The Synthetic Data Age is here to stay - Generative AI models are training on AI-generated data, whether knowingly or unknowingly. What does this mean for the future, and how do we handle it?
Advances in generative artificial intelligence (AI) algorithms for text, imagery, and other data types have led to the temptation to use AI-synthesized data to train next-generation models. Unfortunately, recent research has shown that repeated training with synthetic data forms a self-consuming feedback loop that causes the model distribution to drift away from reality, reinforcing biases, amplifying artifacts, and lowering the quality and diversity of next-generation models. This workshop will explore, crystallize, and build a community around this important emerging research area from three diverse perspectives:
- theoretical and empirical studies of AI model deterioration due to synthetic data training
- potential strategies to mitigate the deterioration
- the ethical and societal implications of self-consuming AI model training.
Schedule
⭐ Link to NeurIPS page: https://neurips.cc/virtual/2024/workshop/????? ⭐
time (PDT) | Session | Speaker(s) |
---|---|---|
9:00 am | Opening Remarks | Organizers |
09:00 am | Invited Talk 1: Title 1 | Speaker 1 |
09:30 am | Invited Talk 2: Title 2 | Speaker 2 |
10:00 am | Oral 1: Paper Title | Paper Authors |
10:15 am | Oral 2: Paper Title | Paper Authors |
10:30 am | Break and Discussion 1 | |
11:00 am | Panel Session 1: Topic 1 | Panelists |
12:00 pm | Oral 3: Paper Title | Paper Authors |
12:15 pm | Oral 4: Paper Title | Paper Authors |
12:30 pm | Lunch Break | |
01:00 pm | Poster Session and Discussion 2 | |
03:00 pm | Break and Discussion 3 | |
03:30 pm | Invited talk 3: Talk 3 | Speaker 3 |
04:00 pm | Invited talk 4: Talk 4 | Speaker 4 |
04:30 pm | Invited talk 5: Talk 5 | Speaker 5 |
05:00 pm | Discussion 3 and closing remarks | Organizers |
Speakers
Josue Casco-Rodriguez Rice University |
Tom Goldstein University of Maryland |
Abeba Birhane Mozilla Foundation |
Zakhar Shumaylov University of Cambridge |
Nate Gillman Brown University |
Elvis Dohmatob Meta FAIR |
Panelists
Maria del Rio-Chanona University College London, UK |
Justin Norman UC Berkeley, USA |
Sahand Sharifzadeh DeepMind, UK |
Organizers
Richard G. Baraniuk Rice University, USA |
Sina Alemohammad Rice University, USA |
Julia Kempe New York University, USA |
Ilia Shumailov DeepMind, UK |
Tshilidzi Marwala United Nations University, Japan |
Jathan Sadowski Monash University, Australia |
Organizers affiliations
Call for Papers
We cordially invite submissions and participation in our “AI in the Synthetic Data Age: Unintended Consequences and Potential Mitigation” workshop that will be held on December 14 or 15, 2024 at NeurIPS 2024, Vancouver, Canada.
The submission deadline is August 30th, 2024, 23:59 AoE and the submission link https://openreview.net/group?id=NeurIPS.cc/2024/Workshop/????.
Topics
We welcome submissions related to any aspect of backdoor research, including but not limited to:
- what happens when generative AI models train on AI-generated or AI-curated data?
- theoretical guarantees
- empirical descriptions
- stability
- how can we mitigate the negative effects of AI self-consumption?
- detection and removal of harmful synthetic data
- new, less harmful, ways of training on synthetic data
- self-improvement: improving generative AI models with self-synthesized data
- synthetic data accumulation
- synthetic data correction
- how will AI self-consumption impact society? examples include:
- LLMs in search engines training and finding AI-generated data
- AI model collapse and creativity
The workshop will employ a double-blind review process. Each submission will be evaluated based on the following criteria:
- Soundness of the methodology
- Relevance to the workshop
- Societal impacts
We only consider submissions that haven’t been published yet in any peer-reviewed venue, including the NeurIPS 2024 conference. We allow dual submissions with other workshops or conferences. The workshop is non-archival and will not have any official proceedings. All accepted papers will be allocated either a poster presentation or a talk slot.
Important Dates
- Submission deadline: August 30th, 2024, 23:59 Anywhere on Earth (AoE)
- Author notification: September 30th, 2024
- Camera-ready deadline: December 1st, 2024, 23:59 Anywhere on Earth (AoE)
- Workshop date: December 14th or 15th, 2023 (Full-day Event)
Submission Instructions
Papers should be submitted to OpenReview: https://openreview.net/group?id=NeurIPS.cc/2024/Workshop/????
Submitted papers should have up to 6 pages (excluding references, acknowledgments, or appendices). Please use the NeurIPS submission template provided at https://neurips.cc/Conferences/2024/CallForPapers. Submissions must be anonymous following NeurIPS double-blind reviewing guidelines, NeurIPS Code of Conduct, and Code of Ethics. Accepted papers will be hosted on the workshop website but are considered non-archival and can be submitted to other workshops, conferences, or journals if their submission policy allows.