Uncategorized

Sina Alemohammad, Josue Casco-Rodriguez, Lorenzo Luzi, Ahmed Imtiaz Humayun,
Hossein Babaei, Daniel LeJeune, Ali Siahkoohi, Richard G. Baraniuk

Abstract: Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous ("self-consuming") loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previous-generation models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease. We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.

In the news:

"Generative AI Goes 'MAD' When Trained on AI-Created Data Over Five Times," Tom's Hardware, 12 July 2023
"AI Loses Its Mind After Being Trained on AI-Generated Data," Futurism, 12 July 2023
"Scientists make AI go crazy by feeding it AI-generated content," TweakTown, 13 July 2023
"AI models trained on AI-generated data experience Model Autophagy Disorder (MAD) after approximately five training cycles," Multiplatform.AI, 13 July 2023
"AIs trained on AI-generated images produce glitches and blurs,” NewScientist, 18 July 2023
"Training AI With Outputs of Generative AI Is Mad" CDOtrends, 19 July 2023
"When AI Is Trained on AI-Generated Data, Strange Things Start to Happen" Futurism, 1 August 2023
"Mad AI risks destroying the Information Age" The Telegraph, 1 February 2024
''AI's 'mad cow disease' problem tramples into earnings season'', Yahoo!finance, 12 April 2024
"Cesspool of AI crap or smash hit? LinkedIn’s AI-powered Collaborative Articles offer a sobering peek at the future of content'' Fortune, 18 April 2024
"AI's Mad Loops," Rice Magazine, February 2025

In cartoons:

30 Students in 30 Years

jkh6 • April 13, 2023

Jack Wang Defends PhD Thesis

jkh6 • April 5, 2023

Rice DSP graduate student Jack Wang successfully defended his PhD thesis entitled "Towards Personalized Human Learning at Scale: A Machine Learning Approach."

Abstract: Despite the recent advances in artificial intelligence (AI) and machine learning (ML), we have yet to witness the transformative breakthroughs they can bring to education and, more broadly, to how humans learn. This thesis establishes two research directions that leverage the recent advances in generative modeling to enable more personalized learning experiences on a large scale. The first part of the thesis focuses on educational content generation and proposes a method to automatically generate math word problems that are personalized to each learner. The second part of the thesis focuses on learning analytics and proposes a framework for analyzing learners’ open-ended solutions to assessment questions, such as code submissions in computer science education.

Jack’s next step is Adobe Research, where he will be working on new natural language processing models for documents and other data.

Two Papers at CVPR 2023

jkh6 • March 21, 2023

Two DSP group papers have been accepted by the IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2023 in Vancouver, Canada

"SplineCam: Exact Visualization of Deep Neural Network Geometry and Decision Boundaries" by Ahmed Imtiaz Humayun, Randall Balestriero, Guha Balakrishnan, and Richard Baraniuk (Highlight paper, 2.5% of all submissions)
"WIRE: Wavelet Implicit Neural Representations," by Vishwa Saragadam, Daniel LeJeune, Jasper Tan, Guha Balakrishnan, Ashok Veeraraghavan, and Richard Baraniuk

Machine Learning Privacy Work to Appear at AISTATS 2023

jkh6 • January 21, 2023

"A Blessing of Dimensionality in Membership Inference through Regularization" by DSP group members Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, and Richard Baraniuk has been accepted for the International Conference on Artificial Intelligence and Statistics (AISTATS) in Valencia, Spain, April 2023.

Two “Notable” Papers at ICLR 2023

jkh6 • January 21, 2023

Two DSP group papers have been accepted as "Notable - Top 25%" papers for the International Conference on Learning Representations (ICLR) 2023 in Kigali, Rwanda

"A Primal-Dual Framework for Transformers and Neural Networks," by T. M. Nguyen, T. Nguyen, N. Ho, A. L. Bertozzi, R. G. Baraniuk, and S. Osher
"Retrieval-based Controllable Molecule Generation," by Jack Wang, W. Nie, Z. Qiao, C. Xiao, R. G. Baraniuk, and A. Anandkumar

Abstracts below.

Retrieval-based Controllable Molecule Generation

Generating new molecules with specified chemical and biological properties via generative models has emerged as a promising direction for drug discovery. However, existing methods require extensive training/fine-tuning with a large dataset, often unavailable in real-world generation tasks. In this work, we propose a new retrieval-based framework for controllable molecule generation. We use a small set of exemplar molecules, i.e., those that (partially) satisfy the design criteria, to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria. We design a retrieval mechanism that retrieves and fuses the exemplar molecules with the input molecule, which is trained by a new self-supervised objective that predicts the nearest neighbor of the input molecule. We also propose an iterative refinement process to dynamically update the generated molecules and retrieval database for better generalization. Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning. On various tasks ranging from simple design criteria to a challenging real-world scenario for designing lead compounds that bind to the SARS-CoV-2 main protease, we demonstrate our approach extrapolates well beyond the retrieval database, and achieves better performance and wider applicability than previous methods.

A Primal-Dual Framework for Transformers and Neural Networks

Self-attention is key to the remarkable success of transformers in sequence modeling tasks, including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresponds to the support vector expansion derived from a support vector regression problem, whose primal formulation has the form of a neural network layer. Using our framework, we derive popular attention layers used in practice and propose two new attentions: 1) the Batch Normalized Attention (Attention-BN) derived from the batch normalization layer and 2) the Attention with Scaled Head (Attention-SH) derived from using less training data to fit the SVR model. We empirically demonstrate the advantages of the Attention-BN and Attention-SH in reducing head redundancy, increasing the model’s accuracy, and improving the model’s efficiency in a variety of practical applications including image and time-series classification.

OpenStax Research Website Now Live

jkh6 • January 21, 2023

Check out the new website for the awesome OpenStax Research team!

Jasper Tan Defends PhD Thesis

jkh6 • December 13, 2022

Rice DSP graduate student Jasper Tan successfully defended his PhD thesis entitled "Privacy-Preserving Machine Learning: The Role of Overparameterization and Solutions in Computational Imaging.”

Abstract: While the accelerating deployment of machine learning (ML) models brings benefits to various aspects of human life, it also opens the door to serious privacy risks. In particular, it is sometimes possible to reverse engineer a given model to extract information about the data on which it was trained. Such leakage is especially dangerous if the model's training data contains sensitive information, such as medical records, personal media, or consumer behavior. This thesis is concerned with two big questions around this privacy issue: (1) "what makes ML models vulnerable to privacy attacks?" and (2) "how do we preserve privacy in ML applications?". For question (1), I present detailed analysis on the effect increased overparameterization has on a model's vulnerability to the membership inference (MI) privacy attack, the task of identifying whether a given point is included in the model's training dataset or not. I theoretically and empirically show multiple settings wherein increased overparameterization leads to increased vulnerability to MI even while improving generalization performance. However, I then show that incorporating proper regularization while increasing overparameterization can eliminate this effect and can actually increase privacy while preserving generalization performance, yielding a "blessing of dimensionality’" for privacy through regularization. For question (2), I present results on the privacy-preserving techniques of synthetic training data simulation and privacy-preserving sensing, both in the domain of computational imaging. I first present a training data simulator for accurate ML-based depth of field (DoF) extension for time-of-flight (ToF) imagers, resulting in a 3.6x increase in a conventional ToF camera's DoF when used with a deblurring neural network. This simulator allows ML to be used without the need for potentially private real training data. Second, I propose a design for a sensor whose measurements obfuscate person identities while still allowing person detection to be performed. Ultimately, it is my hope that these findings and results take the community one step closer towards the responsible deployment of ML models without putting sensitive user data at risk.

Jasper’s next step is the start-up GLASS Imaging, where he will be leveraging machine learning and computational photography techniques to design a new type of camera with SLR image quality that can fit in your pocket.

DSP Faculty Member Richard Baraniuk Honored with IEEE SPS Norbert Wiener Society Award

jkh6 • December 13, 2022

Richard Baraniuk has been selected for the 2022 IEEE SPS Norbert Wiener Society Award "for fundamental contributions to sparsity-based signal processing and pioneering broad dissemination of open educational resources". The Society Award honors outstanding technical contributions in a field within the scope of the IEEE Signal Processing Society and outstanding leadership in that field (list of previous recipients).

Why Some Deep Networks are Easier to Optimize Than Others

jkh6 • November 27, 2022

"Singular Value Perturbation and Deep Network Optimization", Rudolf H. Riedi, Randall Balestriero, and Richard G. Baraniuk, Constructive Approximation, 27 November 2022 (also arXiv preprint 2203.03099, 7 March 2022)

Deep learning practitioners know that ResNets and DenseNets are much preferred over ConvNets, because empirically their gradient descent learning converges faster and more stably to a better solution. In other words, it is not what a deep network can approximate that matters, but rather how it learns to approximate. Empirical studies have indicated that this is because the so-called loss landscape of the objective function navigated by gradient descent as it optimizes the deep network parameters is much smoother for ResNets and DenseNets as compared to ConvNets (see Figure 1 from Tom Goldstein's group below). However, to date there has been no analytical work in this direction.

Building on our earlier work connecting deep networks with continuous piecewise-affine splines, we develop an exact local linear representation of a deep network layer for a family of modern deep networks that includes ConvNets at one end of a spectrum and networks with skip connections, such as ResNets and DenseNets, at the other. For tasks that optimize the squared-error loss, we prove that the optimization loss surface of a modern deep network is piecewise quadratic in the parameters, with local shape governed by the singular values of a matrix that is a function of the local linear representation. We develop new perturbation results for how the singular values of matrices of this sort behave as we add a fraction of the identity and multiply by certain diagonal matrices. A direct application of our perturbation results explains analytically why a network with skip connections (e.g., ResNet or DenseNet) is easier to optimize than a ConvNet: thanks to its more stable singular values and smaller condition number, the local loss surface of a network with skip connections is less erratic, less eccentric, and features local minima that are more accommodating to gradient-based optimization. Our results also shed new light on the impact of different nonlinear activation functions on a deep network's singular values, regardless of its architecture.

DIGITAL SIGNAL PROCESSING AT RICE UNIVERSITY

Uncategorized

Self-Consuming Generative Models Go MAD

30 Students in 30 Years

Jack Wang Defends PhD Thesis

Two Papers at CVPR 2023

Machine Learning Privacy Work to Appear at AISTATS 2023

Two “Notable” Papers at ICLR 2023

OpenStax Research Website Now Live

Jasper Tan Defends PhD Thesis

DSP Faculty Member Richard Baraniuk Honored with IEEE SPS Norbert Wiener Society Award

Why Some Deep Networks are Easier to Optimize Than Others