Lossy Loops: Shannon’s DPI and Information Decay in Generative Model Training
- Publicado
- Servidor
- Preprints.org
- DOI
- 10.20944/preprints202507.2260.v1
Model collapse, the progressive degradation of generative AI performance when trained on synthetic data, poses a critical challenge for modern AI systems. This paper establishes a theoretical framework based on Shannon's Data Processing Inequality (DPI) to explain this phenomenon. We conceptualize generative AI models as lossy communication channels, predicting progressive mutual information decay during iterative training. We derive testable hypotheses for exponential decay rates (λ ∈ [0.2, 0.4] per iteration) and propose mitigation paradigms requiring future validation. See also: https://doi.org/10.5281/zenodo.15199262 for a related philosophical exploration.