From Entropy and Beyond: A Comprehensive Survey of Probability-Space Unsupervised Objectives
- Posted
- Server
- Preprints.org
- DOI
- 10.20944/preprints202606.0394.v1
In an era where compute resources are rapidly advancing with better algorithms and larger clusters, the growth of labeled data, the fossil fuel of AI, has not kept pace. This disparity has spurred a growing interest in learning paradigms that rely solely on unlabeled data. A class of these paradigms employ unsupervised learning objectives that operate directly in the probability or prediction space, with Shannon entropy being one common example among many. Such objectives leverage unlabeled domain data to enable diverse tasks within the target domain. Yet, these methods remain scattered across the literature, with no systematic overview to guide their comparison or use. This work addresses that gap by providing a high-level compilation of the designs, implementations, and applications of 17 such unsupervised loss functions, focusing on their roles in common learning applications while also exploring their broader potential. By presenting their theoretical underpinnings, practical applications, and small-scale yet extensive experiments, this study aims to shape future research by addressing data scarcity, reducing dependence on labeled annotations, and enabling the unsupervised optimization of increasingly large models.