Deep Learning Architectures for Multi-Omics Data Integration: Bridging Biomarker Discovery and Clinical Translation
- Posted
- Server
- Preprints.org
- DOI
- 10.20944/preprints202601.1884.v1
The integration of heterogeneous molecular data across multiple omics layers is essential for understanding complex disease biology, yet conventional analytical approaches struggle with the high dimensionality, non-linearity, missingness, and technical variability of multi-omics datasets. This review aims to critically evaluate how deep learning (DL) methodologies address these challenges and to assess their translational relevance within biomedical informatics. We systematically review contemporary DL-based approaches for multi-omics data integration and categorize them into two principal methodological strategies: (i) unsupervised latent-space learning models, such as variational autoencoders, which enable probabilistic feature fusion, denoising, and data imputation; and (ii) network-based integration frameworks, including graph neural networks, which incorporate biological prior knowledge through molecular interaction graphs. Supervised extensions, including attention-based models for clinical prediction tasks, are also examined with emphasis on architectural design, data fusion mechanisms, and validation practices. Deep learning-based multi-omics integration methods have demonstrated improved performance over conventional approaches in disease subtyping, biomarker discovery, and prognostic modeling by capturing complex, non-linear interactions across molecular layers. Latent-space models provide robust representations in the presence of incomplete data, while network-based approaches facilitate the identification of biologically coherent molecular subnetworks; however, most reported performance gains rely on internal validation, with limited evidence of external generalizability, interpretability robustness, and readiness for clinical deployment. Deep learning has emerged as a powerful paradigm for multi-omics integration in biomedical informatics, enabling clinically relevant molecular stratification and prediction. Nevertheless, effective translation into routine clinical practice requires a shift beyond predictive accuracy toward explainable modeling, standardized external validation on independent cohorts, and integration with real-world clinical data, which are essential for establishing trustworthy and actionable decision-support systems for precision medicine.