Saltar al contenido principal

Escribe una PREreview

Multi-Omic Integration and Machine Learning Reveal Regulatory Networks Driving Breast Cancer Progression

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202512.0929.v1

Breast cancer progression from early to late stages involves complex molecular changes that traditional anatomic staging inadequately captures. Integration of microRNA (miRNA) and messenger RNA (mRNA) expression profiles through machine learning offers potential for identifying biological markers that distinguish progression states independent of tumor size and lymph node status. This study analyzed 1,081 primary breast cancer samples from The Cancer Genome Atlas with combined miRNA-Seq and RNA-Seq data, stratified into early-stage (Stage I-II, n=822) and late-stage (Stage III-IV, n=259) groups. Following variance-based feature selection retaining 3,000 high-variability features and sample-level log2-CPM normalization, nested 5-fold cross-validation with stratified sampling addressed the 3.2:1 class imbalance. Nine machine learning algorithms were evaluated, with XGBoost selected for final modeling after Bayesian hyperparameter optimization. The integrated miRNA-mRNA XGBoost classifier achieved test set accuracy of 79.8% (95% CI: 73.2-85.3%) with AUC 0.687 (95% CI: 0.622-0.748), outperforming single-platform mRNA-only models (AUC 0.654) and miRNA-only approaches (AUC 0.612). Top discriminative features included miR-21-5p, miR-155-5p, miR-200c-3p, and miR-145-5p among miRNAs, alongside mRNA targets PIK3CA, CCND1, MYC, and ERBB2. Network analysis revealed three core regulatory modules: epithelial-mesenchymal transition controlled by the miR-200 family targeting ZEB1/ZEB2, metabolic reprogramming via the miR-155/HK2 axis enhancing glycolysis, and immune evasion through miR-34a/PD-L1 regulation. Differential expression analysis identified 15 significant miRNAs and 194 significant mRNAs distinguishing progression groups. Hub miRNA analysis revealed 15 miRNAs with extensive target networks ranging from 97 to 516 targets. Multi-omic integration of miRNA and mRNA expression captures biological progression signatures beyond anatomic staging, with moderate but consistent classification performance validated through rigorous statistical methods. The identified regulatory networks provide mechanistic insights into progression drivers and potential therapeutic vulnerabilities applicable across diverse populations and resource settings.

Puedes escribir una PREreview de Multi-Omic Integration and Machine Learning Reveal Regulatory Networks Driving Breast Cancer Progression. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora