Ir para o conteúdo principal

Escrever uma avaliação PREreview

Multi-Omic Integration and Machine Learning Reveal Regulatory Networks Driving Breast Cancer Progression

Publicado
Servidor
Preprints.org
DOI
10.20944/preprints202512.0929.v1

Breast cancer progression from early to late stages involves complex molecular changes that traditional anatomic staging inadequately captures. Integration of microRNA (miRNA) and messenger RNA (mRNA) expression profiles through machine learning offers potential for identifying biological markers that distinguish progression states independent of tumor size and lymph node status. This study analyzed 1,081 primary breast cancer samples from The Cancer Genome Atlas with combined miRNA-Seq and RNA-Seq data, stratified into early-stage (Stage I-II, n=822) and late-stage (Stage III-IV, n=259) groups. Following variance-based feature selection retaining 3,000 high-variability features and sample-level log2-CPM normalization, nested 5-fold cross-validation with stratified sampling addressed the 3.2:1 class imbalance. Nine machine learning algorithms were evaluated, with XGBoost selected for final modeling after Bayesian hyperparameter optimization. The integrated miRNA-mRNA XGBoost classifier achieved test set accuracy of 79.8% (95% CI: 73.2-85.3%) with AUC 0.687 (95% CI: 0.622-0.748), outperforming single-platform mRNA-only models (AUC 0.654) and miRNA-only approaches (AUC 0.612). Top discriminative features included miR-21-5p, miR-155-5p, miR-200c-3p, and miR-145-5p among miRNAs, alongside mRNA targets PIK3CA, CCND1, MYC, and ERBB2. Network analysis revealed three core regulatory modules: epithelial-mesenchymal transition controlled by the miR-200 family targeting ZEB1/ZEB2, metabolic reprogramming via the miR-155/HK2 axis enhancing glycolysis, and immune evasion through miR-34a/PD-L1 regulation. Differential expression analysis identified 15 significant miRNAs and 194 significant mRNAs distinguishing progression groups. Hub miRNA analysis revealed 15 miRNAs with extensive target networks ranging from 97 to 516 targets. Multi-omic integration of miRNA and mRNA expression captures biological progression signatures beyond anatomic staging, with moderate but consistent classification performance validated through rigorous statistical methods. The identified regulatory networks provide mechanistic insights into progression drivers and potential therapeutic vulnerabilities applicable across diverse populations and resource settings.

Você pode escrever uma avaliação PREreview de Multi-Omic Integration and Machine Learning Reveal Regulatory Networks Driving Breast Cancer Progression. Uma avaliação PREreview é uma avaliação de um preprint e pode variar de algumas frases a um parecer extenso, semelhante a um parecer de revisão por pares realizado por periódicos.

Antes de começar

Vamos pedir que você faça login com seu ORCID iD. Se você não tiver um iD, pode criar um.

O que é um ORCID iD?

Um ORCID iD é um identificador único que diferencia você de outras pessoas com o mesmo nome ou nome semelhante.

Começar agora