Saltar al contenido principal

Escribe una PREreview

SHAP‑Guided CpG Selection with Ensemble Learning for Epigenetic Age Prediction

Publicada
Servidor
bioRxiv
DOI
10.64898/2026.02.20.707142

Abstract Epigenetic biomarkers offer critical insight into biological aging and disease risk, yet most deep learning models lack interpretability and generalization across tissues. We present a reproducible pipeline for interpretable age classification using SHAP-guided CpG prioritization, enhancer and gene annotation, and stacked ensemble modeling. Across both blood and brain samples (GSE41826, GSE40279), certain CpGs showed reproducible age-linked methylation changes. Comparative performance metrics, SHAP breakdowns, and CpG-level stability analyses support their potential as cross-tissue anchor sites.. A multi-model ensemble combining XGBoost, MLP, TabTransformer→XGBoost, and LightGBM yielded high predictive accuracy (92.4%) and macro F1 of 92.3%. Biological support for these findings stems from motif scans, enrichment results, and visual mapping of CpG-to-gene relationships using Sankey diagrams. Delta-based stacking improved prediction confidence in borderline age groups, notably boosting middle-age recall through complementary model behavior. This work lays the groundwork for explainable epigenetic clocks that transcend tissue boundaries.

Puedes escribir una PREreview de SHAP‑Guided CpG Selection with Ensemble Learning for Epigenetic Age Prediction. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora