A machine learning approach to identify key Epigenetic Transcripts for Ageing research in human blood (Epitage)
- Posted
- Server
- bioRxiv
- DOI
- 10.64898/2026.02.09.704870
DNA methylation is an established biomarker of human ageing, and analysing CpGs grouped by transcript as functional units may reveal new insights into the processes of ageing. In this study, we analyzed the GSE87571 dataset (714 samples from 14–94 years) to assess the relationship between transcript-level methylation profiles and chronological age in human blood. This approach led to the creation of Epitage , a curated set of 48 transcripts from 13 genes identified through machine learning as having methylation profiles that strongly correlate with age (R 2 ≥ 0.8). This analysis highlighted transcripts from the genes KCNS1, SPTBN4 , and VTRNA1-2 , which have been only rarely mentioned as age-related methylation markers in humans, suggesting them as underexplored candidates for future investigation. In addition, the list includes genes already implicated in aging or related pathways, such as ELOVL2, FHL2, KLF14, TRIM59, MIR29B2CHG, CALB1, OBSCN, PRRT1, OTUD7A, and SYNGR3 . To validate models efficiently while ensuring reproducibility, we developed ugPlot , an open-source R package with a graphical user interface (GUI) that automates routine steps for training and testing hundreds of machine-learning models. The tool also streamlines dataset import and manipulation, reducing human error and generating publication-ready plots. Epitage thus provides a focused and accessible starting point for experimental and translational studies into the roles of DNA methylation and transcript regulation in human ageing.