Skip to main content

Write a PREreview

Subcellular Localization Constrains Protein Detectability and Reveals Systematic RNA-Protein Discordance Across Cancers

Posted
Server
bioRxiv
DOI
10.64898/2026.03.30.713919

Transcript abundance is widely used as a proxy for protein expression in cancer studies; however, mRNA levels often fail to predict protein detectability due to post-transcriptional and compartment-specific regulatory processes. Here, we present a machine learning framework that integrates RNA expression, gene-level attributes, and subcellular localization to model protein detectability across human cancers.

Leveraging transcriptomic data from TCGA, TARGET, and GTEx, and protein annotations from the Human Protein Atlas, we constructed a dataset comprising over 100,000 gene–cancer pairs across seven tumor types. Models based on RNA features alone achieved moderate predictive performance (ROC-AUC ~0.71), whereas incorporating subcellular localization significantly improved accuracy (ROC-AUC ~0.82). Paired bootstrap analysis confirmed that these gains were statistically robust.

We further identify a substantial set of genes with high transcript abundance yet absent protein detection, revealing widespread RNA-protein decoupling. These discordant genes are enriched in mitochondrial, metabolic, and translational regulatory pathways, suggesting that discordance reflects structured biological processes rather than stochastic variation. Together, our results demonstrate that cellular context, particularly subcellular localization, is a key determinant of protein detectability and underscore the limitations of transcript-centric interpretations in cancer genomics.

You can write a PREreview of Subcellular Localization Constrains Protein Detectability and Reveals Systematic RNA-Protein Discordance Across Cancers. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now