Skip to main content

Write a PREreview

Structural semantic evolutionary distance (SSED) unifies the selection of cancer driver genes across macroevolution and tumorigenesis

Posted
Server
bioRxiv
DOI
10.64898/2025.12.17.694808

The non-random, site-specific enrichment of somatic mutations in cancer driver genes (CDGs) suggests their emergence is governed by underlying evolutionary constraints. However, quantitative methods to define these constraints and their underlying principles remain underexplored. To address this, we introduce the Structural Semantic Evolutionary Distance (SSED), a metric leveraging the pretrained ESM-3 protein language model to quantify evolutionary divergence within a unified structural semantic space. Our analysis demonstrates that CDGs are subject to persistent structural semantic constraints across species, tolerating a significantly narrower range of structural semantic changes during evolution compared to non-CDGs. Crucially, clinically observed oncogenic mutations follow this same principle, favoring minimal structural perturbation as shaped by long-term gene evolution. Such mutations maintain core protein function while conferring a capacity for immune evasion, thereby driving clonal expansion. Guided by this “evolutionary constraint” framework, we successfully predicted and experimentally validated a previously uncharacterized oncogenic mutation, KRAS R135L, in bronchial epithelial cells. Furthermore, clinical cohort analysis demonstrated that SSED acts as an independent predictor of response to immune checkpoint blockade, offering information orthogonal to tumor mutational burden (TMB). This study unifies the evolutionary principles governing CDGs across macroevolutionary and microevolutionary (tumorigenesis) timescales, elucidates the balance between structural adaptability, functional conservation, and immune pressure, and identifies a novel predictive biomarker for cancer immunotherapy.

You can write a PREreview of Structural semantic evolutionary distance (SSED) unifies the selection of cancer driver genes across macroevolution and tumorigenesis. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now