Comments
Write a commentNo comments have been published yet.
Cis-regulatory elements (CREs) possess key regulatory components that fine-tune gene expression patterns. The manuscript by Hale et al. focuses on addressing how CREs evolve during adaptive divergence in plants. While many models have been proposed to explain CRE evolution, the authors focus on two contrasting hypotheses: one proposes that adaptive divergence involves shifts in transcription factor (TF) binding preference (trans changes), while the other suggests that TF preferences remain conserved, but their binding site (TFBS) distribution evolves (binding site turnover). The authors refer to the latter as the ‘stable code, variable sites’ hypothesis. Grasses serve as a suitable model for testing this hypothesis due to the conservation of gene content and collinearity despite significant alterations in ploidy, genome size, and regulatory patterns. The authors effectively describe the extent of cis-regulatory conservation in unmethylated regions across five model grasses and develop novel computational methods to explore how turnover in TF motifs might be related to ecological niche transitions across 589 grass species. While this study represents a major advance in our understanding of cis-regulatory evolution, we have a few concerns/questions that may help improve the interpretation of the findings. It is also worth noting that the authors have also brought up some of the concerns we raised in their discussion.
1. The authors used OrthoFinder to construct orthogroups for 32 representative genomes. OrthoFinder is not synteny aware and could potentially miss true orthologs, which could otherwise be detected using synteny-based approaches. Genomic rearrangements, including duplication and deletion, can occur among different genomes and challenge the effectiveness of this approach. Using genomic neighborhoods to infer orthologs reveals high-confidence 1:1 orthologs (Conover, Sharbrough, and Wendel 2021; Ludwig and Mrázek 2024). Additionally, the authors use BLAST to retrieve homologous sequences for predicted ancestral proteins from hundreds of short-read genome assemblies. This approach relies on the quality of ancestral sequence reconstruction for each gene family, which may be confounded by fast-evolving protein sequences, which is often the case for families involved in processes such as defense and reproduction. In light of this, might the “low-conservation genes” (Fig S10) be an artifact of orthology assignment? Perhaps orthogonal homology assignment methods might resolve this.
2. In theory, cis-regulatory changes can only accompany phenotypic changes when they drive changes in gene expression. The authors provide a compelling argument that selection might constrain motif turnover (Fig 3A) and that variable motif abundance is associated with ecological niche transitions (Fig 4,5), but do these constraints have an impact on gene expression? The mere presence or absence of proximal upstream TFBSs may not fully explain the expression pattern of the associated gene(s). Multi-modal (RNAseq and ATACseq) data would be needed to confirm these associations. For example, evaluating whether HSF-GARP motif abundance influences steady-state or cold-inducible abundance of OG0018131 transcripts (Fig 5) in a subset of grasses would provide functional support for the author’s claims.
3. While the authors argue for a “stable code, variable sites” model of expression evolution, the conclusion appears to rest heavily on the conservation of transcription factor binding sites of a limited set of representative grasses (Fig 1). The study then focuses exclusively on the most conserved 377 motifs across the selected five genomes. We wonder if this may have biased the conclusion regarding stable code and variable sites. To address this, the study could include a global assessment (across 589 grass species) of locally least conserved motifs (the other 137 motifs not conserved across the selected five genomes). However, this raises another challenge: identifying biologically relevant motifs in non-model organisms. Multi-modal tests (ATACseq and RNAseq) are not scalable to such a large extent and would require higher-quality genome sequences. Additionally, even though the authors mention this, they do not present evidence that would rule out a significant role for trans-regulatory evolution, raising the possibility that their conclusion might be biased by cis-regulatory evolution and not a comprehensive test of the different regulatory mechanisms.
4. The comparative study focused exclusively on 500 bp upstream of genes’ translation start sites, which simplifies the assumptions of gene regulation to gain a general understanding of evolutionary context. However, CREs can be located much higher and even downstream. There could additionally be lineage-specific differences where motifs are conserved, but their location upstream or downstream becomes more relevant in gene regulation. Availability of better sequencing depth of the genomes for non-model organisms could help address this limitation. Moreover, the authors use ‘higher proportion of TEs beyond 500 bp upstream’ as the reason for considering only 500 bp upstream for their analysis. Since TEs can be a source of CRE evolution, particularly in environmental adaptation, we believe that some of the de novo CREs involved in niche transitions may have been overlooked.
Overall, the study represents a unique and scalable way of exploring the conservation of CREs using advanced computational modelling and large genomic datasets. While the conclusions reported herein rely heavily on the quality of genome assemblies and the rigor of analytical pipelines, we believe this study represents a conceptual advance in our understanding of expression evolution. As the feasibility of generating orthogonal transcriptome and chromatin accessibility datasets increases, these findings can be functionally validated.
Acknowledgement
This preprint was discussed in the BPSC 240 course offered by Sunil Kenchanmane Raju in the Department of Botany and Plant Sciences at the University of California, Riverside, in Spring 2025. The authors thank the participants of the course for the detailed discussions, particularly Angel Morris, Skyler Wong, Wesley George, and Simoné Murguia.
References
Conover, Justin L., Joel Sharbrough, and Jonathan F. Wendel. 2021. “PSONIC: Ploidy-Aware Syntenic Orthologous Networks Identified via Collinearity.” G3 (Bethesda, Md.) 11 (8). https://doi.org/10.1093/g3journal/jkab170.
Ludwig, J., and J. Mrázek. 2024. “OrthoRefine: Automated Enhancement of Prior Ortholog Identification via Synteny.” BMC Bioinformatics 25 (1): 163.
The authors declare that they have no competing interests.
No comments have been published yet.