Skip to PREreview

PREreview of Highly multiplexed design of an allosteric transcription factor to sense novel ligands

Published
DOI
10.5281/zenodo.11247730
License
CC BY 4.0

Summary

In this paper the authors present Sensor-seq, a platform for creating allosteric transcription factor (aTF) biosensors with high sensitivity and scale. This platform enables evolution of aTFs that bind to ligands structurally dissimilar from the native ligand by choosing a promiscuous starting aTF, TtgR. They validate the method with a pilot library before completing a larger screen against seven ligands. The authors then perform secondary and tertiary cell-based fluorescent screens to identify high performing TtgR variants with affinity for new ligands. In their analysis, the authors identify amino acid preferences for hits that are sensitive to the ligands tested. The authors also solve a crystal structure of one of the variants bound to its ligand naltrexone, which provides some structural information about what relevant interactions enable specific binding. Lastly, the authors engineer cell-free biosensors for two ligands in the Sensor-seq screen based on TtgR hits for those ligands, naltrexone and quinine.

The authors of this paper succeed in combining multiple tools to create a straightforward and effective pipeline for generating aTF biosensors for non-native ligands. Specifically, their implementation of FuncLib1 clearly produces a large pool of TtgR variants capable of binding non-native ligands while also retaining their allosteric function. Additionally, their library construction involving aTF variants in cis with random barcodes enables rapid testing of many variants with multiple ligands in parallel. The characterization of amino acid preferences for each ligand provides a wealth of information for validating structural contributors to the activation of the aTF with each ligand. The provided crystal structure gives insight into the structural contributors of a TtgR mutant binding to naltrexone, and validates the rationale around choice of TtgR as a starting point. Furthermore, the cell-free platform demonstrates the practical utility of the hits found in the Sensor-seq pipeline, with clear rationale behind the choice to build biosensor platforms for the selected ligands.

While the paper develops a significant methodological advance, the authors could provide more rationale behind the choices in their analysis. Additionally, the section on amino acid preferences for given ligands based on the clustering analysis would be more convincing if the authors reasoned more about the preferences of each ligand given the ligand structure and knowledge about the binding site of TtgR. Overall, this paper provides proof of concept for a pipeline that should enable generation of aTF biosensors for a range of biological and synthetic small molecules.  

Major points

  • While the authors do well to solve a structure of an important variant, we think including more structural rationale behind the mutations mentioned would greatly support the amino acid preference analysis. For example, considerations about how amino acid preferences relate to stability in the binding site could be corroborated by referencing the crystal structure section where the authors discuss Met-Aro interactions. Including connections between the crystal structure and the mutations from the heatmap would also support discussion of important allosteric interactions. In the reviewers' opinion, the analysis in the "Unsupervised learning reveals key residues for ligand specificity" and "Crystal structure reveals protein-stabilizing motifs for a non-native ligand interaction" sections would benefit from a unifying focus on the structural contributions of the different amino acid preferences identified in Figure 3, mapped onto the crystal structure.

    • Limiting the amino acid preferences discussion to the differences between clusters that rank highly for naltrexone would aid in focusing the discussion and establishing clear rationale for the clustering analysis. Swapping Figures 3 and 4, and their corresponding analyses, could also help unify the claims made in both sections. Additionally, marking the relevant mutation positions on the crystal structure with naltrexone bound would benefit the analysis corresponding to Figure 3.

  • Figure 3 presents unsupervised clustering of variants based on amino acid physicochemical embeddings, a dot plot showing the enrichment of different clusters for binding to different ligands, and a heatmap showing the top variants for each cluster and the measured enrichment for specific mutations. In the text, the section referencing Figure 3 (“Unsupervised learning reveals key residues for ligand specificity”) claims that “these results highlight the power of Sensor-seq to provide a holistic view of the mutational adaptability of TtgR to accommodate diverse ligands, and the key determinants of specificity for each ligand.”  We think that revising the figure to directly target the parts of the clustering analysis that support the claims made in the relevant section would improve the interpretability of this figure. 

    • Many of the text references are to specific sections of the heatmap in Figure 3c. Since the text discusses the rationale behind the enrichment scores, it would be helpful to use the figure to support that rationale by including only the relevant portion of the heatmap along with a structure mapping the scores, or by directly highlighting the comparison of different variants in the clustering analysis. For example, the full heatmap could be moved to the supplementary material, while the specific relevant portions discussed in the text (e.g. mutations at 68, 78, and 110 interacting with certain ligand hydroxy groups) could be highlighted along with a visualization of the structural rationale.

    • The authors’ use of unsupervised learning to establish a way to analyze the physicochemical differences between variants is valuable, but we found the UMAP->HDBSCAN analysis in Figure 3a and its relation to the physicochemical rationale in the following panels hard to follow. Given the crystal structure shown in Figure 4, it seems reasonable to focus this analysis by providing structural context. 

Minor Points

  • Understanding the distinction between the secondary and tertiary screens was somewhat tricky on a first read. A simple graphical cartoon in the supplementary figures showing the protocol for the tertiary clonal screen may be useful. If space allows, also consider moving these two graphical representations into a main figure. 

  • Colors used in figure 2 could be kept more consistent. In 2c blue represents a drop in efficiency while darker blue is then used to show high levels of efficiency in 2e

  • Similarly, it would be helpful to maintain a consistent color scheme between 5b/c and 5d/e

  • The analysis of barcode coverage bootstrapping (figure 2g) may fit better in the supplement.

  • Supplementary figure 8 and figure 9b flip axis and cause us confusion. Perhaps consider keeping these consistent.

  • In figure 3a, we found the wording unclear about whether the HDBSCAN is applied to the UMAP dimensions or to some distance calculated from the physicochemical features. We presume the former, and this would be clearer if the previous two sentences about the feature calculations were placed before the mention of UMAP. The methods section (Unsupervised learning of ligand-agnostic dataset) is also not precise about what is done, as it refers to “a combination of UMAP and HSBSCAN'' and only one set of hyperparameters is mentioned (though the other set of hyperparameters is in Supp. Fig. 15).

  • The lack of contrast between cluster colors in 3a makes it hard to distinguish which is which at first glance. Coloring clusters with contrasting colors adjacent to each other would help.

  • In figure 3b, as there are no F-scores represented that are below 1.5 according to the figure description, consider changing the range of the colorbar used.

  • We find it unclear why the variant 3A7 is chosen for crystal analysis over the other variants for naltrexone. Is it because 3A7 is a quadruple mutant? This could be clarified with a textual edit. If it was due to crystallization, this could also be clarified.

References

  1. Khersonsky, O.; Lipsh, R.; Avizemer, Z.; Ashani, Y.; Goldsmith, M.; Leader, H.; Dym,  O.; Rogotner, S.; Trudeau, D. L.; Prilusky, J.; Amengual-Rigo, P.; Guallar, V.; Tawfik, D. S.;  Fleishman, S. J. Automated Design of Efficient and Functionally Diverse Enzyme Repertoires.  Mol. Cell 2018, 72 (1), 178-186.e5. https://doi.org/10.1016/j.molcel.2018.08.033.

Competing interests

The authors declare that they have no competing interests.