Review of “Predicting Relative Populations of Protein Conformations without a Physics Engine Using AlphaFold2”
The development of machine learning algorithms, most notably Alpha Fold 2 (AF2), have improved the speed, quality and accuracy of protein structure prediction. A next challenge is to use these approaches to predict alternate conformations and the effects of sequence variants on structure. Considering the ubiquity of functionally significant fold-switching and order-disorder transitions, developing the ability to predict these alternate conformations has the potential to inform the discovery of new drug targets. Similarly, the conformational equilibrium of drug receptors relates to their affinities for drugs, highlighting the importance of predicting the relative population of different conformations.
Previous research has found that subsampling the input multiple sequence alignments in AF2 and increasing the number of predictions was able to sample alternative structures of the same target protein, even capturing different fold-switching states of known metamorphic proteins. Prior work has also generated conformational ensembles through reducing the max_seq:extra_seq parameter values and used these ensembles as starting points for molecular dynamics simulations to sample more conformations of interest such as cryptic ligand binding pockets.
Here, the authors use a similar approach of MSA subsampling to discover alternate conformations and their relative populations of certain proteins purely using the AF2 pipeline without the need for extensive MD simulations. They demonstrate how subsampling MSA by modulating the max_seq:extra_seq parameters can generate ensembles of protein conformations whose relative populations correlate with experimental knowledge. They test AF2’s capacity to predict differences in conformer populations with two example proteins–Abl1 tyrosine kinase core and granulocyte-macrophage colony-stimulating factor (GMCSF). With Abl1, they found that AF2 can qualitatively predict the effects of mutations on active state populations of kinase cores with up to eighty percent accuracy. They also found that their method predicted most of the activation loop intermediate states in the active-to-inactive transition of the kinase core, performing comparably to predictions obtained from multi-microsecond MD simulations. Despite the paucity of sequence data for GMCSF compared with Abl1, they were able to predict the extent of variation in backbone dynamics among GMCSF variants, which allowed them to conclude that AF’s prediction engine could decode population signals from relatively scarce data. Overall, the results are very interesting and encouraging and the manuscript is well written. We have the following points which we feel, if addressed, could make this manuscript stronger.
The MSA subsampling approach that the authors have adapted in this work has been used by others previously (as cited by the authors themselves), albeit with some modifications. So it is important to see if the existing methodologies, for instance the DBSCAN based clustering and MSA subsampling by Wayment-Steele et al., are able to predict these relative state populations of variants. Also, the optimization of max_seq:extra_seq requires quite a bit of pre-existing experimental information. How is this method to be applied for a relatively new system? The authors could also provide some guidelines on how the max_seq:extra_seq numbers to be sampled are chosen and in general comment about the hyper-parameter space in their approach and how it compares to other schemes/approaches.
Apart from the large change in A-loop from active to inactive state in Abl kinase, the other important structure change involves the 𝛼C helix moving out (as shown in Reference 22 cited in the preprint). The authors have not discussed this aspect. The snapshots shown from enhanced MD does not seem to show this change either (upon visual examination of the snapshots shown in the figures). Hence, the biological relevance of the MD simulation becomes questionable. Does the AF2 subsampled ensemble reflect the change in the helix position?
The authors haven’t performed statistical analyses on the RMSD comparisons or the CSP comparisons of GMCSF to claim the differences to be significant or not. For example, the authors say their approach has worked “as the range of the distribution of RMSDs of residues 80-90 and 110-125 is significantly larger for most of the mutations tested at both of these sites”. What is this distribution of RMSD compared against? Are these differences statistically significant?
Given that GMCSF has very limited sequence data in MSA to start with, does MSA subsampling actually help? The authors could try doing predictions using the traditional AF2 pipeline and compare those distributions against their approach.
Although the authors are right in looking for only the ground and I2 states in Abl kinase predictions, it will be interesting to explore if there were any predictions that matched the I1 state and if not, to speculate why more extensively
The data on some of the max_seq:extra_seq optimizations discussed for Abl kinase is missing. For example, 512:8 or 8:1024
There is no citation provided for the single and double mutants whose relative ground state populations were tested for Abl Kinase.
The nature of these mutations on Abl Kinase is not discussed. Are some of these mutations pathogenic or drug-resistant? It will be interesting to correlate the nature of mutation with its structural effects.The authors could provide more introduction of how these mutations were identified and add more discussion on trends.
What is the rationale behind choosing the PCs mentioned by the authors for Abl kinase enhanced sampling?
Why have the authors not shown the RMSD distribution of Distance 2 in Figure 4C?
How were the mutations on the histidine triad of GMCSF chosen?
Sebollela et al. 2005 (https://pubmed.ncbi.nlm.nih.gov/16027123/), which is not cited in this paper specifically but cited in one of the papers (Cui et al. 2020 - https://doi.org/10.1021/acs.biochem.0c00538) that this paper cites, substitutes H15 with alanine to demonstrate a decrease in heparin affinity
For the GMCSF system, do the authors see a relationship between the plDDT scores and the extent of RMSD?
Prior work that uses AF2 to sample conformational ensembles has seen that AF2 is able to predict more diverse conformations when the protein is not part of AF2’s training dataset. Was GMSCF part of the training dataset? If yes, how would the author’s approach vary for a protein that is not part of the training dataset?
Some of the figures are not informative/important enough to be main figures. For example, Figure 2 is mainly the AF2 pipeline, Figure 5 is just a pictorial representation of Supplementary Table S1. Also, Figures 6 and 7 could be combined into a single figure.
The CSP data for H15N is not shown in Figure 9B whereas its RMSD is shown in Figure 9C
The cut-off values used for jackhmmer not mentioned.
Residues are being addressed as codons in some places in the text
The authors may also want to include a few sentences contrasting their approach with this recently posted work: https://www.biorxiv.org/content/10.1101/2023.08.06.552168v1 in the introduction or discussion.
Review written by Ashraya Ravikumar and Sonya Lee with input from other Fraser Lab members at UCSF
The author declares that they have no competing interests.