Write a comment

PREreview of AF2χ: Predicting protein side-chain rotamer distributions with AlphaFold2

by Stephanie A. Wankowicz and 1 other author

Published: August 5, 2025
DOI: 10.5281/zenodo.16746471
License: CC BY 4.0

Summary

This paper introduces a new method to predict the distribution of protein side chains from structures generated through AlphaFold2. While ‘pseudoensembles’ or high similarity sequence structures can provide the distribution of side chain conformations, the authors concluded that at least 20 structures were needed to establish a ‘good enough’ distribution, making this approach not uniformly available.

Therefore, the authors turned to see if AlphaFold2 could predict the side chain distributions. Their method, AlphaFold2χ, leverages the inner layers of AlphaFold2 to predict the distributions of side chain conformations. They then use Bayesian/maximum entropy (BME) reweighting, of the inner AlphaFold2 predictions with existing rotamer libraries, to refine the predictions of side chain distributions.

The authors tested their model's predictions in several ways. Side chain distributions were compared against nuclear magnetic resonance (NMR) J-couplings, molecular dynamics, S2-axis methyl NMR data, and lastly benchmarking against new generative artificial intelligence methods: BioEmu with Hpackaer, aSAMt, MDgen, and SeqDance.

The author’s main finding is that Bayesian reweighting monotonically improves the model’s accuracy, measured by correlation with NMR data. Ultimately, the new model performs at least as well as rotamer libraries/current prediction methods and sometimes better than traditional molecular dynamics simulations when compared to side chain ensemble distributions determined by NMR. However, the details and comparisons within the paper are relatively light. While the authors are limited by well-studied systems, the lack of quantitative data and comparisons with a larger set of side chain conformational distributions limits the conclusion on how widely applicable the findings are and their impact.

Major Revisions (please note, we used the biorxiv version to provide line numbers)

The claims of superior performance of Af2x require stronger quantitative backing. This is especially important when discussing difference between AF2x prior v. AF2x. In many plots these looks comparable. 201 through 207, lines 230 through 232, lines 272 through 273, lines 285 through 287, lines 298 through 310, lines 329 through 337, line 338 through 361, figures 3 through 6.
The null model is presented only for S2-axis methyl data; developing a null model across all evaluated models would be beneficial especially when comparing JS-divergence and J-Couplings, determining if AF2x is better than expectation. See lines 231 through 240, lines 286 through 291, figure 5, and supplemental table 4
The authors only use globular proteins when testing sidechain ensemble distributions. While the level of validation via NMR and MR is commendable, ultimately testing on more types of proteins, like membrane proteins, is important for helping to classify the differences in sidechain dynamics across proteins. Rerunning all analyses for other types of proteins such as GPCRs should be performed. See lines 221 through 240 and lines 243 through 277
Qualitatively explaining the structural differences between the types of ensembles seen between HSP and AF2x will help understand the value the model provides. For example, do you see differences between buried or exposed side chains, amino acid type, or side chain that are sampling multiple rotamer wells in HSP. See lines 292 through 310.
While we understand that the authors are limited in comparisons to good NMR metrics, given that there was good correlation between HSP with more than 20 members and NMR metrics, and that there are many more 20+ HSPs that could be created, we encourage the authors to do a larger comparison to determine how widely AF2x can pick up on the side chain heterogeneity.
While the authors briefly comment on timescale in comparison with methyl order parameters, I think this point deserves a second line in the conclusions. While they are clear in stating that they are looking at side chain heterogeneity, specifying that each method looks at different but overlapping timescales will emphasize downstream use that this is for heterogeneity but not speaking about dynamics.

Minor Revisions

We suggest citing Wankowicz et al, 2022 eLife instead of Wankowicz & Fraser, 2024 when talking about the impact of side chain heterogeneity with ligand binding. Lines 34
Please add citation: Vicinal Proton Coupling in Nuclear Magnetic Resonance, Karplus 1963 when referring to the Karplus equation in the HSP section. See line 107 through 108
In the model parameterization section, referring to Fig S3, please clarify that the decoy structure is AF2 in single sequence mode. See lines 173 through 181

Competing interests

The authors declare that they have no competing interests.

You can write a comment on this PREreview of AF2χ: Predicting protein side-chain rotamer distributions with AlphaFold2.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.