PREreview of AF2χ: Predicting protein side-chain rotamer distributions with AlphaFold2

by Daniel Keedy

Published: September 3, 2025
DOI: 10.5281/zenodo.17049067
License: CC BY 4.0

Overview:

This is an innovative manuscript that explores how the inner workings of the AlphaFold 2 deep learning algorithm can be exploited to predict protein side-chain dihedral-angle (torsion-angle) distributions and to subsequently construct atomistic ensemble-style models consistent with those distributions. The authors have developed an approach called AF2chi to, on a per-residue and per-chi basis, “mix” a discrete chi prediction from traditional AF2 output with Top8000 distributions from many crystal structures of many proteins, then to reweight the resulting chi distribution using an average of discrete chi angle predictions from AF2’s black-box-esque “inner layers”. The method then uses these distributions for most chi angles (see comments below about chi1-2 vs. chi3-4) to construct an atomistic ensemble of the full protein (100 distinct models) that reflects the reweighted chi distributions with some accompanying small coordinate shifts to accommodate them (see comments below about clashes and backbone shifts).

The authors show that the chi angle distributions from their method are consistent with various experimental and computational points of reference including NMR 3J coupling data, NMR order parameters, “HSP” ensembles of closely related crystal structures, and MD simulations. The comparisons span various protein systems depending on what types of data are available for each, as appropriate. The method does not require an extensive multiple sequence alignment and can work equally well with only a single sequence for input. Notably, AF2chi is orders of magnitude faster than comparatively quite expensive MD simulations, yet achieves similar results by these metrics.

Together, these qualities make AF2chi an attractive solution to modeling side-chain conformational heterogeneity for any protein! Still, we note a few caveats and areas for improvement below.

Major Comments:

Our biggest area of criticism relates to the generation of the structural ensemble, which is not the focus of most of the manuscript. How are chi1-2 values “sampled” for this stage? Is it random from the reweighted distributions, independently for each residue and for each chi (chi1 and chi2) for each residue -- or in some other, perhaps more principled way? This question is key because the answer has much to say about what information the resulting ensemble contains about coupling between spatially adjacent residues. It seems clear that the AF2 outer layer, and inner layers for that matter, contain significant information about how the chi angles of spatially adjacent residues are coupled (otherwise presumably the protein interior could not be packed in a satisfactory way by AF2!). However, it is less clear how much of this valuable information remains in the final structural ensemble, after the AF2 sequence-specific information is mixed with the Top8000 sequence-independent information.

Relatedly, we were surprised to read the following: "In principle, correlations between side-chain conformations could be captured in the structure generation step of AF2χ, as structures with clashes are rejected; in practice we find that the acceptance rate of structures is close to 100%. This suggests that any remaining steric clashes may be resolved when the backbone structure is relaxed, in line with previous observations that small backbone movements help decouple side-chain motions (Davis et al., 2006; DuBay et al., 2011)." This raises a few questions.

First, is it possible that there are actually significant remaining clashes, but the clash detection method employed here is too lenient? This could be examined by varying the clash detection threshold and repeating certain analyses. This could have the additional interesting advantage of helping to dissect degrees of local allosteric coupling based on which clashes are more vs. less easily resolved by relaxation.

Second, is it possible that any initial clashes are indeed resolved, but at the expense of local geometry, which may become strained in terms of rotamericity (rotamer percentile), Ramachandran percentile, Cbeta deviations, bond length/angle deviations, etc. This would depend on how “aggressive” the relaxation method is, given the tight imposed restraints on chi angles (per the Methods). To address this concern, all of these geometry metrics should be compared using MolProbity for (a) residues that start with a clash pre-relaxation that is resolved by relaxation vs. (b) residues that start with no clashes pre-relaxation.

Third, do initial clashes happen more/less with reweighting vs. without reweighting? (We’re not sure if structural ensembles were generated without reweighting, but this could be attempted.) This could provide insight as to the extent to which the inner layer information about side-chain flexibility contributes to the clashes, independent of / controlling for other aspects of the AF2chi pipeline such as the routine for sampling chi angles for structural ensemble generation.

Fourth, are specific backbone adjustments such as the backrub actually responsible for resolving initial clashes after relaxation? The authors should check their stated hypothesis on this matter for the same sets of (a) vs. (b) residues as listed above.

Minor Comments:

Overall, it is a bit disappointing that AF2chi vs. AF2chi prior (with vs. without reweighting based on the AF2 inner layers) yield similar results in many of the presented analyses. However, to their credit the authors openly acknowledge this, and point out that the reweighting does not typically do any harm in individual cases, even if it may not help much overall / in the aggregate. So there is likely still some (perhaps small) value in this step.

What is the sequence identity threshold for constructing HSP ensembles? Is this an important parameter? Do the results of the comparisons presented here depend on this threshold choice? This may also vary based on the characteristics of the HSP ensemble in other respects such as distribution of resolution, crystal symmetries, etc.

Bootstrap sampling is mentioned a few times, but its meaning in each such context could be better explained.

Fig. 1g-j are confusing in a few regards. We think there may be an error in the order of inner layer / outer layer for Fig. 1g,h vs. Fig. 1i,j. This is made more confusing by the use of the terms “inner” and “outer” to describe the parts of the circular plots, which are not conceptually related to the inner and outer layers of the algorithm…

Fig. 3c: Why are there apparently 2 distinct clusters?

"For residues that sample multiple χ1-angle free-energy wells in the HSP ensemble, we found that AF2χ provides better agreement with the HSP ensemble than CHARMM36m MD simulations, suggesting that AF2χ can accurately capture the structural heterogeneity of dynamic side chains." They look pretty similar in the plot; is this statistically significant?

What about chi3 and chi4? Are they handled analogously to chi1-2 within AF2chi? The Methods mention that in the ensemble generation step only chi1-2 are sampled, so how are chi3-4 handled there? Are there analogous ground-truth standards to compare against for these dihedral angles farther from the backbone? This is not touched upon in the manuscript, unless we missed it -- but is obviously important for defining full side-chain conformations, including the termini of longer side chains that engage in various important interactions in protein structures.

What would be the equivalent of AF2chi for the protein backbone? Are the backbone dihedral angles phi, psi, and omega handled analogously to side-chain chi angles within AF2, in which case e.g. Top8000 Ramachandran distributions could be used to construct priors per residue? Or would prediction of backbone heterogeneity need to take a different form due to the inner workings of AF2?

(This preprint review stems from a journal club discussion in the Keedy lab at the CUNY Advanced Science Research Center on August 29, 2025.)

Competing interests

Comments