PREreview of AlphaFold as a Prior: Experimental Structure Determination Conditioned on a Pretrained Neural Network

by James Fraser, Stephanie A. Wankowicz, and Karson Chrispens

Published: July 16, 2025
DOI: 10.5281/zenodo.15946798
License: CC BY 4.0

AlphaFold and related methods have captured the imagination of structural biologists and the wider public. Despite the promise of readily calculable models with near-experimental accuracy, AF models are clearly not yet complete substitutes for experimentally-refined models with two obvious limitations:

1) unanticipated/rare/out-of-distribution conformational changes,

2) details of geometry/packing/etc (see also the recent preprint: https://www.biorxiv.org/content/10.1101/2025.06.30.662466v1)

Both of these limitations have clear ramifications for the long term hope of improving the AI methods themselves so that they can entirely substitute for experiment across wider applications. Many approaches have been introduced to “hack” the multiple sequence alignment information that acts as a core distance constraint in AF methods (e.g. clustering, subsampling, noising) as a workaround for the first limitation. The second limitation (and to some extent also the first) can also be worked around by procedures such as the “PredictAndBuild” pipeline introduced by a subset of these authors previously, where (externally) iterative cycles of prediction and conventional (experimentally driven) model refinement can clean up the predictions and increase agreement with experimental data.

Here, the major goal is to try to create a more “end-to-end/internally iterative” solution rather than an externally iterative one. ROCKET combines a clever way of (likely) disambiguating signals from the MSA by (internally) iteratively updating these signals based on the agreement with experimental data. This is very cool and distinct from other approaches that try to bias sampling based on experimental signals in different ways (e.g. https://arxiv.org/pdf/2406.04239, https://arxiv.org/abs/2502.09372). However, the authors could more directly compare the distinction between:

AF prediction +/- conventional refinement
externally iteratively templating above ( PredictAndBuild) +/- conventional refinement
MSA subsampling plus conventional refinement - and maybe iteratively templating (PredictandBuild) +/- conventional refinement
internal ‘guided’ MSA feature modified by experimental data (Rocket in Fig 2/3) +/- conventional refinement
MSA subsampling followed by internal ‘guided’ MSA feature modified by experimental data (Rocket in Fig 4) +/- conventional refinement

Such comparisons will clarify the value of their internal loop, relative to MSA subsampling, and (more importantly) subsequent conventional refinement. There are unique tests that could be done to help understand the signal propagating back to the MSA, by for example restricting the gradient updates to only portions of the ROCKET weights tensor and comparing performance with the full ROCKET tensor.

An exciting avenue that this opens up is the idea that MSA disambiguation driven by experimental data can help amplify the MSA patterns corresponding to distinct conformations. This might enable better ensemble prediction right off the bat by accounting for averaging. This is demonstrated by the small difference between PredictAndBuild v. ROCKET in terms of fit to experimental data and/or PDBRedo structures (Figure 2). However, as shown in Sup Fig 7, there are situations where ROCKET clearly outperforms PredictAndBuild. Are these all low confidence and flexible loops? We suggest the authors further explore where and potentially why PredictAndBuild-type external iteration is doomed to fail, but an internal MSA biasing iteration can shine. The limitations of the current ROCKET MSA disambiguation capabilities are exposed in Fig 4, where MSA subsampling is used in combination with ROCKET to drive an even larger (out of AF distribution) conformation to be sampled. Does this specific subsampled MSA also enable BuildAndPredict to converge on a better solution? What are the signals in that subsample that can’t be disambiguated from the full MSA by the ROCKET procedure?

The greatest potential novelty is the tomo case as the other baseline methods haven’t yet been calibrated to low resolution maps. Here, the comparison to the PDB structure may be a bit unfair as the refinement of the PDB seems incomplete as deposited. Just putting that model into phenix.real_space refine improves it substantially in my hands. Although the PredictAndBuild isn’t applicable here because of the focus on iterative X-ray data, the core concept would probably work well with an “external” loop of “AF->phenix.real_space_refine, etc”, even at the low resolutions common in sub tomo averaging. As in the X-ray and CryoEM examples, the value of an internal loop that interacts with the MSA information as implemented in ROCKET is the major conceptual advance that might enable future performance gains - but the performance gain over external loops is potentially exaggerated by the lack of conventional refinement for the deposited or AF comparisons (even without iteration as in a predictandbuild-type real space implementation or true B-factor refinement for the AF comparison).

Overall, this work represents a big conceptual advance in how MSA information can interact with experimental data. However, whether to ascribe the improvements to the way the MSA interacts with the experimental data, the specific curation of the (subsampled) MSA, or (especially) the subsequent refinement with conventional (experimentally driven) pipelines remains unclear in parts and potentially underemphasized in others.

PREreview of AlphaFold as a Prior: Experimental Structure Determination Conditioned on a Pretrained Neural Network

Competing interests

Comments