Skip to PREreview

PREreview of Large scale prospective evaluation of co-folding across 557 Mac1-ligand complexes and three virtual screens

Published
DOI
10.5281/zenodo.18250549
License
CC BY 4.0

Summary

This manuscript evaluates the performance of co-folding models when tasked with 1) the recapitulation of a large number of experimentally determined co-crystal structures of Mac1 with a series of Mac1 ligands and 2) the rescoring of hits to identify false positives originally derived from a set of large docking-based virtual screens.  The evaluation leverages a dataset of crystal structures and affinity data from high-throughput crystallographic and biophysical screens, respectively. These data uniquely enable this report to focus on the ability of co-folding models to handle ligands, resulting in an analysis that is particularly timely given the wide adoption of co-folding models and the relative scarcity of such ligand-focused benchmarks among existing evaluations, which have primarily focused on protein structure prediction or binder design.

Feedback

The experiments and analyses in the manuscript are well thought-out and do not have any significant issues. There are a few high-level points that may improve the clarity and completeness of the results. Importantly, none of the suggested additional experiments will affect the conclusions of the paper, but rather help provide additional context for the results:

·      The first section presents an exciting opportunity to frame the Mac1 ligands against ligands in the PDB more broadly. It would be informative to assess whether chemotypes that are easier or harder to predict accurately and confidently are over- or under-represented in the PDB as a whole. Note that this is not a recommendation that new scaffold similarity metrics be incorporated into the analysis, but rather that analyses similar to those already performed in the manuscript are performed using all ligands in the PDB. For example, PCA-based analyses similar to those in Fig. 1c could be used to examine Mac1 ligands in the context of all PDB ligands enabling questions such as whether similarity to a nearest PDB neighbor, cluster size in a Tc/MCS PCA space, or other frequency-based measures show any relationship with prediction vs. crystal structure RMSD. Such analyses could provide additional insight into how effectively models leverage ligand information present in the PDB overall, as opposed to biases arising specifically from scaffolds represented in Mac1 structures in the PDB, which are already well covered in the manuscript. The conclusion that Tc/MCS do not correlate with the ligand RMSDs for the ligands already associated with the Mac1 is well supported, and presumably suggests that a correlation would not exist against the backdrop of the PDB, but it would be interesting to see the data using analyses similar to those already done in the manuscript nonetheless.

·      Similarly, even though the four proteins used in this manuscript are not themselves the primary focus of the analysis, it would be valuable to perform a high-level assessment of the precedent for each protein in the PDB (beyond the count of liganded structures in Table S6), either in protein sequence space (e.g., MSAs) or structural space (e.g., FoldSeek). An analysis like this would provide important context about whether any of the proteins in the study have close homologs with liganded structures in the PDB, or are generally overrepresented in the PDB. The fact that the AUC for L-pLDDT for AmpC is higher than σ2 and D4, for example, is notable given the relative abundance of liganded AmpC structures in the PDB (this raises potentially interesting questions related to where DOCK3.7 and AF3 actually place the ligands, given the orthosteric β-lactam binding pocket in AmpC, although this is outside of the scope of this manuscript).

·      A discussion of the affinity probability results (`affinity_probability_binary`) from Boltz-2 is likely warranted in the second section in addition to the pIC50s that are already reported (`affinity_pred_value`). The former seems like it would be more applicable for section 2 of the manuscript, but both warrant inclusion—they should both be calculated by default when the affinity pipeline in Boltz-2 is turned on, so it wouldn't involve any more inference.

Minor points

·      A more detailed description of the experimental methods used to generate the ground-truth data in the introduction (even though these have been explained in prior works) would help orient the reader early on, and ground the benchmarking aspect of the story. In general, the abstract and introduction would benefit from a more cohesive through-line to tie the two complementary but orthogonal sections of the paper together.

·      The cutoffs in the "Co-folding can accurately reproduce..." section shift between 2.5 Å (from the ligand center of mass) and 2.0 Å. Is there a reason for this? Along similar lines, mentioning cutoffs for true positives/negatives when introducing the ROC analyses later on in the Mac1 section seems unnecessary since no cutoff should be necessary here.

Competing interests

Our lab has published with the authors in the last five years (https://doi.org/10.5281/zenodo.11391920)

Use of Artificial Intelligence (AI)

The authors declare that they did not use generative AI to come up with new ideas for their review.