Avalilação PREreview de Training a force field for proteins and small molecules from scratch
- Publicado
- DOI
- 10.5281/zenodo.19731247
- Licença
- CC BY 4.0
Preprint
Alexandre Blanco-González, Thea K Schulze, Evianne Rovers, Joe G Greener, arXiv:2603.16770
Authors of the review
Giovanni Bussi
This report was written after a journal club given by Giovanni Bussi in the bussilab group meeting. All the members of the group, including external guests, are acknowledged for participating in the discussion and providing feedback that was useful to prepare this report.
The corresponding authors of the original manuscript were consulted before posting this report.
Summary
The authors report the fitting of a force field for biomolecules from scratch, called Garnet, covering small molecules and proteins. Parameters are partly inferred from QM calculations and partly from experimental data. Importantly, the parametrization procedure includes benchmarking simulations of proteins against NMR data. Results are on par with state-of-the-art force fields.
Comments
This is an impressive work. Whereas the resulting force field does not outperform existing ones, the fact that all parameters are obtained within this paper with an end-to-end procedure makes its updates straightforward. In line of principle, it should be possible to (a) add simulations of additional proteins or (b) different macromolecules and obtain a more transferable parametrization.
Figure 2C report deviations in line with other force fields. However, their magnitude appears to be significantly larger than kBT. Can the authors comment on the expected impact of these deviations on the population of observed states?
Some of the protein simulations (Figures 3A and 4A) display a drift which suggests they are unstable. The authors could comment about the difference in the reported metrics (same figures, additional panels) when computed using only the first or only the second half of the trajectory.
Performance on IDPs is reported to be better than many force fields. The authors suggest this might be due to a bias in human derived force fields towards structured states. Is it possible instead that the Garnet force field is just suboptimal in preserving structured proteins and, as such, results in apparently better IDPs simulations? Or is the improvement a genuine consequence of the training procedure?
Page 21-22: "we computed the minimum and maximum values attainable by the corresponding Karplus curve and clamped the experimentally reported coupling to this theoretically admissible interval prior to analysis." Does this suggest that the empirical coefficients are not correct? How much does this clamping affect the final result?
The two densities in Fig SI 2 look quite similar. Perhaps showing them as standard histograms with discrete bins, without KDE, and with the same bin spacings, would make the figure clearer. I would also suggest avoiding the KDE, because it leads to a misleading negative tail, which is clearly a plotting artifact (absolute charges should be positive). In fact, it might even be better to show a scatter plot (similarly to Fig SI 3), either in addition to the histogram or as an alternative to it.
Writing remarks:
The authors use a double exponential form to replace standard Lennard-Jones. In the paper, they refer to this as "Van Der Waals". However, the only part of the Lennard-Jones potential that represent true Van Der Waals forces is the attractive 1/r6 term. I believe it would be clearer to call the double exponential used in this paper differently.
I would recommend using homogeneous colors to distinguish different methods. For instance, in figure 3 Garnet is orange and AMBER is blue. In Figure 4, the colors are swapped.
Competing interests
The author declares that they have no competing interests.
Use of Artificial Intelligence (AI)
The author declares that they did not use generative AI to come up with new ideas for their review.