Comments
Write a commentNo comments have been published yet.
Giovanni Bussi, Ivan Gilardoni
This report was written after a journal club given by GB in the bussilab group meeting. All the members of the group, including external guests, are acknowledged for participating in the discussion and providing feedback that was useful to prepare this report. The corresponding author of the original manuscript was consulted before posting this report.
The authors introduce a method to perform maximum entropy reweighting of molecular dynamics (MD) trajectories, specifically addressing the choice of regularization parameters. Interestingly, an automatic procedure to determine the relative strength of the regularization terms acting on different experimental datapoints is proposed. The method is then applied to extensive MD simulations of 5 intrinsically disordered proteins (IDPs) using 3 different force fields. For 3 of the tested IDPs, different simulations converge to the similar ensembles after reweighting. For the other 2, the ensemble overlap is too limited, and the reweighted ensemble is significantly force-field dependent.We are mostly interested in the methodological part of the paper. Technical innovations are:
The idea of including the statistical error of the MD, as computed with block analysis, in the regularization term.
The idea of performing preliminary regularization scans for individual sets of experiments to tune the relative uncertainty, and ultimately end up with a single parameter (prefactor) to be tuned.
The idea of tuning the parameters monitoring the Kish sample size.
To our knowledge, the comparative results for the 5 IDPs and 3 force fields are also new and very relevant.
Could there be some speculation by the authors on the relative role of the water model and of the protein force field? The way results are presented, it is not clear if the reason for the discrepancies is in the protein force field or in the water model. Maybe it is not possible to disentangle them.
The authors evaluate the errors of the employed forward models by applying the Flyvbjerg block analysis method, which only estimates statistical errors, without keeping into account possible systematic errors in the forward models themselves. We would suggest to clarify it in the text.
In many sentences of the main text, the authors use the word "weight" to refer both to the statistical weight of each frame in the trajectory and to the strength of the experimental restraints. To avoid undesired misunderstandings, we suggest to use words like "strength" or "confidence" of the experimental data/restraints etc.
Figure 2B is unexpected. The authors used the derivation from Cesari et al JCTC (2016) to introduce uncertainty in the experimental observation. This is formally equivalent to BioEn (Koefinger and Hummer, 2015) if the theta hyperparameter is set to 1. If we understood correctly how the global scaling was applied, its scan would correspond to a scan over theta in the BioEn formalism. When doing so, it can be proven analytically that the chi^2 should be a monotonic function of the regularization hyperparameter. Hence, when the authors decrease sigma_global, the average RMSE should decrease monotonically. However, in the figure all the reported RMSEs seem to be increasing for sigma → 0, so their average will increase. This seems inconsistent. Is this related to some issues with the minimization process (e.g., minimizations with a very low sigma are not converging)? Or is there something we misunderstood in the explanation?
The perfect match of the SAXS data for PaaA2 for the AMBER force field (before reweighting) is striking. Was perhaps the force field optimized using this system as a reference? If so, we would suggest to mention this in the text.
“An IDP ensemble containing ten consecutive residues [...] neighboring residues”. This is super interesting. Could the author explicitly compute this and show it? We believe it is a very interesting example of what can be observed in an MD simulation and not in a bulk experiment, and so it would be relevant to know if the answer is force field dependent or not. For instance, one could plot a 2D map with DDG associated to the helical content.
The authors quantify the similarity of two ensembles by computing the normalized overlap integral of the kernel densities in the ELViM latent space. This quantity is, by construction, sensitive to this particular choice of the ensemble representation, based on the C\alpha carbon atoms, which is in effect a dimensionality reduction. We agree that it is in practice unfeasible to evaluate the similarity between ensembles sampled in different MD simulations using quantities that do not rely on dimensionality reduction. However, when comparing reweighted with unbiased ensembles, we expect the Kish sample size to be relatively low (around 0.1), whereas the shown density overlap S is high (>0.66 for all the examined IDPs - values along the diagonal in Fig. 8B). Does this happen because the employed dimensionality reduction (based on C\alpha carbon atoms) focuses more on some robust part of the structural ensembles, preserved by the reweighting? Or is it because two different (possibly uncomparable) metrics are used?
The authors comment that leave-one-out cross validation should not be used with correlated data, and claim that this is the reason for the unexpected good performance of cross validation in Figure 2C. This is very interesting as, to our knowledge, has not been seen before. Adding more points in the low-Kish-size region might be interesting.
At some point the authors write "We note a conceptual similarity of the reweighting procedure proposed here with the concept of gentle ensemble refinement in the recently published work of Kofinger and Hummer.32 " We believe the authors meant to cite Ref. 33 rather than Ref. 32.
The authors declare that they have no competing interests.
No comments have been published yet.