Skip to PREreview

PREreview of Mag-Net: Rapid enrichment of membrane-bound particles enables high coverage quantitative analysis of the plasma proteome

Published
DOI
10.5281/zenodo.13334307
License
CC BY 4.0

Do the beads not enrich unwanted negatively charged proteins (those with low isoelectric point)? It's not clear what principle would favor particle enrichment over protein enrichment?

The threshold for lines in the volcano plot to determine which are differentially expressed is stated in the results but should also be stated in the figure legend along with which statistical test was used.

It is a strength that they assessed the enrichment of markers, the size distribution of the enriched particles, and the TEM of the beads to increase confidence that they have captured the particles of interest.

Figure 3D: does it make sense that the EVs would be visible outside the particles? My impression was that the EVs would go inside the beads.

It's not clear how figure 2A/B standard PAC differs from Figure 4A. Why would they get over 4k proteins in 4A but only ~1k proteins in 2A/B?

Some figures are small and difficult to see, such as 6b and 6d.

It appears that some plasma samples for each dementia sample group may have been collected with the different protocols as part of separate studies. Different studies can lead to major batch effects that may confound the data interpretation. Thus, the strong differentiation they observe may be due to differences in sample collection.

Some slight details about how they did the classification task should be mentioned in the results text instead of only in the methods. For example what model and how it was fit.

Given that their data distributions are imbalanced for classification (sometimes 10 out of 40 total in the positive class, or 0.25 of the data), they should not use AUROC to assess performance. Precision and recall, or summaries of precision and recall such as F1 or AUPR, would be more appropriate. 

Averaging k-fold cross validation is also not the best way to compute metrics. It would be better to use nested k-fold cross validation and report the average of metrics on the nested test sets. Given the simple model this may not substantially change the results but it would be a more true assessment of generalizability. 

Details how statistically significant proteins or peptides were used in relation to the ML are a little sparse. For example, if the significant proteins were determined using the whole dataset and those proteins were then applied for training in the 5-fold CV, this constitutes test set leakage. 

It is also not clear how the single protein level AUROC was determined. Was this also the average of a 5-fold CV? Also again this should be AUPR or F1 score to reflect imbalance classes. 

Competing interests

The authors declare that they have no competing interests.