Escrever um comentário

Avalilação PREreview de Machine Learning-Driven Identification of Serotype-Independent Pneumococcal Vaccine Candidates using samples from Human Infection Challenge Studies

de Brenda Ametepe

Publicado: 18 de novembro de 2025
DOI: 10.5281/zenodo.17636208
Licença: CC0 1.0

PAPER SUMMARY

Cheliotis et al. address a critical limitation in pneumococcal vaccination where current vaccines target only specific serotypes among over 100 variants, leading to serotype replacement and continued disease burden. Streptococcus pneumoniae remains the leading infectious cause of childhood pneumonia globally, with nasopharyngeal colonization serving as both a prerequisite for invasive disease and source of community transmission. Serotype-independent vaccines targeting conserved protein antigens could provide broader protection while being more cost-effective for resource-limited settings. The study employed controlled human infection models with pneumococcal serotypes in 86 healthy adults to identify immune correlates of protection. The authors developed a novel Luminex assay measuring baseline IgG responses to 75 conserved pneumococcal proteins and assessed cytokine responses from stimulated peripheral blood mononuclear cells. Machine learning algorithms, specifically Random Forest analysis, were applied to identify predictive immune signatures when traditional univariate analyses failed to show significant associations after multiple testing correction. The machine learning approach identified IgG responses to three proteins (PdB, SP1069, SP0899) and specific cytokine patterns (MCP-1 responses to SP1069 and SP0899, IL-17 production to SP0648-3) as predictive of colonization resistance. Elevated baseline interferon-gamma, RANTES, and anti-protein IgG correlated with lower colonization density among infected participants. These findings highlight SP1069 and SP0899 as potential vaccine candidates and demonstrate machine learning's utility for identifying complex immune correlates that traditional statistical approaches might miss, though independent validation remains essential before clinical application.

MAJOR REVISIONS: Multiple Comparison Correction Implementation Critique

The study's approach to controlling Type I error requires substantial refinement in its implementation of multiple testing corrections. The Benjamini-Hochberg correction was applied in a fragmented manner across separate analyses rather than controlling the false discovery rate across the entire family of immune measurements, which included 75 proteins for IgG responses plus 84 additional tests from cytokine measurements (21 proteins × 4 cytokines). This piecemeal approach fundamentally undermines the protection against Type I error inflation that multiple testing correction is designed to provide (Benjamini & Hochberg, 1995). Furthermore, the authors' reliance on uncorrected p-values (particularly MCP-1 responses to SP1069 p=0.035 and SP0899 p=0.049) after acknowledging that significance was "lost following correction" represents inappropriate statistical cherry-picking that violates fundamental principles of hypothesis testing (Rothman, 1990). The subsequent transition to Random Forest analysis after univariate failure lacks statistical justification and constitutes potential data dredging, as no a priori hypothesis was provided for why machine learning would succeed where traditional statistical methods failed.

Relevant section: Figure 2 caption states "this significance was lost following correction" yet Figure 3 presents uncorrected p-values; the Methods section describes separate corrections for different analyses rather than global correction.

Framework for Addressing: Apply Benjamini-Hochberg correction across all 159 immune measurements simultaneously (75 IgG + 84 cytokine tests) and report only corrected p-values throughout. Remove any reference to uncorrected p-values as "trends" or meaningful findings. Establish a clear statistical analysis plan specifying when machine learning approaches would be employed, ideally as a pre-specified alternative analysis rather than a post-hoc rescue strategy. Provide power calculations demonstrating adequate sample size for the intended statistical comparisons.

MAJOR REVISIONS: Cross-Serotype Comparison Validity and Temporal Bias Critique

The integration of data from both pneumococcal serotypes introduces systematic bias that fundamentally compromises the validity of identifying "universal" immune correlates. The different follow-up periods between serotype cohorts (29 days for 6B versus 14 days for 15B) creates systematic differences in colonization detection sensitivity and AUC calculation accuracy, as transient or delayed colonization events may be captured in the 6B cohort but missed in the 15B cohort (Ferreira et al., 2013). This temporal asymmetry particularly affects participants with delayed colonization kinetics who would be classified as protected in the 15B cohort but colonized in the 6B cohort, potentially misclassifying protection status. When combining data for machine learning analysis, this differential observation period creates systematic differences in outcome ascertainment that could generate false associations between immune responses and apparent protection. Additionally, the retrospective definition of colonization density thresholds using quartile-based divisions from historical data (128 participants for 6B versus only 9 for 15B) lacks scientific justification and creates unreliable thresholds for the 15B cohort.

Relevant section: Methods sections describe different follow-up periods; Figure 1 shows different sampling schedules; AUC calculations use truncated data (days 2-14) for both serotypes despite different study durations.

Framework for Addressing: Analyze serotype cohorts separately to avoid temporal bias, or standardize follow-up periods in future studies. If combined analysis is essential, implement statistical methods to account for differential observation periods such as survival analysis with censoring. Define colonization density thresholds prospectively using biological markers rather than retrospective quartiles. Validate that AUC calculations using days 2-14 data adequately capture colonization dynamics for both serotypes through sensitivity analyses using different time windows.

MINOR REVISIONS: Machine Learning Performance Assessment Critique

The evaluation of Random Forest model performance requires substantial enhancement through comprehensive reporting of standard machine learning metrics beyond the reported out-of-bag error rates. While the authors report 36.05% and 38.98% error rates, these figures represent only marginal improvement over random chance classification (50% error rate), indicating poor predictive performance that undermines claims of robust immune correlate identification (Hastie et al., 2001). The absence of precision, recall, F1 -score, area under the ROC curve, confusion matrices, or sensitivity analyses prevents proper assessment of model discrimination ability and clinical utility. Additionally, the hyperparameter selection using mtry=1 represents an unusually restrictive choice that considers only oRne feature per split, departing from standard recommendations.

Relevant section: Results section reports out-of-bag error rates

Framework for Addressing: Report comprehensive performance metrics including precision, recall, F1 -score, area under ROC curve, and confusion matrices for both antibody and cytokine models. Provide systematic hyperparameter optimization results showing why mtry=1 was selected over standard recommendations. Include cross-validation performance distributions across folds to demonstrate model stability. Add sensitivity analyses showing how performance varies with different feature subsets and hyperparameter choices. Explicitly acknowledge that 61-64% accuracy represents modest predictive ability requiring independent validation before clinical application.

Competing interests

The author declares that they have no competing interests.

Use of Artificial Intelligence (AI)

The author declares that they used generative AI to come up with new ideas for their review.

Você pode escrever um comentário nesta Avaliação PREreview de Machine Learning-Driven Identification of Serotype-Independent Pneumococcal Vaccine Candidates using samples from Human Infection Challenge Studies.

Antes de começar

Vamos pedir para você fazer login com seu ORCID iD. Se você não tiver um iD, você pode criar um.