PREreview del Next-Generation Sequencing Methods for Sensitive Hepatitis B Viral Genome Analysis: A European Study

por Brenda Ametepe

Publicado: 26 de agosto de 2025
DOI: 10.5281/zenodo.16945679
Licencia: CC BY 4.0

PAPER SUMMARY

This multicenter comparative study tackles an important and timely challenge: evaluating next-generation sequencing (NGS) methods for HBV genome detection in low viral load samples. The authors explore a diverse range of approaches, untargeted metagenomics, probe capture, and PCR-based methods using Illumina and Oxford Nanopore platforms, across six laboratories. The work is ambitious and potentially impactful, especially in clinical contexts where sensitive and specific HBV detection is critical. However, substantial issues with contamination, data interpretation, and statistical methodology currently limit the reliability of the comparative conclusions. The study lays strong groundwork for method optimization, and with targeted revisions, it can offer valuable insight into the future of HBV diagnostics.

MAJOR REVISIONS

1. Contamination Across PCR-Based Methods

Critique:

The detection of HBV sequences with 48-100% genome coverage in negative controls across all four PCR-based protocols represents a fundamental quality control failure that requires immediate attention to preserve the study's scientific validity. A false positive result in a negative control mandates investigation and method validation review, and a lack thereof potentially invalidates the entire analytical run. The high genome coverage for the false positives is indicative of potential contamination and needs to be addressed.

Framework for Addressing:

Acknowledge the limitation explicitly and transparently in the manuscript.
Reprocess samples with enhanced contamination control protocols (separate spaces, extraction blanks, environmental testing).
Delay comparative performance claims for PCR methods until a contamination-free dataset is available.
Increase the number and distribution of negative controls to monitor contamination throughout runs.

2. Statistical Misapplication in Correlation Analyses

Critique:

Figures 1, 2, and 4 contain a fundamental statistical reporting error that requires immediate correction to ensure accurate interpretation of the correlation analyses. The figures display R2 values for Spearman correlation analyses, which is methodologically incorrect as Spearman correlations measure monotonic relationships and should report the correlation coefficient ρ (rho) rather than the coefficient of determination. Some figures display negative R² values,which are mathematically impossible for coefficient of determination calculations, further confirming the inappropriate statistical notation. Moreover, Spearman's correlation does not deduce significance but association.

Framework for Addressing:

Replace all R² values with correctly reported Spearman ρ (rho) values and include 95% CIs.
Apply censored data models (e.g., Tobit regression or Kaplan-Meier) to properly handle non-detects.
For diagnostic performance, consider ROC analyses and logistic regression to assess detection probability versus viral load.

3. Arbitrary Threshold for Contamination Identification

Critique: The 5% nucleotide divergence threshold used to define contamination lacks empirical validation. Given known HBV subgenotype divergence (4–8%), the threshold risks misclassifying true biological variants. The authors provide no empirical analysis of sequencing eros rates and no justification for this threshold selection, and no validation of the phylogenetic analysis of known HBV strains. This arbitrary threshold choice fundamentally affects the interpretation of the contamination versus legitimate viral diversity.

Framework for Addressing:

Conduct phylogenetic analysis to empirically calibrate divergence thresholds.
Compare contamination classifications across multiple cutoffs (3%, 5%, 7%, 10%) to test robustness.
Incorporate platform-specific error profiles using known standards to better ground threshold decisions.
Include supplemental trees showing the relationship of detected sequences to HBV references.

MINOR REVISIONS

1. Lack of Reference Standardization in Mapping

Critique: Variation in reference sequences across protocols may skew assembly metrics and coverage comparisons.

Framework for Addressing:

Standardize the reference genome used across methods or provide justification for divergence.
Document reference accession numbers and parameters used in mapping.
Conduct sensitivity analyses to demonstrate impact of reference choice.

2. Incorrect Log-Scale Notation in Figures

Critique: Figures use "<100" on log-transformed axes, which is mathematically misleading and visually confusing.

Framework for Addressing:

Update axes to reflect accurate log-scale notation (e.g., "<1" or "10⁰").
Double-check all figures for mathematical consistency and precision in labeling.

3. Post-hoc Contamination Thresholds

Critique: The 10-base minimum threshold for consensus calling is applied post-hoc and lacks validation, particularly problematic for low viral load samples.

Framework for Addressing:

Establish minimum coverage thresholds based on empirical LOD studies and known viral controls.
Use ROC curves to optimize sensitivity/specificity tradeoffs.
Avoid retrospective thresholding; instead, predefine and justify thresholds before analysis.

Competing interests

The author declares that they have no competing interests.

Use of Artificial Intelligence (AI)

The author declares that they used generative AI to come up with new ideas for their review.

Comentarios

Escribir un comentario

No se han publicado comentarios aún.