Skip to PREreview

PREreview of In Vivo Dual RNA-Seq Analysis Reveals the Basis for Differential Tissue Tropism of Clinical Isolates of Streptococcus pneumoniae

CC BY 4.0

We, the students of MICI5029/5049, a Graduate Level Molecular Pathogenesis Journal Club at Dalhousie University in Halifax, NS, Canada, hereby submit a review of the following BioRxiv preprint:

Vikrant Minhas, Rieza Aprianto, Lauren J. McAllister, Hui Wang, Shannon C. David, Kimberley T. McLean, Iain Comerford, Shaun R. McColl, James C. Paton, Jan-Willem Veening, Claudia Trappetti. In Vivo Dual RNA-Seq Analysis Reveals the Basis for Differential Tissue Tropism of Clinical Isolates of Streptococcus pneumoniae. BioRxiv 862755; doi:

We will adhere to the Universal Principled (UP) Review guidelines proposed in:

Universal Principled Review: A Community-Driven Method to Improve Peer Review. Krummel M, Blish C, Kuhns M, Cadwell K, Oberst A, Goldrath A, Ansel KM, Chi H, O'Connell R, Wherry EJ, Pepper M; Future Immunology Consortium. Cell. 2019 Dec 12;179(7):1441-1445.

SUMMARY: Previous work has shown that two genetically related clinical isolates (947-Ear RafRD249 isolate and 4559-Blood RafRG249 isolate) of Streptococcus pneumoniae serotype 14 ST15 display differences in virulence in the mouse lung following intranasal challenge. These different virulence phenotypes of the two clinical isolates correlated with non-conserved single nucleotide polymorphisms (SNPs) in the raffinose pathway regulatory gene rafR (RafRD249G). In this study, Minhas et al. investigated the effects of the RafRD249G SNP on bacterial and host transcriptomes in infected lungs using a dual RNA-seq approach. They observed differential expression of genes encoding sugar transporters that fine-tune carbohydrate metabolism, which were driven by the RafRD249G SNP. This contributed to distinct host niche specialization for the two clinical isolates. RNA-seq analysis of the host response to the S. pneumoniae infection suggested that the RafRD249G SNP in bacteria caused differential expression of host genes encoding for multiple cytokines, cytokine receptors, chemokines and chemokine ligands. Particularly, the expression of IL-17-related genes was enriched in the murine lungs infected with the 947-Ear isolate carrying the RafRD249 SNP compared to the 947-Ear isolate carrying the RafRG249 SNP. Using in vivo neutrophil depletion and IL-17A neutralization, the authors found that IL-17-induced neutrophil recruitment was partially responsible for the more efficient clearance of isolates of 947-Ear RafRD249 observed in the murine lungs. Overall, this study showed that the RafRD249G SNP had a significant impact on bacterial and host transcriptomes, leading to distinct virulence phenotypes and host response. The use of dual RNA-seq in this study provided a powerful approach for investigating in vivo host-pathogen interactions.


STRENGTHS: The authors conducted a transcriptomic study on the impact of a non-conservative SNPs RafRD249G on host cells and two S. pneumoniae isolates (ear and blood) using a dual RNA-seq approach in murine lung infection. Their RNA-seq data strongly suggested that the RafRD249G SNP between the two isolates caused distinct patterns of gene expression. Moreover, the dual RNA-seq on the host side showed that the RafRD249G SNP also played a detrimental role in trigging differential host responses in infected lung. This study provides solid evidence that the dual RNA-seq approach is a useful tool for in vivo study of complex host-pathogen interactions. 

WEAKNESSES: There are a few weaknesses in the data analysis, data presentation, and experimental design. The lack of information in RNA-seq data analysis made it difficult for the reviewers to evaluate data quality. Some of the figure labels are difficult to interpret. In certain instances, methods were not described in sufficient detail to enable others to reproduce experiments. In this study, the RNA-seq data suggested an association between IL-17 expression and the RafRD249G SNP in bacteria. However, the conclusion that RafRD249G SNP was responsible for bacterial clearance via IL-17-mediated neutrophil recruitment was insufficiently supported. This could be improved by using appropriate controls, testing neutrophil recruitment in murine lung, and comparing data within each treatment. Moreover, we found that some of the writing in the manuscript was misleading and confusing. We believe that some of the weaknesses mentioned above could be solved by simplifying and clarifying the data presentation and interpretation.



1.     Quality: Experiments (1-3 scale) SCORE = 1.5

Figure by Figure, do experiments, as performed, have the proper controls? 

·       Figures 2B and 2C: There appears to be an enrichment in higher fold changes on either side of the replication origin. Have the authors considered whether cell populations might be rapidly growing, which could provide multiple copies of the origin per cell, thereby clouding interpretation?

·       Figure 5: The gating strategy for multi-color flow cytometry analysis should be provided.

·       Figure 5 and 6A: We think adding a mock infected negative control or showing the data of time zero as the base line is necessary to interpret this data. 

Are specific analyses performed using methods that are consistent with answering the specific question? Is there the appropriate technical expertise in the collection and analysis of data presented?

·       In Figure 2&3, the meaning of “member genes” and “non-member genes” needs to be more clearly explained in the Methods and Figure Legends. 

·       In Figure 2 panels D, E and F and Figure 3 panels D, E and F, we think it would be easier to follow the data presentation if only the differentially expressed genes were shown as the fold enrichment in a bar graph instead of the dot plot. The rest of the granular dot plot data can be presented in supplementary material. 

Do analyses use the best possible (most unambiguous) available methods, quantified via appropriate statistical comparisons? 

·       OK

Are controls or experimental foundations consistent with established findings in the field? A review that raises concerns regarding inconsistency with widely reproduced observations should list at least 2 examples in the literature of such results. To address this question may occasionally require a supplemental figure that, for example, re-graphs multi-axis data from the primary figure using established axes or gating strategies to demonstrate how results in this paper line up with established understandings. It should not be necessary to defend exactly why these may be different from established truths, although doing so may increase the impact of the study. 

·       OK

2.     Quality: Completeness (1-3 scale) SCORE = 2

Does the collection of experiments and associated analysis of data support the proposed title/abstract-level conclusions? Typically, the major (title or abstract level) conclusions are expected to be supported by at least two experimental systems. 

·       It was clear that the bacterial clearance of both isolates was partially due to neutrophil recruitment to the infected lung. It was also clear that the different bacterial loads between the two isolates in lungs at 24h post infection was driven by the RafR SNP. However, the direct evidence to support the conclusion that the IL-17 expression and its downstream neutrophil recruitment in the infected lung contributed to the RafR SNP-mediated phenotypic differences between the two isolates was not convincing. We believe that there are few points in the data presentation and interpretation (as listed below) that need to be further clarified or explained. 

·       There were two variables (different isolate and different neutrophil treatment) between “9-47-Ear isolate with anti-mouse Ly6G treatment” and “4559-Blood isolate with isotype control treatment”, as well as between “4559M isolate with anti-mouse Ly6G treatment” and “4559-Blood isolate with isotype control treatment” in Figure 6C. We feel the authors need to address the rationale of comparing the bacterial burden of “9-47-Ear isolate with anti-mouse Ly6G treatment”, “4559M isolate with anti-mouse Ly6G treatment” and “4559-Blood isolate with isotype control treatment” (lines 347-348).

·       The authors should clarify the comparison that was described on lines 340-342 since it was not shown in Figure 6C.

·       We would like to see the comparison of the bacterial loads between lungs infected with different isolates under each treatment, especially under anti-mouse Ly6G treatment and anti-IL17A treatment. We would also like to see whether the difference of bacterial loads between isolates changes when neutrophils are depleted or IL-17A is neutralized. 

·       In line 262-266, the authors wrote that “Common differentially expressed genes of this function include genes encoding interleukin 17F (Il17f) and chemokine ligands (Cxcl2, Cxcl3 and Ccl20), with ascending expression level of response to 9-47M, 9-47-Ear and 4559-Blood”. The authors need to indicate the corresponding figure. If it was Figure 3C, the comparison group B (9-47-Ear to 4559-Blood) did not show significant difference. 

·       We think it is necessary to test the neutrophil recruitment in infected and mock-infected lung with/without anti-IL17A treatment. 

Are there experiments or analyses that have not been performed, but if "true" would disprove the conclusion (sometimes considered a fatal flaw in the study)? In some cases, a reviewer may propose an alternative conclusion/abstract that is clearly defensible with the experiments as presented, and one solution to ‘completeness’ here should always be to temper an abstract or remove a conclusion and to discuss this alternative in the discussion section. 

·       OK

3.     Quality: Reproducibility (1-3 scale) SCORE = 1.5

Figure by Figure, were experiments repeated per a standard of 3x repeats or 5 mice/cohort etc.? 

·       OK

Is there sufficient “raw data” presented to assess rigor of the analysis? 

·       Instead of showing sequencing depth as “times”, the commonly used methodology is to show the raw read counts, the percentage of reads mapped to the host/S. pneumoniae genes, and the percentage of alignments. Those data can be organized as a table and included as one the main results for their sequencing data presentation. 

Are methods for experimentation and analysis adequately outlined to permit reproducibility? 

·       In general, more details for each experiments and data analysis are needed. Most importantly, details are needed for cDNA library preparation, DESeq2 and enrichment analysis.

If a ‘discovery’ dataset is used, has a ‘validation’ cohort been assessed and/or has the issue of false-discovery been addressed? 

·       OK

4.     Quality: Scholarship (1-3 scale), generally not the basis for acceptance/rejection: SCORE = 3

Has the author cited and discussed the merits of the relevant data that would argue against their conclusion? Has the author cited and/or discussed the important works that are consistent with their conclusion and which a reader should be especially familiar when considering the work? 

·       In the conclusion, the authors stated that “All these variations, and the downstream consequences thereof, are ultimately sensed by host cells, including epithelial and immune cells, resulting in the observed divergence of host response to the various strains, particularly with respect to expression of genes encoding cytokine and chemokine ligands and receptors, as well as those associated with programmed cell death” without any citations. We think the authors need to provide citations to support this statement or exclude this statement from their conclusion. 

Specific (helpful) comments on grammar/diction, paper structure or data presentation (e.g. change a graph style or color scheme) go in this section, but scores in this area not to be significant bases for decisions. 

·       The authors could write more clearly and concisely to aid reader comprehension. 

·       It would helpful if the authors include the RafR mutation site when describing the isolates (e.g. “9-47-Ear RafRD249” instead of “9-47-Ear”).

·       The use of an isolate name, such as “9-47-Ear, 9-47-EarM”, and rafR gene background, such as “RafRD249G”, was often mixed in one sentence. This made it harder for the readers to understand the data interpretation. 

·       Figure 3E would be better as a supplemental information.

·       There was an incorrect reference to Figure 6B (nasopharynx) on line 348, which should be to Figure 6C (lungs).

·       There was a mixed usage of capital letters and small letters in Figure panels (ie. Figure 2a vs Figure 6A). 

·       There were also many gene names and p-values were written out in results e.g. lines 182-203, which could be represented in a table. 

·       In line 288, the authors stated that “In addition to the above, necroptosis, a programmed cell death, is almost significantly enriched (p= 0.07) in differentially expressed murine genes because of the rafR swap in the blood isolate background”. We think this kind of statements should be avoided. 

·       The authors stated that “……with the strains expressing the G249 allele triggering a strong pro-inflammatory IL-17 response in the lungs post-infection. This response leads to an influx of neutrophils to the lungs, resulting in the clearance of bacteria. Conversely, expression of the D249 rafR allele results in a more subdued IL-17 host response, allowing for bacterial persistence in the lungs.” at the end of their conclusion. However, we noticed that this statement was contradictory with the previous and current study. The 4559-Blood isolates (G249 allele) was found to be the persistent isolate in lung (Minhas et al., 2019) and induced less neutrophil recruitment in the lung compare to 9-47-Ear isolate (Figure 5). 

·       Line 341, the authors need to clarify whether it is “…higher than 9-37-Ear and 4559” or ““…higher than 9-37-Ear and 4559M”. 


1.     Impact: Novelty/Fundamental and Broad Interest (1-4 scale) SCORE = 2

A score here should be accompanied by a statement delineating the most interesting/important conceptual finding(s), as they stand right now with the current scope of the paper. A ‘1’ would be expected to be understood for the importance by a layperson but would also be of top interest (will have lasting impact) on the field. 

How big of an advance would you consider the findings to be if fully supported but not extended? It would be appropriate to cite literature to provide context for evaluating the advance. However, great care must be taken to avoid exaggerating what is known comparing these findings to the current dogma (see Table 2). Citations (figure by figure) are essential here. 

·       The authors demonstrated that dual RNA-seq analysis is an excellent approach to study host-pathogen interaction at the transcriptomic level. They identified multiple genes in bacteria and host cells that were affected by RafRD249G. Several molecular mechanisms were proposed to facilitate the interaction between S. pneumoniae isolates and murine lung. However, the results of the microbiological and immunological tests in this study were not enough to support those predictions. This is a major drawback in the study because it is important to validate the findings from transcriptomic studies with the functional characterization of any host-pathogen interaction. Part of this issue can be solved by improving the presentation and interpretation of existing data. 

·       We also think that the implications and significance of this study could be expanded by addressing the rationale behind the choice of the two isolates, such as the epidemiology or prevalence of these strains in a clinical setting.

2.     Impact: Extensibility (1-4 scale) SCORE = N/A

Has an initial result (e.g. of a paradigm in a cell line) been extended to be shown (or implicated) to be important in a bigger scheme (e.g. in animals, or in a human cohort)? 

This criterion is only valuable as a scoring parameter if it is present, indicated by the N/A option if it simply doesn’t apply. The extent to which this is necessary for a result to be considered of value is important. It should be explicitly discussed by a reviewer why it would be required. What work (scope and expected time) and/or discussion would improve this score, and what would this improvement add to the conclusions of the study? Care should be taken to avoid casually suggesting experiments of great cost (e.g. ‘repeat a mouse-based experiment in humans’) and difficulty that merely confirm, but do not extend. (see Bad Behaviors, Table 2). 

·       N/A