Comments
Write a commentNo comments have been published yet.
This preprint integrates large-scale single-cell RNA-seq datasets from 110 treatment-naïve NSCLC tumors together with a prospective scRNA-seq cohort and a bulk RNA-seq clinical trial cohort to identify cellular determinants of resistance to PD-1/PD-L1 blockade. By stratifying tumors according to the abundance of CXCL13⁺ CD4 and CD8 tumor-reactive T cells, the authors systematically scan the tumor microenvironment for populations that inversely associate with these cells and home in on MMP1⁺ cancer-associated fibroblasts, further defined as tumor–stroma boundary CAFs (tsbCAFs). These cells are enriched in tumors with low tumor-reactive T-cell infiltration, are more abundant in non-responders in the on-treatment cohort, and are linked to poorer outcomes in the atezolizumab OAK trial using a two-gene tsbCAF signature (MMP1/CRABP1). Spatial transcriptomics and protein data support a striking one-layer peritumoral arrangement of these MMP1⁺ fibroblasts at the tumor–stroma interface, consistent with a possible physical barrier to T-cell access. Conceptually, the work is important for colleagues working on tumor immunology, stromal biology, CAF heterogeneity, and immunotherapy resistance, and it offers a pragmatic computational framework that others could apply to different cancer types or resistance phenotypes. At the same time, several aspects would benefit from clarification or additional analysis, including more precise description of the CXCL13⁺ T-cell–based stratification scheme, stronger functional linkage between tsbCAFs and T-cell exclusion, clearer justification and benchmarking of the MMP1/CRABP1 bulk signature, and more discussion of potential confounders such as treatment regimen, disease stage and spatial sampling. Addressing these points should help sharpen the story and make the mechanistic inferences more robust.
The tumor stratification based on “high” versus “low” CXCL13⁺ CD4 and CD8 T cells is central to the entire discovery framework (“the tertile split was used to dichotomize tumors”), but the rationale and robustness of this cut-off are not fully explored. Could you quantify how sensitive your downstream identification of MMP1⁺ CAFs is to the exact tertile thresholds, for example by testing alternative schemes (median split, quartiles, continuous modeling of CXCL13⁺ T-cell frequency)? It would be helpful to see, even in supplementary form, how the effect sizes for MMP1⁺ CAFs change if you vary the cut-offs, and whether a continuous association (e.g., correlation between tsbCAF proportion and CXCL13⁺ T-cell fraction) remains consistent across datasets once you adjust for total T-cell infiltration, tumor purity, histology, and stage.
The conclusion that MMP1⁺ tsbCAFs act as a “physical barrier that prevents tumor-reactive T cells from recognizing and killing cancer cells” currently relies largely on spatial colocalization and ECM-related gene expression; this feels more suggestive than mechanistically supported. Can you probe more directly whether regions enriched for tsbCAFs indeed exhibit reduced T-cell infiltration or activation in a spatially resolved manner (e.g., T-cell distances to cancer cells across tsbCAF-positive vs tsbCAF-negative interfaces, spatial gradients of CXCL13 or cytotoxic markers, spatial correlation statistics)? If this is not possible with existing datasets, please make explicit which mechanistic steps remain hypothetical and consider discussing alternative, non–purely physical possibilities (e.g., local cytokine gradients, chemokine decoys, or interactions with endothelial cells) that could also contribute to T-cell exclusion in tsbCAF-rich regions.
The use of MMP1 and CRABP1 as a bulk RNA-seq proxy for tsbCAF abundance in the OAK cohort is an attractive idea, but the derivation, specificity and potential confounders of this two-gene signature are not entirely clear. Could you quantify how specific MMP1 and CRABP1 are to tsbCAFs compared with other stromal and malignant compartments across all single-cell datasets (including previously published ones) and show how often they are co-expressed at the single-cell level? It would also be useful to compare the prognostic and predictive performance of the two-gene signature against alternative CAF/ECM-related gene sets (e.g., FAP, COL genes, generic CAF scores, Grout et al. CAF matrices) in the same OAK dataset, and to adjust the survival models for known clinical correlates such as PD-L1 expression, TMB, histology, and prior lines of therapy. This would help establish that the bulk tsbCAF signal adds information beyond more generic stromal or CAF signatures.
In the description of single-cell quality control (“The qualified cells were defined as those with (1) >600 expressed genes and (2) <25,000 UMIs”), the choice of these specific thresholds is not justified. Could you briefly explain how you arrived at these cut-offs, whether alternatives were tested (e.g., doublet rates, mitochondrial content, cell cycle or ribosomal content), and whether the inclusion or exclusion of borderline cells materially changes the inferred abundance of tsbCAFs or CXCL13⁺ T cells across samples?
In the methods section on batch correction (“we implemented the BBKNN pipeline to obtain a batch-corrected space”), it would help to provide more detail on the parameter settings (e.g., number of neighbors, number of PCs, metric) and to show at least one diagnostic plot or quantitative metric (e.g., kBET, LISI) indicating that cell-type structure is preserved while batch effects are reduced. Given that your key readout is relative cell-type composition across samples, it would also be useful to comment on whether batch correction was applied before or after cell-type frequency estimation and how you avoided artificially homogenizing sample-specific composition differences.
The MetaCell-based “computational gating strategy” for fibroblast subsets is an interesting and somewhat non-standard step, but the current description is quite dense. Could you add a short schematic or pseudo-code description that clarifies: (i) how metacells were defined (k, minimal metacell size, coverage thresholds), (ii) how marker genes were selected for gating (e.g., thresholds on average expression and specificity), and (iii) how you handled metacells expressing overlapping signatures (e.g., partial MMP1 and COL18A1 expression)? This will make it easier for others to reproduce or adapt your approach to different cell types.
When you compare MMP1⁺ fibroblasts with FAP⁺ACTA2⁺ CAFs and COL18A1⁺ fibroblasts (“MMP1⁺ fibroblasts upregulated matrix metallopeptidase-associated genes”), it would be informative to provide a more quantitative summary of the ECM-related programs, for example using pathway enrichment or gene set scores for collagen deposition, matrix degradation and stiffness-related signatures. Have you checked whether tsbCAFs are enriched for any known CAF subtypes from Lavie et al. or Grout et al., or whether they represent a distinct axis orthogonal to the existing iCAF/myCAF/apCAF classification?
In the validation cohort 1, the on-treatment biopsies are collected under combination regimens (PD-1 or PD-L1 plus chemotherapy), which can substantially reshape the TME by themselves. Could you clarify the timing of biopsies relative to both immunotherapy and chemotherapy cycles, and comment on whether chemotherapy-only effects on CAFs and T cells could confound your attribution of tsbCAF changes to ICB response? If possible, including even a simple analysis stratifying by chemotherapy regimen or number of cycles might help gauge how robust the tsbCAF–response association is under these clinical variables.
For the comparison of cell-type frequencies in validation cohort 1 (“The Dirichlet-multinomial regression model was utilized to test for differences in cell composition”), some additional detail would be useful. What covariates were included in the model (e.g., total cell count per sample, batch, patient ID, biopsy timing)? How did you correct for multiple testing across all clusters, and are the reported P values FDR-adjusted? A short summary of the model formula and the software/package used would help readers assess the strength of the compositional evidence.
In the spatial analyses, you mention that MMP1⁺ fibroblasts were observed only in certain sections (“we still identified MMP1⁺ fibroblasts in a lung tumor section”) and that only one of the two MERFISH samples contained MMP1⁺ cells. Could you discuss more explicitly how representative these sections are of the broader cohort, and whether you attempted any quantitative spatial statistics (e.g., Ripley’s K, cross-K functions, neighborhood enrichment) to formally assess the “single-cell layer” description rather than relying on visual impression alone? Even negative or inconclusive attempts would help calibrate how far the spatial evidence can currently be pushed.
The choice of F2R as an IHC surrogate for tsbCAFs is reasonable but also clearly imperfect given its expression in endothelial cells and COL18A1⁺ CAFs. It would help to more systematically describe how many NSCLC cases in the Human Protein Atlas display the “one-layer peritumoral” F2R staining pattern, and whether these regions overlap with endothelial markers (e.g., CD31, CLDN5) or collagen-rich stromal regions. If dual-marker data are not available, please emphasize this limitation more clearly and consider suggesting which multiplex IHC or imaging mass cytometry panels would best validate tsbCAF identity in future work.
In the Discussion you propose that tsbCAFs may drive T-cell exclusion through ECM remodeling and specific 3D matrix architectures, which is plausible but quite high-level. Could you briefly comment on how your tsbCAF program overlaps with published ECM-related drivers of immune exclusion, such as DDR1-dependent collagen alignment or LOX-mediated matrix cross-linking, and whether any of these pathways are specifically upregulated in your tsbCAF cluster compared with other CAF populations? This might help prioritize candidate mechanisms and possible therapeutic targets for follow-up.
Some terminology would benefit from standardization and minor clarification. For example, you refer to “MMP1⁺ fibroblasts,” “MMP1⁺ CAFs,” and “tsbCAFs” at different points—would you consider clearly defining these terms once and then consistently using one main label (e.g., tsbCAFs) throughout the main text? Similarly, when you say “tumors with high levels of tumor-reactive T cells,” it would be useful to remind the reader in the figure legends that this refers to the top tertile of both CXCL13⁺ CD4 and CXCL13⁺ CD8 T-cell fractions, to avoid confusion with total T-cell infiltration.
The author declares that they have no competing interests.
The author declares that they did not use generative AI to come up with new ideas for their review.
No comments have been published yet.