Skip to PREreview
Requested PREreview

PREreview of Shotgun metagenomic analysis of the skin mucus bacteriome of the common carp (Cyprinus carpio)

Published
DOI
10.5281/zenodo.12608639
License
CC BY 4.0

Papp et al. apply shotgun metagenomic sequencing to the skin mucus of the common carp. Their key findings include:

  • The bacteriome of carp skin mucus resembles the surrounding pond water.

  • The predominant composition of the community, especially at a phylum level.

  • Insight into some of the functional potential of the microbiome.

Given the challenges of applying shotgun metagenomic sequencing to fish for microbial detection, the authors' effort was difficult. However, given the limitations of their data, it's commendable that they were able to assemble a coherent narrative and relate their biological findings to those in the field.

Major issues

  • Improve the introduction: The introduction doesn’t currently give the reader enough contextual information about the project and its challenges or about the potential usefulness of the results. I’ve included some suggestions below that might improve the intro.

    • Include a methodological discussion of why using shotgun metagenomic sequencing on fish samples is difficult for microbial cataloging.

    • A discussion of differences between wild and farm fish and how that might impact findings or methodology used in the study. In particular, how does the microbial load differ? How is the microbiome of the water different?

    • Can the microbiome be pathogenic, or is it only ever beneficial?

    • Specific examples of the economic importance of carp. It would also be helpful to know why this study would benefit the farmed carp industry.

  • Introduce the study's limitations earlier: Currently, the reader doesn't find out that less than 1% of the sample is bacterial until the results section. It would be very helpful to read this result in the abstract, as I think it would contextualize the paper better. It would also be helpful to know both the percent of reads and the number of reads that were carp.

  • Potential room for improvement in methodological approach: As I understand it, the authors quality controlled their reads, used Kraken to identify bacterial reads, and then analyze those reads. I think that this might dramatically undersample the bacterial reads present. I would take a different approach. Below are some ideas:

    • De novo assembly approach 1: Assemble all reads. Use a tool like diamond (see code suggestion below for making a diamond database) to assign the taxonomy of each contig. Filter out carp contigs. Analyze the non-carp contigs (which might also include archaea and fungi).

    • De novo assembly approach 2: map the reads back to the common carp genome. Assemble anything that doesn't map, and then work with those contigs.

    • Mapping approach: Use a tool like genome-grist to identify which microbes are in the sample. Genome-grist first uses sourmash to determine which microbes are present. It will then download those microbial genomes and map the reads back to those genomes. This might provide a fuller picture of what’s in the sample, given that I expect you might have a highly fragmented assembly. Note, I am an author on the sourmash tool so this suggestion has a conflict of interest :)

      * It would be helpful to see the analysis code for this project. Would it be possible for it to be uploaded to a GitHub repository?

Minor issues

  • Ideas for how this work could be used by others

    • Marker gene panel: Could the data in this study be used to generate a marker gene panel for taxonomic profiling? This would include more genes than 16s rRNA but help limit host contamination. How could this type of panel be used by the field? Would many researchers need to adopt it for it to be useful? A tool like singleM might help identify the marker genes (and taxonomic composition) in the sample.

    • Relate the microbiome to the host: Is there enough data in this study to relate the microbiome to the host genome (SNPs, copy number variation, etc)? If not, how much more data would need to be collected to enable this? What lessons could we learn to help the field with such a study?

  • It would be helpful if you could report how many reads were lost when filtering with TrimGalore. Also, note that fastp is generally now standard in the field for quality control (it would replace FastQC and TrimGalore). I'm not suggesting that the authors use a different tool, though.

Competing interests

The author declares that they have no competing interests.