Skip to PREreview

PREreview of Implementing Code Review in the Scientific Workflow: Insights from Ecology and Evolutionary Biology

Published
DOI
10.5281/zenodo.8122424
License
CC BY 4.0

This review was made as part of a community of practice supported by ASAPbio and BIOS2. One of the authors, GH, is a signatory of Publish Your Reviews, and has committed to publish her peer reviews alongside the preprint version of an article. For more information, see publishyourreviews.org.

Authors’ background:

  1. GH: theoretical ecology, open science, open data, ecological networks

  2. VC: plant ecology, conservation biology, taxonomy, ecological modelling

  3. MG: plant biology, seed science and technology, agronomy, urban evolutionary biology

  4. KH: biodiversity change, indicators, population dynamics, island biogeography

  5. TS: network ecology, computational ecology, prediction

—-

Summary:

The authors do a great job filling a gap in the literature about the importance of reviewing code (beyond sharing code), but without going directly to the ‘it should be a part of the peer review process’, showing that it can be done at a smaller pre-review scale (on small groups and using voluntary reviews). They delineate a way to implement code review in ecology and evolutionary biology projects, including how to review code, how to set up projects to facilitate code reviews, and ideas of how this could be implemented during the research process.

The goal is to encourage and envision the implementation of code review to improve the reliability of ecoevo research, which increasingly requires software development but rarely involves code review. They do that by defining “the 4 R’s of code review” (is the code as reported, does the code run, is the code reliable, are the results reproducible) - which can be applied in any programming language. 

The workflow they provide is clear and it seems accessible / easy for people with no computational background to understand and follow. For example, they provide a useful graphic summary at the end of each mail section that makes it clear to the readers to conclude what they have read so far. It also covers some common tools used in ecological research (GitHub, R) and common data management/sharing standards (e.g., metadata).

Big-picture comments

Good/Excellent Things

  • The manuscript is generally clear and readable and touches on key points of code review and sharing. The implementation ideas are concrete and feel accessible.

  • We love how they build a picture of reviewing code throughout the manuscript's “lifetime” - it’s not a common view in the publishing system, but it’s very important to avoid a crisis of unmaintained software.

  • The R’s of code review are defined clearly and concisely, and nicely visualised in Figure 1. 

  • The manuscript is full of linked resources to help readers set up a code review, which makes this feel achievable (it could also be made into a table to make it a shareable resource).

  • We really liked the discussion in ‘Output Reproducibility’ (starting l. 235) and how we might not expect exact outputs when doing some sort of random simulation (it’s a minor thing but so easy to forget).

Things to be improved

  • Figure 3 is a useful resource and could also be made into a (printable?) checklist, instead of (or in addition to) the figure, because those questions would be good to check off as yes/no with comments. The checklist could then also be easily converted into a template Github issue that anyone could use!

  • We would have liked to see a bit more development of the “Formal code review” (l. 323) section. The authors state (rightly) that adding code review to the peer review process could put more burden on already overworked academics. Can the authors provide any suggestions on how to mitigate the effects of this additional burden, and ways of making the peer review process both sustainable and scientifically robust?

  • We missed a bit of a connection with the broader peer review and publishing systems. How would code review fit into a traditional peer review system? What needs to be changed to accommodate it properly? We felt it came a little late in the manuscript, so it was in the back of our minds while reading the paper (until the section about “Formal code review”).

  • In some parts, they can mention a specific example to make it clear to the reviewers what they exactly mean. For example, in the section “Is the code reliable” on page 5, they could have created a "mock" code for the reader to have a clearer picture of what a non-reliable code would look like.

Small-picture comments

  • L. 176: The sentence seems to be missing a piece

  • Check stability/longevity of links e.g. GitHub issues can be closed etc. - specific e.g. ls 276-77

  • Fig 2: 

    • Missing a question mark “Does the code match the methods”

    • There is an argument that all errors should be reported as issues no matter their complexities. Even if you fix it, file a PR.

  • L. 85: There’s a question mark where it should be a point.

  • Maybe consider referring for consent from authors when referring to GitHub issues as a use case.

Competing interests

KH is co-authoring a paper with Matthew Grainger, who is on the author list. GH has collaborated on a peer review with Elliot Gould.