Avalilação PREreview de The Contributor Role Taxonomy (CRediT) at ten: a retrospective analysis of the diversity of contributions to published research output
- Publicado
- DOI
- 10.5281/zenodo.18459736
- Licença
- CC BY 4.0
Review of “The Contributor Role Taxonomy (CRediT) at ten: a retrospective analysis of the diversity of contributions to published research output”
This review was the result of a live review session and a period of asynchronous review that included Alan Colín-Arce as an additional discussion participant.
Summary
This preprint examines global adoption of the Contributor Role Taxonomy (CRediT) from 2015 to 2024. Because implementation has been inconsistent across publishers, and because systematic data collection remains challenging, the authors use the Dimensions database and a multi-stage text-extraction approach via Google BigQuery (GBQ). They identify more than 3.2 million research outputs (articles, preprints, and conference papers) that include CRediT roles and report accelerating uptake, with CRediT appearing in 22.5% of 2024 publications where full text is available. The manuscript also aims to situate CRediT within responsible research assessment discussions by aligning it with DORA and CoARA and emphasizing that author position is not a reliable indicator of contribution. A validation exercise reports a 0.971 matching score when comparing roles across paired XML/PDF files, suggesting the reliability of the computational workflow. Further, the analysis documents variation by publisher, country, and discipline, with figures that suggest additional patterns—such as a recent decline in MDPI’s reported CRediT usage and decreasing reporting rates for several roles over time—that are not yet explored in depth in the manuscript.
Strengths
A major strength of this manuscript is its scale and scope. It provides an impressive census of CRediT’s first decade, drawing on more than 3.2 million publications and mapping temporal, disciplinary, geographic, and publisher-level patterns of uptake. The paper is also timely and well-positioned within broader responsible research assessment discussions, emphasizing that author position is not a reliable indicator of contribution and noting CRediT’s relevance for surfacing “hidden” technical, software, and data-centric work.
The authors also make a commendable effort toward transparency by sharing processed data and Python scripts for figure generation via Figshare (doi:10.6084/m9.figshare.28816703). The coding notebook is well-documented and includes helpful explanatory comments.
Major issues
While the study’s objective is clearly stated, the absence of explicitly formulated research questions weakens the analytical framing and makes it difficult to assess whether the methods and results are optimally aligned with the study’s aims.
As the analysis supports a published Commentary, it would be good to state that in the introduction and to include a full citation of the Commentary.
The authors should also better justify their methodological choices and the use of Dimensions as the primary data source. Relatedly, it would be helpful to provide more detail about the Dimensions database (its content, strengths relative to Scopus, Web of Science or OpenAlex, and limitations) and about Google BigQuery (GBQ), including a short explanation of what it is and how it supports large-scale querying and analytics.
A central interpretive concern is that the analysis is shaped by the restriction to publications in Dimensions where “full-text is available to search.” If full-text availability varies systematically by publisher, discipline, or region, this may introduce bias in precisely the strata used to interpret adoption patterns. We recommend that the authors provide quantitative information (e.g., a diagram or table) and/or a qualitative discussion of full-text coverage over time and across major strata (publisher, field, country). This would help readers assess whether observed growth reflects, at least in part, improvements in full-text ingestion and Open Access availability rather than CRediT adoption alone.
Several core results would also benefit from stronger normalization and interpretation. The analysis is often presented in absolute numbers, which can be misleading without normalization by each country’s total output, disciplinary output, or total publisher output. Normalizing selected results would help readers to identify which entities are truly playing leadership roles in the adoption of CRediT and to interpret the activities of dominant players (China, Elsevier, MDPI, the U.S.) more reliably.
The manuscript also includes notable descriptive patterns that warrant deeper probing. For example, the decline in MDPI’s CRediT usage (Fig 3A) requires further investigation to determine whether it reflects a policy shift, changes in publication volume, or a technical artifact in algorithmic detection. Although these patterns are visible in the figures, they are not explicitly analyzed in the Results or Discussion sections. Similarly, declining role reporting rates over time (Fig 2) remain under-examined, including whether these patterns reflect changes in enthusiasm, normalization, reporting practices, or detection. More broadly, the Results and Discussion sometimes attribute trends to publisher adoption and systems integrations. While plausible, the evidence is observational and supports association rather than causation, so tightening language to avoid causal interpretations would reduce the risk of overclaiming. At times, the manuscript moves from descriptive analysis to normative or advocacy-oriented claims without clearly distinguishing between empirical findings and interpretive claims.
Conceptually, the manuscript would be strengthened by greater reflexivity about what “adoption” means in practice. The manuscript implicitly assumes a relatively linear model of research dissemination (from contribution to publication to assessment), whereas preprints and contributorship statements may serve heterogeneous purposes that complicate this trajectory. The analysis at times risks conflating recorded uptake of CRediT with meaningful or ethical use, without sufficiently discussing whether reported roles reflect contributor practices, power asymmetries within teams, or strategic compliance driven by publisher mandates. A dedicated limitations section (or an expanded limitations discussion) that addresses these issues would improve analytical depth and credibility, especially when making claims about global uptake and equity-related interpretations.
Methodologically, further transparency is needed regarding the NLP pipeline, including spaCy model configuration, role-disambiguation rules, and separate reporting of precision and recall. Please provide the matching formula in text or supplemental materials. It would also help to discuss how validation based on recent, predominantly English-language data may limit generalizability. Reproducibility would be further enhanced by more granular detail on the specific search algorithms used. While open code and data are a strength, the Figshare repository does not include the full raw dataset (it provides summary statistics and a validation subset), and sharing a small representative sample of raw data and/or additional metadata could further strengthen reproducibility and reuse.
Also, please provide complete citations for all sources rather than bare URLs.
Finally, we note that two of the authors are affiliated with Digital Science, the company behind Dimensions and involved in the introduction of the CRediT taxonomy. Given the centrality of Dimensions to the analysis, we recommend that this affiliation be explicitly disclosed in the competing interests statement, alongside disclosure of other relevant roles (e.g., NISO Standing Committee on CRediT).
Conflict of Interest
No conflicts of interest were identified by the review authors.
This review represents the opinions of the authors and does not represent the position of Future of Research Evaluation and e-Scholarship (FORCE11) as an organization.
Use of Artificial Intelligence (AI)
The authors declare that they did not use generative AI to come up with new ideas for their review.
Competing interests
The authors declare that they have no competing interests.
Use of Artificial Intelligence (AI)
The authors declare that they did not use generative AI to come up with new ideas for their review.