Avalilação PREreview de Changes in Manuscript Length, Research Team Size, and International Collaboration in the Post-2022 Period: Evidence from PLOS ONE

de Jennifer Miller, Ifeanyichukwu Akuma, Guleda Dogan, Rosario Rogel-Salazar, Alan Colin-Arce, Randa Salah Gomaa Mahmoud e Xiuqi Li do/a Future of Research Communication and e-Scholarship (FORCE11)

Publicado: 10 de abril de 2026
DOI: 10.5281/zenodo.19500176
Licença: CC BY 4.0

This review was the result of a live review session and a period of asynchronous review.

Summary:

The preprint examines changes in the structural characteristics of articles published in PLOS ONE between 2019 and 2025, with particular attention to 2022 as a potential inflection point. Drawing on a large corpus (over 100,000 articles), the study analyzes variations in manuscript length, team size, reference counts, and collaboration patterns between authors classified as native and non-native English speakers (NES/NNES). This classification is constructed based on authors’ institutional affiliations—specifically, their association with institutions in English-speaking countries—as a proxy for linguistic background.

Methodologically, the study combines large-scale quantitative analysis, including temporal comparisons, statistical significance testing, and trend modeling, along with automated procedures for author classification and corpus processing. The study also incorporates open science practices by making data, prompts, and analytical materials publicly available.

The findings indicate a sustained increase in manuscript length, a modest convergence between the NES and NNES groups in text length, a reduction in the number of co-authors, and shifts in collaboration patterns—including a decrease in the proportion of NES authors in NNES-led papers. Additionally, the study reports a moderate increase in the number of references, although without substantial differences between groups. The manuscript suggests that these changes may be partially associated with the adoption of generative AI tools following 2022; however, it also acknowledges important limitations in establishing direct causal relationships and the potential influence of alternative explanatory factors.

List of major concerns and feedback:

The following major concerns reflect issues that, if not adequately addressed, may affect the interpretation and robustness of the study’s findings. These comments focus on key aspects of conceptual framing, methodological validity, and analytical transparency. While the reviewers recognize the value and relevance of the work, addressing these points would significantly strengthen the clarity, rigor, and interpretive balance of the manuscript.

Interpretive scope and causal attribution of findings: The manuscript suggests that the changes observed after 2022 may be associated with the adoption of generative AI tools. However, the study design is observational and does not include a direct measure of LLM use, nor a strategy to isolate its effects from other concurrent factors. This limits the ability to establish causal relationships and may lead to an overemphasis on the role of AI in interpreting the results.

Suggestion: The discussion and conclusions should more clearly reflect the correlational nature of the findings. The argument could be strengthened by incorporating complementary analyses (e.g., more direct proxies of AI use, comparisons across journals with different profiles, or finer-grained temporal segmentation) or by expanding the discussion of alternative explanations.

Conceptual validity of the NES/NNES classification: The classification of authors as NES/NNES is based on institutional affiliation in some English-speaking countries as a proxy for linguistic background. This assumption introduces important ambiguities, as institutional affiliation does not necessarily reflect an author’s linguistic trajectory. Moreover, this approach may conflate linguistic, geographic, and institutional dimensions, affecting the interpretation of differences between groups.

Suggestion: The manuscript should explicitly problematize this proxy in both the methods and discussion sections, acknowledging its conceptual limitations. Where possible, the authors could consider complementary indicators or, at a minimum, refine the terminology (e.g., avoiding direct equivalence with “native speaker”) to more accurately reflect what the variable captures.

Limited consideration of alternative explanations and corpus context: Although the manuscript acknowledges some limitations, the discussion does not sufficiently develop alternative explanations for the observed changes, such as disciplinary differences, pandemic-related effects, shifts in collaboration patterns, or journal-specific editorial dynamics. Given the multidisciplinary nature of PLOS ONE, these factors may significantly influence variables such as manuscript length, number of authors, and reference counts.

Suggestion: The discussion would benefit from a more systematic consideration of these factors and, where feasible, additional stratified analyses or controls to assess their relative influence. This would support a more robust and context-sensitive interpretation of the findings.

Transparency and reproducibility of the analysis: The manuscript adopts valuable open science practices by sharing data, prompts, and analytical materials. However, some elements limit full reproducibility, including the absence of complete analysis scripts, limited repository documentation (e.g., README), and insufficient detail on missing data handling and validation procedures. In addition, there appear to be discrepancies between the number of records reported and those available in the dataset.

Suggestion: The authors are encouraged to provide full analysis scripts (e.g., in R or Python), improve repository documentation, and more clearly describe data cleaning, validation, and missing data procedures. These steps would significantly strengthen transparency and reusability.

Interpretation of results relies predominantly on statistical significance: Given the large sample size, statistical significance may not necessarily indicate substantive relevance. Some reported findings, while statistically significant, appear to have small effect sizes, making it difficult to assess their practical implications for scientific writing practices.

Suggestion: The analysis would be strengthened by incorporating effect size measures and explicitly discussing the substantive significance of the observed changes. This would support a more balanced interpretation and help avoid overstating the implications of the findings.

List of minor concerns and feedback:

Clarity and readability of figures: Some figures present issues of interpretation or readability. In particular, the density shown in Figure 1A is difficult to interpret without additional context, and certain graphical elements in other figures (e.g., Figures 5 and 6B) could benefit from improved visual clarity or simplification

Suggestion: The authors are encouraged to revise the design of the figures to enhance clarity, including more explicit labeling, adjustments to scales, or the addition of brief explanatory notes to guide interpretation.

Incomplete visualization of some results: Some relevant results described in the text, such as patterns in reference volume, are not accompanied by clear graphical representations comparable to those provided for other variables.

Suggestion: Consider adding additional figures or reorganizing existing ones to ensure a more balanced visual representation of the main findings.

Consistency in temporal comparisons: The manuscript alternates between different temporal comparison schemes (e.g., 2019–2025 vs. comparisons centered on 2022), which may create some confusion in interpreting the results.

Suggestion: Clarify the rationale behind each type of temporal comparison and maintain greater consistency in their use throughout the manuscript.

Methodological and statistical details: Certain methodological aspects would benefit from additional clarification, such as the justification of specific statistical tests, assessment of assumptions (e.g., normality), and decisions related to data processing (e.g., whether word counts include or exclude references).

Suggestion: Provide further clarification in the methods section or supplementary materials to support transparency and facilitate evaluation of the analysis.

Some specific things to clarify:

The potential selection bias introduced by using a single journal with an article processing charge (APC) business model (PLOS ONE) and how this may affect NES vs. NNES comparisons.
The statistical adequacy and representativeness of the validation sample (n=100) used for NES/NNES classification (e.g., confidence intervals or margin of error).

Terminological and stylistic consistency: Minor inconsistencies are present in terminology, abbreviations, and citation style (e.g., use of “et al.”, capitalization of terms), as well as occasional typographical errors.

Suggestion: A careful editorial review would help ensure consistency and precision throughout the manuscript.

Some specific things to clarify:

Delete extra period after “was” in the second paragraph of section 2.4.
Most major style guides do not list more than 2 authors in an in-text citation; they use the abbreviation et al. and the first author’s name instead. Check your style guide and follow it consistently.
Spell out abbreviations on first use, include the abbreviation in parentheses, then use the abbreviation throughout the rest of the paper (examples include API, MASS, emmeans).
Page 10, Disciplinary Heterogeneity in Textual Expansion: (2022–2025) should be corrected to (2022 vs. 2025).
Correct capitalization in the References section, especially for country names and software.

Overall Assessment

The manuscript addresses a timely and relevant question regarding potential transformations in scientific writing in the context of adopting generative artificial intelligence tools. Its main strengths lie in the use of a large-scale corpus, the application of quantitative analytical techniques, and the incorporation of open science practices that enhance transparency and data reuse.

However, the interpretation of the findings requires greater caution. In particular, the emphasis on linking the observed changes to LLM adoption is not fully supported by the study design, introducing a risk of overstating their role relative to other concurrent factors. Similarly, the NES/NNES classification raises important conceptual concerns, as it relies on institutional proxies that may oversimplify the diversity of linguistic trajectories and potentially reproduce categories that are not analytically neutral.

The manuscript would also benefit from a more robust discussion of alternative explanations, including disciplinary and contextual differences, as well as strengthened methodological documentation to improve reproducibility.

This is a valuable contribution to an emerging area of research; however, it would benefit from substantive revisions to ensure a more precise, nuanced, and conceptually robust interpretation of its findings.

Concluding remarks

We thank the authors of the preprint for posting their work openly for feedback. We also thank all participants of the Live Review call for their time and for engaging in the lively discussion that generated this review.

This review represents the opinions of the authors and does not represent the position of Future of Research Communication and e-Scholarship (FORCE11) as an organization.

Competing interests

The authors declare that they have no competing interests.

Use of Artificial Intelligence (AI)

The authors declare that they did not use generative AI to come up with new ideas for their review.

Comentários

Escrever um comentário

Nenhum comentário foi publicado ainda.