Saltar al contenido principal

Escribe una PREreview

Set-up, validation, evaluation, and cost-benefit analysis of an AI-assisted assessment of responsible research practices in a sample of life science publications

Publicada
Servidor
bioRxiv
DOI
10.64898/2026.01.23.701317

The (semi-)automated screening of publications for diverse quality and transparency criteria is at the core of systematic literature assessment. Typically, the assessment process involves two initial reviewers and one additional reviewer for cases that require reconciliation. Here, we explore to what extent this process can be assisted by Large Language Models (LLMs). Specifically, whether LLMs are capable of assessing responsible research practices (RRPs) in scientific papers in a robust way. We employed proprietary LLMs to assess an initial set of 37 papers across ten RRPs. The same papers were also reviewed by three human reviewers. We iteratively redesigned prompts to increase model accuracy compared to human ratings which we treated as the gold standard. The resulting pipeline was validated on an additional set of 15 papers. We show that LLM accuracy is comparable to single human reviewer performance (90% for LLM vs 86% for a single human reviewer). However, performance strongly depended on the specific RRPs with accuracy ranging from 40% to 100%. LLMs exhibited an affirmative bias, making more errors when practices were not reported in the papers. Overall, we show how such an approach potentially replaces one human reviewer, enabling AI-assisted assessment of research papers. We discuss how dataset imbalances, validation procedures, and implementation time limit the broad applicability of such approaches. Through this, we develop initial guidance on the utility of proprietary LLMs in evidence synthesis.

Puedes escribir una PREreview de Set-up, validation, evaluation, and cost-benefit analysis of an AI-assisted assessment of responsible research practices in a sample of life science publications. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora