Saltar al contenido principal

Escribe una PREreview

All We Also Need Is ABSTAIN: Eliminating Hallucinations via a Single Token

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202510.1827.v1

Large language models (LLMs) suffer from hallucinations—confidently generating false information when uncertain. Here we demonstrate that hallucinations stem primarily from the constraint that models must always select a token from a fixed vocabulary, with no mechanism to express uncertainty. We propose and test a simple solution: we add a single ABSTAIN token to the vocabulary and train models to predict it using corruption augmentation—a scalable data augmentation technique where corrupted inputs are mapped back to the abstain token. In a simple feedforward network tasked with single-token prediction, this approach eliminated hallucinations on unseen data (hallucination rate 95% down to 0%) while maintaining perfect accuracy on known examples. The same principle also scaled to a real question-answering (QA) model: a distilled BERT, fine-tuned on SQuAD abstained on 95% of nonsense questions at the optimal corruption level without suffering a catastrophic reduction in accuracy.

Puedes escribir una PREreview de All We Also Need Is ABSTAIN: Eliminating Hallucinations via a Single Token. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora