PREreviews de “All We Also Need Is ABSTAIN: Eliminating Hallucinations via a Single Token”

Saltar a detalles del preprint Saltar a PREreviews

All We Also Need Is ABSTAIN: Eliminating Hallucinations via a Single Token

por Baris Kanber

Publicado: 24 de octubre de 2025
Servidor: Preprints.org
DOI: 10.20944/preprints202510.1827.v1

Resumen

Large language models (LLMs) suffer from hallucinations—confidently generating false information when uncertain. Here we demonstrate that hallucinations stem primarily from the constraint that models must always select a token from a fixed vocabulary, with no mechanism to express uncertainty. We propose and test a simple solution: we add a single ABSTAIN token to the vocabulary and train models to predict it using corruption augmentation—a scalable data augmentation technique where corrupted inputs are mapped back to the abstain token. In a simple feedforward network tasked with single-token prediction, this approach eliminated hallucinations on unseen data (hallucination rate 95% down to 0%) while maintaining perfect accuracy on known examples. The same principle also scaled to a real question-answering (QA) model: a distilled BERT, fine-tuned on SQuAD abstained on 95% of nonsense questions at the optimal corruption level without suffering a catastrophic reduction in accuracy.

Leer el preprint

0 PREreviews

Redactar una PREreview Solicitar una PREreview

PREreviews de All We Also Need Is ABSTAIN: Eliminating Hallucinations via a Single Token

0 PREreviews