Saltar al contenido principal

Escribe una PREreview

Adversarially Robust Real-Time Fake News Detection Using RoBERTa with Continuous Learning and Browser-Native Deployment

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202605.0899.v1

Automated fake news detection has advanced substantially through transformer-based classification, yet two critical gaps persist in the literature: static models degrade as misinformation tactics evolve, and high-performing systems rarely reach end users in accessible forms. This paper addresses both gaps through a system that couples RoBERTa-based classification with a post-deployment continuous learning pipeline and a browser-native Chrome extension. We curate a corpus of 70,556 unique articles from three established benchmark datasets—ISOT, WELFake, and the COVID-19 Constraint dataset—after eliminating 42.9% of initially gathered samples as cross-dataset duplicates. A systematic comparison of XGBoost (95.88%), DistilBERT (97.74%), and RoBERTa-base (98.51%) establishes the production model, with selection driven primarily by false negative rate: RoBERTa achieves 1.09%, a 69% reduction over XGBoost and 28% over DistilBERT. A documented vulnerability of transformer classifiers is susceptibility to formally-worded misinformation that mimics journalistic style. We construct a dedicated adversarial training set of 70 examples spanning health misinformation, suppression narratives, and election fraud claims, and demonstrate that targeted fine-tuning raises adversarial detection accuracy from approximately 40% to 95.71% while maintaining 98.60% accuracy on standard benchmarks—achieved through experience replay that prevents catastrophic forgetting. For deployment, ONNX INT8 quantization reduces model size from 500MB to 125MB without accuracy loss, enabling inference on free CPU infrastructure. A GitHub Actions pipeline collects fresh labeled articles nightly, and a FastAPI service running on Hugging Face Spaces serves predictions with 150–200 ms latency. A Chrome extension providing paragraph-level hover detection, LIME-based word attribution, source credibility scoring, and multilingual support across 19 languages makes the system accessible to non-technical users. End-to-end evaluation across 50 curated articles yields 98% accuracy; research-backed adversarial testing across seven categories achieves 91.7%, with perfect detection on adversarial attacks, AI-generated misinformation, temporal domain shifts, and multilingual content.

Puedes escribir una PREreview de Adversarially Robust Real-Time Fake News Detection Using RoBERTa with Continuous Learning and Browser-Native Deployment. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora