Skip to main content

Write a PREreview

Scenario-Adaptive Evaluation of Trustworthy Fine-Tuned Text Models Across Knowledge-Grounded Generation and Misinformation Detection

Posted
Server
Preprints.org
DOI
10.20944/preprints202605.0570.v1

Large language models (LLMs) increasingly require robust evaluation under realistic instruction-following conditions, particularly for fine-tuned task-specific adapters operating in multilingual environments. This study proposes a scenario-adaptive evaluation framework for assessing the reliability of fine-tuned text models across two application regimes: misinformation detection (disinfo) and knowledge-grounded factual biography generation (heroes). The framework integrates automated generation of balanced risk-oriented scenarios, bilingual evaluation in English and Ukrainian, the LLM-as-a-Judge paradigm, and multidimensional robustness analysis through the Alignment Robustness Index (ARI). Six LoRA-adapted models based on Qwen2.5-3B-Instruct, SmolLM2-1.7B-Instruct, and TinyLlama-1.1B-Chat-v1.0 were evaluated. The implemented pipeline generated 2052 scenarios and 6156 model responses, producing a final bilingual analytical subset of 4104 judged records. Experimental results show that task-specific adaptation produces task-dependent robustness profiles. In the disinfo case, Qwen2.5-3B achieved the strongest overall performance, combining the highest safety and classification accuracy. In contrast, the heroes case revealed a more compressed and multidimensional vulnerability space without a single dominant model. The results further demonstrate the importance of multilingual evaluation, as weaker adapters exhibited substantially larger cross-lingual safety gaps. Overall, the proposed framework provides a reproducible and practically applicable methodology for auditing fine-tuned language models under imperfect instructions.

You can write a PREreview of Scenario-Adaptive Evaluation of Trustworthy Fine-Tuned Text Models Across Knowledge-Grounded Generation and Misinformation Detection. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now