Avaliações PREreview de “Towards a Science of AI Agent Reliability”

Ir para detalhes do preprint Ir para avaliações PREreview

Towards a Science of AI Agent Reliability

de Stephan Rabanser, Sayash Kapoor, Peter Kirgis, Kangheng Liu, Saiteja Utpala e Arvind Narayanan

Publicado: 18 de fevereiro de 2026
Servidor: arXiv
DOI: 10.48550/arxiv.2602.16666

Resumo

AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress, many agents still continue to fail in practice. This discrepancy highlights a fundamental limitation of current evaluations: compressing agent behavior into a single success metric obscures critical operational flaws. Notably, it ignores whether agents behave consistently across runs, withstand perturbations, fail predictably, or have bounded error severity. Grounded in safety-critical engineering, we provide a holistic performance profile by proposing twelve concrete metrics that decompose agent reliability along four key dimensions: consistency, robustness, predictability, and safety. Evaluating 14 models across two complementary benchmarks, we find that recent capability gains have only yielded small improvements in reliability. By exposing these persistent limitations, our metrics complement traditional evaluations while offering tools for reasoning about how agents perform, degrade, and fail.

Ler o preprint

1 Avaliação PREreview

Escrever uma Avaliação PREreview Solicitar uma Avaliação PREreview

Avaliação PREreview de Zirui Wei

De autoria de Zirui Wei

Summary

This paper argues that current agent evaluation practice — reporting mean task success rates — fundamentally fails to capture whether agents are reliable enough for real-world deployment. The authors propose a four-dimensional reliability framework decomposed into twelve concrete metrics:…

Ler a avaliação PREreview de Zirui Wei

Avaliações PREreview de Towards a Science of AI Agent Reliability

1 Avaliação PREreview