Ir para o conteúdo principal

Escrever uma avaliação PREreview

The Coverage-Deferral Trade-Off: Fairness Implications of Conformal Prediction in Human-in-the-Loop Decision Systems

Publicado
Servidor
Preprints.org
DOI
10.20944/preprints202512.2631.v1

Conformal prediction (CP) provides distribution-free uncertainty quantification by constructing prediction sets with guaranteed coverage. In human-in-the-loop (HITL) decision systems, these sets naturally define deferral policies: cases with singleton sets proceed automatically, while those with multiple labels require human review. Mondrian CP, which calibrates separately per group, has been proposed to achieve group-conditional coverage validity, ensuring each demographic group meets the target coverage level. However, we demonstrate through extensive experiments (832K evaluations across 14K configurations, 6 datasets, 100 seeds) that improving coverage validity comes at a significant cost: Mondrian CP increases deferral disparity by 143% compared to global CP, despite reducing coverage disparity by 26% on average. This coverage-deferral trade-off is fundamental: it persists across all datasets (p < 0.001), is invariant to HITL parameters, and exhibits monotonic behavior with respect to the shrinkage interpolation parameter γ. We prove an analogous impossibility result for conformal prediction: under specific conditions, coverage parity and deferral parity cannot be simultaneously achieved when base rates differ between groups. We further demonstrate that standard fairness metrics (Equalized Odds, Average Odds Difference) are invariant to CP method choice, identifying deferral gap as a critical operational fairness metric that captures CP’s unique impact on who receives human review, a dimension invisible to standard EO metrics. Our findings provide actionable guidance: use Mondrian for group-conditional coverage validity, global CP for deferral fairness, or shrinkage for balanced trade-offs.

Você pode escrever uma avaliação PREreview de The Coverage-Deferral Trade-Off: Fairness Implications of Conformal Prediction in Human-in-the-Loop Decision Systems. Uma avaliação PREreview é uma avaliação de um preprint e pode variar de algumas frases a um parecer extenso, semelhante a um parecer de revisão por pares realizado por periódicos.

Antes de começar

Vamos pedir que você faça login com seu ORCID iD. Se você não tiver um iD, pode criar um.

O que é um ORCID iD?

Um ORCID iD é um identificador único que diferencia você de outras pessoas com o mesmo nome ou nome semelhante.

Começar agora