Saltar al contenido principal

Escribe una PREreview

Path-Sensitive AGI Alignment: Cognitive Integrity, Escape Cost, and Trajectory Risk in Augmented State Space

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202605.0905.v1

AGI alignment is often evaluated at a snapshot: a system is judged by its current outputs, policy profile, benchmark behavior, or apparent corrigibility. Snapshot evaluation misses a central risk of advanced deployment: a good endpoint can still be reached by a bad journey. Two trajectories may arrive in similar behavioral regions while differing in reversibility, opacity, intervention cost, memory entanglement, institutional dependency, and the quality of human judgment left available for oversight. This paper develops a path-sensitive alternative. It represents AGI development as motion through an augmented state space Z containing model and environment state, world-model structure, policy state, memory and provenance traces, governance affordances, institutional embedding, and human evaluative capacity. Cognitive integrity — the capacity of individuals, teams, or institutions to sustain calibrated attention, trust, contestability, and decision under pressure [1] — is introduced here as an alignment-relevant state variable rather than assumed as a familiar metric. The formal contribution is a scaffold of definitions: controlled transition laws over augmented state, escape cost, path-level alignment functionals, viability floors, forbidden regions, and trajectory classes distinguished by lock-in, basin structure, retargetability, and integrity preservation. The result does not supply a calibrated empirical model of deployed AGI systems. It specifies what such a model must track if alignment evidence is to cover both present behavior and the remaining possibility of legible, reversible, and cognitively intact correction.

Puedes escribir una PREreview de Path-Sensitive AGI Alignment: Cognitive Integrity, Escape Cost, and Trajectory Risk in Augmented State Space. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora