Skip to main content

Write a PREreview

Path-Sensitive AGI Alignment: Cognitive Integrity, Escape Cost, and Trajectory Risk in Augmented State Space

Posted
Server
Preprints.org
DOI
10.20944/preprints202605.0905.v1

AGI alignment is often evaluated at a snapshot: a system is judged by its current outputs, policy profile, benchmark behavior, or apparent corrigibility. Snapshot evaluation misses a central risk of advanced deployment: a good endpoint can still be reached by a bad journey. Two trajectories may arrive in similar behavioral regions while differing in reversibility, opacity, intervention cost, memory entanglement, institutional dependency, and the quality of human judgment left available for oversight. This paper develops a path-sensitive alternative. It represents AGI development as motion through an augmented state space Z containing model and environment state, world-model structure, policy state, memory and provenance traces, governance affordances, institutional embedding, and human evaluative capacity. Cognitive integrity — the capacity of individuals, teams, or institutions to sustain calibrated attention, trust, contestability, and decision under pressure [1] — is introduced here as an alignment-relevant state variable rather than assumed as a familiar metric. The formal contribution is a scaffold of definitions: controlled transition laws over augmented state, escape cost, path-level alignment functionals, viability floors, forbidden regions, and trajectory classes distinguished by lock-in, basin structure, retargetability, and integrity preservation. The result does not supply a calibrated empirical model of deployed AGI systems. It specifies what such a model must track if alignment evidence is to cover both present behavior and the remaining possibility of legible, reversible, and cognitively intact correction.

You can write a PREreview of Path-Sensitive AGI Alignment: Cognitive Integrity, Escape Cost, and Trajectory Risk in Augmented State Space. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now