Ir para o conteúdo principal

Escrever uma avaliação PREreview

Learning Contraction Metrics for Provably Stable Model-Based Reinforcement Learning

Publicado
Servidor
Preprints.org
DOI
10.20944/preprints202601.1382.v1

Model-based reinforcement learning (MBRL) offers improved sample efficiency but faces instability from model errors and compounding uncertainties. We present Contraction Dynamics Model (CDM), a framework that learns state-dependent Riemannian contraction metrics jointly with system dynamics and control policies to ensure stability during training and deployment. The method uses a softplus-Cholesky decomposition for positive definite metric parameterization and optimizes via virtual displacements to minimize trajectory divergence energy. An adaptive stability regularizer incorporates the learned metric into policy objectives, guiding exploration toward contracting state space regions. Theoretically, we establish exponential trajectory convergence in expectation, derive robustness bounds against model errors, and characterize sample complexity. Empirically, on continuous control benchmarks (Pendulum, CartPole, HalfCheetah), contraction-guided learning enhances stability, sample efficiency (38.9% step reduction), and resilience to model errors (78% performance retention vs 52% for baselines at 10% noise) compared to MBRL baselines (PETS, MBPO) and safe RL methods. Ablation studies confirm design choices, showing learned metrics yield 10-40% performance gains with 20% computational overhead. This work demonstrates that learning contraction metrics enables practical, scalable embedding of nonlinear stability guarantees in deep reinforcement learning.

Você pode escrever uma avaliação PREreview de Learning Contraction Metrics for Provably Stable Model-Based Reinforcement Learning. Uma avaliação PREreview é uma avaliação de um preprint e pode variar de algumas frases a um parecer extenso, semelhante a um parecer de revisão por pares realizado por periódicos.

Antes de começar

Vamos pedir que você faça login com seu ORCID iD. Se você não tiver um iD, pode criar um.

O que é um ORCID iD?

Um ORCID iD é um identificador único que diferencia você de outras pessoas com o mesmo nome ou nome semelhante.

Começar agora