Saltar al contenido principal

Escribe una PREreview

The Sleep Mechanism of LLMs

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202508.0071.v1

In this paper, we introduce an innovative perspective, proposing that prompts in large language models (LLMs) can be viewed as hypernetworks. From this viewpoint, we further suggest that prompt engineering acts as a form of post-training for LLMs. Building upon this foundation, we present a novel training-free approach to transform system prompts into model parameters, serving as a sleep mechanism within LLMs. Our method effectively enables the conversion of knowledge and memory contained in system prompts into model parameters through the sleep mechanism, enhancing the adaptability and efficiency of language models without traditional training processes.

Puedes escribir una PREreview de The Sleep Mechanism of LLMs. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora