Escribe una PREreview

SeMaNet: Semantic-Guided Low-Light Image Enhancement with Hybrid Transformer-Mamba Architecture

por Tianzhi Jia, Shikui Wei y Yao Zhao

Publicada: 23 de abril de 2026
Servidor: Preprints.org
DOI: 10.20944/preprints202604.1600.v1

Low-light image enhancement aims to recover high-quality visuals from poorly illuminated inputs, yet existing methods often suffer from over-enhancement, noise amplification, and semantic inconsistency in complex scenes. In this paper, we propose SeMaNet, a novel semantic-guided framework that integrates textual priors with a hybrid Transformer-Mamba architecture for controllable and efficient low-light enhancement. Our approach begins by leveraging pre-trained CLIP to generate semantically meaningful attention maps from natural language prompts, enabling interpretable region-aware enhancement without requiring pixel-level annotations. These semantic priors are then fused with illumination estimates and raw image features through a cross-attention mechanism, allowing dynamic interaction among multi-modal cues. To balance global context modeling and computational efficiency, we design a U-Net-based restoration network that interleaves Transformer blocks for long-range dependency capture and Mamba layers for linear-time sequence processing. Furthermore, our method explicitly models the image formation process via a perturbation-aware Retinex decomposition, enhancing physical plausibility. Extensive experiments on LOL v1, LOL-v2-real, LOL-v2-synthetic, SID, SMID, and SDSD-out datasets demonstrate that SeMaNet achieves state-of-the-art performance in both quantitative metrics (PSNR, SSIM) and qualitative quality, particularly excelling in preserving semantic coherence and fine details under challenging lighting conditions. The hybrid architecture also offers superior inference efficiency compared to pure Transformer-based models.

Puedes escribir una PREreview de SeMaNet: Semantic-Guided Low-Light Image Enhancement with Hybrid Transformer-Mamba Architecture. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.