Ir para o conteúdo principal

Escrever uma avaliação PREreview

SeMaNet: Semantic-Guided Low-Light Image Enhancement with Hybrid Transformer-Mamba Architecture

Publicado
Servidor
Preprints.org
DOI
10.20944/preprints202604.1600.v1

Low-light image enhancement aims to recover high-quality visuals from poorly illuminated inputs, yet existing methods often suffer from over-enhancement, noise amplification, and semantic inconsistency in complex scenes. In this paper, we propose SeMaNet, a novel semantic-guided framework that integrates textual priors with a hybrid Transformer-Mamba architecture for controllable and efficient low-light enhancement. Our approach begins by leveraging pre-trained CLIP to generate semantically meaningful attention maps from natural language prompts, enabling interpretable region-aware enhancement without requiring pixel-level annotations. These semantic priors are then fused with illumination estimates and raw image features through a cross-attention mechanism, allowing dynamic interaction among multi-modal cues. To balance global context modeling and computational efficiency, we design a U-Net-based restoration network that interleaves Transformer blocks for long-range dependency capture and Mamba layers for linear-time sequence processing. Furthermore, our method explicitly models the image formation process via a perturbation-aware Retinex decomposition, enhancing physical plausibility. Extensive experiments on LOL v1, LOL-v2-real, LOL-v2-synthetic, SID, SMID, and SDSD-out datasets demonstrate that SeMaNet achieves state-of-the-art performance in both quantitative metrics (PSNR, SSIM) and qualitative quality, particularly excelling in preserving semantic coherence and fine details under challenging lighting conditions. The hybrid architecture also offers superior inference efficiency compared to pure Transformer-based models.

Você pode escrever uma avaliação PREreview de SeMaNet: Semantic-Guided Low-Light Image Enhancement with Hybrid Transformer-Mamba Architecture. Uma avaliação PREreview é uma avaliação de um preprint e pode variar de algumas frases a um parecer extenso, semelhante a um parecer de revisão por pares realizado por periódicos.

Antes de começar

Vamos pedir que você faça login com seu ORCID iD. Se você não tiver um iD, pode criar um.

O que é um ORCID iD?

Um ORCID iD é um identificador único que diferencia você de outras pessoas com o mesmo nome ou nome semelhante.

Começar agora