Saltar al contenido principal

Escribe una PREreview

DNABERT2-CAMP: A Hybrid Transformer-CNN Model for E. coli Promoter Recognition

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202512.1533.v1

Accurate identification of promoters is essential for deciphering gene regulation but remains challenging due to the complexity and variability of transcriptional initiation signals. Existing deep learning models often fail to simultaneously capture long-range dependencies and precise local motifs in DNA sequences. To address this, we propose DNABERT2-CAMP, a hybrid deep learning framework that integrates global sequence context with localized feature extraction for enhanced promoter recognition in Escherichia coli. The model leverages a pre-trained DNABERT-2 Transformer to encode evolutionary conserved patterns across extended contexts, while a novel CAMP (CNN-Attention-Mean Pooling) module detects fine-grained promoter motifs through convolutional filtering, multi-head attention, and mean pooling. By fusing global embeddings with high-resolution local features, our approach achieves robust discrimination between promoter and non-promoter sequences. Under 5-fold cross-validation, DNABERT2-CAMP attained an accuracy of 93.10% and a ROC AUC of 97.28%. It also demonstrated strong generalization on independent external data, achieving 89.83% accuracy and 92.79% ROC AUC. These results underscore the advantage of combining global contextual modeling with targeted local motif analysis for accurate and interpretable promoter identification, offering a powerful tool for synthetic biology and genomic research.

Puedes escribir una PREreview de DNABERT2-CAMP: A Hybrid Transformer-CNN Model for E. coli Promoter Recognition. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora