Saltar al contenido principal

Escribe una PREreview

LVC2-DViT: Landview Creation for Landview Classification

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202507.1001.v1

Remote sensing land-cover classification is impeded by limited annotated data and pronounced geometric distortion, hindering its value for environmental monitoring and land planning. We introduce LVC2‑DViT (Landview Creation for Landview Classification with Deformable Vision Transformer), an end‑to‑end framework evaluated on five Aerial Image Dataset (AID) scene types, including Beach, Bridge, Pond, Port and River. LVC2‑DViT fuses two modules: (i) a data creation pipeline that converts ChatGPT-4o-generated textual scene descriptions into class‑balanced, high-fidelity images via Stable Diffusion, and (ii) DViT, a deformation‑aware Vision Transformer dedicated to land‑use classification whose adaptive receptive fields more faithfully model irregular landform geometries. Without increasing model size, LVC2‑DViT improves Overall Accuracy by 2.13 percentage points and Cohen’s Kappa by 2.66 percentage points over a strong vanilla ViT baseline, and also surpasses FlashAttention variant. These results confirm the effectiveness of combining generative augmentation with deformable attention for robust land‑use mapping. The project is available at here.

Puedes escribir una PREreview de LVC2-DViT: Landview Creation for Landview Classification. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora