Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
DDPM-SegFormer: Highly refined feature land use and land cover segmentation with a fused denoising diffusion probabilistic model and transformer
oleh: Junfu Fan, Zongwen Shi, Zhoupeng Ren, Yuke Zhou, Min Ji
| Format: | Article |
|---|---|
| Diterbitkan: | Elsevier 2024-09-01 |
Deskripsi
The semantic segmentation of land use and land cover (LULC) is a crucial and widely employed remote sensing task. Conventional convolutional neural networks and vision transformers have been extensively utilized for LULC segmentation. However, high-resolution remote sensing images contain a wealth of spatial and color texture information, which is not fully exploited by traditional deep learning approaches. The information bottleneck of CNNs and transformers results in the loss of a significant amount of texture detail information during the feature extraction process, which further limits the performance of LULC segmentation. We present DDPM-SegFormer, a new framework that merges a denoising diffusion probabilistic model (DDPM) and vision transformer for LULC segmentation. The aim is to address the difficulties arising from extraction in complex geographic landscapes and to alleviate information bottlenecks. The framework utilizes the ability of a DDPM to generate refined semantic features and that of vision transformer to model the global image context. Our framework introduces two main innovations. First, we use a DDPM for the first time in LULC segmentation to generate highly refined multiscale semantic features. This approach alleviates the information bottleneck caused by relying solely on a CNN or transformer architecture. Second, we develop an effective feature-level fusion strategy that utilizes multihead cross-attention between the DDPM and Transformer. This approach achieves the harmonious fusion of fine-scale semantic features, generating continuous and highly refined semantic features that enhance the segmentation accuracy. The results indicate that DDPM-SegFormer achieves an MIOU of 83.72% and an F1-score of 90.97% for the large-scale LoveDA dataset and an MIOU of 90.91% and an F1score of 93.30% for the Tarim Basin LULC dataset in a desert scenario. The research demonstrated that the refined and continuous semantic features produced by DDPM-SegFormer can significantly enhance LULC segmentation performance.