CNN Attention Enhanced ViT Network for Occluded Person Re-Identification

oleh: Jing Wang, Peitong Li, Rongfeng Zhao, Ruyan Zhou, Yanling Han

Format:	Article
Diterbitkan:	MDPI AG 2023-03-01

Deskripsi

Person re-identification (ReID) is often affected by occlusion, which makes most of the features extracted by ReID models contain a lot of identity-independent noise. Recently, the use of Vision Transformer (ViT) has enabled significant progress in various visual artificial intelligence tasks. However, ViT suffers from insufficient local information extraction capability, which should be of concern to researchers in the field of occluded ReID. This paper conducts a study to exploit the potential of attention mechanisms to enhance ViT in ReID tasks. In this study, an Attention Enhanced ViT Network (AET-Net) is proposed for occluded ReID. We use ViT as the backbone network to extract image features. Even so, occlusion and outlier problems still exist in ReID. Then, we combine the spatial attention mechanism into the ViT architecture, by which we enhance the attention of ViT patch embedding vectors to important regions. In addition, we design a MultiFeature Training Module to optimize the network by the construction of multiple classification features and calculation of the multi-feature loss to enhance the performance of the model. Finally, the effectiveness and superiority of the proposed method are demonstrated by broad experiments on both occluded and non-occluded datasets.

Find in Library

Indexed Open Access Databases

CNN Attention Enhanced ViT Network for Occluded Person Re-Identification

Deskripsi