Downsampling Algorithm with Fusion of Different Receptive Field Sizes in Deep Detection Methods

oleh: GU Zhenghua, LIU Gaqiong, SHAO Changbin, YU Hualong

Format: Article
Diterbitkan: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2024-10-01

Deskripsi

The advantage of deep detection models primarily benefits from the feature representation ability of the backbone network, where down-sampling plays a key role in semantic integration. However, existing down-sampling approaches often ignore the global structural information of features, due to the usage of the small receptive field manner. To address this issue, this paper proposes a plug-and-play dual path down-sampling method (DPDM). It improves the support of backbone network for subsequent detection, through an extra large receptive field branch. Built on the traditional small receptive field channel, DPDM constructs an efficient large receptive field branch to obtain the structural information of features. Inspired from spatial-to-depth operation, it can achieve the effectiveness of a large receptive field under a conventional convolution kernel setting. The dual-path operation increases diversity of features but doesn’t emphasize the coordination between both types of features. Therefore, DPDM subsequently uses channel concatenation and point-wise convolution techniques to merge the features of two paths. Taking the advanced YOLO as benchmark, experimental evaluations of three models (YOLOX, YOLOv5, YOLOv6) on different datasets demonstrate the effectiveness of this method in improving detection accuracy.