摘要
In this paper, we propose a feature enhancement method based on multi-scale self-attention, mainly including a multi-scale feature combination module and a self-attention module. The multi-scale feature combination module integrates the multi-layers' features extracted from the backbone network in both the top-down and bottom-up directions. Then, the shallow and deep features are combined. The self-attention module enhances the feature representation by assigning attention weights to the features that have intrinsic connection to the features of the target. The multi-scale self-attention-based feature enhancement method improves the performance for detecting targets with small image sizes in complex scenes by mutual combination between deep and shallow features and between local and global features. The experimental results show the effectiveness of the proposed feature enhancement method.
-
单位中国科学院研究生院; 南昌航空大学; 南京航空航天大学