Multi-scale attention guided network for end-to-end face alignment and recognition

Authors:Shakeel, M. Saad*; Zhang, Yuxuan; Wang, Xin; Kang, Wenxiong; Mahmood, Arif
Source:JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 88: 103628.
DOI:10.1016/j.jvcir.2022.103628

Summary

Attention modules embedded in deep networks mediate the selection of informative regions for object recog-nition. In addition, the combination of features learned from different branches of a network can enhance the discriminative power of these features. However, fusing features with inconsistent scales is a less-studied problem. In this paper, we first propose a multi-scale channel attention network with an adaptive feature fusion strategy (MSCAN-AFF) for face recognition (FR), which fuses the relevant feature channels and improves the network's representational power. In FR, face alignment is performed independently prior to recognition, which requires the efficient localization of facial landmarks, which might be unavailable in uncontrolled scenarios such as low-resolution and occlusion. Therefore, we propose utilizing our MSCAN-AFF to guide the Spatial Transformer Network (MSCAN-STN) to align feature maps learned from an unaligned training set in an end-to -end manner. Experiments on benchmark datasets demonstrate the effectiveness of our proposed MSCAN-AFF and MSCAN-STN.

  • Institution
    茂名学院

Full-Text