Efficient disentangled representation learning for multi-modal finger biometrics
摘要
Most multi-modal biometric systems use multiple devices to capture different traits and directly fuse multi modal data while ignoring correlation information between modalities. In this paper, finger skin and finger vein images are acquired from the same region of the finger and therefore have a higher correlation. To represent data efficiently, we propose a novel Finger Disentangled Representation Learning Framework (FDRL-Net) that is based on a factorization concept, which disentangles each modality into shared and private features, thereby improving complementarity for better fusion and extracting modality-invariant features for heterogeneous recognition. Besides, to capture as much finger texture as possible, we utilize three-view finger images to reconstruct full-view multi-spectral finger traits, which increases the identity information and the robustness to finger posture variation. Finally, a Boat-Trackers-based multi-task distillation method is proposed to migrate the feature representation ability to a lightweight multi-task network. Extensive experiments on six single-view multi-spectral finger datasets and two full-view multi-spectral finger datasets demonstrate the effectiveness of our approach.
