Focalized contrastive view-invariant learning for self-supervised skeleton-based action recognition

Men, Qianhui<sup>*</sup>; Ho, Edmond S. L.; Shum, Hubert P. H.; Leung, Howard

doi:10.1016/j.neucom.2023.03.070

摘要

Learning view-invariant representation is a key to improving feature discrimination power for skeleton -based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations. In this work, we propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL), which significantly suppresses the view-specific information on the representation space where the viewpoints are coarsely aligned. By maximizing mutual information with an effective contrastive loss between multi-view sample pairs, FoCoViL associates actions with common view-invariant properties and simultaneously separates the dissimilar ones. We further propose an adaptive focalization method based on pairwise similarity to enhance contrastive learning for a clearer cluster boundary in the learned space. Different from many existing self-supervised representation learning work that rely heavily on supervised classifiers, FoCoViL performs well on both unsupervised and supervised classifiers with superior recognition perfor-mance. Extensive experiments also show that the proposed contrastive-based focalization generates a more discriminative latent representation.

全文

访问全文

分享分享被引(1) 浏览

更新时间：2024-03-23 01:45

Focalized contrastive view-invariant learning for self-supervised skeleton-based action recognition

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友