2026IEEE Transactions on Multimedia

Ranking-Based Self-Supervised Representation Learning for Skeleton-Based Action Recognition

Wu, Bizhu, Chen, Junliang, Xie, Jinheng, Li, Qiufu, Ren, Jianfeng, Bai, Ruibin, Qu, Rong, and Shen, Linlin

Recently, researchers have achieved significant results in the skeleton-based action recognition. To better model the skeleton sequences, we drive the encoder to learn more discriminative representations in the self-supervised setting. We find that instead of clustering feature vectors to assign pseudo labels for samples as in DeepCluster, ranking them is a more reasonable, reliable, and efficient way to learn more effective feature representations. With this intuition, we propose a novel self-supervised learning framework, DeepRank. Specifically, we rank triplets of skeleton sequences with the ranking labels, obtained from the relative distances among them. Besides, to deeply mine complementary discriminative information that exists in different modalities of skeleton sequences, we further propose Multi-View DeepRank (MV-DeepRank) to enable encoders to comprehensively learn complementary features from multiple modalities. Extensive experimental results on the NTU RGB+D, NTU RGB+D 120, PKU-MMD I, and PKU-MMD II datasets under various evaluation settings demonstrate the generality, transferability, and superiority of our proposed self-supervised learning frameworks. Notably, our frameworks surpass the previous methods that employ the same backbone networks as ours by at least 1.8% (ST-GCN) and 2.1% (STTFormer) under the finetuning setting. Additionally, DeepRank gains a significant advantage on computational complexities, $O(1)$ , over the contrastive learning-based methods, $O(\rm{batch size})$ , and the clustering-based methods, $O(\rm{number of clusters})$ .

Discriminative modelFeature learningRanking (information retrieval)Learning to rankFeature (linguistics)Representation (politics)AutoencoderCluster analysisPattern recognition (psychology)

Ruibin Bai

Director of Lab

Computer Science and Operations Research

View on Publisher Site

Ranking-Based Self-Supervised Representation Learning for Skeleton-Based Action Recognition

Abstract

Keywords

Authors from this organization

Ruibin Bai