Poseview3d: dynamic multi-view angular encoding for skeleton-based action recognition
Loading...
Files
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
Human Action Recognition has gained prominence in computer vision due to its wide applicability. Among various approaches, skeleton-based methods stand out for their compact motion representation, effectively minimizing environmental noise. Despite advances, accurately recognizing human actions remains challenging, especially when distinguishing between fine grained, visually similar motions. Existing methods often rely on complex neural networks to model joint relationships, but still struggle with subtle action differences. In this paper, we introduce PoseView3D, which enhances recognition by generating angular information from pose data. This angular representation acts as a complementary feature representation to improve action classification. Our method introduces temporally dynamic anchor points that provide a multi-view perspective of motion, enabling more discriminative and robust skeleton representations. These anchor points are learned through a customized Non-Local Neural Network, which uses self-attention to capture both spatiotemporaljoint relationships effectively. PoseView3D outperforms current state-of-the-art methods that rely on single-stream angular data. We conduct comprehensive experiments using the NTU RGB+D and NTU RGB+D 120 datasets to evaluate performance across various configurations. Our results demonstrate that PoseView3D delivers competitive accuracy and robust recognition capabilities, validating its effectiveness in capturing nuanced motion features for human action recognition.
