Skip to Main content Skip to Navigation
Conference papers

Where to Focus on for Human Action Recognition?

Srijan Das 1 Arpit Chaudhary 1 Francois Bremond 1 Monique Thonnat 1 
1 STARS - Spatio-Temporal Activity Recognition Systems
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : In this paper, we present a new attention model for the recognition of human action from RGB-D videos. We propose an attention mechanism based on 3D articulated pose. The objective is to focus on the most relevant body parts involved in the action. For action classification, we propose a classification network compounded of spatio-temporal sub-networks modeling the appearance of human body parts and RNN attention subnetwork implementing our attention mechanism. Furthermore, we train our proposed network end-to-end using a regularized cross-entropy loss, leading to a joint training of the RNN delivering attention globally to the whole set of spatio-temporal features, extracted from 3D ConvNets. Our method outperforms the State-of-the-art methods on the largest human activity recognition dataset available to-date (NTU RGB+D Dataset) which is also multi-views and on a human action recognition dataset with object interaction (Northwestern-UCLA Multiview Action 3D Dataset).
Document type :
Conference papers
Complete list of metadata

Cited literature [47 references]  Display  Hide  Download
Contributor : SRIJAN DAS Connect in order to contact the contributor
Submitted on : Monday, November 19, 2018 - 7:08:55 PM
Last modification on : Saturday, June 25, 2022 - 11:32:46 PM
Long-term archiving on: : Wednesday, February 20, 2019 - 4:18:21 PM


Files produced by the author(s)


  • HAL Id : hal-01927432, version 1



Srijan Das, Arpit Chaudhary, Francois Bremond, Monique Thonnat. Where to Focus on for Human Action Recognition?. WACV 2019 - IEEE Winter Conference on Applications of Computer Vision, Jan 2019, Waikoloa Village, Hawaii, United States. pp.1-10. ⟨hal-01927432⟩



Record views


Files downloads