Skip to Main content Skip to Navigation
Conference papers

Where to Focus on for Human Action Recognition?

Abstract : In this paper, we present a new attention model for the recognition of human action from RGB-D videos. We propose an attention mechanism based on 3D articulated pose. The objective is to focus on the most relevant body parts involved in the action. For action classification, we propose a classification network compounded of spatio-temporal sub-networks modeling the appearance of human body parts and RNN attention subnetwork implementing our attention mechanism. Furthermore, we train our proposed network end-to-end using a regularized cross-entropy loss, leading to a joint training of the RNN delivering attention globally to the whole set of spatio-temporal features, extracted from 3D ConvNets. Our method outperforms the State-of-the-art methods on the largest human activity recognition dataset available to-date (NTU RGB+D Dataset) which is also multi-views and on a human action recognition dataset with object interaction (Northwestern-UCLA Multiview Action 3D Dataset).
Document type :
Conference papers
Complete list of metadata

Cited literature [47 references]  Display  Hide  Download

https://hal.inria.fr/hal-01927432
Contributor : Srijan Das <>
Submitted on : Monday, November 19, 2018 - 7:08:55 PM
Last modification on : Wednesday, January 6, 2021 - 10:54:25 AM
Long-term archiving on: : Wednesday, February 20, 2019 - 4:18:21 PM

File

421.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01927432, version 1

Collections

Citation

Srijan Das, Arpit Chaudhary, Francois Bremond, Monique Thonnat. Where to Focus on for Human Action Recognition?. WACV 2019 - IEEE Winter Conference on Applications of Computer Vision, Jan 2019, Waikoloa Village, Hawaii, United States. pp.1-10. ⟨hal-01927432⟩

Share

Metrics

Record views

449

Files downloads

2216