Skip to Main content Skip to Navigation
Theses

Human action recognition in videos with local representation

Michal Koperski 1
1 STARS - Spatio-Temporal Activity Recognition Systems
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : This thesis targets recognition of human actions in videos. This problem can be defined as the ability to name the action that occurs in the video. Due to the complexity of human actions such as appearance and motion pattern variations, many open questions keep action recognition far from being solved. Current state-of-the-art methods achieved satisfactory results based only on local features. To handle complexity of actions we propose 2 methods which model spatio-temporal relationship between features: (1) model a pairwise relationship between features with Brownian Covariance, (2) model spatial-layout of features w.r.t. person bounding box. Our methods are generic and can improve both hand-crafted and deep-learning based methods. Another question is whether 3D information can improve action recognition. Many methods use 3D information only to obtain body joints. We show that 3D information can be used not only for joints detection. We propose a novel descriptor which introduces 3D trajectories computed on RGB-D information. In the evaluation, we focus on daily living actions -- performed by people in daily self-care routine. Recognition of such actions is important for patient monitoring and assistive robots systems. To evaluate our methods we created a large-scale dataset, which consists of 160~hours of video footage of 20~seniors. We have annotated 35 action classes. The actions are performed in un-acted way, thus we introduce real-world challenges, absent in many public datasets. We also evaluated our methods on public datasets: CAD60, CAD120, MSRDailyActivity3D. THe experiments show that our methods improve state-of-the-art results.
Complete list of metadata

Cited literature [125 references]  Display  Hide  Download

https://hal.inria.fr/tel-01648968
Contributor : Abes Star :  Contact
Submitted on : Friday, February 9, 2018 - 9:49:08 AM
Last modification on : Thursday, May 28, 2020 - 10:58:34 AM

File

2017AZUR4096.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01648968, version 2

Collections

Citation

Michal Koperski. Human action recognition in videos with local representation. Computer Vision and Pattern Recognition [cs.CV]. COMUE Université Côte d'Azur (2015 - 2019), 2017. English. ⟨NNT : 2017AZUR4096⟩. ⟨tel-01648968v2⟩

Share

Metrics

Record views

739

Files downloads

3532