Toward unsupervised human activity and gesture recognition in videos

Farhood Negin 1
1 STARS - Spatio-Temporal Activity Recognition Systems
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : The main goal of this thesis is to propose a complete framework for automatic discovery, modeling and recognition of human activities in videos. In order to model and recognize activities in long-term videos, we propose a framework that combines global and local perceptual information from the scene and accordingly constructs hierarchical activity models. In the first variation of the framework, a supervised classifier based on Fisher vector is trained and the predicted semantic labels are embedded in the constructed hierarchical models. In the second variation, to have a completely unsupervised framework, rather than embedding the semantic labels, the trained visual codebooks are stored in the models. Finally, we evaluate the proposed frameworks on two realistic Activities of Daily Living datasets recorded from patients in a hospital environment. Furthermore, to model fine motions of human body, we propose four different gesture recognition frameworks where each framework accepts one or combination of different data modalities as input. We evaluate the developed frameworks in the context of medical diagnostic test namely Praxis. Praxis test is a gesture-based diagnostic test, which has been accepted as a diagnostically indicative of cortical pathologies such as Alzheimer’s disease. We suggest a new challenge in gesture recognition, which is to obtain an objective opinion about correct and incorrect performances of very similar gestures. The experiments show effectiveness of our deep learning based approach in gesture recognition and performance assessment tasks.
Complete list of metadatas

Cited literature [247 references]  Display  Hide  Download

https://hal.inria.fr/tel-01947341
Contributor : Abes Star <>
Submitted on : Tuesday, February 26, 2019 - 4:12:25 PM
Last modification on : Thursday, February 28, 2019 - 9:43:47 AM
Long-term archiving on : Monday, May 27, 2019 - 2:48:01 PM

File

2018AZUR4246.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01947341, version 2

Collections

Citation

Farhood Negin. Toward unsupervised human activity and gesture recognition in videos. Computer Vision and Pattern Recognition [cs.CV]. Université Côte d'Azur, 2018. English. ⟨NNT : 2018AZUR4246⟩. ⟨tel-01947341v2⟩

Share

Metrics

Record views

444

Files downloads

205