Skip to Main content Skip to Navigation

Learning human body and human action representations from visual data

Abstract : The focus of visual content is often people. Automatic analysis of people from visual data is therefore of great importance for numerous applications in content search, autonomous driving, surveillance, health care, and entertainment. The goal of this thesis is to learn visual representations for human understanding. Particular emphasis is given to two closely related areas of computer vision: human body analysis and human action recognition. In summary, our contributions are the following: (i) we generate photo-realistic synthetic data for people that allows training CNNs for human body analysis, (ii) we propose a multi-task architecture to recover a volumetric body shape from a single image, (iii) we study the benefits of long-term temporal convolutions for human action recognition using 3D CNNs, (iv) we incorporate similarity training in multi-view videos to design view-independent representations for action recognition.
Complete list of metadata

Cited literature [374 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Monday, May 18, 2020 - 5:58:16 PM
Last modification on : Friday, June 24, 2022 - 3:16:10 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02266593, version 2



Gül Varol. Learning human body and human action representations from visual data. Computer Vision and Pattern Recognition [cs.CV]. Université Paris sciences et lettres, 2019. English. ⟨NNT : 2019PSLEE029⟩. ⟨tel-02266593v2⟩



Record views


Files downloads