Synthetic Humans for Action Recognition from Unseen Viewpoints

Gül Varol; Ivan Laptev; Cordelia Schmid; Andrew Zisserman

doi:10.1007/s11263-021-01467-7

Article Dans Une Revue International Journal of Computer Vision Année : 2021

Synthetic Humans for Action Recognition from Unseen Viewpoints

(1, 2, 3) , (1) , (4, 5) , (2)

1
2
3
4
5

Gül Varol

Fonction : Auteur
PersonId : 11217
IdHAL : gul-varol
ORCID : 0000-0002-8438-6152
IdRef : 244277400

Models of visual object recognition and scene understanding

Visual Geometry Group

Laboratoire d'Informatique Gaspard-Monge

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

Cordelia Schmid

Fonction : Auteur
PersonId : 831154

Research at Google

Apprentissage de modèles à partir de données massives

Andrew Zisserman

Fonction : Auteur
PersonId : 878447

Visual Geometry Group

Résumé

Although synthetic training data has been shown to be beneficial for tasks such as human pose estimation, its use for RGB human action recognition is relatively unexplored. Our goal in this work is to answer the question whether synthetic humans can improve the performance of human action recognition, with a particular focus on generalization to unseen viewpoints. We make use of the recent advances in monocular 3D human body reconstruction from real action sequences to automatically render synthetic training videos for the action labels. We make the following contributions: (i) we investigate the extent of variations and augmentations that are beneficial to improving performance at new viewpoints. We consider changes in body shape and clothing for individuals, as well as more action relevant augmentations such as non-uniform frame sampling, and interpolating between the motion of individuals performing the same action; (ii) We introduce a new data generation methodology, SURREACT, that allows supervised training of spatio-temporal CNNs for action classification; (iii) We substantially improve the state-of-the-art action recognition performance on the NTU RGB+D and UESTC standard human action multi-view benchmarks; Finally, (iv) we extend the augmentation approach to in-the-wild videos from a subset of the Kinetics dataset to investigate the case when only one-shot training data is available, and demonstrate improvements in this case as well.

Mots clés

Synthetic humans Action recognition

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Gül Varol : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02435731

Soumis le : samedi 11 janvier 2020-12:12:30

Dernière modification le : samedi 27 avril 2024-03:13:49

Dates et versions

hal-02435731 , version 1 (11-01-2020)

Identifiants

HAL Id : hal-02435731 , version 1
ARXIV : 1912.04070
DOI : 10.1007/s11263-021-01467-7

Citer

Gül Varol, Ivan Laptev, Cordelia Schmid, Andrew Zisserman. Synthetic Humans for Action Recognition from Unseen Viewpoints. International Journal of Computer Vision, 2021, 129, pp.2264-2287. ⟨10.1007/s11263-021-01467-7⟩. ⟨hal-02435731⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS ENPC UNIV-RENNES1 UGA CNRS INRIA IRISA LIGM_A3SI INSMI PARISTECH LIGM LJK LJK_GI INRIA2 LJK-GI-THOTH PSL UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES ANR PRAIRIE-IA UR1-MATH-NUM UNIV-EIFFEL JSE2024

332 Consultations

0 Téléchargements

Synthetic Humans for Action Recognition from Unseen Viewpoints

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager