Long-term Temporal Convolutions for Action Recognition

Gül Varol; Ivan Laptev; Cordelia Schmid

doi:10.1109/TPAMI.2017.2712608

Article Dans Une Revue IEEE Transactions on Pattern Analysis and Machine Intelligence Année : 2018

Long-term Temporal Convolutions for Action Recognition

(1, 2) , (1) , (2)

1
2

Gül Varol

Fonction : Auteur
PersonId : 11217
IdHAL : gul-varol
ORCID : 0000-0002-8438-6152
IdRef : 244277400

Models of visual object recognition and scene understanding

Apprentissage de modèles à partir de données massives

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

Cordelia Schmid

Fonction : Auteur
PersonId : 831154

Apprentissage de modèles à partir de données massives

Résumé

Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate the importance of high-quality optical flow estimation for learning accurate action models. We report state-of-the-art results on two challenging benchmarks for human action recognition UCF101 (92.7%) and HMDB51 (67.2%).

Mots clés

Spatio-temporal convolutions Neural networks Video analysis Action recognition Representation learning

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

VarolPAMI2017.pdf (2.33 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Gül Varol : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01241518

Soumis le : vendredi 2 juin 2017-14:04:32

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-01241518 , version 2 (15-04-2016)

hal-01241518 , version 3 (02-06-2017)

Identifiants

HAL Id : hal-01241518 , version 3
ARXIV : 1604.04494
DOI : 10.1109/TPAMI.2017.2712608

Citer

Gül Varol, Ivan Laptev, Cordelia Schmid. Long-term Temporal Convolutions for Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (6), pp.1510-1517. ⟨10.1109/TPAMI.2017.2712608⟩. ⟨hal-01241518v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UGA CNRS INRIA LJK LJK_GI INRIA2 LJK-GI-THOTH PSL

8134 Consultations

2175 Téléchargements

Long-term Temporal Convolutions for Action Recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager