Modeling Spatio-Temporal Human Track Structure for Action Localization

Guilhem Chéron; Anton Osokin; Ivan Laptev; Cordelia Schmid

Pré-Publication, Document De Travail Année : 2019

Modeling Spatio-Temporal Human Track Structure for Action Localization

(1, 2) , (1, 3) , (1) , (2)

1
2
3

Guilhem Chéron

Fonction : Auteur
PersonId : 1039639

Models of visual object recognition and scene understanding

Apprentissage de modèles à partir de données massives

Anton Osokin

Fonction : Auteur

Models of visual object recognition and scene understanding

Vysšaja škola èkonomiki = National Research University Higher School of Economics [Moscow]

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

Cordelia Schmid

Fonction : Auteur

Apprentissage de modèles à partir de données massives

Résumé

This paper addresses spatio-temporal localization of human actions in video. In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks. Our model is trained to simultaneously recognize and localize action classes in time and is based on two layer gated recurrent units (GRU) applied separately to two streams, i.e. appearance and optical flow streams. When used together with state-of-the-art person detection and tracking, our model is shown to improve substantially spatio-temporal action localization in videos. The gain is shown to be mainly due to improved temporal localization. We evaluate our method on two recent datasets for spatio-temporal action localization, UCF101-24 and DALY, demonstrating a significant improvement of the state of the art.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Guilhem Chéron : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01979583

Soumis le : dimanche 13 janvier 2019-14:36:01

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-01979583 , version 1 (13-01-2019)

Identifiants

HAL Id : hal-01979583 , version 1
ARXIV : 1806.11008

Citer

Guilhem Chéron, Anton Osokin, Ivan Laptev, Cordelia Schmid. Modeling Spatio-Temporal Human Track Structure for Action Localization. 2019. ⟨hal-01979583⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UGA CNRS INRIA LJK LJK_GI INRIA2 LJK-GI-THOTH PSL

165 Consultations

0 Téléchargements

Modeling Spatio-Temporal Human Track Structure for Action Localization

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager