Learning to track for spatio-temporal action localization - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Learning to track for spatio-temporal action localization

Résumé

We propose an effective approach for action localization, both in the spatial and temporal domains, in realistic videos. The approach starts from detecting proposals at frame-level, and proceeds to scoring them using a combination of static and motion state-of-the-art features extracted from CNNs. We then track a selection of proposals throughout the video, using a tracking-by-detection approach that leverages a combination of instance-level and class-specific learned detectors. The tracks are scored using a spatio-temporal motion histogram (STMH), a novel descriptor at the track level, in combination with the CNN features. Finally, we perform temporal localization of the action using a sliding-window approach. We present experimental results on the UCF-Sports and J-HMDB action localization datasets, where our approach outperforms the state of the art with a margin of 15% and 7% respectively in mAP. Furthermore, we present the first experimental results on the challenging UCF-101 localization dataset with 24 classes, where we also obtain a promising performance.
Fichier principal
Vignette du fichier
WeinzaepfelICCV2015.pdf (742.29 Ko) Télécharger le fichier
Vignette du fichier
UCFSports_002_30.jpg (11.29 Ko) Télécharger le fichier
WeinzaepfelICCV2015_video.mp4 (11.3 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Format : Figure, Image
Origine : Fichiers produits par l'(les) auteur(s)
Format : Vidéo
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01159941 , version 1 (05-06-2015)
hal-01159941 , version 2 (01-10-2015)

Identifiants

Citer

Philippe Weinzaepfel, Zaid Harchaoui, Cordelia Schmid. Learning to track for spatio-temporal action localization. ICCV - IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chile. pp.3164-3172, ⟨10.1109/ICCV.2015.362⟩. ⟨hal-01159941v2⟩
1691 Consultations
2977 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More