Human Focused Action Localization in Video

Alexander Klaser 1 Marcin Marszałek 2 Cordelia Schmid 1 Andrew Zisserman 2
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract :

We propose a novel human-centric approach to detect and localize human actions in challenging video data, such as Hollywood movies. Our goal is to localize actions in time through the video and spatially in each frame. We achieve this by first obtaining generic spatio-temporal human tracks and then detecting specific actions within these using a sliding window classifier.

We make the following contributions: (i) We show that splitting the action localization task into spatial and temporal search leads to an efficient localization algorithm where generic human tracks can be reused to recognize multiple human actions; (ii) We develop a human detector and tracker which is able to cope with a wide range of postures, articulations, motions and camera viewpoints. The tracker includes detection interpolation and a principled classification stage to suppress false positive tracks; (iii) We propose a track-aligned 3D-HOG action representation, investigate its parameters, and show that action localization benefits from using tracks; and (iv) We introduce a new action localization dataset based on Hollywood movies.

Results are presented on a number of real-world movies with crowded, dynamic environment, partial occlusion and cluttered background. On the Coffee&Cigarettes dataset we significantly improve over the state of the art. Furthermore, we obtain excellent results on the new Hollywood-Localization dataset.

Type de document :
Communication dans un congrès
Kiriakos N. Kutulakos. SGA 2010 - International Workshop on Sign, Gesture, and Activity, ECCV 2010 Workshops, Sep 2010, Hersonissos, Heraklion, Crete, Greece. Springer, 6553, pp.219-233, 2010, Lecture Notes in Computer Science; Trends and Topics in Computer Vision. <10.1007/978-3-642-35749-7_17>
Liste complète des métadonnées


https://hal.inria.fr/inria-00514845
Contributeur : Alexander Klaser <>
Soumis le : vendredi 3 septembre 2010 - 13:59:16
Dernière modification le : mercredi 9 juillet 2014 - 15:07:20
Document(s) archivé(s) le : mardi 23 octobre 2012 - 15:30:46

Identifiants

Collections

Citation

Alexander Klaser, Marcin Marszałek, Cordelia Schmid, Andrew Zisserman. Human Focused Action Localization in Video. Kiriakos N. Kutulakos. SGA 2010 - International Workshop on Sign, Gesture, and Activity, ECCV 2010 Workshops, Sep 2010, Hersonissos, Heraklion, Crete, Greece. Springer, 6553, pp.219-233, 2010, Lecture Notes in Computer Science; Trends and Topics in Computer Vision. <10.1007/978-3-642-35749-7_17>. <inria-00514845>

Partager

Métriques

Consultations de
la notice

563

Téléchargements du document

586