Actions in Context

Marcin Marszałek 1 Ivan Laptev 2 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
2 VISTAS - Spatio-Temporal Vision and Learning
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper exploits the context of natural dynamic scenes for human action recognition in video. Human actions are frequently constrained by the purpose and the physical properties of scenes and demonstrate high correlation with particular scene classes. For example, eating often happens in a kitchen while running is more common outdoors. The contribution of this paper is three-fold: (a) we automatically discover relevant scene classes and their correlation with human actions, (b) we show how to learn selected scene classes from video without manual supervision and (c) we develop a joint framework for action and scene recognition and demonstrate improved recognition of both in natural video. We use movie scripts as a means of automatic supervision for training. For selected action classes we identify correlated scene classes in text and then retrieve video samples of actions and scenes for training using script-to-video alignment. Our visual models for scenes and actions are formulated within the bag-of-features framework and are combined in a joint scene-action SVM-based classifier. We report experimental results and validate the method on a new large dataset with twelve action classes and ten scene classes acquired from 69 movies.
Type de document :
Communication dans un congrès
CVPR 2009 - IEEE Conference on Computer Vision & Pattern Recognition, Jun 2009, Miami, United States. IEEE Computer Society, pp.2929-2936, 2009, <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5206557>. <10.1109/CVPR.2009.5206557>
Liste complète des métadonnées


https://hal.inria.fr/inria-00548645
Contributeur : Thoth Team <>
Soumis le : lundi 20 décembre 2010 - 10:24:06
Dernière modification le : vendredi 13 janvier 2017 - 14:15:09
Document(s) archivé(s) le : lundi 5 novembre 2012 - 14:37:10

Identifiants

Collections

Citation

Marcin Marszałek, Ivan Laptev, Cordelia Schmid. Actions in Context. CVPR 2009 - IEEE Conference on Computer Vision & Pattern Recognition, Jun 2009, Miami, United States. IEEE Computer Society, pp.2929-2936, 2009, <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5206557>. <10.1109/CVPR.2009.5206557>. <inria-00548645>

Partager

Métriques

Consultations de
la notice

529

Téléchargements du document

845