Recognizing human actions in still images: a study of bag-of-features and part-based representations

Vincent Delaitre 1 Ivan Laptev 2 Josef Sivic 2
2 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Recognition of human actions is usually addressed in the scope of video interpretation. Meanwhile, common human actions such as ''reading a book'', ''playing a guitar'' or ''writing notes'' also provide a natural description for many still images. In addition, some actions in video such as ''taking a photograph'' are static by their nature and may require recognition methods based on static cues only. Motivated by the potential impact of recognizing actions in still images and the little attention this problem has received in computer vision so far, we address recognition of human actions in consumer photographs. We construct a new dataset available at http://www.di.ens.fr/willow/research/stillactions/ with seven classes of actions in 911 Flickr images representing natural variations of human actions in terms of camera view-point, human pose, clothing, occlusions and scene background. We study action recognition in still images using the state-of-the-art bag-of-features methods as well as their combination with the part-based Latent SVM approach of Felzenszwalb et al. In particular, we investigate the role of background scene context and demonstrate that improved action recognition performance can be achieved by (i) combining the statistical and part-based representations, and (ii) integrating person-centric description with the background scene context. We show results on our newly collected dataset of seven common actions as well as demonstrate improved performance over existing methods on the datasets of Gupta et al. and Yao and Fei-Fei.
Type de document :
Communication dans un congrès
BMVC 2010 - 21st British Machine Vision Conference, Aug 2010, Aberystwyth, United Kingdom. 2010
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01060885
Contributeur : Vincent Delaitre <>
Soumis le : jeudi 4 septembre 2014 - 14:37:16
Dernière modification le : jeudi 11 janvier 2018 - 06:23:05
Document(s) archivé(s) le : vendredi 5 décembre 2014 - 10:28:40

Fichier

delaitre_BMVC10.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01060885, version 1

Collections

Citation

Vincent Delaitre, Ivan Laptev, Josef Sivic. Recognizing human actions in still images: a study of bag-of-features and part-based representations. BMVC 2010 - 21st British Machine Vision Conference, Aug 2010, Aberystwyth, United Kingdom. 2010. 〈hal-01060885〉

Partager

Métriques

Consultations de la notice

458

Téléchargements de fichiers

412