Learning person-object interactions for action recognition in still images

V. Delaitre 1, * J. Sivic 1 I. Laptev 1
* Auteur correspondant
1 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : We investigate a discriminatively trained model of person-object interactions for recognizing common human actions in still images. We build on the locally order-less spatial pyramid bag-of-features model, which was shown to perform extremely well on a range of object, scene and human action recognition tasks. We introduce three principal contributions. First, we replace the standard quantized local HOG/SIFT features with stronger discriminatively trained body part and object detectors. Second, we introduce new person-object interaction features based on spatial co-occurrences of individual body parts and objects. Third, we address the combinatorial problem of a large number of possible interaction pairs and propose a discriminative selection procedure using a linear support vector machine (SVM) with a sparsity inducing regularizer. Learning of action-specific body part and object interactions bypasses the difficult problem of estimating the complete human body pose configuration. Benefits of the proposed model are shown on human action recognition in consumer photographs, outperforming the strong bag-of-features baseline.
Type de document :
Communication dans un congrès
J. Shawe-Taylor and R.S. Zemel and P. Bartlett and F. Pereira and K.Q. Weinberger. NIPS 2011 : Twenty-Fifth Annual Conference on Neural Information Processing Systems, Dec 2011, Grenada, Spain. 2011, Advances in Neural Information Processing Systems
Liste complète des métadonnées

Littérature citée [37 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00648156
Contributeur : Josef Sivic <>
Soumis le : lundi 5 décembre 2011 - 11:47:52
Dernière modification le : mardi 17 avril 2018 - 11:29:00
Document(s) archivé(s) le : vendredi 16 novembre 2012 - 14:21:04

Fichier

delaitre11.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00648156, version 1

Collections

Citation

V. Delaitre, J. Sivic, I. Laptev. Learning person-object interactions for action recognition in still images. J. Shawe-Taylor and R.S. Zemel and P. Bartlett and F. Pereira and K.Q. Weinberger. NIPS 2011 : Twenty-Fifth Annual Conference on Neural Information Processing Systems, Dec 2011, Grenada, Spain. 2011, Advances in Neural Information Processing Systems. 〈hal-00648156〉

Partager

Métriques

Consultations de la notice

321

Téléchargements de fichiers

138