Skip to Main content Skip to Navigation
Conference papers

Learning person-object interactions for action recognition in still images

V. Delaitre 1, * J. Sivic 1 I. Laptev 1
* Corresponding author
1 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : We investigate a discriminatively trained model of person-object interactions for recognizing common human actions in still images. We build on the locally order-less spatial pyramid bag-of-features model, which was shown to perform extremely well on a range of object, scene and human action recognition tasks. We introduce three principal contributions. First, we replace the standard quantized local HOG/SIFT features with stronger discriminatively trained body part and object detectors. Second, we introduce new person-object interaction features based on spatial co-occurrences of individual body parts and objects. Third, we address the combinatorial problem of a large number of possible interaction pairs and propose a discriminative selection procedure using a linear support vector machine (SVM) with a sparsity inducing regularizer. Learning of action-specific body part and object interactions bypasses the difficult problem of estimating the complete human body pose configuration. Benefits of the proposed model are shown on human action recognition in consumer photographs, outperforming the strong bag-of-features baseline.
Document type :
Conference papers
Complete list of metadata

Cited literature [37 references]  Display  Hide  Download

https://hal.inria.fr/hal-00648156
Contributor : Josef Sivic <>
Submitted on : Monday, December 5, 2011 - 11:47:52 AM
Last modification on : Thursday, July 1, 2021 - 5:58:06 PM
Long-term archiving on: : Friday, November 16, 2012 - 2:21:04 PM

File

delaitre11.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00648156, version 1

Collections

Citation

V. Delaitre, J. Sivic, I. Laptev. Learning person-object interactions for action recognition in still images. NIPS 2011 : Twenty-Fifth Annual Conference on Neural Information Processing Systems, NIPS Foundation, Dec 2011, Grenada, Spain. ⟨hal-00648156⟩

Share

Metrics

Record views

437

Files downloads

373