Temporal Localization of Actions with Actoms

Adrien Gaidon 1, 2, * Zaid Harchaoui 1 Cordelia Schmid 1
* Corresponding author
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We address the problem of detecting actions, such as drinking or opening a door, in hours of challenging video data. We propose a model based on a sequence of atomic action units, termed "actoms", that are semantically meaningful and characteristic for the action. Our Actom Sequence Model (ASM) represents the temporal structure of actions as a sequence of histograms of actom-anchored visual features. Our representation, which can be seen as a temporally structured extension of the bag-of-features, is flexible, sparse, and discriminative. Training requires the annotation of actoms for action examples. At test time, actoms are detected automatically based on a non-parametric model of the distribution of actoms, which also acts as a prior on an action's temporal structure. We present experimental results on two recent benchmarks for temporal action detection: "Coffee and Cigarettes" and the "DLSB" dataset. We also adapt our approach to a classification by detection set-up and demonstrate its applicability on the challenging "Hollywood 2" dataset. We show that our ASM method outperforms the current state of the art in temporal action detection, as well as baselines that detect actions with a sliding window method combined with bag-of-features.
Complete list of metadatas

Cited literature [76 references]  Display  Hide  Download


https://hal.inria.fr/hal-00687312
Contributor : Thoth Team <>
Submitted on : Monday, January 21, 2013 - 11:38:23 AM
Last modification on : Monday, December 17, 2018 - 11:22:02 AM
Long-term archiving on : Monday, April 22, 2013 - 3:52:36 AM

Files

RR-7930.pdf
Explicit agreement for this submission

Identifiers

  • HAL Id : hal-00687312, version 2

Collections

Citation

Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid. Temporal Localization of Actions with Actoms. [Research Report] RR-7930, INRIA. 2012. ⟨hal-00687312v2⟩

Share

Metrics

Record views

946

Files downloads

788