Recognizing activities with cluster-trees of tracklets

Adrien Gaidon 1, 2, * Zaid Harchaoui 1 Cordelia Schmid 1
* Corresponding author
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We address the problem of recognizing complex activities, such as pole vaulting, which are characterized by the composition of a large and variable number of different spatio-temporal parts. We represent a video as a hierarchy of mid-level motion components. This hierarchy is a data-driven decomposition specific to each video. We introduce a divisive clustering algorithm that can efficiently extract a hierarchy over a large number of local trajectories. We use this structure to represent a video as an unordered binary tree. This tree is modeled by nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two tree decompositions by relying on models of their edges. Contrary to most approaches based on action decompositions, we propose to use the full hierarchical action structure instead of selecting a small fixed number of parts. We present experimental results on two recent challenging benchmarks that focus on complex activities and show that our kernel on per-video hierarchies allows to efficiently discriminate between complex activities sharing common action parts. Our approach improves over the state of the art, including unstructured activity models, baselines using other motion decomposition algorithms, graph matching, and latent models explicitly selecting a fixed number of parts.
Document type :
Conference papers
Richard Bowden and John P. Collomosse and Krystian Mikolajczyk. BMVC 2012 - British Machine Vision Conference, Sep 2012, Guildford, United Kingdom. BMVA Press, pp.30.1-30.13, 2012, 〈10.5244/C.26.30〉
Liste complète des métadonnées

Cited literature [6 references]  Display  Hide  Download


https://hal.inria.fr/hal-00722955
Contributor : Thoth Team <>
Submitted on : Tuesday, August 7, 2012 - 9:50:26 AM
Last modification on : Monday, July 14, 2014 - 10:41:53 PM
Document(s) archivé(s) le : Thursday, November 8, 2012 - 2:20:45 AM

Identifiers

Collections

Citation

Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid. Recognizing activities with cluster-trees of tracklets. Richard Bowden and John P. Collomosse and Krystian Mikolajczyk. BMVC 2012 - British Machine Vision Conference, Sep 2012, Guildford, United Kingdom. BMVA Press, pp.30.1-30.13, 2012, 〈10.5244/C.26.30〉. 〈hal-00722955v2〉

Share

Metrics

Record views

1153

Document downloads

3017