Skip to Main content Skip to Navigation
Conference papers

Recognizing activities with cluster-trees of tracklets

Adrien Gaidon 1, 2, * Zaid Harchaoui 1 Cordelia Schmid 1
* Corresponding author
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We address the problem of recognizing complex activities, such as pole vaulting, which are characterized by the composition of a large and variable number of different spatio-temporal parts. We represent a video as a hierarchy of mid-level motion components. This hierarchy is a data-driven decomposition specific to each video. We introduce a divisive clustering algorithm that can efficiently extract a hierarchy over a large number of local trajectories. We use this structure to represent a video as an unordered binary tree. This tree is modeled by nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two tree decompositions by relying on models of their edges. Contrary to most approaches based on action decompositions, we propose to use the full hierarchical action structure instead of selecting a small fixed number of parts. We present experimental results on two recent challenging benchmarks that focus on complex activities and show that our kernel on per-video hierarchies allows to efficiently discriminate between complex activities sharing common action parts. Our approach improves over the state of the art, including unstructured activity models, baselines using other motion decomposition algorithms, graph matching, and latent models explicitly selecting a fixed number of parts.
Complete list of metadatas

Cited literature [6 references]  Display  Hide  Download
Contributor : Thoth Team <>
Submitted on : Tuesday, August 7, 2012 - 9:50:26 AM
Last modification on : Thursday, March 26, 2020 - 8:49:27 PM
Document(s) archivé(s) le : Thursday, November 8, 2012 - 2:20:45 AM




Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid. Recognizing activities with cluster-trees of tracklets. BMVC 2012 - British Machine Vision Conference, Sep 2012, Guildford, United Kingdom. pp.30.1-30.13, ⟨10.5244/C.26.30⟩. ⟨hal-00722955v2⟩



Record views


Files downloads