Skip to Main content Skip to Navigation
Conference papers

Recognizing activities with cluster-trees of tracklets

Adrien Gaidon 1, 2, * Zaid Harchaoui 1 Cordelia Schmid 1
* Corresponding author
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology
Abstract : We address the problem of recognizing complex activities, such as pole vaulting, which are characterized by the composition of a large and variable number of different spatio-temporal parts. We represent a video as a hierarchy of mid-level motion components. This hierarchy is a data-driven decomposition specific to each video. We introduce a divisive clustering algorithm that can efficiently extract a hierarchy over a large number of local trajectories. We use this structure to represent a video as an unordered binary tree. This tree is modeled by nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two tree decompositions by relying on models of their edges. Contrary to most approaches based on action decompositions, we propose to use the full hierarchical action structure instead of selecting a small fixed number of parts. We present experimental results on two recent challenging benchmarks that focus on complex activities and show that our kernel on per-video hierarchies allows to efficiently discriminate between complex activities sharing common action parts. Our approach improves over the state of the art, including unstructured activity models, baselines using other motion decomposition algorithms, graph matching, and latent models explicitly selecting a fixed number of parts.
Document type :
Conference papers
Complete list of metadata

Cited literature [6 references]  Display  Hide  Download
Contributor : Thoth Team Connect in order to contact the contributor
Submitted on : Tuesday, August 7, 2012 - 9:50:26 AM
Last modification on : Tuesday, October 19, 2021 - 11:13:04 PM
Long-term archiving on: : Thursday, November 8, 2012 - 2:20:45 AM




Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid. Recognizing activities with cluster-trees of tracklets. BMVC 2012 - British Machine Vision Conference, Sep 2012, Guildford, United Kingdom. pp.30.1-30.13, ⟨10.5244/C.26.30⟩. ⟨hal-00722955v2⟩



Record views


Files downloads