Activity representation with motion hierarchies

Adrien Gaidon 1, 2, 3, * Zaid Harchaoui 1 Cordelia Schmid 1
* Auteur correspondant
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 Computer Vision
Xerox Research Centre Europe [Meylan]
Abstract : Complex activities, e.g., pole vaulting, are composed of a variable number of sub-events connected by complex spatio-temporal relations, whereas simple actions can be represented as sequences of short temporal parts. In this paper, we learn hierarchical representations of activity videos in an unsupervised manner. These hierarchies of mid-level motion components are data-driven decompositions specific to each video. We introduce a spectral divisive clustering algorithm to efficiently extract a hierarchy over a large number of tracklets (i.e., local trajectories). We use this structure to represent a video as an unordered binary tree. We model this tree using nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two hierarchical decompositions by relying on models of their parent-child relations. We present experimental results on four recent challenging benchmarks: the High Five dataset [Patron-Perez et al, 2010], the Olympics Sports dataset [Niebles et al, 2010], the Hollywood 2 dataset [Marszalek et al, 2009], and the HMDB dataset [Kuehne et al, 2011]. We show that pervideo hierarchies provide additional information for activity recognition. Our approach improves over unstructured activity models, baselines using other motion decomposition algorithms, and the state of the art.
Type de document :
Article dans une revue
International Journal of Computer Vision, Springer Verlag, 2014, 107 (3), pp.219-238. 〈10.1007/s11263-013-0677-1〉
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger


https://hal.inria.fr/hal-00908581
Contributeur : Thoth Team <>
Soumis le : lundi 25 novembre 2013 - 09:17:13
Dernière modification le : jeudi 11 janvier 2018 - 06:21:56
Document(s) archivé(s) le : mercredi 26 février 2014 - 04:24:27

Fichiers

tracklets_journal.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid. Activity representation with motion hierarchies. International Journal of Computer Vision, Springer Verlag, 2014, 107 (3), pp.219-238. 〈10.1007/s11263-013-0677-1〉. 〈hal-00908581〉

Partager

Métriques

Consultations de la notice

1194

Téléchargements de fichiers

1484