A Spatio-Temporal Descriptor Based on 3D-Gradients

Alexander Klaser 1 Marcin Marszałek 2 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : In this work, we present a novel local descriptor for video sequences. The proposed descriptor is based on histograms of oriented 3D spatio-temporal gradients. Our contribution is four-fold. (i) To compute 3D gradients for arbitrary scales, we develop a memory-efficient algorithm based on integral videos. (ii) We propose a generic 3D orientation quantization which is based on regular polyhedrons. (iii) We perform an in-depth evaluation of all descriptor parameters and optimize them for action recognition. (iv) We apply our descriptor to various action datasets (KTH, Weizmann, Hollywood) and show that we outperform the state-of-the-art.
Type de document :
Communication dans un congrès
Mark Everingham and Chris Needham and Roberto Fraile. BMVC 2008 - 19th British Machine Vision Conference, Sep 2008, Leeds, United Kingdom. British Machine Vision Association, pp.275:1-10, 2008
Liste complète des métadonnées



https://hal.inria.fr/inria-00514853
Contributeur : Alexander Klaser <>
Soumis le : vendredi 3 septembre 2010 - 14:15:04
Dernière modification le : mercredi 9 juillet 2014 - 15:16:21
Document(s) archivé(s) le : mardi 23 octobre 2012 - 15:30:54

Identifiants

  • HAL Id : inria-00514853, version 1

Collections

Citation

Alexander Klaser, Marcin Marszałek, Cordelia Schmid. A Spatio-Temporal Descriptor Based on 3D-Gradients. Mark Everingham and Chris Needham and Roberto Fraile. BMVC 2008 - 19th British Machine Vision Conference, Sep 2008, Leeds, United Kingdom. British Machine Vision Association, pp.275:1-10, 2008. <inria-00514853>

Partager

Métriques

Consultations de
la notice

2462

Téléchargements du document

2120