Skip to Main content Skip to Navigation
Conference papers

A Spatio-Temporal Descriptor Based on 3D-Gradients

Alexander Klaser 1 Marcin Marszałek 2 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology
Abstract : In this work, we present a novel local descriptor for video sequences. The proposed descriptor is based on histograms of oriented 3D spatio-temporal gradients. Our contribution is four-fold. (i) To compute 3D gradients for arbitrary scales, we develop a memory-efficient algorithm based on integral videos. (ii) We propose a generic 3D orientation quantization which is based on regular polyhedrons. (iii) We perform an in-depth evaluation of all descriptor parameters and optimize them for action recognition. (iv) We apply our descriptor to various action datasets (KTH, Weizmann, Hollywood) and show that we outperform the state-of-the-art.
Document type :
Conference papers
Complete list of metadata

Cited literature [21 references]  Display  Hide  Download
Contributor : Alexander Klaser Connect in order to contact the contributor
Submitted on : Friday, September 3, 2010 - 2:15:04 PM
Last modification on : Tuesday, October 19, 2021 - 11:13:04 PM
Long-term archiving on: : Tuesday, October 23, 2012 - 3:30:54 PM


  • HAL Id : inria-00514853, version 1



Alexander Klaser, Marcin Marszałek, Cordelia Schmid. A Spatio-Temporal Descriptor Based on 3D-Gradients. BMVC 2008 - 19th British Machine Vision Conference, Sep 2008, Leeds, United Kingdom. pp.275:1-10. ⟨inria-00514853⟩



Les métriques sont temporairement indisponibles