Skip to Main content Skip to Navigation
Other publications

Encoding Feature Maps of CNNs for Action Recognition

Xiaojiang Peng 1 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Grenoble INP [2007-2019] - Institut polytechnique de Grenoble - Grenoble Institute of Technology [2007-2019], LJK [2007-2015] - Laboratoire Jean Kuntzmann [2007-2015], Inria Grenoble - Rhône-Alpes
Abstract : We describe our approach for action classification in the THUMOS Challenge 2015. Our approach is based on two types of features, improved dense trajectories and CNN features. For trajectory features, we extract HOG, HOF, MBHx, and MBHy descriptors and apply Fisher vector encoding. For CNN features, we utilize a recent deep CNN model, VGG19, to capture appearance features and use VLAD encoding to encode/pool convolutional feature maps which shows better performance than average pooling of feature maps and full-connected activation features. After concatenating them, we train a linear SVM classifier for each class in a one-vs-all scheme.
Document type :
Other publications
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download
Contributor : Thoth Team <>
Submitted on : Thursday, December 10, 2015 - 4:43:43 PM
Last modification on : Friday, July 17, 2020 - 11:38:58 AM
Long-term archiving on: : Saturday, April 29, 2017 - 2:32:32 AM


Files produced by the author(s)


  • HAL Id : hal-01236843, version 1



Xiaojiang Peng, Cordelia Schmid. Encoding Feature Maps of CNNs for Action Recognition. 2015. ⟨hal-01236843⟩



Record views


Files downloads