Skip to Main content Skip to Navigation
New interface
Other publications

Encoding Feature Maps of CNNs for Action Recognition

Xiaojiang Peng 1 Cordelia Schmid 1 
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology
Abstract : We describe our approach for action classification in the THUMOS Challenge 2015. Our approach is based on two types of features, improved dense trajectories and CNN features. For trajectory features, we extract HOG, HOF, MBHx, and MBHy descriptors and apply Fisher vector encoding. For CNN features, we utilize a recent deep CNN model, VGG19, to capture appearance features and use VLAD encoding to encode/pool convolutional feature maps which shows better performance than average pooling of feature maps and full-connected activation features. After concatenating them, we train a linear SVM classifier for each class in a one-vs-all scheme.
Document type :
Other publications
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download
Contributor : THOTH Team Connect in order to contact the contributor
Submitted on : Thursday, December 10, 2015 - 4:43:43 PM
Last modification on : Thursday, January 20, 2022 - 5:28:04 PM
Long-term archiving on: : Saturday, April 29, 2017 - 2:32:32 AM


Files produced by the author(s)


  • HAL Id : hal-01236843, version 1



Xiaojiang Peng, Cordelia Schmid. Encoding Feature Maps of CNNs for Action Recognition. 2015. ⟨hal-01236843⟩



Record views


Files downloads