Encoding Feature Maps of CNNs for Action Recognition

Xiaojiang Peng 1 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We describe our approach for action classification in the THUMOS Challenge 2015. Our approach is based on two types of features, improved dense trajectories and CNN features. For trajectory features, we extract HOG, HOF, MBHx, and MBHy descriptors and apply Fisher vector encoding. For CNN features, we utilize a recent deep CNN model, VGG19, to capture appearance features and use VLAD encoding to encode/pool convolutional feature maps which shows better performance than average pooling of feature maps and full-connected activation features. After concatenating them, we train a linear SVM classifier for each class in a one-vs-all scheme.
Document type :
Other publications
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/hal-01236843
Contributor : Thoth Team <>
Submitted on : Thursday, December 10, 2015 - 4:43:43 PM
Last modification on : Monday, December 17, 2018 - 11:22:02 AM
Long-term archiving on : Saturday, April 29, 2017 - 2:32:32 AM

File

thumos15_f2_xpeng.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01236843, version 1

Collections

Citation

Xiaojiang Peng, Cordelia Schmid. Encoding Feature Maps of CNNs for Action Recognition. 2015. ⟨hal-01236843⟩

Share

Metrics

Record views

1069

Files downloads

1132