Image Classification with the Fisher Vector: Theory and Practice

Jorge Sanchez 1 Florent Perronnin 2 Thomas Mensink 3 Jakob Verbeek 4, *
* Auteur correspondant
4 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : A standard approach to describe an image for classification and retrieval purposes is to extract a set of local patch descriptors, encode them into a high dimensional vector and pool them into an image-level signature. The most common patch encoding strategy consists in quantizing the local descriptors into a finite set of prototypical elements. This leads to the popular Bag-of-Visual words (BoV) representation. In this work, we propose to use the Fisher Kernel framework as an alternative patch encoding strategy: we describe patches by their deviation from an "universal" generative Gaussian mixture model. This representation, which we call Fisher Vector (FV) has many advantages: it is efficient to compute, it leads to excellent results even with efficient linear classifiers, and it can be compressed with a minimal loss of accuracy using product quantization. We report experimental results on five standard datasets - PASCAL VOC 2007, Caltech 256, SUN 397, ILSVRC 2010 and ImageNet10K - with up to 9M images and 10K classes, showing that the FV framework is a state-of-the-art patch encoding technique.
Type de document :
Article dans une revue
International Journal of Computer Vision, Springer Verlag, 2013, 105 (3), pp.222-245. 〈10.1007/s11263-013-0636-x〉
Liste complète des métadonnées

Littérature citée [43 références]  Voir  Masquer  Télécharger


https://hal.inria.fr/hal-00830491
Contributeur : Jakob Verbeek <>
Soumis le : mercredi 12 juin 2013 - 15:39:44
Dernière modification le : mercredi 4 octobre 2017 - 16:08:21
Document(s) archivé(s) le : mardi 4 avril 2017 - 20:53:41

Fichiers

journal.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Jorge Sanchez, Florent Perronnin, Thomas Mensink, Jakob Verbeek. Image Classification with the Fisher Vector: Theory and Practice. International Journal of Computer Vision, Springer Verlag, 2013, 105 (3), pp.222-245. 〈10.1007/s11263-013-0636-x〉. 〈hal-00830491v2〉

Partager

Métriques

Consultations de
la notice

1941

Téléchargements du document

12380