Improving the Fisher Kernel for Large-Scale Image Classification

Florent Perronnin; Jorge Sánchez; Thomas Mensink

doi:10.1007/978-3-642-15561-1_11

Communication Dans Un Congrès Année : 2010

Improving the Fisher Kernel for Large-Scale Image Classification

(1) , (1) , (2, 1)

1
2

Florent Perronnin

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Jorge Sánchez

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Thomas Mensink

Fonction : Auteur

Learning and recognition in vision

Xerox Research Centre Europe [Meylan]

Résumé

The Fisher kernel (FK) is a generic framework which combines the benefits of generative and discriminative approaches. In the context of image classification the FK was shown to extend the popular bag-of-visual-words (BOV) by going beyond count statistics. However, in practice, this enriched representation has not yet shown its superiority over the BOV. In the first part we show that with several well-motivated modifications over the original framework we can boost the accuracy of the FK. On PASCAL VOC 2007 we increase the Average Precision (AP) from 47.9% to 58.3%. Similarly, we demonstrate state-of-the-art accuracy on CalTech 256. A major advantage is that these results are obtained using only SIFT descriptors and costless linear classifiers. Equipped with this representation, we can now explore image classification on a larger scale. In the second part, as an application, we compare two abundant resources of labeled images to learn classifiers: ImageNet and Flickr groups. In an evaluation involving hundreds of thousands of training images we show that classifiers learned on Flickr groups perform surprisingly well (although they were not intended for this purpose) and that they can complement classifiers learned on more carefully annotated datasets.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

PSM10_0766.pdf (167.49 Ko)

poster.png (266.22 Ko)

PSM10_ILSVRC.pdf (1.35 Mo)

PSM10_poster.pdf (9.44 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Format : Figure, Image

Format : Autre

THOTH Team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00548630

Soumis le : lundi 20 décembre 2010-10:22:06

Dernière modification le : jeudi 4 avril 2024-21:21:26

Archivage à long terme le : lundi 21 mars 2011-03:24:55

Dates et versions

inria-00548630 , version 1 (20-12-2010)

Identifiants

HAL Id : inria-00548630 , version 1
DOI : 10.1007/978-3-642-15561-1_11

Citer

Florent Perronnin, Jorge Sánchez, Thomas Mensink. Improving the Fisher Kernel for Large-Scale Image Classification. ECCV 2010 - European Conference on Computer Vision, Sep 2010, Heraklion, Greece. pp.143-156, ⟨10.1007/978-3-642-15561-1_11⟩. ⟨inria-00548630⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LJK LJK_GI LJK_GI_LEAR INRIA2

1848 Consultations

5716 Téléchargements

Improving the Fisher Kernel for Large-Scale Image Classification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager