Ask the locals: multi-way local pooling for image recognition

Y-Lan Boureau 1, 2 Nicolas Le Roux 2, 3 Francis Bach 2, 3 Jean Ponce 1, 2 Yann Lecun 4
1 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
3 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Invariant representations in object recognition systems are generally obtained by pooling feature vectors over spatially local neighborhoods. But pooling is not local in the feature vector space, so that widely dissimilar features may be pooled together if they are in nearby locations. Recent approaches rely on sophisticated encoding methods and more specialized codebooks (or dictionaries), e.g., learned on subsets of descriptors which are close in feature space, to circumvent this problem. In this work, we argue that a common trait found in much recent work in image recognition or retrieval is that it leverages locality in feature space on top of purely spatial locality. We propose to apply this idea in its simplest form to an object recognition system based on the spatial pyramid framework, to increase the performance of small dictionaries with very little added engineering. State of- the-art results on several object recognition benchmarks show the promise of this approach.
Type de document :
Communication dans un congrès
ICCV'11 - The 13th International Conference on Computer Vision, Nov 2011, Barcelone, Spain. 2011
Liste complète des métadonnées

Littérature citée [43 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00646816
Contributeur : Nicolas Le Roux <>
Soumis le : mercredi 30 novembre 2011 - 17:47:35
Dernière modification le : vendredi 25 mai 2018 - 12:02:06
Document(s) archivé(s) le : jeudi 1 mars 2012 - 02:32:53

Fichier

boureau-iccv-11.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00646816, version 1

Collections

Citation

Y-Lan Boureau, Nicolas Le Roux, Francis Bach, Jean Ponce, Yann Lecun. Ask the locals: multi-way local pooling for image recognition. ICCV'11 - The 13th International Conference on Computer Vision, Nov 2011, Barcelone, Spain. 2011. 〈hal-00646816〉

Partager

Métriques

Consultations de la notice

3314

Téléchargements de fichiers

1279