Creating Efficient Codebooks for Visual Recognition

Frédéric Jurie 1 Bill Triggs 1
1 LEAR - Learning and recognition in vision
GRAVIR - IMAG - Graphisme, Vision et Robotique, Inria Grenoble - Rhône-Alpes, CNRS - Centre National de la Recherche Scientifique : FR71
Abstract : Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for texture analysis and scene classification. Codebooks are usually constructed by using a method such as k-means to cluster the descriptor vectors of patches sampled either densely ('textons') or sparsely ('bags of features' based on key-points or salience measures) from a set of training images. This works well for texture analysis in homogeneous images, but the images that arise in natural object recognition tasks have far less uniform statistics. We show that for dense sampling, k-means over-adapts to this, clustering centres almost exclusively around the densest few regions in descriptor space and thus failing to code other informative regions. This gives suboptimal codes that are no better than using randomly selected centres. We describe a scalable acceptance-radius based clusterer that generates better codebooks and study its performance on several image classification tasks. We also show that dense representations outperform equivalent keypoint based ones on these tasks and that SVM or mutual information based feature selection starting from a dense codebook further improves the performance.
Type de document :
Communication dans un congrès
10th International Conference on Computer Vision (ICCV '05), Oct 2005, Beijing, China. IEEE Computer Society, 1, pp.604 -- 610, 2005, 〈http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1541309〉. 〈10.1109/ICCV.2005.66〉
Liste complète des métadonnées

Littérature citée [27 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00548511
Contributeur : Thoth Team <>
Soumis le : lundi 20 décembre 2010 - 09:08:15
Dernière modification le : mardi 5 juin 2018 - 18:00:02
Document(s) archivé(s) le : lundi 21 mars 2011 - 03:06:24

Fichier

05-jurie-triggs-iccv.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

IMAG | INRIA | UGA

Citation

Frédéric Jurie, Bill Triggs. Creating Efficient Codebooks for Visual Recognition. 10th International Conference on Computer Vision (ICCV '05), Oct 2005, Beijing, China. IEEE Computer Society, 1, pp.604 -- 610, 2005, 〈http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1541309〉. 〈10.1109/ICCV.2005.66〉. 〈inria-00548511〉

Partager

Métriques

Consultations de la notice

475

Téléchargements de fichiers

824