Hyperfeatures - Multilevel Local Coding for Visual Recognition

Ankur Agarwal; Bill Triggs

Rapport (Rapport De Recherche) Année : 2005

Hyperfeatures - Multilevel Local Coding for Visual Recognition

(1) , (1)

Ankur Agarwal

Fonction : Auteur

Learning and recognition in vision

Bill Triggs

Fonction : Auteur
PersonId : 741773
IdHAL : bill-triggs
ORCID : 0000-0003-4116-6296
IdRef : 068974116

Learning and recognition in vision

Résumé

Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and they have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics of features at scales larger than their local input patches. We present a new multilevel visual representation, `hyperfeatures', that is designed to remedy this. The basis of the work is the familiar notion that to detect object parts, in practice it often suffices to detect co-occurrences of more local object fragments a process that can be formalized as comparison (vector quantization) of image patches against a codebook of known fragments, followed by local aggregation of the resulting codebook membership vectors to detect co-occurrences. This process converts collections of local image descriptor vectors into slightly less local histogram vectors higher-level but spatially coarser descriptors. Our central observation is that it can therefore be iterated, and that doing so captures and codes ever larger assemblies of object parts and increasingly abstract or `semantic' image properties. This repeated nonlinear `folding' is essentially different from that of hierarchical models such as Convolutional Neural Networks and HMAX, being based on repeated comparison to local prototypes and accumulation of co-occurrence statistics rather than on repeated convolution and rectification. We formulate the hyperfeatures model and study its performance under several different image coding methods including clustering based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Discriminant Analysis. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks.

Mots clés

COMPUTER VISION VISUAL RECOGNITION IMAGE CODING IMAGE CLASSIFICATION

Domaines

Autre [cs.OH]

Fichier principal

RR-5655.pdf (1.35 Mo)

test_level3_0095.png (145.02 Ko)

Format : Figure, Image

Rapport De Recherche Inria : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00070355

Soumis le : vendredi 19 mai 2006-20:14:31

Dernière modification le : jeudi 4 avril 2024-21:29:17

Archivage à long terme le : dimanche 4 avril 2010-21:01:15

Dates et versions

inria-00070355 , version 1 (19-05-2006)

Identifiants

HAL Id : inria-00070355 , version 1

Citer

Ankur Agarwal, Bill Triggs. Hyperfeatures - Multilevel Local Coding for Visual Recognition. [Research Report] RR-5655, INRIA. 2005, pp.19. ⟨inria-00070355⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA IMAG CNRS INRIA INRIA-RRRT INRIA2 LARA

397 Consultations

923 Téléchargements

Hyperfeatures - Multilevel Local Coding for Visual Recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager