Aggregating local image descriptors into compact codes

Hervé Jégou 1 Florent Perronnin 2 Matthijs Douze 3, 4 Jorge Sánchez 2 Patrick Pérez 5 Cordelia Schmid 3
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
3 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : This paper addresses the problem of large-scale image search. Three constraints have to be taken into account: search accuracy, efficiency, and memory usage. We first present and evaluate different ways of aggregating local image descriptors into a vector and show that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension. We then jointly optimize dimensionality reduction and indexing in order to obtain a precise vector comparison as well as a compact representation. The evaluation shows that the image representation can be reduced to a few dozen bytes while preserving high accuracy. Searching a 100 million image dataset takes about 250 ms on one processor core.
Document type :
Journal articles
IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2012, 34 (9), pp.1704-1716. <10.1109/TPAMI.2011.235>
Liste complète des métadonnées



https://hal.inria.fr/inria-00633013
Contributor : Hervé Jégou <>
Submitted on : Monday, October 17, 2011 - 2:18:08 PM
Last modification on : Friday, January 13, 2017 - 2:17:51 PM
Document(s) archivé(s) le : Thursday, November 15, 2012 - 9:50:17 AM

Files

jegou_aggregate.pdf
Files produced by the author(s)

Identifiers

Citation

Hervé Jégou, Florent Perronnin, Matthijs Douze, Jorge Sánchez, Patrick Pérez, et al.. Aggregating local image descriptors into compact codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2012, 34 (9), pp.1704-1716. <10.1109/TPAMI.2011.235>. <inria-00633013>

Share

Metrics

Record views

4144

Document downloads

11347