Searching in one billion vectors: re-rank with source coding

Hervé Jégou 1 Romain Tavenard 1 Matthijs Douze 2, 3 Laurent Amsaleg 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.
Type de document :
Communication dans un congrès
ICASSP 2011 - International Conference on Acoustics, Speech and Signal Processing, May 2011, Prague, Czech Republic. IEEE, pp.861-864, 2011, 〈10.1109/ICASSP.2011.5946540〉
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger


https://hal.inria.fr/inria-00566883
Contributeur : Hervé Jégou <>
Soumis le : jeudi 17 février 2011 - 18:39:31
Dernière modification le : lundi 2 octobre 2017 - 10:42:01
Document(s) archivé(s) le : jeudi 30 juin 2011 - 13:51:24

Fichiers

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Hervé Jégou, Romain Tavenard, Matthijs Douze, Laurent Amsaleg. Searching in one billion vectors: re-rank with source coding. ICASSP 2011 - International Conference on Acoustics, Speech and Signal Processing, May 2011, Prague, Czech Republic. IEEE, pp.861-864, 2011, 〈10.1109/ICASSP.2011.5946540〉. 〈inria-00566883〉

Partager

Métriques

Consultations de
la notice

774

Téléchargements du document

622