Searching in one billion vectors: re-rank with source coding - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Searching in one billion vectors: re-rank with source coding

Résumé

Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.
Fichier principal
Vignette du fichier
paper.pdf (97.47 Ko) Télécharger le fichier
Vignette du fichier
Screen_shot_2011-02-17_at_12.15.11.png (42.25 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Format : Figure, Image
Loading...

Dates et versions

inria-00566883 , version 1 (17-02-2011)

Identifiants

Citer

Hervé Jégou, Romain Tavenard, Matthijs Douze, Laurent Amsaleg. Searching in one billion vectors: re-rank with source coding. ICASSP 2011 - International Conference on Acoustics, Speech and Signal Processing, May 2011, Prague, Czech Republic. pp.861-864, ⟨10.1109/ICASSP.2011.5946540⟩. ⟨inria-00566883⟩
841 Consultations
811 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More