Searching in one billion vectors: re-rank with source coding

Hervé Jégou; Romain Tavenard; Matthijs Douze; Laurent Amsaleg

doi:10.1109/ICASSP.2011.5946540

Communication Dans Un Congrès Année : 2011

Searching in one billion vectors: re-rank with source coding

(1) , (1) , (2, 3) , (1)

1
2
3

Hervé Jégou

Fonction : Auteur
PersonId : 833473

Multimedia content-based indexing

Romain Tavenard

Fonction : Auteur
PersonId : 5645
IdHAL : rtavenar
ORCID : 0000-0002-1439-8465
IdRef : 154729507

Multimedia content-based indexing

Matthijs Douze

Fonction : Auteur
PersonId : 843109

Learning and recognition in vision

Service Expérimentation et Développement

Laurent Amsaleg

Fonction : Auteur
PersonId : 15318
IdHAL : laurent-amsaleg
ORCID : 0000-0003-0204-0930
IdRef : 154720879

Multimedia content-based indexing

Résumé

Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.

Mots clés

approximate nearest neighbors large scale indexing source coding similarity search

Domaines

Recherche d'information [cs.IR] Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

paper.pdf (97.47 Ko)

Screen_shot_2011-02-17_at_12.15.11.png (42.25 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Format : Figure, Image

Hervé Jégou : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00566883

Soumis le : jeudi 17 février 2011-18:39:31

Dernière modification le : jeudi 4 avril 2024-18:25:52

Archivage à long terme le : jeudi 30 juin 2011-13:51:24

Dates et versions

inria-00566883 , version 1 (17-02-2011)

Identifiants

HAL Id : inria-00566883 , version 1
ARXIV : 1102.3828
DOI : 10.1109/ICASSP.2011.5946540

Citer

Hervé Jégou, Romain Tavenard, Matthijs Douze, Laurent Amsaleg. Searching in one billion vectors: re-rank with source coding. ICASSP 2011 - International Conference on Acoustics, Speech and Signal Processing, May 2011, Prague, Czech Republic. pp.861-864, ⟨10.1109/ICASSP.2011.5946540⟩. ⟨inria-00566883⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 UGA CNRS INRIA INSA-RENNES IRISA LJK LJK_GI LJK_GI_LEAR IRISA-D6 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

844 Consultations

821 Téléchargements

Searching in one billion vectors: re-rank with source coding

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager