Memory vectors for similarity search in high-dimensional spaces

Abstract : We study an indexing architecture to store and search in a database of high-dimensional vectors from the perspective of statistical signal processing and decision theory. This architecture is composed of several memory units, each of which summarizes a fraction of the database by a single representative vector. The potential similarity of the query to one of the vectors stored in the memory unit is gauged by a simple correlation with the memory unit's representative vector. This representative optimizes the test of the following hypothesis: the query is independent from any vector in the memory unit vs. the query is a simple perturbation of one of the stored vectors. Compared to exhaustive search, our approach finds the most similar database vectors significantly faster without a noticeable reduction in search quality. Interestingly, the reduction of complexity is provably better in high-dimensional spaces. We empirically demonstrate its practical interest in a large-scale image search scenario with off-the-shelf state-of-the-art descriptors.
Type de document :
Article dans une revue
IEEE Transactions on Big Data, 2017, 〈10.1109/TBDATA.2017.2677964〉
Liste complète des métadonnées

Littérature citée [33 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01481220
Contributeur : Ahmet Iscen <>
Soumis le : jeudi 2 mars 2017 - 12:40:32
Dernière modification le : jeudi 23 novembre 2017 - 14:03:25
Document(s) archivé(s) le : mercredi 31 mai 2017 - 13:53:37

Fichier

iscen_tbd.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Ahmet Iscen, Teddy Furon, Vincent Gripon, Michael Rabbat, Hervé Jégou. Memory vectors for similarity search in high-dimensional spaces. IEEE Transactions on Big Data, 2017, 〈10.1109/TBDATA.2017.2677964〉. 〈hal-01481220〉

Partager

Métriques

Consultations de la notice

385

Téléchargements de fichiers

77