Memory vectors for similarity search in high-dimensional spaces

Ahmet Iscen 1, 2 Teddy Furon 1 Vincent Gripon 2, 3 Michael Rabbat 4 Hervé Jégou 5
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
2 Lab-STICC_TB_CACS_IAS
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
Abstract : We study an indexing architecture to store and search in a database of high-dimensional vectors from the perspective of statistical signal processing and decision theory. This architecture is composed of several memory units, each of which summarizes a fraction of the database by a single representative vector. The potential similarity of the query to one of the vectors stored in the memory unit is gauged by a simple correlation with the memory unit's representative vector. This representative optimizes the test of the following hypothesis: the query is independent from any vector in the memory unit vs. the query is a simple perturbation of one of the stored vectors. Compared to exhaustive search, our approach finds the most similar database vectors significantly faster without a noticeable reduction in search quality. Interestingly, the reduction of complexity is provably better in high-dimensional spaces. We empirically demonstrate its practical interest in a large-scale image search scenario with off-the-shelf state-of-the-art descriptors.
Document type :
Journal articles
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download

https://hal.inria.fr/hal-01481220
Contributor : Ahmet Iscen <>
Submitted on : Thursday, March 2, 2017 - 12:40:32 PM
Last modification on : Friday, September 13, 2019 - 9:48:07 AM
Long-term archiving on : Wednesday, May 31, 2017 - 1:53:37 PM

File

iscen_tbd.pdf
Files produced by the author(s)

Identifiers

Citation

Ahmet Iscen, Teddy Furon, Vincent Gripon, Michael Rabbat, Hervé Jégou. Memory vectors for similarity search in high-dimensional spaces. IEEE transactions on big data, IEEE, 2017, 4 (1), pp.65 - 77. ⟨10.1109/TBDATA.2017.2677964⟩. ⟨hal-01481220⟩

Share

Metrics

Record views

856

Files downloads

337