Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan

Abstract : Nearest Neighbor (NN) search in high dimension is an important feature in many applications (e.g., image retrieval, multimedia databases). Product Quantization (PQ) is a widely used solution which offers high performance, i.e., low response time while preserving a high accuracy. PQ represents high-dimensional vectors (e.g., image descriptors) by compact codes. Hence, very large databases can be stored in memory, allowing NN queries without resorting to slow I/O operations. PQ computes distances to neighbors using cache-resident lookup tables, thus its performance remains limited by (i) the many cache accesses that the algorithm requires, and (ii) its inability to leverage SIMD instructions available on modern CPUs. In this paper, we advocate that cache locality is not sufficient for efficiency. To address these limitations, we design a novel algorithm, PQ Fast Scan, that transforms the cache-resident lookup tables into small tables, sized to fit SIMD registers. This transformation allows (i) in-register lookups in place of cache accesses and (ii) an efficient SIMD implementation. PQ Fast Scan has the exact same accuracy as PQ, while having 4 to 6 times lower response time (e.g., for 25 million vectors, scan time is reduced from 74ms to 13ms).
Type de document :
Communication dans un congrès
42nd International Conference on Very Large Data Bases, Sep 2016, New Delhi, India. Proceedings of the VLDB Endowment, 9 (4), pp.12, 2015, 〈http://www.vldb.org/pvldb/〉
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01239055
Contributeur : Fabien André <>
Soumis le : lundi 7 décembre 2015 - 14:11:26
Dernière modification le : vendredi 16 novembre 2018 - 01:38:34
Document(s) archivé(s) le : mardi 8 mars 2016 - 12:51:13

Fichier

p288-andre.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale - Pas de modification 4.0 International License

Identifiants

  • HAL Id : hal-01239055, version 1

Citation

Fabien André, Anne-Marie Kermarrec, Nicolas Le Scouarnec. Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan. 42nd International Conference on Very Large Data Bases, Sep 2016, New Delhi, India. Proceedings of the VLDB Endowment, 9 (4), pp.12, 2015, 〈http://www.vldb.org/pvldb/〉. 〈hal-01239055〉

Partager

Métriques

Consultations de la notice

1506

Téléchargements de fichiers

404