Approximate k-Nearest-Neighbor Searches: A New Algorithm with Probabilistic Control of the Precision

Sid-Ahmed Berrani 1 Laurent Amsaleg 2 Patrick Gros 2
2 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes a new approach for performing efficient approximate k-nearest-neighbor searches in high-dimensional databases. This approach allows a fine and intuitive control over the precision of the search by setting at run time the maximum probability for a vector that would be in the exact answer set to be missing in the approximate set of answers. One of the contribution of this paper is an off-line process computing controlled approximations shrinking each cluster within which feature vectors are enclosed. Those approximations are values for (approximate) radiuses of clusters, and they are computed for all the levels of precision defined beforehand. Therefore, to answer a query, the search process simply considers the appropriate approximations corresponding to the desired level of precision. This may cause the actual nearest-neighbors of the query point to be ignored. Our method, however, probabilistically bounds the chances for this to happen. This paper also presents a performance study of the implementa- tion. It shows, for example, that our method is 6.72 times faster than the sequential scan when it handles more then 5 10^6 24-dimensions vectors, even when the probability of missing one of the true nearest-neighbors is below 0.01.
Type de document :
Rapport
[Research Report] RR-4675, INRIA. 2002
Liste complète des métadonnées

https://hal.inria.fr/inria-00071910
Contributeur : Rapport de Recherche Inria <>
Soumis le : mardi 23 mai 2006 - 19:13:45
Dernière modification le : vendredi 16 novembre 2018 - 01:25:10
Document(s) archivé(s) le : dimanche 4 avril 2010 - 22:44:07

Fichiers

Identifiants

  • HAL Id : inria-00071910, version 1

Citation

Sid-Ahmed Berrani, Laurent Amsaleg, Patrick Gros. Approximate k-Nearest-Neighbor Searches: A New Algorithm with Probabilistic Control of the Precision. [Research Report] RR-4675, INRIA. 2002. 〈inria-00071910〉

Partager

Métriques

Consultations de la notice

334

Téléchargements de fichiers

242