Approximate k-Nearest-Neighbor Searches: A New Algorithm with Probabilistic Control of the Precision

Sid-Ahmed Berrani 1 Laurent Amsaleg 2 Patrick Gros 2
2 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes a new approach for performing efficient approximate k-nearest-neighbor searches in high-dimensional databases. This approach allows a fine and intuitive control over the precision of the search by setting at run time the maximum probability for a vector that would be in the exact answer set to be missing in the approximate set of answers. One of the contribution of this paper is an off-line process computing controlled approximations shrinking each cluster within which feature vectors are enclosed. Those approximations are values for (approximate) radiuses of clusters, and they are computed for all the levels of precision defined beforehand. Therefore, to answer a query, the search process simply considers the appropriate approximations corresponding to the desired level of precision. This may cause the actual nearest-neighbors of the query point to be ignored. Our method, however, probabilistically bounds the chances for this to happen. This paper also presents a performance study of the implementa- tion. It shows, for example, that our method is 6.72 times faster than the sequential scan when it handles more then 5 10^6 24-dimensions vectors, even when the probability of missing one of the true nearest-neighbors is below 0.01.
Document type :
Liste complète des métadonnées
Contributor : Rapport de Recherche Inria <>
Submitted on : Tuesday, May 23, 2006 - 7:13:45 PM
Last modification on : Friday, November 16, 2018 - 1:25:10 AM
Document(s) archivé(s) le : Sunday, April 4, 2010 - 10:44:07 PM


  • HAL Id : inria-00071910, version 1


Sid-Ahmed Berrani, Laurent Amsaleg, Patrick Gros. Approximate k-Nearest-Neighbor Searches: A New Algorithm with Probabilistic Control of the Precision. [Research Report] RR-4675, INRIA. 2002. ⟨inria-00071910⟩



Record views


Files downloads