Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

High-dimensional approximate nearest neighbor: k -d Generalized Randomized Forests

Abstract : We propose a new data-structure, the generalized randomized k-d forest, or k-d GeRaF, for approximate nearest neighbor searching in high dimensions. In particular, we introduce new ran-domization techniques to specify a set of independently constructed trees where search is performed simultaneously, hence increasing accuracy. We omit backtracking, and we optimize distance computations , thus accelerating queries. We release public domain software GeRaF and we compare it to existing implementations of state-of-the-art methods including BBD-trees, Locality Sensitive Hashing, randomized k-d forests, and product quantization. Experimental results indicate that our method would be the method of choice in dimensions around 1,000, and probably up to 10,000, and pointsets of cardinality up to a few hundred thousands or even one million; this range of inputs is encountered in many critical applications today. For instance, we handle a real dataset of 10 6 images represented in 960 dimensions with a query time of less than 1sec on average and 90% responses being true nearest neighbors.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download
Contributor : Yannis Avrithis <>
Submitted on : Tuesday, November 19, 2019 - 1:40:37 PM
Last modification on : Thursday, November 26, 2020 - 4:00:02 PM
Long-term archiving on: : Thursday, February 20, 2020 - 4:47:41 PM


Files produced by the author(s)


  • HAL Id : hal-02370318, version 1
  • ARXIV : 1603.09596


Yannis Avrithis, Ioannis Z. Emiris, Giorgos Samaras. High-dimensional approximate nearest neighbor: k -d Generalized Randomized Forests. 2016. ⟨hal-02370318⟩



Record views


Files downloads