Skip to Main content Skip to Navigation
Reports

Finding Friends and Followers in Sub-linear Time

David Arthur 1 Steve Oudot 2, * Anneesh Sharma 3
* Corresponding author
2 GEOMETRICA - Geometric computing
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Saclay - Ile de France
Abstract : The approximate Nearest Neighbor (NN) search problem asks to pre-process a given set of points $P$ in such a way that, given any query point $q$, one can retrieve a point in $P$ that is approximately closest to $q$. Of particular interest is the case of points lying in high dimensions, which has seen rapid developments since the introduction of the Locality-Sensitive Hashing (LSH) data structure by Indyk and Motwani. Combined with a space decomposition by Har-Peled, the LSH data structure can answer approximate NN queries in sub-linear time using polynomial (in both $d$ and $n$) space. Unfortunately, it is not known whether Har-Peled's space decomposition can be maintained efficiently under point insertions and deletions, so the above solution only works in a static setting. In this paper we present a variant of Har-Peled's decomposition, based on random semi-regular grids, which can achieve the same query time with the added advantage that it can be maintained efficiently even under adversarial point insertions and deletions. The outcome is a new data structure to answer approximate NN queries efficiently in dynamic settings. Another related problem known as Reverse Nearest Neighbor (RNN) search is to find the influence set of a given query point $q$, i.e. the subset of points of $P$ that have $q$ as their nearest neighbor. Although this problem finds many practical applications, very little is known about its complexity. In particular, no algorithm is known to solve it in high dimensions in sub-linear time using sub-exponential space. In this paper we show how to pre-process the data points, so that Har-Peled's space decomposition combined with modified LSH data structures can solve an approximate variant of the RNN problem efficiently, using polynomial space. The query time of our approach is bounded by two terms: the first one is sub-linear in the size of $P$ and corresponds roughly to the incompressible time needed to locate the query point in the data structure; the second one is proportional to the size of the output, which is a set of points as opposed to a single point for (approximate) NN queries. An interesting feature of our RNN solution is that it is flexible enough to be applied indifferently in monochromatic or bichromatic settings.
Document type :
Reports
Complete list of metadata

https://hal.inria.fr/inria-00429459
Contributor : Steve Oudot Connect in order to contact the contributor
Submitted on : Monday, November 9, 2009 - 8:02:17 AM
Last modification on : Friday, February 26, 2021 - 11:36:04 AM
Long-term archiving on: : Saturday, November 26, 2016 - 2:22:42 PM

File

RR-7084.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00429459, version 2

Collections

Citation

David Arthur, Steve Oudot, Anneesh Sharma. Finding Friends and Followers in Sub-linear Time. [Research Report] RR-7084, 2009. ⟨inria-00429459v2⟩

Share

Metrics

Les métriques sont temporairement indisponibles