hal-00722635, version 1
Nearest neighbor search for arbitrary kernels with explicit embeddings
N° RR-8040 (2012)
Abstract: Many algorithms have been proposed to handle efficient search in large databases for simple metrics such as the Euclidean distance. However, few approaches apply to more sophisticated Positive Semi-Definite (PSD) kernels. In this document, we propose for such kernels to use the concept of explicit embedding and to cast the search problem into a Euclidean space. We first describe an exact nearest neighbor search technique which relies on bounds on the approximation of the kernel. We show that, in the case of SIFT descriptors, one can retrieve the nearest neighbor with probability 1 by computing only a fraction of the costly kernels between the query and the database vectors. We then propose to combine explicit embedding with a recent Euclidean approximate nearest neighbor search method and show that it leads to significant improvements with respect to the state-of-the-art methods which rely on an implicit embedding. The database vectors being indexed by short codes, the approach is shown to scale to a dataset comprising 200 million vectors on a commodity server.
- 1:
- CNRS : UMR6074 – INRIA – Institut National des Sciences Appliquées (INSA) - Rennes – Université de Rennes 1
- 2:
- Technicolor
- 3:
- Xerox
- 4:
- CNRS : UMR6074 – INRIA – Institut National des Sciences Appliquées (INSA) - Rennes – Université de Rennes 1
- Domain : Computer Science/Information Retrieval
- Internal note : RR-8040
- hal-00722635, version 1
- http://hal.inria.fr/hal-00722635
- oai:hal.inria.fr:hal-00722635
- From:
- Submitted on: Thursday, 2 August 2012 17:14:14
- Updated on: Friday, 3 August 2012 08:13:31





Associated documents
Export