Near-Neighbor Preserving Dimension Reduction for Doubling Subsets of L1

Ioannis Z. Emiris; Vasilis Margonis; Ioannis Psarros

doi:10.4230/LIPIcs.APPROX-RANDOM.2019.47

Communication Dans Un Congrès Année : 2019

Near-Neighbor Preserving Dimension Reduction for Doubling Subsets of L1

(1, 2) , (1) , (3)

1
2
3

Ioannis Z. Emiris

Fonction : Auteur
PersonId : 10682
IdHAL : ioannis-emiris
ORCID : 0000-0002-2339-5303
IdRef : 092244807

National and Kapodistrian University of Athens

AlgebRe, geOmetrie, Modelisation et AlgoriTHmes

Vasilis Margonis

Fonction : Auteur
PersonId : 1060218

National and Kapodistrian University of Athens

Ioannis Psarros

Fonction : Auteur

Universität Bonn = University of Bonn

Résumé

Randomized dimensionality reduction has been recognized as one of the fundamental techniques in handling high-dimensional data. Starting with the celebrated Johnson-Lindenstrauss Lemma, such reductions have been studied in depth for the Euclidean (L2) metric, but much less for the Manhattan (L1) metric. Our primary motivation is the approximate nearest neighbor problem in L1. We exploit its reduction to the decision-with-witness version, called approximate near neighbor, which incurs a roughly logarithmic overhead. In 2007, Indyk and Naor, in the context of approximate nearest neighbors, introduced the notion of nearest neighbor-preserving embeddings. These are randomized embeddings between two metric spaces with guaranteed bounded distortion only for the distances between a query point and a point set. Such embeddings are known to exist for both L2 and L1 metrics, as well as for doubling subsets of L2. The case that remained open were doubling subsets of L1. In this paper, we propose a dimension reduction by means of a near neighbor-preserving embedding for doubling subsets of L1. Our approach is to represent the pointset with a carefully chosen covering set, then randomly project the latter. We study two types of covering sets: c-approximate r-nets and randomly shifted grids, and we discuss the tradeoff between them in terms of preprocessing time and target dimension. We employ Cauchy variables: certain concentration bounds derived should be of independent interest.

Mots clés

randomized embedding Nearest neighbor algorithms Approximate nearest neighbor Dimensionality reduction Manhattan metric

Domaines

Géométrie métrique [math.MG] Géométrie algorithmique [cs.CG] Complexité [cs.CC]

Fichier principal

EmMaPs-APPROX-RANDOM-2019.pdf (501.13 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Ioannis Emiris : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02398741

Soumis le : samedi 7 décembre 2019-22:48:21

Dernière modification le : lundi 8 avril 2024-10:41:35

Archivage à long terme le : dimanche 8 mars 2020-14:36:34

Dates et versions

hal-02398741 , version 1 (07-12-2019)

Identifiants

HAL Id : hal-02398741 , version 1
DOI : 10.4230/LIPIcs.APPROX-RANDOM.2019.47

Citer

Ioannis Z. Emiris, Vasilis Margonis, Ioannis Psarros. Near-Neighbor Preserving Dimension Reduction for Doubling Subsets of L1. APPROX 2019 - Workshop on Approximation Algorithms for Combinatorial Optimization Problems, Sep 2019, Boston, United States. ⟨10.4230/LIPIcs.APPROX-RANDOM.2019.47⟩. ⟨hal-02398741⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA2 UNIV-COTEDAZUR

45 Consultations

52 Téléchargements

Near-Neighbor Preserving Dimension Reduction for Doubling Subsets of L1

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager