The Out-of-core KNN Awakens: The light side of computation force on large datasets

Nitin Chiluka 1, * Anne-Marie Kermarrec 1 Javier Olivares 1
* Auteur correspondant
1 ASAP - As Scalable As Possible: foundations of large scale dynamic distributed systems
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
Abstract : K-Nearest Neighbors (KNN) is a crucial tool for many applications , e.g. recommender systems, image classification and web-related applications. However, KNN is a resource greedy operation particularly for large datasets. We focus on the challenge of KNN computation over large datasets on a single commodity PC with limited memory. We propose a novel approach to compute KNN on large datasets by leveraging both disk and main memory efficiently. The main rationale of our approach is to minimize random accesses to disk, maximize sequential accesses to data and efficient usage of only the available memory. We evaluate our approach on large datasets, in terms of performance and memory consumption. The evaluation shows that our approach requires only 7% of the time needed by an in-memory baseline to compute a KNN graph.
Type de document :
Communication dans un congrès
The International Conference on Networked Systems NETYS, May 2016, Marrakech, Morocco. 〈http://netys.net/〉
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01336673
Contributeur : Javier Olivares <>
Soumis le : jeudi 23 juin 2016 - 15:24:29
Dernière modification le : vendredi 16 novembre 2018 - 01:39:27
Document(s) archivé(s) le : samedi 24 septembre 2016 - 11:45:04

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01336673, version 1

Citation

Nitin Chiluka, Anne-Marie Kermarrec, Javier Olivares. The Out-of-core KNN Awakens: The light side of computation force on large datasets. The International Conference on Networked Systems NETYS, May 2016, Marrakech, Morocco. 〈http://netys.net/〉. 〈hal-01336673〉

Partager

Métriques

Consultations de la notice

498

Téléchargements de fichiers

275