Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

Abstract : In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called S NAPLE , exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, S NAPLE can scale to extremely large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that S NAPLE can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.
Liste complète des métadonnées

Littérature citée [42 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01111459
Contributeur : Juan Manuel Tirado Martin <>
Soumis le : vendredi 30 janvier 2015 - 13:59:04
Dernière modification le : mercredi 16 mai 2018 - 11:23:29
Document(s) archivé(s) le : samedi 15 avril 2017 - 23:37:49

Fichier

RT-454.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01111459, version 1

Citation

Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond. [Technical Report] RT-0454, Inria Rennes; INRIA. 2015. 〈hal-01111459〉

Partager

Métriques

Consultations de la notice

762

Téléchargements de fichiers

405