Scaling Out Link Prediction with SNAPLE

Abstract : A growing number of organizations are seeking to analyze extra large graphs in a timely and resource-efficient manner. With some graphs containing well over a billion elements, these organizations are turning to distributed graph-computing platforms that can scale out easily in existing data-centers and clouds. Unfortunately such platforms usually impose programming models that can be ill suited to typical graph computations, fundamentally undermining their potential benefits. In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called Snaple, exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, Snaple can scale to very large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that Snaple can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.
Type de document :
Communication dans un congrès
16th Annual ACM/IFIP/USENIX Middleware Conference, Dec 2015, Vancouver, Canada. Middleware '15 Proceedings of the 16th Annual Middleware Conference, pp.12, 2015, 〈http://dl.acm.org/citation.cfm?id=2814576&picked=prox&cfid=738349186&cftoken=78689160〉. 〈10.1145/2814576.2814810〉
Liste complète des métadonnées

Littérature citée [43 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01244663
Contributeur : François Taïani <>
Soumis le : mercredi 16 décembre 2015 - 10:02:39
Dernière modification le : mardi 16 janvier 2018 - 15:54:13
Document(s) archivé(s) le : samedi 29 avril 2017 - 16:16:07

Fichier

manuscript (1).pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE. 16th Annual ACM/IFIP/USENIX Middleware Conference, Dec 2015, Vancouver, Canada. Middleware '15 Proceedings of the 16th Annual Middleware Conference, pp.12, 2015, 〈http://dl.acm.org/citation.cfm?id=2814576&picked=prox&cfid=738349186&cftoken=78689160〉. 〈10.1145/2814576.2814810〉. 〈hal-01244663〉

Partager

Métriques

Consultations de la notice

616

Téléchargements de fichiers

73