Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport Technique) Année : 2015

Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

Résumé

In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called S NAPLE , exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, S NAPLE can scale to extremely large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that S NAPLE can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.
Fichier principal
Vignette du fichier
RT-454.pdf (861.45 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01111459 , version 1 (30-01-2015)

Identifiants

  • HAL Id : hal-01111459 , version 1

Citer

Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond. [Technical Report] RT-0454, Inria Rennes; INRIA. 2015. ⟨hal-01111459⟩
350 Consultations
329 Téléchargements

Partager

Gmail Facebook X LinkedIn More