Skip to Main content Skip to Navigation

Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

Abstract : In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called S NAPLE , exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, S NAPLE can scale to extremely large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that S NAPLE can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.
Complete list of metadata

Cited literature [42 references]  Display  Hide  Download
Contributor : Juan Manuel Tirado Martin <>
Submitted on : Friday, January 30, 2015 - 1:59:04 PM
Last modification on : Thursday, January 7, 2021 - 4:25:29 PM
Long-term archiving on: : Saturday, April 15, 2017 - 11:37:49 PM


Files produced by the author(s)


  • HAL Id : hal-01111459, version 1


Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond. [Technical Report] RT-0454, Inria Rennes; INRIA. 2015. ⟨hal-01111459⟩



Record views


Files downloads