HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

Abstract : In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called S NAPLE , exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, S NAPLE can scale to extremely large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that S NAPLE can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.
Complete list of metadata

Cited literature [42 references]  Display  Hide  Download

Contributor : Juan Manuel Tirado Martin Connect in order to contact the contributor
Submitted on : Friday, January 30, 2015 - 1:59:04 PM
Last modification on : Thursday, January 20, 2022 - 4:19:59 PM
Long-term archiving on: : Saturday, April 15, 2017 - 11:37:49 PM


Files produced by the author(s)


  • HAL Id : hal-01111459, version 1


Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond. [Technical Report] RT-0454, Inria Rennes; INRIA. 2015. ⟨hal-01111459⟩



Record views


Files downloads