Skip to Main content Skip to Navigation
Conference papers

Scaling Out Link Prediction with SNAPLE

Abstract : A growing number of organizations are seeking to analyze extra large graphs in a timely and resource-efficient manner. With some graphs containing well over a billion elements, these organizations are turning to distributed graph-computing platforms that can scale out easily in existing data-centers and clouds. Unfortunately such platforms usually impose programming models that can be ill suited to typical graph computations, fundamentally undermining their potential benefits. In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called Snaple, exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, Snaple can scale to very large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that Snaple can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.
Complete list of metadata

Cited literature [43 references]  Display  Hide  Download

https://hal.inria.fr/hal-01244663
Contributor : François Taïani <>
Submitted on : Wednesday, December 16, 2015 - 10:02:39 AM
Last modification on : Thursday, January 7, 2021 - 4:29:18 PM
Long-term archiving on: : Saturday, April 29, 2017 - 4:16:07 PM

File

manuscript (1).pdf
Files produced by the author(s)

Identifiers

Citation

Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE. 16th Annual ACM/IFIP/USENIX Middleware Conference, Dec 2015, Vancouver, Canada. pp.12, ⟨10.1145/2814576.2814810⟩. ⟨hal-01244663⟩

Share

Metrics

Record views

854

Files downloads

355