Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

Anne-Marie Kermarrec; François Taïani; Juan Manuel Tirado Martin

Rapport (Rapport Technique) Année : 2015

Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

(1) , (2, 1) , (1)

1
2

Anne-Marie Kermarrec

Fonction : Auteur

As Scalable As Possible: foundations of large scale dynamic distributed systems

François Taïani

Fonction : Auteur
PersonId : 855
IdHAL : francois-taiani
ORCID : 0000-0002-9692-5678
IdRef : 081430264

Université de Rennes

As Scalable As Possible: foundations of large scale dynamic distributed systems

Juan Manuel Tirado Martin

Fonction : Auteur
PersonId : 963497

As Scalable As Possible: foundations of large scale dynamic distributed systems

Résumé

In this paper, we consider how the emblematic problem of link-prediction can be implemented efficiently in gather-apply-scatter (GAS) platforms, a popular distributed graph-computation model. Our proposal, called S NAPLE , exploits a novel highly-localized vertex scoring technique, and minimizes the cost of data flow while maintaining prediction quality. When used within GraphLab, S NAPLE can scale to extremely large graphs that a standard implementation of link prediction on GraphLab cannot handle. More precisely, we show that S NAPLE can process a graph containing 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the quality of predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machine running a non-distributed state-of-the-art solution.

Mots clés

big data Distributed systems Graph link prediction

Domaines

Informatique [cs] Algorithme et structure de données [cs.DS] Calcul parallèle, distribué et partagé [cs.DC] Recherche d'information [cs.IR] Informatique et langage [cs.CL]

Fichier principal

RT-454.pdf (861.45 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Juan Manuel Tirado Martin : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01111459

Soumis le : vendredi 30 janvier 2015-13:59:04

Dernière modification le : vendredi 24 mars 2023-14:53:00

Archivage à long terme le : samedi 15 avril 2017-23:37:49

Dates et versions

hal-01111459 , version 1 (30-01-2015)

Identifiants

HAL Id : hal-01111459 , version 1

Citer

Anne-Marie Kermarrec, François Taïani, Juan Manuel Tirado Martin. Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond. [Technical Report] RT-0454, Inria Rennes; INRIA. 2015. ⟨hal-01111459⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA INRIA-RRRT CENTRALESUPELEC IRISA-D1 INRIA2 LARA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

350 Consultations

329 Téléchargements

Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager