Inference in OSNs via Lightweight Partial Crawls - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Inference in OSNs via Lightweight Partial Crawls

Résumé

Are Online Social Network (OSN) A users more likely to form friendships with those with similar attributes? Do users at an OSN B score content more favorably than OSN C users? Such questions frequently arise in the context of Social Network Analysis (SNA) but often crawling an OSN network via its Application Programming Interface (API) is the only way to gather data from a third party. To date, these partial API crawls are the majority of public datasets and the synonym of lack of statistical guarantees in incomplete-data comparisons, severely limiting SNA research progress. Using regenerative properties of the random walks, we propose estimation techniques based on short crawls that have proven statistical guarantees. Moreover, our short crawls can be implemented in massively distributed algorithms. We also provide an adaptive crawler that makes our method parameter-free, significantly improving our statistical guarantees. We then derive the Bayesian approximation of the posterior of the estimates, and in addition, obtain an estima-tor for the expected value of node and edge statistics in an equivalent configuration model or Chung-Lu random graph model of the given network (where nodes are connected randomly) and use it as a basis for testing null hypotheses. The theoretical results are supported with simulations on a variety of real-world networks.
Fichier principal
Vignette du fichier
Sigm2016.pdf (961.19 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01403018 , version 1 (25-11-2016)

Identifiants

Citer

Konstantin Avrachenkov, Bruno Ribeiro, Jithin K Sreedharan. Inference in OSNs via Lightweight Partial Crawls. ACM SIGMETRICS, Jun 2016, Juan Les Pins, France. ⟨10.1145/2896377.2901477⟩. ⟨hal-01403018⟩

Collections

INRIA INRIA2
121 Consultations
208 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More