Skip to Main content Skip to Navigation

How Information Propagates on Twitter?

Abstract : This thesis presents the measurement study of Online Social Networks focusing on Twitter. Twitter is one of the largest social networks using exclusively directed links among accounts. This makes the Twitter social graph much closer to the social graph supporting real life communications than, for instance, Facebook. Therefore, understanding the structure of the Twitter social graph and the way information propagates through it is interesting not only for computer scientists, but also for researchers in other fields, such as sociologists. However, little is known about the information propagation in Twitter. In the first part, we present an in-depth study of the macroscopic structure of the Twitter social graph unveiling the highways on which tweets propagate. For this study, we crawled Twitter to retrieve all accounts and all social relationships (follow links) among accounts. We present a methodology to unveil the macroscopic structure of the Twitter social graph that consists of 8 components defined by their connectivity characteristics. We found that each component group users with a specific usage of Twitter. Finally, we present a method to approximate the macroscopic structure of the Twitter social graph in the past, validate this method using old datasets, and discuss the evolution of the macroscopic structure of the Twitter social graph during the past 6 years. In the second part, we study the information propagation in Twitter by looking at the news media articles shared on Twitter. Online news domains increasingly rely on socialmedia to drive traffic to their websites. Yet we know surprisingly little about how social media conversation mentioning an online article actually generates a click to it. We present a large scale, validated and reproducible study of social clicks by gathering a month of web visits to online resources that are located in 5 leading news domains and that are mentioned in Twitter. As we prove, properties of clicks and social media Click-Per-Follower rate impact multiple aspects of information diffusion, all previously unknown. Secondary resources, that are not promoted through headlines and are responsible for the long tail of content popularity, generate more clicks both in absolute and relative terms. Social media attention is actually long-lived, in contrast with temporal evolution estimated from posts or receptions. The actual influence of an intermediary or a resource is poorly predicted by their posting behavior, but we show how that prediction can be made more precise. In the third part we present an experimental study of graph sampling. Online social networks (OSNs) are an important source of information for scientists in different fields such as computer science, sociology, economics, etc. However, it is hard to study OSNs as they are very large. Also, companies take measures to prevent crawls of their OSNs and refrain from sharing their data with the research community. For these reasons, we argue that sampling techniques will be the best technique to study OSNs in the future. In this part, we take an experimental approach to study the characteristics of well-known sampling techniques on a full social graph of Twitter we crawled in 2012.
Complete list of metadata
Contributor : Maksym Gabielkov Connect in order to contact the contributor
Submitted on : Wednesday, June 22, 2016 - 5:03:13 PM
Last modification on : Wednesday, September 21, 2016 - 1:04:33 AM


  • HAL Id : tel-01336218, version 1


Maksym Gabielkov. How Information Propagates on Twitter?. Social and Information Networks [cs.SI]. Univeristé Nice Sophia Antipolis, 2016. English. ⟨tel-01336218v1⟩



Les métriques sont temporairement indisponibles