Skip to Main content Skip to Navigation
Conference papers

Data Harvesting 2.0: from the Visible to the Invisible Web

Claude Castelluccia 1, * Stéphane Grumbach 2, * Lukasz Olejnik 1
* Corresponding author
1 PRIVATICS - Privacy Models, Architectures and Tools for the Information Society
CITI - CITI Centre of Innovation in Telecommunications and Integration of services, Inria Grenoble - Rhône-Alpes
2 DICE - Data on the Internet at the Core of the Economy
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
Abstract : Personal data are fuelling a fast emerging industry which transform them into added value. Harvesting these data is therefore of the outermost importance for the economy. In this paper, we study the flows of personal data at a global level, and distinguish countries based on their capacity to harvest data. We establish a cartography of international data channels on the visible and invisible Web. The visible Web is composed of the sites that are available to the general public and are typically indexed by search engines. The invisible Web refers to tags, Web bugs, pixels and beacons that appear on Websites to track and profile users. It is well known that the US dominate the visible Web with more than 70% of the top 100 sites in the world. We show that this domination is even stronger on the invisible Web.The largest proportion of trackers in most countries are indeed from the US. Apart from the US, two countries exhibit an original strategy. China, which dominates its visible Web with a majority of local sites, but surprisingly these sites still contain a majority of US trackers. Russia, which also dominates its visible Web, and is the only country with more local trackers than US ones.
keyword : trackers
Document type :
Conference papers
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download
Contributor : Stephane Grumbach Connect in order to contact the contributor
Submitted on : Tuesday, June 11, 2013 - 2:37:51 PM
Last modification on : Friday, December 10, 2021 - 1:16:03 PM
Long-term archiving on: : Thursday, September 12, 2013 - 4:07:54 AM


Files produced by the author(s)


  • HAL Id : hal-00832784, version 1



Claude Castelluccia, Stéphane Grumbach, Lukasz Olejnik. Data Harvesting 2.0: from the Visible to the Invisible Web. The Twelfth Workshop on the Economics of Information Security, Allan Friedman, Jun 2013, Washington, DC, United States. ⟨hal-00832784⟩



Les métriques sont temporairement indisponibles