The Observable Web

Yacine Boufkhad; Laurent Viennot

Rapport (Rapport De Recherche) Année : 2003

The Observable Web

(1) , (1)

Yacine Boufkhad

Fonction : Auteur
PersonId : 7352
IdHAL : yacine-boufkhad
IdRef : 14974398X

Dynamic graphs and the web graph

Laurent Viennot

Fonction : Auteur
PersonId : 1841
IdHAL : laurentviennot
IdRef : 034781072

Dynamic graphs and the web graph

Résumé

The web is now de facto the first place to publish data. However, retrieving the whole database represented by the web appears almost impossible. Some parts are known to be hard to discover automatically, giving rise to the so called hidden or invisible web. On the one hand, search engines try to index most of the web. Almost all related work is based on discovering the web by crawling. This paper is devoted to estimate how accurate is the view of the web obtained by crawling. Our approach is to compare crawling to other ways of discovering the web (mainly by analyzing server or proxy logs of web surfers activity). This work is a first step towards identifying the observable web.

Mots clés

CRAWL ACCESS LOGS HIDDEN WEB

Domaines

Autre [cs.OH]

Fichier principal

RR-4790.pdf (140.51 Ko)

Rapport De Recherche Inria : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00071796

Soumis le : mardi 23 mai 2006-18:48:10

Dernière modification le : mardi 7 février 2023-03:39:03

Archivage à long terme le : dimanche 4 avril 2010-22:38:31

Dates et versions

inria-00071796 , version 1 (23-05-2006)

Identifiants

HAL Id : inria-00071796 , version 1

Citer

Yacine Boufkhad, Laurent Viennot. The Observable Web. [Research Report] RR-4790, INRIA. 2003. ⟨inria-00071796⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA-RRRT INRIA2 LARA

89 Consultations

77 Téléchargements

The Observable Web

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager