The web changes everything, Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM '09, 2009. ,
DOI : 10.1145/1498759.1498837
Extracting structured data from Web pages, Proceedings of the 2003 ACM SIGMOD international conference on on Management of data , SIGMOD '03, 2003. ,
DOI : 10.1145/872757.872799
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.9515
The evolution of the Web and implications for an incremental crawler, Proc. VLDB, 2000. ,
RoadRunner, Proceedings of the 2002 ACM SIGMOD international conference on Management of data , SIGMOD '02, 2001. ,
DOI : 10.1145/564691.564778
Resource Harvesting within the OAI-PMH Framework, D-Lib Magazine, 2004. ,
DOI : 10.1045/december2004-vandesompel
As we may perveive: Inferring logical documents from hypertext, Proc. HT, 2005. ,
A large-scale study of the evolution of web pages, Proc. WWW, 2003. ,
Implementing Preservation Strategies for Complex Multimedia Objects, Proc. ECDL, 2003. ,
DOI : 10.1007/978-3-540-45175-4_43
Application of kalman filters to identify unexpected change in blogs, Proc. JCDL, 2008. ,
Boilerplate detection using shallow text features, Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, 2010. ,
DOI : 10.1145/1718487.1718542
Data Without Meaning: Establishing the Significant Properties of Digital Research, International Journal of Digital Curation, vol.4, issue.1, 2009. ,
DOI : 10.2218/ijdc.v4i1.86
Mining data records in Web pages, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '03, 2003. ,
DOI : 10.1145/956750.956826
What's new on the web? the evolution of the web from a search engine perspective, Proc. WWW, 2004. ,
Archivage du contenu éphémère du Web à l'aide des flux Web, Proc. BDA Conference without formal proceedings. (Demonstration), 2010. ,
Extracting article text from the web with maximum subsequence segmentation, Proceedings of the 18th international conference on World wide web, WWW '09, 2009. ,
DOI : 10.1145/1526709.1526840
A novel Web archiving approach based on visual pages analysis, Proc. IWAW, 2009. ,
ArchivePress: A really simple solution to archiving blog content, Proc. iPRES, 2009. ,
Efficient monitoring algorithm for fast news alerts, In IEEE Trans. on Knowl. and Data Eng, vol.197, 2007. ,
Incremental crawling with Heritrix, Proc. IWAW, 2005. ,
Catch me if you can. Temporal coherence of Web archives, Proc. IWAW, 2008. ,
Migrating content in warc files, Proc. IWAW, 2009. ,
Dynamic Web file format transformations with grace, Proc. IWAW, 2005. ,
Extracting structured data from Web pages, Proc. SIGMOD?to be modified, 2009. ,
Improving pseudo-relevance feedback in web information retrieval using web page segmentation, Proceedings of the twelfth international conference on World Wide Web , WWW '03, 2003. ,
DOI : 10.1145/775152.775155