HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Wrapping Web Pages into XML Documents: A Practical Experience and Comparison of Two Tools

Abstract : he notion of wrapping a web server to produce XML documents from unstructed web pages is driven by the need to produce structured data that can be used by a variety of applications. The web contains vast amounts of information that cannot be used by most applications as it targets a human audience. A solution to this is to automate the browsing process and convert the unstructured extracted information into a more structured format such as XML. This is called wrapping. We have used two different tools to wrap several tourist sites into XML The tools we have used are Norfolk, a system developed by the CSIRO TED group and W4F, initially developed at the University of Pennsylvania and now a commercial product. This report describes our practical experience with the tools and compares them. The comparison highlights features required by a wrapper system to support real applications.
Complete list of metadata

Cited literature [12 references]  Display  Hide  Download

https://hal.inria.fr/inria-00092248
Contributor : Anne-Marie Vercoustre Connect in order to contact the contributor
Submitted on : Friday, September 8, 2006 - 3:44:27 PM
Last modification on : Thursday, January 11, 2018 - 5:22:01 PM
Long-term archiving on: : Monday, April 5, 2010 - 11:35:57 PM

Identifiers

  • HAL Id : inria-00092248, version 1

Citation

Sabine Jabbour, Anne-Marie Vercoustre. Wrapping Web Pages into XML Documents: A Practical Experience and Comparison of Two Tools. The Eighth Australian World Wide Web Conference, Jul 2002, Sunshine Coast, Queensland, Australia. ⟨inria-00092248⟩

Share

Metrics

Record views

46

Files downloads

115