Skip to Main content Skip to Navigation
Conference papers

Knowledge extraction from webpages

Sylvain Tenier 1, 2 Amedeo Napoli 1 Xavier Polanco 2 Yannick Toussaint 1
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This article presents a system to extract Knowledge from webpages by producing semantic annotations. taking into account semantic information from the domain to annotate an element in a webpage implies solving two problems : (1) identifying the syntactic structure of this element in the webpage and (2) identifying the most specific concept (in terms of subsumption) of the ontology that will be used to annotate this element. Our approach relies on a wrapper-based machine learning algorithm combined with reasoning making use of the formal structure of the ontology.
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download
Contributor : Sylvain Tenier Connect in order to contact the contributor
Submitted on : Tuesday, November 22, 2005 - 11:36:12 AM
Last modification on : Thursday, January 20, 2022 - 4:13:47 PM
Long-term archiving on: : Friday, April 2, 2010 - 6:13:36 PM


  • HAL Id : inria-00000822, version 1



Sylvain Tenier, Amedeo Napoli, Xavier Polanco, Yannick Toussaint. Knowledge extraction from webpages. 5th International Workshop on Knowledge Markup and Semantic Annotation - SemAnnot 2005, Siegfried Handschuh, Nov 2005, Galway/Ireland. ⟨inria-00000822⟩



Les métriques sont temporairement indisponibles