Knowledge extraction from webpages - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2005

Knowledge extraction from webpages

Résumé

This article presents a system to extract Knowledge from webpages by producing semantic annotations. taking into account semantic information from the domain to annotate an element in a webpage implies solving two problems : (1) identifying the syntactic structure of this element in the webpage and (2) identifying the most specific concept (in terms of subsumption) of the ontology that will be used to annotate this element. Our approach relies on a wrapper-based machine learning algorithm combined with reasoning making use of the formal structure of the ontology.
Fichier principal
Vignette du fichier
semannot_stenier.pdf (116.15 Ko) Télécharger le fichier
Loading...

Dates et versions

inria-00000822 , version 1 (22-11-2005)

Identifiants

  • HAL Id : inria-00000822 , version 1

Citer

Sylvain Tenier, Amedeo Napoli, Xavier Polanco, Yannick Toussaint. Knowledge extraction from webpages. 5th International Workshop on Knowledge Markup and Semantic Annotation (SemAnnot 2005) located at the 4rd International Semantic Web Conference ISWC 2005, Siegfried Handschuh; Thierry Declerck; Marja-Riitta Koivunen, Nov 2005, Galway, Ireland. ⟨inria-00000822⟩
127 Consultations
261 Téléchargements

Partager

Gmail Facebook X LinkedIn More