Conditional Random Fields for XML Applications - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Reports (Research Report) Year : 2008

Conditional Random Fields for XML Applications

Abstract

XML tree labeling is the problem of classifying elements in XML documents. It is a fundamental task for applications like XML transformation, schema matching, and information extraction. In this paper we propose XCRFs, conditional random fields for XML tree labeling. Dealing with trees often raises complexity problems. We describe optimization methods by means of constraints and combination techniques that allow XCRFs to be used in real tasks and in interactive machine learning programs. We show that domain knowledge in XML applications easily transfers in XCRFs thanks to constraints and combination of XCRFs. We describe an approach based on XCRF to learn tree transformations. The approach allows to solve xml data integration tasks and restructuration tasks. We have developed an open source toolbox for XCRFs. We use it to propose a Web service for the generation of personalized RSS feeds from HTML pages.
Fichier principal
Vignette du fichier
RR-6738.pdf (381.46 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

inria-00342279 , version 1 (27-11-2008)

Identifiers

  • HAL Id : inria-00342279 , version 1

Cite

Rémi Gilleron, Florent Jousse, Marc Tommasi, Isabelle Tellier. Conditional Random Fields for XML Applications. [Research Report] RR-6738, INRIA. 2008. ⟨inria-00342279⟩
194 View
241 Download

Share

Gmail Facebook X LinkedIn More