Skip to Main content Skip to Navigation
New interface
Reports (Research report)

Conditional Random Fields for XML Applications

Abstract : XML tree labeling is the problem of classifying elements in XML documents. It is a fundamental task for applications like XML transformation, schema matching, and information extraction. In this paper we propose XCRFs, conditional random fields for XML tree labeling. Dealing with trees often raises complexity problems. We describe optimization methods by means of constraints and combination techniques that allow XCRFs to be used in real tasks and in interactive machine learning programs. We show that domain knowledge in XML applications easily transfers in XCRFs thanks to constraints and combination of XCRFs. We describe an approach based on XCRF to learn tree transformations. The approach allows to solve xml data integration tasks and restructuration tasks. We have developed an open source toolbox for XCRFs. We use it to propose a Web service for the generation of personalized RSS feeds from HTML pages.
Document type :
Reports (Research report)
Complete list of metadata

Cited literature [45 references]  Display  Hide  Download
Contributor : Marc Tommasi Connect in order to contact the contributor
Submitted on : Thursday, November 27, 2008 - 9:03:35 AM
Last modification on : Wednesday, October 26, 2022 - 8:16:44 AM
Long-term archiving on: : Thursday, October 11, 2012 - 12:07:18 PM


Files produced by the author(s)


  • HAL Id : inria-00342279, version 1


Rémi Gilleron, Florent Jousse, Marc Tommasi, Isabelle Tellier. Conditional Random Fields for XML Applications. [Research Report] RR-6738, INRIA. 2008. ⟨inria-00342279⟩



Record views


Files downloads