XML Document Transformation with Conditional Random Fields

Rémi Gilleron 1 Florent Jousse 1 Isabelle Tellier 1 Marc Tommasi 1
1 MOSTRARE - Modeling Tree Structures, Machine Learning, and Information Extraction
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : We address the problem of structure mapping that arises in xml data exchange or xml document transformation. Our approach relies on xml annotation with semantic labels that describe local tree editions. We propose xml Conditional Random Fields (xcrfs), a framework for building conditional models for labeling xml documents. We equip xcrfs with efficient algorithms for inference and parameter estimation. We provide theoretical arguments and practical experiments that illustrate their expressivity and efficiency. Experiments on the Structure Mapping movie datasets of the inex xml Document Mining Challenge yield very good results.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/inria-00147052
Contributor : Marc Tommasi <>
Submitted on : Tuesday, May 15, 2007 - 5:16:41 PM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM

Identifiers

  • HAL Id : inria-00147052, version 1

Collections

Citation

Rémi Gilleron, Florent Jousse, Isabelle Tellier, Marc Tommasi. XML Document Transformation with Conditional Random Fields. INEX 2006, Dec 2006, Dagstuhl, Germany. ⟨inria-00147052⟩

Share

Metrics

Record views

205