XML Document Transformation with Conditional Random Fields

Rémi Gilleron 1 Florent Jousse 1 Isabelle Tellier 1 Marc Tommasi 1
1 MOSTRARE - Modeling Tree Structures, Machine Learning, and Information Extraction
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : We address the problem of structure mapping that arises in xml data exchange or xml document transformation. Our approach relies on xml annotation with semantic labels that describe local tree editions. We propose xml Conditional Random Fields (xcrfs), a framework for building conditional models for labeling xml documents. We equip xcrfs with efficient algorithms for inference and parameter estimation. We provide theoretical arguments and practical experiments that illustrate their expressivity and efficiency. Experiments on the Structure Mapping movie datasets of the inex xml Document Mining Challenge yield very good results.
Type de document :
Communication dans un congrès
INEX 2006, Dec 2006, Dagstuhl, Germany. 4518 (4518), 2006, LNCS
Liste complète des métadonnées

https://hal.inria.fr/inria-00147052
Contributeur : Marc Tommasi <>
Soumis le : mardi 15 mai 2007 - 17:16:41
Dernière modification le : jeudi 11 janvier 2018 - 06:22:13

Identifiants

  • HAL Id : inria-00147052, version 1

Collections

Citation

Rémi Gilleron, Florent Jousse, Isabelle Tellier, Marc Tommasi. XML Document Transformation with Conditional Random Fields. INEX 2006, Dec 2006, Dagstuhl, Germany. 4518 (4518), 2006, LNCS. 〈inria-00147052〉

Partager

Métriques

Consultations de la notice

127