Abstract : We address the problem of structure mapping that arises in xml data exchange or xml document transformation. Our approach relies on xml annotation with semantic labels that describe local tree editions. We propose xml Conditional Random Fields (xcrfs), a framework for building conditional models for labeling xml documents. We equip xcrfs with efficient algorithms for inference and parameter estimation. We provide theoretical arguments and practical experiments that illustrate their expressivity and efficiency. Experiments on the Structure Mapping movie datasets of the inex xml Document Mining Challenge yield very good results.
Rémi Gilleron, Florent Jousse, Isabelle Tellier, Marc Tommasi. XML Document Transformation with Conditional Random Fields. INEX 2006, Dec 2006, Dagstuhl, Germany. ⟨inria-00147052⟩