Learning n-ary Node Selecting Tree Transducers from Completely Annotated Examples

Aurélien Lemay 1 Joachim Niehren 1 Rémi Gilleron 1
1 MOSTRARE - Modeling Tree Structures, Machine Learning, and Information Extraction
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (NSTTs), that are known to capture the class of MSO-definable n-ary queries. Despite of this highly expressive, we show that n-aryy queries, selecting a polynomially bounded number of tuples per tree, represented by deterministic NSTTs can be learned from polynomial time and data while allowing for efficient enumeration of query answers. An application to wrapper induction in Web information extraction yields encouraging results.
Document type :
Conference papers
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download

https://hal.inria.fr/inria-00088077
Contributor : Joachim Niehren <>
Submitted on : Sunday, August 13, 2006 - 9:42:21 PM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Monday, September 20, 2010 - 4:08:09 PM

File

Identifiers

  • HAL Id : inria-00088077, version 2

Collections

Citation

Aurélien Lemay, Joachim Niehren, Rémi Gilleron. Learning n-ary Node Selecting Tree Transducers from Completely Annotated Examples. 8th International Colloquium on Grammatical Inference, Sep 2006, Tokyo, Japan. pp.253-267. ⟨inria-00088077v2⟩

Share

Metrics

Record views

441

Files downloads

403