Learning Node Selecting Tree Transducer from Completely Annotated Examples

Abstract : Web documents in HTML or XML form trees with nodes containing text. A base problem in Web information extraction is to find appropriate queries for informative nodes in trees. We propose to learn queries for nodes in trees automatically from examples. We introduce node selecting tree transducer (NSTT) for representing node queries in trees and show how to induce determinist ic NSTTs in polynomial time from completely annotated examples by methods of grammatical inference. We have implemented learning algorithms for NSTTs, started applying them to Web information extraction, and present first experimental results.
Type de document :
Communication dans un congrès
Georgios Paliouras and Yasubumi Sakakibara. 7th International Colloquium on Grammatical Inference, 2004, Athens, Greece. Springer, 3264, pp.91--102, 2004, Lecture Notes in Artificial Intelligence
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00536528
Contributeur : Joachim Niehren <>
Soumis le : mardi 16 novembre 2010 - 13:41:26
Dernière modification le : jeudi 11 janvier 2018 - 06:22:13
Document(s) archivé(s) le : jeudi 17 février 2011 - 02:26:16

Fichier

icgi04.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00536528, version 1

Collections

Citation

Julien Carme, Aurélien Lemay, Joachim Niehren. Learning Node Selecting Tree Transducer from Completely Annotated Examples. Georgios Paliouras and Yasubumi Sakakibara. 7th International Colloquium on Grammatical Inference, 2004, Athens, Greece. Springer, 3264, pp.91--102, 2004, Lecture Notes in Artificial Intelligence. 〈inria-00536528〉

Partager

Métriques

Consultations de la notice

307

Téléchargements de fichiers

250