A Framework for Multi-level Linguistic Annotation

Patrice Lopez 1 Laurent Romary 1
1 LANGUE ET DIALOGUE - Human-machine dialogue with a significant language component
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This article presents a 3-step model for multi- layer annotations of corpora. Each kind of an- notation for a textual corporacorresponds to a dierent view on the same document. This prin- ciple can be expressed rst with a general re- lational model dedicated to the organisation of LR. This abstract model is then implemented as an application of the XML formalism for the en- coding of large corpora. The exploitation of this kind of annotated corpora requires ecient ma- nipulation processes and reversive access. We propose to use a third step representation based on a set of optimised FSA resulting from the parsing of the XML documents. These propo- sitions have been implemented in the rst ver- sion of a workbench dedicated to the French Le Monde corpus.
Type de document :
Communication dans un congrès
LREC Workshop on Large Corpus Annotation and Software Standards, Data Architectures and Software Support for Large Corpora,, May 2000, Athens, Greece. 2000
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00525227
Contributeur : Laurent Romary <>
Soumis le : lundi 11 octobre 2010 - 14:30:23
Dernière modification le : jeudi 11 janvier 2018 - 06:19:48
Document(s) archivé(s) le : jeudi 30 juin 2011 - 13:32:36

Fichier

lopez-romary.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00525227, version 1

Collections

Citation

Patrice Lopez, Laurent Romary. A Framework for Multi-level Linguistic Annotation. LREC Workshop on Large Corpus Annotation and Software Standards, Data Architectures and Software Support for Large Corpora,, May 2000, Athens, Greece. 2000. 〈inria-00525227〉

Partager

Métriques

Consultations de la notice

162

Téléchargements de fichiers

87