Discourse Data in DiET - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 1999

Discourse Data in DiET

Résumé

The DiET project provides systematically constructed and annotated test items and associated tools, enabling fast system debugging and evaluation, and automatic linkage from test items to real corpora instances. This paper concentrates on the discourse test suite and its use. The discourse test suite covers discourse phenomena such as pronouns, def-inites and ellipsis. These can be used to evaluate the coverage and accuracy of implementations of anaphora resolution algorithms. We also examine the text prooling support within the Diet tools. Text Prooling identiies typical and salient corpus characteristics, e.g. the frequency and distribution of part of speech tags and vocabulary richness. Prooling also provides candidate sentences instantiating predeened syntactic phenomena. Prooling enables users to select test-items appropriate to their domain speciic corpus. The paper shows how the corpus search engine can be used to identify discourse phenomena in a corpus and presents concrete results of this evaluation scenario.

Domaines

Linguistique
Fichier principal
Vignette du fichier
diet_discourse99.pdf (344.21 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

halshs-01322332 , version 1 (27-05-2016)

Identifiants

  • HAL Id : halshs-01322332 , version 1

Citer

Ian Lewin, Pierrette Bouillon, Sabine Lehmann, David Milward, Ludovic Tanguy. Discourse Data in DiET. EACL Workshop on Linguistically Interpreted Corpora (LINC), 1999, Bergen, Norway. ⟨halshs-01322332⟩
54 Consultations
29 Téléchargements

Partager

Gmail Facebook X LinkedIn More