Modelling frequency data -- Methodological considerations on the relationship between dictionaries and corpora
Résumé
The research questions addressed in our paper stem from a bundle of linguistically focused projects which -among other activities- also create glossaries and dictionaries which are intended to be usable both for human readers and particular NLP applications. The paper will comprise two parts: in the first section, the authors will give a concise overview of the projects and their goals. The second part will concentrate on encoding issues involved in the related dictionary production. Particular focus will be put on the modelling of an encoding scheme for statistical information on lexicographic data gleaned from digital corpora.
Domaines
Informatique et langage [cs.CL]
Origine : Fichiers produits par l'(les) auteur(s)
Loading...