Modelling frequency data -- Methodological considerations on the relationship between dictionaries and corpora

Abstract : The research questions addressed in our paper stem from a bundle of linguistically focused projects which -among other activities- also create glossaries and dictionaries which are intended to be usable both for human readers and particular NLP applications. The paper will comprise two parts: in the first section, the authors will give a concise overview of the projects and their goals. The second part will concentrate on encoding issues involved in the related dictionary production. Particular focus will be put on the modelling of an encoding scheme for statistical information on lexicographic data gleaned from digital corpora.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [6 references]  Display  Hide  Download

https://hal.inria.fr/hal-00922068
Contributor : Laurent Romary <>
Submitted on : Monday, December 23, 2013 - 1:58:48 PM
Last modification on : Friday, March 22, 2019 - 2:22:12 PM
Document(s) archivé(s) le : Monday, March 24, 2014 - 12:35:08 AM

Files

tei_rome_abstract.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00922068, version 1

Collections

Citation

Gerhard Budin, Karlheinz Mörth, Laurent Romary. Modelling frequency data -- Methodological considerations on the relationship between dictionaries and corpora. TEI Conference 2013, Oct 2013, Roma, Italy. ⟨hal-00922068⟩

Share

Metrics

Record views

442

Files downloads

449