Skip to Main content Skip to Navigation
Conference papers

Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin

Abstract : In this paper, we present ongoing work for developing language resources and basic NLP tools for an undocumented variety of Romansh, in the context of a language documentation and language acquisition project. Our tools are designed to improve the speed and reliability of corpus annotations for noisy data involving large amounts of code-switching, occurrences of child speech and orthographic noise. Being able to increase the efficiency of language resource development for language documentation and acquisition research also constitutes a step towards solving the data sparsity issues with which researchers have been struggling.
Complete list of metadata

Cited literature [11 references]  Display  Hide  Download

https://hal.inria.fr/hal-01570614
Contributor : Benoît Sagot <>
Submitted on : Monday, July 31, 2017 - 7:08:29 PM
Last modification on : Thursday, August 29, 2019 - 2:24:09 PM

File

speeding-corpus-development-10...
Files produced by the author(s)

Identifiers

Collections

Citation

Géraldine Walther, Benoît Sagot. Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin. Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Aug 2017, Vancouver, Canada. pp.89 - 94, ⟨10.18653/v1/W17-2212⟩. ⟨hal-01570614⟩

Share

Metrics

Record views

292

Files downloads

327