Skip to Main content Skip to Navigation
Journal articles

French text preprocessing with TTL

Abstract : In this paper we present some experiments on the building of French resources for the TTL POS tagger (Ion, 2007). TTL is a collection of interconnected text preprocessing modules (sentence splitter, tokenizer, tagger, le mmatizer and chunker) with resources for Romanian and English but with no resources available for French. We show how we develop the required POS tagging training corpus and that the average POS tagging accuracy for French exceeds 97% when TTL is trained on this corpus.
Document type :
Journal articles
Complete list of metadata

https://hal.inria.fr/hal-00867452
Contributor : Amalia Todirascu <>
Submitted on : Sunday, September 29, 2013 - 11:47:55 PM
Last modification on : Thursday, February 27, 2020 - 2:16:02 PM

Identifiers

  • HAL Id : hal-00867452, version 1

Collections

Citation

Amalia Todirascu, Radu Ion, Mirabela Navlea, Laurence Longo. French text preprocessing with TTL. Proceedings of Romanian Academy - Series A (Mathematics, Physics, Technical Sciences, Information Science), The Publishing House of the Romanian Academy, 2011, 12 (2), pp. 151-158. ⟨hal-00867452⟩

Share

Metrics

Record views

274