Exploitation d'une ressource lexicale pour la construction d'un étiqueteur morpho-syntaxique état-de-l'art du français

Pascal Denis 1 Benoît Sagot 1
1 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing
Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7
Abstract : This paper presents MEltfr, an automatic POS tagger for French. This system relies on a sequential probabilistic model that exploits information extracted from an external lexicon, namely Lefff . When evaluated on the FTB corpus, MEltfr achieves an accuracy of 97.75% (91.36% on unknow words) using a tagset of 29 categories. This corresponds to an error rate decrease of 18% (36.1% on unknow words) compared to the same model without Lefff information. We investigate in more detail the contribution of this resource through two sets of experiments. These reveal in particular that the Lefff features allow for an increased coverage and a finer-grained modeling of the context at the right of a word.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/inria-00514364
Contributor : Pascal Denis <>
Submitted on : Thursday, September 2, 2010 - 8:57:46 AM
Last modification on : Friday, January 4, 2019 - 5:33:24 PM

Identifiers

  • HAL Id : inria-00514364, version 1

Collections

Citation

Pascal Denis, Benoît Sagot. Exploitation d'une ressource lexicale pour la construction d'un étiqueteur morpho-syntaxique état-de-l'art du français. Traitement automatique des langues naturelles, Association pour le Traitement Automatique des Langues, Jul 2010, Montréal, Canada. ⟨inria-00514364⟩

Share

Metrics

Record views

269