Skip to Main content Skip to Navigation
Conference papers

Trouver et confondre les coupables : un processus sophistiqué de correction de lexique

Abstract : The coverage of a parser depends mostly on the quality of the underlying grammar and lexicon. The development of a lexicon both complete and accurate is an intricate and demanding task, overall when achieving a certain level of quality and coverage. We introduce an automatic process able to detect missing or incomplete entries in a lexicon, and to suggest corrections hypotheses for these entries. The detection of dubious lexical entries is tackled by two techniques relying either on a specific statistical model, or on the information provided by a part-of-speech tagger. The generation of correction hypotheses for the detected entries is achieved by studying which modifications could improve the parse rate of the sentences in which the entries occur. This process brings together various techniques based on different tools such as taggers, parsers and entropy classifiers. Applying it on the Lefff, a large-coverage morphologi- cal and syntactic French lexicon, has already allowed us to perfom noticeable improvements.
Document type :
Conference papers
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download
Contributor : Eric Villemonte de la Clergerie Connect in order to contact the contributor
Submitted on : Thursday, January 6, 2011 - 9:58:39 PM
Last modification on : Tuesday, January 11, 2022 - 11:16:23 AM
Long-term archiving on: : Thursday, April 7, 2011 - 2:33:28 AM


Files produced by the author(s)


  • HAL Id : inria-00553257, version 1


Lionel Nicolas, Benoît Sagot, Miguel Molinero, Jacques Farré, Éric Villemonte de la Clergerie. Trouver et confondre les coupables : un processus sophistiqué de correction de lexique. 16ème conférence sur le Traitement Automatique des Langues Naturelles : TALN'09, ATALA ; LIPN, Jun 2009, Senlis, France. ⟨inria-00553257⟩



Les métriques sont temporairement indisponibles