Mining Parsing Results for Lexical Correction: Toward a Complete Correction Process of Wide-Coverage Lexicons

Abstract : The coverage of a parser depends mostly on the quality of the underlying grammar and lexicon. The development of a lexicon both complete and accurate is an intricate and demanding task. We introduce a automatic process for detecting missing, incomplete and erroneous entries in a morphological and syntactic lexicon, and for suggesting corrections hypotheses for these entries. The detection of dubious lexical entries is tackled by two different techniques; the first one is based on a specific statistical model, the other one benefits from information provided by a part-of-speech tagger. The generation of correction hypotheses for dubious lexical entries is achieved by studying which modifications could improve the successful parse rate of sentences in which they occur. This process brings together various techniques based on taggers, parsers and statistical models. We report on its application for improving a large-coverage morphological and syntacic French lexicon, the Lefff.
Type de document :
Communication dans un congrès
Zygmunt Vetulani and Hans Uszkoreit. LTC 2007 - Third Language and Technology Conference, Oct 2007, Poznan, Poland. Springer, 5603, pp.178-191, 2009, Lecture Notes in Computer Science; Human Language Technology. Challenges of the Information Society. <10.1007/978-3-642-04235-5_16>
Liste complète des métadonnées

https://hal.inria.fr/hal-00793052
Contributeur : Brigitte Briot <>
Soumis le : jeudi 21 février 2013 - 14:46:29
Dernière modification le : mercredi 12 octobre 2016 - 01:23:17

Identifiants

Collections

Citation

Lionel Nicolas, Benoît Sagot, Miguel Molinero, Jacques Farré, Éric De La Clergerie. Mining Parsing Results for Lexical Correction: Toward a Complete Correction Process of Wide-Coverage Lexicons. Zygmunt Vetulani and Hans Uszkoreit. LTC 2007 - Third Language and Technology Conference, Oct 2007, Poznan, Poland. Springer, 5603, pp.178-191, 2009, Lecture Notes in Computer Science; Human Language Technology. Challenges of the Information Society. <10.1007/978-3-642-04235-5_16>. <hal-00793052>

Partager

Métriques

Consultations de la notice

167