3527 articles – 5249 Notices  [english version]

inria-00417541, version 1

Word- and sentence-level confidence measures for machine translation

Sylvain Raybaud (Auteur à contacter de préférence) 1, Caroline Lavecchia () 1, David Langlois () 1, Kamel Smaïli () 1

13th Annual Meeting of the European Association for Machine Translation - EAMT 09 (2009)

Résumé : A machine translated sentence is seldom completely correct. Confidence measures are designed to detect incorrect words, phrases or sentences, or to provide an estimation of the probability of correctness. In this article we describe several word- and sentence-level confidence measures relying on different features: mutual information between words, n-gram and backward n-gram language models, and linguistic features. We also try different combination of these measures. Their accuracy is evaluated on a classification task. We achieve 17% error-rate (0.84 f-measure) on word-level and 31% error-rate (0.71 f-measure) on sentence-level.

  • 1 :  PAROLE (INRIA Lorraine - LORIA)
  • INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
  • Domaine : Informatique/Traitement du texte et du document
  • Mots-clés : confidence measure – machine translation – mutual information
 
  • inria-00417541, version 1
  • oai:hal.inria.fr:inria-00417541
  • Contributeur : 
  • Soumis le : Mercredi 16 Septembre 2009, 10:43:29
  • Dernière modification le : Mercredi 7 Octobre 2009, 10:47:19