An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation

Abstract : During decoding, the Statistical Machine Translation (SMT) decoder travels over all complete paths on the Search Graph (SG), seeks those with cheapest costs and back-tracks to read off the best translations. Although these winners beat the rest in model scores, there is no certain guarantee that they have the highest quality with respect to the human references. This paper exploits Word Confidence Estimation (WCE) scores in the second pass of decoding to enhance the Machine Translation (MT) quality. By using the confidence score of each word in the N-best list to update the cost of SG hypotheses containing it, we hope to " reinforce " or " weaken " them relied on word quality. After the update, new best translations are re-determined using updated costs. In the experiments on our real WCE scores and ideal (oracle) ones, the latter significantly boosts one-pass de-coder by 7.87 BLEU points, meanwhile the former yields an improvement of 1.49 points for the same metric.
Type de document :
Communication dans un congrès
European Association for Machine Translation (EAMT), Jun 2014, Dubrovnik, Croatia. 2014
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01002922
Contributeur : Laurent Besacier <>
Soumis le : vendredi 23 février 2018 - 12:44:19
Dernière modification le : jeudi 11 octobre 2018 - 08:48:03
Document(s) archivé(s) le : jeudi 24 mai 2018 - 21:12:43

Fichier

eamt14.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01002922, version 1

Citation

Luong Ngoc Quang, Laurent Besacier, Lecouteux Benjamin. An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation. European Association for Machine Translation (EAMT), Jun 2014, Dubrovnik, Croatia. 2014. 〈hal-01002922〉

Partager

Métriques

Consultations de la notice

192

Téléchargements de fichiers

30