An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation

Résumé

During decoding, the Statistical Machine Translation (SMT) decoder travels over all complete paths on the Search Graph (SG), seeks those with cheapest costs and back-tracks to read off the best translations. Although these winners beat the rest in model scores, there is no certain guarantee that they have the highest quality with respect to the human references. This paper exploits Word Confidence Estimation (WCE) scores in the second pass of decoding to enhance the Machine Translation (MT) quality. By using the confidence score of each word in the N-best list to update the cost of SG hypotheses containing it, we hope to " reinforce " or " weaken " them relied on word quality. After the update, new best translations are re-determined using updated costs. In the experiments on our real WCE scores and ideal (oracle) ones, the latter significantly boosts one-pass de-coder by 7.87 BLEU points, meanwhile the former yields an improvement of 1.49 points for the same metric.
Fichier principal
Vignette du fichier
eamt14.pdf (386.43 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01002922 , version 1 (23-02-2018)

Identifiants

  • HAL Id : hal-01002922 , version 1

Citer

Ngoc Quang Luong, Laurent Besacier, Benjamin Lecouteux. An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation. European Association for Machine Translation (EAMT), Jun 2014, Dubrovnik, Croatia. ⟨hal-01002922⟩
176 Consultations
71 Téléchargements

Partager

Gmail Facebook X LinkedIn More