An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation

Abstract : During decoding, the Statistical Machine Translation (SMT) decoder travels over all complete paths on the Search Graph (SG), seeks those with cheapest costs and back-tracks to read off the best translations. Although these winners beat the rest in model scores, there is no certain guarantee that they have the highest quality with respect to the human references. This paper exploits Word Confidence Estimation (WCE) scores in the second pass of decoding to enhance the Machine Translation (MT) quality. By using the confidence score of each word in the N-best list to update the cost of SG hypotheses containing it, we hope to " reinforce " or " weaken " them relied on word quality. After the update, new best translations are re-determined using updated costs. In the experiments on our real WCE scores and ideal (oracle) ones, the latter significantly boosts one-pass de-coder by 7.87 BLEU points, meanwhile the former yields an improvement of 1.49 points for the same metric.
Document type :
Conference papers
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download

https://hal.inria.fr/hal-01002922
Contributor : Laurent Besacier <>
Submitted on : Friday, February 23, 2018 - 12:44:19 PM
Last modification on : Monday, July 8, 2019 - 3:10:29 PM
Long-term archiving on : Thursday, May 24, 2018 - 9:12:43 PM

File

eamt14.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01002922, version 1

Citation

Ngoc Quang Luong, Laurent Besacier, Benjamin Lecouteux. An Efficient Two-Pass Decoder for SMT Using Word Confidence Estimation. European Association for Machine Translation (EAMT), Jun 2014, Dubrovnik, Croatia. ⟨hal-01002922⟩

Share

Metrics

Record views

235

Files downloads

231