Word Confidence Estimation for SMT N-best List Re-ranking

Abstract : This paper proposes to use Word Confidence Estimation (WCE) information to improve MT outputs via N-best list re-ranking. From the confidence label assigned for each word in the MT hypothesis , we add six scores to the baseline log-linear model in order to re-rank the N-best list. Firstly, the correlation between the WCE-based sentence-level scores and the conventional evaluation scores (BLEU, TER, TERp-A) is investigated. Then, the N-best list re-ranking is evaluated over different WCE system performance levels: from our real and efficient WCE system (ranked 1st during last WMT 2013 Quality Estimation Task) to an oracle WCE (which simulates an interactive scenario where a user simply validates words of a MT hypothesis and the new output will be automatically regenerated). The results suggest that our real WCE system slightly (but significantly) improves the baseline while the oracle one extremely boosts it; and better WCE leads to better MT quality.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [27 references]  Display  Hide  Download

https://hal.inria.fr/hal-00953719
Contributor : Laurent Besacier <>
Submitted on : Friday, February 23, 2018 - 12:46:01 PM
Last modification on : Tuesday, February 12, 2019 - 1:31:31 AM
Document(s) archivé(s) le : Thursday, May 24, 2018 - 9:25:26 PM

File

eacl2014_cameraready.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00953719, version 1

Collections

Citation

Ngoc-Quang Luong, Laurent Besacier, Benjamin Lecouteux. Word Confidence Estimation for SMT N-best List Re-ranking. Proceedings of the Workshop on Humans and Computer-assisted Translation (HaCaT) during EACL, 2014, Gothenburg, Sweden. ⟨hal-00953719⟩

Share

Metrics

Record views

539

Files downloads

71