LORIA System for the WMT13 Quality Estimation Shared Task

David Langlois 1, * Kamel Smaïli 1
* Auteur correspondant
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this paper we present the system we submitted to the WMT13 shared task on Quality Estimation. We participated to the Task 1.1. Each translated sentence is given a score between 0 and 1. The score is obtained by using several numerical or boolean features calculated according to the source and target sentences. We perform a linear regression of the feature space against scores in the range [0..1], to this end, we use a Support Vector Machine with 66 features. In this paper, we propose to increase the size of the training corpus. For that, we decide to use the post-edited and reference corpora in the training step after assigning a score to each sentence of these corpora. Then, we tune these scores on a development corpus. This leads to an improvement of 10.5% on the development corpus, in terms of Mean Average Error, but achieves only a sligth improvement on the test corpus.
Type de document :
Communication dans un congrès
ACL 2013 - Eighth Workshop on Statistical Machine Translation, Aug 2013, Sofia, Bulgaria. pp.380 - 385, 2013
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00923623
Contributeur : David Langlois <>
Soumis le : mercredi 15 novembre 2017 - 10:46:28
Dernière modification le : jeudi 11 janvier 2018 - 06:25:24
Document(s) archivé(s) le : vendredi 16 février 2018 - 14:02:34

Fichier

wmt2013_langlois_smaili_prepri...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00923623, version 1

Collections

Citation

David Langlois, Kamel Smaïli. LORIA System for the WMT13 Quality Estimation Shared Task. ACL 2013 - Eighth Workshop on Statistical Machine Translation, Aug 2013, Sofia, Bulgaria. pp.380 - 385, 2013. 〈hal-00923623〉

Partager

Métriques

Consultations de la notice

167

Téléchargements de fichiers

25