LORIA System for the WMT12 Quality Estimation Shared Task

David Langlois 1 Sylvain Raybaud 1 Kamel Smaïli 1
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this paper we present the system we submitted to the WMT12 shared task on Quality Estimation. Each translated sentence is given a score between 1 and 5. The score is obtained using several numerical or boolean features calculated according to the source and target sentences. We perform a linear regression of the feature space against scores in the range [1:5]. To this end, we use a SupportVectorMachine. We experimentwith two kernels: linear and radial basis function. In our submission we use the features from the shared task baseline system and our own features. This leads to 66 features. To deal with this large number of features, we propose an in-house feature selection algorithm. Our results show that a lot of information is already present in baseline features, and that our feature selection algorithm discards features which are linearly correlated.
Type de document :
Communication dans un congrès
NAACL 2012 - The Seventh Workshop on Statistical Machine Translation, Jun 2012, Montréal, Canada. Association for Computational Linguistics, pp.114--119, 2012, Statistical Machine Translation
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00726372
Contributeur : David Langlois <>
Soumis le : mercredi 15 novembre 2017 - 10:30:11
Dernière modification le : jeudi 11 janvier 2018 - 06:25:24
Document(s) archivé(s) le : vendredi 16 février 2018 - 12:48:05

Fichier

wmt2012_langlois_raybaud_smail...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00726372, version 1

Collections

Citation

David Langlois, Sylvain Raybaud, Kamel Smaïli. LORIA System for the WMT12 Quality Estimation Shared Task. NAACL 2012 - The Seventh Workshop on Statistical Machine Translation, Jun 2012, Montréal, Canada. Association for Computational Linguistics, pp.114--119, 2012, Statistical Machine Translation. 〈hal-00726372〉

Partager

Métriques

Consultations de la notice

250

Téléchargements de fichiers

22