Discovering Phrases in Machine Translation by Simulated Annealing

Caroline Lavecchia 1 David Langlois 1 Kamel Smaïli 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In this paper, we propose a new phrase-based translation model based on inter-lingual triggers. The originality of our method is double. First we identify common source. Then we use inter-lingual triggers in order to retrieve their translat ions. Furthermore, we consider the way of extracting phrase trans- lations as an optimization issue. For that we use simulated annealing algorithm to find out the best phrase translations among all those determined by inter-lingual triggers. The best phrases are those which improve the translation quality in terms of Bleu score. Tests are achieved on the proceedings of the European Parliament corpora. The training is made on a corpus containing 596K parallel sentences (French-English) and tests on a corpus of 1444 sentences. With only 8.1% of the identified source phrases occurring in the test corpus, our system overcomes the baseline model by almost 3 points.
Type de document :
Communication dans un congrès
INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association, Sep 2008, Brisbane, Australia. pp.2354-2357, 2008
Liste complète des métadonnées

Littérature citée [7 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00331327
Contributeur : Caroline Lavecchia <>
Soumis le : jeudi 16 octobre 2008 - 12:46:11
Dernière modification le : jeudi 11 janvier 2018 - 06:19:56
Document(s) archivé(s) le : lundi 7 juin 2010 - 20:16:25

Fichier

CarolineLavecchia.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00331327, version 1

Collections

Citation

Caroline Lavecchia, David Langlois, Kamel Smaïli. Discovering Phrases in Machine Translation by Simulated Annealing. INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association, Sep 2008, Brisbane, Australia. pp.2354-2357, 2008. 〈inria-00331327〉

Partager

Métriques

Consultations de la notice

174

Téléchargements de fichiers

146