Phrase-Based Language Model in Statistical Machine Translation

Abstract : As one of the most important modules in statistical machine translation (SMT), language model measures whether one translation hypothesis is more grammatically correct than other hypotheses. Currently the state-of-the-art SMT systems use standard word n-gram models, whereas the translation model is phrase-based. In this paper, the idea is to use a phrase-based language model. For that, target portion of the translation table are retrieved and used to rewrite the training corpus and to calculate a phrase n-gram language model. In this work, we perform experiments with two language models word-based (WBLM) and phrase-based (PBLM). The different SMT are trained with three optimization algorithms MERT, MIRA and PRO. Thus, the PBLM systems are compared to the baseline system in terms of BLUE and TER. The experimental results show that the use of a phrase-based language model in SMT can improve results and is especially able to reduce the error rate.
Type de document :
Article dans une revue
International Journal of Computational Linguistics and Applications, Alexander Gelbukh, 2016
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01336485
Contributeur : Kamel Smaïli <>
Soumis le : jeudi 23 juin 2016 - 11:08:40
Dernière modification le : mardi 24 avril 2018 - 13:34:42

Fichier

CICLING2016.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01336485, version 1

Collections

Citation

Achraf Ben Romdhane, Salma Jamoussi, Abdelmajid Ben Hamadou, Kamel Smaïli. Phrase-Based Language Model in Statistical Machine Translation. International Journal of Computational Linguistics and Applications, Alexander Gelbukh, 2016. 〈hal-01336485〉

Partager

Métriques

Consultations de la notice

260

Téléchargements de fichiers

31