Training phrase-based SMT without explicit word aligment

Cyrine Nasri; Kamel Smaïli; Chiraz Latiri

Communication Dans Un Congrès Année : 2014

Training phrase-based SMT without explicit word aligment

(1) , (1) , (2)

1
2

Cyrine Nasri

Fonction : Auteur
PersonId : 929496

Statistical Machine Translation and Speech Modelization and Text

Kamel Smaïli

Fonction : Auteur
PersonId : 2521
IdHAL : kamel-smaili
IdRef : 034429700

Statistical Machine Translation and Speech Modelization and Text

Chiraz Latiri

Fonction : Auteur
PersonId : 929497

URPAH Tunis

Résumé

The machine translation systems usually build an initial word-to-word alignment, before training the phrase translation pairs. This approach requires a lot of matching between different single words of both considered languages. In this paper, we propose a new approach for phrase-based machine translation which does not require any word alignment. This method is based on inter-lingual triggers retrieved by Multivariate Mutual Information. This algorithm segments sentences into phrases and fnds their alignments simultaneously. The main objective of this work is to build directly valid alignments between source and target phrases. The achieved results, in terms of performance are satisfactory and the obtained translation table is smaller than the reference one; this approach could be considered as an alternative to the classical methods.

Mots clés

Statistical Machine Translation Inter-Lingual triggers Multivariate Mutual Information

Domaines

Informatique et langage [cs.CL]

Fichier principal

CICLING2014Nasri.pdf (262.26 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Kamel Smaïli : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01067051

Soumis le : lundi 22 septembre 2014-18:06:40

Dernière modification le : lundi 11 septembre 2023-17:41:19

Dates et versions

hal-01067051 , version 1 (22-09-2014)

Identifiants

HAL Id : hal-01067051 , version 1

Citer

Cyrine Nasri, Kamel Smaïli, Chiraz Latiri. Training phrase-based SMT without explicit word aligment. 15th International Conference on Intelligent Text Processing and Computational Linguistics, Apr 2014, Kathmandu, Nepal. pp.233-241. ⟨hal-01067051⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE LORIA LORIA-NLPKD

268 Consultations

22 Téléchargements

Training phrase-based SMT without explicit word aligment

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager