3532 articles – 5253 references  [version française]

inria-00579335, version 1

Phrase-based machine translation based on text mining and statistical language modeling techniques

Chiraz Latiri 1, Kamel Smaïli () 2, Caroline Lavecchia 2, Cyrine Nasri 1, David Langlois 2

12th International Conference on Intelligent Text Processing and Computational Linguistics - CICLing2011 (2011)

Abstract: In this paper, we introduce two new methods dedicated to phrase-based machine translation. Both are based on mining a parallel corpus in order to find out the couples of linguistic units which are translation of each other. The presented methods do not rely on any alignment in contrast to what is done usually by the statistical machine translation community. Each of them proposes a complete translation table containing translations of single words and phrases. The first method is inspired from the well-known trigger language model while the second one is inspired from the association rules mining technique. All experiments are conducted on a large part of EUROPARL corpus and highlight the utility of both proposed approaches.

  • 1:  Unité de Recherche en Programmation Algorithmique et Heuristique (URPAH)
  • Faculté des Sciences de Tunis
  • 2:  PAROLE (INRIA Lorraine - LORIA)
  • INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
  • Domain : Computer Science/Information Retrieval
    Computer Science/Artificial Intelligence
  • Keywords : Statistical machine translation – Sequence mining – Inter-lingual triggers – Inter-lingual association rules – Bilingual corpora
 
  • inria-00579335, version 1
  • oai:hal.inria.fr:inria-00579335
  • From: 
  • Submitted on: Wednesday, 23 March 2011 15:27:09
  • Updated on: Thursday, 24 March 2011 10:02:00