The University of Edinburgh’s Submissions to the WMT19 News Translation Task - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

The University of Edinburgh’s Submissions to the WMT19 News Translation Task

Résumé

The University of Edinburgh participated in the WMT19 Shared Task on News Translation in six language directions: English↔Gujarati, English↔Chinese, German→English, and English→Czech. For all translation directions , we created or used back-translations of monolingual data in the target language as additional synthetic training data. For English↔Gujarati, we also explored semi-supervised MT with cross-lingual language model pre-training, and translation pivoting through Hindi. For translation to and from Chi-nese, we investigated character-based tokeni-sation vs. sub-word segmentation of Chinese text. For German→English, we studied the impact of vast amounts of back-translated training data on translation quality, gaining a few additional insights over Edunov et al. (2018). For English→Czech, we compared different pre-processing and tokenisation regimes.
Fichier principal
Vignette du fichier
UEDIN_at_WMT19.pdf (332.47 Ko) Télécharger le fichier
Loading...

Dates et versions

hal-02986330 , version 1 (03-11-2020)

Identifiants

  • HAL Id : hal-02986330 , version 1

Citer

Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, et al.. The University of Edinburgh’s Submissions to the WMT19 News Translation Task. 4th Conference on Machine Translation, 2019, Florence, Italy. ⟨hal-02986330⟩
88 Consultations
85 Téléchargements

Partager

Gmail Facebook X LinkedIn More