T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation

Résumé

We present a new approach to perform zeroshot cross-modal transfer between speech and text for translation tasks. Multilingual speech and text are encoded in a joint fixed-size representation space. Then, we compare different approaches to decode these multimodal and multilingual fixed-size representations, enabling zero-shot translation between languages and modalities. All our models are trained without the need of cross-modal labeled translation data. Despite a fixed-size representation, we achieve very competitive results on several text and speech translation tasks. In particular, we outperform the state of the art for zero-shot speech translation on Must-C. We also introduce the first results for zero-shot direct speechto-speech and text-to-speech translation.
Fichier principal
Vignette du fichier
T_modules___EMNLP_2022___8_pages-3.pdf (767.66 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03834732 , version 1 (30-10-2022)

Identifiants

  • HAL Id : hal-03834732 , version 1

Citer

Paul-Ambroise Duquenne, Hongyu Gong, Benoît Sagot, Holger Schwenk. T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation. EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Dec 2022, Abu Dhabi, United Arab Emirates. ⟨hal-03834732⟩

Collections

INRIA INRIA2
45 Consultations
92 Téléchargements

Partager

Gmail Facebook X LinkedIn More