Speech technology for unwritten languages
Odette Scharenborg
(1)
,
Laurent Besacier
(2)
,
Alan Black
(3)
,
Mark Hasegawa-Johnson
(4)
,
Florian Metze
(3)
,
Graham Neubig
(3)
,
Sebastian Stuker
(5)
,
Pierre Godard
(6)
,
Markus Müller
(5)
,
Lucas Ondel
(7, 8)
,
Shruti Palaskar
(3)
,
Philip Arthur
(3)
,
Francesco Ciannella
(3)
,
Mingxing Du
(9)
,
Elin Larsen
(9)
,
Danny Merkx
(10)
,
Rachid Riad
(9)
,
Liming Wang
(4)
,
Emmanuel Dupoux
(9, 11)
1
TU Delft -
Delft University of Technology
2 GETALP - Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole
3 CMU - Carnegie Mellon University [Pittsburgh]
4 Department. of Computer Science [Illinois]
5 KIT - Institute for Anthropomatics
6 TLP - Traitement du Langage Parlé
7 BUT - Brno University of Technology [Brno]
8 JHU - Johns Hopkins University
9 LSCP - Laboratoire de sciences cognitives et psycholinguistique
10 Radboud University [Nijmegen]
11 CoML - Apprentissage machine et développement cognitif
2 GETALP - Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole
3 CMU - Carnegie Mellon University [Pittsburgh]
4 Department. of Computer Science [Illinois]
5 KIT - Institute for Anthropomatics
6 TLP - Traitement du Langage Parlé
7 BUT - Brno University of Technology [Brno]
8 JHU - Johns Hopkins University
9 LSCP - Laboratoire de sciences cognitives et psycholinguistique
10 Radboud University [Nijmegen]
11 CoML - Apprentissage machine et développement cognitif
Laurent Besacier
- Fonction : Auteur
- PersonId : 1521
- IdHAL : laurent-besacier
- ORCID : 0000-0001-7411-9125
- IdRef : 079377017
Pierre Godard
- Fonction : Auteur
- PersonId : 762183
- ORCID : 0000-0002-5402-6033
- IdRef : 140798072
Lucas Ondel
- Fonction : Auteur
- PersonId : 750844
- IdHAL : lucas-ondel
- ORCID : 0000-0003-4512-0471
Emmanuel Dupoux
- Fonction : Auteur
- PersonId : 857216
Résumé
Speech technology plays an important role in our everyday life. Speech is, among others, used for human-computer interaction, including, for instance, information retrieval and on-line shopping. In the case of an unwritten language, however, speech technology is unfortunately difficult to create, because it cannot be created by the standard combination of pre-trained speech-to-text and text-to-speech subsystems. The research presented in this paper takes the first steps towards speech technology for unwritten languages. Specifically, the aim of this work was 1) to learn speech-to-meaning representations without using text as an intermediate representation, and 2) to test the sufficiency of the learned representations to regenerate speech or translated text, or to retrieve images that depict the meaning of an utterance in an unwritten language. The results suggest that building systems that go directly from speech-to-meaning and from meaning-to-speech, bypassing the need for text, is possible.