Speech technology for unwritten languages

Odette Scharenborg; Laurent Besacier; Alan Black; Mark Hasegawa-Johnson; Florian Metze; Graham Neubig; Sebastian Stuker; Pierre Godard; Markus Müller; Lucas Ondel; Shruti Palaskar; Philip Arthur; Francesco Ciannella; Mingxing Du; Elin Larsen; Danny Merkx; Rachid Riad; Liming Wang; Emmanuel Dupoux

doi:10.1109/TASLP.2020.2973896

Article Dans Une Revue IEEE/ACM Transactions on Audio, Speech and Language Processing Année : 2020

Speech technology for unwritten languages

(1) , (2) , (3) , (4) , (3) , (3) , (5) , (6) , (5) , (7, 8) , (3) , (3) , (3) , (9) , (9) , (10) , (9) , (4) , (9, 11)

1
2
3
4
5
6
7
8
9
10
11

Odette Scharenborg

Fonction : Auteur

Delft University of Technology

Laurent Besacier

Fonction : Auteur
PersonId : 1521
IdHAL : laurent-besacier
ORCID : 0000-0001-7411-9125
IdRef : 079377017

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Alan Black

Fonction : Auteur

Carnegie Mellon University [Pittsburgh]

Mark Hasegawa-Johnson

Fonction : Auteur

Department. of Computer Science [Illinois]

Florian Metze

Fonction : Auteur

Carnegie Mellon University [Pittsburgh]

Graham Neubig

Fonction : Auteur

Carnegie Mellon University [Pittsburgh]

Sebastian Stuker

Fonction : Auteur

Institute for Anthropomatics

Pierre Godard

Fonction : Auteur
PersonId : 762183
ORCID : 0000-0002-5402-6033
IdRef : 140798072

Traitement du Langage Parlé

Markus Müller

Fonction : Auteur
PersonId : 942224

Institute for Anthropomatics

Lucas Ondel

Fonction : Auteur
PersonId : 750844
IdHAL : lucas-ondel
ORCID : 0000-0003-4512-0471

Brno University of Technology [Brno]

Johns Hopkins University

Shruti Palaskar

Fonction : Auteur

Carnegie Mellon University [Pittsburgh]

Philip Arthur

Fonction : Auteur

Carnegie Mellon University [Pittsburgh]

Francesco Ciannella

Fonction : Auteur

Carnegie Mellon University [Pittsburgh]

Mingxing Du

Fonction : Auteur

Laboratoire de sciences cognitives et psycholinguistique

Elin Larsen

Fonction : Auteur

Laboratoire de sciences cognitives et psycholinguistique

Danny Merkx

Fonction : Auteur

Radboud University [Nijmegen]

Rachid Riad

Fonction : Auteur

Laboratoire de sciences cognitives et psycholinguistique

Liming Wang

Fonction : Auteur

Department. of Computer Science [Illinois]

Emmanuel Dupoux

Fonction : Auteur
PersonId : 857216

Laboratoire de sciences cognitives et psycholinguistique

Apprentissage machine et développement cognitif

Résumé

Speech technology plays an important role in our everyday life. Speech is, among others, used for human-computer interaction, including, for instance, information retrieval and on-line shopping. In the case of an unwritten language, however, speech technology is unfortunately difficult to create, because it cannot be created by the standard combination of pre-trained speech-to-text and text-to-speech subsystems. The research presented in this paper takes the first steps towards speech technology for unwritten languages. Specifically, the aim of this work was 1) to learn speech-to-meaning representations without using text as an intermediate representation, and 2) to test the sufficiency of the learned representations to regenerate speech or translated text, or to retrieve images that depict the meaning of an utterance in an unwritten language. The results suggest that building systems that go directly from speech-to-meaning and from meaning-to-speech, bypassing the need for text, is possible.

Domaines

Informatique et langage [cs.CL]

Laurent Besacier : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02480675

Soumis le : dimanche 16 février 2020-22:23:01

Dernière modification le : vendredi 19 avril 2024-16:18:55

Dates et versions

hal-02480675 , version 1 (16-02-2020)

Identifiants

HAL Id : hal-02480675 , version 1
DOI : 10.1109/TASLP.2020.2973896

Citer

Odette Scharenborg, Laurent Besacier, Alan Black, Mark Hasegawa-Johnson, Florian Metze, et al.. Speech technology for unwritten languages. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, ⟨10.1109/TASLP.2020.2973896⟩. ⟨hal-02480675⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UGA CNRS INRIA EHESS LIG LSCP DEC LIMSI LIG_TDCGE_GETALP INRIA2 PSL UNIV-PARIS-SACLAY POLYTECH-GRENOBLE MIAI ANR PRAIRIE-IA LISN GS-ENGINEERING GS-COMPUTER-SCIENCE LISN-TLP LIG_SIDCH

192 Consultations

0 Téléchargements

Speech technology for unwritten languages

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager