Phoneme-to-Articulatory mapping using bidirectional gated RNN

Théo Biasutto– Lervat; Slim Ouni

Communication Dans Un Congrès Année : 2018

Phoneme-to-Articulatory mapping using bidirectional gated RNN

(1) , (1)

Théo Biasutto– Lervat

Fonction : Auteur
PersonId : 1035892

Speech Modeling for Facilitating Oral-Based Communication

Slim Ouni

Fonction : Auteur
PersonId : 1158
IdHAL : slim-ouni
ORCID : 0000-0001-5286-7368

Speech Modeling for Facilitating Oral-Based Communication

Résumé

Deriving articulatory dynamics from the acoustic speech signal has been addressed in several speech production studies. In this paper, we investigate whether it is possible to predict articulatory dynamics from phonetic information without having the acoustic speech signal. The input data may be considered as not sufficiently rich acoustically, as probably there is no explicit coarticulation information but we expect that the phonetic sequence provides compact yet rich knowledge. Motivated by the recent success of deep learning techniques used in the acoustic-to-articulatory inversion, we have experimented around the bidirectional gated recurrent neural network archi-tectures. We trained these models with an EMA corpus, and have obtained good performances similar to the state-of-the-art articulatory inversion from LSF features, but using only the phoneme labels and durations.

Mots clés

speech production coarticulation modeling bidirectional recurrent neural network (BRNN)

Domaines

Intelligence artificielle [cs.AI] Interface homme-machine [cs.HC] Réseau de neurones [cs.NE] Modélisation et simulation Informatique Sciences de l'information et de la communication

Fichier principal

1202_Paper.pdf (376.74 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Slim Ouni : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01862587

Soumis le : lundi 27 août 2018-14:40:29

Dernière modification le : lundi 11 septembre 2023-17:41:19

Archivage à long terme le : mercredi 28 novembre 2018-16:15:47

Dates et versions

hal-01862587 , version 1 (27-08-2018)

Identifiants

HAL Id : hal-01862587 , version 1

Citer

Théo Biasutto– Lervat, Slim Ouni. Phoneme-to-Articulatory mapping using bidirectional gated RNN. Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Sep 2018, Hyderabad, India. ⟨hal-01862587⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 TDS-MACS LORIA LORIA-NLPKD

262 Consultations

440 Téléchargements

Phoneme-to-Articulatory mapping using bidirectional gated RNN

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager