An episodic memory-based solution for the acoustic-to-articulatory inversion problem

Sébastien Demange 1 Slim Ouni 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents an acoustic-to-articulatory inversion method based on an episodic memory. An episodic memory is an interesting model for two reasons. First, it does not rely on any assumptions about the mapping function but rather it relies on real synchronized acoustic and articulatory data streams. Second, the memory inherently represents the real articulatory dynamics as observed. It is argued that the computational models of episodic memory, as they are usually designed, cannot provide a satisfying solution for the acoustic-to-articulatory inversion problem due to the insufficient quantity of training data. Therefore, an episodic memory is proposed, called generative episodic memory (G-Mem), which is able to produce articulatory trajectories that do not belong to the set of episodes the memory is based on. The generative episodic memory is evaluated using two electromagnetic articulography corpora: one for English and one for French. Comparisons with a codebook-based method and with a classical episodic memory (which is termed concatenative episodic memory) are presented in order to evaluate the proposed generative episodic memory in terms of both its modeling of articulatory dynamics and its generalization capabilities. The results show the effectiveness of the method where an overall root-mean-square error of 1.65 mm and a correlation of 0.71 are obtained for the G-Mem method. They are comparable to those of methods recently proposed.
Type de document :
Article dans une revue
Journal of the Acoustical Society of America, Acoustical Society of America, 2013, 133 (5), pp.2921-2930. 〈http://asadl.org/jasa/resource/1/jasman/v133/i5/p2921_s1〉. 〈10.1121/1.4798665〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00834556
Contributeur : Slim Ouni <>
Soumis le : mardi 6 octobre 2015 - 15:34:42
Dernière modification le : jeudi 11 janvier 2018 - 06:19:57
Document(s) archivé(s) le : jeudi 7 janvier 2016 - 10:41:27

Fichier

JASA_Demange_Ouni.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Sébastien Demange, Slim Ouni. An episodic memory-based solution for the acoustic-to-articulatory inversion problem. Journal of the Acoustical Society of America, Acoustical Society of America, 2013, 133 (5), pp.2921-2930. 〈http://asadl.org/jasa/resource/1/jasman/v133/i5/p2921_s1〉. 〈10.1121/1.4798665〉. 〈hal-00834556〉

Partager

Métriques

Consultations de la notice

207

Téléchargements de fichiers

57