Inversion from Audiovisual Speech to Articulatory Information by Exploiting Multimodal Data - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Inversion from Audiovisual Speech to Articulatory Information by Exploiting Multimodal Data

Résumé

We present an inversion framework to identify speech production properties from audiovisual information. Our system is built on a multimodal articulatory dataset comprising ultrasound, X-ray, magnetic resonance images as well as audio and stereovisual recordings of the speaker. Visual information is captured via stereovision while the vocal tract state is represented by a properly trained articulatory model. Inversion is based on an adaptive piecewise linear approximation of the audiovisualto- articulation mapping. The presented system can recover the hidden vocal tract shapes and may serve as a basis for a more widely applicable inversion setup.
Fichier principal
Vignette du fichier
KatsamanisRoussosMaragosAronBerger_AVInversionMultimodalArtData_issp2008.pdf (696.35 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

inria-00327031 , version 1 (06-01-2009)

Identifiants

  • HAL Id : inria-00327031 , version 1

Citer

Athanassios Katsamanis, Anastasios Roussos, Petros Maragos, Michael Aron, Marie-Odile Berger. Inversion from Audiovisual Speech to Articulatory Information by Exploiting Multimodal Data. 8th International Seminar On Speech Production - ISSP'08, Dec 2008, Strasbourg, France. ⟨inria-00327031⟩
81 Consultations
100 Téléchargements

Partager

Gmail Facebook X LinkedIn More