An Effective Lip Tracking Algorithm for Acoustic-to-Articulatory Inversion

Jingying Chen 1 Marie-Odile Berger 2 Yves Laprie 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
2 ISA - Models, algorithms and geometry for computer graphics and vision
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Although automatic speech recognition systems can now perform well under certain conditions, they still don't provide good results in real life conditions, especially in noisy environments. Several authors have suggested that using articulatory features rather than acoustic features as a basis for speech parameterization would help yield better recognition results. The articulatory features can be recovered from the speech signal by acoustic-to-articulatory inversion. Given the acoustic signal, the recovery of the articulatory state is considered difficult. The reason is the "one-to-many" nature of the acoustic-toarticulatory inversion problem: a given articulatory state has always only one acoustic realization but an acoustic signal can be the outcome of more than one articulatory states. Since visual information is complementary to acoustic information in the inversion, lip tracking is proposed in this paper to provide visual information of lip movement for the acoustic-to-articulatory inversion. Encouraging results have proven the effectiveness of this method which provides useful information (i.e. mouth width and height) for inversion.
Type de document :
Communication dans un congrès
5th International Workshop on Image Analysis for Multimedia - WIAMIS'2004, Apr 2004, Lisbon, Portugal, 3 p, 2004
Liste complète des métadonnées

https://hal.inria.fr/inria-00099905
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 10:06:01
Dernière modification le : jeudi 11 janvier 2018 - 06:19:57
Document(s) archivé(s) le : mercredi 29 mars 2017 - 12:44:17

Fichiers

Identifiants

  • HAL Id : inria-00099905, version 1

Collections

Citation

Jingying Chen, Marie-Odile Berger, Yves Laprie. An Effective Lip Tracking Algorithm for Acoustic-to-Articulatory Inversion. 5th International Workshop on Image Analysis for Multimedia - WIAMIS'2004, Apr 2004, Lisbon, Portugal, 3 p, 2004. 〈inria-00099905〉

Partager

Métriques

Consultations de la notice

166

Téléchargements de fichiers

40