A study on auditory feature spaces for speech-driven lip animation

Guylaine Le-Jan 1 Yannick Benezeth 1 Guillaume Gravier 2 Frédéric Bimbot 1
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : We present in this paper a study on auditory feature spaces for speech-driven face animation. The goal is to provide solid analytic ground to underscore the description capability of some well-known features with relation to lipsync. A set of various audio features describing the temporal and spectral shape of speech signal has been computed on annotated audio extracts. The dimension of the input feature space has been reduced with PCA and the contribution of each input feature is investigated to determine the more descriptive. The resulting feature space is quantitatively and qualitatively analyzed for the description of acoustic units (phonemes, visemes, etc.) and we demonstrate that the use of some low-level features in addition to MFCC increases the relevance of the feature space. Finally, we evaluate the stability of these features w.r.t. the gender of the speaker.
Type de document :
Communication dans un congrès
Interspeech, Aug 2011, Florence, Italy. 2011
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00598314
Contributeur : Yannick Benezeth <>
Soumis le : mardi 16 octobre 2012 - 15:08:30
Dernière modification le : jeudi 11 janvier 2018 - 06:20:10
Document(s) archivé(s) le : mardi 13 décembre 2016 - 18:20:09

Fichier

interspeech_2011.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00598314, version 1

Collections

Citation

Guylaine Le-Jan, Yannick Benezeth, Guillaume Gravier, Frédéric Bimbot. A study on auditory feature spaces for speech-driven lip animation. Interspeech, Aug 2011, Florence, Italy. 2011. 〈inria-00598314〉

Partager

Métriques

Consultations de la notice

298

Téléchargements de fichiers

146