HMM-based Automatic Visual Speech Segmentation Using Facial Data

Utpala Musti 1 Asterios Toutios 1 Slim Ouni 1 Vincent Colotte 1 Brigitte Wrobel-Dautcourt 2 Marie-Odile Berger 2
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
2 MAGRIT - Visual Augmentation of Complex Environments
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : We describe automatic visual speech segmentation using facial data captured by a stereo-vision technique. The segmentation is performed using an HMM-based forced alignment mechanism widely used in automatic speech recognition. The idea is based on the assumption that using visual speech data alone for the training might capture the uniqueness in the facial compo- nent of speech articulation, asynchrony (time lags) in visual and acoustic speech segments and significant coarticulation effects. This should provide valuable information that helps to show the extent to which a phoneme may affect surrounding phonemes visually. This should provide information valuable in labeling the visual speech segments based on dominant coarticulatory contexts.
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.inria.fr/inria-00526776
Contributor : Slim Ouni <>
Submitted on : Friday, October 15, 2010 - 4:55:46 PM
Last modification on : Thursday, January 11, 2018 - 6:20:14 AM
Long-term archiving on: Monday, January 17, 2011 - 10:53:11 AM

File

IS10-UM.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00526776, version 1

Collections

Citation

Utpala Musti, Asterios Toutios, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, et al.. HMM-based Automatic Visual Speech Segmentation Using Facial Data. Interspeech 2010, ISCA, Sep 2010, Makuhari, Chiba, Japan. pp.1401-1404. ⟨inria-00526776⟩

Share

Metrics

Record views

683

Files downloads

347