J. Lee and C. Park, Robust Audio-Visual Speech Recognition Based on Late Integration, IEEE Transactions on Multimedia, vol.10, issue.5, pp.767-779, 2008.

T. Frank, M. Hoch, and G. Trogemann, Automated Lip- Sync for 3D-Character Animation, 15th IMACS World Congress on Scientific Computation, Modelling and Applied Mathematics, pp.24-29, 1997.

S. Nakamura, Statistical multimodal integration for audio-visual speech processing, IEEE Transactions on Neural Networks, vol.13, issue.4, pp.854-866, 2002.
DOI : 10.1109/TNN.2002.1021886

L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, pp.257-286, 1989.

J. Park and H. Ko, Real-Time Continuous Phoneme Recognition System Using Class-Dependent Tied-Mixture HMM With HBT Structure for Speech-Driven Lip-Sync, IEEE Transactions on Multimedia, vol.10, issue.7, pp.1299-1306, 2008.
DOI : 10.1109/TMM.2008.2004908

S. Foo and L. Dong, Recognition of Visual Speech Elements Using Hidden Markov Models, Advances in Multimedia Information Processing, pp.153-173, 2002.
DOI : 10.1007/3-540-36228-2_75

G. Gravier, J. Bonastre, E. Geoffrois, S. Galliano, K. Mc-tait et al., ESTER, une campagne d'évaluation des systèmes d'indexation automatique d'émissions radiophoniques en français, Journées d'Etude sur la Parole (JEP), 2004.

C. Benoit, T. Lallouache, T. Mohamadi, and C. Abry, A set of French visemes for visual speech synthesis, Les cahiers de l'ICP, Rapport de recherche, pp.113-129, 1994.

O. Govokhina, Modèles de trajectoires pour l'animation de visages parlants, Thèse de l'Institut National Polytechnique de Grenoble, 2008.

E. Bozkurt, Q. Erdem, E. Erzin, T. Erdem, and M. Ozkan, Comparison of Phoneme and Viseme Based Acoustic Units for Speech Driven Realistic lip Animation, 3DTV international Conference, 2007.