J. Barker and F. Berthommier, Evidence of correlation between acoustic and visual features of speech, ICPhS, 1999.

H. Yehia, P. Rubin, and E. Vatikiotis-bateson, Quantitative association of vocal-tract and facial behavior, Speech Communication, vol.26, issue.1-2, pp.23-43, 1998.
DOI : 10.1016/S0167-6393(98)00048-X

W. H. Sumby and I. Pollack, Visual Contribution to Speech Intelligibility in Noise, The Journal of the Acoustical Society of America, vol.26, issue.2, pp.212-215, 1954.
DOI : 10.1121/1.1907309

B. , L. Goff, T. Guiard-marigny, M. Cohen, and C. Benoit, Realtime analysis-synthesis and intelligibility of talking faces, 2nd International Conference on Speech Synthesis, 1994.

S. Ouni, M. Cohen, H. Ishak, and D. Massaro, Visual Contribution to Speech Perception: Measuring the Intelligibility of Animated Talking Heads, EURASIP Journal on Audio, Speech, and Music Processing, vol.41, issue.3, p.47891, 2007.
DOI : 10.1121/1.429611
URL : https://hal.archives-ouvertes.fr/hal-00184425

G. Bailly, M. Bérar, F. Elisei, and M. Odisio, Audiovisual speech synthesis, International Journal of Speech Technology, vol.6, issue.4, pp.331-346, 2003.
DOI : 10.1023/A:1025700715107
URL : https://hal.archives-ouvertes.fr/hal-00169556

W. Mattheyses, L. Latacz, and W. Verhelst, On the Importance of Audiovisual Coherence for the Perceived Quality of Synthesized Visual Speech, EURASIP Journal on Audio, Speech, and Music Processing, vol.9, issue.5588, p.169819, 2009.
DOI : 10.1016/j.specom.2004.06.004

A. Hallgren and B. Lyberg, Visual speech synthesis with concatenative speech, AVSP, 1998.

S. Minnis and A. Breen, Modeling visual coarticulation in synthetic talking heads using a lip motion unit inventory with concatenative synthesis, Interspeech, 2000.

B. Wrobel-dautcourt, M. Berger, B. Potard, Y. Laprie, and S. Ouni, A low-cost stereovision based system for acquisition of visible articulatory data, AVSP, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000432

S. Maeda, Compensatory Articulation During Speech: Evidence from the Analysis and Synthesis of Vocal-Tract Shapes Using an Articulatory Model, Speech production and speech modelling, pp.131-149, 1990.
DOI : 10.1007/978-94-009-2037-8_6

V. Colotte and R. Beaufort, Linguistic features weighting for a Text-To-Speech system without prosody model, Interspeech, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00012561

K. Liu and J. Ostermann, Optimization of an Image-Based Talking Head System, Speech, and Music Processing, p.174192, 2009.
DOI : 10.1016/j.specom.2004.07.002

M. Berger, Realistic face animation from sparse stereo meshes, AVSP, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00169216

J. Kim and C. Davis, Visible speech cues and auditory detection of spoken sentences: an effect of degree of correlation between acoustic and visual properties, AVSP, 2001.

J. Schwartz, F. Berthommier, and C. Savariaux, Audio-visual scene analysis: evidence for a " very-early " integration process in audio-visual speech perception, Interspeech, 2002.