W. Sumby and I. Pollack, Visual Contribution to Speech Intelligibility in Noise, The Journal of the Acoustical Society of America, vol.26, issue.2, p.212, 1954.
DOI : 10.1121/1.1907309

L. Goff, T. Guiard-marigny, M. Cohen, and C. Benoit, Real-time analysis-synthesis and intelligibility of talking faces, 2nd International conference on Speech Synthesis, pp.53-56, 1994.

S. Ouni, M. M. Cohen, H. Ishak, and D. W. Massaro, Visual Contribution to Speech Perception: Measuring the Intelligibility of Animated Talking Heads, EURASIP Journal on Audio, Speech, and Music Processing, vol.41, issue.3, pp.3-347891, 2007.
DOI : 10.1121/1.429611

URL : https://hal.archives-ouvertes.fr/hal-00184425

Z. Deng and J. Noh, Computer Facial Animation: A Survey, pp.1-28, 2008.
DOI : 10.1007/978-1-84628-907-1_1

. Gascuel, 3d models of the lips for realistic speech animation, Computer Animation'96. Proceedings. IEEE, pp.80-89, 1996.

C. Pelachaud, E. Magno-caldognetto, C. Zmarich, and P. Cosi, Modelling an italian talking head, AVSP 2001-International Conference on Auditory-Visual Speech Processing, 2001.

S. King, R. Parent, and B. Olsafsky, A muscle-based 3d parametric lip model for speech-synchronized facial animation, " in Deformable Avatars, ser. IFIP The International Federation for Information Processing, pp.12-23, 2001.

T. Kuratate and M. Riley, Building speaker-specific lip models for talking heads from 3d face data, AVSP 2010 -International Conference on Auditory-Visual Speech Processing, 2010.

S. Zhang and P. Huang, High-Resolution, Real-time 3D Shape Acquisition, 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp.28-28, 2004.
DOI : 10.1109/CVPR.2004.360

S. Ouni and S. Dahmani, Is markerless acquisition technique adequate for speech production?, The Journal of the Acoustical Society of America, vol.139, issue.6, pp.234-239, 2016.
DOI : 10.1121/1.4954497

URL : http://asa.scitation.org/doi/pdf/10.1121/1.4954497

B. Bickel, M. Botsch, R. Angst, W. Matusik, M. Otaduy et al., Multi-scale capture of facial geometry and motion, ACM Trans. Graph, vol.26, issue.3, 2007.
DOI : 10.1145/1276377.1276419

M. Berger, J. Ponroy, and B. Wrobel-dautcourt, Realistic Face Animation for Audiovisual Speech Applications: A Densification Approach Driven by Sparse Stereo Meshes, Computer Vision/Computer Graphics CollaborationTechniques, ser. Lecture Notes in Computer, pp.297-307, 2009.
DOI : 10.1007/3-540-47979-1_51

URL : https://hal.archives-ouvertes.fr/inria-00429338

K. S. Bhat, R. Goldenthal, Y. Ye, R. Mallet, and M. Koperwas, High fidelity facial animation capture and retargeting with contours, Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA '13, pp.7-14, 2013.
DOI : 10.1145/2485895.2485915

Y. Cao, W. C. Tien, P. Faloutsos, and F. Pighin, Expressive speech-driven facial animation, ACM Transactions on Graphics, vol.24, issue.4, pp.1283-1302, 2005.
DOI : 10.1145/1095878.1095881

K. Wampler, D. Sasaki, L. Zhang, and Z. Popovi´cpopovi´c, Dynamic, expressive speech animation from a single mesh, Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, ser. SCA '07. Aire-la-Ville, Switzerland, Switzerland: Eurographics Association, pp.53-62, 2007.

Z. Deng and X. Ma, Perceptually guided expressive facial animation, Proceedings of the 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp.67-76, 2008.

G. Bailly, O. Govokhina, F. Elisei, and G. Breton, Lip-Synching Using Speaker-Specific Articulation, Shape and Appearance Models, Speech, and Music Processing, 2009.
DOI : 10.1109/TSA.2005.857572

URL : https://hal.archives-ouvertes.fr/hal-00447061

C. Berger and . Lavecchia, Acoustic-visual synthesis technique using bimodal unit-selection, Speech, and Music Processing, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00835854

L. Williams, Performance-driven facial animation, Proceedings of the 17th Annual Conference on Computer Graphics and Interactive Techniques, ser. SIGGRAPH '90, pp.235-242, 1990.
DOI : 10.1145/97880.97906

F. Elisei, M. Odisio, G. Bailly, and P. Badin, Creating and controlling video-realistic talking heads, AVSP 2001-International Conference on Auditory-Visual Speech Processing, 2001.

B. Wrobel-dautcourt, M. Berger, B. Potard, Y. Laprie, and S. Ouni, A low-cost stereovision based system for acquisition of visible articulatory data Available: https, 5th Conference on Auditory-Visual Speech Processing -AVSP'2005, 2005.

D. Bradley, W. Heidrich, T. Popa, and A. Sheffer, High resolution passive facial performance capture, Proc. SIGGRAPH), 2010.
DOI : 10.1145/1778765.1778778

T. Weise, S. Bouaziz, H. Li, and M. Pauly, Realtime performancebased facial animation, ACM SIGGRAPH 2011 Papers, ser. SIGGRAPH '11, pp.1-77, 2011.
DOI : 10.1145/1964921.1964972

D. B. Gennery, Stereo vision for the acquisition and tracking of moving three-dimensional objects, 1986.

P. Hoole and A. Zierdt, Five-dimensional articulography, " in Speech Motor Control: New developments in basic and applied research, pp.331-349, 2010.

P. Hoole and S. Gfoerer, Electromagnetic articulography as a tool in the study of lingual coarticulation, The Journal of the Acoustical Society of America, vol.87, issue.S1, pp.123-123, 1121.
DOI : 10.1121/1.2027899

J. S. Perkell, M. H. Cohen, M. A. Svirsky, M. L. Matthies, I. Garabieta et al., Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements, The Journal of the Acoustical Society of America, vol.92, issue.6, pp.3078-3096, 1992.
DOI : 10.1121/1.404204

L. Wang, H. Chen, S. Li, and H. M. Meng, Phoneme-level articulatory animation in pronunciation training, Speech Communication, vol.54, issue.7, pp.845-856, 2012.
DOI : 10.1016/j.specom.2012.02.003

H. Li, M. Yang, and J. Tao, Speaker-independent lips and tongue visualization of vowels, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.8106-8110, 2013.
DOI : 10.1109/ICASSP.2013.6639244

Y. Arieli, B. Freedman, M. Machline, and A. Shpunt, Depth mapping using projected patterns, p.142, 2012.

T. Weise, B. Leibe, and L. J. , Accurate and robust registration for in-hand modeling, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587832

URL : http://www.vision.ee.ethz.ch/~bleibe/papers/weise-inhandscanning-cvpr08.pdf

T. Rhee, Y. Hwang, J. D. Kim, and C. Kim, Real-time facial animation from live video tracking, Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA '11, pp.215-224, 2011.
DOI : 10.1145/2019406.2019435

J. R. Green, E. M. Wilson, Y. Wang, and C. A. Moore, Estimating Mandibular Motion Based on Chin Surface Targets During Speech, Journal of Speech Language and Hearing Research, vol.50, issue.4, pp.928-9391092, 2007.
DOI : 10.1044/1092-4388(2007/066)

URL : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2745713/pdf

. Faceshift, Faceshift is not available anymore since, Last, 2015.

X. Huang, F. Alleva, H. Hon, M. Hwang, and R. Rosenfeld, The sphinxii speech recognition system: An overview, 15213, 1992.
DOI : 10.1006/csla.1993.1007

URL : ftp://reports.adm.cs.cmu.edu/usr/anon/1992/CMU-CS-92-112.ps

D. W. Massaro, Perceiving talking faces: From speech perception to a behavioral principle, 1998.

H. Ze, A. Senior, and M. Schuster, Statistical parametric speech synthesis using deep neural networks, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7962-7966, 2013.
DOI : 10.1109/ICASSP.2013.6639215

URL : http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/40837.pdf

B. Fan, L. Xie, S. Yang, L. Wang, and F. K. Soong, A deep bidirectional LSTM approach for video-realistic talking head, Multimedia Tools and Applications, pp.5287-5309, 2016.
DOI : 10.1016/j.specom.2009.04.004