J. Yamagishi, Z. Ling, and S. King, Robustness of HMM-based speech synthesis, Proc. of Interspeech, pp.2-5, 2008.

H. Ze, A. Senior, and M. Schuster, Statistical parametric speech synthesis using deep neural networks, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7962-7966, 2013.
DOI : 10.1109/ICASSP.2013.6639215
URL : http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/40837.pdf

Y. Sagisaka, Speech synthesis by rule using an optimal selection of non-uniform synthesis units, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing, pp.679-682, 1988.
DOI : 10.1109/ICASSP.1988.196677

A. W. Black and P. Taylor, CHATR, Proceedings of the 15th conference on Computational linguistics -, pp.983-986, 1994.
DOI : 10.3115/991250.991307

A. Hunt and A. W. Black, Unit selection in a concatenative speech synthesis system using a large speech database, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.373-376, 1996.
DOI : 10.1109/ICASSP.1996.541110

P. Taylor, A. Black, and R. Caley, The architecture of the Festival speech synthesis system, Proc. of the ESCA Workshop in Speech Synthesis, pp.147-151, 1998.

A. P. Breen and P. Jackson, Non-uniform unit selection and the similarity metric within BTs Laureate TTS system, Proc. of the ESCA Workshop on Speech Synthesis, pp.373-376, 1998.

R. A. Clark, K. Richmond, and S. King, Multisyn: Open-domain unit selection for the Festival speech synthesis system, Speech Communication, vol.49, issue.4, pp.317-330, 2007.
DOI : 10.1016/j.specom.2007.01.014
URL : https://hal.archives-ouvertes.fr/hal-00499177

H. Patil, T. Patel, N. Shah, H. Sailor, R. Krishnan et al., A syllable-based framework for unit selection synthesis in 13 Indian languages, 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp.1-8, 2013.
DOI : 10.1109/ICSDA.2013.6709851

Y. Stylianou and A. Syrdal, Perceptual and objective detection of discontinuities in concatenative speech synthesis, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), pp.837-840, 2001.
DOI : 10.1109/ICASSP.2001.941045

D. Tihelka, J. Matou?ek, and Z. Hanzlí?ek, Modelling F0 Dynamics in??Unit??Selection??Based??Speech??Synthesis, Proc. of TSD, pp.457-464, 2014.
DOI : 10.1007/978-3-319-10816-2_55

J. Yi, Natural-sounding speech synthesis using variable-length units, 1998.

D. Cadic, C. Boidin, and C. , Vocalic sandwich, a unit designed for unit selection TTS, Proc. of Interspeech, pp.2079-2082, 2009.

P. Boersma, Praat, a system for doing phonetics by computer, pp.341-345, 2002.

D. Talkin, A robust algorithm for pitch tracking (RAPT), " in Speech coding and synthesis, pp.495-518, 1995.

J. Duddington, eSpeak text to speech, 2012.

S. Young, G. Evermann, M. Gales, T. Hein, D. Kershaw et al., The HTK book. for version 3, 2005.

J. Chevelu, G. Lecorvé, and D. Lolive, ROOTS: a toolkit for easy, fast and consistent processing of large sequential annotated data collections, Proc. of LREC, pp.619-626, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00974628

D. Guennec and D. Lolive, Unit Selection Cost Function Exploration Using an A* Based Text-to-Speech System, Proc. of TSD, pp.432-440, 2014.
DOI : 10.1007/978-3-319-10816-2_52
URL : https://hal.archives-ouvertes.fr/hal-01133321

P. Alain, J. Chevelu, D. Guennec, G. Lecorvé, and D. Lolive, The IRISA Text-To-Speech System for the Blizzard Challenge 2016, Blizzard Challenge 2016 workshop, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01375897

C. Blouin, O. Rosec, P. Bagshaw, and C. , Concatenation cost calculation and optimisation for unit selection in TTS, IEEE Workshop on Speech Synthesis, pp.0-3, 2002.

F. Alías, L. Formiga, and X. Llorá, Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept, Speech Communication, vol.53, issue.5, pp.786-800, 2011.
DOI : 10.1016/j.specom.2011.01.004

D. Guennec and D. Lolive, On the Suitability of Vocalic Sandwiches in a Corpus-Based TTS Engine, Interspeech 2016, 2016.
DOI : 10.21437/Interspeech.2016-1222
URL : https://hal.archives-ouvertes.fr/hal-01338839

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: Machine learning in python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905