O. Abdel-hamid and H. Jiang, Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition, Proc. of INTERSPEECH, pp.1248-1252, 2013.

B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter et al., Speaker Independent Continuous Speech Recognition Using an Acoustic-Phonetic Italian Corpus, Proc. of ICSLP, pp.1391-1394, 1994.

P. Bell, P. Swietojanski, R. , and S. , Multi-level adaptive networks in tandem and hybrid ASR systems, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6975-6979, 2013.
DOI : 10.1109/ICASSP.2013.6639014

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, Greedy layerwise training of deep networks, Advances in Neural Information Processing Systems, vol.19, p.153, 2007.

Y. Bengio, A. Courville, and P. Vincent, Representation learning: A review and new perspectives. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.35, issue.8, pp.1798-1828, 2013.

H. A. Bourlard and N. Morgan, Connectionist speech recognition: a hybrid approach, 1994.
DOI : 10.1007/978-1-4615-3210-1

L. Burget, P. Schwarz, M. Agarwal, P. Akyazi, K. Feng et al., Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4334-4337, 2010.
DOI : 10.1109/ICASSP.2010.5495646

T. Claes, I. Dologlou, L. Ten-bosch, C. , and D. V. , A novel feature transformation for vocal tract length normalization in automatic speech recognition, IEEE Transactions on Speech and Audio Processing, vol.6, issue.6, pp.549-557, 1998.
DOI : 10.1109/89.725321

G. Dahl, D. Yu, L. Deng, and A. Acero, Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.1, pp.30-42, 2012.
DOI : 10.1109/TASL.2011.2134090

S. Das, D. Nix, and M. Picheny, Improvements in children's speech recognition performance, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), 1998.
DOI : 10.1109/ICASSP.1998.674460

E. Eide and H. Gish, A Parametric Approach to Vocal Tract Lenght Normalization, Proc. of IEEE ICASSP, pp.346-349, 1996.

D. Erhan, Y. Bengio, A. Courville, P. Manzagol, P. Vincent et al., Why does unsupervised pre-training help deep learning?, The Journal of Machine Learning Research, vol.11, pp.625-660, 2010.

W. T. Fitch and J. Giedd, Morphology and development of the human vocal tract: A study using magnetic resonance imaging, The Journal of the Acoustical Society of America, vol.106, issue.3, pp.1511-1522, 1999.
DOI : 10.1121/1.427148

M. J. Gales, Maximum likelihood linear transformations for HMM-based speech recognition, Computer Speech & Language, vol.12, issue.2, pp.75-98, 1998.
DOI : 10.1006/csla.1998.0043

M. Gerosa, D. Giuliani, and F. Brugnara, Acoustic variability and automatic recognition of children???s speech, Speech Communication, vol.49, issue.10-11, pp.10-11847, 2007.
DOI : 10.1016/j.specom.2007.01.002

M. Gerosa, D. Giuliani, and F. Brugnara, Towards age-independent acoustic modeling, Speech Communication, vol.51, issue.6, pp.499-509, 2009.
DOI : 10.1016/j.specom.2009.01.006

URL : https://hal.archives-ouvertes.fr/hal-00524121

M. Gerosa, D. Giuliani, S. Narayanan, and A. Potamianos, A review of ASR technologies for children's speech, Proceedings of the 2nd Workshop on Child, Computer and Interaction, WOCCI '09, pp.1-7, 2009.
DOI : 10.1145/1640377.1640384

L. Gillick and S. Cox, Some statistical issues in the comparison of speech recognition algorithms, International Conference on Acoustics, Speech, and Signal Processing, pp.532-535, 1989.
DOI : 10.1109/ICASSP.1989.266481

D. Giuliani and M. Gerosa, Investigating Recognition of Children Speech, Proc. of IEEE ICASSP, pp.137-140, 2003.

A. Hagen, B. Pellom, C. , and R. , Children's speech recognition with application to interactive books and tutors, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721), 2003.
DOI : 10.1109/ASRU.2003.1318426

G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed et al., Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Processing Magazine, issue.6, pp.2982-97, 2012.

G. Hinton, S. Osindero, and Y. Teh, A Fast Learning Algorithm for Deep Belief Nets, Neural Computation, vol.18, issue.7, pp.1527-1554, 2006.
DOI : 10.1162/jmlr.2003.4.7-8.1235

J. E. Huber, E. T. Stathopoulos, G. M. Curione, T. A. Ash, J. et al., Formants of children, women, and men: The effects of vocal intensity variation, The Journal of the Acoustical Society of America, vol.106, issue.3, pp.1532-1542, 1999.
DOI : 10.1121/1.427150

D. Imseng, P. Motlicek, P. Garner, and H. Bourlard, Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.332-337, 2013.
DOI : 10.1109/ASRU.2013.6707752

N. Kumar and A. G. Andreou, Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition, Speech Communication, vol.26, issue.4, pp.283-97, 1998.
DOI : 10.1016/S0167-6393(98)00061-2

. Viet-bac, . Le, L. Lamel, and J. Gauvain, Multi-style ML features for BN transcription, Proc. of IEEE ICASSP, pp.4866-4869, 2010.

C. Lee and J. Gauvain, Speaker adaptation based on MAP estimation of HMM parameters, IEEE International Conference on Acoustics Speech and Signal Processing, pp.558-561, 1993.
DOI : 10.1109/ICASSP.1993.319368

L. Lee and R. C. Rose, Speaker normalization using efficient frequency warping procedures, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.353-356, 1996.
DOI : 10.1109/ICASSP.1996.541105

S. Lee, A. Potamianos, and S. Narayanan, Acoustics of children???s speech: Developmental changes of temporal and spectral parameters, The Journal of the Acoustical Society of America, vol.105, issue.3, pp.1455-1468, 1999.
DOI : 10.1121/1.426686

Q. Li and M. Russell, Why is Automatic Recognition of Children's Speech Difficult?, Proc. of the Seventh European Conference on Speech Communication and Technology, 2001.

H. Liao, Speaker adaptation of context dependent deep neural networks, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7947-7951, 2013.
DOI : 10.1109/ICASSP.2013.6639212

A. Metallinou and J. Cheng, Using Deep Neural Networks to Improve Proficiency Assessment for Children English Language Learners, Proc. of INTERSPEECH, pp.1468-1472, 2014.

A. Mohamed, G. Dahl, and G. Hinton, Acoustic Modeling Using Deep Belief Networks, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.1, pp.14-22, 2012.
DOI : 10.1109/TASL.2011.2109382

R. Nisimura, A. Lee, H. Saruwatari, and K. Shikano, Public speech-oriented guidance system with adult and child discrimination capability, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004.
DOI : 10.1109/ICASSP.2004.1326015

J. Pinto, . Magimai-doss, and H. Bourlard, MLP based hierarchical system for task adaptation in ASR, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, pp.365-370, 2009.
DOI : 10.1109/ASRU.2009.5373383

A. Potamianos and S. Narayanan, Robust recognition of children's speech, IEEE Transactions on Speech and Audio Processing, vol.11, issue.6, pp.603-615, 2003.
DOI : 10.1109/TSA.2003.818026

F. Seide, G. Li, X. Chen, Y. , and D. , Feature engineering in contextdependent deep neural networks for conversational speech transcription, Proc. of IEEE ASRU Workshop, pp.24-29, 2011.

F. Seide, G. Li, X. Chen, Y. , and D. , Feature engineering in contextdependent deep neural networks for conversational speech transcription, Proc. of IEEE ASRU Workshop, 2011.

M. Seltzer, D. Yu, W. , and Y. , An investigation of deep neural networks for noise robust speech recognition, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013.
DOI : 10.1109/ICASSP.2013.6639100

A. Senior and I. Lopez-moreno, Improving DNN speaker independence with I-vector inputs, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014.
DOI : 10.1109/ICASSP.2014.6853591

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.674.5253

R. Serizel and D. Giuliani, Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition, 2014 IEEE Spoken Language Technology Workshop (SLT), 2014.
DOI : 10.1109/SLT.2014.7078563

R. Serizel and D. Giuliani, Deep neural network adaptation for children's and adults' speech recognition, Proc. of the First Italian Computational Linguistics Conference, 2014.

S. Sivadas and H. Hermansky, On use of task independent training data in tandem feature extraction, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.541-545, 2004.
DOI : 10.1109/ICASSP.2004.1326042

S. Steidl, G. Stemmer, C. Hacker, E. Nöth, and H. Niemann, Improving Children???s Speech Recognition by HMM Interpolation with an Adults??? Speech Recognizer, Pattern Recognition, 25th DAGM Symposium, pp.600-607, 2003.
DOI : 10.1007/978-3-540-45243-0_76

A. Stolcke, F. Grezl, M. Hwang, X. Lei, N. Morgan et al., Cross-Domain and Cross-Language Portability of Acoustic Features Estimated Romain Serizel and Diego Giuliani by Multilayer Perceptrons, Proc. of IEEE ICASSP, pp.321-334, 2006.

P. Swietojanski, A. Ghoshal, R. , and S. , Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR, 2012 IEEE Spoken Language Technology Workshop (SLT), pp.246-251, 2012.
DOI : 10.1109/SLT.2012.6424230

S. Thomas, M. Seltzer, K. Church, and H. Hermansky, Deep neural network features and semi-supervised training for low resource speech recognition, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6704-6708, 2013.
DOI : 10.1109/ICASSP.2013.6638959

K. Veseì-y, L. Burget, and F. Grézl, Parallel training of neural networks for speech recognition, Text, Speech and Dialogue, pp.439-446, 2010.

S. Wegmann, D. Mcallaster, J. Orloff, and B. Peskin, Speaker Normalisation on Conversational Telephone Speech, Proc. of IEEE ICASSP, pp.339-341, 1996.

L. Welling, S. Kanthak, and H. Ney, Improved methods for vocal tract normalization, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258), pp.761-764, 1999.
DOI : 10.1109/ICASSP.1999.759780

J. G. Wilpon and C. N. Jacobsen, A study of speech recognition for children and the elderly, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.349-352, 1996.
DOI : 10.1109/ICASSP.1996.541104

M. Wöllmer, B. Schuller, A. Batliner, S. Steidl, and D. Seppi, Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario, ACM Transasctions Speech Language Processing, vol.712, issue.4, pp.1-1222, 2011.

P. Woodland, J. J. Odell, V. Valtchev, Y. , and S. J. , Large vocabulary continuous speech recognition using HTK, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing, pp.125-128, 1994.
DOI : 10.1109/ICASSP.1994.389562

K. Yochai and N. Morgan, GDNN: a gender-dependent neural network for continuous speech recognition, Proc. of Iternational Joint Conference on Neural Networks, pp.332-337, 1992.