C. Anderson, P. Teal, and M. Poletti, Spatially Robust Far-field Beamforming Using the von Mises(-Fisher) Distribution, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.12, pp.2189-2197, 2015.
DOI : 10.1109/TASLP.2015.2473684

X. Anguera, C. Wooters, and J. Hernando, Acoustic Beamforming for Speaker Diarization of Meetings, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.7, pp.2011-2023, 2007.
DOI : 10.1109/TASL.2007.902460

S. Araki, S. Makino, Y. Hinamoto, R. Mukai, T. Nishikawa et al., Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures, EURASIP Journal on Advances in Signal Processing, vol.2003, issue.11, pp.1157-1166, 2003.
DOI : 10.1155/S1110865703305074

D. Bagchi, M. I. Mandel, Z. Wang, Y. He, A. Plummer et al., Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.496-503, 2015.
DOI : 10.1109/ASRU.2015.7404836

J. M. Baker, L. Deng, J. Glass, S. Khudanpur, C. Lee et al., Developments and directions in speech recognition and understanding, Part 1 [DSP Education], IEEE Signal Processing Magazine, vol.26, issue.3, pp.75-80, 2009.
DOI : 10.1109/MSP.2009.932166

H. Barfuss, C. Huemmer, A. Schwarz, and W. Kellermann, Robust coherence-based spectral enhancement for distant speech recognition, 2015.

J. Barker, R. Marxer, E. Vincent, and S. Watanabe, The third ???CHiME??? speech separation and recognition challenge: Dataset, task and baselines, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.504-511, 2015.
DOI : 10.1109/ASRU.2015.7404837

URL : https://hal.archives-ouvertes.fr/hal-01211376

J. Barker, R. Marxer, E. Vincent, and S. Watanabe, The third ???CHiME??? speech separation and recognition challenge: Analysis and outcomes, Computer Speech & Language, 2016.
DOI : 10.1016/j.csl.2016.10.005

URL : https://hal.archives-ouvertes.fr/hal-01382108

J. Barker, E. Vincent, N. Ma, H. Christensen, and P. Green, The PASCAL CHiME speech separation and recognition challenge, Computer Speech & Language, vol.27, issue.3, pp.621-633, 2013.
DOI : 10.1016/j.csl.2012.10.004

URL : https://hal.archives-ouvertes.fr/hal-00646370

P. Bell, M. J. Gales, T. Hain, J. Kilgour, P. Lanchantin et al., The MGB challenge: Evaluating multi-genre broadcast media recognition, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.687-693, 2015.
DOI : 10.1109/ASRU.2015.7404863

A. Brutti, M. Matassoni, and . Feb, On the relationship between Early-to-Late Ratio of Room Impulse Responses and ASR performance in reverberant environments, Speech Communication, vol.76, pp.170-185, 2016.
DOI : 10.1016/j.specom.2015.09.004

J. Chen, Y. Wang, and D. Wang, Noise Perturbation Improves Supervised Speech Separation, Proc. 12th Int. Conf. on Latent Variable Analysis and Signal Separation, pp.83-90, 2015.
DOI : 10.1007/978-3-319-22482-4_10

I. Cohen, J. Benesty, and S. Gannot, Speech processing in modern communication: Challenges and perspectives, 2010.
DOI : 10.1007/978-3-642-11130-3

H. Cox, R. Zeskind, and M. Owen, Robust adaptive beamforming, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.35, issue.10, pp.1365-1376, 1987.
DOI : 10.1109/TASSP.1987.1165054

J. Dibiase, H. Silverman, and M. Brandstein, Robust Localization in Reverberant Rooms, Microphone arrays: signal processing techniques and applications, pp.157-180, 2001.
DOI : 10.1007/978-3-662-04619-7_8

S. Doclo, M. Moonen, and . Feb, Superdirective Beamforming Robust Against Microphone Mismatch, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.2, pp.617-631, 2007.
DOI : 10.1109/TASL.2006.881676

N. Q. Duong, E. Vincent, and R. Gribonval, Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.7, pp.1830-1840, 2010.
DOI : 10.1109/TASL.2010.2050716

URL : https://hal.archives-ouvertes.fr/inria-00435807

J. G. Fiscus, A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER), 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.347-354, 1997.
DOI : 10.1109/ASRU.1997.659110

C. Fox, Y. Liu, E. Zwyssig, and T. Hain, The Sheffield wargames corpus, Proc. Interspeech, pp.1116-1120, 2013.

Y. Fujita, R. Takashima, T. Homma, R. Ikeshita, Y. Kawaguchi et al., Unified ASR system using LGM-based source separation, noise-robust feature extraction, and word hypothesis selection, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.416-422, 2015.
DOI : 10.1109/ASRU.2015.7404825

M. J. Gales, Maximum likelihood linear transformations for HMM-based speech recognition, Computer Speech & Language, vol.12, issue.2, pp.75-98, 1998.
DOI : 10.1006/csla.1998.0043

S. Gannot, D. Burshtein, and E. Weinstein, Signal enhancement using beamforming and nonstationarity with applications to speech, IEEE Transactions on Signal Processing, vol.49, issue.8, pp.1614-1626, 2001.
DOI : 10.1109/78.934132

L. Gillick and S. J. Cox, Some statistical issues in the comparison of speech recognition algorithms, International Conference on Acoustics, Speech, and Signal Processing, pp.532-535, 1989.
DOI : 10.1109/ICASSP.1989.266481

J. H. Hansen, P. Angkititrakul, J. Plucienkowski, S. Gallant, and U. Yapanel, CU-Move " : Analysis & corpus development for interactive in-vehicle speech systems, Proc. Eurospeech, pp.2023-2026, 2001.

M. Harper, The automatic speech recognition in reverberant environments (ASpIRE) challenge, Proc. 2015 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp.547-554, 2015.

J. Heymann, L. Drude, A. Chinaev, and R. Haeb-umbach, BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.444-451, 2015.
DOI : 10.1109/ASRU.2015.7404829

H. Hirsch and D. Pearce, The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, Proc. ASR2000, pp.181-188, 2000.

T. Hori, Z. Chen, H. Erdogan, J. R. Hershey, J. L. Roux et al., The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.475-481, 2015.
DOI : 10.1109/ASRU.2015.7404833

A. Hurmalainen, J. F. Gemmeke, and T. Virtanen, Modelling non-stationary noise with spectral factorisation in automatic speech recognition, Computer Speech & Language, vol.27, issue.3, pp.763-779, 2013.
DOI : 10.1016/j.csl.2012.07.008

N. Kanda, R. Takeda, and Y. Obuchi, Elastic spectral distortion for low resource speech recognition with deep neural networks, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.309-314, 2013.
DOI : 10.1109/ASRU.2013.6707748

M. Karafiát, L. Burget, P. Mat?jka, O. Glembek, and J. Cernock´ycernock´y, ivectorbased discriminative adaptation for automatic speech recognition, Proc. 2011 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp.152-157, 2011.

P. Karanasou, Y. Wang, M. J. Gales, and P. C. Woodland, Adaptation of deep neural network acoustic models using factorised i-vectors, Proc. Interspeech, pp.2180-2184, 2014.

M. Kim and P. Smaragdis, Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures, Proc. 12th Int. Conf. on Latent Variable Analysis and Signal Separation, pp.100-107, 2015.
DOI : 10.1007/978-3-319-22482-4_12

K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, E. Habets et al., The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp.1-4, 2013.
DOI : 10.1109/WASPAA.2013.6701894

R. Kneser and H. Ney, Improved backing-off for M-gram language modeling, 1995 International Conference on Acoustics, Speech, and Signal Processing, pp.181-184, 1995.
DOI : 10.1109/ICASSP.1995.479394

K. Kumatani, T. Arakawa, K. Yamamoto, J. Mcdonough, B. Raj et al., Microphone array processing for distant speech recognition: Towards real-world deployment, Proc. APSIPA Annual Summit and Conf, pp.1-10, 2012.

L. Lamel, F. Schiel, A. Fourcin, J. Mariani, and H. Tillman, The translingual English database (TED), Proc. 3rd Int. Conf. on Spoken Language Processing (ICSLP), 1994.

J. Li, L. Deng, R. Haeb-umbach, and Y. Gong, Robust Automatic Speech Recognition ? A Bridge to Practical Applications, 2015.

A. Liutkus, D. Fitzgerald, and Z. Rafii, Scalable audio separation with light Kernel Additive Modelling, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.76-80, 2015.
DOI : 10.1109/ICASSP.2015.7177935

URL : https://hal.archives-ouvertes.fr/hal-01114890

M. I. Mandel, R. J. Weiss, and D. P. Ellis, Model-Based Expectation-Maximization Source Separation and Localization, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.2, pp.382-394, 2010.
DOI : 10.1109/TASL.2009.2029711

A. C. Martinez and B. Meyer, Mutual benefits of auditory spectrotemporal Gabor features and deep learning for the 3rd CHiME challenge, 2015.

X. Mestre and M. A. Lagunas, On diagonal loading for minimum variance beamformers, Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795), pp.459-462, 2003.
DOI : 10.1109/ISSPIT.2003.1341157

T. Mikolov, M. Karafiát, L. Burget, J. Cernock´ycernock´y, and S. Khudanpur, Recurrent neural network based language model, Proc. Interspeech, pp.1045-1048, 2010.

V. Mitra, H. Franco, and M. Graciarena, Damped oscillator cepstral coefficients for robust speech recognition, Proc. Interspeech, pp.886-890, 2013.

V. Mitra, H. Franco, M. Graciarena, and D. Vergyri, Medium-duration modulation cepstral feature for robust speech recognition, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1749-1753, 2014.
DOI : 10.1109/ICASSP.2014.6853898

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.587.296

N. Moritz, S. Gerlach, K. Adiloglu, J. Anemüller, B. Kollmeier et al., A CHiME-3 challenge system: Long-term acoustic features for noise robust automatic speech recognition, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.468-474, 2015.
DOI : 10.1109/ASRU.2015.7404832

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel Audio Source Separation With Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.9, pp.1652-1664, 2016.
DOI : 10.1109/TASLP.2016.2580946

URL : https://hal.archives-ouvertes.fr/hal-01163369

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel music separation with deep neural networks, 2016 24th European Signal Processing Conference (EUSIPCO), 2016.
DOI : 10.1109/EUSIPCO.2016.7760548

URL : https://hal.archives-ouvertes.fr/hal-01334614

Z. Pang and F. Zhu, Noise-robust ASR for the third 'CHiME' challenge exploiting time-frequency masking based multi-channel speech enhancement and recurrent neural network, 2015.

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek et al., The kaldi speech recognition toolkit, IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (ASRU), 2011.

A. Prudnikov, M. Korenevsky, and S. Aleinik, Adaptive beamforming and adaptive training of DNN acoustic models for enhanced multichannel noisy speech recognition, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.401-408, 2015.
DOI : 10.1109/ASRU.2015.7404823

M. Ravanelli, L. Cristoforetti, R. Gretter, M. Pellin, A. Sosi et al., The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.275-282, 2015.
DOI : 10.1109/ASRU.2015.7404805

S. Renals, T. Hain, and H. Bourlard, Interpretation of Multiparty Meetings the AMI and Amida Projects, 2008 Hands-Free Speech Communication and Microphone Arrays, pp.115-118, 2008.
DOI : 10.1109/HSCMA.2008.4538700

A. Schwarz and W. Kellermann, Unbiased coherent-to-diffuse ratio estimation for dereverberation, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), pp.6-10, 2014.
DOI : 10.1109/IWAENC.2014.6953306

M. L. Seltzer, D. Yu, and Y. Wang, An investigation of deep neural networks for noise robust speech recognition, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7398-7402, 2013.
DOI : 10.1109/ICASSP.2013.6639100

K. Shinoda, Speaker adaptation techniques for automatic speech recognition, Proc. APSIPA ASC, 2011.

K. U. Simmer, S. Fischer, and A. Wasiljeff, Suppression of coherent and incoherent noise using a microphone array, Annals of telecommunications, vol.78, pp.439-446, 1994.

S. Sivasankaran, A. A. Nugraha, E. Vincent, J. A. Morales-cordovilla, S. Dalmia et al., Robust ASR using neural network based speech enhancement and feature simulation, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.482-489, 2015.
DOI : 10.1109/ASRU.2015.7404834

URL : https://hal.archives-ouvertes.fr/hal-01204553

M. Stolbov and S. Aleinik, Improvement of microphone array characteristics for speech capturing, Modern Applied Science, vol.9, issue.6, pp.310-319, 2015.

A. Stupakov, E. Hanusa, D. Vijaywargi, D. Fox, and J. Bilmes, The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments, Computer Speech & Language, vol.26, issue.1, pp.52-66, 2011.
DOI : 10.1016/j.csl.2010.12.003

P. Swietojanski and S. Renals, Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, 2014 IEEE Spoken Language Technology Workshop (SLT), pp.171-176, 2014.
DOI : 10.1109/SLT.2014.7078569

Y. Tachioka, H. Kanagawa, and J. Ishii, The overview of the MELCO ASR system for the third CHiME challenge, 2015.

E. Vincent, R. Gribonval, and M. Plumbley, Oracle estimators for the benchmarking of source separation algorithms, Signal Processing, vol.87, issue.8, pp.1933-1059, 2007.
DOI : 10.1016/j.sigpro.2007.01.016

URL : https://hal.archives-ouvertes.fr/inria-00544194

T. T. Vu, B. Bigot, and E. S. Chng, Speech enhancement using beamforming and non negative matrix factorization for robust speech recognition in the CHiME-3 challenge, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.423-429, 2015.
DOI : 10.1109/ASRU.2015.7404826

X. Wang, C. Wu, P. Zhang, Z. Wang, Y. Liu et al., Noise robust IOA/CAS speech separation and recognition system for the third 'CHIME' challenge, 2015.

Y. Wang, A. Narayanan, and D. Wang, On Training Targets for Supervised Speech Separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.12, pp.1849-1858, 2014.
DOI : 10.1109/TASLP.2014.2352935

F. Weninger, H. Erdogan, S. Watanabe, E. Vincent, J. Le-roux et al., Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR, Proc. 12th Int. Conf. on Latent Variable Analysis and Signal Separation, pp.91-99, 2015.
DOI : 10.1007/978-3-319-22482-4_11

URL : https://hal.archives-ouvertes.fr/hal-01163493

M. Wölfel and J. Mcdonough, Distant Speech Recognition, 2009.

Y. Xu, J. Du, L. Dai, and C. Lee, An Experimental Study on Speech Enhancement Based on Deep Neural Networks, IEEE Signal Processing Letters, vol.21, issue.1, pp.65-68, 2014.
DOI : 10.1109/LSP.2013.2291240

T. Yoshioka, N. Ito, M. Delcroix, A. Ogawa, K. Kinoshita et al., The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.436-443, 2015.
DOI : 10.1109/ASRU.2015.7404828

T. Yoshioka, T. Nakatani, M. Miyoshi, and H. Okuno, Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.1, pp.69-84, 2010.
DOI : 10.1109/TASL.2010.2045183

R. Zelinski, A microphone array with adaptive post-filtering for noise reduction in reverberant rooms, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing, pp.2578-2581, 1988.
DOI : 10.1109/ICASSP.1988.197172