T. Yoshioka, N. Ito, M. Delcroix, A. Ogawa, and K. Kinoshita, The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.436-443, 2015.
DOI : 10.1109/ASRU.2015.7404828

S. Doclo, A. Spriet, J. Wouters, and M. Moonen, Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction, Speech Communication, vol.49, issue.7-8, pp.636-656, 2007.
DOI : 10.1016/j.specom.2007.02.001

URL : https://hal.archives-ouvertes.fr/hal-00499178

M. Souden, J. Benesty, and S. Affes, On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.2, pp.260-276, 2010.
DOI : 10.1109/TASL.2009.2025790

S. Doclo and M. Moonen, GSVD-based optimal filtering for single and multimicrophone speech enhancement, IEEE Transactions on Signal Processing, vol.50, issue.9, pp.2230-2244, 2002.
DOI : 10.1109/TSP.2002.801937

E. Warsitz and R. Haeb-umbach, Blind Acoustic Beamforming Based on Generalized Eigenvalue Decomposition, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.5, pp.1529-1539, 2007.
DOI : 10.1109/TASL.2007.898454

R. Serizel, M. Moonen, B. Van-dijk, and J. Wouters, Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.4, pp.785-799, 2014.
DOI : 10.1109/TASLP.2014.2304240

URL : https://hal.archives-ouvertes.fr/hal-01390918

J. Heymann, L. Drude, and R. Haeb-umbach, Neural network based spectral mask estimation for acoustic beamforming, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.196-200, 2016.
DOI : 10.1109/ICASSP.2016.7471664

F. Weninger, F. Eyben, and B. Schuller, Singlechannel speech separation with memory-enhanced recurrent neural networks, Proc. of ICASSP, pp.3709-3713, 2014.
DOI : 10.1109/icassp.2014.6854294

H. Erdogan, J. R. Hershey, S. Watanabe, and J. L. Roux, Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.708-712, 2015.
DOI : 10.1109/ICASSP.2015.7178061

F. Weninger, J. R. Hershey, J. L. Roux, and B. Schuller, Discriminatively trained recurrent neural networks for single-channel speech separation, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp.577-581, 2014.
DOI : 10.1109/GlobalSIP.2014.7032183

P. Pertila and J. Nikunen, Distant speech separation using predicted time???frequency masks from spatial features, Speech Communication, vol.68, pp.97-106, 2015.
DOI : 10.1016/j.specom.2015.01.006

A. Schwarz, C. Huemmer, R. Maas, and W. Kellermann, Spatial diffuseness features for DNN-based speech recognition in noisy and reverberant environments, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4380-4384, 2015.
DOI : 10.1109/ICASSP.2015.7178798

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel Audio Source Separation With Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.9, pp.1652-1664, 2016.
DOI : 10.1109/TASLP.2016.2580946

URL : https://hal.archives-ouvertes.fr/hal-01163369

Z. Wang, E. Vincent, R. Serizel, and Y. Yan, Rank-1 constrained Multichannel Wiener Filter for speech recognition in noisy environments, Computer Speech & Language, vol.49, p.2017
DOI : 10.1016/j.csl.2017.11.003

URL : https://hal.archives-ouvertes.fr/hal-01634449

J. R. Hershey, Z. Chen, J. L. Roux, and S. Watanabe, Deep clustering: Discriminative embeddings for segmentation and separation, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.31-35, 2016.
DOI : 10.1109/ICASSP.2016.7471631

URL : http://arxiv.org/pdf/1508.04306

J. Daniel, Représentation de champs acoustiques, application à la transmission et à la reproduction de scènes sonores complexes dans un contexte multimédia, Thèse de doctorat, 2000.

N. Epain and C. Jin, Independent Component Analysis Using Spherical Microphone Arrays, Acta Acustica united with Acustica, vol.98, issue.1, pp.91-102, 2012.
DOI : 10.3813/AAA.918495

URL : http://www.cel.usyd.edu.au/carlab/UserFiles/File/SmaIcaActa.pdf

P. K. Wu, N. Epain, and C. Jin, A super-resolution beamforming algorithm for spherical microphone arrays using a compressed sensing approach, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.649-653, 2013.
DOI : 10.1109/ICASSP.2013.6637728

M. Baqué, Analyse de scène sonore multi-capteurs, 2017.

L. Griffiths and C. Jim, An alternative approach to linearly constrained adaptive beamforming, IEEE Transactions on Antennas and Propagation, vol.30, issue.1, pp.27-34, 1982.
DOI : 10.1109/TAP.1982.1142739

L. F. Lamel, J. Gauvain, and M. Eskénazi, BREF, a large vocabulary spoken corpus for French, Proc. of Eurospeech, pp.505-508, 1991.

G. Gravier, J. Bonastre, E. Geoffrois, S. Galliano, K. M. Tait et al., The ESTER evaluation campaign for the rich transcription of French broadcast news, Proc. of LREC, 2004.

T. Dozat, Incorporating Nesterov momentum into Adam, Tech. Rep, 2015.

D. Povey, V. Peddinti, D. Galvez, P. Ghahremani, and V. Manohar, Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI, Interspeech 2016, pp.2751-2755, 2016.
DOI : 10.21437/Interspeech.2016-595

T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B. Juang, Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.7, pp.1717-1731, 2010.
DOI : 10.1109/TASL.2010.2052251