V. Pulkki and S. , Delikaris-Manias, and A. Politis, Parametric Time-Frequency Domain Spatial Audio, 2017.

E. Vincent, T. Virtanen, and S. Gannot, Audio Source Separation and Speech Enhancement, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01120685

C. Knapp and G. Carter, The generalized correlation method for estimation of time delay, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.24, issue.4, pp.320-327, 1976.
DOI : 10.1109/TASSP.1976.1162830

M. S. Brandstein and H. F. Silverman, A robust method for speech signal time-delay estimation in reverberant rooms, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.375-378, 1997.
DOI : 10.1109/ICASSP.1997.599651

URL : http://cvrr.ucsd.edu/ece285/papers/tdoa_brandstein.pdf

R. Schmidt, Multiple emitter location and signal parameter estimation, IEEE Transactions on Antennas and Propagation, vol.34, issue.3, pp.276-280, 1986.
DOI : 10.1109/TAP.1986.1143830

M. A. Gerzon, Periphony : with-height sound reproduction, J. Audio Eng. Soc, vol.21, issue.1, pp.2-10, 1973.

J. Herre, J. Hilpert, A. Kuntz, and J. Plogsties, MPEG-H 3D Audio???The New Standard for Coding of Immersive Spatial Audio, IEEE Journal of Selected Topics in Signal Processing, vol.9, issue.5, pp.770-779, 2015.
DOI : 10.1109/JSTSP.2015.2411578

URL : http://doi.org/10.1109/jstsp.2015.2411578

V. Pulkki, Spatial sound reproduction with directional audio coding, J. Audio Eng. Soc, vol.55, issue.6, pp.503-516, 2007.

D. P. Jarrett, E. A. Habets, and P. A. Naylor, 3D source localization in the spherical harmonic domain using a pseudointensity vector, Proc. of EUSIPCO, pp.442-446, 2010.

H. Khaddour, J. Schimmel, and M. Trzos, Threedimensional sound source localization using B-format signals, Int. J. of Advances in Telecommunications Electrotechnics, Signals and Systems, vol.2, issue.2, 2013.
DOI : 10.11601/ijates.v2i2.35

URL : http://www.ijates.org/index.php/ijates/article/download/35/50

C. Dimoulas, G. Kalliris, K. Avdelidis, and G. Papanikolaou, Improved localization of sound sources using multi-band processing of ambisonic components, Proc. of AES Conv, pp.1-11, 2009.

O. Nadiri and B. Rafaeli, Localization of Multiple Speakers under High Reverberation using a Spherical Microphone Array and the Direct-Path Dominance Test, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.10, pp.1494-1505, 2014.
DOI : 10.1109/TASLP.2014.2337846

M. Baqué, Analyse de scène sonore multi-capteurs, 2017.

N. Ma, G. J. Brown, and T. May, Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions, Proc. of Interspeech, pp.3302-3306, 2015.
DOI : 10.1109/taslp.2017.2750760

URL : http://orbit.dtu.dk/files/139685925/taslp_ma_2750760_proof_color.pdf

]. X. Xiao, A learning-based approach to direction of arrival estimation in noisy and reverberant environments, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2814-2818, 2015.
DOI : 10.1109/ICASSP.2015.7178484

R. Takeda and K. Komatani, Sound source localization based on deep neural networks with directional activate function exploiting phase information, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.405-409, 2016.
DOI : 10.1109/ICASSP.2016.7471706

V. Varanasi, R. Serizel, and E. Vincent, DNN based robust DOA estimation in reverberant, noisy and multisource environment, 2018.

S. Chakrabarty and E. A. Habets, Broadband doa estimation using convolutional neural networks trained with noise signals, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.136-140, 2017.
DOI : 10.1109/WASPAA.2017.8170010

URL : http://arxiv.org/pdf/1705.00919

S. Adavanne, A. Politis, and T. Virtanen, Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network, 2017.

F. Jacobsen, A note on instantaneous and time-averaged active and reactive sound intensity, Journal of Sound and Vibration, vol.147, issue.3, pp.489-496, 1991.
DOI : 10.1016/0022-460X(91)90496-7

T. Dozat, Incorporating Nesterov momentum into Adam, Tech. Rep, 2015.

W. He, P. Motlicek, and J. Odobez, Deep neural networks for multiple speaker detection and localization, Proc. of ICRA, 2018.

J. B. Allen and D. A. Berkley, Image method for efficiently simulating small???room acoustics, The Journal of the Acoustical Society of America, vol.65, issue.4, pp.943-950, 1979.
DOI : 10.1121/1.382599

E. A. Habets, Room impulse response generator, Tech. Rep, 2006.

L. F. Lamel, J. Gauvain, and M. Eskénazi, BREF, a large vocabulary spoken corpus for French, Proc. of Eurospeech, pp.505-508, 1991.

E. Vincent, S. Araki, and P. Bofill, The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation, Proc. of ICA, pp.734-741, 2009.
DOI : 10.1109/ICASSP.2009.4959531

URL : https://hal.archives-ouvertes.fr/inria-00544168

X. L. Li and T. Adali, A novel entropy estimator and its application to ICA, 2009 IEEE International Workshop on Machine Learning for Signal Processing, pp.1-6, 2009.
DOI : 10.1109/MLSP.2009.5306208