J. Chen, J. Benesty, and Y. Huang, Time Delay Estimation in Room Acoustic Environments: An Overview, EURASIP Journal on Advances in Signal Processing, vol.11, issue.6, pp.170-170, 2006.
DOI : 10.1109/TSA.2003.818027

URL : https://doi.org/10.1155/asp/2006/26503

C. Knapp and G. C. Carter, The generalized correlation method for estimation of time delay, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.24, issue.4, pp.320-327, 1976.
DOI : 10.1109/TASSP.1976.1162830

Y. Huang and J. Benesty, Adaptive Multichannel Time Delay Estimation Based on Blind System Identification for Acoustic Source Localization, Adaptive Signal Processing, pp.227-247, 2003.
DOI : 10.1007/978-3-662-11028-7_8

S. Doclo and M. Moonen, Robust Adaptive Time Delay Estimation for Speaker Localization in Noisy and Reverberant Acoustic Environments, EURASIP Journal on Advances in Signal Processing, vol.2003, issue.11, pp.1110-1124, 2003.
DOI : 10.1155/S111086570330602X

URL : https://doi.org/10.1155/s111086570330602x

T. G. Dvorkind and S. Gannot, Time difference of arrival estimation of speech source in a noisy and reverberant environment, Signal Processing, vol.85, issue.1, pp.177-204, 2005.
DOI : 10.1016/j.sigpro.2004.09.014

K. Kowalczyk, E. A. Habets, W. Kellermann, and P. A. Naylor, Blind System Identification Using Sparse Learning for TDOA Estimation of Room Reflections, IEEE Signal Processing Letters, vol.20, issue.7, pp.653-656, 2013.
DOI : 10.1109/LSP.2013.2261059

O. Yilmaz and S. Rickard, Blind Separation of Speech Mixtures via Time-Frequency Masking, IEEE Transactions on Signal Processing, vol.52, issue.7, pp.1830-1847, 2004.
DOI : 10.1109/TSP.2004.828896

URL : http://www-sigproc.eng.cam.ac.uk/research/reading%20group/material/yilmaz%20rickard%20-%202004%20-%20blind%20separation%20of%20speech%20mixtures%20via%20time-frequencymasking.pdf

Y. Dorfan and S. Gannot, Tree-Based Recursive Expectation-Maximization Algorithm for Localization of Acoustic Sources, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.10, pp.1692-1703, 2015.
DOI : 10.1109/TASLP.2015.2444654

X. Li, L. Girin, R. Horaud, and S. Gannot, Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.10, 1997.
DOI : 10.1109/TASLP.2017.2740001

URL : https://hal.archives-ouvertes.fr/hal-01413417

Y. Avargel and I. Cohen, System Identification in the Short-Time Fourier Transform Domain With Crossband Filtering, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.4, pp.1305-1319, 2007.
DOI : 10.1109/TASL.2006.889720

R. Talmon, I. Cohen, and S. Gannot, Relative Transfer Function Identification Using Convolutive Transfer Function Approximation, IEEE Transactions on Audio, Speech, and Language Processing, vol.17, issue.4, pp.546-555, 2009.
DOI : 10.1109/TASL.2008.2009576

N. Roman and D. Wang, Binaural Tracking of Multiple Moving Sources, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.4, pp.728-739, 2008.
DOI : 10.1109/TASL.2008.918978

C. Evers, A. H. Moore, P. A. Naylor, J. Sheaffer, and B. Rafaely, Bearing-only acoustic tracking of moving speakers for robot audition, 2015 IEEE International Conference on Digital Signal Processing (DSP), pp.1206-1210, 2015.
DOI : 10.1109/ICDSP.2015.7252071

URL : http://www.commsp.ee.ic.ac.uk/%7Esap/uploads/publications/Evers2015.pdf

Y. Ban, L. Girin, X. Alameda-pineda, and R. Horaud, Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), 2017.
DOI : 10.1109/ICCVW.2017.60

URL : https://hal.archives-ouvertes.fr/hal-01577965

S. Ba, X. Alameda-pineda, A. Xompero, and R. Horaud, An on-line variational Bayesian model for multi-person tracking from cluttered scenes, Computer Vision and Image Understanding, vol.153, pp.64-76, 2016.
DOI : 10.1016/j.cviu.2016.07.006

URL : https://hal.archives-ouvertes.fr/hal-01349763

I. Gebru, S. Ba, X. Li, and R. Horaud, Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.40, issue.5, 2017.
DOI : 10.1109/TPAMI.2017.2648793

URL : https://hal.archives-ouvertes.fr/hal-01413403

O. Schwartz and S. Gannot, Speaker Tracking Using Recursive EM Algorithms, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.2, pp.392-402, 2014.
DOI : 10.1109/TASLP.2013.2292361

M. I. Mandel, R. J. Weiss, and D. P. Ellis, Model-Based Expectation-Maximization Source Separation and Localization, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.2, pp.382-394, 2010.
DOI : 10.1109/TASL.2009.2029711

URL : http://www.ee.columbia.edu/%7Eronw/pubs/taslp09-messl.pdf

G. Xu, H. Liu, L. Tong, and T. Kailath, A least-squares approach to blind channel identification, IEEE Transactions on signal processing, vol.43, issue.12, pp.2982-2993, 1995.

X. Li, L. Girin, R. Horaud, and S. Gannot, Estimation of the Direct-Path Relative Transfer Function for Supervised Sound-Source Localization, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.11, pp.2171-2186, 2016.
DOI : 10.1109/TASLP.2016.2598319

URL : https://hal.archives-ouvertes.fr/hal-01349691

X. Li, L. Girin, R. Horaud, and S. Gannot, Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.320-324, 2015.
DOI : 10.1109/ICASSP.2015.7177983

URL : https://hal.archives-ouvertes.fr/hal-01119186

X. Li, L. Girin, F. Badeig, and R. Horaud, Reverberant sound localization with a robot head based on direct-path relative transfer function, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.2819-2826, 2016.
DOI : 10.1109/IROS.2016.7759437

URL : https://hal.archives-ouvertes.fr/hal-01349771

J. H. Dibiase, H. F. Silverman, and M. S. Brandstein, Robust Localization in Reverberant Rooms, Microphone Arrays, pp.157-180, 2001.
DOI : 10.1007/978-3-662-04619-7_8