M. Aharon, M. Elad, and A. Bruckstein, rmk-svd: An algorithm for designing overcomplete dictionaries for sparse representation, IEEE Transactions on signal processing, vol.54, issue.11, pp.4311-4322, 2006.
DOI : 10.1109/tsp.2006.881199

S. Argentieri, P. Danes, and P. Eres, A survey on sound source localization in robotics: From binaural to array processing methods, Computer Speech & Language, vol.34, issue.1, pp.87-112, 2015.
DOI : 10.1016/j.csl.2015.03.003
URL : https://hal.archives-ouvertes.fr/hal-01058575

M. Aytekin, C. F. Moss, and J. Z. Simon, A sensorimotor approach to sound localization, Neural Computation, vol.20, issue.3, pp.603-635, 2008.
DOI : 10.1162/neco.2007.12-05-094
URL : http://wexler.free.fr/library/files/aytekin%20(2007)%20a%20sensorimotor%20approach%20to%20sound%20localization.pdf

H. Barfuss and W. Kellermann, An adaptive microphone array topology for target signal extraction with humanoid robots, Acoustic Signal Enhancement (IWAENC), pp.16-20, 2014.
DOI : 10.1109/iwaenc.2014.6953315

E. Berglund and J. Sitte, Sound source localisation through active audition, IEEE/RSJ International Conference on, pp.653-658, 2005.
DOI : 10.1109/iros.2005.1545032

M. Bernard, S. Nguyen, P. Pirim, B. Gas, and J. Meyer, Phonotaxis behavior in the artificial rat psikharpax, International Symposium on Robotics and Intelligent Sensors, IRIS2010, pp.118-122, 2010.

M. Bernard, P. Pirim, A. De-cheveigné, and B. Gas, Sensorimotor learning of sound localization from an auditory evoked behavior, Robotics and Automation (ICRA), 2012 IEEE International Conference on, pp.91-96, 2012.

J. Braasch, . Clapp, . Parks, N. Pastore, and . Xiang, A binaural model that analyses acoustic spaces and stereophonic reproduction systems by utilizing head rotations, The technology of binaural listening, pp.201-223, 2013.
DOI : 10.1007/978-3-642-37762-4_8

G. Bustamante, P. Danés, T. Forgue, and A. Podlubne, Towards information-based feedback control for binaural active localization, Acoustics, Speech and Signal Processing, pp.6325-6329, 2016.
DOI : 10.1109/icassp.2016.7472894
URL : https://hal.archives-ouvertes.fr/hal-01969304

M. Cooke, J. Barker, S. Cunningham, and X. Shao, An audio-visual corpus for speech perception and automatic speech recognition, The Journal of the Acoustical Society of America, vol.120, issue.5, pp.2421-2424, 2006.
DOI : 10.1121/1.2229005

G. Davis, S. Mallat, and M. Avellaneda, Adaptive greedy approximations. Constructive approximation, vol.13, pp.57-98, 1997.
DOI : 10.1007/s003659900033

A. Deleforge, F. Forbes, and R. Horaud, Acoustic space learning for sound-source separation and localization on binaural manifolds. International journal of neural systems, vol.25, p.1440003, 2015.
DOI : 10.1142/s0129065714400036
URL : https://hal.archives-ouvertes.fr/hal-00960796

A. Deleforge, F. Forbes, and R. Horaud, High-dimensional regression with gaussian mixtures and partially-latent response variables, Statistics and Computing, vol.25, pp.893-911, 2015.
DOI : 10.1007/s11222-014-9461-5
URL : https://hal.archives-ouvertes.fr/hal-00863468

A. Deleforge and R. Horaud, Learning the direction of a sound source using head motions and spectral features, INRIA, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00564708

A. Deleforge and R. Horaud, The cocktail party robot: Sound source separation and localisation with an active binaural head, Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction, pp.431-438, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00768668

A. Deleforge, R. Horaud, Y. Yoav, L. Schechner, and . Girin, Colocalization of audio sources in images using binaural features and locallylinear regression, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.23, issue.4, pp.718-731, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01112834

A. Deleforge and W. Kellermann, Phase-optimized k-svd for signal extraction from underdetermined multichannel sparse mixtures, Acoustics, Speech and Signal Processing, pp.355-359, 2015.

A. Deleforge and Y. Traonmilin, Phase unmixing: Multichannel source separation with magnitude constraints, Acoustics, Speech and Signal Processing, pp.161-165, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01372418

C. Evers, A. H. Moore, and P. Naylor, Acoustic simultaneous localization and mapping (a-slam) of a moving microphone array and its surrounding speakers, Acoustics, Speech and Signal Processing, pp.6-10, 2016.

K. Furukawa, K. Okutani, K. Nagira, T. Otsuka, K. Itoyama et al., Noise correlation matrix estimation for improving sound source localization by multirotor uav, telligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on, pp.3943-3948, 2013.

S. Gannot, E. Vincent, S. Markovich-golan, and A. Ozerov, A consolidated perspective on multimicrophone speech enhancement and source separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.4, pp.692-730, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01414179

C. Gaultier, S. Kataria, and A. Deleforge, Vast: The virtual acoustic space traveler dataset, International Conference on Latent Variable Analysis and Signal Separation, pp.68-79, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01416508

D. Gouaillier, V. Hugel, P. Blazevic, C. Kilner, J. Monceaux et al., Mechatronic design of nao humanoid, Robotics and Automation, 2009. ICRA'09. IEEE International Conference on, pp.769-774, 2009.

S. Haykin and Z. Chen, The cocktail party problem, Neural computation, vol.17, issue.9, pp.1875-1902, 2005.

P. M. Hofman and A. J. Van-opstal, Spectro-temporal factors in twodimensional human sound localization, JASA, vol.103, issue.5, pp.2634-2648, 1998.

J. Paul-m-hofman, A. Van-riswick, and . Opstal, Relearning sound localization with new ears, Nature neuroscience, vol.1, issue.5, pp.417-421, 1998.

J. Hornstein, M. Lopes, :. Joschad, F. ´-e-santos-victor, and . Lacerda, Sound localization for humanoid robots-building audio-motor maps based on the hrtf, IEEE/RSJ International Conference on, pp.1170-1176, 2006.

J. Huang, N. Ohnishi, and N. Sugie, Building ears for robots: sound localization and separation, Artificial Life and Robotics, vol.1, issue.4, pp.157-163, 1997.

D. Huggins-daines, M. Kumar, A. Chan, A. W. Black, M. Ravishankar et al., Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices, Proceedings. 2006 IEEE International Conference on, vol.1, 2006.

G. Ince, K. Nakadai, T. Rodemann, Y. Hasegawa, H. Tsujino et al., Ego noise suppression of a robot using template subtraction, IEEE/RSJ International Conference on, pp.199-204, 2009.

A. Ito, T. Kanayama, M. Suzuki, and S. Makino, Internal noise suppression for speech recognition by small robots, Ninth European Conference on Speech Communication and Technology, 2005.

M. Kato, H. Uematsu, M. Kashino, and T. Hirahara, The effect of head motion on the accuracy of sound localization, Acoustical science and technology, vol.24, issue.5, pp.315-317, 2003.

L. Kneip and C. Baumann, Binaural model for artificial spatial sound localization based on interaural time delays and movements of the interaural axis, The Journal of the Acoustical Society of America, vol.124, issue.5, pp.3108-3119, 2008.

M. Krekovi´ckrekovi´c, I. Dokmani´cdokmani´c, and M. Vetterli, Echoslam: Simultaneous localization and mapping with acoustic echoes, Acoustics, Speech and Signal Processing, pp.11-15, 2016.

Y. Li and A. Ngom, Versatile sparse matrix factorization and its applications in high-dimensional biological data analysis, IAPR International Conference on Pattern Recognition in Bioinformatics, pp.91-101, 2013.

H. Heinrich-w-löllmann, A. Barfuss, S. Deleforge, W. Meier, and . Kellermann, Challenges in acoustic signal enhancement for humanrobot communication. In Speech Communication; 11. ITG Symposium; Proceedings of

A. Heinrich-w-löllmann, . Moore, A. Patrick, B. Naylor, R. Rafaely et al., Microphone array signal processing for robot audition, Hands-free Speech Communications and Microphone Arrays (HSCMA, pp.51-55, 2017.

N. Ma, T. May, and G. Brown, Exploiting deep neural networks and head movements for robust binaural localization of multiple sources in reverberant environments, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.12, pp.2444-2453, 2017.

N. Ma, T. May, H. Wierstorf, and G. Brown, A machine-hearing system exploiting head movements for binaural sound localisation in reverberant conditions, Acoustics, Speech and Signal Processing, p.2015

, IEEE International Conference on, pp.2699-2703, 2015.

A. Magassouba, N. Bertin, and F. Chaumette, Sound-based control with two microphones, Intelligent Robots and Systems (IROS), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01185841

, IEEE/RSJ International Conference on, pp.5568-5573, 2015.

T. May, N. Ma, and G. Brown, Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues, Acoustics, Speech and Signal Processing, pp.2679-2683, 2015.

C. John, D. Middlebrooks, and . Green, Sound localization by human listeners. Annual review of psychology, vol.42, pp.135-159, 1991.

K. Nakadai, T. Lourens, G. Hiroshi, H. Okuno, and . Kitano, Active audition for humanoid, AAAI/IAAI, pp.832-839, 2000.

K. Nakadai, G. Hiroshi, H. Okuno, and . Kitano, Real-time sound source localization and separation for robot audition, Seventh International Conference on Spoken Language Processing, 2002.

K. Nakadai, G. Hiroshi, H. Okuno, and . Kitano, Robot recognizes three simultaneous speech by active audition, Robotics and Automation, 2003. Proceedings. ICRA'03. IEEE International Conference on, vol.1, pp.398-405, 2003.

K. Nakadai, T. Takahashi, G. Hiroshi, H. Okuno, Y. Nakajima et al., Design and implementation of robot audition system HARK-open source software for listening to three simultaneous speakers, Advanced Robotics, vol.24, issue.5-6, pp.739-761, 2010.

P. Naylor, . Nikolay, and . Gaubitch, , 2010.

F. Quan-v-nguyen, E. Colas, F. Vincent, and . Charpillet, Long-term robot motion planning for active sound source localization with monte carlo tree search, Hands-free Speech Communications and Microphone Arrays (HSCMA, pp.61-65, 2017.

J. Kevin, O. 'regan, and A. Noë, A sensorimotor account of vision and visual consciousness, Behavioral and brain sciences, vol.24, issue.5, pp.939-973, 2001.

M. Otani, T. Hirahara, and S. Ise, Numerical study on sourcedistance dependency of head-related transfer functions, The Journal of the Acoustical Society of America, vol.125, issue.5, pp.3253-3261, 2009.

K. Otsuka, S. Araki, K. Ishizuka, M. Fujimoto, M. Heinrich et al., A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization, Proceedings of the 10th international conference on Multimodal interfaces, pp.257-264, 2008.

A. Ozerov and C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.550-563, 2010.
DOI : 10.1109/tasl.2009.2031510

S. Perrett and W. Noble, The effect of head rotations on vertical plane sound localization, The Journal of the Acoustical Society of America, vol.102, issue.4, pp.2325-2332, 1997.

H. Poincaré, The foundations of science; Science and hypothesis, the value of science, science and method, 1905.

A. Portello, P. Danes, and S. Argentieri, Acoustic models and kalman filtering strategies for active binaural sound localization, Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, pp.137-142, 2011.
DOI : 10.1109/iros.2011.6094842

R. Prasad, H. Saruwatari, and K. Shikano, Robots that can hear, understand and talk, Advanced Robotics, vol.18, issue.5, pp.533-564, 2004.
DOI : 10.1163/156855304774195064

C. Rascon and I. Meza, Localization of sound sources in robotics: A review, Robotics and Autonomous Systems, vol.96, pp.184-210, 2017.

J. Sanchez-riera, X. Alameda-pineda, J. Wienke, A. Deleforge, S. Arias et al., Online multimodal speaker detection for humanoid robots, Humanoid Robots (Humanoids), 2012 12th IEEE-RAS International Conference on, pp.126-133
DOI : 10.1109/humanoids.2012.6651509
URL : https://hal.archives-ouvertes.fr/hal-00768764

, IEEE, 2012.

H. Sawada, H. Kameoka, S. Araki, and N. Ueda, Multichannel extensions of non-negative matrix factorization with complex-valued data, IEEE Transactions on Audio, Speech, and Language Processing, vol.21, issue.5, pp.971-982, 2013.
DOI : 10.1109/tasl.2013.2239990
URL : http://www.brl.ntt.co.jp/people/kameoka/publications/Sawada2013IEEETrans05-published.pdf

A. Schmidt, A. Deleforge, and W. Kellermann, Ego-noise reduction using a motor data-guided multichannel dictionary, Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on, pp.1281-1286, 2016.
DOI : 10.1109/iros.2016.7759212
URL : https://hal.archives-ouvertes.fr/hal-01415723

A. Schmidt, H. Loellmann, and W. Kellermann, A novel egonoise suppression algorithm for acoustic signal enhancement in autonomous systems, Acoustics, Speech and Signal Processing (ICASSP), 2018.

B. Schölkopf, C. John, J. Platt, A. J. Shawe-taylor, R. Smola et al., Estimating the support of a high-dimensional distribution, Neural computation, vol.13, issue.7, pp.1443-1471, 2001.

P. Smaragdis and J. C. Brown, Non-negative matrix factorization for polyphonic music transcription, Applications of Signal Processing to Audio and Acoustics, pp.177-180, 2003.
DOI : 10.1109/aspaa.2003.1285860
URL : http://www.merl.com/publications/docs/TR2003-139.pdf

J. W. Strutt, On the perception of the direction of sound, Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, vol.83, issue.559, pp.61-64, 1909.

R. Talmon, I. Cohen, and S. Gannot, Supervised source localization using diffusion kernels, Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.245-248, 2011.
DOI : 10.1109/aspaa.2011.6082267

J. W. Willard-r-thurlow, P. Mangels, and . Runge, Head movements during sound localization, The Journal of the Acoustical society of America, vol.42, issue.2, pp.489-493, 1967.

V. Tourbabin, H. Barfuss, B. Rafaely, and W. Kellermann, Enhanced robot audition by dynamic acoustic sensing in moving humanoids, Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pp.5625-5629, 2015.
DOI : 10.1109/icassp.2015.7179048

A. Joel, A. Tropp, and . Gilbert, Signal recovery from random measurements via orthogonal matching pursuit, IEEE Transactions on information theory, vol.53, issue.12, pp.4655-4666, 2007.

J. Valin, . Shun'ichi, J. Yamamoto, F. Rouat, K. Michaud et al., Robust recognition of simultaneous speech by a mobile robot, IEEE Transactions on Robotics, vol.23, issue.4, pp.742-752, 2007.
DOI : 10.1109/tro.2007.900612
URL : http://arxiv.org/pdf/1602.06442

E. Vincent, J. Barker, S. Watanabe, J. L. Roux, F. Nesta et al., The second chimespeech separation and recognition challenge: An overview of challenge systems and outcomes, Automatic Speech Recognition and Understanding (ASRU), pp.162-167, 2013.

E. Vincent, R. Gribonval, and C. Févotte, Performance measurement in blind audio source separation, IEEE transactions on audio, speech, and language processing, vol.14, pp.1462-1469, 2006.
DOI : 10.1109/tsa.2005.858005
URL : https://hal.archives-ouvertes.fr/inria-00544230

T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE transactions on audio, speech, and language processing, vol.15, pp.1066-1074, 2007.
DOI : 10.1109/tasl.2006.885253
URL : http://www.cs.tut.fi/sgn/arg/music/tuomasv/virtanen_taslp2007.pdf

H. Wallach, The role of head movements and vestibular and visual cues in sound localization, Journal of Experimental Psychology, vol.27, issue.4, p.339, 1940.

D. Wang and G. J. Brown, Computational Auditory Scene Analysis: Principles, Algorithms and Applications, 2006.

L. Wang and A. Cavallaro, Ear in the sky: Ego-noise reduction for auditory micro aerial vehicles, Advanced Video and Signal Based Surveillance (AVSS), pp.152-158, 2016.
DOI : 10.1109/avss.2016.7738063

L. Frederic, D. Wightman, and . Kistler, Resolution of front-back ambiguity in spatial hearing by listener and source movement, The Journal of the Acoustical Society of America, vol.105, issue.5, pp.2841-2853, 1999.

B. A. Wright and Y. Zhang, A review of learning with normal and altered sound-localization cues in human adults, International journal of audiology, vol.45, issue.S1, pp.92-98, 2006.

T. Xiao and Q. Liu, Finite difference computation of head-related transfer function for human hearing, The Journal of the Acoustical Society of America, vol.113, issue.5, pp.2434-2441, 2003.
DOI : 10.1121/1.1561495