M. J. Arcaro, P. F. Schade, J. L. Vincent, C. R. Ponce, and M. S. Livingstone, Seeing faces is necessary for face-domain formation, Nature Neuroscience, vol.20, issue.10, pp.1404-1412, 2017.

F. Badeig, Q. Pelorson, S. Arias, V. Drouard, I. Gebru et al., A distributed architecture for interacting with nao, ACM ICMI, pp.385-386, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01201716

Y. Ban, X. Alameda-pineda, F. Badeig, S. Ba, and R. Horaud, Tracking a varying number of people with a visually-controlled robotic head, IEEE/RSJ IROS, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01542987

M. Bennewitz, F. Faber, D. Joho, M. Schreiber, and S. Behnke, Towards a humanoid museum guide robot that interacts with multiple persons, IEEE-RAS, pp.418-423, 2005.

Z. Cao, T. Simon, S. Wei, and Y. Sheikh, Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, IEEE CVPR, 2017.

F. Cruz, G. I. Parisi, J. Twiefel, and S. Wermter, Multi-modal integration of dynamic audiovisual patterns for an interactive reinforcement learning scenario, IEEE/RSJ IROS, pp.759-766, 2016.

I. Gebru, S. Ba, X. Li, and R. Horaud, Audiovisual speaker diarization based on spatiotemporal bayesian fusion, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01413403

A. Ghadirzadeh, J. Bütepage, A. Maki, D. Kragic, and M. Björkman, A sensorimotor reinforcement learning framework for physical Human-Robot Interaction, IEEE/RSJ IROS, pp.2682-2688, 2016.

I. Goodfellow, Y. Bengio, and A. Courville, Deep learning, 2016.

M. A. Goodrich and A. C. Schultz, Human-robot Interaction: A Survey, Foundations and Trends in Human-Computer Interaction, vol.1, issue.3, pp.203-275, 2007.

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, 1997.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, ICLR, 2014.

J. Kober, J. A. Bagnell, and J. Peters, Reinforcement learning in robotics: A survey. IJRR, 2013.

X. Li, L. Girin, F. Badeig, and R. Horaud, Reverberant sound localization with a robot head based on direct-path relative transfer function, IEEE/RSJ IROS, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01349771

X. Li, L. Girin, R. Horaud, and S. Gannot, Multiplespeaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization, IEEE/ACM TASLP, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01413417

S. Ljungblad, J. Kotrbova, M. Jacobsson, H. Cramer, and K. Niechwiadowicz, Hospital Robot at Work: Something Alien or an Intelligent Colleague?, ACM CSCW, pp.177-186, 2012.

N. Mitsunaga, C. Smith, T. Kanda, H. Ishiguro, and N. Hagita, Robot behavior adaptation for humanrobot interaction based on policy gradient reinforcement learning, JRSJ, 2006.

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou et al., Playing Atari With Deep Reinforcement Learning, NIPS Deep Learning Workshop, 2013.

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Humanlevel control through deep reinforcement learning, Nature, 2015.

S. Pourmehr, J. Thomas, J. Bruce, J. Wawerla, and R. Vaughan, Robust sensor fusion for finding HRI partners in a crowd, IEEE ICRA, pp.3272-3278, 2017.

A. H. Qureshi, Y. Nakamura, Y. Yoshikawa, and H. Ishiguro, Robot gains social intelligence through multimodal deep reinforcement learning, IEEE Humanoids, pp.745-751, 2016.

A. H. Qureshi, Y. Nakamura, Y. Yoshikawa, and H. Ishiguro, Show, attend and interact: Perceivable human-robot social interaction through neural attention Q-network, IEEE ICRA, pp.1639-1645, 2017.

M. Rothbucher, C. Denk, and K. Diepold, Robotic gaze control using reinforcement learning, IEEE HAVE, 2012.

A. Sauppé and B. Mutlu, The Social Impact of a Robot Co-Worker in Industrial Settings, ACM CHI, pp.3613-3622, 2015.

G. Skantze, A. Hjalmarsson, and C. Oertel, Turntaking, feedback and joint attention in situated human-robot interaction, Speech Communication, vol.65, pp.50-66, 2014.

R. S. Sutton and A. G. Barto, Introduction to Reinforcement Learning, 1998.

A. L. Thomaz, G. Hoffman, and C. Breazeal, Reinforcement learning with human teachers: Understanding how people want to teach robots, IEEE RO-MAN, pp.352-357, 2006.

M. Vázquez, A. Steinfeld, and S. E. Hudson, Maintaining awareness of the focus of attention of a conversation: A robot-centric reinforcement learning approach, IEEE RO-MAN, 2016.

C. J. Watkins and P. Dayan, Q-learning, Mach. Learn, 1992.

R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Mach. Learn, 1992.

S. Yun, A gaze control of socially interactive robots in multiple-person interaction, Robotica, vol.35, issue.11, pp.2122-2138, 2017.