E. G. Freedman and D. L. Sparks, Eye-head coordination during headunrestrained gaze shifts in rhesus monkeys, Journal of Neurophysiology, 1997.

E. G. Freedman, Coordination of the eyes and head during visual orienting, Experimental Brain Research, vol.77, issue.4, 2008.
DOI : 10.1113/jphysiol.1964.sp007485

D. B. Jayagopi, The vernissage corpus: A multimodal humanrobot-interaction dataset, IDIAP, Tech. Rep, 2012.

M. J. Marin-jimenez, A. Zisserman, M. Eichner, and V. Ferrari, Detecting People Looking at Each Other in Videos, International Journal of Computer Vision, vol.25, issue.1, 2014.
DOI : 10.1007/11526346_26

L. H. Yu and M. Eizenman, A new methodology for determining pointof-gaze in head-mounted eye tracking systems, IEEE Transactions on Biomedical Engineering, vol.51, 2004.

T. Toyama, T. Kieninger, F. Shafait, and A. Dengel, Gaze guided object recognition using a head-mounted eye tracker, Proceedings of the Symposium on Eye Tracking Research and Applications, ETRA '12, 2012.
DOI : 10.1145/2168556.2168570

A. K. Hong, J. Pelz, and J. Cockburn, Lightweight, low-cost, sidemounted mobile eye tracking system, IEEE WNYIPW, 2012.

K. Kurzhals, M. Hlawatsch, C. Seeger, and D. Weiskopf, Visual Analytics for Mobile Eye Tracking, IEEE Transactions on Visualization and Computer Graphics, vol.23, issue.1, pp.301-310, 2017.
DOI : 10.1109/TVCG.2016.2598695

P. Smith, N. Shah, V. Da, and . Lobo, Determining driver visual attention with one camera, IEEE Transactions on Intelligent Transportation Systems, vol.4, issue.4, 2003.
DOI : 10.1109/TITS.2003.821342

K. Krafka, A. Khosla, P. Kellnhofer, H. Kannan, S. Bhandarkar et al., Eye Tracking for Everyone, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.239

Y. Matsumoto, T. Ogasawara, and A. Zelinsky, Behavior recognition based on head pose and gaze direction measurement, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113), 2000.
DOI : 10.1109/IROS.2000.895285

T. Ohno and N. Mukawa, A free-head, simple calibration, gaze tracking system that enables gaze-based interaction, Proceedings of the Eye tracking research & applications symposium on Eye tracking research & applications , ETRA'2004, 2004.
DOI : 10.1145/968363.968387

F. Lu, Y. Sugano, T. Okabe, and Y. Sato, Adaptive Linear Regression for Appearance-Based Gaze Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, issue.10, 2014.
DOI : 10.1109/TPAMI.2014.2313123

F. Lu, T. Okabe, Y. Sugano, and Y. Sato, Learning gaze biases with head motion for head pose-free gaze estimation, Image and Vision Computing, vol.32, issue.3, 2014.
DOI : 10.1016/j.imavis.2014.01.005

E. Murphy-chutorian and M. Trivedi, Head Pose Estimation in Computer Vision: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.4, 2009.
DOI : 10.1109/TPAMI.2008.106

X. Zabulis, T. Sarmis, and A. A. Argyros, 3D head pose estimation from multiple distant views, Procedings of the British Machine Vision Conference 2009, 2009.
DOI : 10.5244/C.23.118

I. Chamveha, Y. Sugano, D. Sugimura, T. Siriteerakul, T. Okabe et al., Head direction estimation from low resolution images with scene adaptation, Computer Vision and Image Understanding, vol.117, issue.10, 2013.
DOI : 10.1016/j.cviu.2013.06.005

A. K. Rajagopal, R. Subramanian, E. Ricci, R. L. Vieriu, O. Lanz et al., Exploring Transfer Learning Approaches for Head Pose Classification from Multi-view Surveillance Images, International Journal of Computer Vision, vol.30, issue.7, 2014.
DOI : 10.1109/ICCV.2013.150

Y. Yan, E. Ricci, R. Subramanian, G. Liu, O. Lanz et al., A Multi-Task Learning Framework for Head Pose Estimation under Target Motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.6, 2016.
DOI : 10.1109/TPAMI.2015.2477843

Z. Qin and C. R. Shelton, Social Grouping for Multi-Target Tracking and Head Pose Estimation in Video, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.10, 2016.
DOI : 10.1109/TPAMI.2015.2505292

J. S. Stahl, Amplitude of human head movements associated with horizontal saccades, Experimental Brain Research, vol.126, issue.1, 1999.
DOI : 10.1007/s002210050715

H. H. Goossens and A. Van-opstal, Human eye-head coordination in two dimensions under different sensorimotor conditions, Experimental Brain Research, vol.114, issue.3, 1997.
DOI : 10.1007/PL00005663

R. Stiefelhagen and J. Zhu, Head orientation and gaze direction in meetings, CHI '02 extended abstracts on Human factors in computing systems , CHI '02, 2002.
DOI : 10.1145/506443.506634

P. Lanillos, J. F. Ferreira, and J. Dias, A Bayesian hierarchy for robust gaze estimation in human???robot interaction, International Journal of Approximate Reasoning, vol.87, 2017.
DOI : 10.1016/j.ijar.2017.04.007

S. Asteriadis, K. Karpouzis, and S. Kollias, Visual Focus of Attention in Non-calibrated Environments using Gaze Estimation, International Journal of Computer Vision, vol.13, issue.2, 2014.
DOI : 10.1002/ima.10048

S. Ba and J. Odobez, Recognizing Visual Focus of Attention From Head Pose in Natural Meetings, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol.39, issue.1, 2009.
DOI : 10.1109/TSMCB.2008.927274

S. Sheikhi and J. Odobez, Recognizing the Visual Focus of Attention for Human Robot Interaction, Human Behavior Understanding Workshop, 2012.
DOI : 10.1007/978-3-642-34014-7_9

Z. Yucel, A. A. Salah, C. Mericli, T. Mericli, R. Valenti et al., Joint Attention by Gaze Interpolation and Saliency, IEEE Transactions on Cybernetics, vol.43, issue.3, 2013.
DOI : 10.1109/TSMCB.2012.2216979

K. Otsuka, J. Yamato, and Y. Takemae, Conversation scene analysis with dynamic bayesian network based on visual head tracking, IEEE ICME, 2006.

S. Duffner and C. Garcia, Visual Focus of Attention Estimation With Unsupervised Incremental Learning, IEEE Transactions on Circuits and Systems for Video Technology, 2015.
DOI : 10.1109/TCSVT.2015.2501920

URL : https://hal.archives-ouvertes.fr/hal-01153969

S. Sheikhi and J. Odobez, Combining dynamic head pose???gaze mapping with the robot conversational state for attention recognition in human???robot interactions, Pattern Recognition Letters, vol.66, 2015.
DOI : 10.1016/j.patrec.2014.10.002

B. Massé, S. Ba, and R. Horaud, Simultaneous estimation of gaze direction and visual focus of attention for multi-person-to-robot interaction, 2016 IEEE International Conference on Multimedia and Expo (ICME), 2016.
DOI : 10.1109/ICME.2016.7552986

K. P. Murphy, Switching Kalman filters, UC Berkeley, Tech. Rep, 1998.

D. Simon, Kalman filtering with state constraints: a survey of linear and nonlinear algorithms, IET Control Theory & Applications, vol.4, issue.8, 2010.
DOI : 10.1049/iet-cta.2009.0032

C. M. Bishop, Pattern Recognition and Machine Learning, 2006.

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001.
DOI : 10.1109/CVPR.2001.990517

S. Bae and K. Yoon, Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.159

V. Drouard, R. Horaud, A. Deleforge, S. Ba, and G. Evangelidis, Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions, IEEE Transactions on Image Processing, vol.26, issue.3, 2017.
DOI : 10.1109/TIP.2017.2654165

URL : https://hal.archives-ouvertes.fr/hal-01413406

A. Patron-perez, M. Marsza?ek, A. Zisserman, and I. D. Reid, High five: Recognising human interactions in TV shows, British Machine Vision Conference, 2010.

X. Li, L. Girin, R. Horaud, and S. Gannot, Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.10, pp.1997-2012, 2017.
DOI : 10.1109/TASLP.2017.2740001

URL : https://hal.archives-ouvertes.fr/hal-01413417

I. Gebru, S. Ba, X. Li, and R. Horaud, Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
DOI : 10.1109/TPAMI.2017.2648793

URL : https://hal.archives-ouvertes.fr/hal-01413403