S. Escalera, X. Baró, H. J. Escalante, and I. Guyon, ChaLearn looking at people: A review of events and resources, 2017 International Joint Conference on Neural Networks (IJCNN), pp.1594-1601, 2017.
DOI : 10.1109/IJCNN.2017.7966041
URL : https://hal.archives-ouvertes.fr/hal-01677944

Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang, A survey of affect recognition methods, Proceedings of the ninth international conference on Multimodal interfaces , ICMI '07, pp.39-58, 2009.
DOI : 10.1145/1322192.1322216

I. Lüsi, J. C. Junior, J. Gorbova, X. Baró, S. Escalera et al., Joint Challenge on Dominant and Complementary Emotion Recognition Using Micro Emotion Features and Head-Pose Estimation: Databases, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp.809-813, 2017.
DOI : 10.1109/FG.2017.102

C. Loob, P. Rasti, I. Lüsi, J. C. Junior, X. Baró et al., Dominant and Complementary Multi-Emotional Facial Expression Recognition Using C-Support Vector Classification, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp.833-838, 2017.
DOI : 10.1109/FG.2017.106

H. J. Escalante, V. Ponce-lópez, J. Wan, M. A. Riegler, B. Chen et al., ChaLearn Joint Contest on Multimedia Challenges Beyond Visual Analysis: An overview, 2016 23rd International Conference on Pattern Recognition (ICPR), pp.67-73, 2016.
DOI : 10.1109/ICPR.2016.7899609
URL : https://hal.archives-ouvertes.fr/hal-01381144

J. Wan, Y. Zhao, S. Zhou, I. Guyon, S. Escalera et al., ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.56-64, 2005.
DOI : 10.1109/CVPRW.2016.100
URL : https://hal.archives-ouvertes.fr/hal-01381151

S. Escalera, X. Baró, J. Gonzalez, M. ´. Bautista, M. Madadi et al., ChaLearn Looking at People Challenge 2014: Dataset and Results, pp.459-473, 2014.
DOI : 10.1007/978-3-319-16178-5_32
URL : https://hal.archives-ouvertes.fr/hal-01381162

X. Baró, J. Gonzalez, J. Fabian, M. A. Bautista, M. Oliu et al., ChaLearn Looking at People 2015 challenges: Action spotting and cultural event recognition, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.1-9, 2015.
DOI : 10.1109/CVPRW.2015.7301329

J. Duan, J. Wan, S. Zhou, X. Guo, and S. Li, A unified framework for multi-modal isolated gesture recognition, ACM Transactions on Multimedia Computing, Communications , and Applications (TOMM), p.2017

Y. Li, Q. Miao, K. Tian, Y. Fan, X. Xu et al., Large-scale gesture recognition with a fusion of rgb-d data based on the c3d model, ICPR, pp.25-30, 2016.

P. Wang, W. Li, S. Liu, Z. Gao, C. Tang et al., Large-scale Isolated Gesture Recognition using Convolutional Neural Networks, 2016 23rd International Conference on Pattern Recognition (ICPR), pp.7-12, 2016.
DOI : 10.1109/ICPR.2016.7899599
URL : http://arxiv.org/pdf/1701.01814

G. Zhu, L. Zhang, L. Mei, J. Shao, J. Song et al., Large-scale Isolated Gesture Recognition using pyramidal 3D convolutional networks, 2016 23rd International Conference on Pattern Recognition (ICPR), pp.19-24, 2016.
DOI : 10.1109/ICPR.2016.7899601

D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks, 2015 IEEE International Conference on Computer Vision (ICCV), pp.4489-4497, 2005.
DOI : 10.1109/ICCV.2015.510
URL : http://arxiv.org/pdf/1412.0767

L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin et al., Temporal Segment Networks: Towards Good Practices for Deep Action Recognition, ECCV, pp.20-36, 2016.
DOI : 10.1109/CVPR.2016.219

H. Bilen, B. Fernando, E. Gavves, A. Vedaldi, and S. Gould, Dynamic image networks for action recognition, CVPR, pp.3034-3042, 2016.

H. Fang, S. Xie, and C. Lu, RMPE: Regional Multi-person Pose Estimation, 2017 IEEE International Conference on Computer Vision (ICCV), 2016.
DOI : 10.1109/ICCV.2017.256

S. Xingjian, Z. Chen, H. Wang, D. Yeung, W. Wong et al., Convolutional lstm network: A machine learning approach for precipitation nowcasting, NIPS, pp.802-810, 2015.

S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, pp.91-99, 2015.
DOI : 10.1109/TPAMI.2016.2577031

G. Zhu, L. Zhang, P. Shen, and J. Song, Multimodal gesture recognition using 3d convolution and convolutional lstm, IEEE Access, issue.4

X. Chai, Z. Liu, F. Yin, Z. Liu, and X. Chen, Two streams Recurrent Neural Networks for Large-Scale Continuous Gesture Recognition, 2016 23rd International Conference on Pattern Recognition (ICPR), pp.31-36, 2016.
DOI : 10.1109/ICPR.2016.7899603

N. C. Camgoz, S. Hadfield, O. Koller, and R. Bowden, Using Convolutional 3D Neural Networks for User-independent continuous gesture recognition, 2016 23rd International Conference on Pattern Recognition (ICPR), pp.49-54, 2016.
DOI : 10.1109/ICPR.2016.7899606

P. Wang, W. Li, S. Liu, Y. Zhang, Z. Gao et al., Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks, 2016 23rd International Conference on Pattern Recognition (ICPR), pp.13-18, 2016.
DOI : 10.1109/ICPR.2016.7899600

L. Pigou, A. Van-den-oord, S. Dieleman, M. Van-herreweghe, and J. Dambre, Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video, International Journal of Computer Vision, vol.86, issue.11, pp.1-10, 2015.
DOI : 10.1109/CVPR.2015.7298935

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770-778, 2016.
DOI : 10.1109/CVPR.2016.90

D. Clevert, T. Unterthiner, and S. Hochreiter, Fast and accurate deep network learning by exponential linear units (elus)

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, ICML, pp.448-456, 2015.

I. Ofodile, K. Kulkarni, C. A. Corneanu, S. Escalera, X. Baro et al., Automatic recognition of deceptive facial expressions of emotion, 2017.

. King, Easily create high quality object detectors with deep learning, 2016.

S. Kazemi, One millisecond face alignment with an ensemble of regression trees, 2014 IEEE Conference on Computer Vision and Pattern Recognition
DOI : 10.1109/CVPR.2014.241

A. Werner and S. , Handling Data Imbalance in Automatic Facial Action Intensity Estimation, Procedings of the British Machine Vision Conference 2015, p.8, 2015.
DOI : 10.5244/C.29.124

V. Ng and W. Nguyen, Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning, Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, ICMI '15, 2015.
DOI : 10.1016/j.imavis.2014.09.005

L. Werner, G. Al-hamadi, and T. Walter, Automatic Pain Assessment with Facial Activity Descriptors, IEEE Transactions on Affective Computing, vol.8, issue.3, p.99, 2016.
DOI : 10.1109/TAFFC.2016.2537327

. Joachims, Optimizing search engines using clickthrough data, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, pp.133-142, 2002.
DOI : 10.1145/775047.775067
URL : http://www.cse.unsw.edu.au/~qzhang/papers/459.pdf

J. Tani, M. It, and Y. Sugita, Self-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robot experiments using RNNPB, Neural Networks, vol.17, issue.8-9, 2004.
DOI : 10.1016/j.neunet.2004.05.007

H. Xu and Y. , A discriminative CNN video representation for event detection, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298789

Z. Gao and D. Beijbom, Compact Bilinear Pooling, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
DOI : 10.1109/CVPR.2016.41

W. Pei, T. Baltru?aitis, D. M. Tax, and L. Morency, Temporal Attention-Gated Model for Robust Sequence Classification, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2017.94