M. Abadi, TensorFlow: Large-scale machine learning on heterogeneous systems, 2015.

F. Baradel, C. Wolf, and J. Mille, Human action recognition: Pose-based attention draws focus to hands, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp.604-613, 2017.
DOI : 10.1109/iccvw.2017.77

URL : https://hal.archives-ouvertes.fr/hal-01575390

J. Carreira and A. Zisserman, Quo vadis, action recognition? a new model and the kinetics dataset, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4724-4733, 2017.
DOI : 10.1109/cvpr.2017.502

URL : http://arxiv.org/pdf/1705.07750

G. Cheron, I. Laptev, and C. Schmid, P-cnn: Pose-based cnn features for action recognition, ICCV, 2015.
DOI : 10.1109/iccv.2015.368

URL : https://hal.archives-ouvertes.fr/hal-01187690

F. Chollet, , 2015.

S. Das, M. Koperski, F. Bremond, and G. Francesca, Action recognition based on a mixture of rgb and depth based skeleton, AVSS, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01639504

J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan et al., Long-term recurrent convolutional networks for visual recognition and description, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

C. Feichtenhofer, A. Pinz, and A. Zisserman, Convolutional two-stream network fusion for video action recognition, Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, pp.1933-1941, 2016.

J. F. Hu, W. S. Zheng, J. Lai, and J. Zhang, Jointly learning heterogeneous features for rgb-d activity recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.39, issue.11, pp.2186-2200, 2017.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, 2014.

M. Koperski, P. Bilinski, and F. Bremond, 3D Trajectories for Action Recognition, ICIP, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01054949

M. Koperski and F. Bremond, Modeling spatial layout of features for real world scenario rgb-d action recognition, AVSS, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01399037

H. S. Koppula, R. Gupta, and A. Saxena, Learning human activities and object affordances from rgb-d videos, Int. J. Rob. Res, vol.32, issue.8, pp.951-970, 2013.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, NIPS, 2012.

J. Liu, A. Shahroudy, D. Xu, and G. Wang, Spatio-temporal lstm with trust gates for 3d human action recognition, Computer Vision-ECCV 2016, pp.816-833, 2016.

M. Liu, H. Liu, and C. Chen, Enhanced skeleton visualization for view invariant human action recognition, Pattern Recognition, vol.68, pp.346-362, 2017.

B. Mahasseni and S. Todorovic, Regularizing long short term memory with 3d human-skeleton sequences for action recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3054-3062, 2016.

O. Oreifej and Z. Liu, Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences, CVPR, 2013.

A. Shahroudy, J. Liu, T. Ng, and G. Wang, Ntu rgb+d: A large scale dataset for 3d human activity analysis, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/cvpr.2016.115

URL : http://arxiv.org/pdf/1604.02808

A. Shahroudy, T. T. Ng, Y. Gong, and G. Wang, Deep multimodal feature analysis for action recognition in rgb+d videos, IEEE Transactions on Pattern Analysis and Machine Intelligence, issue.99, pp.1-1, 2017.
DOI : 10.1109/tpami.2017.2691321

URL : http://arxiv.org/pdf/1603.07120

K. Simonyan and A. Zisserman, Two-stream convolutional networks for action recognition in videos, Advances in neural information processing systems, pp.568-576, 2014.

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res, vol.15, issue.1, pp.1929-1958, 2014.

J. Sung, C. Ponce, B. Selman, and A. Saxena, Unstructured human activity detection from rgbd images, ICRA, 2012.

I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, Proceedings of the 27th International Conference on Neural Information Processing Systems, vol.2, pp.3104-3112, 2014.

H. Wang, A. Kläser, C. Schmid, and C. Liu, Action Recognition by Dense Trajectories, IEEE Conference on Computer Vision & Pattern Recognition, pp.3169-3176, 2011.
DOI : 10.1109/cvpr.2011.5995407

URL : https://hal.archives-ouvertes.fr/inria-00583818

H. Wang and C. Schmid, Action recognition with improved trajectories, ICCV, 2013.
DOI : 10.1109/iccv.2013.441

URL : https://hal.archives-ouvertes.fr/hal-00873267

Y. Wu, Mining actionlet ensemble for action recognition with depth cameras, CVPR, 2012.

L. Xia and J. Aggarwal, Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera, CVPR, 2013.
DOI : 10.1109/cvpr.2013.365

URL : http://cvrc.ece.utexas.edu/lu/CVPR2013_Lu_20130918.pdf

J. Yue-hei, M. Ng, S. Hausknecht, O. Vijayanarasimhan, R. Vinyals et al., Beyond Short Snippets: Deep Networks for Video Classification, 2015.

P. Zhang, C. Lan, J. Xing, W. Zeng, J. Xue et al., View adaptive recurrent neural networks for high performance human action recognition from skeleton data, The IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/iccv.2017.233

URL : http://arxiv.org/pdf/1703.08274

S. Zhang, X. Liu, and J. Xiao, On geometric features for skeleton-based action recognition using multilayer lstm networks, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.148-157, 2017.
DOI : 10.1109/wacv.2017.24

M. Zolfaghari, G. L. Oliveira, N. Sedaghat, and T. Brox, Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection, 2017 IEEE International Conference on, pp.2923-2932, 2017.
DOI : 10.1109/iccv.2017.316

URL : http://arxiv.org/pdf/1704.00616