R. V. Babu and K. R. Ramakrishnan, Recognition of human actions using motion history information extracted from the compressed video, Image and Vision Computing, vol.22, issue.8, pp.597-607, 2004.
DOI : 10.1016/j.imavis.2003.11.004

D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black, A Naturalistic Open Source Movie for Optical Flow Evaluation, ECCV, Part IV, pp.611-625, 2012.
DOI : 10.1007/978-3-642-33783-3_44

K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman, The devil is in the details: an evaluation of recent feature encoding methods, Procedings of the British Machine Vision Conference 2011, 2011.
DOI : 10.5244/C.25.76

N. Dalal, B. Triggs, and C. Schmid, Human Detection Using Oriented Histograms of Flow and Appearance, ECCV, 2006.
DOI : 10.1023/A:1008162616689

URL : https://hal.archives-ouvertes.fr/inria-00548587

P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, Behavior Recognition via Sparse Spatio-Temporal Features, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp.65-72, 2005.
DOI : 10.1109/VSPETS.2005.1570899

G. Farnebäck, Two-Frame Motion Estimation Based on Polynomial Expansion, SCIA, p.3, 2003.
DOI : 10.1007/3-540-45103-X_50

J. Liu, B. Kuipers, and S. Savarese, Recognizing human actions by attributes, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995353

M. Jain, H. Jégou, and P. Bouthemy, Better Exploiting Motion for Better Action Recognition, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.330

URL : https://hal.archives-ouvertes.fr/hal-00813014

H. Jegou, M. Douze, C. Schmid, and P. Perez, Aggregating local descriptors into a compact image representation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.3304-3311, 2010.
DOI : 10.1109/CVPR.2010.5540039

URL : https://hal.archives-ouvertes.fr/inria-00548637

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez et al., Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9, 2012.
DOI : 10.1109/TPAMI.2011.235

A. Klaser, M. Marsza?ek, and C. Schmid, A Spatio-Temporal Descriptor Based on 3D-Gradients, Procedings of the British Machine Vision Conference 2008, 2008.
DOI : 10.5244/C.22.99

URL : https://hal.archives-ouvertes.fr/inria-00514853

O. Kliper-gross, Y. Gurovich, T. Hassner, and L. Wolf, Motion Interchange Patterns for Action Recognition in Unconstrained Videos, ECCV, pp.256-269, 2012.
DOI : 10.1007/978-3-642-33783-3_19

H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, HMDB: A large video database for human motion recognition, 2011 International Conference on Computer Vision, pp.2556-2563, 2011.
DOI : 10.1109/ICCV.2011.6126543

I. Laptev, On Space-Time Interest Points, International Journal of Computer Vision, vol.17, issue.8, pp.107-123, 2005.
DOI : 10.1007/s11263-005-1838-7

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.1419

I. Laptev, M. Marsza?ek, C. Schmid, and B. Rozenfeld, Learning realistic human actions from movies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2004.
DOI : 10.1109/CVPR.2008.4587756

URL : https://hal.archives-ouvertes.fr/inria-00548659

I. Laptev and P. Pérez, Retrieving actions in movies, 2007 IEEE 11th International Conference on Computer Vision, 2007.
DOI : 10.1109/ICCV.2007.4409105

B. Lucas and T. Kanade, An iterative image registration technique with an application to stereo vision, DARPA Image Understanding Workshop, pp.121-130, 1981.

M. Marsza?ek, I. Laptev, and C. Schmid, Actions in context, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2005.
DOI : 10.1109/CVPR.2009.5206557

P. Matikainen, M. Hebert, and R. Sukthankar, Trajectons: Action recognition through the motion analysis of tracked features, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 2002.
DOI : 10.1109/ICCVW.2009.5457659

M. Muja and D. Lowe, Fast approximate nearest neighbors with automatic algorithm configuration, VISSAPP, pp.331-340, 2009.

J. Niebles, C. Chen, and L. Fei-fei, Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification, ECCV, 2010.
DOI : 10.1007/978-3-642-15552-9_29

F. Perronnin and J. Sanchez, High-dimensional signature compression for large-scale image classification, CVPR, 2012.

F. Perronnin, J. Sánchez, and T. Mensink, Improving the Fisher Kernel for Large-Scale Image Classification, ECCV, pp.143-156, 2005.
DOI : 10.1007/978-3-642-15561-1_11

URL : https://hal.archives-ouvertes.fr/inria-00548630

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.
DOI : 10.1109/CVPR.2007.383172

C. P. Messing and H. Kautz, Activity recognition using the velocity histories of tracked keypoints, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459154

K. Reddy and M. Shah, Recognizing 50 human action categories of web videos, Machine Vision and Applications, pp.1-11, 2012.
DOI : 10.1007/s00138-012-0450-4

J. Revaud, M. Douze, C. Schmid, and H. Jégou, Event Retrieval in Large Video Collections with Circulant Temporal Encoding, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.318

URL : https://hal.archives-ouvertes.fr/hal-00801714

M. Rodriguez, A. Javed, and M. Shah, Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587727

M. S. Ryoo and J. K. Aggarwal, UT-Interaction Dataset, ICPR contest on Semantic Description of Human Activities (SDHA), p.7, 2010.

S. Sadanand and J. J. Corso, Action bank: A high-level representation of activity in video, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.1234-1241, 2012.
DOI : 10.1109/CVPR.2012.6247806

C. Schüldt, I. Laptev, and B. Caputo, Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., pp.32-36, 2004.
DOI : 10.1109/ICPR.2004.1334462

P. Scovanner, S. Ali, and M. Shah, A 3-dimensional sift descriptor and its application to action recognition, Proceedings of the 15th international conference on Multimedia , MULTIMEDIA '07, 2007.
DOI : 10.1145/1291233.1291311

F. Shi, E. Petriu, and R. Laganiere, Sampling Strategies for Real-Time Action Recognition, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.2595-2602, 2007.
DOI : 10.1109/CVPR.2013.335

H. Wang, A. Kläser, C. Schmid, and C. Liu, Dense Trajectories and Motion Boundary Descriptors for Action Recognition, International Journal of Computer Vision, vol.73, issue.2, 2007.
DOI : 10.1007/s11263-012-0594-8

URL : https://hal.archives-ouvertes.fr/hal-00725627

H. Wang and C. Schmid, Action Recognition with Improved Trajectories, 2013 IEEE International Conference on Computer Vision, 2013.
DOI : 10.1109/ICCV.2013.441

URL : https://hal.archives-ouvertes.fr/hal-00873267

L. Yeffet and L. Wolf, Local Trinary Patterns for human action recognition, 2009 IEEE 12th International Conference on Computer Vision, pp.492-497, 2009.
DOI : 10.1109/ICCV.2009.5459201

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.9905

C. Yeo, P. Ahammad, K. Ramachandran, and S. S. Sastry, Compressed Domain Real-time Action Recognition, 2006 IEEE Workshop on Multimedia Signal Processing, 2006.
DOI : 10.1109/MMSP.2006.285263

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.111.1547

C. Yeo, P. Ahammad, K. Ramchandran, and S. S. Sastry, High-speed action recognition and localization in compressed domain videos, IEEE Transactions on Circuits and Systems, vol.18, issue.2, pp.1006-1015, 2008.

T. Yu, T. Kim, and R. Cipolla, Real-time Action Recognition by Spatiotemporal Semantic and Structural Forests, Procedings of the British Machine Vision Conference 2010, 2007.
DOI : 10.5244/C.24.52

J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study, International Journal of Computer Vision, vol.36, issue.1, pp.213-238, 2007.
DOI : 10.1007/s11263-006-9794-4

URL : https://hal.archives-ouvertes.fr/inria-00548574