YouTube- 8M: A large-scale video classification benchmark, 2016. ,
Grammar of the film language, 1991. ,
Midwest and its children: The psychological ecology of an American town. Row, Peterson and Company, 1954. ,
Actions as space-time shapes, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, 2005. ,
DOI : 10.1109/ICCV.2005.28
URL : http://www.wisdom.weizmann.ac.il/~yelenag/spaceTimeActionsTPAMI2007.pdf
ActivityNet: A large-scale video benchmark for human activity understanding, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/CVPR.2015.7298698
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,
DOI : 10.1109/CVPR.2017.502
HICO: A Benchmark for Recognizing Human-Object Interactions in Images, 2015 IEEE International Conference on Computer Vision (ICCV), p.8, 2015. ,
DOI : 10.1109/ICCV.2015.122
Word association norms, mutual information, and lexicoraphy, Computational Linguistics, vol.16, issue.1 5, 1990. ,
Finding action tubes, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/CVPR.2015.7298676
The ???Something Something??? Video Database for Learning and Evaluating Visual Common Sense, 2017 IEEE International Conference on Computer Vision (ICCV), p.8, 2017. ,
DOI : 10.1109/ICCV.2017.622
Visual semantic role labeling. CoRR ,
Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.90
The devil is in the tails: Finegrained classification in the wild, 2017. ,
Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos, 2017 IEEE International Conference on Computer Vision (ICCV), 2007. ,
DOI : 10.1109/ICCV.2017.620
Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ,
DOI : 10.1109/CVPR.2017.351
The THUMOS challenge on action recognition for videos " in the wild FlowNet 2.0: Evolution of optical flow estimation with deep networks, CVPR, 2017. ,
Towards Understanding Action Recognition, 2013 IEEE International Conference on Computer Vision, p.7 ,
DOI : 10.1109/ICCV.2013.396
URL : https://hal.archives-ouvertes.fr/hal-00906902
Large-Scale Video Classification with Convolutional Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. ,
DOI : 10.1109/CVPR.2014.223
The Kinetics human action video dataset, 2017. ,
Efficient visual event detection using volumetric features, ICCV, 2005. ,
HMDB: A large video database for human motion recognition, 2011 International Conference on Computer Vision, 2011. ,
DOI : 10.1109/ICCV.2011.6126543
The Hungarian method for the assignment problem, Naval Research Logistics Quarterly, vol.3, issue.1-2, pp.83-97, 1955. ,
DOI : 10.2140/pjm.1953.3.369
Actions in context, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. ,
DOI : 10.1109/CVPR.2009.5206557
URL : https://hal.archives-ouvertes.fr/inria-00548645
Spot On: Action Localization from Pointly-Supervised Proposals, ECCV, 2016. ,
DOI : 10.1007/s11263-013-0636-x
TRECVID 2014 ? an overview of the goals, tasks, data, evaluation mechanisms and metrics, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01230444
Multi-region Two-Stream R-CNN for Action Detection, ECCV, p.7, 2016. ,
DOI : 10.1109/CVPR.2015.7298735
URL : https://hal.archives-ouvertes.fr/hal-01349107
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, 2015. 3 ,
DOI : 10.1109/TPAMI.2016.2577031
Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.3, 2008. ,
DOI : 10.1109/CVPR.2008.4587727
AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep Architecture, 2017 IEEE International Conference on Computer Vision (ICCV), 2017. ,
DOI : 10.1109/ICCV.2017.473
Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos, Procedings of the British Machine Vision Conference 2016, 2016. ,
DOI : 10.5244/C.30.58
Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004. ,
DOI : 10.1109/ICPR.2004.1334462
Much ado about time: Exhaustive annotation of temporal data, Conference on Human Computation and Crowdsourcing, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01431527
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding, ECCV, 2016 ,
DOI : 10.1109/ICCV.2015.515
URL : https://hal.archives-ouvertes.fr/hal-01418216
Online Real-Time Multiple Spatiotemporal Action Localisation and Prediction, 2017 IEEE International Conference on Computer Vision (ICCV), p.7, 2017. ,
DOI : 10.1109/ICCV.2017.393
UCF101: A dataset of 101 human actions classes from videos in the wild, p.3, 2012. ,
Rethinking the Inception Architecture for Computer Vision, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.308
Action Tubelet Detector for Spatio-Temporal Action Localization, 2017 IEEE International Conference on Computer Vision (ICCV), 2007. ,
DOI : 10.1109/ICCV.2017.472
URL : https://hal.archives-ouvertes.fr/hal-01519812
Actionness Estimation Using Hybrid Fully Convolutional Networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.296
Learning to Track for Spatio-Temporal Action Localization, 2015 IEEE International Conference on Computer Vision (ICCV), 2015. ,
DOI : 10.1109/ICCV.2015.362
URL : https://hal.archives-ouvertes.fr/hal-01159941
Towards weaklysupervised action localization, 2016. ,
PersonNet: Person re-identification with deep convolutional neural networks, 2016. ,
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos, International Journal of Computer Vision, vol.25, issue.1, p.2017 ,
DOI : 10.1109/CVPR.1992.223161
Discriminative subvolume search for efficient action detection, CVPR, 2009. ,
SLAC: A sparsely labeled dataset for action classification and localization. arXiv preprint, 2017. ,
Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection, 2017 IEEE International Conference on Computer Vision (ICCV), 2017. ,
DOI : 10.1109/ICCV.2017.316