R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua et al., SLIC Superpixels Compared to State-of-the-Art Superpixel Methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.11, pp.2274-2282, 2012.
DOI : 10.1109/TPAMI.2012.120

B. Alexe, T. Deselares, and V. Ferrari, Measuring the Objectness of Image Windows, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.11, pp.2189-2202, 2012.
DOI : 10.1109/TPAMI.2012.28

P. Arbeláez, M. Maire, C. Fowlkes, and J. Malik, Contour Detection and Hierarchical Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.5, pp.898-916, 2011.
DOI : 10.1109/TPAMI.2010.161

T. Brox and J. Malik, Object Segmentation by Long Term Analysis of Point Trajectories, p.ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_21

T. Brox and J. Malik, Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.3, 2011.
DOI : 10.1109/TPAMI.2010.143

J. Carreira, R. Caseiroa, J. Batista, and C. Sminchisescu, Semantic segmentation with secondorder pooling, p.ECCV, 2012.

A. Chen and J. Corso, Propagating multi-class pixel labels throughout video frames, 2010 Western New York Image Processing Workshop, 2010.
DOI : 10.1109/WNYIPW.2010.5649773

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.188.7421

R. Cinbis, J. Verbeek, and C. Schmid, Segmentation Driven Object Detection with Fisher Vectors, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.369

URL : https://hal.archives-ouvertes.fr/hal-00873134

J. Corso, E. Sharon, S. Dube, S. El-saden, U. Sinha et al., Efficient Multilevel Brain Tumor Segmentation With Integrated Bayesian Model Classification, IEEE Transactions on Medical Imaging, vol.27, issue.5, pp.629-640, 2008.
DOI : 10.1109/TMI.2007.912817

P. Dollár and C. Zitnick, Structured Forests for Fast Edge Detection, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.231

O. Duchenne, I. Laptev, J. Sivic, F. Bach, and J. Ponce, Automatic annotation of human actions in video, 2009 IEEE 12th International Conference on Computer Vision, p.ICCV, 2009.
DOI : 10.1109/ICCV.2009.5459279

I. Endres and D. Hoiem, Category Independent Object Proposals, p.ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_42

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.178.6784

P. Felzenszwalb and D. Huttenlocher, Efficient Graph-Based Image Segmentation, International Journal of Computer Vision, vol.59, issue.2, pp.167-181, 2004.
DOI : 10.1023/B:VISI.0000022288.19776.77

C. Fowlkes, S. Belongie, F. Chung, and J. Malik, Spectral grouping using the nystrom method, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.2, pp.214-225, 2004.
DOI : 10.1109/TPAMI.2004.1262185

A. Gaidon, Z. Harchaoui, and C. Schmid, Actom sequence models for efficient action detection, CVPR 2011, p.CVPR, 2011.
DOI : 10.1109/CVPR.2011.5995646

URL : https://hal.archives-ouvertes.fr/inria-00575217

F. Galasso, N. Nagaraja, T. Cardenas, T. Brox, and B. , Schiele: A unified video segmentation benchmark: Annotation, metrics and analysis, p.ICCV, 2013.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2014.
DOI : 10.1109/CVPR.2014.81

M. Grundmann, V. Kwatra, M. Han, and I. Essa, Efficient hierarchical graph-based video segmentation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.CVPR, 2010.
DOI : 10.1109/CVPR.2010.5539893

M. Jain, H. Jégou, and P. Bouthemy, Better Exploiting Motion for Better Action Recognition, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2013.
DOI : 10.1109/CVPR.2013.330

URL : https://hal.archives-ouvertes.fr/hal-00813014

M. Jain, J. Van-gemert, P. Bouthemy, H. Jégou, and C. Snoek, Action Localization with Tubelets from Motion, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2014.
DOI : 10.1109/CVPR.2014.100

URL : https://hal.archives-ouvertes.fr/hal-00996844

H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, HMDB: A large video database for human motion recognition, 2011 International Conference on Computer Vision, p.ICCV, 2011.
DOI : 10.1109/ICCV.2011.6126543

C. Lampert, M. Blaschko, and T. Hofmann, Efficient Subwindow Search: A Branch and Bound Framework for Object Localization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.12, pp.2129-2142, 2009.
DOI : 10.1109/TPAMI.2009.144

Z. Li, E. Gavves, K. Van-de-sande, C. Snoek, and A. Smeulders, Codemaps - Segment, Classify and Search Objects Locally, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.454

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.431.4858

J. Lu, H. Yang, D. Min, and M. Do, Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2013.
DOI : 10.1109/CVPR.2013.242

T. Ma and L. Latecki, Maximum weight cliques with mutex constraints for video object segmentation, p.CVPR, 2012.

S. Manén, M. Guillaumin, and L. V. Gool, Prime Object Proposals with Randomized Prim's Algorithm, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.315

M. Marszalek, I. Laptev, and C. Schmid, Actions in context, 2009 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2009.
DOI : 10.1109/CVPR.2009.5206557

URL : https://hal.archives-ouvertes.fr/inria-00548645

D. Oneata, J. Verbeek, and C. Schmid, Action and Event Recognition with Fisher Vectors on a Compact Feature Set, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.228

URL : https://hal.archives-ouvertes.fr/hal-00873662

D. Oneata, J. Verbeek, and C. Schmid, Efficient Action Localization with Approximately Normalized Fisher Vectors, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2014.
DOI : 10.1109/CVPR.2014.326

URL : https://hal.archives-ouvertes.fr/hal-00979594

P. Over, G. Awad, M. Michel, J. Fiscus, G. Sanders et al., TRECVID 2012 ? an overview of the goals, tasks, data, evaluation mechanisms and metrics, Proceedings of TRECVID, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00953826

A. Papazoglou and V. Ferrari, Fast Object Segmentation in Unconstrained Video, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.223

S. Paris and F. Durand, A Topological Approach to Hierarchical Segmentation using Mean Shift, 2007 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2007.
DOI : 10.1109/CVPR.2007.383228

O. Pele and M. Werman, Fast and robust Earth Mover's Distances, 2009 IEEE 12th International Conference on Computer Vision, p.ICCV, 2009.
DOI : 10.1109/ICCV.2009.5459199

A. Prest, C. Leistner, J. Civera, C. Schmid, and V. Ferrari, Learning object class detectors from weakly annotated video, 2012 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2012.
DOI : 10.1109/CVPR.2012.6248065

URL : https://hal.archives-ouvertes.fr/hal-00695940

M. Rodriguez, J. Ahmed, and M. Shah, Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2008.
DOI : 10.1109/CVPR.2008.4587727

J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, Image Classification with the Fisher Vector: Theory and Practice, International Journal of Computer Vision, vol.73, issue.2, pp.222-245, 2013.
DOI : 10.1007/s11263-013-0636-x

N. Sundaram, T. Brox, and K. Keutzer, Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow, p.ECCV, 2010.
DOI : 10.1007/978-3-642-15549-9_32

D. Tran and J. Yuan, Optimal spatio-temporal path discovery for video event detection, CVPR 2011, p.CVPR, 2011.
DOI : 10.1109/CVPR.2011.5995416

J. Uijlings, K. Van-de-sande, T. Gevers, and A. Smeulders, Selective Search for Object Recognition, International Journal of Computer Vision, vol.57, issue.1, pp.154-171, 2013.
DOI : 10.1007/s11263-013-0620-5

K. Van-de-sande, C. Snoek, and A. Smeulders, Fisher and VLAD with FLAIR, p.CVPR, 2014.

M. Van-den-bergh, G. Roig, X. Boix, S. Manen, and L. V. Gool, Online video SEEDS for temporal window objectness, p.ICCV, 2013.

H. Wang and C. Schmid, Action Recognition with Improved Trajectories, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.441

URL : https://hal.archives-ouvertes.fr/hal-00873267

X. Wang, M. Yang, S. Zhu, and Y. Lin, Regionlets for generic object detection, p.ICCV, 2013.

O. Weber, Y. Devir, A. Bronstein, M. Bronstein, and R. Kimmel, Parallel algorithms for approximation of distance maps on parametric surfaces, ACM Transactions on Graphics, vol.27, issue.4, 2008.
DOI : 10.1145/1409625.1409626

C. Xu and J. Corso, Evaluation of super-voxel methods for early video processing, p.CVPR, 2012.

C. Xu, S. Whitt, and J. Corso, Flattening Supervoxel Hierarchies by the Uniform Entropy Slice, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.279

C. Xu, C. Xiong, and J. Corso, Streaming Hierarchical Video Segmentation, p.ECCV, 2012.
DOI : 10.1007/978-3-642-33783-3_45

J. Yuan, Z. Liu, and Y. Wu, Discriminative subvolume search for efficient action detection, p.CVPR, 2009.

D. Zhang, O. Javed, and M. Shah, Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2013.
DOI : 10.1109/CVPR.2013.87