R. Achanta, A. Shaji, K. Smith, P. Lucchi, S. Fua et al., SLIC Superpixels Compared to State-of-the-Art Superpixel Methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.11, pp.342274-2282, 2012.
DOI : 10.1109/TPAMI.2012.120

P. Arbeláez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev et al., Semantic segmentation using regions and parts, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6248077

H. Azizpour and I. Laptev, Object Detection Using Strongly-Supervised Deformable Part Models, ECCV, 2012.
DOI : 10.1007/978-3-642-33718-5_60

URL : https://hal.archives-ouvertes.fr/hal-01063338

L. Bourdev and J. Malik, Poselets: Body part detectors trained using 3D human pose annotations, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459303

Y. Boureau, F. Bach, Y. Lecun, and J. Ponce, Learning mid-level features for recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539963

Y. Boureau, N. Le-roux, F. Bach, J. Ponce, and Y. Lecun, Ask the locals: Multi-way local pooling for image recognition, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126555

URL : https://hal.archives-ouvertes.fr/hal-00646816

Y. Boykov and V. Kolmogorov, An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.9, pp.1124-1137, 2004.
DOI : 10.1109/TPAMI.2004.60

Y. Chai, E. Rahtu, V. Lempitsky, L. Van-gool, and A. Zisserman, TriCoS: A Tri-level Class-Discriminative Co-segmentation Method for Image Classification, ECCV, 2012.
DOI : 10.1007/978-3-642-33718-5_57

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, ECCV workshop on statistical learning in computer vision, 2004.

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

C. Doersch, S. Singh, A. Gupta, J. Sivic, and A. Efros, What makes paris look like paris?, p.101, 2012.
DOI : 10.1145/2185520.2185597

URL : https://hal.archives-ouvertes.fr/hal-01053876

O. Duchenne, A. Joulin, and J. Ponce, A graph-matching kernel for object categorization, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126445

URL : https://hal.archives-ouvertes.fr/hal-00650345

J. Duchi and Y. Singer, Efficient learning using forward-backward splitting, NIPS, 2009.

L. Fei-fei, R. Fergus, and P. Perona, Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories, CVPR workshop on generative-model based vision, 2004.
DOI : 10.1016/j.cviu.2005.09.012

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.1627-1645, 2010.
DOI : 10.1109/TPAMI.2009.167

A. Joulin, F. Bach, and J. Ponce, Discriminative clustering for image cosegmentation, CVPR, 2010.

A. Joulin, F. Bach, and J. Ponce, Multi-class cosegmentation, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6247719

URL : https://hal.archives-ouvertes.fr/hal-00717448

G. Kim and E. P. Xing, On multiple foreground cosegmentation, CVPR, 2012.

G. Kim, E. P. Xing, L. Fei-fei, T. Kanade, S. Lazebnik et al., Distributed cosegmentation via submodular optimization on anisotropic diffusion Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, ICCV CVPR, 2006.

L. Li, H. Su, E. Xing, and L. Fei-fei, Object bank: A high-level image representation for scene classification and semantic feature sparsification, NIPS, 2010.

L. Li and L. Fei-fei, What, where and who? Classifying events by scene and object recognition, 2007 IEEE 11th International Conference on Computer Vision, 2007.
DOI : 10.1109/ICCV.2007.4408872

L. Liu, L. Wang, and X. Liu, In defense of soft-assignment coding, ICCV, 2011.

M. Juneja, A. Vedaldi, C. V. Jawahar, and A. Zisserman, Blocks That Shout: Distinctive Parts for Scene Classification, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.124

L. Mukherjee, V. Singh, and J. Peng, Scale invariant cosegmentation for image groups, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995420

M. Pandey and S. Lazebnik, Scene recognition and weakly supervised object localization with deformable part-based models, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126383

A. Quattoni and A. Torralba, Recognizing indoor scenes, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009.
DOI : 10.1109/CVPR.2009.5206537

C. Rother, V. Kolmogorov, and A. Blake, "GrabCut", ACM Transactions on Graphics, vol.23, issue.3, pp.309-314, 2004.
DOI : 10.1145/1015706.1015720

S. Sadanand and J. J. Corso, Action bank: A high-level representation of activity in video, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6247806

F. Sadeghi and M. F. Tappen, Latent Pyramidal Regions for Recognizing Scenes, ECCV, 2012.
DOI : 10.1007/978-3-642-33715-4_17

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.297.3130

G. Sharma, F. Jurie, and C. Schmid, Discriminative spatial saliency for image classification, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6248093

URL : https://hal.archives-ouvertes.fr/hal-00714311

S. Singh, A. Gupta, and A. Efros, Unsupervised Discovery of Mid-Level Discriminative Patches, ECCV, 2012.
DOI : 10.1007/978-3-642-33709-3_6

Y. Su and F. Jurie, Visual word disambiguation by semantic contexts, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126257

URL : https://hal.archives-ouvertes.fr/hal-00808655

S. Vicente, C. Rother, and V. Kolmogorov, Object cosegmentation, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995530

J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, SUN database: Large-scale scene recognition from abbey to zoo, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539970

S. Yan, X. Xu, D. Xu, S. Lin, and X. Li, Beyond Spatial Pyramids: A New Feature Extraction Framework with Dense Spatial Sampling for Image Classification, ECCV, 2012.
DOI : 10.1007/978-3-642-33765-9_34

J. Yang, Y. Li, Y. Tian, L. Duan, and W. Gao, Group-sensitive multiple kernel learning for object categorization, CVPR, 2009.

J. Yang, K. Yu, Y. Gong, and T. Huang, Linear spatial pyramid matching using sparse coding for image classification, CVPR, 2009.

B. Yao, X. Jiang, A. Khosla, A. Lin, L. Guibas et al., Human action recognition by learning bases of action attributes and parts, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126386

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.58, issue.1, pp.49-67, 2005.
DOI : 10.1198/016214502753479356

Y. Zheng, Y. Jiang, and X. Xue, Learning Hybrid Part Filters for Scene Recognition, ECCV, 2012.
DOI : 10.1007/978-3-642-33715-4_13