M. Blaschko, A. Vedaldi, and A. Zisserman, Simultaneous object detection and ranking with weak supervision, NIPS, 2010.

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), p.7, 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

V. Delaitre, I. Laptev, and J. Sivic, Recognizing human actions in still images: a study of bag-of-features and part-based representations, Procedings of the British Machine Vision Conference 2010, 2010.
DOI : 10.5244/C.24.97

URL : https://hal.archives-ouvertes.fr/hal-01060885

T. Deselaers, B. Alexe, and V. Ferrari, Localizing Objects While Learning Their Appearance, ECCV, 2010.
DOI : 10.1007/978-3-642-15561-1_33

M. Everingham, L. Van-gool, C. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (voc) challenge . IJCV, p.5, 2010.

P. Felzenszwalb, R. Girshick, and D. Mcallester, Discriminatively trained deformable part models, release 4

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object detection with discriminatively trained partbased models, 2004.

T. Joachims, Making large-scale SVM learning practical, Advances in Kernel Methods, 1999.

T. Joachims, T. Finley, and C. J. Yu, Cutting-plane training of structural SVMs, Machine Learning, 2009.
DOI : 10.1007/s10994-009-5108-8

M. P. Kumar, B. Packer, and D. Koller, Modeling latent variable uncertainty for loss-based learning, ICML, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00773605

M. P. Kumar, H. Turki, D. Preston, and D. Koller, Learning specific-class segmentation from diverse data, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126446

T. Lan, Y. Wang, W. Yang, S. Robinovitch, and G. Mori, Discriminative latent models for recognizing contextual group activities, 2012.

Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour et al., Large-scale image classification: Fast feature extraction and svm training Action recognition from a distributed representation of pose and appearance, CVPR CVPR, pp.2-14, 2005.

K. Miller, M. P. Kumar, B. Packer, D. Goodman, and D. Koller, Max-margin min-entropy models, AISTATS, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00773602

A. Mishra, K. Alahari, and C. V. Jawahar, Scene Text Recognition using Higher Order Language Priors, Procedings of the British Machine Vision Conference 2012, 2012.
DOI : 10.5244/C.26.127

URL : https://hal.archives-ouvertes.fr/hal-00818183

M. Pandey and S. Lazebnik, Scene recognition and weakly supervised object localization with deformable part-based models, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126383

A. Prest, C. Schmid, and V. Ferrari, Weakly Supervised Learning of Interactions between Humans and Objects, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.3, 2012.
DOI : 10.1109/TPAMI.2011.158

URL : https://hal.archives-ouvertes.fr/inria-00516477

O. Russakovsky, Y. Lin, K. Yu, and L. Fei-fei, Objectcentric spatial pooling for image classification, ECCV, 2012.

S. Shalev-shwartz, Y. Singer, and N. Srebro, Pegasos, Proceedings of the 24th international conference on Machine learning, ICML '07, 2009.
DOI : 10.1145/1273496.1273598

A. Smola, S. Vishwanathan, and T. Hofmann, Kernel methods for missing variables, AISTATS, 2005.

V. Vapnik, Statistical learning theory, 1998.

A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman, Multiple kernels for object detection, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459183

A. Vezhnevets, J. Buhmann, and V. Ferrari, Weakly supervised structured output learning for semantic segmentation, 2012 IEEE Conference on Computer Vision and Pattern Recognition
DOI : 10.1109/CVPR.2012.6247757

H. Wang, S. Gould, and D. Koller, Discriminative learning with latent variables for cluttered indoor scene understanding, ECCV, 2010.
DOI : 10.1145/2436256.2436276

Y. Wang and G. Mori, A Discriminative Latent Model of Object Classes and Attributes, ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_12

J. Yang, K. Yu, Y. Gong, and T. Huang, Linear spatial pyramid matching using sparse coding for image classification, CVPR, 2009.

L. Yang, R. Jin, R. Sukthankar, and F. Jurie, Unifying discriminative visual codebook generation with classifier training for object category recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587504

URL : https://hal.archives-ouvertes.fr/inria-00548653

W. Yang, Y. Wang, and G. Mori, Recognizing human actions from still images with latent poses, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539879

B. Yao, X. Jiang, A. Khosla, A. Lin, L. Guibas et al., Human action recognition by learning bases of action attributes and parts, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126386

C. Yu and T. Joachims, Learning structural SVMs with latent variables, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553523

Y. Yue, T. Finley, F. Radlinski, and T. Joachims, A support vector method for optimizing average precision, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, 2005.
DOI : 10.1145/1277741.1277790

A. Yuille and A. Rangarajan, The Concave-Convex Procedure, Neural Computation, vol.39, issue.4, 2003.
DOI : 10.1162/08997660260028674