H. Bilen, V. Namboodiri, and L. Van-gool, Object and Action Classification with Latent Variables, Procedings of the British Machine Vision Conference 2011, 2003.
DOI : 10.5244/C.25.17

O. Chapelle, Training a Support Vector Machine in the Primal, Neural Computation, vol.6, issue.5, pp.1155-1178, 2007.
DOI : 10.1198/106186005X25619

G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, Intl. Workshop on Stat. Learning in Comp. Vision, 2004.

V. Delaitre, I. Laptev, and J. Sivic, Recognizing human actions in still images: a study of bag-of-features and part-based representations, Procedings of the British Machine Vision Conference 2010, p.7, 2010.
DOI : 10.5244/C.24.97

URL : https://hal.archives-ouvertes.fr/hal-01060885

V. Delaitre, J. Sivic, and I. Laptev, Learning person-object interactions for action recognition in still images, NIPS, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00648156

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, 2010.
DOI : 10.1007/s11263-009-0275-4

R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, LIBLINEAR: A library for large linear classification, Journal of Machine Learning Research, vol.9, issue.5, pp.1871-1874, 2008.

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object detection with discriminatively trained part based models. Pattern Analysis and Machine Intelligence, pp.1627-1645, 2004.

D. Gao and N. Vasconcelos, Discriminant saliency for visual recognition form cluttered scenes, NIPS, 2004.

D. Gao and N. Vasconcelos, Integrated learning of saliency, complex features and object detectors from cluttered scenes, CVPR, 2005.

T. Harada, Y. Ushiku, Y. Yamashita, and Y. Kuniyoshi, Discriminative spatial pyramid, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995691

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.662.6786

H. Harzallah, F. Jurie, and C. Schmid, Combining efficient object localization and image classification, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459257

URL : https://hal.archives-ouvertes.fr/inria-00439516

L. Itti, C. Koch, and E. Niebur, A model of saliency-based visual attention for rapid scene analysis. Pattern Analysis and Machine Intelligence, 1998.

F. S. Khan, J. Van-de-weijer, and M. Vanrell, Top-down color attention for object recognition, ICCV, 2009.

C. Koch and S. Ullman, Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry, Human Neurobiology, vol.4, issue.1 2, pp.219-227, 1985.
DOI : 10.1007/978-94-009-3833-5_5

J. Krapac, J. Verbeek, and F. Jurie, Learning tree-structured descriptor quantizers for image categorization, BMVC, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00613118

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), 2006.
DOI : 10.1109/CVPR.2006.68

URL : https://hal.archives-ouvertes.fr/inria-00548585

D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

K. Mikolajczyk and C. Schmid, Scale & Affine Invariant Interest Point Detectors, International Journal of Computer Vision, vol.60, issue.1, pp.63-86, 2004.
DOI : 10.1023/B:VISI.0000027790.02288.f2

URL : https://hal.archives-ouvertes.fr/inria-00548554

F. Moosmann, D. Larlus, and F. Jurie, Learning saliency maps for object categorization, ECCV Workshops, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00203726

N. Murray, M. Vanrell, X. Otazu, and C. A. Parraga, Saliency estimation using a non-parametric low-level vision model, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995506

E. Nowak, F. Jurie, and B. Triggs, Sampling Strategies for Bag-of-Features Image Classification, ECCV, 2006.
DOI : 10.1007/11744085_38

URL : https://hal.archives-ouvertes.fr/hal-00203752

D. Parikh, L. Zitnick, and T. Chen, Determining Patch Saliency Using Low-Level Context, ECCV, 2008.
DOI : 10.1007/978-3-540-88688-4_33

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.146.3666

F. Perronnin, J. Sánchez, and T. Mensink, Improving the Fisher Kernel for Large-Scale Image Classification, ECCV, 2010.
DOI : 10.1007/978-3-642-15561-1_11

URL : https://hal.archives-ouvertes.fr/inria-00548630

A. M. Treisman and G. Gelade, A feature-integration theory of attention, Cognitive Psychology, vol.12, issue.1, pp.97-136, 1980.
DOI : 10.1016/0010-0285(80)90005-5

A. Vedaldi and B. Fulkerson, Vlfeat, Proceedings of the international conference on Multimedia, MM '10, 2008.
DOI : 10.1145/1873951.1874249

A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman, Multiple kernels for object detection, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459183

A. Vedaldi and A. Zisserman, Efficient additive kernels using explicit feature maps, CVPR, p.5, 2010.
DOI : 10.1109/cvpr.2010.5539949

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.7024

M. Wang, J. Konrad, P. Ishwar, K. Jing, and H. Rowley, Image saliency: From intrinsic to extrinsic context, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995743

J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba, SUN database: Large-scale scene recognition from abbey to zoo, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539970

B. Yao and L. Fei-fei, Grouplet: A structured image representation for recognizing human and object interactions, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5540234

B. Yao, A. Khosla, and L. Fei-fei, Combining randomization and discrimination for fine-grained image categorization, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995368