E. L. Allwein, R. E. Schapire, and Y. Singer, Reducing multiclass to binary: A unifying approach for margin classifiers, ICML, 2000.

B. Bai, J. Weston, D. Grangier, R. Collobert, O. Chapelle et al., Supervised semantic indexing, CIKM, 2009.
DOI : 10.1007/978-3-642-00958-7_81
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.162.1162

P. L. Bartlett, M. I. Jordan, and J. D. Mcauliffe, Convexity, Classification, and Risk Bounds, NIPS, 2003.
DOI : 10.1198/016214505000000907
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.3497

S. Bengio, J. Weston, and D. Grangier, Label embedding trees for large multi-class tasks, NIPS, 2010.

Y. Bengio, A. Courville, and P. Vincent, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8
DOI : 10.1109/TPAMI.2013.50

A. Bergamo, L. Torresani, and A. Fitzgibbon, PICODES: Learning a compact code for novel-category recognition, NIPS, 2011.

A. Beygelzimer, V. Dani, T. P. Hayes, J. Langford, and B. Zadrozny, Error limiting reductions between classification tasks, Proceedings of the 22nd international conference on Machine learning , ICML '05, 2005.
DOI : 10.1145/1102351.1102358
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.81.4721

A. Bordes, L. Bottou, P. Gallinari, and J. Weston, Solving multiclass support vector machines with LaRank, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273508
URL : https://hal.archives-ouvertes.fr/hal-00750277

L. Bottou and O. Bousquet, The tradeoffs of large scale learning, NIPS, 2007.

Y. Bourreau, F. Bach, Y. Lecun, and J. Ponce, Learning mid-level features for recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539963

G. Burghouts and J. Geusebroek, Performance evaluation of local colour invariants, Computer Vision and Image Understanding, vol.113, issue.1, 2009.
DOI : 10.1016/j.cviu.2008.07.003

P. K. Chan and S. J. Stolfo, On the accuracy of meta-learning for scalable data mining, JIIS, issue.3, 1997.

C. Chang and C. Lin, LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, p.7, 2011.
DOI : 10.1145/1961189.1961199

K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman, The devil is in the details: an evaluation of recent feature encoding methods, Procedings of the British Machine Vision Conference 2011, p.7, 2011.
DOI : 10.5244/C.25.76

K. Crammer and Y. Singer, On the algorithmic implementation of multiclass kernel-based vector machines, JMLR, 2002.

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, ECCV SLCV workshop, 2004.

J. Dean, G. Corrado, R. Monga, M. Devin, Q. Le et al., Large scale distributed deep networks, NIPS, 2012.

J. Deng, A. Berg, K. Li, and L. Fei-fei, What Does Classifying More Than 10,000 Image Categories Tell Us?, ECCV, p.12, 2009.
DOI : 10.1007/978-3-642-15555-0_6

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., Imagenet: A large-scale hierarchical image database, CVPR, 2009.

J. Deng, S. Satheesh, A. Berg, and L. Fei-fei, Fast and balanced: efficient label tree learning for large scale object recognition, NIPS, 2011.

T. Deselaers and V. Ferrari, Visual and semantic similarity in ImageNet, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995474

T. G. Dietterich and G. Bakiri, Solving multiclass learning problems via error-correcting output codes, 1995.

M. Everingham, L. V. Gool, C. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (VOC) challenge. IJCV, p.9, 2010.

R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, LIBLINEAR: A library for large linear classification, JMLR, issue.7, 2008.

J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-taylor, Improving bag-of-keypoints image categorisation, 2005.

V. Franc and S. Sonnenburg, Optimized cutting plane algorithm for support vector machines, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008.
DOI : 10.1145/1390156.1390197

T. Gao and D. Koller, Discriminative learning of relaxed hierarchy for large-scale visual recognition, ICCV, 2011.

J. Gehrke, R. Ramakrishnan, and V. Ganti, Rainforest -a framework for fast decision tree construction of large datasets, 2000.

Y. Gong and S. Lazebnik, Comparing data-dependent and dataindependent embeddings for classification and ranking of internet images, CVPR, 2011.

D. Grangier, F. Monay, and S. Bengio, A Discriminative Approach for the Retrieval of Images from Text Queries, ECML, 2006.
DOI : 10.1007/11871842_19

C. Hsieh, K. Chang, C. Lin, S. S. Keerthi, and S. Sundararajan, A dual coordinate descent method for large-scale linear SVM, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008.
DOI : 10.1145/1390156.1390208

K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. Lecun, What is the best multi-stage architecture for object recognition?, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459469

H. Jégou, M. Douze, and C. Schmid, Product Quantization for Nearest Neighbor Search, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.1, p.7, 2011.
DOI : 10.1109/TPAMI.2010.57

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez et al., Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9
DOI : 10.1109/TPAMI.2011.235

T. Joachims, Making large-scale support vector machine learning practical, Advances in kernel methods, 1999.

T. Joachims, Optimizing search engines using clickthrough data, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, pp.133-142, 2002.
DOI : 10.1145/775047.775067
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.3161

T. Joachims, Training linear SVMs in linear time, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '06, 2006.
DOI : 10.1145/1150402.1150429

A. Krizhevsky, I. Sutskever, and G. Hinton, Image classification with deep convolutional neural networks, NIPS, 2012, p.12

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), 2006.
DOI : 10.1109/CVPR.2006.68
URL : https://hal.archives-ouvertes.fr/inria-00548585

Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen et al., Building high-level features using large scale unsupervised learning, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, p.12
DOI : 10.1109/ICASSP.2013.6639343
URL : http://arxiv.org/abs/1112.6209

Y. Lecun, B. Boser, J. Denker, D. Henderson, R. Howard et al., Handwritten digit recognition with a back-propagation network, NIPS, 1989.

Y. Lecun, L. Bottou, G. Orr, and K. Muller, Efficient backprop, Neural Networks: Tricks of the trade, 1998.

Y. Lecun, F. Huang, and L. Bottou, Learning methods for generic object recognition with invariance to pose and lighting, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., 2004.
DOI : 10.1109/CVPR.2004.1315150

T. Lee, Y. Lin, and G. Wahba, Multicategory Support Vector Machines, Journal of the American Statistical Association, vol.99, issue.465, 2004.
DOI : 10.1198/016214504000000098
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.1879

L. Li, H. Su, E. Xing, and L. Fei-fei, Object bank: A high-level image representation for scene classification and semantic feature sparsification, NIPS, 2010.

Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour et al., Large-scale image classification: Fast feature extraction and SVM training, CVPR 2011, p.11, 2003.
DOI : 10.1109/CVPR.2011.5995477
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.225.3736

Y. Lin, F. Lv, S. Zhu, K. Yu, M. Yang et al., Large-scale image classification: Fast feature extraction and SVM training, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995477
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.225.3736

D. G. Lowe, Distinctive image features from scale-invariant keypoints. IJCV, p.7, 2004.
DOI : 10.1023/b:visi.0000029664.99615.94
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.4931

S. Maji and A. Berg, Max-margin additive classifiers for detection, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459203

M. Marszalek and C. Schmid, Constructing Category Hierarchies for Visual Recognition, ECCV, 2008.
DOI : 10.1007/978-3-540-88693-8_35
URL : https://hal.archives-ouvertes.fr/inria-00548656

M. Mehta, R. Agrawal, and J. Rissanen, SLIQ: A fast scalable classifier for data mining, EDBT, 1996.
DOI : 10.1007/BFb0014141

S. Nowozin and C. Lampert, Structured Learning and Prediction in Computer Vision. Foundations and Trends in Computer Graphics and Vision, 2011.

F. Perronnin, Z. Akata, Z. Harchaoui, and C. Schmid, Towards good practice in large-scale learning for image classification, 2012 IEEE Conference on Computer Vision and Pattern Recognition, p.11
DOI : 10.1109/CVPR.2012.6248090
URL : https://hal.archives-ouvertes.fr/hal-00690014

F. Perronnin and C. Dance, Fisher Kernels on Visual Vocabularies for Image Categorization, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383266

F. Perronnin, J. Sánchez, and Y. Liu, Large-scale image categorization with explicit data embedding, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005.
DOI : 10.1109/CVPR.2010.5539914

F. Perronnin, J. Sánchez, and T. Mensink, Improving the Fisher Kernel for Large-Scale Image Classification, ECCV, p.5, 2010.
DOI : 10.1007/978-3-642-15561-1_11
URL : https://hal.archives-ouvertes.fr/inria-00548630

J. C. Platt, Fast training of support vector machines using sequential minimal optimization, Advances in kernel methods, 1999.

M. Rastegari, C. Fang, and L. Torresani, Scalable object-class retrieval with approximate and top-k ranking, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126556

R. Rifkin and A. Klautau, In defense of one-vs-all classification, JMLR, issue.3, 2004.

M. Rohrbach, M. Stark, and B. Schiele, Evaluating knowledge transfer and zero-shot learning in a large-scale setting, CVPR 2011, 2005.
DOI : 10.1109/CVPR.2011.5995627

L. Rokach and O. Maimon, Top-Down Induction of Decision Trees Classifiers???A Survey, IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol.35, issue.4, 2005.
DOI : 10.1109/TSMCC.2004.843247

B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman, LabelMe: A Database and Web-Based Tool for Image Annotation, International Journal of Computer Vision, vol.3, issue.1, 2008.
DOI : 10.1007/s11263-007-0090-8

S. L. Salzberg, On comparing classifiers: Pitfalls toavoid and a recommended approach

J. Sánchez and F. Perronnin, High-dimensional signature compression for large-scale image classification, CVPR 2011, p.12, 2011.
DOI : 10.1109/CVPR.2011.5995504

S. Shalev-shwartz, Y. Singer, and N. Srebro, Pegasos, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273598

D. J. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, Chapman & Hall/CRC, issue.9, 2007.
DOI : 10.1201/9781420036268

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, 2003.
DOI : 10.1109/ICCV.2003.1238663

D. Tao, X. Tang, X. Li, and X. Wu, Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval, IEEE Trans. Pattern Anal. Mach. Intell, vol.28, issue.7, 2006.

A. Tewari and P. L. Bartlett, On the Consistency of Multiclass Classification Methods, JMLR, vol.3, p.4, 2007.
DOI : 10.1007/11503415_10

A. Torralba, R. Fergus, and W. Freeman, 80 million tiny images: a large dataset for non-parametric object and scene recognition, IEEE TPAMI, vol.2, issue.1 3, 2008.

L. Torresani, M. Szummer, and A. Fitzgibbon, Efficient Object Category Recognition Using Classemes, ECCV, 2010.
DOI : 10.1007/978-3-642-15549-9_56

I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, Large margin methods for structured and interdependent output variables, JMLR, issue.4, 2005.

N. Usunier, D. Buffoni, and P. Gallinari, Ranking with ordered weighted pairwise classification, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553509
URL : https://hal.archives-ouvertes.fr/hal-01297974

J. C. Van-gemert, C. J. Veenman, A. W. Smeulders, and J. M. Geusebroek, Visual Word Ambiguity, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.7, 2010.
DOI : 10.1109/TPAMI.2009.132

A. Vedaldi and A. Zisserman, Efficient additive kernels via explicit feature maps, CVPR, 2010.
DOI : 10.1109/cvpr.2010.5539949
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.7024

A. Vedaldi and A. Zisserman, Sparse kernel approximations for efficient classification and detection, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6247943
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.363.5902

V. Vural and J. G. Dy, A hierarchical method for multi-class support vector machines, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015427

G. Wang, D. Hoiem, and D. Forsyth, Learning image similarity from Flickr groups using Stochastic Intersection Kernel MAchines, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459167

J. Wang, J. Yang, K. Yu, F. Lv, T. Huang et al., Localityconstrained linear coding for image classification, CVPR, 2010.

J. Weston, S. Bengio, and N. Usunier, Large scale image annotation: learning??to??rank with??joint word-image embeddings, Machine Learning, vol.5, issue.1, p.12, 2010.
DOI : 10.1007/s10994-010-5198-3

J. Weston, S. Bengio, and N. Usunier, Wsabie: Scaling up to large vocabulary image annotation, IJCAI, 2011.

J. Weston and C. Watkins, Multi-class support vector machines, 1998.

J. Weston and C. Watkins, Support vector machines for multi-class pattern recognition, ESANN, 1999.

J. Xu, T. Liu, M. Lu, H. Li, and W. Ma, Directly optimizing evaluation measures in learning to rank, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, p.5, 2008.
DOI : 10.1145/1390334.1390355

J. Yang, K. Yu, Y. Gong, and T. S. Huang, Linear spatial pyramid matching using sparse coding for image classification, CVPR, 2009.

Y. Yue, T. Finley, F. Radlinski, and T. Joachims, A support vector method for optimizing average precision, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, p.5, 2007.
DOI : 10.1145/1277741.1277790

B. Zhao, L. Fei-fei, and E. Xing, Large-scale category structure aware image categorization, NIPS, 2011.

Z. Zhou, K. Yu, T. Zhang, and T. Huang, Image Classification Using Super-Vector Coding of Local Image Descriptors, ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_11