R. Bekkerman, M. Bilenko, and J. Langford, Scaling Up Machine Learning, 2011.

A. Bellet and A. Habrard, Robustness and generalization for metric learning, Neurocomputing, vol.151, issue.1, pp.259-267, 2015.
DOI : 10.1016/j.neucom.2014.09.044

URL : https://hal.archives-ouvertes.fr/hal-01075370

A. Bellet, A. Habrard, and M. Sebban, A survey on metric learning for feature vectors and structured data ArXiv e-prints, 2013.

P. Bertail and J. Tressou, Incomplete Generalized U-Statistics for Food Risk Assessment, Biometrics, vol.40, issue.1, pp.66-74, 2006.
DOI : 10.1111/j.1541-0420.2005.00401.x

URL : https://hal.archives-ouvertes.fr/hal-01068794

P. Bianchi, S. Clémençon, J. Jakubowicz, and G. , Moral-Adell. On-line learning gossip algorithm in multi-agent systems with local decision rules, Proceedings of the IEEE International Conference on Big Data, 2013.

G. Blom, -statistics, Biometrika, vol.63, issue.3, pp.573-580, 1976.
DOI : 10.1093/biomet/63.3.573

URL : https://hal.archives-ouvertes.fr/in2p3-00608259

L. Bottou, Online Algorithms and Stochastic Approximations: Online Learning and Neural Networks, 1998.

S. Boucheron, O. Bousquet, and G. Lugosi, Theory of Classification: a Survey of Some Recent Advances, ESAIM: Probability and Statistics, vol.9, pp.323-375, 2005.
DOI : 10.1051/ps:2005018

URL : https://hal.archives-ouvertes.fr/hal-00017923

B. M. Brown and D. G. Kildea, Reduced $U$-Statistics and the Hodges-Lehmann Estimator, The Annals of Statistics, vol.6, issue.4, pp.828-835, 1978.
DOI : 10.1214/aos/1176344256

Q. Cao, Z. Guo, and Y. Ying, Generalization bounds for metric and similarity learning, Machine Learning, vol.13, issue.1, 2012.
DOI : 10.1007/s10994-015-5499-7

G. Chechik, V. Sharma, U. Shalit, and S. Bengio, Large Scale Online Learning of Image Similarity through Ranking, Journal of Machine Learning Research, vol.11, pp.1109-1135, 2010.
DOI : 10.1007/978-3-642-02172-5_2

S. Clémençon, A statistical view of clustering performance through the theory of -processes, Journal of Multivariate Analysis, vol.124, pp.42-56, 2014.
DOI : 10.1016/j.jmva.2013.10.001

S. Clémençon and S. Robbiano, Building confidence regions for the ROC surface. To appear in Pattern Recognition Letters, 2014.

S. Clémençon and N. Vayatis, Tree-Based Ranking Methods, IEEE Transactions on Information Theory, vol.55, issue.9, pp.4316-4336, 2009.
DOI : 10.1109/TIT.2009.2025558

S. Clémençon, G. Lugosi, and N. Vayatis, Ranking and scoring using empirical risk minimization, Proceedings of COLT, 2005.

S. Clémençon, G. Lugosi, and N. Vayatis, Ranking and empirical risk minimization of U-statistics. The Annals of Statistics, pp.844-874, 2008.

S. Clémençon, S. Robbiano, and N. Vayatis, Ranking data with ordinal labels: optimality and pairwise aggregation, Machine Learning, pp.67-104, 2013.

W. G. Cochran, Sampling techniques, 1977.

V. De-la-peña and E. Giné, Decoupling: from Dependence to Independence, 1999.
DOI : 10.1007/978-1-4612-0537-1

J. C. Deville, Réplications d'´ echantillons, demi-´ echantillons, Jackknife, bootstrap dans les sondages, Economica, 1987.

L. Devroye, L. Györfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition, 1996.
DOI : 10.1007/978-1-4612-0711-5

E. Enqvist, On sampling from sets of random variables with application to incomplete U-statistics, 1978.

J. Friedman, T. Hastie, and R. Tibshirani, The Elements of Statistical Learning, 2009.

D. K. Fuk and S. V. Nagaev, Probability Inequalities for Sums of Independent Random Variables, Theory of Probability & Its Applications, vol.16, issue.4, p.643660, 1971.
DOI : 10.1137/1116071

E. Giné and J. Zinn, Some limit theorems for empirical processes. The Annals of Probability, pp.929-989, 1984.

W. Grams and R. Serfling, Convergence Rates for $U$-Statistics and Related Statistics, The Annals of Statistics, vol.1, issue.1, pp.153-160, 1973.
DOI : 10.1214/aos/1193342392

J. Hájek, Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population, The Annals of Mathematical Statistics, vol.35, issue.4, pp.1491-1523, 1964.
DOI : 10.1214/aoms/1177700375

J. Hájek, Asymptotic Normality of Simple Linear Rank Statistics Under Alternatives, The Annals of Mathematical Statistics, vol.39, issue.2, pp.325-346, 1968.
DOI : 10.1214/aoms/1177698394

W. Hoeffding, A Class of Statistics with Asymptotically Normal Distribution, The Annals of Mathematical Statistics, vol.19, issue.3, pp.293-325, 1948.
DOI : 10.1214/aoms/1177730196

D. G. Horvitz and D. J. Thompson, A Generalization of Sampling Without Replacement from a Finite Universe, Journal of the American Statistical Association, vol.1, issue.260, pp.663-685, 1951.
DOI : 10.1080/01621459.1949.10483288

S. Janson, The asymptotic distributions of incomplete U-statistics, Zeitschrift f???r Wahrscheinlichkeitstheorie und Verwandte Gebiete, vol.49, issue.4, pp.495-505, 1984.
DOI : 10.1007/BF00531887

R. Jin, S. Wang, and Y. Zhou, Regularized distance metric learning: theory and algorithm, Advances in Neural Information Processing Systems 22, pp.862-870, 2009.

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems 26, pp.315-323, 2013.

T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman et al., A local search approximation algorithm for k-means clustering, Computational Geometry, vol.28, pp.2-389, 2004.

N. , L. Roux, M. W. Schmidt, and F. Bach, A Stochastic gradient method with an exponential convergence rate for finite training sets, Advances in Neural Information Processing Systems 25, pp.2672-2680, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00674995

M. Ledoux and M. Talagrand, Probability in Banach Spaces, 1991.
DOI : 10.1007/978-3-642-20212-4

A. J. Lee, U-statistics: Theory and Practice, 1990.

G. Lugosi, Pattern Classification and Learning Theory, Principles of Nonparametric Learning, pp.1-56, 2002.
DOI : 10.1007/978-3-7091-2568-7_1

P. Massart, Concentration Inequalities and Model Selection, Lecture Notes in Mathematics, 2006.

C. Mcdiarmid, On the method of bounded differences, pp.148-188, 1989.
DOI : 10.1017/CBO9781107359949.008

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

R. J. Serfling, Probability inequalities for the sum in sampling without replacement. The Annals of Statistics, pp.39-48, 1974.

R. J. Serfling, Approximation Theorems of Mathematical Statistics, 1980.

Y. Tillé, Sampling Algorithms, 2006.
DOI : 10.1007/978-3-642-04898-2_501

V. N. Vapnik, An overview of statistical learning theory, IEEE Transactions on Neural Networks, vol.10, issue.5, pp.988-999, 1999.
DOI : 10.1109/72.788640

J. H. Ward, Hierarchical Grouping to Optimize an Objective Function, Journal of the American Statistical Association, vol.58, issue.301, pp.236-244, 1963.
DOI : 10.1007/BF02289263

K. Q. Weinberger and L. K. Saul, Distance metric learning for large margin nearest neighbor classification, Journal of Machine Learning Research, vol.10, pp.207-244, 2009.