P. Agrawal, V. K. Garg, and R. Narayanam, Link label prediction in signed social networks, IJCAI, pp.2591-2597, 2013.

Y. Atzmon, U. Shalit, and G. Chechik, Learning sparse metrics, one feature 680 at a time, NIPS 2015 Workshop on Feature Extraction: Modern Questions and Challenges, 2015.

R. Bardenet and O. A. Maillard, Concentration inequalities for sampling without replacement, Bernoulli, vol.21, pp.1361-1385, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01216652

A. Bellet and A. Habrard, Robustness and Generalization for Metric Learn-685 ing, Neurocomputing, vol.151, pp.259-267, 2015.
DOI : 10.1016/j.neucom.2014.09.044

URL : http://arxiv.org/pdf/1209.1086

A. Bellet, A. Habrard, and M. Sebban, Similarity Learning for Provably Accurate Sparse Linear Classification, pp.1871-1878, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00708401

A. Bellet, A. Habrard, and M. Sebban, Metric Learning, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01121733

W. Bian and D. Tao, Learning a Distance Metric by Empirical Loss Minimization, pp.1186-1191, 2011.

D. Cai and X. He, Manifold Adaptive Experimental Design for Text Catego-695 rization, IEEE Transactions on Knowledge and Data Engineering, vol.24, pp.707-719, 2012.

Q. Cao, Z. C. Guo, and Y. Ying, Generalization Bounds for Metric and Similarity Learning, 2012.
DOI : 10.1007/s10994-015-5499-7

URL : https://link.springer.com/content/pdf/10.1007%2Fs10994-015-5499-7.pdf

Q. Cao, Y. Ying, and P. Li, ECML/PKDD, pp.283-298, 2012.

R. Caruana, N. Karampatziakis, and A. Yessenalina, An empirical evaluation of supervised learning in high dimensions, pp.96-103, 2008.

C. C. Chang and C. J. Lin, LIBSVM : a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, vol.2, pp.27-27, 2011.

Y. W. Chang, C. J. Hsieh, K. W. Chang, M. Ringgaard, and C. J. Lin, Train-705 ing and Testing Low-degree Polynomial Data Mappings via Linear SVM, Journal of Machine Learning Research, vol.11, pp.1471-1490, 2010.

G. Chechik, U. Shalit, V. Sharma, and S. Bengio, An online algorithm for large scale image similarity learning, NIPS, pp.306-314, 2009.

Y. Chen, D. Pavlov, and J. F. Canny, Large-scale behavioral targeting, p.710, 2009.
DOI : 10.1145/1557019.1557048

URL : http://www.cs.berkeley.edu/~jfc/papers/09/KDD09.pdf

K. L. Clarkson, Coresets, sparse greedy approximation, and the FrankWolfe algorithm, ACM Transactions on Algorithms, vol.6, pp.1-30, 2010.

S. Clémençon, I. Colin, and A. Bellet, Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics, Journal of Machine Learning 715 Research, vol.17, pp.1-36, 2016.

S. Clémençon, G. Lugosi, and N. Vayatis, Ranking and Empirical Minimization of U-statistics, Annals of Statistics, vol.36, pp.844-874, 2008.

J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon, Information-theoretic metric learning, pp.209-216, 2007.
DOI : 10.1145/1273496.1273523

R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin, LIBLIN-EAR: A Library for Large Linear Classification, Journal of Machine Learning Research, vol.9, pp.1871-1874, 2008.

S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing, 2013.

D. Fradkin and D. Madigan, Experiments with random projections for machine learning, pp.517-522, 2003.

M. Frank and P. Wolfe, An algorithm for quadratic programming, Naval Research Logistics Quarterly, vol.3, pp.95-110, 1956.

R. M. Freund and P. Grigas, New Analysis and Results for the Conditional 730, 2013.

, Gradient Method

X. Gao, S. C. Hoi, Y. Zhang, J. Wan, and J. Li, SOML: Sparse Online Metric Learning with Application to Image Retrieval, pp.1206-1212, 2014.

J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, Neighbourhood 735 Components Analysis, NIPS, pp.513-520, 2004.

J. Guélat and P. Marcotte, Some comments on Wolfe's away step, Mathematical Programming, vol.35, pp.110-119, 1986.

M. Guillaumin, J. J. Verbeek, and C. Schmid, Is that you? Metric learning approaches for face identification, pp.498-505, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00439290

Z. C. Guo and Y. Ying, Guaranteed Classification via Regularized Similarity Learning, Neural Computation, vol.26, pp.497-522, 2014.
DOI : 10.1162/neco_a_00556

URL : http://arxiv.org/pdf/1306.3108

I. Guyon, S. R. Gunn, A. Ben-hur, and G. Dror, , 2004.

W. Hoeffding, A Class of Statistics with Asymptotically Normal Distri-745 bution, The Annals of Mathematical Statistics, vol.19, pp.293-325, 1948.

M. Jaggi, Sparse Convex Optimization Methods for Machine Learning, 2011.

M. Jaggi, Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization, 2013.

L. Jain, B. Mason, and R. Nowak, Learning Low-Dimensional Metrics, 2017.

R. Jin, S. Wang, and Y. Zhou, Regularized Distance Metric Learning: Theory and Algorithm, 2009.

D. Kedem, S. Tyree, K. Weinberger, F. Sha, and G. Lanckriet, , p.755, 2012.

, Metric Learning, NIPS, pp.2582-2590

B. Kulis, Metric Learning: A Survey, Foundations and Trends in Machine Learning, vol.5, pp.287-364, 2012.

S. Lacoste-julien and M. Jaggi, On the Global Linear Convergence of FrankWolfe Optimization Variants, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01248675

A. R. Leach and V. J. Gillet, An Introduction to Chemoinformatics, 2007.

A. J. Lee, U-Statistics: Theory and Practice, 1990.

D. K. Lim, B. Mcfee, and G. Lanckriet, Robust Structural Metric Learning, 2013.

K. Liu, A. Bellet, and F. Sha, Similarity Learning for High-Dimensional, vol.765, 2015.

S. Data, , pp.653-662

W. Liu, C. Mu, R. Ji, S. Ma, J. R. Smith et al., Low-Rank Similarity Metric Learning in High Dimensions, 2015.

C. Mcdiarmid, On the method of bounded differences, Surveys in combinatorics, vol.141, pp.148-188, 1989.

G. J. Qi, J. Tang, Z. J. Zha, T. S. Chua, and H. J. Zhang, An Efficient Sparse Metric Learning in High-Dimensional Space via l1-Penalized Log-Determinant Regularization, 2009.

Q. Qian, R. Jin, L. Zhang, and S. Zhu, Towards Making High Dimensional Distance Metric Learning Practical, 2015.

Q. Qian, R. Jin, S. Zhu, and Y. Lin, An Integrated Framework for High Dimensional Distance Metric Learning and Its Application to Fine-Grained Visual Categorization, 2014.

R. Rosales and G. Fung, Learning Sparse Metrics via Linear Programming, pp.367-373, 2006.

M. Schultz and T. Joachims, Learning a Distance Metric from Relative Comparisons, 2003.

R. J. Serfling, Probability inequalities for the sum in sampling without replacement, The Annals of Statistics, vol.2, pp.39-48, 1974.

S. Shalev-shwartz and S. Ben-david, Understanding Machine Learning: 785 From Theory to Algorithms, 2014.

C. Shen, J. Kim, L. Wang, and A. Van-den-hengel, Positive Semidefinite Metric Learning Using Boosting-like Algorithms, Journal of Machine Learning Research, vol.13, pp.1007-1036, 2012.

Y. Shi, A. Bellet, and F. Sha, Sparse Compositional Metric Learning, pp.2078-2084, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01430847

. St, J. Amand, and J. Huan, Sparse Compositional Local Metric Learning, 2017.

N. Verma and K. Branson, Sample complexity of learning mahalanobis distance metrics, 2015.

J. Wang, A. Woznica, and A. Kalousis, Parametric Local Metric Learning for Nearest Neighbor Classification, NIPS, pp.1610-1618, 2012.

K. Q. Weinberger and L. K. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, Journal of Machine Learning Research, vol.10, pp.207-244, 2009.

D. Yao, P. Zhao, T. A. Pham, and G. Cong, High-dimensional Similarity Learning via Dual-sparse Random Projection, 2018.
DOI : 10.24963/ijcai.2018/417

URL : https://www.ijcai.org/proceedings/2018/0417.pdf

Y. Ying, K. Huang, and C. Campbell, Sparse Metric Learning via Smooth Optimization, NIPS, pp.2214-2222, 2009.

Y. Ying and P. Li, Distance Metric Learning with Eigenvalue Optimization, 805 Journal of Machine Learning Research, vol.13, pp.1-26, 2012.

J. Zhang and L. Zhang, Efficient Stochastic Optimization for Low-Rank Distance Metric Learning, 2017.