F. Bach and Z. Harchaoui, Diffrac: a discriminative and flexible framework for clustering, Advances in Neural Information Processing Systems, 2007.

F. Bach and M. I. Jordan, Learning spectral clustering, Advances in Neural Information Processing Systems 16, 2004.

Y. Bengio, H. Larochelle, and P. Vincent, Nonlocal manifold Parzen windows, Advances in Neural Information Processing Systems 18, pp.115-122, 2006.

G. Blanchard, M. Kawanabe, M. Sugiyama, V. Spokoiny, and K. Müller, In search of non- Gaussian components of a high-dimensional distribution, J. Mach. Learn. Res, vol.7, pp.247-282, 2006.

J. M. Borwein and A. S. Lewis, Convex analysis and nonlinear optimization, theory and examples, 2000.

R. Boscolo, H. Pan, and V. P. Roychowdhury, Independent Component Analysis Based on Nonparametric Density Estimation, IEEE Transactions on Neural Networks, vol.15, issue.1, pp.55-65, 2004.
DOI : 10.1109/TNN.2003.820667

A. Bowman, An alternative method of cross-validation for the smoothing of density estimates, Biometrika, vol.71, issue.2, pp.353-360, 1984.
DOI : 10.1093/biomet/71.2.353

W. Chen, Y. Song, H. Bai, C. Lin, and E. Y. Chang, Parallel Spectral Clustering in Distributed Systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.3, pp.568-586, 2011.
DOI : 10.1109/TPAMI.2010.88

Y. Chow, S. Geman, and L. Wu, Consistent Cross-Validated Density Estimation, The Annals of Statistics, vol.11, issue.1, pp.25-38, 1983.
DOI : 10.1214/aos/1176346053

D. Comaniciu and P. Meer, Mean shift: a robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, issue.5, pp.603-619, 2002.
DOI : 10.1109/34.1000236

V. , D. Silva, and J. B. Tenenbaum, Sparse multidimensional scaling using landmark points, Technology, pp.1-41, 2004.

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), vol.39, issue.1, pp.1-38, 1977.

R. P. Duin, On the choice of smoothing parameters for Parzen estimators of probability density functions. Computers, IEEE Transactions, issue.11, pp.251175-1179, 1976.

T. Duong and M. Hazelton, Cross-validation Bandwidth Matrices for Multivariate Kernel Density Estimation, Scandinavian Journal of Statistics, vol.9, issue.3, pp.485-506, 2005.
DOI : 10.1016/S0167-9473(01)00053-6

A. Globerson and S. Roweis, Metric learning by collapsing classes, Advances in Neural Information Processing Systems, pp.451-458, 2006.

J. Goldberger, S. Roweis, G. E. Hinton, and R. Salakhutdinov, Neighbourhood components analysis, Advances in Neural Information Processing Systems 17, 2004.

A. Hyvärinen, J. Karhunen, and E. Oja, Independent component analysis, 2001.

M. C. Jones and D. A. Henderson, Maximum likelihood kernel density estimation: On the potential of convolution sieves, Computational Statistics & Data Analysis, vol.53, issue.10, pp.3726-3733, 2009.
DOI : 10.1016/j.csda.2009.03.019

A. Ng, M. I. Jordan, and Y. Weiss, On spectral clustering: Analysis and an algorithm, Advances in Neural Information Processing Systems 14, 2001.

B. Park and J. S. Marron, Comparison of Data-Driven Bandwidth Selectors, Journal of the American Statistical Association, vol.9, issue.409, pp.66-72, 1990.
DOI : 10.1214/aoms/1177696810

E. Parzen, On estimation of a probability density function and mode. The Annals of Mathematical Statistics, pp.1065-1076, 1962.

M. Rosenblatt, Remarks on Some Nonparametric Estimates of a Density Function, The Annals of Mathematical Statistics, vol.27, issue.3, pp.832-837, 1956.
DOI : 10.1214/aoms/1177728190

M. Rudemo, Empirical choice of histograms and kernel density estimators, Scandinavian Journal of Statistics, vol.9, issue.2, pp.65-78, 1982.

S. J. Sheather and M. C. Jones, A reliable databased bandwidth selection method for kernel density estimation, Journal of the Royal Statistical Society. Series B (Methodological), vol.53, issue.3, pp.683-690, 1991.

B. W. Silverman, Density estimation for statistics and data analysis, 1986.
DOI : 10.1007/978-1-4899-3324-9

P. Vincent and Y. Bengio, Manifold Parzen windows, Advances in Neural Information Processing Systems 15, pp.825-832, 2002.

E. Xing, A. Ng, M. I. Jordan, and S. Russell, Distance metric learning with application to clustering with side-information, Advances in Neural Information Processing Systems 14, 2002.

L. Zelnik-manor and P. Perona, Self-tuning spectral clustering, Advances in Neural Information Processing Systems 17, 2004.