R. J. Little and D. B. Rubin, Statistical Analysis with Missing Data, 2002.
DOI : 10.1002/9781119013563

D. B. Rubin, Multiple Imputation for Nonresponse in Surveys, 1987.
DOI : 10.1002/9780470316696

C. K. Enders, Applied Missing Data Analysis, Methodology In The Social Sciences, 2010.

E. R. Hruschka, E. R. Hruschka-jr, and N. F. Ebecken, Evaluating a Nearest-Neighbor Method to Substitute Continuous Missing Values, AI 2003: Advances in Artificial Intelligence, pp.723-734, 2003.
DOI : 10.1007/978-3-540-24581-0_62

J. Van-hulse and T. M. Khoshgoftaar, Incomplete-case nearest neighbor imputation in software measurement data, press, available online 9, 2011.
DOI : 10.1016/j.ins.2010.12.017

S. Van-buuren, J. P. Brand, C. G. Groothuis-oudshoorn, and D. B. Rubin, Fully conditional specification in multivariate imputation, Journal of Statistical Computation and Simulation, vol.36, issue.12, pp.1049-1064, 2006.
DOI : 10.2307/2289716

A. Aussem and S. R. De-morais, A conservative feature subset selection algorithm with missing data, Neurocomputing, vol.73, issue.4-6, pp.585-590, 2010.
DOI : 10.1016/j.neucom.2009.05.019

URL : https://hal.archives-ouvertes.fr/hal-00383775

G. Doquire and M. Verleysen, Feature selection with missing data using mutual information estimators, Neurocomputing, vol.90, pp.3-11, 2012.
DOI : 10.1016/j.neucom.2012.02.031

J. W. Grzymala-busse and W. J. Grzymala-busse, Handling missing attribute values, Data Mining and Knowledge Discovery Handbook, pp.33-51, 2010.

P. J. García-laencina, J. Sancho-gómez, A. R. Figueiras-vidal, and M. Verleysen, K nearest neighbours with mutual information for simultaneous classification and missing data imputation, Neurocomputing, vol.72, issue.7-9, pp.1483-1493, 2009.
DOI : 10.1016/j.neucom.2008.11.026

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), vol.39, pp.1-38, 1977.

Z. Ghahramani and M. Jordan, Learning From Incomplete Data, 1995.

L. Hunt and M. Jorgensen, Mixture model clustering for mixed data with missing information, Computational Statistics & Data Analysis, vol.41, issue.3-4, pp.429-440, 2003.
DOI : 10.1016/S0167-9473(02)00190-1

T. I. Lin, J. C. Lee, and H. J. Ho, On fast supervised learning for normal mixture models with missing information, Pattern Recognition, vol.39, issue.6, pp.1177-1187, 2006.
DOI : 10.1016/j.patcog.2005.12.014

R. J. Steele, N. Wang, and A. E. Raftery, Inference from multiple imputation for missing data using mixtures of normals, Statistical Methodology, vol.7, issue.3, pp.351-365, 2010.
DOI : 10.1016/j.stamet.2010.01.003

O. Delalleau, A. C. Courville, and Y. Bengio, Efficient EM training of Gaussian mixtures with missing data, p.521, 1209.

V. Tresp, S. Ahmad, and R. Neuneier, Training neural networks with deficient data, Advances in Neural Information Processing Systems, pp.128-135, 1994.

Z. Viharos, L. Monostori, and T. Vincze, Training and Application of Artificial Neural Networks with Incomplete Data, Developments in Applied Artificial Intelligence, pp.649-659, 2002.
DOI : 10.1007/3-540-48035-8_63

A. Morris, M. Cooke, and P. Green, Some solution to the missing feature problem in data classification, with application to noise robust ASR, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), pp.737-740, 1998.
DOI : 10.1109/ICASSP.1998.675370

C. Beunckens, G. Molenberghs, and M. G. Kenward, Direct likelihood analysis versus simple forms of imputation for missing data in randomized clinical trials, Clinical Trials, vol.2, issue.5, pp.379-386, 2005.
DOI : 10.1191/1740774505cn119oa

C. Bouveyron, S. Girard, and C. Schmid, High-dimensional data clustering, Computational Statistics & Data Analysis, vol.52, issue.1, pp.502-519, 2007.
DOI : 10.1016/j.csda.2007.02.009

URL : https://hal.archives-ouvertes.fr/inria-00548573

J. K. Dixon, Pattern recognition with partly missing data, Systems, Man and Cybernetics, IEEE Transactions on, vol.9, pp.617-621, 1979.

T. W. Anderson, An Introduction to Multivariate Statistical Analysis, 2003.

H. Akaike, A new look at the statistical model identification, Automatic Control, IEEE Transactions on, vol.19, pp.716-723, 1974.

C. M. Hurvich and C. Tsai, Regression and time series model selection in small samples, Biometrika, vol.76, issue.2, pp.297-307, 1989.
DOI : 10.1093/biomet/76.2.297

A. Asuncion and D. J. Newman, UCI machine learning repository, 2012.

J. Shaffer, Multiple hypothesis testing, Annual review of psychology, pp.561-584, 1995.

G. Huang, Q. Zhu, and C. Siew, Extreme learning machine: Theory and applications, Neurocomputing, vol.70, issue.1-3, pp.489-501, 2006.
DOI : 10.1016/j.neucom.2005.12.126

G. Huang and C. Siew, Extreme learning machine with randomly assigned RBF kernels, International Journal of Information Technology, vol.11, pp.16-24, 2005.