I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res, vol.3, pp.1157-1182, 2003.

D. François, F. Rossi, V. Wertz, and M. Verleysen, Resampling methods for parameterfree and robust feature selection with mutual information, Neurocomputing, vol.70, pp.7-91276, 2007.

M. Verleysen, F. Rossi, and D. François, Advances in Feature Selection with Mutual Information, Similarity-Based Clustering, pp.52-69, 2009.
DOI : 10.1002/9780470316849

URL : https://hal.archives-ouvertes.fr/hal-00413154

B. Frénay, M. Van-heeswijk, Y. Miche, M. Verleysen, and A. Lendasse, Feature selection for nonlinear models with extreme learning machines, Neurocomputing, vol.102, pp.111-124, 2013.

V. Gomez-verdejo, M. Verleysen, and J. Fleury, Information-theoretic feature selection for functional data classification, Neurocomputing, vol.72, issue.16-18, pp.16-183580, 2009.
DOI : 10.1016/j.neucom.2008.12.035

R. Kohavi and G. H. John, Wrappers for feature subset selection, Artificial Intelligence, vol.97, issue.1-2, pp.273-324, 1997.
DOI : 10.1016/S0004-3702(97)00043-X

URL : https://doi.org/10.1016/s0004-3702(97)00043-x

J. Paul, R. D. Ambrosio, and P. Dupont, Kernel methods for heterogeneous feature selection, Neurocomputing, vol.169, pp.187-195, 2015.
DOI : 10.1016/j.neucom.2014.12.098

URL : https://www.info.ucl.ac.be/%7Epdupont/pdupont/pdf/neurocomputing_15.pdf

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression, Annals of Statistics, vol.32, pp.407-499, 2004.

B. Frénay, G. Doquire, and M. Verleysen, Is mutual information adequate for feature selection in regression?, Neural Networks, vol.48, pp.1-7, 2013.
DOI : 10.1016/j.neunet.2013.07.003

G. Doquire, B. Frénay, and M. Verleysen, Risk estimation and feature selection, Proceedings of the 21th International Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, p.2013, 2013.

A. Degeest, M. Verleysen, and B. Frénay, Feature ranking in changing environments where new features are introduced, 2015 International Joint Conference on Neural Networks (IJCNN), pp.1-8, 2015.
DOI : 10.1109/IJCNN.2015.7280533

G. Brown, A. Pocock, M. Zhao, and M. Lujan, Conditional likelihood maximisation: A unifying framework for mutual information feature selection, Journal of Machine Learning Research, vol.13, pp.27-66, 2012.

R. Battiti, Using mutual information for selecting features in supervised neural net learning, IEEE Transactions on Neural Networks, vol.5, issue.4, pp.537-550, 1994.
DOI : 10.1109/72.298224

URL : http://rtm.science.unitn.it/~battiti/archive/mutual-nn.pdf

J. R. Vergara and P. A. Estévez, A review of feature selection methods based on mutual information, Neural Computing and Applications, pp.175-186, 2014.
DOI : 10.1109/T-C.1971.223410

URL : http://arxiv.org/pdf/1509.07577

C. E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, vol.27, issue.3, pp.379-423, 1948.
DOI : 10.1002/j.1538-7305.1948.tb01338.x

B. Frénay, G. Doquire, and M. Verleysen, Theoretical and empirical study on the potential inadequacy of mutual information for feature selection in classification, Neurocomputing, vol.112, pp.64-78, 2013.
DOI : 10.1016/j.neucom.2012.12.051

A. Guillén, D. Sovilj, F. Mateo, I. Rojas, and A. Lendasse, New methodologies based on delta test for variable selection in regression problems, Workshop on Parallel Architectures and Bioinspired Algorithms, 2008.

Q. Yu, E. Séverin, and A. Lendasse, Variable selection for financial modeling, Proceedings of the CEF 2007, 13th International Conference on Computing in Economics and Finance, pp.237-241, 2007.

A. Kraskov, H. Stögbauer, and P. Grassberger, Estimating mutual information, Physical Review E, vol.140, issue.6, p.66138, 2004.
DOI : 10.1103/PhysRevE.62.3096

URL : http://juser.fz-juelich.de/record/42907/files/60015.pdf

L. F. Kozachenko and N. Leonenko, Sample estimate of the entropy of a random vector, Problems Inform. Transmission, vol.23, pp.95-101, 1987.

E. Eirola, E. Liitiäinen, A. Lendasse, F. Corona, and M. Verleysen, Using the delta test for variable selection, Proceedings of ESANN'08, 2008.

E. Eirola, A. Lendasse, F. Corona, and M. Verleysen, The delta test: The 1-nnestimator as a feature selection criterion, Proceedings of the 2014 International Joint Conference on Neural Networks (IJCNN), pp.4214-4222, 2014.
DOI : 10.1109/ijcnn.2014.6889560

URL : http://research.ics.aalto.fi/eiml/Publications/Publication213.pdf

D. S. Wookey and G. D. Konidaris, Regularized feature selection in reinforcement learning, Machine Learning, pp.655-676, 2015.
DOI : 10.1109/TAC.2009.2022097

URL : https://link.springer.com/content/pdf/10.1007%2Fs10994-015-5518-8.pdf