83 6.1.2 Intrinsic dimension and selection subset size ,
90 6.3.1 Sensitivity w.r.t. initialization of network parameters ,
93 6.4.1 Sensitivity w.r.t. number of selected features k ,
98 6.4.2.1 Classification-based criterion ,
,
Towards more robust and computationally efficient agnostic feature selection ,
, Cluster analysis, 2017.
, The scikit-feature project
K-svd : An algorithm for designing overcomplete dictionaries for sparse representation, IEEE Transactions on signal processing, vol.54, issue.11, pp.4311-4322, 2006. ,
A framework for learning predictive structures from multiple tasks and unlabeled data, Journal of Machine Learning Research, vol.6, pp.1817-1853, 2005. ,
A framework for learning predictive structures from multiple tasks and unlabeled data, Journal of Machine Learning Research, vol.6, pp.1817-1853, 2005. ,
Consistency of the group lasso and multiple kernel learning, Journal of Machine Learning Research, vol.9, pp.1179-1225, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00164735
, The isomap algorithm and topological stability, vol.295, p.7, 2002.
Autoencoders, unsupervised learning, and deep architectures, Proceedings of ICML workshop on unsupervised and transfer learning, pp.37-49, 2012. ,
Neural networks and principal component analysis : Learning from examples without local minima, Neural networks, vol.2, issue.1, pp.53-58, 1989. ,
Clustering on the unit hypersphere using von mises-fisher distributions, Journal of Machine Learning Research, vol.6, pp.1345-1382, 2005. ,
Feature screening using signal-to-noise ratios, Neurocomputing, vol.31, pp.29-44, 2000. ,
Deep learning of representations for unsupervised and transfer learning, Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp.17-36, 2012. ,
Representation learning : A review and new perspectives, IEEE transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013. ,
Occam's razor. Information processing letters, vol.24, pp.377-380, 1987. ,
Modern multidimensional scaling : Theory and applications, Journal of Educational Measurement, vol.40, issue.3, pp.277-280, 2003. ,
Sparse feature learning for deep belief networks, Advances in neural information processing systems, pp.1185-1192, 2008. ,
Signature verification using a siamese time delay neural network, Advances in neural information processing systems, pp.737-744, 1994. ,
Generalized multidimensional scaling : a framework for isometry-invariant partial surface matching, Proceedings of the National Academy of Sciences, vol.103, issue.5, pp.1168-1172, 2006. ,
Unsupervised feature selection for multi-cluster data, 2010. ,
Estimating the intrinsic dimension of data with a fractal-based method, IEEE Transactions on pattern analysis and machine intelligence, vol.24, issue.10, pp.1404-1407, 2002. ,
Intrinsic dimension estimation : Relevant techniques and a benchmark framework, 2015. ,
Robust principal component analysis ?, vol.58, 2011. ,
A convex formulation for semi-supervised multi-label feature selection, Twenty-eighth AAAI conference on artificial intelligence, 2014. ,
Handbook of pattern recognition and computer vision, 2015. ,
Kernel feature selection via conditional covariance minimization, Advances in Neural Information Processing Systems, pp.6946-6955, 2017. ,
Searching in metric spaces, ACM computing surveys (CSUR), vol.33, issue.3, pp.273-321, 2001. ,
Support-vector networks, Machine Learning, vol.20, pp.273-297, 1995. ,
Geodesic entropic graphs for dimension and entropy estimation in manifold learning, IEEE Trans. on Signal Processing, vol.52, issue.8, pp.2210-2221, 2004. ,
Elements of information theory, 2012. ,
Occam learning through pattern discovery : Computational mechanics in ai systems, Proceedings on the International Conference on Artificial Intelligence (ICAI), 2011. ,
Feature analysis : neural network and fuzzy set theoretic approaches, Pattern Recognition, vol.30, issue.10, pp.1579-1590, 1997. ,
, , 2019.
Sample-based non-uniform random variate generation, 1986. ,
Towards a rigorous science of interpretable machine learning, 2017. ,
, Pattern classification, vol.2, 2000.
Gene selection and classification of microarray data using random forest, BMC bioinformatics, vol.7, issue.1, 2006. ,
The approximation of one matrix by another of lower rank, Psychometrika, vol.1, issue.3, pp.211-218, 1936. ,
Estimating the intrinsic dimension of datasets by a minimal neighborhood information, Nature, vol.7, issue.1, 2017. ,
, Sparse-input neural networks for high-dimensional nonparametric regression and classication, 2017.
The use of multiple measurements in taxonomic problems, Annals of eugenics, vol.7, issue.2, pp.179-188, 1936. ,
Fair, transparent and accountable learning, 2018. ,
On comparing clusterings : an element-centric framework unifies overlaps and hierarchy, 2018. ,
Feature selection as a one-player game, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00484049
Fast incremental lda feature extraction, Pattern Recognition, vol.48, issue.6, pp.1999-2012, 2015. ,
Understanding the difficulty of training deep feedforward neural networks, International conference on Artificial Intelligence and Statistics, pp.249-256, 2010. ,
Classical linear regression, Econometric Theory, pp.156-212, 1964. ,
Singular value decomposition and least squares solutions, Linear Algebra, pp.134-151, 1971. ,
Correction of ai systems by linear discriminants : Probabilistic foundations, Information Sciences, vol.466, pp.303-322, 2018. ,
Learning functional causal models with generative neural networks, Explainable and Interpretable Models in Computer Vision and Machine Learning, pp.39-80, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01649153
Generalized fisher score for feature selection, 2012. ,
Causal feature selection. In Computational methods of feature selection, 2007. ,
An introduction to variable and feature selection, Journal of Machine Learning Research, vol.3, pp.1157-1182, 2003. ,
Gene selection for cancer classification using support vector machines, Machine learning, vol.46, issue.1-3, pp.389-422, 2002. ,
Algorithm as 136 : A k-means clustering algorithm, Journal of the Royal Statistical Society. Series C (Applied Statistics), vol.28, issue.1, pp.100-108, 1979. ,
Neural networks : a comprehensive foundation, 1994. ,
Laplacian score for feature selection, Advances in Neural Information Processing Systems, 2005. ,
Improving neural networks by preventing co-adaptation of feature detectors, 2012. ,
The vanishing gradient problem during learning recurrent neural nets and problem solutions, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol.6, issue.02, pp.107-116, 1998. ,
, Kernel Methods in Machine Learning, 2008.
A hybrid genetic algorithm for feature selection wrapper based on mutual information, Pattern Recognition Letters, vol.28, issue.13, pp.1825-1844, 2007. ,
Adaptive lasso and group-lasso for functional poisson regression, The Journal of Machine Learning Research, vol.17, issue.1, pp.1903-1948, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01097914
Structured variable selection with sparsity-inducing norms, JMLR, vol.12, pp.2777-2824, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00377732
An efficient implementation of shortest path algorithm based on dijkstra algorithm, Journal of Wuhan Technical University of Surveying and Mapping (Wtusm), issue.004, p.3, 1999. ,
Data Reduction, 2003. ,
An introduction to computational learning theory, 1994. ,
, Disentangling by factorising, 2018.
, Auto-encoding variational bayes, 2013.
The feature selection problem : Traditional methods and a new algorithm, vol.2, pp.129-134, 1992. ,
A study of cross-validation and bootstrap for accuracy estimation and model selection, IJCAI, vol.14, pp.1137-1145, 1995. ,
Wrappers for feature subset selection, Artificial intelligence, vol.97, issue.1-2, pp.273-324, 1997. ,
Estimating attributes : analysis and extensions of relief, European conference on machine learning, pp.171-182, 1994. ,
Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012. ,
Entropy and correlation : Some comments, IEEE Transactions on Systems, Man, and Cybernetics, vol.17, issue.3, pp.517-519, 1987. ,
Intrinsic dimension estimation using packing numbers, Advances in neural information processing systems, pp.697-704, 2003. ,
The next frontier in AI : Unsupervised learning, 2016. ,
Feature selection with neural networks, Behaviormetrika, vol.26, issue.1, pp.145-166, 1999. ,
Maximum likelihood estimation of intrinsic dimension, Advances in neural information processing systems, pp.777-784, 2005. ,
Supervised feature extraction based on orthogonal discriminant projection, Neurocomputing, vol.73, issue.1-3, pp.191-196, 2009. ,
Feature selection : A data perspective, ACM Computing Surveys (CSUR), issue.6, p.50, 2018. ,
Feature selection : A data perspective, ACM Computing Surveys (CSUR), vol.50, issue.6, p.94, 2018. ,
Challenges of feature selection for big data analytics, IEEE Intelligent Systems, vol.32, pp.9-15, 2017. ,
Reconstruction-based unsupervised feature selection : an embedded approach, Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017. ,
Deep feature selection : theory and application to identify enhancers and promoters, Journal of Computational Biology, vol.23, issue.5, pp.322-336, 2016. ,
Clustering-guided sparse structural learning for unsupervised feature selection, IEEE Transactions on Knowledge and Data Engineering, vol.26, issue.9, pp.2138-2150, 2014. ,
Unsupervised feature selection using non-negative spectral analysis, 2012. ,
Supervised deep feature extraction for hyperspectral image classification, IEEE Transactions on Geoscience and Remote Sensing, vol.56, issue.4, pp.1909-1921, 2018. ,
Moreau-yosida regularization for grouped tree structure learning, NIPS, pp.1459-1467, 2010. ,
Constructing tumor progression pathways and biomarker discovery with fuzzy kernel kmeans and dna methylation data, Cancer informatics, p.6, 2008. ,
Web image annotation via subspace-sparsity collaborated feature selection, IEEE Trans. Multimedia, vol.14, issue.4, pp.1021-1030, 2012. ,
How long is the coast of britain ? statistical self-similarity and fractional dimension, Science, vol.156, issue.3775, pp.636-638, 1967. ,
The fractal geometry of nature, WH freeman, vol.173, p.51, 1983. ,
Where are linear feature extraction methods applicable ?, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, pp.1934-1944, 2005. ,
Evaluation of the nash-sutcliffe efficiency index, Journal of Hydrologic Engineering, vol.11, issue.6, pp.597-602, 2006. ,
The group lasso for logistic regression, Journal of the Royal Statistical Society : Series B (Statistical Methodology), vol.70, issue.1, pp.53-71, 2008. ,
Comparing clusterings by the variation of information, Learning theory and kernel machines, pp.173-187, 2003. ,
Fisher discriminant analysis with kernels, Neural networks for signal processing IX : Proceedings of the 1999 IEEE signal processing society workshop, pp.41-48, 1999. ,
Scalable nearest neighbor algorithms for high dimensional data, IEEE transactions on pattern analysis and machine intelligence, vol.36, pp.2227-2240, 2014. ,
On spectral clustering : Analysis and an algorithm, Advances in Neural Information Processing Systems, 2001. ,
Unsupervised feature selection with structured graph optimization, AAAI, pp.1302-1308, 2016. ,
Sparse coding with an overcomplete basis set : a strategy employed by v1 ? Vision Research, vol.37, pp.3311-3325, 1997. ,
Weapons of math destruction : How big data increases inequality and threatens democracy, 2016. ,
On the difficulty of training recurrent neural networks, International conference on machine learning, pp.1310-1318, 2013. ,
Causal inference in statistics : An overview, Statistics surveys, vol.3, pp.96-146, 2009. ,
On lines and planes of closest fit to systems of points in space, Philosophical Magazine, vol.2, issue.11, pp.559-572, 1901. ,
On the geometry of similarity search : dimensionality curse and concentration of measure, 1999. ,
Intrinsic dimension of a dataset : what properties does one expect ?, International Joint Conference on Neural Networks, pp.2959-2964, 2007. ,
Elements of causal inference : foundations and learning algorithms, 2017. ,
An intrinsic dimensionality estimator from near-neighbor information, IEEE Trans. on PAMI, vol.1, pp.25-37, 1979. ,
Efficient learning of sparse representations with an energybased model, Advances in neural information processing systems, pp.1137-1144, 2007. ,
Variational autoencoder for deep learning of images, labels and captions, Advances in neural information processing systems, pp.2352-2360, 2016. ,
Robust unsupervised feature selection, IJCAI, pp.1621-1627, 2013. ,
Contractive auto-encoders : Explicit invariance during feature extraction, Proceedings of the 28th International Conference on International Conference on Machine Learning, pp.833-840, 2011. ,
Nonlinear dimensionality reduction by locally linear embedding, Science, New Series, vol.290, pp.2323-2326, 2000. ,
Feature selection using deep neural networks, International Joint Conference on Neural Networks (IJCNN), 2015. ,
An overview of gradient descent optimization algorithms, 2016. ,
A survey of decision tree classifier methodology, IEEE transactions on systems, man, and cybernetics, vol.21, issue.3, pp.660-674, 1991. ,
Think globally, fit locally : unsupervised learning of low dimensional manifolds, Journal of machine learning research, vol.4, pp.119-155, 2003. ,
Learning with kernels : support vector machines, regularization, optimization, and beyond, 2001. ,
Neural-network feature selector, IEEE transactions on neural networks, vol.8, issue.3, pp.654-662, 1997. ,
Normalized cuts and image segmentation, 1997. ,
Robust spectral learning for unsupervised feature selection, Data Mining (ICDM), pp.977-982, 2014. ,
A sparse-group lasso, Journal of Computational and Graphical Statistics, vol.22, issue.2, pp.231-245, 2013. ,
The interpretation of interaction in contingency tables, Journal of the Royal Statistical Society, vol.13, pp.238-241, 1951. ,
Risk, race, and recidivism : predictive bias and disparate impact, Criminology, vol.54, issue.4, pp.680-712, 2016. ,
Dropout : a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, vol.15, issue.1, pp.1929-1958, 2014. ,
Improved feature screening in feedforward neural networks, Neurocomputing, vol.13, pp.47-58, 1996. ,
Cluster ensembles-a knowledge reuse framework for combining multiple partitions, Journal of machine learning research, vol.3, pp.583-617, 2002. ,
A global geometric framework for nonlinear dimensionality reduction, Science, vol.290, issue.5500, pp.2319-2323, 2000. ,
Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society Series B (Methodological), pp.267-288, 1996. ,
Theory and methods of scaling, 1958. ,
Stastical estimation of the intrinsic dimensionality of a noisy signal collection, IEEE Transactions on Computers, vol.100, issue.2, pp.165-171, 1976. ,
, Dimensionality reduction : A comparative review, 2008.
, Gradient regularization improves accuracy of discriminative models, 2017.
Feature selection with neural networks, Pattern Recognition Letters, vol.23, issue.11, pp.1323-1335, 2002. ,
An evaluation of intrinsic dimensionality estimators, IEEE Trans. on PAMI, vol.17, issue.1, pp.81-86, 1995. ,
Extracting and composing robust features with denoising autoencoders, Proceedings of the 25th international conference on Machine learning, pp.1096-1103, 2008. ,
Stacked denoising autoencoders : Learning useful representations in a deep network with a local denoising criterion, Journal of machine learning research, vol.11, pp.3371-3408, 2010. ,
Information theoretic measures for clusterings comparison, Proceedings of the 26th Annual International Conference on Machine Learning, 2009. ,
Information theoretic measures for clusterings comparison : Variants, properties, normalization and correction for chance, 2010. ,
A tutorial on spectral clustering, Statistics and computing, vol.17, issue.4, pp.395-416, 2007. ,
Unsupervised feature selection via unified trace ratio formulation and k-means clustering (track), Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp.306-321, 2014. ,
, Empirical evaluation of rectified activations in convolutional network, 2015.
Information-theoretic measures for knowledge discovery and data mining, Entropy measures, maximum entropy principle and emerging applications, pp.115-136, 2003. ,
Variable selection via penalized neural network : a drop-out-one loss approach, International Conference on Machine Learning, pp.5616-5625, 2018. ,
Multiclass spectral clustering, 2003. ,
Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society : Series B (Statistical Methodology), vol.68, issue.1, pp.49-67, 2007. ,
Ml-knn : A lazy learning approach to multi-label learning, Pattern recognition, vol.40, issue.7, pp.2038-2048, 2007. ,
A review on multi-label learning algorithms, IEEE transactions on knowledge and data engineering, vol.26, issue.8, pp.1819-1837, 2013. ,
Local structure based supervised feature extraction, Pattern Recognition, vol.39, issue.8, pp.1546-1550, 2006. ,
Local structure based supervised feature extraction, Pattern Recognition, vol.39, issue.8, pp.1546-1550, 2006. ,
Spectral feature selection for supervised and unsupervised learning, 2007. ,
Multi-source feature selection via geometry-dependent covariance analysis, New Challenges for Feature Selection in Data Mining and Knowledge Discovery, pp.36-47, 2008. ,
On similarity preserving feature selection, IEEE Transactions on Knowledge and Data Engineering, vol.25, issue.3, pp.619-632, 2013. ,
Pertubation method for deleting redundant inputs of perceptron networks, Neurocomputing, vol.14, pp.177-193, 1997. ,