Variable selection in model-based clustering and discriminant analysis with a regularization approach

Gilles Celeux 1 Cathy Maugis-Rabusseau 2 Mohammed Sedki 3
1 SELECT - Model selection in statistical learning
Inria Saclay - Ile de France, LMO - Laboratoire de Mathématiques d'Orsay, CNRS - Centre National de la Recherche Scientifique : UMR
Abstract : Several methods for variable selection have been proposed in model-based clustering and classification. These make use of backward or forward procedures to define the roles of the variables. Unfortunately, such stepwise procedures are slow and the resulting algorithms inefficient when analyzing large data sets with many variables. In this paper, we propose an alternative regularization approach for variable selection in model-based clustering and classification. In our approach the variables are first ranked using a lasso-like procedure in order to avoid slow stepwise algorithms. Thus, the variable selection methodology of Maugis et al (2009b) can be efficiently applied to high-dimensional data sets.
Type de document :
Pré-publication, Document de travail
en cours de révision à ADAC. 2014
Liste complète des métadonnées

Littérature citée [32 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01053784
Contributeur : Mohammed Amechtoh Sedki <>
Soumis le : mardi 28 novembre 2017 - 14:03:03
Dernière modification le : lundi 15 janvier 2018 - 11:46:03

Fichier

article.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01053784, version 2

Citation

Gilles Celeux, Cathy Maugis-Rabusseau, Mohammed Sedki. Variable selection in model-based clustering and discriminant analysis with a regularization approach. en cours de révision à ADAC. 2014. 〈hal-01053784v2〉

Partager

Métriques

Consultations de la notice

60

Téléchargements de fichiers

26