Variable Selection in Model-based Clustering: A General Variable Role Modeling

Abstract : The currently available variable selection procedures in model-based clustering assume that the irrelevant clustering variables are all independent or are all linked with the relevant clustering variables. We propose a more versatile variable selection model which describes three possible roles for each variable: The relevant clustering variables, the irrelevant clustering variables dependent on a part of the relevant clustering variables and the irrelevant clustering variables totally independent of all the relevant variables. A model selection criterion and a variable selection algorithm are derived for this new variable role modeling. The model identifiability and the consistency of the variable selection criterion are also established. Numerical experiments on simulated datasets and on a real dataset highlight the interest of this new modeling.
Type de document :
Pré-publication, Document de travail
RR-6744. 2008
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00342108
Contributeur : Cathy Maugis <>
Soumis le : lundi 1 décembre 2008 - 12:08:33
Dernière modification le : jeudi 11 janvier 2018 - 06:22:14
Document(s) archivé(s) le : mercredi 22 septembre 2010 - 11:10:16

Fichier

RR-6744.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00342108, version 2

Citation

Cathy Maugis, Gilles Celeux, Marie-Laure Martin-Magniette. Variable Selection in Model-based Clustering: A General Variable Role Modeling. RR-6744. 2008. 〈inria-00342108v2〉

Partager

Métriques

Consultations de la notice

537

Téléchargements de fichiers

285