SSC : Statistical Subspace Clustering

Abstract : Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. This is a particularly important challenge with high dimensional data where the curse of dimensionality occurs. It has also the benefit of providing smaller descriptions of the clusters found. Existing methods only consider numerical databases and do not propose any method for clusters visualization. Besides, they require some input parameters difficult to set for the user. The aim of this paper is to propose a new subspace clustering algorithm, able to tackle databases that may contain continuous as well as discrete attributes, requiring as few user parameters as possible, and producing an interpretable output. We present a method based on the use of the well-known EM algorithm on a probabilistic model designed under some specific hypotheses, allowing us to present the result as a set of rules, each one defined with as few relevant dimensions as possible. Experiments, conducted on artificial as well as real databases, show that our algorithm gives robust results, in terms of classification and interpretability of the output.
Type de document :
Communication dans un congrès
Petra Perner and Atsushi Imiya. 4th International Conference on Machine Learning and Data Mining in Pattern Recognition, 2005, Leipzig, Georgia. Springer, 3587, pp.100--109, 2005, Lecture Notes in Computer Science
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00536697
Contributeur : Isabelle Tellier <>
Soumis le : mardi 16 novembre 2010 - 17:32:22
Dernière modification le : samedi 25 août 2018 - 14:48:02
Document(s) archivé(s) le : jeudi 17 février 2011 - 03:07:31

Fichier

MLDM05.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00536697, version 1

Collections

Citation

Laurent Candillier, Isabelle Tellier, Fabien Torre, Olivier Bousquet. SSC : Statistical Subspace Clustering. Petra Perner and Atsushi Imiya. 4th International Conference on Machine Learning and Data Mining in Pattern Recognition, 2005, Leipzig, Georgia. Springer, 3587, pp.100--109, 2005, Lecture Notes in Computer Science. 〈inria-00536697〉

Partager

Métriques

Consultations de la notice

255

Téléchargements de fichiers

346