inria-00122088, version 3
Evolutionary Latent Class Clustering of Qualitative Data
Damien Tessier
a, 1Marc Schoenauer
a, 1Christophe Biernacki b, 2Gilles Celeux
a, 3Gérard Govaert c, 4
N° RR-6082 (2006)
Abstract: The latent class model or multivariate multinomial mixture is a powerful model for clustering discrete data. This model is expected to be useful to represent non-homogeneous populations. It uses a conditional independence assumption given the latent class to which a statistical unit is belonging. However, whereas a predictive approach of cluster analysis from qualitative data can be easily derived from a fully Bayesian analysis with Jeffreys non informative prior distributions, it leads to a criterion (the integrated completed likelihood derived from the latent class model) that proves difficult to optimize by the standard approach based on the EM algorithm. An Evolutionary Algorithms is designed to tackle this discrete optimization problem, and an extensive parameter study on a large artificial dataset allows to derive stable parameters. A Monte Carlo approach is used to validate those parameters on other artificial datasets, as well as on some well-known real data: the Evolutionary Algorithm seems to repeatedly perform better than other standard clustering techniques on the same data.
- a – INRIA
- b – Université des Sciences et Technologie de Lille - Lille I
- c – Université de Technologie de Compiègne
- 1: TAO (INRIA Futurs)
- INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
- 2: Laboratoire Paul Painlevé (LPP)
- CNRS : UMR8524 – Université Lille 1 - Sciences et Technologies
- 3: SELECT (INRIA Futurs)
- INRIA – Université Paris XI - Paris Sud
- 4: UMR CNRS 6599 (UMR CNRS 6599)
- Université de Technologie de Compiègne
- Domain : Computer Science/Artificial Intelligence
- Keywords : Clustering – Evolutionary Computation – Qualitative features
- Internal note : RR-6082
- Available versions : v1 (2006-12-26) v2 (2006-12-27) v3 (2006-12-29)
- inria-00122088, version 3
- http://hal.inria.fr/inria-00122088
- oai:hal.inria.fr:inria-00122088
- From: Marc Schoenauer
- Submitted on: Wednesday, 27 December 2006 16:36:36
- Updated on: Friday, 29 December 2006 09:04:11






Associated documents
Export