Skip to Main content Skip to Navigation
Reports

Strategies for Getting the Highest Likelihood in Mixture Models

Christophe Biernacki 1 Gilles Celeux 1 Gérard Govaert 1
1 IS2 - Statistical Inference for Industry and Health
Inria Grenoble - Rhône-Alpes, LBBE - Laboratoire de Biométrie et Biologie Evolutive - UMR 5558
Abstract : We compare simple strategies to get maximum likelihood parameter estimation in mixture models when using the EM algorithm. All considered strategies are aiming to initiate the EM algorithm in a good way. They are based on random initialisation, using a Classification EM algorithm (CEM), a Stochastic EM algorithm (SEM) or previous short runs of EM itself. They are compared in the context of multivariate Gaussian mixtures on the basis of numerical experiments on both simulated and real data sets. The main conclusions of those numerical experiments are the following. The simple random initialisation which is probably the most employed way of initiating EM is often outperformed by strategies using CEM, SEM or shorts runs of EM before running EM. Thus, those strategies can be preferred to the random initialisation strategy. Also, it appears that repeating runs of EM is generally profitable since using a single run of EM can often lead to suboptimal solutions. Otherwise, none of the experimented strategies can be regarded as the best one and it is difficult to characterize situations where a particular strategy can be expected to outperform the other ones. However, the strategy initiating EM with repeated short runs of EM can be recommended. This strategy, which as far as we know was not used before the present study have some advantages. It is simple, performs well in a lot of situations presupposing no particular form of the mixture to be fitted to the data and seems little sensitive to noisy data.
Document type :
Reports
Complete list of metadata

https://hal.inria.fr/inria-00072333
Contributor : Rapport de Recherche Inria <>
Submitted on : Tuesday, May 23, 2006 - 8:25:16 PM
Last modification on : Monday, February 10, 2020 - 4:36:45 PM
Long-term archiving on: : Sunday, April 4, 2010 - 11:04:09 PM

Identifiers

  • HAL Id : inria-00072333, version 1

Citation

Christophe Biernacki, Gilles Celeux, Gérard Govaert. Strategies for Getting the Highest Likelihood in Mixture Models. [Research Report] RR-4255, INRIA. 2001. ⟨inria-00072333⟩

Share

Metrics

Record views

267

Files downloads

768