EM for mixtures - Initialization requires special care

Jean-Patrick Baudry 1 Gilles Celeux 2
2 SELECT - Model selection in statistical learning
Inria Saclay - Ile de France, LMO - Laboratoire de Mathématiques d'Orsay, CNRS - Centre National de la Recherche Scientifique : UMR
Abstract : Maximum likelihood through the EM algorithm is widely used to estimate the parameters in hidden structure models such as Gaussian mixture models. But the EM algorithm has well-documented drawbacks: its solution could be highly dependent from its initial position and it may fail as a result of degeneracies. We stress the practical dangers of theses limitations and how carefully they should be dealt with. Our main conclusion is that no method enables to address them satisfactory in all situations. But improvements are in-troduced by, first, using a penalized loglikelihood of Gaussian mixture models in a Bayesian regularization perspective and, second, choosing the best among several relevant initialisation strategies. In this perspective, we also propose new recursive initialization strategies which prove helpful. They are compared with standard initialization procedures through numerical experiments and their effects on model selection criteria are analyzed.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal.inria.fr/hal-01113242
Contributor : Gilles Celeux <>
Submitted on : Wednesday, February 4, 2015 - 5:09:02 PM
Last modification on : Thursday, March 21, 2019 - 1:20:43 PM
Long-term archiving on : Sunday, April 16, 2017 - 8:09:15 AM

File

KM1.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01113242, version 1

Citation

Jean-Patrick Baudry, Gilles Celeux. EM for mixtures - Initialization requires special care. 2015. ⟨hal-01113242⟩

Share

Metrics

Record views

577

Files downloads

1606