inria-00321486, version 2
A variational EM algorithm for large-scale mixture modeling
Jakob Verbeek
1Nikos Vlassis
a, 1Jan Nunnink b, 1
9th Annual Conference of the Advanced School for Computing and Imaging (ASCI '03) (2003) 136--143
Abstract: Mixture densities constitute a rich family of models that can be used in several data mining and machine learning applications, for instance, clustering. Although practical algorithms exist for learning such models from data, these algorithms typically do not scale very well with large datasets. Our approach, which builds on previous work by other authors, offers an acceleration of the EM algorithm for Gaussian mixtures by precomputing and storing sufficient statistics of the data in the nodes of a kd-tree. Contrary to other works, we obtain algorithms that strictly increase a lower bound on the data log-likelihood in every learning step. Experimental results illustrate the validity of our approach.
- a – Technical University of Crete
- b – Universiteit van Amsterdam
- 1: Instituut voor Informatica (IvI)
- Universiteit van Amsterdam
- Domain : Computer Science/Learning
- Keywords : Gaussian mixture – EM algorithm – variational approximation – clustering – very large database
- Available versions : v1 (2011-02-03) v2 (2011-03-08)
- inria-00321486, version 2
- http://hal.inria.fr/inria-00321486
- oai:hal.inria.fr:inria-00321486
- From: Jakob Verbeek
- Submitted on: Tuesday, 8 March 2011 15:08:17
- Updated on: Tuesday, 8 March 2011 15:54:31







Associated documents
Export