Bayesian mixture models (in)consistency for the number of clusters - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Preprints, Working Papers, ... (Preprint) Year : 2023

Bayesian mixture models (in)consistency for the number of clusters

Abstract

Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, their application for clustering has some limitations. Recent results proved posterior inconsistency of the number of clusters when the true number of clusters is finite for the Dirichlet process and Pitman--Yor process mixture models. We extend these results to additional Bayesian nonparametric priors such as Gibbs-type processes and finite-dimensional representations thereof. The latter include the Dirichlet multinomial process, the recently proposed Pitman--Yor, and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a post-processing algorithm introduced for the Dirichlet process can be extended to more general models and provides a consistent method to estimate the number of components.
Fichier principal
Vignette du fichier
2210.14201v2.pdf (1.95 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03866434 , version 1 (22-11-2022)
hal-03866434 , version 2 (22-02-2023)

Identifiers

Cite

Louise Alamichel, Daria Bystrova, Julyan Arbel, Guillaume Kon Kam King. Bayesian mixture models (in)consistency for the number of clusters. 2023. ⟨hal-03866434v2⟩
82 View
143 Download

Altmetric

Share

Gmail Facebook X LinkedIn More