A Quasi-Bayesian Perspective to Online Clustering

Le Li 1 Benjamin Guedj 2 Sébastien Loustau 3
2 MODAL - MOdel for Data Analysis and Learning
LPP - Laboratoire Paul Painlevé - UMR 8524, Université de Lille, Sciences et Technologies, Inria Lille - Nord Europe, CERIM - Santé publique : épidémiologie et qualité des soins-EA 2694, Polytech Lille - École polytechnique universitaire de Lille
Abstract : When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that our approach is supported by minimax regret bounds. We also provide an RJMCMC-flavored implementation (called PACBO, see https://cran.r-project.org/web/packages/PACBO/index.html) for which we give a convergence guarantee. Finally, numerical experiments illustrate the potential of our procedure.
Complete list of metadatas

Cited literature [52 references]  Display  Hide  Download

https://hal.inria.fr/hal-01264233
Contributor : Benjamin Guedj <>
Submitted on : Friday, May 25, 2018 - 5:54:31 PM
Last modification on : Tuesday, May 28, 2019 - 4:16:05 PM
Long-term archiving on : Sunday, August 26, 2018 - 2:24:11 PM

File

main.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License

Identifiers

Collections

Citation

Le Li, Benjamin Guedj, Sébastien Loustau. A Quasi-Bayesian Perspective to Online Clustering. Electronic journal of statistics , Shaker Heights, OH : Institute of Mathematical Statistics, 2018, ⟨10.1214/18-EJS1479⟩. ⟨hal-01264233v4⟩

Share

Metrics

Record views

1075

Files downloads

595