Skip to Main content Skip to Navigation
Journal articles

Evolutionary clustering for categorical data using parametric links among multinomial mixture models

Md Abul Hasnat 1 Julien Velcin 1, 2 Stephane Bonnevay 1 Julien Jacques 3, 1, 4, 2
3 MODAL - MOdel for Data Analysis and Learning
Inria Lille - Nord Europe, LPP - Laboratoire Paul Painlevé - UMR 8524, METRICS - Evaluation des technologies de santé et des pratiques médicales - ULR 2694, Polytech Lille - École polytechnique universitaire de Lille, Université de Lille, Sciences et Technologies
Abstract : In this paper, we propose a novel evolutionary clustering method for temporal categorical data based on parametric links among multinomial mixture models. Besides clustering, our main goal is to interpret the evolutions of clusters over time. To this aim, first we propose the formulation of a generalized model that establishes parametric links among two multinomial mixture. Afterward, different parametric sub-models are defined in order to model typical evolutions of the clustering structure. Model selection criteria allow to select the best sub-models and thus to guess the clustering evolution. For the experiments, first we evaluate the proposed method with synthetic temporal data. Next, we apply it to analyze the annotated social media data. Results show that the proposed method is better than the state-of-the-art based on the common evaluation metrics. Additionally, it can provide interpretation about the temporal evolution of the clusters.
Complete list of metadata

Cited literature [42 references]  Display  Hide  Download

https://hal.inria.fr/hal-01204613
Contributor : Julien Jacques <>
Submitted on : Wednesday, February 27, 2019 - 4:11:04 PM
Last modification on : Tuesday, December 8, 2020 - 9:44:59 AM

File

PLMM.pdf
Files produced by the author(s)

Identifiers

Citation

Md Abul Hasnat, Julien Velcin, Stephane Bonnevay, Julien Jacques. Evolutionary clustering for categorical data using parametric links among multinomial mixture models. Econometrics and Statistics , Elsevier, 2017, 3, pp.141-159. ⟨10.1016/j.ecosta.2017.03.004⟩. ⟨hal-01204613v3⟩

Share

Metrics

Record views

245

Files downloads

901