Consistent Algorithms for Clustering Time Series

Azadeh Khaleghi; Daniil Ryabko; Jérémie Mary; Philippe Preux

Article Dans Une Revue Journal of Machine Learning Research Année : 2016

Consistent Algorithms for Clustering Time Series

(1) , (2) , (2, 3) , (3, 2)

1
2
3

Azadeh Khaleghi

Fonction : Auteur
PersonId : 993820

Department of Mathematics & Statistics [Lancaster]

Daniil Ryabko

Fonction : Auteur
PersonId : 848126

Sequential Learning

Jérémie Mary

Fonction : Auteur
PersonId : 740984
IdHAL : jeremie-mary

Sequential Learning

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Philippe Preux

Fonction : Auteur
PersonId : 5488
IdHAL : preux-philippe
IdRef : 059896353

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Sequential Learning

Résumé

The problem of clustering is considered for the case where every point is a time series. The time series are either given in one batch (offline setting), or they are allowed to grow with time and new time series can be added along the way (online setting). We propose a natural notion of consistency for this problem, and show that there are simple, com-putationally efficient algorithms that are asymptotically consistent under extremely weak assumptions on the distributions that generate the data. The notion of consistency is as follows. A clustering algorithm is called consistent if it places two time series into the same cluster if and only if the distribution that generates them is the same. In the considered framework the time series are allowed to be highly dependent, and the dependence can have arbitrary form. If the number of clusters is known, the only assumption we make is that the (marginal) distribution of each time series is stationary ergodic. No paramet-ric, memory or mixing assumptions are made. When the number of clusters is unknown, stronger assumptions are provably necessary, but it is still possible to devise nonparametric algorithms that are consistent under very general conditions. The theoretical findings of this work are illustrated with experiments on both synthetic and real data.

Mots clés

clustering time series ergodicity unsupervised learning

Domaines

Apprentissage [cs.LG] Théorie de l'information [cs.IT] Statistiques [math.ST] Théorie [stat.TH]

Fichier principal

khaleghi16a.pdf (477.17 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Daniil Ryabko : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01399613

Soumis le : samedi 19 novembre 2016-21:37:12

Dernière modification le : jeudi 28 mars 2024-13:22:03

Dates et versions

hal-01399613 , version 1 (19-11-2016)

Identifiants

HAL Id : hal-01399613 , version 1

Citer

Azadeh Khaleghi, Daniil Ryabko, Jérémie Mary, Philippe Preux. Consistent Algorithms for Clustering Time Series. Journal of Machine Learning Research, 2016, 17 (3), pp.1 - 32. ⟨hal-01399613⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

160 Consultations

383 Téléchargements

Consistent Algorithms for Clustering Time Series

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager