F. R. Bach and M. I. Jordan, Learning Graphical Models for Stationary Time Series, IEEE Transactions on Signal Processing, vol.52, issue.8, pp.2189-2199, 2004.
DOI : 10.1109/TSP.2004.831032

M. F. Balcan, A. Blum, and S. Vempala, A discriminative framework for clustering via similarity functions, Proceedings of the fourtieth annual ACM symposium on Theory of computing, STOC 08, pp.671-680, 2008.
DOI : 10.1145/1374376.1374474

C. Biernacki, G. Celeux, and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.22, issue.7, pp.719-725, 2000.

P. Billingsley, Statistical Methods in Markov Chains, The Annals of Mathematical Statistics, vol.32, issue.1, pp.12-40, 1961.
DOI : 10.1214/aoms/1177705136

P. Billingsley, Convergence of Probability Measures, 1999.
DOI : 10.1002/9780470316962

D. Bosq, Nonparametric Statistics for Stochastic Processes. Estimation and Prediction, 1996.
DOI : 10.1007/978-1-4612-1718-3

R. C. Bradley, Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions, Probability Surveys, pp.107-144, 2005.
DOI : 10.1214/154957805100000104

E. Carlstein and S. Lele, Nonparametric Change-Point Estimation for Data from an Ergodic Sequence, Theory of Probability & its Applications, pp.726-733, 1994.
DOI : 10.1137/1138073

R. Cilibrasi and P. M. Vitanyi, Clustering by Compression, IEEE Transactions on Information Theory, vol.51, issue.4, pp.1523-1545, 2005.
DOI : 10.1109/TIT.2005.844059

URL : http://arxiv.org/abs/cs/0312044

R. Grossi and J. S. Vitter, Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching, SIAM Journal on Computing, vol.35, issue.2, pp.378-407, 2005.
DOI : 10.1137/S0097539702402354

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.156.968

M. Gutman, Asymptotically optimal classification for multiple tests with empirically observed statistics, IEEE Transactions on Information Theory, vol.35, issue.2, pp.402-408, 1989.
DOI : 10.1109/18.32134

T. Jebara, Y. Song, and K. Thadani, Spectral Clustering and Embedding with Hidden Markov Models, Machine Learning: ECML 2007, pp.164-175, 2007.
DOI : 10.1007/978-3-540-74958-5_18

I. Katsavounidis, C. Kuo, and Z. Zhang, A new initialization technique for generalized Lloyd iteration, IEEE Signal Processing Letters, vol.1, issue.10, pp.144-146, 1994.
DOI : 10.1109/97.329844

A. Khaleghi and D. Ryabko, Locating changes in highly-dependent data with unknown number of change points, Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, United States, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00765436

A. Khaleghi, D. Ryabko, J. Mary, and P. Preux, Online clustering of processes, International Conference on Artificial Intelligence and Statistics (AI&STATS), JMLR W&CP 22, pp.601-609, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00765462

A. Khaleghi and D. Ryabko, Asymptotically consistent estimation of the number of change points in highly dependent time series, the Proceedings of the 29th International Conference on Machine Learning (ICML), JMLR W&CP, pp.539-547, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01026583

J. Kleinberg, An impossibility theorem for clustering, Neural Information Processing Systems (NIPS), pp.446-453, 2002.

I. Kontoyiannis and Y. M. Suhov, Prefixes and the entropy rate for long-range sources, Proceedings of 1994 IEEE International Symposium on Information Theory, pp.194-194, 1994.
DOI : 10.1109/ISIT.1994.394774

M. Kumar, N. R. Patel, and J. Woo, Clustering seasonality patterns in the presence of errors, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, pp.557-563, 2002.
DOI : 10.1145/775047.775129

E. Lehmann, Testing Statistical Hypotheses, 1986.

L. Li and B. A. Prakash, Time series clustering: Complex is simpler, the Proceedings of the 28th International Conference on Machine Learning (ICML), pp.185-192, 2011.

M. Mahajan, P. Nimbhorkar, and K. Varadarajan, The Planar k-Means Problem is NP-Hard, the Proceedings of the 3rd International Workshop on Algorithms and Computation (WALCOM), pp.274-285, 2009.
DOI : 10.1109/TC.1981.6312176

G. Morvai and B. Weiss, On classifying processes, Bernoulli, vol.11, issue.3, pp.523-532, 2005.
DOI : 10.3150/bj/1120591187

G. Morvai and B. Weiss, A note on prediction for discrete time series, Kybernetika, vol.48, issue.4, pp.809-823, 2012.

D. S. Ornstein and B. Weiss, How Sampling Reveals a Process, The Annals of Probability, vol.18, issue.3, pp.905-930, 1990.
DOI : 10.1214/aop/1176990729

E. Rio, Théorie asymptotique des processus aléatoires faiblement dépendants, 1999.

B. Ryabko, Prediction of random sequences and universal coding. Problems of Information Transmission, pp.87-96, 1988.

B. Ryabko and J. Astola, Universal codes as a basis for time series testing, Statistical Methodology, vol.3, issue.4, pp.375-397, 2006.
DOI : 10.1016/j.stamet.2005.10.004

B. Ryabko, Applications of universal source coding to statistical analysis of time series. Selected Topics in Information and Coding Theory, pp.289-338, 2010.

D. Ryabko, Clustering processes, the Proceedings of the 27th International Conference on Machine Learning (ICML), pp.919-926, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00477238

D. Ryabko, Discrimination Between B-Processes is Impossible, Journal of Theoretical Probability, vol.44, issue.6, pp.565-575, 2010.
DOI : 10.1007/s10959-009-0263-1

URL : https://hal.archives-ouvertes.fr/hal-00639537

D. Ryabko, Testing composite hypotheses about discrete ergodic processes, TEST, vol.56, issue.3, pp.317-329, 2012.
DOI : 10.1007/s11749-011-0245-3

URL : https://hal.archives-ouvertes.fr/hal-00639477

D. Ryabko, Uniform hypothesis testing for finite-valued stationary processes, Statistics, vol.22, issue.1, pp.121-128, 2014.
DOI : 10.1007/s10959-009-0263-1

URL : https://hal.archives-ouvertes.fr/inria-00610009

D. Ryabko and B. Ryabko, Nonparametric Statistical Inference for Ergodic Processes, IEEE Transactions on Information Theory, vol.56, issue.3, pp.1430-1435, 2010.
DOI : 10.1109/TIT.2009.2039169

URL : https://hal.archives-ouvertes.fr/inria-00269249

D. Ryabko and J. Mary, A binary-classification-based metric between time-series distributions and its use in statistical and learning problems, Journal of Machine Learning Research, vol.14, pp.2837-2856, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00913240

P. Shields, The Ergodic Theory of Discrete Sample Paths, 1996.
DOI : 10.1090/gsm/013

P. Smyth, Clustering sequences with hidden Markov models, Advances in Neural Information Processing Systems, pp.648-654, 1997.

M. Tschannen and H. Bolcskei, Nonparametric nearest neighbor random process clustering, 2015 IEEE International Symposium on Information Theory (ISIT), pp.1207-1211, 2015.
DOI : 10.1109/ISIT.2015.7282647

URL : http://arxiv.org/abs/1504.05059

E. Ukkonen, On-line construction of suffix trees, Algorithmica, vol.10, issue.3, pp.249-260, 1995.
DOI : 10.1007/BF01206331

S. Zhong and J. Ghosh, A unified framework for model-based clustering, Journal of Machine Learning Research, vol.4, pp.1001-1037, 2003.