Skip to Main content Skip to Navigation
Reports

Selecting Hidden Markov Chain States Number with Cross-Validated Likelihood

Gilles Celeux 1 Jean-Baptiste Durand 1
1 SELECT - Model selection in statistical learning
LMO - Laboratoire de Mathématiques d'Orsay, Inria Saclay - Ile de France
Abstract : The problem of estimating the number of hidden states in a hidden Markov chain model is considered. Emphasis is placed on cross-validated likelihood criteria. Using cross-validation to assess the number of hidden states allows to circumvent the well documented technical difficulties of the order identification problem in mixture models. Moreover, in a predictive perspective, it does not require that the sampling distribution belongs to one of the models in competition. However, computing cross-validated likelihood for hidden Markov chains involves difficulties since the data are not independent. Two approaches are proposed to compute cross-validated likelihood for a hidden Markov chain. The first one consists of using a deterministic half-sampling procedure, and the second one consists of an adaptation of the EM algorithm for hidden Markov chains, to take into account randomly missing values induced by cross-validation. Numerical experiments on both simulated and real data sets compare different versions of cross-validated likelihood criterion and penalised likelihood criteria, including BIC and a penalised marginal likelihood criterion. Those numerical experiments hightlight a promising behaviour of the deterministic half-sampling criterion.
Document type :
Reports
Complete list of metadata

https://hal.inria.fr/inria-00071392
Contributor : Rapport de Recherche Inria <>
Submitted on : Tuesday, May 23, 2006 - 5:08:56 PM
Last modification on : Wednesday, September 16, 2020 - 5:07:09 PM
Long-term archiving on: : Sunday, April 4, 2010 - 10:09:50 PM

Identifiers

  • HAL Id : inria-00071392, version 1

Collections

Citation

Gilles Celeux, Jean-Baptiste Durand. Selecting Hidden Markov Chain States Number with Cross-Validated Likelihood. [Research Report] RR-5877, INRIA. 2006. ⟨inria-00071392⟩

Share

Metrics

Record views

307

Files downloads

633