Skip to Main content Skip to Navigation
Journal articles

A predictive deviance criterion for selecting a generative model in semi-supervised classification

Abstract : Semi-supervised classification can be hoped to improve generative classifiers by taking profit of the information provided by the unlabeled data points, especially when there are far more unlabeled data than labeled data. This paper is concerned with selecting a generative classification model from both unlabeled and labeled data. We propose a predictive deviance criterion AIC$_{cond}$ aiming to select a parsimonious and relevant generative classifier in the semi-supervised context. Contrary to standard information criteria as AIC and BIC, AIC$_{cond}$ is focusing to the classification task since it aims to measure the predictive power of a generative model by approximating its predictive deviance. On an other hand, it avoids the computational trouble encountered with cross validation criteria due to the repeated use of the EM algorithm. AIC$_{cond}$ is proved to have consistency properties ensuring its parsimony compared to the Bayesian Entropy Criterion (BEC) which has a similar focus than AIC$_{cond}$. In addition, numerical experiments on both simulated and real data sets highlight an encouraging behavior of AIC$_{cond}$ for variable and model selection in comparison to the other mentioned criteria.
Complete list of metadata

Cited literature [21 references]  Display  Hide  Download

https://hal.inria.fr/inria-00516991
Contributor : Gilles Celeux <>
Submitted on : Monday, September 13, 2010 - 12:39:14 PM
Last modification on : Friday, November 27, 2020 - 2:18:02 PM
Long-term archiving on: : Tuesday, December 14, 2010 - 2:44:12 AM

Files

RR-7377.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00516991, version 1

Citation

Vincent Vandewalle, Christophe Biernacki, Gilles Celeux, Gérard Govaert. A predictive deviance criterion for selecting a generative model in semi-supervised classification. Computational Statistics and Data Analysis, Elsevier, 2013, 64, pp.220-236. ⟨inria-00516991⟩

Share

Metrics

Record views

804

Files downloads

1250