Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, Epiciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Journal articles

A predictive deviance criterion for selecting a generative model in semi-supervised classification

Vincent Vandewalle 1, 2 Christophe Biernacki 3 Gilles Celeux 1 Gérard Govaert 4 
1 SELECT - Model selection in statistical learning
Inria Saclay - Ile de France, LMO - Laboratoire de Mathématiques d'Orsay
3 MODAL - MOdel for Data Analysis and Learning
LPP - Laboratoire Paul Painlevé - UMR 8524, Université de Lille, Sciences et Technologies, Inria Lille - Nord Europe, METRICS - Evaluation des technologies de santé et des pratiques médicales - ULR 2694, Polytech Lille - École polytechnique universitaire de Lille
Abstract : Semi-supervised classification can help to improve generative classifiers by taking into account the information provided by the unlabeled data points, especially when there are far more unlabeled data than labeled data. The aim is to select a generative classification model using both unlabeled and labeled data. A predictive deviance criterion, AIC$_{cond}$ , aiming to select a parsimonious and relevant generative classifier in the semi-supervised context is proposed. In contrast to standard information criteria such as AIC and BIC, AIC$_{cond}$ is focused on the classification task, since it attempts to measure the predictive power of a generative model by approximating its predictive deviance. However, it avoids the computational cost of cross-validation criteria, which make repeated use of the EM algorithm. AIC$_{cond}$ is proved to have consistency properties that ensure its parsimony when compared with the Bayesian Entropy Criterion (BEC), whose focus is similar to that of AIC$_{cond}$. Numerical experiments on both simulated and real data sets show that the behavior of AIC$_{cond}$ as regards the selection of variables and models, is encouraging when it is compared to the competing criteria.
Document type :
Journal articles
Complete list of metadata

https://hal.inria.fr/hal-00778130
Contributor : Erwan Le Pennec Connect in order to contact the contributor
Submitted on : Friday, January 18, 2013 - 5:38:18 PM
Last modification on : Sunday, June 26, 2022 - 11:57:51 AM

Identifiers

Citation

Vincent Vandewalle, Christophe Biernacki, Gilles Celeux, Gérard Govaert. A predictive deviance criterion for selecting a generative model in semi-supervised classification. Computational Statistics and Data Analysis, Elsevier, 2013, 64, pp.220-236. ⟨10.1016/j.csda.2013.02.010⟩. ⟨hal-00778130⟩

Share

Metrics

Record views

93