Cross-validation failure: small sample sizes lead to large error bars - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue NeuroImage Année : 2017

Cross-validation failure: small sample sizes lead to large error bars

Résumé

Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers. The principled approach to establish their validity and usefulness is cross-validation, testing prediction on unseen data. Here, I would like to raise awareness on error bars of cross-validation, which are often underestimated. Simple experiments show that sample sizes of many neuroimaging studies inherently lead to large error bars, eg ±10% for 100 samples. The standard error across folds strongly underestimates them. These large error bars compromise the reliability of conclusions drawn with predictive models, such as biomarkers or methods developments where, unlike with cognitive neuroimaging MVPA approaches, more samples cannot be acquired by repeating the experiment across many subjects. Solutions to increase sample size must be investigated, tackling possible increases in heterogeneity of the data.
Fichier principal
Vignette du fichier
paper.pdf (789.04 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01545002 , version 1 (22-06-2017)

Identifiants

Citer

Gaël Varoquaux. Cross-validation failure: small sample sizes lead to large error bars. NeuroImage, 2017, ⟨10.1016/j.neuroimage.2017.06.061⟩. ⟨hal-01545002⟩
4592 Consultations
3383 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More