Evaluating machine learning models and their diagnostic value - Archive ouverte HAL Access content directly
Book Sections Year : 2023

Evaluating machine learning models and their diagnostic value


This chapter describes model validation, a crucial part of machine learning whether it is to select the best model or to assess risk of a given model. We start by detailing the main performance metrics for different tasks (classification, regression), and how they may be interpreted, including in the face of class imbalance, varying prevalence, or asymmetric cost-benefit trade-offs. We then explain how to estimate these metrics in a unbiased manner using training, validation, and test sets. We describe cross-validation procedures –to use a larger part of the data for both training and testing– and the dangers of data leakage –optimism bias due to training data contaminating the test set. Finally, we discuss how to obtain confidence intervals of performance metrics, distinguishing two situations: internal validation or evaluation of learning algorithms, and external validation or evaluation of resulting prediction models.
Fichier principal
Vignette du fichier
main.pdf (1.66 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03682454 , version 1 (31-05-2022)
hal-03682454 , version 2 (01-06-2022)
hal-03682454 , version 3 (02-06-2022)
hal-03682454 , version 4 (21-01-2023)


  • HAL Id : hal-03682454 , version 4


Gaël Varoquaux, Olivier Colliot. Evaluating machine learning models and their diagnostic value. Olivier Colliot. Machine Learning for Brain Disorders, Springer, In press. ⟨hal-03682454v4⟩
3817 View
830 Download


Gmail Facebook Twitter LinkedIn More