Mixability is Bayes Risk Curvature Relative to Log Loss

Tim Van Erven 1, 2 Mark D. Reid 3, 4 Robert C. Williamson 3, 4
1 SELECT - Model selection in statistical learning
Inria Saclay - Ile de France, LMO - Laboratoire de Mathématiques d'Orsay, CNRS - Centre National de la Recherche Scientifique : UMR
Abstract : Mixability of a loss characterizes fast rates in the online learning setting of prediction with expert advice. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses. We then extend this result to multiclass proper losses where there are few existing results. We show that mixability is governed by the maximum eigenvalue of the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses.
Liste complète des métadonnées

Cited literature [23 references]  Display  Hide  Download

https://hal.inria.fr/hal-00758204
Contributor : Tim Van Erven <>
Submitted on : Wednesday, November 28, 2012 - 12:21:18 PM
Last modification on : Thursday, February 7, 2019 - 4:16:24 PM
Document(s) archivé(s) le : Saturday, December 17, 2016 - 4:03:17 PM

File

vanerven12a.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00758204, version 1

Collections

Citation

Tim Van Erven, Mark D. Reid, Robert C. Williamson. Mixability is Bayes Risk Curvature Relative to Log Loss. Journal of Machine Learning Research, special issue on Inductive Logic Programming, Microtome Publishing, 2012, pp.1639−1663. ⟨hal-00758204⟩

Share

Metrics

Record views

459

Files downloads

280