Implicit differentiation for fast hyperparameter selection in non-smooth convex learning - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Journal Articles Journal of Machine Learning Research Year : 2022

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Abstract

Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approximately. Results on regression and classification problems reveal computational benefits for hyperparameter optimization, especially when multiple hyperparameters are required.
Fichier principal
Vignette du fichier
journal.pdf (1.41 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03228663 , version 1 (18-05-2021)
hal-03228663 , version 2 (18-10-2022)

Identifiers

  • HAL Id : hal-03228663 , version 2

Cite

Quentin Bertrand, Quentin Klopfenstein, Mathurin Massias, Mathieu Blondel, Samuel Vaiter, et al.. Implicit differentiation for fast hyperparameter selection in non-smooth convex learning. Journal of Machine Learning Research, 2022, 23 (149), pp.1-48. ⟨hal-03228663v2⟩
159 View
183 Download

Share

Gmail Facebook X LinkedIn More