Skip to Main content Skip to Navigation
Conference papers

Combining policies: the best of human expertise and neurocontrol

Vincent Berthier 1, 2 Adrien Couëtoux 1, 2 Olivier Teytaud 1, 2 
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : We consider sequential decision making in the case where a generative model and a parametric policy are available. Such a framework is naturally tackled with Direct Policy Search, i.e. parametric op-timisation over simulations. We propose a simple method that combines this parametric policy with a more generic neural network, where all parameters are trained simultaneously. As such, our approach doesn't require any computational overhead. We show that the resulting policy significantly outperforms both the domain specific policies and the neural network on a unit commitment test problem.
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download
Contributor : Olivier Teytaud Connect in order to contact the contributor
Submitted on : Monday, September 7, 2015 - 10:42:01 AM
Last modification on : Saturday, June 25, 2022 - 10:17:28 PM
Long-term archiving on: : Tuesday, December 8, 2015 - 11:38:11 AM


Files produced by the author(s)


  • HAL Id : hal-01194516, version 1


Vincent Berthier, Adrien Couëtoux, Olivier Teytaud. Combining policies: the best of human expertise and neurocontrol. Artificial Evolution 2015, 2015, Lyon, France. To appear. ⟨hal-01194516⟩



Record views


Files downloads