CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Direct Policy Search is a widely used tool for reinforcement learning; however, it is usually not suitable for handling high-dimensional constrained action spaces such as those arising in power system control (unit commitmen problems). We propose Direct Value Search, an hybridization of DPS with Bellman decomposition techniques. We prove runtime properties, and apply the results to an energy management problem.
https://hal.inria.fr/hal-00997562 Contributor : Jérémie DecockConnect in order to contact the contributor Submitted on : Wednesday, June 4, 2014 - 11:54:07 PM Last modification on : Thursday, July 8, 2021 - 3:48:25 AM Long-term archiving on: : Thursday, September 4, 2014 - 10:41:33 AM
Jérémie Decock, Jean-Joseph Christophe, Olivier Teytaud. Optimization of Energy Policies Using Direct Value Search. 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), May 2014, Liège, Belgium. 2014. ⟨hal-00997562⟩