Non-Markovian Reinforcement Learning for Reactive Grid scheduling

Julien Perez 1 Balázs Kégl 2, 3, 4 Cecile Germain-Renaud 2, 4
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Two recurrent questions often appear when solving numerous real world policy search problems. First, the variables defining the so called Markov Decision Process are often continuous, that leads to the necessity for discretization of the considered state/action space or the use of a regression model, often non-linear, to approach the Q-function nee- ded in the reinforcement learning paradigm. Second, the markovian hypothesis is made which is often strongly discutable and can lead to unacceptably suboptimal resulting policies. In this paper, the job scheduling problem in grid infrastructure is modeled as a continuous action-state space, multi-objective reinforcement learning problem, under realistic assumptions ; the high level goals of users, administrators, and shareholders are captured through simple utility functions. So, formalizing the problem as a par- tially observable Markov decision process (POMDP), we detail the algorithm of fitted Q-function learning using an Echo State Network. The experiment, conducted on simu- lation of real grid activity will demonstrate the significative gain of the method against native scheduling infrastructure and a classic feed forward back-propagated neural net- work (FFNN) for Q function learning in the most difficult cases.
Complete list of metadatas

Cited literature [7 references]  Display  Hide  Download

https://hal.inria.fr/inria-00586504
Contributor : Cecile Germain <>
Submitted on : Saturday, April 16, 2011 - 3:37:16 PM
Last modification on : Thursday, April 5, 2018 - 12:30:12 PM
Long-term archiving on : Sunday, July 17, 2011 - 2:27:18 AM

File

esn-cap11.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00586504, version 1

Citation

Julien Perez, Balázs Kégl, Cecile Germain-Renaud. Non-Markovian Reinforcement Learning for Reactive Grid scheduling. Conférence Francophone d'Apprentissage, May 2011, Chambéry, France. ⟨inria-00586504⟩

Share

Metrics

Record views

610

Files downloads

195