Non-Markovian Reinforcement Learning for Reactive Grid scheduling

Julien Perez 1 Balázs Kégl 2, 3, 4 Cecile Germain-Renaud 2, 4
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Two recurrent questions often appear when solving numerous real world policy search problems. First, the variables defining the so called Markov Decision Process are often continuous, that leads to the necessity for discretization of the considered state/action space or the use of a regression model, often non-linear, to approach the Q-function nee- ded in the reinforcement learning paradigm. Second, the markovian hypothesis is made which is often strongly discutable and can lead to unacceptably suboptimal resulting policies. In this paper, the job scheduling problem in grid infrastructure is modeled as a continuous action-state space, multi-objective reinforcement learning problem, under realistic assumptions ; the high level goals of users, administrators, and shareholders are captured through simple utility functions. So, formalizing the problem as a par- tially observable Markov decision process (POMDP), we detail the algorithm of fitted Q-function learning using an Echo State Network. The experiment, conducted on simu- lation of real grid activity will demonstrate the significative gain of the method against native scheduling infrastructure and a classic feed forward back-propagated neural net- work (FFNN) for Q function learning in the most difficult cases.
Type de document :
Communication dans un congrès
Presses Universitaires des Antilles et de la Guyane. Conférence Francophone d'Apprentissage, May 2011, Chambéry, France. Publibook, 2011
Liste complète des métadonnées

Littérature citée [7 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00586504
Contributeur : Cecile Germain <>
Soumis le : samedi 16 avril 2011 - 15:37:16
Dernière modification le : jeudi 5 avril 2018 - 12:30:12
Document(s) archivé(s) le : dimanche 17 juillet 2011 - 02:27:18

Fichier

esn-cap11.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00586504, version 1

Collections

Citation

Julien Perez, Balázs Kégl, Cecile Germain-Renaud. Non-Markovian Reinforcement Learning for Reactive Grid scheduling. Presses Universitaires des Antilles et de la Guyane. Conférence Francophone d'Apprentissage, May 2011, Chambéry, France. Publibook, 2011. 〈inria-00586504〉

Partager

Métriques

Consultations de la notice

574

Téléchargements de fichiers

179