Multi-objective Monte-Carlo Tree Search

Weijia Wang 1, 2 Michèle Sebag 1, 2
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Concerned with multi-objective reinforcement learning (MORL), this paper presents MO-MCTS, an extension of Monte-Carlo Tree Search to multi-objective sequential decision making. The known multi-objective indicator referred to as hyper-volume indicator is used to define an action selection criterion, replacing the UCB criterion in order to deal with multi-dimensional rewards. MO-MCTS is firstly compared with an existing MORL algorithm on the artificial Deep Sea Treasure problem. Then a scalability study of MO-MCTS is made on the NP-hard problem of grid scheduling, showing that the performance of MO-MCTS matches the non RL-based state of the art albeit with a higher computational cost.
Type de document :
Communication dans un congrès
Steven C.H. Hoi and Wray Buntine. Asian Conference on Machine Learning, Nov 2012, Singapour, Singapore. 25, pp.507-522, 2012, JMLR: Workshop and Conference Proceedings
Liste complète des métadonnées

Littérature citée [30 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00758379
Contributeur : Weijia Wang <>
Soumis le : mercredi 28 novembre 2012 - 16:06:33
Dernière modification le : jeudi 5 avril 2018 - 12:30:12
Document(s) archivé(s) le : samedi 17 décembre 2016 - 16:23:00

Fichiers

wang88.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00758379, version 1

Collections

Citation

Weijia Wang, Michèle Sebag. Multi-objective Monte-Carlo Tree Search. Steven C.H. Hoi and Wray Buntine. Asian Conference on Machine Learning, Nov 2012, Singapour, Singapore. 25, pp.507-522, 2012, JMLR: Workshop and Conference Proceedings. 〈hal-00758379〉

Partager

Métriques

Consultations de la notice

700

Téléchargements de fichiers

655