Skip to Main content Skip to Navigation
Conference papers

Multi-objective Monte-Carlo Tree Search

Weijia Wang 1, 2 Michèle Sebag 1, 2
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Concerned with multi-objective reinforcement learning (MORL), this paper presents MO-MCTS, an extension of Monte-Carlo Tree Search to multi-objective sequential decision making. The known multi-objective indicator referred to as hyper-volume indicator is used to define an action selection criterion, replacing the UCB criterion in order to deal with multi-dimensional rewards. MO-MCTS is firstly compared with an existing MORL algorithm on the artificial Deep Sea Treasure problem. Then a scalability study of MO-MCTS is made on the NP-hard problem of grid scheduling, showing that the performance of MO-MCTS matches the non RL-based state of the art albeit with a higher computational cost.
Document type :
Conference papers
Complete list of metadata

Cited literature [30 references]  Display  Hide  Download
Contributor : Weijia Wang Connect in order to contact the contributor
Submitted on : Wednesday, November 28, 2012 - 4:06:33 PM
Last modification on : Thursday, July 8, 2021 - 3:48:25 AM
Long-term archiving on: : Saturday, December 17, 2016 - 4:23:00 PM


Files produced by the author(s)


  • HAL Id : hal-00758379, version 1



Weijia Wang, Michèle Sebag. Multi-objective Monte-Carlo Tree Search. Asian Conference on Machine Learning, Nov 2012, Singapour, Singapore. pp.507-522. ⟨hal-00758379⟩



Les métriques sont temporairement indisponibles