Adding expert knowledge and exploration in Monte-Carlo Tree Search

Guillaume Chaslot 1 Christophe Fiter 2 Jean-Baptiste Hoock 2 Arpad Rimmel 2 Olivier Teytaud 2, 3, 4
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
3 TANC - Algorithmic number theory for cryptology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France, Polytechnique - X, CNRS - Centre National de la Recherche Scientifique : UMR7161
Abstract : We present a new exploration term, more efficient than clas- sical UCT-like exploration terms and combining efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values and classi- cal online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo: { We show an expert-based improvement of Monte-Carlo simulations for nakade situations; we also emphasize some limitations of this modification. { We show a technique which preserves diversity in the Monte-Carlo simulation, which greatly improves the results in 19x19. { Whereas the UCB-based exploration term is not efficient in MoGo, we show a new exploration term which is highly efficient in MoGo. MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien.
Type de document :
Communication dans un congrès
Advances in Computer Games, 2009, Pamplona, Spain. Springer, 2009
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00386477
Contributeur : Olivier Teytaud <>
Soumis le : jeudi 21 mai 2009 - 22:29:55
Dernière modification le : jeudi 22 février 2018 - 01:23:20
Document(s) archivé(s) le : lundi 15 octobre 2012 - 10:51:28

Fichier

peacg.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00386477, version 1

Collections

Citation

Guillaume Chaslot, Christophe Fiter, Jean-Baptiste Hoock, Arpad Rimmel, Olivier Teytaud. Adding expert knowledge and exploration in Monte-Carlo Tree Search. Advances in Computer Games, 2009, Pamplona, Spain. Springer, 2009. 〈inria-00386477〉

Partager

Métriques

Consultations de la notice

305

Téléchargements de fichiers

530