inria-00386477, version 1
Adding expert knowledge and exploration in Monte-Carlo Tree Search
Guillaume Chaslot a, 1Christophe Fiter 2Jean-Baptiste Hoock 2Arpad Rimmel 2Olivier Teytaud
2, 3, 4
Advances in Computer Games (2009)
Résumé : We present a new exploration term, more efficient than clas- sical UCT-like exploration terms and combining efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values and classi- cal online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo: { We show an expert-based improvement of Monte-Carlo simulations for nakade situations; we also emphasize some limitations of this modification. { We show a technique which preserves diversity in the Monte-Carlo simulation, which greatly improves the results in 19x19. { Whereas the UCB-based exploration term is not efficient in MoGo, we show a new exploration term which is highly efficient in MoGo. MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien.
- a – University of Maastricht
- 1 : Maastricht University
- univ. Maastricht
- 2 : TAO (INRIA Saclay - Ile de France)
- INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
- 3 : TAO (INRIA Futurs)
- INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
- 4 : Laboratoire de Recherche en Informatique (LRI)
- CNRS : UMR8623 – Université Paris XI - Paris Sud
- Domaine : Mathématiques/Optimisation et contrôle
- inria-00386477, version 1
- http://hal.inria.fr/inria-00386477
- oai:hal.inria.fr:inria-00386477
- Contributeur : Olivier Teytaud
- Soumis le : Jeudi 21 Mai 2009, 22:29:55
- Dernière modification le : Vendredi 22 Mai 2009, 08:33:53






Documents associés
Exporter