Simulation-based search of combinatorial games - Archive ouverte HAL Access content directly
Conference Papers Year : 2010

Simulation-based search of combinatorial games

(1) , (2)
Rémi Coulom
  • Function : Author
  • PersonId : 836867


Monte-Carlo Tree Search is a very successful game playing algorithm. Unfortunately it su ers from the horizon e ect: some important tactical sequences may be delayed beyond the depth of the search tree, causing evaluation errors. Temporal-di erence search with a function approximation is a method that was proposed to overcome these weaknesses, by adaptively changing the simulation policy outside the tree. In this paper we present an experimental evidence demonstrating that a temporal di erence search may fail to nd an optimal policy, even in very simple game positions. Classical temporal di erence algorithms try to evaluate a local situation with a numerical value, but, as it appears, a single number is not enough to model the dynamics of a partial two-player game state. As a solution we propose to replace numerical values by approximate thermographs. With this richer representation of partial states, reinforcement-learning algorithms converge and accurately represent dynamics of states, allowing to nd an optimal policy.
Not file

Dates and versions

hal-00694030 , version 1 (03-05-2012)


  • HAL Id : hal-00694030 , version 1


Lukasz Lew, Rémi Coulom. Simulation-based search of combinatorial games. ICML 2010 : Workshop on Machine Learning and Games, Jun 2010, Haifa, Israel. ⟨hal-00694030⟩
112 View
0 Download


Gmail Facebook Twitter LinkedIn More