Skip to Main content Skip to Navigation
Conference papers

Nash and the Bandit Approach for Adversarial Portfolios

David L. Saint-Pierre 1, 2 Olivier Teytaud 1, 2 
1 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : —In this paper we study the use of a portfolio of policies for adversarial problems. We use two different portfolios of policies and apply it to the game of Go. The first portfolio is composed of different versions of the GnuGo agent. The second portfolio is composed of fixed random seeds. First we demonstrate that learning an offline combination of these policies using the notion of Nash Equilibrium generates a stronger opponent. Second, we show that we can learn online such distributions through a bandit approach. The advantages of our approach are (i) diversity (the Nash-Portfolio is more variable than its components) (ii) adaptivity (the Bandit-Portfolio adapts to the opponent) (iii) simplicity (no computational overhead) (iv) increased performance. Due to the importance of games on mobile devices, designing artificial intelligences for small computational power is crucial; our approach is particularly suited for mobile device since it create a stronger opponent simply by biasing the distribution over the policies and moreover it generalizes quite well.
Document type :
Conference papers
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download
Contributor : Olivier Teytaud Connect in order to contact the contributor
Submitted on : Monday, November 3, 2014 - 8:14:41 AM
Last modification on : Sunday, June 26, 2022 - 12:02:13 PM
Long-term archiving on: : Wednesday, February 4, 2015 - 10:07:03 AM


Files produced by the author(s)




David L. Saint-Pierre, Olivier Teytaud. Nash and the Bandit Approach for Adversarial Portfolios. CIG 2014 - Computational Intelligence in Games, IEEE, Aug 2014, Dortmund, Germany. pp.7, ⟨10.1109/CIG.2014.6932897⟩. ⟨hal-01077628⟩



Record views


Files downloads