Skip to Main content Skip to Navigation
Conference papers

Cooperation in stochastic games through communication

Raghav Aras 1 Alain Dutech 1 François Charpillet 1
1 MAIA - Autonomous intelligent machine
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : We describe a process of reinforcement learning in two-agent general-sum stochastic games under imperfect observability of moves and payoffs. In practice, it is known that using naive Q-learning, agents can learn equilibrium policies under the discounted reward criterion although these may be arbitrarily worse for both the agents than a non-equilibrium policy, in the absence of global optima. We aim for Pareto-efficiency in policies, in which agents enjoy higher payoffs than in an equilibrium and show agents may employ naive Q-learning with the addition of communication and a payoff interpretation rule, to achieve this. In principle, our objective is to shift the focus of the learning from equilibria (to which solipsistic algorithms converge) to non-equilibria by transforming the latter to equilibria.
Document type :
Conference papers
Complete list of metadata
Contributor : Raghav Aras <>
Submitted on : Tuesday, September 13, 2005 - 2:54:10 PM
Last modification on : Friday, February 26, 2021 - 3:28:04 PM
Long-term archiving on: : Thursday, April 1, 2010 - 10:24:00 PM




Raghav Aras, Alain Dutech, François Charpillet. Cooperation in stochastic games through communication. 4th International Joint Conference on Autonomous Agents and Multiagent Systems - AAMAS'05, Jul 2005, Utrecht/ The Netherlands, pp.1197 - 1198, ⟨10.1145/1082473.1082691⟩. ⟨inria-00000208⟩



Record views


Files downloads