Bellmanian Bandit Network - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Bellmanian Bandit Network

Résumé

This paper presents a new reinforcement learning (RL) algorithm called Bellmanian Bandit Network (BBN), where action selection in each state is formalized as a multi-armed bandit problem. The first contribution lies in the definition of an exploratory reward inspired from the intrinsic motivation criterion [1], combined with the RL reward. The second contribution is to use a network of multi-armed bandits to achieve the convergence toward the optimal Q-value function. The BBN algorithm is validated in stationary and non-stationary grid-world environments, comparatively to [1].
Fichier principal
Vignette du fichier
nips14_BBN.pdf (489.7 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01102970 , version 1 (13-01-2015)

Identifiants

  • HAL Id : hal-01102970 , version 1

Citer

Antoine Bureau, Michèle Sebag. Bellmanian Bandit Network. Autonomously Learning Robots, at NIPS 2014, Gerhard Neumann (TU-Darmstadt); Joelle Pineau (McGill University); Peter Auer (Uni Leoben); Marc Toussaint (Uni Stuttgart), Dec 2014, Montréal, Canada. ⟨hal-01102970⟩
410 Consultations
190 Téléchargements

Partager

Gmail Facebook X LinkedIn More