Learning Exploration Strategies in Model-Based Reinforcement Learning - Archive ouverte HAL Access content directly
Conference Papers Year : 2013

Learning Exploration Strategies in Model-Based Reinforcement Learning

(1) , (1) , (2)
1
2

Abstract

Reinforcement learning (RL) is a paradigm for learning sequential decision making tasks. However, typically the user must hand-tune exploration parameters for each different domain and/or algorithm that they are using. In this work, we present an algorithm called leo for learning these exploration strategies on-line. This algorithm makes use of bandit-type algorithms to adaptively select exploration strategies based on the rewards received when following them. We show empirically that this method performs well across a set of five domains. In contrast, for a given algorithm, no set of parameters is best across all domains. Our results demonstrate that the leo algorithm successfully learns the best exploration strategies on-line, increasing the received reward over static parameterizations of exploration and reducing the need for hand-tuning exploration parameters.
Not file

Dates and versions

hal-00871861 , version 1 (10-10-2013)

Identifiers

  • HAL Id : hal-00871861 , version 1

Cite

Todd Hester, Peter Stone, Manuel Lopes. Learning Exploration Strategies in Model-Based Reinforcement Learning. AAMAS 2013 - 12th International Conference on Autonomous Agents and Multiagent Systems, May 2013, St. Paul, MN, United States. pp.1069-1076. ⟨hal-00871861⟩
159 View
0 Download

Share

Gmail Facebook Twitter LinkedIn More