Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies

Clément Moulin-Frier 1, * Pierre-Yves Oudeyer 1
* Corresponding author
1 Flowers - Flowing Epigenetic Robots and Systems
Inria Bordeaux - Sud-Ouest, U2IS - Unité d'Informatique et d'Ingénierie des Systèmes
Abstract : In the field of developmental robotics, we are particularly interested in the exploration strategies which can drive an agent to learn how to reach a wide variety of goals. In this paper, we unify and compare such strategies, recently shown to be efficient to learn complex non-linear redundant sensorimotor mappings. They combine two main principles. The first one concerns the space in which the learning agent chooses points to explore (motor space vs. goal space). Previous works have shown that learning redundant inverse models could be achieved more efficiently if exploration was driven by goal babbling, triggering reaching, rather than direct motor babbling. Goal babbling is especially efficient to learn highly redundant mappings (e.g the inverse kinematics of a arm). At each time step, the agent chooses a goal in a goal space (e.g uniformly), uses the current knowledge of an inverse model to infer a motor command to reach that goal, observes the corresponding consequence and updates its inverse model according to this new experience. This exploration strategy allows the agent to cover the goal space more efficiently, avoiding to waste time in redundant parts of the sensorimotor space (e.g executing many motor commands that actually reach the same goal). The second principle comes from the field of active learning, where exploration strategies are conceived as an optimization process. Samples in the input space (i.e motor space) are collected in order to minimize a given property of the learning process, e.g the uncertainty or the prediction error of the model. This allows the agent to focus on parts of the sensorimotor space in which exploration is supposed to improve the quality of the model.
Liste complète des métadonnées

https://hal.inria.fr/hal-00922537
Contributor : Clément Moulin-Frier <>
Submitted on : Friday, December 27, 2013 - 4:27:03 PM
Last modification on : Monday, December 17, 2018 - 10:23:40 AM
Document(s) archivé(s) le : Friday, March 28, 2014 - 4:50:28 PM

Files

rldm.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00922537, version 1

Citation

Clément Moulin-Frier, Pierre-Yves Oudeyer. Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2013), Princeton University, New Jersey, Oct 2014, Princeton, United States. 2013. 〈hal-00922537〉

Share

Metrics

Record views

789

Files downloads

209