Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies

Clément Moulin-Frier 1, * Pierre-Yves Oudeyer 1
* Auteur correspondant
1 Flowers - Flowing Epigenetic Robots and Systems
Inria Bordeaux - Sud-Ouest, U2IS - Unité d'Informatique et d'Ingénierie des Systèmes
Abstract : In the field of developmental robotics, we are particularly interested in the exploration strategies which can drive an agent to learn how to reach a wide variety of goals. In this paper, we unify and compare such strategies, recently shown to be efficient to learn complex non-linear redundant sensorimotor mappings. They combine two main principles. The first one concerns the space in which the learning agent chooses points to explore (motor space vs. goal space). Previous works have shown that learning redundant inverse models could be achieved more efficiently if exploration was driven by goal babbling, triggering reaching, rather than direct motor babbling. Goal babbling is especially efficient to learn highly redundant mappings (e.g the inverse kinematics of a arm). At each time step, the agent chooses a goal in a goal space (e.g uniformly), uses the current knowledge of an inverse model to infer a motor command to reach that goal, observes the corresponding consequence and updates its inverse model according to this new experience. This exploration strategy allows the agent to cover the goal space more efficiently, avoiding to waste time in redundant parts of the sensorimotor space (e.g executing many motor commands that actually reach the same goal). The second principle comes from the field of active learning, where exploration strategies are conceived as an optimization process. Samples in the input space (i.e motor space) are collected in order to minimize a given property of the learning process, e.g the uncertainty or the prediction error of the model. This allows the agent to focus on parts of the sensorimotor space in which exploration is supposed to improve the quality of the model.
Type de document :
Communication dans un congrès
1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2013), Princeton University, New Jersey, Oct 2014, Princeton, United States. 2013
Liste complète des métadonnées

https://hal.inria.fr/hal-00922537
Contributeur : Clément Moulin-Frier <>
Soumis le : vendredi 27 décembre 2013 - 16:27:03
Dernière modification le : jeudi 12 avril 2018 - 13:06:49
Document(s) archivé(s) le : vendredi 28 mars 2014 - 16:50:28

Fichiers

rldm.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-00922537, version 1

Collections

Citation

Clément Moulin-Frier, Pierre-Yves Oudeyer. Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2013), Princeton University, New Jersey, Oct 2014, Princeton, United States. 2013. 〈hal-00922537〉

Partager

Métriques

Consultations de la notice

714

Téléchargements de fichiers

199