Cooperation between reinforcement and procedural learning in the basal ganglia

Nishal Shah 1 Frédéric Alexandre 2
2 CORTEX - Neuromimetic intelligence
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Describing cognition as cooperating learning mechanisms [1] is a fruitful way to approach its complexity and its dynamics. In a simple model, we explore a possible cooperation between a long lasting procedural memory and a dynamical reinforcement learning, supposed to be respectively located in the parietal cortex and in the basal ganglia. In [2], the authors describe the visual system not only as perceiving features but also as preparing appropriate motor outputs elicited by perceived features. They state that this association is built in the parietal cortex. The selection of action is one of the goals of reinforcement learning [3] and aims at triggering the action that maximizes the expectation of reward. The basal ganglia have been proposed as a substratum for this selection [4]. Few models of the basal ganglia consider that the selection of action could operate on a restricted set of pre-activated actions. We have recently incorporated in the RDDR model (Reinforcement Driven Dimensionality Reduction [5]) realistic physiological and behavioral characteristics, including neuronal formalism of computation, protocol of learning and cerebral information flows. Concerning the latter characteristic, the network is composed of a sensorimotor cortical axis and a basal loop, intersecting in a cortical motor area. The cortical flow is the result of a perceptive analysis in a visual area and an associative matching in a parietal area. This results in the pre-activation of possible actions in the motor area. The basal loop integrates the cortical information in the input structure, the striatum, and compresses it in the output structure (GPi/SNr) where a strong reduction of dimensionality takes place. The selection of action is made at this level, thanks to the modulatory effect of the reward. The resulting effect is sent back to the motor area. The parietal pre-activation of the motor area is not sufficient to trigger an action but is sufficient to activate the striatum and to make selection operate on a restricted set of action. It will consequently speed up the convergence of reinforcement learning. As often required in reinforcement learning, an exploration mechanism is added to compensate the only exploitation of current knowledge and allows sometimes to trigger an action never associated before and thus to discover new rewarding rules. This new perception-action association, if validated by delivery of reward, will also modify the associative learning in the cortex. An interplay between two systems of memory is consequently observed: a procedural memory limits the choices for the selection of action by reinforcement learning and is in turn fed by the results of that selection, made by exploitation and exploration.
Type de document :
Communication dans un congrès
International Joint Conference on Neural Networks IJCNN 2011 Special Topic Neuroscience and Neurocognition, Jul 2011, San Jose, CA, United States. 2011
Liste complète des métadonnées

https://hal.inria.fr/inria-00586250
Contributeur : Frédéric Alexandre <>
Soumis le : vendredi 15 avril 2011 - 13:06:59
Dernière modification le : jeudi 11 janvier 2018 - 06:25:24

Identifiants

  • HAL Id : inria-00586250, version 1

Collections

Citation

Nishal Shah, Frédéric Alexandre. Cooperation between reinforcement and procedural learning in the basal ganglia. International Joint Conference on Neural Networks IJCNN 2011 Special Topic Neuroscience and Neurocognition, Jul 2011, San Jose, CA, United States. 2011. 〈inria-00586250〉

Partager

Métriques

Consultations de la notice

111