Which reinforcing signals in autonomous systems?

Thomas Beati; Maxime Carrere; Frédéric Alexandre

Résumé

Decision making is deeply influenced by signals (called here reinforcing signals), elaborated from biologically significant (aversive or appetitive) stimuli and from internal computations. They are exploited to anticipe punishment or reward as well as to select actions like avoidance and orientation, to maximize benefits for the body. Such signals appeared to be fundamental in the design of a fully autonomous system, i.e. an agent endowed with needs and aimed at learning how to satisfy them autonomously. In this system, beyond classical perceptive inputs, the status of the agent's body (pleasure and pain) is provided by interoceptive sensors. The agent is also wired to automatically identify biologically significant stimuli. We have developed a computational neuroscience approach to design modules implementing neuronal structures and information flows, in order to elaborate reinforcing signals for the advanced detection of meaningful targets and for the corresponding choice of actions, in the perspective of complete autonomy. Particularly, an amygdala module has been designed for the implementation of respondent learning, ie the ability to detect unconditional stimuli (US) and prepare the body accordingly (UR: unconditioned response). The module is made of models of three neuronal structures, corresponding to three nuclei of the amygdala. In the model, the lateral nucleus receives the sensory information flows, not only from the outside (interoception and perception) but also processed by other modules (Sensory Cortex and Hippocampus); these flows might correspond to US and stimuli that can be predictive for US (CS). Basically, the lateral nucleus will be responsible of CS-US associations. The central nucleus is responsible for the pavlovian responses to US, which can be motor (eg freezing), autonomic (changes in heart rate) and hormonal. This latter point corresponds for example to an excitation of VTA-SNc (part of the PFC-BG module) for dopamine release at the moment of a US or CS. The basolateral nuclei are the more recent structures in the amygdala and are involved in a variety of functions that we tried to make more explicit in our model. Basically, these nuclei have been proposed to represent US along various attributes: their valence (positive or negative), their intensity (eg strong reward), their proximity in space and in time, their intrinsic value for the body (hedonic value) and the value they can give to actions that might produce them (incentive value). Another fundamental role of amygdala, thanks to pavlovian learning, is to produce prediction error signal, corresponding to the difference between predicted rewards and rewards actually received. In an early step, we have implemented a classical learning rule for pavlovian learning, the Rescorla-Wagner rule, for the CS-US association. Subsequently, our modeling work mainly corresponded to implement other modules, accounting for other cerebral structures, like cortex and hippocampus [refs], that receive, exploit the variety of reinforcing signals originating from the amygdala and send back more elaborated signals to complement sensory inputs in the lateral nucleus and to modify US representations in the basolateral nuclei.

Which reinforcing signals in autonomous systems?

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager