A reinforcement learning approach to instrumental contingency degradation in rats

Alain Dutech; Etienne Coutureau; Alain Marchand

doi:10.1016/j.jphysparis.2011.07.017

Article Dans Une Revue Journal of Physiology - Paris Année : 2011

A reinforcement learning approach to instrumental contingency degradation in rats

(1) , (2) , (2)

1
2

Alain Dutech

Fonction : Auteur
PersonId : 1580
IdHAL : alain-dutech
ORCID : 0000-0001-7549-7988
IdRef : 131102532

Autonomous intelligent machine

Etienne Coutureau

Fonction : Auteur
PersonId : 879114

Centre de neurosciences intégratives et cognitives

Alain Marchand

Fonction : Auteur
PersonId : 178346
IdHAL : alain-marchand
ORCID : 0000-0002-0231-5562

Centre de neurosciences intégratives et cognitives

Résumé

Goal-directed action involves a representation of action consequences. Adapting to changes in action-outcome contingency requires the prefrontal region. Indeed, rats with lesions of the medial prefrontal cortex do not adapt their free operant response when food delivery becomes unrelated to lever-pressing. The present study explores the bases of this deficit through a combined behavioural and computational approach. We show that lesioned rats retain some behavioural flexibility and stop pressing if this action prevents food delivery. We attempt to model this phenomenon in a reinforcement learning framework. The model assumes that distinct action values are learned in an incremental manner in distinct states. The model represents states as n-uplets of events, emphasizing sequences rather than the continuous passage of time. Probabilities of lever-pressing and visits to the food magazine observed in the behavioural experiments are first analyzed as a function of these states, to identify sequences of events that influence action choice. Observed action probabilities appear to be essentially function of the last event that occurred, with reward delivery and waiting significantly facilitating magazine visits and lever-pressing respectively. Behavioural sequences of normal and lesioned rats are then fed into the model, action values are updated at each event transition according to the SARSA algorithm, and predicted action probabilities are derived through a softmax policy. The model captures the time course of learning, as well as the differential adaptation of normal and prefrontal lesioned rats to contingency degradation with the same parameters for both groups. The results suggest that simple temporal difference algorithms with low learning rates can largely account for instrumental learning and performance. Prefrontal lesioned rats appear to mainly differ from control rats in their low rates of visits to the magazine after a lever press, and their inability to initially detect weak contingency changes.

Domaines

Intelligence artificielle [cs.AI] Neurosciences [q-bio.NC]

Alain Dutech : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00642715

Soumis le : vendredi 18 novembre 2011-15:46:40

Dernière modification le : jeudi 18 avril 2024-08:08:03

Dates et versions

hal-00642715 , version 1 (18-11-2011)

Identifiants

HAL Id : hal-00642715 , version 1
DOI : 10.1016/j.jphysparis.2011.07.017

Citer

Alain Dutech, Etienne Coutureau, Alain Marchand. A reinforcement learning approach to instrumental contingency degradation in rats. Journal of Physiology - Paris, 2011, Computational Neuroscience: Neurocomp 2010, 105 (1-3), pp.36-44. ⟨10.1016/j.jphysparis.2011.07.017⟩. ⟨hal-00642715⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA UNIV-LORRAINE INRIA2 LORIA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

90 Consultations

0 Téléchargements

A reinforcement learning approach to instrumental contingency degradation in rats

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager