K. Amemori, L. G. Gibb, and A. M. , Shifting Responsibly: The Importance of Striatal Modularity to Reinforcement Learning in Uncertain Environments, Frontiers in Human Neuroscience, vol.5, p.47, 2011.
DOI : 10.3389/fnhum.2011.00047

M. Basseville and I. V. , Nikiforov Detection of abrupt changes: theory and application

E. Brunswik, Probability as a determiner of rat behavior., Journal of Experimental Psychology, vol.25, issue.2, p.175, 1939.
DOI : 10.1037/h0061204

V. S. Chakravarthy, D. Joseph, and R. S. Bapi, What do the basal ganglia do? A modeling perspective, Biological Cybernetics, vol.7, issue.4, pp.237-253, 2010.
DOI : 10.1007/978-1-4615-0715-4_35

S. Charpier and J. Deniau, In vivo activity-dependent plasticity at cortico-striatal connections: Evidence for physiological long-term potentiation, Proceedings of the National Academy of Sciences, vol.13, issue.7, pp.7036-7040, 1997.
DOI : 10.1016/0166-2236(90)90109-N

K. Doya, Metalearning and neuromodulation, Neural Networks, vol.15, issue.4-6, pp.495-506, 2002.
DOI : 10.1016/S0893-6080(02)00044-8

K. Doya, Modulators of decision making, Nature Neuroscience, vol.55, issue.4, pp.410-416, 2008.
DOI : 10.1162/003355397555253

K. Doya, K. Samejima, K. Katagiri, and M. Kawato, Multiple Model-Based Reinforcement Learning, Neural Computation, vol.3, issue.6, pp.1347-1369, 2002.
DOI : 10.1016/S1364-6613(98)01221-2

F. Eblen and A. M. , Highly restricted origin of prefrontal cortical inputs to striosomes in the macaque monkey, Journal of neuroscience, vol.15, issue.9, pp.5999-6013, 1995.

A. Flaherty and A. M. , Input-output organization of the sensorimotor striatum in the squirrel monkey, Journal of Neuroscience, vol.14, issue.2, pp.599-610, 1994.

R. Granger, Engines of the brain: The computational instruction set of human cognition, AI Magazine, vol.27, issue.2, p.15, 2006.

A. Graybiel, A. Flaherty, and J. Gimenez-amaya, Striosomes and matrisomes. The basal ganglia III, pp.3-12, 1991.
DOI : 10.1007/978-1-4684-5871-8_1

A. M. Graybiel, The basal ganglia: learning new tricks and loving it, Current Opinion in Neurobiology, vol.15, issue.6, pp.638-644, 2005.
DOI : 10.1016/j.conb.2005.10.006

C. Hartland, N. Baskiotis, S. Gelly, M. Sebag, and O. Teytaud, Change point detection and meta-bandits for online learning in dynamic environments, pp.237-250, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00164033

C. Hartland, S. Gelly, N. Baskiotis, O. Teytaud, and M. Sebag, Multi-armed bandit, dynamic environments and meta-banditsInference about the change-point in a sequence of random variables, Hinkley, D. V, pp.1-17, 1970.

D. Joel, Y. Niv, and E. Ruppin, Actor???critic models of the basal ganglia: new anatomical and computational perspectives, Neural Networks, vol.15, issue.4-6, pp.535-547, 2002.
DOI : 10.1016/S0893-6080(02)00047-3

L. P. Kaelbling, M. L. Littman, and A. W. Moore, Reinforcement learning: A survey, Journal of artificial intelligence research, vol.4, pp.237-285, 1996.

S. K. Kalva, M. Rengaswamy, V. S. Chakravarthy, and N. Gupte, On the neural substrates for exploratory dynamics in basal ganglia: A model, Neural Networks, vol.32, pp.65-73, 2012.
DOI : 10.1016/j.neunet.2012.02.031

T. Kohonen, The self-organizing map, Neurocomputing, vol.21, issue.1-3, pp.1-6, 1998.
DOI : 10.1016/S0925-2312(98)00030-7

J. L. Lanciego, N. Luquin, and J. A. Obeso, Functional Neuroanatomy of the Basal Ganglia, Cold Spring Harbor Perspectives in Medicine, vol.2, issue.12, p.9621, 2012.
DOI : 10.1101/cshperspect.a009621

J. Langford and T. Zhang, The epoch-greedy algorithm for multi-armed bandits with side information Advances in neural information processing systems, 2008.

K. Lloyd and D. S. Leslie, Context-dependent decision-making: a simple Bayesian model, Journal of The Royal Society Interface, vol.73, issue.1, p.20130069, 2013.
DOI : 10.1037/h0022687

G. Lorden, Procedures for Reacting to a Change in Distribution, The Annals of Mathematical Statistics, pp.1897-1908, 1971.
DOI : 10.1214/aoms/1177693055

R. G. Miltenberger, Behavior modification: Principles and procedures, Cengage LearningMazes, maps, and memory, American psychologist, vol.34, issue.7, p.583, 1979.

B. Pasquereau, A. Nadjar, D. Arkadir, E. Bezard, M. Goillandeau et al., Shaping of Motor Responses by Incentive Values through the Basal Ganglia, Journal of Neuroscience, vol.27, issue.5, pp.1176-1183, 2007.
DOI : 10.1523/JNEUROSCI.3745-06.2007

K. Samejima, Y. Ueda, K. Doya, and M. Kimura, Representation of Action-Specific Reward Values in the Striatum, Science, vol.310, issue.5752, pp.1337-1340, 2005.
DOI : 10.1126/science.1115270

W. Schultz, Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology, Current Opinion in Neurobiology, vol.14, issue.2, pp.139-147, 2004.
DOI : 10.1016/j.conb.2004.03.017

M. Seo, E. Lee, and B. B. Averbeck, Action Selection and Action Value in Frontal-Striatal Circuits, Neuron, vol.74, issue.5, pp.947-960, 2012.
DOI : 10.1016/j.neuron.2012.03.037

S. Shivkumar, V. Muralidharan, and V. S. Chakravarthy, A Biologically Plausible Architecture of the Striatum to Solve Context-Dependent Reinforcement Learning Tasks, Frontiers in Neural Circuits, vol.81, 2017.
DOI : 10.1016/j.neuron.2013.11.005

M. A. Sullivan, H. Chen, and H. Morikawa, Recurrent Inhibitory Network among Striatal Cholinergic Interneurons, Journal of Neuroscience, vol.28, issue.35, pp.8682-8690, 2008.
DOI : 10.1523/JNEUROSCI.2411-08.2008

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

M. T. Todd, Y. Niv, and J. D. Cohen, Learning to use working memory in partially observable environments through dopaminergic reinforcement Advances in neural information processing systems, 2009.

R. C. Wilson, Y. K. Takahashi, G. Schoenbaum, and Y. Niv, Orbitofrontal Cortex as a Cognitive Map of Task Space, Neuron, vol.81, issue.2, pp.267-279, 2014.
DOI : 10.1016/j.neuron.2013.11.005

A. Yu and P. Dayan, Expected and unexpected uncertainty: ACh and NE in the neocortex