B. S. Barto-a, Linear least-squares algorithms for temporal difference learning, pp.33-57, 1996.

B. O. and D. A. Charpillet-f, Shaping multi-agent systems with gradient reinforcement learning, Autonomous Agent and Multi-Agent System Journal (AAMASJ), pp.197-220, 2007.

G. Pdmia, Processus Décisionnels de Markov en Intelligence Artificielle. (Edité par Olivier Buffet et Olivier Sigaud, 2008.

L. M. Parr-r and . Littman-m, Least-squares methods in reinforcement learning for control, Proc ; of the 2nd Hellenic Conference on Artificial Intelligence (SETN-02), number 2308 in Lecture Notes on Artificial Intelligence, pp.249-260, 2002.

L. M. Metta-g, . Pfeifer-r, and . Sandini-g, Developmental robotics : a survey, Connection Science, vol.15, issue.4, pp.151-190, 2003.

M. J. Hayes-p, Some philosophical problems from the standpoint of artificial intelligence, Machine Intelligence, vol.4, pp.463-502, 1969.

N. A. Harada-d and . Russell-s, Policy invariance under reward transformations : Theory and application to reward shaping, Proceedings of the Sixteenth International Conference on Machine Learning, ICML-99, pp.278-287, 1999.

O. Kaplan-f and . Hafner-v, Intrinsic motivation systems for autonomous mental development, IEEE Transactions on Evolutionnary Computation, vol.11, issue.2, pp.265-286, 2007.

R. N. Boniface, Dynamic Self-Organising Map, Neurocomputing, vol.74, issue.11, pp.1840-1847, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00495827

S. L. Buffet-o and . Dutech-a, Apprentissage par renforcement développemental en robotique autonome, Conférence Francophone d'Apprentissage, 2011.

S. R. Barto-a, Reinforcement Learning, 1998.
DOI : 10.1016/B978-012526430-3/50003-9

T. G. Kenny-p, The role of developmental limitations of sensory input on sensory/perceptual organization, Developmental & Behavioral Pediatrics, vol.6, issue.5, pp.302-306, 1985.