B. L. , E. D. Schutter-b, and . Babuska-r, Cross-entropy optimization of control policies with adaptive basis functions, IEEE Transactions on Systems, Man, and Cybernetics-Part B : Cybernetics, issue.1, pp.41-196, 2011.

M. V. Butz and O. Herbort, Context-dependent predictions and cognitive arm control with XCSF, Proceedings of the 10th annual conference on Genetic and evolutionary computation, GECCO '08, pp.1357-1364, 2008.
DOI : 10.1145/1389095.1389360

C. C. and H. N. Wang, Intelligent excitation for adaptive control with unknown parameters in reference input, IEEE Transactions on Automatic Control, issue.8, pp.52-1525, 2007.

H. N. Ostermeier-a, Completely derandomized self-adaptation in evolution strategies, Evolutionary Computation, vol.9, issue.2, pp.159-195, 2001.

H. Igel-c, Evolution strategies for direct policy search, Proceedings of the 10th international conference on Parallel Problem Solving from Nature : PPSN X, pp.428-437, 2008.

I. A. and N. J. Schaal-s, Movement imitation with nonlinear dynamical systems in humanoid robots, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2002.

K. J. Peters-j, Policy search for motor primitives in robotics, Machine Learning, pp.171-203, 2011.

M. S. Rubinstein-r and . Gat-y, The cross-entropy method for fast policy search, Proceedings of the 20th International Conference on Machine Learning, pp.512-519, 2003.

M. D. , D. J. , and R. L. Sigaud-o, Learning cost-efficient control policies with XCSF : Generalization capabilities and further improvement, Proceedings of the 13th annual conference on Genetic and evolutionary computation, pp.1235-1242, 2011.

M. D. Sigaud-o, Towards fast and adaptive optimal control policies for robots : A direct policy search approach, Proceedings Robotica, pp.21-26, 2012.

P. J. Schaal-s, Natural actor-critic, Neurocomputing, vol.71, pp.7-9, 2008.

R. T. Sehnke-f, W. D. Schaul-t, and S. Y. Schmidhuber-j, Exploring parameter space in reinforcement learning, Paladyn. Journal of Behavioral Robotics, vol.1, pp.14-24, 2010.

S. F. Theodorou-e and B. J. Schaal-s, Learning to grasp under uncertainty, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2011.

S. I. Lörincz-a, Learning tetris using the noisy cross-entropy method, Neural Comput, vol.18, issue.12, pp.2936-2941, 2006.

T. M. Nemec-b, . Ude-a, and . Wörgötter-f, Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives. Robots and Autonomous Systems, pp.59-910, 2011.

T. E. and B. J. Schaal-s, A generalized path integral control approach to reinforcement learning, Journal of Machine Learning Research, vol.11, pp.3137-3181, 2010.