Autonomous helicopter aerobatics through apprenticeship learning, In International Journal of Robotics Research, vol.29, pp.1608-1639, 2010. ,
Direct gradient-based reinforcement learning, Proceedings of the IEEE International Symposium on Circuits and Systems, pp.271-274, 2002. ,
Real-time learning : A ball on a beam, International Joint Conference on Neural Networks, 1992. ,
Natural actor-critic algorithms, Automatica, pp.2471-2482, 2009. ,
Simultaneous adversarial multi-robot learning, International Joint Conference on Artificial Intelligence, pp.2471-2482, 2003. ,
Reinforcement learning of walking behavior for a four-legged robot, Proceedings of the 40th IEEE Conference on Decision and Control, pp.411-416, 2002. ,
Policy gradient reinforcement learning for fast quadrupedal locomotion, Proceedings of the International Conference on Robotics and Automation, pp.2619-2624, 2004. ,
Natural actor-critic, Neurocomputing, pp.1180-1190, 2008. ,
Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning, Proceeding of the IEEE International Conference on Rehabilitation Robotics, pp.134-140, 2011. ,
Reinforcement Learning, 1998. ,
DOI : 10.1016/B978-012526430-3/50003-9
On the role of tracking in stationary environments, Proceedings of the 24th international conference on Machine learning, pp.871-878, 2007. ,
Policy gradient methods for reinforcement learning with function approximation, Advances in Neural Information Processing Systems, pp.1057-1063, 2000. ,
Online learning with random representations, Proceedings of the Tenth International Conference on Machine Learning, pp.314-321, 1993. ,
Stochastic policy gradient reinforcement learning on a simple 3d biped, IEEE Proceedings of the IEEE, pp.2849-2854, 2005. ,