Infinite-horizon gradient-based policy search, Journal of Artificial Intelligence Research, vol.15, pp.319-350, 2001. ,
Perturbation methods in optimal control Wiley/Gauthier-Villars Series in Modern Applied Mathematics, 1988. ,
Optimal control of a double inverted pendulum on a cart, CSEE, OGI School of Science and Engineering, 2004. ,
Likelihood ratio gradient estimation: an overview, Proceedings of the 1987 Winter Simulation Conference, pp.366-375, 1987. ,
Sensitivity Analysis Using It??--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control, SIAM Journal on Control and Optimization, vol.43, issue.5, pp.1676-1713, 2005. ,
DOI : 10.1137/S0363012902419059
Numerical Solutions of Stochastic Differential Equations, 1995. ,
Stochastic Approximation Algorithms and Applications, 1997. ,
DOI : 10.1007/978-1-4899-2696-8
Planning Algorithms, 2006. ,
DOI : 10.1017/CBO9780511546877
The concentration of measure phenomenon, 2001. ,
DOI : 10.1090/surv/089
Approximate gradient methods in policy-space optimization of Markov reward processes, Discrete Event Dynamic Systems, vol.13, issue.1/2, pp.111-148, 2003. ,
DOI : 10.1023/A:1022145020786
Introduction to Optimization. Optimization Software Inc, 1987. ,
Sensitivity analysis via likelihood ratios, Proceedings of the 18th conference on Winter simulation , WSC '86, pp.285-289, 1986. ,
DOI : 10.1145/318242.318450
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Policy gradient methods for reinforcement learning with function approximation. Neural Information Processing Systems, POLICY GRADIENT IN CONTINUOUS TIME, pp.1057-1063, 2000. ,
A new look at independence. Annals of Probability, pp.1-34, 1996. ,
Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, pp.229-256, 1992. ,
A Monte Carlo Method for Sensitivity Analysis and Parametric Optimization of Nonlinear Stochastic Systems, SIAM Journal on Control and Optimization, vol.29, issue.5, pp.1216-1249, 1991. ,
DOI : 10.1137/0329064