Iterative procedures for nonlinear integral equations, Journal of the ACM (JACM), vol.12, issue.4, pp.547-560, 1965. ,
On the generation of Markov decision processes, Journal of the Operational Research Society, pp.354-361, 1995. ,
Natural actor-critic algorithms, Automatica, vol.45, issue.11, pp.2471-2482, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00840470
Linear Least-Squares algorithms for temporal difference learning, Machine Learning, vol.22, pp.33-57, 1996. ,
A survey on policy search for robotics, Foundations and Trends R in Robotics, vol.2, issue.1-2, pp.1-142, 2013. ,
Two classes of multisecant methods for nonlinear acceleration, Numerical Linear Algebra with Applications, vol.16, issue.3, pp.197-221, 2009. ,
Damped anderson acceleration with restarts and monotonicity control for accelerating em and em-like algorithms, 2018. ,
A fixed-point of view on gradient methods for big data, Frontiers in Applied Mathematics and Statistics, vol.3, p.18, 2017. ,
Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, p.529, 2015. ,
Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,
Approximate modified policy iteration and its application to the game of tetris, Journal of Machine Learning Research, vol.16, pp.1629-1676, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01091341
Regularized nonlinear acceleration, Advances In Neural Information Processing Systems, pp.712-720, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01384682
Reinforcement learning: An introduction, 1998. ,
Convergence analysis for anderson acceleration, SIAM Journal on Numerical Analysis, vol.53, issue.2, pp.805-819, 2015. ,
Anderson acceleration for fixed-point iterations, SIAM Journal on Numerical Analysis, vol.49, issue.4, pp.1715-1735, 2011. ,
Interpolatron: Interpolation or extrapolation schemes to accelerate optimization for deep neural networks, 2018. ,