S. Cecilia-astete-morales, J. Liu, and O. Teytaud, Log-log Convergence for Noisy Optimization, Artificial Evolution -11th International Conference, pp.16-28, 2013.
DOI : 10.1007/978-3-319-11683-9_2

R. Bellman, Dynamic Programming, 1957.

D. P. Bertsekas, Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC*, European Journal of Control, vol.11, issue.4-5, pp.310-334, 2005.
DOI : 10.3166/ejc.11.310-334

URL : http://www-mit.mit.edu/dimitrib/www/ADP-MPC.pdf

H. Beyer, The Theory of Evolution Strategies. Natural Computing Series, 2001.

J. Christophe, J. Decock, and O. Teytaud, Direct model predictive control, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), 2014.
URL : https://hal.archives-ouvertes.fr/hal-00958192

R. Coulom and . Clop, CLOP: Confident Local Optimization for Noisy??Black-Box Parameter Tuning, H. Jaap van den Herik and Aske Plaat Advances in Computer Games, number 7168 in Lecture Notes in Computer Science, pp.146-157, 2011.
DOI : 10.1007/978-3-642-31866-5_13

URL : https://hal.archives-ouvertes.fr/hal-00750326

V. Fabian, Stochastic Approximation of Minima with Improved Asymptotic Speed, The Annals of Mathematical Statistics, vol.38, issue.1, pp.191-200, 1967.
DOI : 10.1214/aoms/1177699070

A. Frangioni, Solving Nonlinear Single-Unit Commitment Problems with Ramping Constraints, Operations Research, vol.54, issue.4, pp.767-775, 2006.
DOI : 10.1287/opre.1060.0309

URL : http://www.di.unipi.it/~frangio/papers/1UC-OpRes.pdf

V. Heidrich-meisner and C. Igel, Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.401-408, 2009.
DOI : 10.1145/1553374.1553426

S. Najafi and Y. Pourjamal, A New Heuristic Algorithm for Unit Commitment Problem, 2nd International Conference on Advances in Energy Engineering (ICAEE), 2005.
DOI : 10.1016/j.egypro.2011.12.1201

URL : https://doi.org/10.1016/j.egypro.2011.12.1201

M. V. Pereira and L. M. Pinto, Multi-stage stochastic optimization applied to energy planning, Mathematical Programming, vol.4, issue.1-3, pp.359-375, 1991.
DOI : 10.1007/BF01582895

M. Schoenauer and E. Ronald, Neuro-genetic truck backer-upper controller, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence, 1994.
DOI : 10.1109/ICEC.1994.349969

URL : http://www.eeaax.polytechnique.fr/papers/marc/icec94.ps.gz

A. Schrijver, Theory of Linear and Integer Programming, 1986.

O. Shamir, On the complexity of bandit and derivative-free stochastic convex optimization, COLT 2013 -The 26th Annual Conference on Learning Theory, pp.3-24, 2013.

A. Shapiro, Analysis of stochastic dual dynamic programming method, European Journal of Operational Research, vol.209, issue.1, pp.63-72, 2011.
DOI : 10.1016/j.ejor.2010.08.007

J. C. Spall, Adaptive stochastic approximation by the simultaneous perturbation method, IEEE Transactions on Automatic Control, vol.45, issue.10, pp.1839-1853, 2000.
DOI : 10.1109/TAC.2000.880982

J. C. Spall, Feedback and Weighting Mechanisms for Improving Jacobian Estimates in the Adaptive Simultaneous Perturbation Algorithm, IEEE Transactions on Automatic Control, vol.54, issue.6, pp.1216-1229, 2009.
DOI : 10.1109/TAC.2009.2019793