J. Audibert, R. Munos, and C. Szepesvári, Tuning Bandit Algorithms in Stochastic Environments, Algorithmic Learning Theory, pp.150-165, 2007.
DOI : 10.1093/biomet/25.3-4.285
URL : https://hal.archives-ouvertes.fr/inria-00203487

T. Bäck, Evolutionary Algorithms in Theory and Practice, 1996.

D. P. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, 1996.

H. Beyer and H. Schwefel, Evolution strategies, Scholarpedia, vol.2, issue.8, pp.3-52, 2002.
DOI : 10.4249/scholarpedia.1965

N. Böhm, G. Kókai, and S. Mandl, An Evolutionary Approach to Tetris, Proc. of the 6th Metaheuristics International Conference, p.page CDROM, 2005.

R. Breukelaar, H. J. Hoogeboom, A. Walter, and . Kosters, Tetris is Hard, Made Easy, 2003.

H. Burgiel, How to Lose at Tetris, The Mathematical Gazette, vol.81, issue.491, pp.194-200, 1997.
DOI : 10.2307/3619195

D. Carr, Adapting Reinforcement Learning to Tetris, 2005.

P. De-boer, D. Kroese, S. Mannor, and R. Rubinstein, A Tutorial on the Cross-Entropy Method, Annals of Operations Research, vol.16, issue.3, pp.19-67, 2004.
DOI : 10.1007/s10479-005-5724-z

E. D. Demaine, S. Hohenberger, and D. Liben-nowell, Tetris is Hard, Even to Approximate, Proc. 9th COCOON, pp.351-363, 2003.
DOI : 10.1007/3-540-45071-8_36

V. Farias and B. Van-roy, Probabilistic and Randomized Methods for Design Under Uncertainty, chapter Tetris: A study of randomized constraint sampling, 2006.

N. Hansen, S. P. Niederberger, L. Guzzella, and P. Koumoutsakos, A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion, IEEE Transactions on Evolutionary Computation, vol.13, issue.1, pp.180-197, 2009.
DOI : 10.1109/TEVC.2008.924423
URL : https://hal.archives-ouvertes.fr/inria-00276216

N. Hansen and A. Ostermeier, Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation, Proceedings of IEEE International Conference on Evolutionary Computation, pp.312-317, 1996.
DOI : 10.1109/ICEC.1996.542381

N. Hansen and A. Ostermeier, Completely Derandomized Self-Adaptation in Evolution Strategies, Evolutionary Computation, vol.9, issue.2, pp.159-195, 2001.
DOI : 10.1016/0004-3702(95)00124-7

V. Heidrich-meisner and C. Igel, Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.401-408, 2009.
DOI : 10.1145/1553374.1553426

Y. Jin and J. Branke, Evolutionary Optimization in Uncertain Environments???A Survey, IEEE Transactions on Evolutionary Computation, vol.9, issue.3, pp.303-317, 2005.
DOI : 10.1109/TEVC.2005.846356

S. Kakade, A natural policy gradient, Advances in Neural Information Processing Systems (NIPS 14), pp.1531-1538, 2001.

J. Kennedy, R. C. Eberhart, and Y. Shi, Swarm Intelligence, 2001.
DOI : 10.1007/0-387-27705-6_6

R. John and . Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, 1992.

M. G. Lagoudakis, R. Parr, and M. L. Littman, Least-Squares Methods in Reinforcement Learning for Control, SETN '02: Proc. of the Second Hellenic Conference on AI, pp.249-260, 2002.
DOI : 10.1007/3-540-46014-4_23

L. Langenhoven, W. S. Van-heerden, and A. P. Engelbrecht, Swarm Tetris: Applying particle swarm optimization to tetris, IEEE Congress on Evolutionary Computation, pp.1-8, 2010.
DOI : 10.1109/CEC.2010.5586033

R. E. Llima, Xtris readme, 2005.

O. Maron and A. W. Moore, Hoeffding races: Accelerating model selection search for classification and function approximation, Proc. Advances in neural information processing systems, pp.59-66, 1994.

O. Maron and A. W. Moore, The Racing Algorithm: Model Selection for Lazy Learners, Artificial Intelligence Review, vol.11, pp.193-225, 1997.
DOI : 10.1007/978-94-017-2053-3_8

A. Ostermeier, A. Gawelczyk, and N. Hansen, A Derandomized Approach to Self-Adaptation of Evolution Strategies, Evolutionary Computation, vol.2, issue.4, pp.369-407, 1994.
DOI : 10.1162/evco.1994.2.4.369

I. Rechenberg, Evolution strategy, Computational Intelligence imitating life, pp.147-159, 1994.

C. Schmidt, J. Branke, and S. Chick, Integrating Techniques from Statistical Ranking into Evolutionary Algorithms, LNCS, vol.3907, pp.752-763, 2006.
DOI : 10.1007/11732242_73

E. V. Siegel and A. D. Chaffee, Genetically optimizing the speed of programs evolved to play tetris, Advances in Genetic Programming 2, pp.279-298, 1996.

P. Stagge, Averaging efficiently in the presence of noise, Proc. of PPSN 5, pp.188-197, 1998.
DOI : 10.1007/BFb0056862

I. Szita and A. Lörincz, Learning Tetris Using the Noisy Cross-Entropy Method, Neural Computation, vol.18, issue.12, pp.2936-2941, 2006.
DOI : 10.1007/s10479-005-5732-z

C. Thiery and B. Scherrer, Construction d'un joueur artificiel pour tetris. Revue d'Intelligence Artificielle, pp.387-407, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00418922

C. Thiery and B. Scherrer, Building Controllers for Tetris, ICGA Journal, vol.32, issue.1, pp.3-11, 2010.
DOI : 10.3233/ICG-2009-32102
URL : https://hal.archives-ouvertes.fr/inria-00418954

C. Thiery and B. Scherrer, Least-Squares ? Policy Iteration: Bias-Variance Trade-off in Control Problems, Proc. ICML, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00520841

J. N. Tsitsiklis and B. Van-roy, Feature-based methods for large scale dynamic programming, Machine Learning, pp.59-94, 1996.