G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman et al., , 2016.

R. J. Cabin and R. J. Mitchell, To bonferroni or not to bonferroni: when and how are the questions, Bulletin of the Ecological Society of America, vol.81, issue.3, pp.246-248, 2000.

P. Dhariwal, C. Hesse, O. Klimov, A. Nichol, M. Plappert et al., , 2017.

P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup et al., Deep reinforcement learning that matters, 2017.

R. Islam, P. Henderson, M. Gomrokchi, and D. Precup, Reproducibility of benchmarked deep reinforcement learning tasks for continuous control, Proceedings of the ICML 2017 workshop on Reproducibility in Machine Learning (RML), 2017.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, 2015.

H. Mania, A. Guy, and B. Recht, Simple random search provides a competitive approach to reinforcement learning, 2018.

M. Plappert, R. Houthooft, P. Dhariwal, S. Sidor, R. Y. Chen et al., Parameter space noise for exploration, 2017.

W. R. Rice, Analyzing tables of statistical tests, Evolution, vol.43, issue.1, pp.223-225, 1989.

J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Abbeel, Trust region policy optimization, 2015.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, 2017.

B. L. Welch, The generalization ofstudent's' problem when several different population variances are involved, Biometrika, vol.34, issue.1/2, pp.28-35, 1947.

Y. Wu, E. Mansimov, S. Liao, R. Grosse, and J. Ba, Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation, 2017.