A. Agelastos, B. Allan, J. Brandt, A. Gentile, S. Lefantzi et al., Toward rapid understanding of production hpc applications and systems, 2015 IEEE International Conference on Cluster Computing, pp.464-473, 2015.

G. Aupy, A. Gainaru, V. Honoré, P. Raghavan, Y. Robert et al., Reservation strategies for stochastic jobs, IEEE IPDPS, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01968419

M. J. Berger and P. Colella, Local adaptive mesh refinement for shock hydrodynamics, J. Comput. Phys, vol.82, issue.1, pp.64-84, 1989.

D. Braess, J. Forster, T. Sauer, and H. Simon, How to achieve minimax expected kullback-leibler distance from an unknown finite distribution, International Conference on Algorithmic Learning Theory, pp.380-394, 2002.

D. Braess and T. Sauer, Bernstein polynomials and learning theory, Journal of Approximation Theory, vol.128, issue.2, pp.187-206, 2004.

N. Capit, G. Costa, Y. Georgiou, G. Huard, C. Martin et al., A batch scheduler with high level components, CCGrid, pp.776-783, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00005106

L. Carrington, D. Komatitsch, M. Laurenzano, M. M. Tikir, D. Michea et al., High-frequency simulations of global seismic wave propagation using specfem3d globe on 62k processors, SC '08, 2008.

L. Carrington, A. Snavely, and N. Wolter, A performance prediction framework for scientific applications, Future Generation Computer Systems, vol.22, issue.3, pp.336-346, 2006.

I. Siu-on-chan, R. A. Diakonikolas, X. Servedio, and . Sun, Learning mixtures of structured distributions over discrete domains, ACM-SIAM SODA, pp.1380-1394, 2013.

A. Gainaru, ScheduleFlow: A simulator for HPC schedulers, vol.19, 2019.

T. Evans, W. L. Barth, J. C. Browne, R. L. Deleon, T. R. Furlani et al., Comprehensive resource use monitoring for hpc systems with tacc stats, 2014 First International Workshop on HPC User Support Tools, pp.13-21, 2014.

A. Gainaru, G. Pallez, H. Sun, and P. Raghavan, Speculative scheduling for stochastic hpc applications, ICPP, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02158598

E. Gaussier, D. Glesser, V. Reis, and D. Trystram, Improving backfilling by using machine learning to predict running times, SC'15, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01221186

B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph et al., Mesos: A platform for fine-grained resource sharing in the data center, the 8th USENIX Conference on Networked Systems Design and Implementation, pp.295-308, 2011.

S. Kamath, A. Orlitsky, D. Pichapati, and A. Suresh, On learning distributions from their samples, Conference on Learning Theory, pp.1066-1100, 2015.

. Masi-lab, Medical-image Analysis and Statistical Interpretation (MASI) Lab

S. Mustafa, I. Elghandour, and M. A. Ismail, A machine learning approach for predicting execution time of spark jobs, Alexandria Engineering Journal, vol.57, issue.4, pp.3767-3778, 2018.

G. Staples, Torque resource manager, SC'06, SC '06, 2006.

D. Tsafrir, Y. Etsion, and D. G. Feitelson, Backfilling using systemgenerated predictions rather than user runtime estimates, IEEE Transactions on Parallel and Distributed Systems, vol.18, issue.6, pp.789-803, 2007.

O. Tuncer, E. Ates, Y. Zhang, A. Turk, J. Brandt et al., Diagnosing performance variations in hpc applications using machine learning, High Performance Computing, pp.355-373, 2017.

A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune et al., Large-scale cluster management at google with borg, Proceedings of the Tenth European Conference on Computer Systems, EuroSys '15, vol.18, pp.1-18, 2015.

O. Weidner, M. Atkinson, A. Barker, and R. Vicente, Rethinking high performance computing platforms: Challenges, opportunities and recommendations, ACM DIDC '16, pp.19-26, 2016.

T. Leo, X. Yang, F. Ma, and . Mueller, Cross-platform performance prediction of parallel applications using partial execution, SC'05, p.40, 2005.

A. B. Yoo, M. A. Jette, and M. Grondona, Slurm: Simple linux utility for resource management, JSSPP, pp.44-60, 2003.