Improving Performance via Mini-applications, Sandia National Laboratories Research Report, vol.5574, 2009. ,
SPEEDUP-AWARE CO-SCHEDULES FOR EFFICIENT WORKLOAD MANAGEMENT, Parallel Processing Letters, vol.23, issue.02, p.1340001, 2013. ,
DOI : 10.1142/S012962641340001X
Co-scheduling algorithms for high-throughput workload execution, Journal of Scheduling, vol.23, issue.2, p.7793, 1304. ,
DOI : 10.1109/DATE.2012.6176641
URL : https://hal.archives-ouvertes.fr/hal-01252366
A survey of rollback-recovery protocols in message-passing systems, ACM Computing Surveys, vol.34, issue.3, pp.375-408, 2002. ,
DOI : 10.1145/568522.568525
A first order approximation to the optimum checkpoint interval, Communications of the ACM, vol.17, issue.9, pp.530-531, 1974. ,
DOI : 10.1145/361147.361115
A higher order estimate of the optimum checkpoint interval for restart dumps, Future Generation Computer Systems, vol.22, issue.3, pp.303-312, 2004. ,
DOI : 10.1016/j.future.2004.11.016
Scheduling Multiprocessor Tasks to Minimize Schedule Length, IEEE Transactions on Computers, vol.35, issue.5, pp.389-393, 1986. ,
DOI : 10.1109/TC.1986.1676781
Complexity of Scheduling Parallel Task Systems, SIAM Journal on Discrete Mathematics, vol.2, issue.4, pp.473-487, 1989. ,
DOI : 10.1137/0402042
Approximation Algorithms for Scheduling Independent Malleable Tasks, Euro-Par 2001 Parallel Processing, pp.191-197, 2001. ,
DOI : 10.1007/3-540-44681-8_29
The Implementation of the Cilk-5 ,
Enhancing the performance of malleable MPI applications by using performance-aware dynamic reconfiguration, Parallel Computing, vol.46, pp.60-77, 2015. ,
DOI : 10.1016/j.parco.2015.04.003
Detection and correction of silent data corruption for large-scale high-performance computing Storage and Analysis, ser. SC '12, Proceedings of the International Conference on High Performance Computing, Networking, pp.1-78, 2012. ,
Hiding Checkpoint Overhead in HPC Applications with a Semi-Blocking Algorithm, 2012 IEEE International Conference on Cluster Computing, pp.364-372, 2012. ,
DOI : 10.1109/CLUSTER.2012.82
Performance and reliability trade-offs for the double checkpointing algorithm, International Journal of Networking and Computing, vol.4, issue.1, pp.23-41, 2014. ,
DOI : 10.15803/ijnc.4.1_23
URL : https://hal.archives-ouvertes.fr/hal-01091928
Batch Resizing Policies and Techniques for Fine-Grain Grid Tasks: The Nuts and Bolts, Journal of Information Processing Systems, vol.7, issue.2, 2011. ,
DOI : 10.3745/JIPS.2011.7.2.299
Fault-Tolerance Techniques for High-Performance Computing, 2015. ,
DOI : 10.1007/978-3-319-20943-2
URL : https://hal.archives-ouvertes.fr/hal-01200488
A first order approximation to the optimum checkpoint interval, Communications of the ACM, vol.17, issue.9, pp.530-531, 1974. ,
DOI : 10.1145/361147.361115
Graph theory with applications, 1976. ,
DOI : 10.1007/978-1-349-03521-2
Computers and Intractability, A Guide to the Theory of NP-Completeness, 1979. ,
Checkpointing strategies for parallel jobs, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '11, pp.1-11, 2011. ,
DOI : 10.1145/2063384.2063428
URL : https://hal.archives-ouvertes.fr/hal-00738504