G. Aupy, M. Shantharam, A. Benoit, Y. Robert, and P. Raghavan, Coscheduling algorithms for high-throughput workload execution, Journal of Scheduling, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01252366

A. Benoit, L. Pottier, and Y. Robert, Resilient application co-scheduling with processor redistribution
URL : https://hal.archives-ouvertes.fr/hal-01354863

M. Bhadauria and S. A. Mckee, An approach to resource-aware coscheduling for CMPs, Proc. 24th ACM Int. Conf. on Supercomputing ICS '10, 2010.

P. B. Bhat, C. S. Raghavendra, and V. K. Prasanna, Efficient collective communication in distributed heterogeneous systems, Journal of Parallel and Distributed Computing, vol.63, issue.3, pp.251-263, 2003.

J. Blazewicz, M. Machowiak, G. Mounie, and D. Trystram, Approximation algorithms for scheduling independent malleable tasks, Parallel Processing, vol.2150, pp.191-197, 2001.

J. A. Bondy and U. S. Murty, Graph theory with applications, 1976.

G. Bosilca, A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra et al., Unified model for assessing checkpointing protocols at extreme-scale, Concurrency and Computation: Practice and Experience, vol.26, issue.17, pp.2772-2791, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00696154

M. Bougeret, H. Casanova, M. Rabie, Y. Robert, and F. Vivien, Checkpointing strategies for parallel jobs, High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for, pp.1-11, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00560582

P. Brucker, A. Gladky, H. Hoogeveen, M. Y. Kovalyov, C. Potts et al., Scheduling a batching machine, J. Scheduling, vol.1, pp.31-54, 1998.

D. Chandra, F. Guo, S. Kim, and Y. Solihin, Predicting inter-thread cache contention on a chip multi-processor architecture, HPCA 11, pp.340-351, 2005.

E. G. Coffman, M. R. Garey, D. S. Johnson, and R. E. Tarjan, Performance bounds for level-oriented two-dimensional packing algorithms, SIAM Journal on Computing, vol.9, issue.4, pp.808-826, 1980.

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2009.

J. T. Daly, A higher order estimate of the optimum checkpoint interval for restart dumps, FGCS, vol.22, issue.3, pp.303-312, 2004.

R. K. Deb and R. F. Serfozo, Optimal control of batch service queues, Advances in Applied Probability, pp.340-361, 1973.

J. Dongarra, T. Herault, and Y. Robert, Performance and reliability trade-offs for the double checkpointing algorithm, International Journal of Networking and Computing, vol.4, issue.1, pp.23-41, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01091928

P. Dutot, Scheduling parallel tasks: Approximation algorithms. Handbook of Scheduling: Algorithms, Models, and Performance Analysis, 2003.
URL : https://hal.archives-ouvertes.fr/hal-00003126

E. N. Elnozahy, L. Alvisi, Y. Wang, and D. B. Johnson, A survey of rollback-recovery protocols in message-passing systems, ACM Comput. Surv, vol.34, issue.3, pp.375-408, 2002.

D. Fiala, F. Mueller, C. Engelmann, R. Riesen, K. Ferreira et al., Detection and correction of silent data corruption for large-scale high-performance computing, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, vol.78, pp.1-78, 2012.

E. Frachtenberg, D. Feitelson, F. Petrini, and J. Fernandez, Adaptive parallel job scheduling with flexible coscheduling, IEEE. Trans. Parallel Distributed Systems, vol.16, issue.11, pp.1066-1077, 2005.

M. Frigo, C. E. Leiserson, and K. H. Randall, The Implementation of the Cilk-5 Multithreaded Language, Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pp.212-223, 1998.

C. Hankendi and A. Coskun, Reducing the energy cost of computing through efficient co-scheduling of parallel workloads, Design, Automation Test in Europe Conference Exhibition (DATE), pp.994-999, 2012.

T. Herault and Y. Robert, Fault-Tolerance Techniques for HighPerformance Computing, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01200488

M. A. Heroux, D. W. Doerfler, P. S. Crozier, J. M. Willenbring, H. C. Edwards et al., Improving Performance via Mini-applications, Research Report, vol.5574, 2009.

Y. Ikura and M. Gimple, Efficient scheduling algorithms for a single batch processing machine, Operations Research Letters, vol.5, issue.2, pp.61-65, 1986.

F. Koehler and S. Khuller, Optimal batch schedules for parallel machines, Proceedings of the 13th Annual Algorithms and Data Structures Symposium, 2013.

G. Koole and R. Righter, A stochastic batching and scheduling problem, Probability in the Engineering and Informational Sciences, vol.15, issue.04, pp.465-479, 2001.

D. Li, D. S. Nikolopoulos, K. Cameron, B. R. De-supinski, and M. Schulz, Power-aware MPI task aggregation prediction for high-end computing systems, IPDPS 10, pp.1-12, 2010.

N. Muthuvelu, I. Chai, E. Chikkannan, and R. Buyya, Batch resizing policies and techniques for fine-grain grid tasks: The nuts and bolts, J. Information Processing Systems, vol.7, issue.2, 2011.

X. Ni, E. Meneses, and L. Kale, Hiding Checkpoint Overhead in HPC Applications with a Semi-Blocking Algorithm, Cluster Computing (CLUSTER), 2012 IEEE International Conference on, pp.364-372, 2012.

C. N. Potts and M. Y. Kovalyov, Scheduling with batching: a review, European Journal of Operational Research, vol.120, issue.2, pp.228-249, 2000.

M. Shantharam, Y. Youn, and P. Raghavan, Speedup-aware co-schedules for efficient workload management, Parallel Processing Letters, vol.23, issue.2, 2013.

J. W. Young, A first order approximation to the optimum checkpoint interval, Comm. of the ACM, vol.17, issue.9, pp.530-531, 1974.