G. Aupy, A. Benoit, R. Melhem, P. Renaud-goud, and Y. Robert, Energyaware checkpointing of divisible tasks with soft or hard deadlines, International Green Computing Conference (IGCC), pp.1-8, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00857244

G. Bosilca, A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra et al., Unified model for assessing checkpointing protocols at extreme-scale. Concurrency and Computation: Practice and Experience, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00696154

A. Bouteiller, G. Bosilca, and J. Dongarra, Redesigning the message logging model for high performance, Concurrency and Computation: Practice and Experience, vol.22, issue.16, pp.2196-2211, 2010.

F. Cappello, E. Caron, M. J. Daydé, F. Desprez, Y. Jégou et al., Grid'5000: A Large Scale, Reconfigurable, Controlable and Monitorable Grid Platform, IEEE/ACM Grid 2005, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000284

F. Cappello, H. Casanova, and Y. Robert, Preventive migration vs. preventive checkpointing for extreme scale supercomputers, Parallel Processing Letters, vol.21, issue.2, pp.111-132, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00945068

F. Cappello, A. Geist, B. Gropp, S. Kale, B. Kramer et al., Toward exascale resilience, International Journal of High Performance Computing Applications, vol.23, pp.374-388, 2009.

K. M. Chandy and L. Lamport, Distributed snapshots: Determining global states of distributed systems, In Transactions on Computer Systems, vol.3, issue.1, pp.63-75, 1985.

J. T. Daly, A higher order estimate of the optimum checkpoint interval for restart dumps, FGCS, vol.22, issue.3, pp.303-312, 2004.

J. T. Daly, A higher order estimate of the optimum checkpoint interval for restart dumps, Future Generation Comp. Syst, vol.22, issue.3, pp.303-312, 2006.

M. Dias-de-assuncao, J. Gelas, L. Lefèvre, and A. Orgerie, The green grid5000: Instrumenting a grid with energy sensors, 5th International Workshop on Distributed Cooperative Laboratories: Instrumenting the Grid (IN-GRID 2010), 2010.
URL : https://hal.archives-ouvertes.fr/hal-00690638

M. Dias-de-assuncao, A. Orgerie, and L. Lefèvre, An analysis of power consumption logs from a monitored grid site, IEEE/ACM International Conference on Green Computing and Communications (GreenCom-2010), pp.61-68, 2010.
URL : https://hal.archives-ouvertes.fr/ensl-00579429

M. E. Diouri, M. F. Dolz, O. Glück, L. Lefèvre, P. Alonso et al., Solving some mysteries in power monitoring of servers: Take care of your wattmeters! In, Energy Efficiency in Large Scale Distributed Systems (EE-LSDS), 2013.
URL : https://hal.archives-ouvertes.fr/hal-00806504

M. E. Diouri, O. Glück, and L. Lefèvre, Your Cluster is not Power Homogeneous: Take Care when Designing Green Schedulers! In 4 th IEEE, ternational Green Computing Conference (IGCC), 2013.

M. E. Diouri, O. Gluck, L. Lefèvre, and F. Cappello, Energy considerations in checkpointing and fault tolerance protocols, DSNW, pp.1-6, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00748006

M. E. Diouri, O. Glück, L. Lefèvre, and F. Cappello, ECOFIT: A Framework to Estimate Energy Consumption of Fault Tolerance protocols during HPC executions, 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013.
URL : https://hal.archives-ouvertes.fr/hal-00806500

M. E. Diouri, O. Gluck, L. Lefevre, and F. Cappello, Ecofit: A framework to estimate energy consumption of fault tolerance protocols for HPC applications, CCGRID, pp.522-529, 2013.

M. E. Diouri, G. L. Tsafack-chetsa, O. Glück, L. Lefèvre, J. Pierson et al., Energy efficiency in high-performance computing with and without knowledge of applications and services, International Journal of High Performance Computing Applications (IJHPCA), 2013.
URL : https://hal.archives-ouvertes.fr/hal-00870615

J. Dongarra, P. Beckman, P. Aerts, F. Cappello, T. Lippert et al., The international exascale software project: a call to cooperative action by the global high-performance community, Int. Journal of High Performance Computing Applications, vol.23, issue.4, pp.309-322, 2009.

J. Dongarra, T. Hérault, and Y. Robert, Revisiting the double checkpointing algorithm, 15th Workshop on Advances in Parallel and Distributed Computational Models APDCM 2013, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00925168

E. N. Elnozahy, L. Alvisi, Y. Wang, and D. B. Johnson, A Survey of Rollback-Recovery Protocols in Message-Passing Systems, ACM Computing Surveys, vol.34, issue.3, pp.375-408, 2002.

M. Etinski, J. Corbalan, J. Labarta, and M. Valero, Utilization driven poweraware parallel job scheduling, Computer Science -Research and Development, vol.25, issue.3-4, pp.207-216, 2010.

W. Feng, X. Feng, and R. Ge, Green supercomputing comes of age, IT Professional, vol.10, issue.1, pp.17-23, 2008.

K. Ferreira, J. Stearley, J. H. Laros, R. Oldfield, K. Pedretti et al., Evaluating the Viability of Process Replication Reliability for Exascale Systems, Proceedings of the, 2011.

, ACM/IEEE Conference on Supercomputing, 2011.

V. W. Freeh, D. K. Lowenthal, F. Pan, N. Kappiah, R. Springer et al., Analyzing the energy-time trade-off in high-performance computing applications, IEEE Trans. Parallel Distrib. Syst, vol.18, issue.6, pp.835-848, 2007.

F. Hermenier, N. Loriant, J. Menaud, ;. Fh-pcn, X. et al., Power Management in Grid Computing with Xen, Frontiers of High Performance Computing and Networking -ISPA 2006 Workshops, ISPA 2006 International Workshops, vol.4331, pp.407-416, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00441366

H. Hlavacs, G. D. Costa, and J. Pierson, Energy consumption of residential and professional switches, IEEE CSE, 2009.

Y. Hotta, M. Sato, H. Kimura, S. Matsuoka, T. Boku et al., Profile-based optimization of power performance by using dynamic voltage scaling on a pc cluster, Proceedings of the 20th International in Parallel and Distributed Processing Symposium, 2006.

C. Hsu, W. Chun-feng, and J. S. Archuleta, Towards efficient supercomputing: A quest for the right metric, Proceedings of the High Performance Power-Aware Computing Workshop, 2005.

C. Hsu and W. Chun-feng, A power-aware run-time system for highperformance computing, Proceedings of the ACM/IEEE SC 2005 Conference, 2005.

J. Dongarra, The international ExaScale software project roadmap, Int. J. of High Performance Computing & Applications, vol.25, issue.1, 2011.

J. H. Laros, I. , K. T. Pedretti, S. M. Kelly, W. Shu et al., Energy based performance tuning for large scale high performance computing systems, Proceedings of the 2012 Symposium on High Performance Computing, HPC '12, vol.6, pp.1-6, 2012.

P. Mahadevan, P. Sharma, S. Banerjee, and P. Ranganathan, A power benchmarking framework for network devices, NETWORKING 2009 Conference, pp.795-808, 2009.

E. Meneses, O. Sarood, and L. V. Kalé, Assessing Energy Efficiency of Fault Tolerance Protocols for HPC Systems, Proceedings of the 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2012), 2012.

R. H. Netzer and J. Xu, Necessary and sufficient conditions for consistent global snapshots, IEEE Transactions on Parallel and Distributed Systems, vol.6, issue.2, pp.165-169, 1995.

X. Ni, E. Meneses, and L. V. Kalé, Hiding checkpoint overhead in HPC applications with a semi-blocking algorithm, Proc. 2012 IEEE Int. Conf. Cluster Computing, 2012.

A. Orgerie, L. Lefevre, and J. Gelas, Save Watts in your Grid: Green Strategies for Energy-Aware Framework in Large Scale Distributed Systems, ICPADS 2008 : The 14th IEEE International Conference on Parallel and Distributed Systems, 2008.
URL : https://hal.archives-ouvertes.fr/ensl-00474726

P. Alonso, M. F. Dolz, R. Mayo, and E. S. Quintana-ortí, Energyefficient execution of dense linear algebra algorithms on multi-core processors, Cluster Computing, 2012.

P. Alonso, M. F. Dolz, F. D. Igual, R. Mayo, and E. S. Quintana-ortí, DVFScontrol techniques for dense linear algebra operations on multi-core processors, Computer Science -R&D, vol.27, issue.4, pp.289-298, 2012.

E. Pinheiro, R. Bianchini, E. V. Carrera, and T. Heath, Load balancing and unbalancing for power and performance in cluster-based systems, WORK-SHOP ON COMPILERS AND OPERATING SYSTEMS FOR LOW POWER, 2001.

R. Rajachandrasekar, A. Moody, K. Mohror, and D. K. Panda, A 1 PB/s file system to checkpoint three million MPI tasks, Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, HPDC '13, pp.143-154, 2013.

C. Rao, H. Toutenburg, A. Fieger, C. Heumann, T. Nittner et al., Linear models: Least squares and alternatives, Springer Series in Statistics, 1999.

V. Sarkar, Exascale software study: Software challenges in extreme scale systems, 2009.

J. Shalf, S. Dosanjh, and J. Morrison, Exascale computing technology challenges, VECPAR'10, the 9th Int. Conf. High Performance Computing for Computational Science, vol.6449, pp.1-25, 2011.

J. W. Young, A first order approximation to the optimum checkpoint interval, Communications of the ACM, vol.17, issue.9, pp.530-531, 1974.

J. W. Young, A first order approximation to the optimum checkpoint interval, Commun. ACM, vol.17, issue.9, pp.530-531, 1974.

G. Zheng, X. Ni, and L. V. Kalé, A scalable double in-memory checkpoint and restart scheme towards exascale, Dependable Systems and Networks Workshops (DSN-W), 2012.

G. Zheng, L. Shi, and L. V. Kalé, FTC-Charm++: an in-memory checkpointbased fault tolerant runtime for Charm++ and MPI, Proc. 2004 IEEE Int. Conf. Cluster Computing, 2004.