Optimal Checkpointing Period: Time vs. Energy - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Conference Papers Year : 2013

Optimal Checkpointing Period: Time vs. Energy

Abstract

This short paper deals with parallel scientific applications using non-blocking and periodic coordinated checkpointing to enforce resilience. We provide a model and detailed formulas for total execution time and consumed energy. We characterize the optimal period for both objectives, and we assess the range of time/energy trade-offs to be made by instantiating the model with a set of realistic scenarios for Exascale systems. We give a particular emphasis to I/O transfers, because the relative cost of communication is expected to dramatically increase, both in terms of latency and consumed energy, for future Exascale platforms.
Fichier principal
Vignette du fichier
main.pdf (317.2 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-00926199 , version 1 (09-01-2014)

Identifiers

  • HAL Id : hal-00926199 , version 1

Cite

Guillaume Aupy, Anne Benoit, Thomas Hérault, Yves Robert, Jack Dongarra. Optimal Checkpointing Period: Time vs. Energy. Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Nov 2013, Denver, United States. ⟨hal-00926199⟩
225 View
125 Download

Share

Gmail Facebook X LinkedIn More