A different re-execution speed can help

Abstract : We consider divisible load scientific applications executing on large-scale platforms subject to silent errors. While the goal is usually to complete the execution as fast as possible in expectation, another major concern is energy consumption. The use of dynamic voltage and frequency scaling (DVFS) can help save energy, but at the price of performance degradation. Consider the execution model where a set of K different speeds is given, and whenever a failure occurs, a different re-execution speed may be used. Can this help? We address the following bi-criteria problem: how to compute the optimal checkpointing period to minimize energy consumption while bounding the degradation in performance. We solve this bi-criteria problem by providing a closed-form solution for the checkpointing period, and demonstrate via a comprehensive set of simulations that a different re-execution speed can indeed help.
Type de document :
Communication dans un congrès
5th International Workshop on Power-aware Algorithms, Systems, and Architectures (PASA'16), held in conjunction with ICPP 2016, the 45th International Conference on Parallel Processing, Aug 2016, Philadelphia, United States. 2016, Proceedings of ICPP'2016 workshops (ICPPW'16)
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01354887
Contributeur : Equipe Roma <>
Soumis le : vendredi 19 août 2016 - 19:03:59
Dernière modification le : vendredi 20 avril 2018 - 15:44:27
Document(s) archivé(s) le : dimanche 20 novembre 2016 - 10:40:25

Fichier

pasa2016.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01354887, version 1

Collections

Citation

Anne Benoit, Aurélien Cavelan, Valentin Le Fèvre, Yves Robert, Hongyang Sun. A different re-execution speed can help. 5th International Workshop on Power-aware Algorithms, Systems, and Architectures (PASA'16), held in conjunction with ICPP 2016, the 45th International Conference on Parallel Processing, Aug 2016, Philadelphia, United States. 2016, Proceedings of ICPP'2016 workshops (ICPPW'16). 〈hal-01354887〉

Partager

Métriques

Consultations de la notice

327

Téléchargements de fichiers

43