A different re-execution speed can help - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2016

A different re-execution speed can help

Résumé

We consider divisible load scientific applications executing on large-scale platforms subject to silent errors. While the goal is usually to complete the execution as fast as possible in expectation, another major concern is energy consumption. The use of dynamic voltage and frequency scaling (DVFS) can help save energy, but at the price of performance degradation. Consider the execution model where a set of $K$ different speeds is given, and whenever a failure occurs, a different re-execution speed may be used. Can this help? We address the following bi-criteria problem: how to compute the optimal checkpointing period to minimize energy consumption while bounding the degradation in performance. We solve this bi-criteria problem by providing a closed-form solution for the checkpointing period, and demonstrate via a comprehensive set of experiments that a different re-execution speed can indeed help.
Fichier principal
Vignette du fichier
RR-8888.pdf (2.95 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01297125 , version 1 (02-04-2016)

Identifiants

  • HAL Id : hal-01297125 , version 1

Citer

Anne Benoit, Aurélien Cavelan, Valentin Le Fèvre, Yves Robert, Hongyang Sun. A different re-execution speed can help. [Research Report] RR-8888, INRIA Grenoble - Rhone-Alpes. 2016. ⟨hal-01297125⟩
79 Consultations
134 Téléchargements

Partager

Gmail Facebook X LinkedIn More