On the complexity of scheduling checkpoints for computational workflows

This paper deals with the complexity of scheduling computational workflows in the presence of Exponentially distributed failures. When such a failure occurs, rollback and recovery is used so that the execution can resume from the last checkpointed state. The goal is to minimize the expected execution time, and we have to decide in which order to execute the tasks, and whether to checkpoint or not after the completion of each given task. We show that this scheduling problem is strongly NP-complete, and propose a (polynomial-time) dynamic programming algorithm for the case where the application graph is a linear chain. These results lay the theoretical foundations of the problem, and constitute a prerequisite before discussing scheduling strategies for arbitrary DAGS of moldable tasks subject to general failure distributions.

Domaines

Algorithme et structure de données [cs.DS] Calcul parallèle, distribué et partagé [cs.DC]

Fichier principal

Complexity-Scheduling-Zaidouni.pdf (381.98 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Equipe Roma : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00763382

Soumis le : lundi 10 décembre 2012-16:21:37

Dernière modification le : jeudi 15 février 2024-03:30:51

Archivage à long terme le : samedi 17 décembre 2016-23:48:19

Dates et versions

hal-00763382 , version 1 (10-12-2012)

Identifiants

HAL Id : hal-00763382 , version 1
DOI : 10.1109/DSNW.2012.6264675

Citer

Yves Robert, Frédéric Vivien, Dounia Zaidouni. On the complexity of scheduling checkpoints for computational workflows. FTXS'2012, the Workshop on Fault-Tolerance for HPC at Extreme Scale, in conjunction with the 42nd Annual IEEE/IFIP Int. Conf. on Dependable Systems and Networks (DSN 2012), 2012, Boston, United States. ⟨10.1109/DSNW.2012.6264675⟩. ⟨hal-00763382⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-LYON UNIV-RENNES1 CNRS INRIA UNIV-LYON1 IRISA INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UDL UR1-MATH-NUM

141 Consultations

192 Téléchargements