Skip to Main content Skip to Navigation
Journal articles

Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms

Anne Benoit 1, 2 Louis-Claude Canon 3 Emmanuel Jeannot 4 Yves Robert 1, 2
2 ROMA - Optimisation des ressources : modèles, algorithmes et ordonnancement
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
3 ALGORILLE - Algorithms for the Grid
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
4 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
Abstract : This paper deals with the reliability of task graph schedules with transient and fail-stop failures. While computing the reliability of a given schedule is easy in the absence of task replication, the problem becomes much more difficult when task replication is used. We fill a complexity gap of the scheduling literature: our main result is that this reliability problem is #P'-Complete (hence at least as hard as NP-Complete problems), both for transient and for fail-stop processor failures. We also study the evaluation of a restricted class of schedules, where a task cannot be scheduled before all replicas of all its predecessors have completed their execution. Although the complexity in this case with fail-stop failures remains open, we provide an algorithm to estimate the reliability while limiting evaluation costs, and we validate this approach through simulations.
Complete list of metadata

https://hal.inria.fr/hal-00653477
Contributor : Louis-Claude Canon <>
Submitted on : Monday, December 19, 2011 - 3:42:30 PM
Last modification on : Friday, February 26, 2021 - 3:28:02 PM
Long-term archiving on: : Friday, November 16, 2012 - 4:00:49 PM

File

JoS_rev1.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Anne Benoit, Louis-Claude Canon, Emmanuel Jeannot, Yves Robert. Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms. Journal of Scheduling, Springer Verlag, 2012, 15 (5), pp.615-627. ⟨10.1007/s10951-011-0236-y⟩. ⟨hal-00653477⟩

Share

Metrics

Record views

532

Files downloads

613