Static strategies for worksharing with unrecoverable interruptions

Abstract : One has a large workload that is ldquodivisiblerdquo-its constituent work's granularity can be adjusted arbitrarily;-and one has access to p remote computers that can assist in computing the workload. The problem is that the remote computers are subject to interruptions of known likelihood that kill all work in progress. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed. Strategies for achieving this goal, by balancing the desire to checkpoint often, in order to decrease the amount of vulnerable work at any point, vs. the desire to avoid the context-switching required to checkpoint, are studied. Strategies are devised that provably maximize the expected amount of work when there is only one remote computer (the case p = 1). Results suggest the intractability of such maximization for higher values of p, which motivates the development of heuristic approaches. Heuristics are developed that replicate works on several remote computers, in the hope of thereby decreasing the impact of work-killing interruptions. The quality of these heuristics is assessed through exhaustive simulations.
Type de document :
Communication dans un congrès
IPDPS, May 2009, Rome, Italy. 2009, 〈10.1109/IPDPS.2009.5161044〉
Liste complète des métadonnées
Contributeur : Equipe Roma <>
Soumis le : vendredi 29 août 2014 - 16:15:38
Dernière modification le : vendredi 20 avril 2018 - 15:44:27




Anne Benoit, Yves Robert, Arnold Rosenberg, Frédéric Vivien. Static strategies for worksharing with unrecoverable interruptions. IPDPS, May 2009, Rome, Italy. 2009, 〈10.1109/IPDPS.2009.5161044〉. 〈hal-01059271〉



Consultations de la notice