New interface

# Static Strategies for Worksharing with Unrecoverable Interruptions (Extended version)

1 GRAAL - Algorithms and Scheduling for Distributed Heterogeneous Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : One has a large workload that is divisible''---its constituent work's granularity can be adjusted arbitrarily---and one has access to p remote computers that can assist in computing the workload. How can one best utilize these computers? Complicating this question is the fact that each remote computer is subject to interruptions (of known likelihood) that kill all work in progress on it. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed. Strategies are presented for achieving this goal, by balancing the desire to checkpoint often---thereby decreasing the amount of vulnerable work at any point---vs. the desire to avoid the context-switching required to checkpoint. The current study demonstrates the accessibility of strategies that provably maximize the expected amount of work when there is only one remote computer (the case p=1) and, at least in an asymptotic sense, when there are two remote computers (the case p=2); but the study strongly suggests the intractability of exact maximization for p >= 2 computers. This study responds to that challenge by developing efficient heuristics that employ both checkpointing and work replication as mechanisms for decreasing the impact of work-killing interruptions. The quality of these heuristics, in expected amount of work completed, is assessed through exhaustive simulations that use both idealized models and actual trace data.
Document type :
Reports (Research report)

Cited literature [40 references]

https://hal.inria.fr/inria-00413977
Contributor : Frédéric Vivien Connect in order to contact the contributor
Submitted on : Monday, September 7, 2009 - 3:03:00 PM
Last modification on : Friday, November 18, 2022 - 9:24:52 AM

### Files

RR-7029.pdf
Files produced by the author(s)

### Identifiers

• HAL Id : inria-00413977, version 1

### Citation

Anne Benoit, Yves Robert, Arnold Rosenberg, Frédéric Vivien. Static Strategies for Worksharing with Unrecoverable Interruptions (Extended version). [Research Report] RR-7029, LIP RR-2008-29, INRIA, LIP. 2009, pp.132. ⟨inria-00413977⟩

Record views