Checkpointing and Recovery of Shared Memory Parallel Applications in a Cluster - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Reports (Research Report) Year : 2003

Checkpointing and Recovery of Shared Memory Parallel Applications in a Cluster

Abstract

This paper describes issues in the design and implementation of checkpointing and recovery modules for the Kerrighed DSM cluster system. Our design is for a DSM supporting the sequential consistency model. The mechanisms are general enough to be used in a number of different checkpointing and recovery protocols. It is designed to support common optimizations for performance suggested in literature, while staying light-weight during fault-free execution. We also present preliminary performance results of the current implementation.

Domains

Other [cs.OH]
Fichier principal
Vignette du fichier
RR-4806.pdf (248.13 Ko) Télécharger le fichier

Dates and versions

inria-00071780 , version 1 (23-05-2006)

Identifiers

  • HAL Id : inria-00071780 , version 1

Cite

Ramamurthy Badrinath, Christine Morin, Geoffroy Vallée. Checkpointing and Recovery of Shared Memory Parallel Applications in a Cluster. [Research Report] RR-4806, INRIA. 2003. ⟨inria-00071780⟩
243 View
162 Download

Share

Gmail Facebook X LinkedIn More