Common Mechanisms for Supporting Fault Tolerance in DSM and Message Passing Systems - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2002

Common Mechanisms for Supporting Fault Tolerance in DSM and Message Passing Systems

Résumé

Backward error recovery involving checkpointing and restart of tasks is an important component of any system providing fault tolerance to applicati- ons distributed over a network. A central problem to checkpointing and recovery is the ability to track dependencies and arrive at a consistent global checkpoint. Traditionally literature treats one of either distributed shared memory (DSM) or message passing as the interprocess communication mechanism when considering the issue of fault tolerance. This paper describes preliminary investigation into common mechanisms that can be implemented to support a wide variety of protocols in both shared memory and message passing systems. In effect it can be used in a system that combines both these IPC mechanisms.
Fichier principal
Vignette du fichier
RR-4613.pdf (223.41 Ko) Télécharger le fichier

Dates et versions

inria-00071972 , version 1 (23-05-2006)

Identifiants

  • HAL Id : inria-00071972 , version 1

Citer

Ramamurthy Badrinath, Christine Morin. Common Mechanisms for Supporting Fault Tolerance in DSM and Message Passing Systems. [Research Report] RR-4613, INRIA. 2002. ⟨inria-00071972⟩
104 Consultations
72 Téléchargements

Partager

Gmail Facebook X LinkedIn More