Skip to Main content Skip to Navigation

Communicating processes and fault tolerance : a shared memory multiprocessor experience

Michel Banâtre 1 Maurice Jégado 1 Philippe Joubert 1 Christine Morin 1
1 LSP - Langages et Systèmes Parallèles
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires
Abstract : The concept of backward recovery is now well established as a means of restoring a consistent state of a fault tolerant system should some faults occur. In this paper, we consider a system of communicating processes mapped onto a multilevel execution support. A shared memory multiprocessor machine is assumed. Our interest is in tolerating the hardware faults that may occur during the execution of a concurrent computation. The machine provides a hardware backard recovery protocol based on a specialized memory device which tracks dependencies between the processors accessing shared data residing in memory. The transparency provided by the protocol is discussed considering successively the models of computation at the various levels of abstraction of the execution support.
Document type :
Complete list of metadata
Contributor : Rapport de Recherche Inria <>
Submitted on : Wednesday, May 24, 2006 - 4:52:58 PM
Last modification on : Thursday, January 7, 2021 - 4:28:14 PM
Long-term archiving on: : Tuesday, April 12, 2011 - 8:04:11 PM


  • HAL Id : inria-00074911, version 1


Michel Banâtre, Maurice Jégado, Philippe Joubert, Christine Morin. Communicating processes and fault tolerance : a shared memory multiprocessor experience. [Research Report] RR-1649, INRIA. 1992. ⟨inria-00074911⟩



Record views


Files downloads