An architecture for tolerating processor failures in shared-memory multiprocessors

Abstract : In this paper, we focus on the problem of recovering processor failures in shared memory multiprocessors. We propose an architecture designed for transparently tolerating processor failures. The recoverable shared memory (RSM) in the main component of this architecture which provides a hardware supported backward error recovery mechanism. This technique copes with standard caches and cache coherence protocols and avoids rollback propagation. The performance of the architecture during normal execution is evaluated and compared with that of existing fault tolerant shared memory multiprocessors. The performance study has been conducted by simulation using address traces collected from real parallel applications.
Type de document :
[Research Report] RR-1965, INRIA. 1993
Liste complète des métadonnées
Contributeur : Rapport de Recherche Inria <>
Soumis le : mercredi 24 mai 2006 - 16:06:56
Dernière modification le : mercredi 16 mai 2018 - 11:23:14
Document(s) archivé(s) le : lundi 5 avril 2010 - 00:13:47



  • HAL Id : inria-00074708, version 1


Michel Banâtre, Alain Gefflaut, Philippe Joubert, Peter Lee, Christine Morin. An architecture for tolerating processor failures in shared-memory multiprocessors. [Research Report] RR-1965, INRIA. 1993. 〈inria-00074708〉



Consultations de la notice


Téléchargements de fichiers