An architecture for tolerating processor failures in shared-memory multiprocessors

Abstract : In this paper, we focus on the problem of recovering processor failures in shared memory multiprocessors. We propose an architecture designed for transparently tolerating processor failures. The recoverable shared memory (RSM) in the main component of this architecture which provides a hardware supported backward error recovery mechanism. This technique copes with standard caches and cache coherence protocols and avoids rollback propagation. The performance of the architecture during normal execution is evaluated and compared with that of existing fault tolerant shared memory multiprocessors. The performance study has been conducted by simulation using address traces collected from real parallel applications.
Type de document :
Rapport
[Research Report] RR-1965, INRIA. 1993
Liste complète des métadonnées

https://hal.inria.fr/inria-00074708
Contributeur : Rapport de Recherche Inria <>
Soumis le : mercredi 24 mai 2006 - 16:06:56
Dernière modification le : jeudi 11 janvier 2018 - 06:21:20
Document(s) archivé(s) le : lundi 5 avril 2010 - 00:13:47

Fichiers

Identifiants

  • HAL Id : inria-00074708, version 1

Collections

Citation

Michel Banâtre, Alain Gefflaut, Philippe Joubert, Peter Lee, Christine Morin. An architecture for tolerating processor failures in shared-memory multiprocessors. [Research Report] RR-1965, INRIA. 1993. 〈inria-00074708〉

Partager

Métriques

Consultations de la notice

266

Téléchargements de fichiers

160