Lessons from FTM: an Experiment in the Design and Implementation of a Low Cost Fault Tolerant System

Gilles Muller 1 Michel Banâtre 1 Mireille Hue 1 Nadine Peyrouze 1 Bruno Rochat 1
1 SOLIDOR - Design of Distributed Operating Systems
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, INRIA Rennes
Abstract : This report describes an experiment in the design of a general purpose fault tolerant system, FTM. The main objective of the FTM design was to implement a "low-cost" fault tolerant system that could be used on standard workstations. At the operating system level, our goal was to provide a methodology for the design of modular reliable operating systems, while offering fault tolerance transparency to user applications. In other words, porting an application to FTM had only to require compiling the source code without having to modify it. These objectives were achieved using the Mach micro-kernel and a modular set of reliable servers which implement application checkpoints and provide continuous system functions despite machine crashes. At the architectural level, our approach relies on a high performance stable storage implementation, called Stable Transactional Memory (STM), which can be implemented either by hardware or software. We first motivate our design choices, then we detail the FTM implementation at both architectural and operating system level. We comment on the reasons for the evolution of our stable memory technology from hardware to software. Finally, we present a performance evaluation of the FTM prototype. We conclude with lessons learned and give some assessments.
Type de document :
[Research Report] RR-2517, INRIA. 1995
Liste complète des métadonnées

Littérature citée [43 références]  Voir  Masquer  Télécharger

Contributeur : Rapport de Recherche Inria <>
Soumis le : mercredi 24 mai 2006 - 14:38:54
Dernière modification le : mercredi 16 mai 2018 - 11:23:05
Document(s) archivé(s) le : dimanche 4 avril 2010 - 21:54:38



  • HAL Id : inria-00074161, version 1


Gilles Muller, Michel Banâtre, Mireille Hue, Nadine Peyrouze, Bruno Rochat. Lessons from FTM: an Experiment in the Design and Implementation of a Low Cost Fault Tolerant System. [Research Report] RR-2517, INRIA. 1995. 〈inria-00074161〉



Consultations de la notice


Téléchargements de fichiers