Skip to Main content Skip to Navigation
Reports

Lessons from FTM: an Experiment in the Design and Implementation of a Low Cost Fault Tolerant System

Gilles Muller 1 Michel Banâtre 1 Mireille Hue 1 Nadine Peyrouze 1 Bruno Rochat 1
1 SOLIDOR - Design of Distributed Operating Systems
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, INRIA Rennes
Abstract : This report describes an experiment in the design of a general purpose fault tolerant system, FTM. The main objective of the FTM design was to implement a "low-cost" fault tolerant system that could be used on standard workstations. At the operating system level, our goal was to provide a methodology for the design of modular reliable operating systems, while offering fault tolerance transparency to user applications. In other words, porting an application to FTM had only to require compiling the source code without having to modify it. These objectives were achieved using the Mach micro-kernel and a modular set of reliable servers which implement application checkpoints and provide continuous system functions despite machine crashes. At the architectural level, our approach relies on a high performance stable storage implementation, called Stable Transactional Memory (STM), which can be implemented either by hardware or software. We first motivate our design choices, then we detail the FTM implementation at both architectural and operating system level. We comment on the reasons for the evolution of our stable memory technology from hardware to software. Finally, we present a performance evaluation of the FTM prototype. We conclude with lessons learned and give some assessments.
Document type :
Reports
Complete list of metadata

Cited literature [43 references]  Display  Hide  Download

https://hal.inria.fr/inria-00074161
Contributor : Rapport de Recherche Inria <>
Submitted on : Wednesday, May 24, 2006 - 2:38:54 PM
Last modification on : Thursday, February 11, 2021 - 2:48:05 PM
Long-term archiving on: : Sunday, April 4, 2010 - 9:54:38 PM

Identifiers

  • HAL Id : inria-00074161, version 1

Citation

Gilles Muller, Michel Banâtre, Mireille Hue, Nadine Peyrouze, Bruno Rochat. Lessons from FTM: an Experiment in the Design and Implementation of a Low Cost Fault Tolerant System. [Research Report] RR-2517, INRIA. 1995. ⟨inria-00074161⟩

Share

Metrics

Record views

353

Files downloads

626