HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

How to bring together fault tolerance and data consistency to enable grid data sharing

Gabriel Antoniu 1 Jean-François Deverge 1 Sébastien Monnet 1
1 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : This paper addresses the challenge of transparent data sharing within computing grids built as cluster federations. On such platforms, the availability of storage resources may change in a dynamic way, often due to hardware failures. We focus on the problem of handling the consistency of replicated data in the presence of failures. We propose a software architecture which decouples consistency management from fault tolerance management. We illustrate this architecture with a case study showing how to design a consistency protocol using fault-tolerant building blocks. As a proof of concept, we describe a prototype implementation of this protocol within JuxMem, a software experimental platform for grid data sharing, and we report on a preliminary experimental evaluation of the proposed approach.
Complete list of metadata

Contributor : Sébastien Monnet Connect in order to contact the contributor
Submitted on : Wednesday, January 11, 2006 - 9:38:55 AM
Last modification on : Friday, February 4, 2022 - 3:22:05 AM
Long-term archiving on: : Monday, September 20, 2010 - 2:00:06 PM


  • HAL Id : inria-00000987, version 2


Gabriel Antoniu, Jean-François Deverge, Sébastien Monnet. How to bring together fault tolerance and data consistency to enable grid data sharing. Concurrency and Computation: Practice and Experience, Wiley, 2006, Concurrency and Computation: Practice and Experience, pp.1-19. ⟨inria-00000987v2⟩



Record views


Files downloads