Skip to Main content Skip to Navigation
Journal articles

How to bring together fault tolerance and data consistency to enable grid data sharing

Gabriel Antoniu 1 Jean-François Deverge 1 Sébastien Monnet 1
1 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : This paper addresses the challenge of transparent data sharing within computing grids built as cluster federations. On such platforms, the availability of storage resources may change in a dynamic way, often due to hardware failures. We focus on the problem of handling the consistency of replicated data in the presence of failures. We propose a software architecture which decouples consistency management from fault tolerance management. We illustrate this architecture with a case study showing how to design a consistency protocol using fault-tolerant building blocks. As a proof of concept, we describe a prototype implementation of this protocol within JuxMem, a software experimental platform for grid data sharing, and we report on a preliminary experimental evaluation of the proposed approach.
Complete list of metadata

https://hal.inria.fr/inria-00000987
Contributor : Sébastien Monnet <>
Submitted on : Tuesday, January 10, 2006 - 4:53:28 PM
Last modification on : Thursday, November 15, 2018 - 11:57:12 AM
Long-term archiving on: : Saturday, April 3, 2010 - 9:06:35 PM

Identifiers

  • HAL Id : inria-00000987, version 1

Citation

Gabriel Antoniu, Jean-François Deverge, Sébastien Monnet. How to bring together fault tolerance and data consistency to enable grid data sharing. Concurrency and Computation: Practice and Experience, Wiley, 2006, Concurrency and Computation: Practice and Experience, pp.1-19. ⟨inria-00000987v1⟩

Share

Metrics

Record views

16

Files downloads

46