Semias: A Framework for Highly Available and Self-Healing Services in Large Scale Dynamic Distributed Systems

Stefania Costache 1 Thomas Ropars 1 Christine Morin 1
1 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : Next generation HPC systems will be large scale distributed systems spread over wide area networks. Overlays are used in those systems to provide scalable and fault tolerant communication mechanisms. In such a context, providing highly available services to users is challenging. In this paper, we present Semias, a framework that provides stateful services with high availability and self-healing. Based on active replication on top of a structured overlay, Semias requires very few modifications of existing services. Semias self-healing mechanisms are designed to minimize the number of reconfigurations of replicated services while ensuring high availability. We have used Semias to make Vigne grid middleware services highly available. Experiments run on the Grid'5000 testbed show the performance and self-healing properties of the framework.
Document type :
Reports
Complete list of metadatas

Cited literature [2 references]  Display  Hide  Download

https://hal.inria.fr/inria-00430443
Contributor : Thomas Ropars <>
Submitted on : Friday, November 6, 2009 - 5:54:48 PM
Last modification on : Friday, November 16, 2018 - 1:24:29 AM
Long-term archiving on : Tuesday, October 16, 2012 - 1:26:16 PM

File

RR7083.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00430443, version 1

Citation

Stefania Costache, Thomas Ropars, Christine Morin. Semias: A Framework for Highly Available and Self-Healing Services in Large Scale Dynamic Distributed Systems. [Research Report] RR-7083, INRIA. 2009. ⟨inria-00430443⟩

Share

Metrics

Record views

525

Files downloads

198