A Framework for High Availability Based on a Single System Image

Geoffroy Vallée 1, 2 Christine Morin 1 Stephen Scott 2
1 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : High availability (HA) is today an important issue in the domain of cluster computing, clusters being more and more larger, introducing a lot of failures. Today, the literature provides a lot of different HA strategies to tolerate application failures (applications being sequential or parallel). Unfortunately, it is still difficult to implement these HA policies inside a real system, and therefore the study of these policies is most of the time just theoretic, without real implementation. Therefore, a framework to ease the implementation of such policies is interesting. Moreover, a single system image (SSI), thanks to mechanisms for the global management of cluster resources, is a good candidate to provide such a framework. This paper presents the preliminary study of this framework on top of the Kerrighed SSI.
Document type :
Reports
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/inria-00070284
Contributor : Rapport de Recherche Inria <>
Submitted on : Friday, May 19, 2006 - 7:54:53 PM
Last modification on : Monday, December 10, 2018 - 11:34:08 AM
Long-term archiving on : Sunday, April 4, 2010 - 8:50:19 PM

Identifiers

  • HAL Id : inria-00070284, version 1

Citation

Geoffroy Vallée, Christine Morin, Stephen Scott. A Framework for High Availability Based on a Single System Image. [Research Report] RR-5734, INRIA. 2005, pp.10. ⟨inria-00070284⟩

Share

Metrics

Record views

258

Files downloads

159