Skip to Main content Skip to Navigation
New interface
Journal articles

Independent Checkpointing in a Heterogeneous Grid Environment

Abstract : The EU-funded XtreemOS project implements an open-source grid operating system based on Linux. In order to provide fault tolerance and migration for grid applications, it integrates a distributed grid-checkpointing service called XtreemGCP. This service is designed to support different checkpointing protocols and to address the underlying grid-node checkpointers (e.g. BLCR, LinuxSSI, OpenVZ, etc.) in a transparent manner through a uniform interface. In this paper, we present the integration of an independent checkpointing and rollback-recovery protocol into the XtreemGCP. The solution we propose is not checkpointer bound and thus can be transparently used on top of any grid-node checkpointer. To evaluate the prototype we run it within a heterogeneous environment composed of single-PC nodes and a Single System Image (SSI) cluster. The experimental results demonstrate the capability of the XtreemGCP service to integrate different checkpointing protocols and independently checkpoint a distributed application within a heterogeneous grid environment. Moreover, the performance evaluation also shows that our solution outperforms the existing coordinated checkpointing protocol in terms of scalability.
Complete list of metadata
Contributor : Eugen Feller Connect in order to contact the contributor
Submitted on : Monday, July 4, 2011 - 6:04:31 PM
Last modification on : Thursday, January 20, 2022 - 4:15:30 PM

Links full text



Eugen Feller, John Mehnert-Spahn, Michael Schoettner, Christine Morin. Independent Checkpointing in a Heterogeneous Grid Environment. Future Generation Computer Systems, 2012, 28 (1), pp.163-170. ⟨10.1016/j.future.2011.03.012⟩. ⟨inria-00605914⟩



Record views