Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, Epiciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation

A Javaspace-based Framework for Efficient Fault-Tolerant Master-Worker Distributed Applications

Constantinos Makassikis 1 Virginie Galtier 2 Stéphane Vialle 1, 2 
1 ALGORILLE - Algorithms for the Grid
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : We propose a framework built around a JavaSpace to ease the development of bag-of-tasks applications. The framework may optionally and automatically tolerate transient crash failures occurring on any of the distributed elements. It relies on checkpointing and underlying middleware mechanisms to do so. To further improve checkpointing efficiency, both in size and frequency, the programmer can introduce intermediate user-defined checkpoint data and code within the task processing program. The framework used without fault tolerance accelerates application development, does not introduce runtime overhead and yields to expected speedup. When enabling fault tolerance, our framework allows, despite failures, correct completion of applications with limited runtime and data storage overheads. Experiments run with up to 128 workers study the impact of some user-related and implementation-related parameters on overall performance, and reveal good performances for classical JavaSpace-based master-worker application profiles.
Complete list of metadata

Cited literature [6 references]  Display  Hide  Download
Contributor : Constantinos Makassikis Connect in order to contact the contributor
Submitted on : Wednesday, November 9, 2011 - 2:46:22 PM
Last modification on : Wednesday, February 2, 2022 - 3:59:39 PM
Long-term archiving on: : Friday, February 10, 2012 - 2:30:47 AM


Files produced by the author(s)


  • HAL Id : inria-00548951, version 2


Constantinos Makassikis, Virginie Galtier, Stéphane Vialle. A Javaspace-based Framework for Efficient Fault-Tolerant Master-Worker Distributed Applications. [Research Report] RR-7496, INRIA. 2010, pp.18. ⟨inria-00548951v2⟩



Record views


Files downloads