HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

A Two-Level Checkpoint Algorithm in a Highly-Available Parallel Single Level Store System

Christine Morin 1 Renaud Lottiaux 1 Anne-Marie Kermarrec 2
1 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : A Parallel Single Level Store systems (PSLS) integrates a shared virtual memory and a parallel file system. Managing globally the data, they provide programmers of scientific applications with the attractive shared memory programming model combined with a large and efficient file system in a cluster. In this paper, we present a cheap and efficient two-level checkpointi- ng approach enabling a PSLS to tolerate failures. The first level checkpointing algorithm is very efficient and saves data in memory but requires a large amount of memory space. When memories are saturated, an alternative algorithm, saving a checkpoint on disks is implemented. Performance results present the impact of different variants of the checkpointing algorithms.
Document type :
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download

Contributor : Rapport de Recherche Inria Connect in order to contact the contributor
Submitted on : Wednesday, May 24, 2006 - 10:15:06 AM
Last modification on : Friday, February 4, 2022 - 3:12:50 AM
Long-term archiving on: : Sunday, April 4, 2010 - 11:12:52 PM


  • HAL Id : inria-00072547, version 1


Christine Morin, Renaud Lottiaux, Anne-Marie Kermarrec. A Two-Level Checkpoint Algorithm in a Highly-Available Parallel Single Level Store System. [Research Report] RR-4086, INRIA. 2000. ⟨inria-00072547⟩



Record views


Files downloads