Active Optimistic Message Logging for Reliable Execution of MPI Applications

Thomas Ropars 1 Christine Morin 1
1 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : To execute MPI applications reliably, fault tolerance mechanisms are needed. Message logging is a well known solution to provide fault tolerance for MPI applications. It as been proved that it can tolerate higher failure rate than coordinated checkpointing. However pessimistic and causal message logging can induce high overhead on failure free execution. In this paper, we present O2P, a new optimistic message logging protocol, based on active optimistic message logging. Contrary to existing optimistic message logging protocols that saves dependency information on reliable storage periodically, O2P logs dependency information as soon as possible to reduce the amount of data piggybacked on application messages. Thus it reduces the overhead of the protocol on failure free execution, making it more scalable and simplifying recovery. O2P is implemented as a module of the Open MPI library. Experiments show that active message logging is promising to improve scalability and performance of optimistic message logging.
Type de document :
Communication dans un congrès
15th International Euro-Par Conference, Aug 2009, Delft, Netherlands. 2009
Liste complète des métadonnées

https://hal.inria.fr/inria-00424002
Contributeur : Thomas Ropars <>
Soumis le : mardi 13 octobre 2009 - 16:39:07
Dernière modification le : jeudi 11 janvier 2018 - 06:20:10

Identifiants

  • HAL Id : inria-00424002, version 1

Citation

Thomas Ropars, Christine Morin. Active Optimistic Message Logging for Reliable Execution of MPI Applications. 15th International Euro-Par Conference, Aug 2009, Delft, Netherlands. 2009. 〈inria-00424002〉

Partager

Métriques

Consultations de la notice

425