Improving Message Logging Protocols Scalability through Distributed Event Logging

Thomas Ropars 1 Christine Morin 1
1 MYRIADS - Design and Implementation of Autonomous Distributed Systems
IRISA-D1 - SYSTÈMES LARGE ÉCHELLE, Inria Rennes – Bretagne Atlantique
Abstract : Message logging is an attractive solution to provide fault tolerance for message passing applications because it is more scalable than coordinated checkpointing. Sender-based message logging is a well known optimization that allows to save messages payload in the sender memory and so only the events corresponding to message receptions have to be logged reliably using an event logger. In existing work on message logging, the event logger has always been considered as a centralized process, limiting message logging protocols scalability. In this paper, we propose a distributed event logger. This new event logger takes advantage of multi-cores processors to be executed in parallel with application processes. It makes use of the nodes' volatile memory to save events reliably. We propose a simple gossip-based dissemination protocol to make application processes aware of new stable events. We evaluated our distributed event logger in the Open MPI library with an optimistic and a pessimistic message logging protocol. Experiments show that distributed event logging improves message logging protocols scalability.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/inria-00526097
Contributor : Thomas Ropars <>
Submitted on : Wednesday, October 13, 2010 - 4:30:29 PM
Last modification on : Friday, November 16, 2018 - 1:40:30 AM

Identifiers

  • HAL Id : inria-00526097, version 1

Citation

Thomas Ropars, Christine Morin. Improving Message Logging Protocols Scalability through Distributed Event Logging. 16th International Euro-Par Conference, Aug 2010, Ischia, Italy. ⟨inria-00526097⟩

Share

Metrics

Record views

463