Skip to Main content Skip to Navigation
Conference papers

CCK: An Improved Coordinated Checkpoint/Rollback Protocol for Dataflow Applications in KAAPI

Xavier Besseron 1 Samir Jafar 1 Thierry Gautier 1 Jean-Louis Roch 1
1 MOAIS - PrograMming and scheduling design fOr Applications in Interactive Simulation
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : Fault tolerance protocols play an important role in today long runtime scientific parallel applications because the probability of failure may be important due to the number of unreliable components involved during simulation. In this paper we present our approach and preliminary results about a new checkpoint/recovery protocol based on a coordinated scheme. This protocol is highly coupled to the availability of an abstract representation of the execution.
Complete list of metadata

https://hal.inria.fr/hal-00684864
Contributor : Ist Rennes Connect in order to contact the contributor
Submitted on : Tuesday, April 3, 2012 - 1:24:36 PM
Last modification on : Thursday, October 21, 2021 - 3:53:33 AM

Links full text

Identifiers

Citation

Xavier Besseron, Samir Jafar, Thierry Gautier, Jean-Louis Roch. CCK: An Improved Coordinated Checkpoint/Rollback Protocol for Dataflow Applications in KAAPI. ICTTA'06 IEEE Conference on Information and Communication Technologies: from Theory to Applications, Apr 2006, Damascus, Syria. ⟨10.1109/ICTTA.2006.1684955⟩. ⟨hal-00684864⟩

Share

Metrics

Record views

306