Hybrid Checkpointing for Parallel Applications in Cluster Federations - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2004

Hybrid Checkpointing for Parallel Applications in Cluster Federations

Résumé

Cluster federations are very useful for applications like large scale code coupling. Faults may appear very frequently, so we want to use checkpoints to be able to restart applications. To take into account the constraints introduced by clusters federation architecture, we propose a hierarchical checkpointing protocol. It uses synchronization inside clusters but only quasi-synchronous methods between clusters. Our protocol has been evaluate by simulation and fits well for applications that can be divided in modules with a lot of communications inside modules but few between them.
Fichier principal
Vignette du fichier
MonMorBad04CCGrid.pdf (105.05 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00000991 , version 1 (10-01-2006)
inria-00000991 , version 2 (11-01-2006)
inria-00000991 , version 3 (04-03-2016)

Identifiants

  • HAL Id : inria-00000991 , version 3

Citer

Sébastien Monnet, Christine Morin, Ramamurthy Badrinath. Hybrid Checkpointing for Parallel Applications in Cluster Federations. 4th IEEE/ACM International Symposium on Cluster Computing and the Grid, Apr 2004, Chicago, IL, United States. ⟨inria-00000991v3⟩
355 Consultations
168 Téléchargements

Partager

Gmail Facebook X LinkedIn More