Resilient Workflows for High-Performance Simulation Platforms

Toan Nguyen 1 Laurentiu Trifan 1 Jean-Antoine Désidéri 1
1 OPALE - Optimization and control, numerical algorithms and integration of complex multidiscipline systems governed by PDE
CRISAM - Inria Sophia Antipolis - Méditerranée , JAD - Laboratoire Jean Alexandre Dieudonné : UMR6621
Abstract : Workflows systems are considered here to support large-scale multiphysics simulations. Because the use of large distributed and parallel multi-core infrastructures is prone to software and hardware failures, the paper addresses the need for error recovery procedures. A new mechanism based on asymmetric checkpointing is presented. A rule-based implementation for a distributed workflow platform is detailed.
Type de document :
Communication dans un congrès
The 2010 International Conference on High Performance Computing & Simulation (HPCS 2010), Jun 2010, Caen, France. 2010
Liste complète des métadonnées

https://hal.inria.fr/inria-00524612
Contributeur : Toan Nguyen <>
Soumis le : vendredi 8 octobre 2010 - 12:32:15
Dernière modification le : jeudi 3 mai 2018 - 13:32:55
Document(s) archivé(s) le : lundi 10 janvier 2011 - 11:43:08

Fichier

HPCS2010final.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00524612, version 1

Collections

Citation

Toan Nguyen, Laurentiu Trifan, Jean-Antoine Désidéri. Resilient Workflows for High-Performance Simulation Platforms. The 2010 International Conference on High Performance Computing & Simulation (HPCS 2010), Jun 2010, Caen, France. 2010. 〈inria-00524612〉

Partager

Métriques

Consultations de la notice

295

Téléchargements de fichiers

967