Robust Workflows for Large-Scale Multiphysics Simulation

Toan Nguyen 1, * Laurentiu Trifan 1 Jean-Antoine Desideri 1
* Auteur correspondant
1 OPALE - Optimization and control, numerical algorithms and integration of complex multidiscipline systems governed by PDE
CRISAM - Inria Sophia Antipolis - Méditerranée , JAD - Laboratoire Jean Alexandre Dieudonné : UMR6621
Abstract : Large-scale simulations, e.g. fluid-structure interactions and aeroacoustics noise generation, require important computing power, visualization systems and high-end storage capacity. Because 3D multi-physics simulations also run long processes on large datasets, an important issue is the robustness of the computing systems involved, i.e., the ability to resume the inadvertantly aborted computations. A new approach is presented here to handle application failures. It is based on extensions of bracketing checkpoints usually implemented in database and transactional systems. An assymetric scheme is devised to reduce the number of checkpoints required to safely restart aborted applications when unexpected failures occur. The tasks are controled by a workflow graph than can be deployed on various distributed platforms and high-performance infrastructures. An automated bracketing process inserts in the workflow graph checkpoints that are placed at critical execution points in the graph. The checkpoints are inserted using a heuristic process based on a evolving set of rules. Preliminary tests show that the number of checkpoints, hence the overhead incurred by the checkpointing mechanism, can be significantly reduced to enhance the application performance while supporting its resilience.
Type de document :
Communication dans un congrès
Fifth European Conference on Computational Fluid Dynamics, Jun 2010, Lisbonne, Portugal. 2010
Liste complète des métadonnées

Littérature citée [31 références]  Voir  Masquer  Télécharger
Contributeur : Toan Nguyen <>
Soumis le : vendredi 8 octobre 2010 - 14:33:14
Dernière modification le : jeudi 3 mai 2018 - 13:32:55
Document(s) archivé(s) le : lundi 10 janvier 2011 - 11:08:17


Fichiers produits par l'(les) auteur(s)


  • HAL Id : inria-00524660, version 1



Toan Nguyen, Laurentiu Trifan, Jean-Antoine Desideri. Robust Workflows for Large-Scale Multiphysics Simulation. Fifth European Conference on Computational Fluid Dynamics, Jun 2010, Lisbonne, Portugal. 2010. 〈inria-00524660〉



Consultations de la notice


Téléchargements de fichiers