Skip to Main content Skip to Navigation
Conference papers

Robust Workflows for Large-Scale Multiphysics Simulation

Toan Nguyen 1, * Laurentiu Trifan 1 Jean-Antoine Desideri 1 
* Corresponding author
1 OPALE - Optimization and control, numerical algorithms and integration of complex multidiscipline systems governed by PDE
CRISAM - Inria Sophia Antipolis - Méditerranée , JAD - Laboratoire Jean Alexandre Dieudonné : UMR6621
Abstract : Large-scale simulations, e.g. fluid-structure interactions and aeroacoustics noise generation, require important computing power, visualization systems and high-end storage capacity. Because 3D multi-physics simulations also run long processes on large datasets, an important issue is the robustness of the computing systems involved, i.e., the ability to resume the inadvertantly aborted computations. A new approach is presented here to handle application failures. It is based on extensions of bracketing checkpoints usually implemented in database and transactional systems. An assymetric scheme is devised to reduce the number of checkpoints required to safely restart aborted applications when unexpected failures occur. The tasks are controled by a workflow graph than can be deployed on various distributed platforms and high-performance infrastructures. An automated bracketing process inserts in the workflow graph checkpoints that are placed at critical execution points in the graph. The checkpoints are inserted using a heuristic process based on a evolving set of rules. Preliminary tests show that the number of checkpoints, hence the overhead incurred by the checkpointing mechanism, can be significantly reduced to enhance the application performance while supporting its resilience.
Complete list of metadata

Cited literature [31 references]  Display  Hide  Download

https://hal.inria.fr/inria-00524660
Contributor : Toan Nguyen Connect in order to contact the contributor
Submitted on : Friday, October 8, 2010 - 2:33:14 PM
Last modification on : Saturday, June 25, 2022 - 11:04:54 PM
Long-term archiving on: : Monday, January 10, 2011 - 11:08:17 AM

File

CFD2010paper_Nguyen.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00524660, version 1

Collections

Citation

Toan Nguyen, Laurentiu Trifan, Jean-Antoine Desideri. Robust Workflows for Large-Scale Multiphysics Simulation. Fifth European Conference on Computational Fluid Dynamics, Jun 2010, Lisbonne, Portugal. ⟨inria-00524660⟩

Share

Metrics

Record views

173

Files downloads

102