Skip to Main content Skip to Navigation
Conference papers

Resilient Workflows for High-Performance Simulation Platforms

Toan Nguyen 1 Laurentiu Trifan 1 Jean-Antoine Désidéri 1 
1 OPALE - Optimization and control, numerical algorithms and integration of complex multidiscipline systems governed by PDE
CRISAM - Inria Sophia Antipolis - Méditerranée , JAD - Laboratoire Jean Alexandre Dieudonné : UMR6621
Abstract : Workflows systems are considered here to support large-scale multiphysics simulations. Because the use of large distributed and parallel multi-core infrastructures is prone to software and hardware failures, the paper addresses the need for error recovery procedures. A new mechanism based on asymmetric checkpointing is presented. A rule-based implementation for a distributed workflow platform is detailed.
Complete list of metadata

https://hal.inria.fr/inria-00524612
Contributor : Toan Nguyen Connect in order to contact the contributor
Submitted on : Friday, October 8, 2010 - 12:32:15 PM
Last modification on : Saturday, June 25, 2022 - 11:04:54 PM
Long-term archiving on: : Monday, January 10, 2011 - 11:43:08 AM

File

HPCS2010final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00524612, version 1

Collections

Citation

Toan Nguyen, Laurentiu Trifan, Jean-Antoine Désidéri. Resilient Workflows for High-Performance Simulation Platforms. The 2010 International Conference on High Performance Computing & Simulation (HPCS 2010), Jun 2010, Caen, France. ⟨inria-00524612⟩

Share

Metrics

Record views

118

Files downloads

812