Towards resilient parallel linear Krylov solvers: recover-restart strategies - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2013

Towards resilient parallel linear Krylov solvers: recover-restart strategies

Résumé

The advent of extreme scale machines will require the use of parallel resources at an unprecedented scale, probably leading to a high rate of hardware faults. High Performance Computing (HPC) applications that aim at exploiting all these resources will thus need to be resilient, \emph{i.e.}, be able to compute a correct solution in presence of faults. In this work, we investigate possible remedies in the framework of the solution of large sparse linear systems that is often the inner most numerical kernel in many scientific and engineering applications and also one of the most time consuming part. More precisely, we present recovery followed by restarting strategies in the framework of Krylov subspace solvers where lost entries of the iterate are interpolated to define a new initial guess before restarting. In particular, we consider two interpolation policies that preserve key numerical properties of well-known solvers, namely the monotony decrease of the A-norm of the error of the conjugate gradient (CG) or the residual norm decrease of GMRES. We assess the impact of the recovery method, the fault rate and the number of processors on the robustness of the resulting linear solvers. We consider experiments with CG, GMRES and Bi-CGStab.
Fichier principal
Vignette du fichier
RR-8324.pdf (1.5 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00843992 , version 1 (12-07-2013)

Identifiants

  • HAL Id : hal-00843992 , version 1

Citer

Emmanuel Agullo, Luc Giraud, Abdou Guermouche, Jean Roman, Mawussi Zounon. Towards resilient parallel linear Krylov solvers: recover-restart strategies. [Research Report] RR-8324, INRIA. 2013, pp.36. ⟨hal-00843992⟩
489 Consultations
537 Téléchargements

Partager

Gmail Facebook X LinkedIn More