Experiments on Checkpointing Adjoint MPI Programs

Abstract : Checkpointing is a classical strategy to reduce the peak memory consumption of the adjoint. Checkpointing is vital for long run-time codes, which is the case of most MPI parallel applications. However, for MPI codes this question has always been addressed by ad-hoc hand manipulations of the differentiated code, and with no formal assurance of correctness. In a previous work, we investigated the assumptions implicitly made during past experiments, to clarify and generalize them. On one hand we proposed an adaptation of checkpointing to the case of MPI parallel programs with point-to-point communications, so that the semantics of an adjoint program is preserved for any choice of the checkpointed part. On the other hand, we proposed an alternative adaptation of checkpointing, more efficient but that requires a number of restrictions on the choice of the checkpointed part. In this work we see checkpointing MPI parallel programs from a practical point of view. We propose an implementation of the adapted techniques inside the AMPI library. We discuss practical questions about the choice of technique to be applied within a checkpointed part and the choice of the checkpointed part itself. Finally, we validate our theoretical results on representative CFD codes.
Type de document :
Communication dans un congrès
11th ASMO UK/ISSMO/NOED2016: International Conference on Numerical Optimisation Methods for Engineering Design, Jul 2016, Munich, Germany. 〈aboutflow.sems.qmul.ac.uk/events/munich2016/〉
Liste complète des métadonnées

Littérature citée [3 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01413402
Contributeur : Laurent Hascoet <>
Soumis le : vendredi 9 décembre 2016 - 17:07:51
Dernière modification le : jeudi 11 janvier 2018 - 16:48:46
Document(s) archivé(s) le : lundi 27 mars 2017 - 15:05:42

Fichier

ExperimentMPIPaper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01413402, version 1

Collections

Citation

Ala Taftaf, Laurent Hascoët. Experiments on Checkpointing Adjoint MPI Programs. 11th ASMO UK/ISSMO/NOED2016: International Conference on Numerical Optimisation Methods for Engineering Design, Jul 2016, Munich, Germany. 〈aboutflow.sems.qmul.ac.uk/events/munich2016/〉. 〈hal-01413402〉

Partager

Métriques

Consultations de la notice

93

Téléchargements de fichiers

41