Skip to Main content Skip to Navigation
Journal articles

H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms

Julien Herrmann 1 Guillaume Pallez 1
1 TADAAM - Topology-Aware System-Scale Data Management for High-Performance Computing
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : We study the problem of checkpointing strategies for adjoint computation on synchronous hierarchical platforms, specifically computational platforms with several levels of storage with different writing and reading costs. When reversing a large adjoint chain, choosing which data to checkpoint and where is a critical decision for the overall performance of the computation. We introduce H-Revolve, an optimal algorithm for this problem. We make it available in a public Python library along with the implementation of several state-of- the-art algorithms for the variant of the problem with two levels of storage. We provide a detailed description of how one can use this library in an adjoint computation software in the field of automatic differentiation or backpropagation. Finally, we evaluate the performance of H-Revolve and other checkpointing heuristics though an extensive campaign of simulation.
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.inria.fr/hal-02080706
Contributor : Guillaume Pallez (aupy) <>
Submitted on : Friday, February 28, 2020 - 12:12:06 PM
Last modification on : Friday, March 6, 2020 - 2:46:29 PM
Long-term archiving on: : Friday, May 29, 2020 - 2:15:58 PM

File

toms.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Julien Herrmann, Guillaume Pallez. H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms. ACM Transactions on Mathematical Software, Association for Computing Machinery, 2020, ⟨10.1145/3378672⟩. ⟨hal-02080706v2⟩

Share

Metrics

Record views

84

Files downloads

191