Skip to Main content Skip to Navigation
Conference papers

Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training

Olivier Beaumont 1, 2 Lionel Eyraud-Dubois 1, 2 Alena Shilova 2, 1
2 HiePACS - High-End Parallel Algorithms for Challenging Numerical Simulations
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : Training Deep Neural Networks is known to be an expensive operation, both in terms of computational cost and memory load. Indeed, during training, all intermediate layer outputs (called activations) computed during the forward phase must be stored until the corresponding gradient has been computed in the backward phase. These memory requirements sometimes prevent to consider larger batch sizes and deeper networks, so that they can limit both convergence speed and accuracy. Recent works have proposed to offload some of the computed forward activations from the memory of the GPU to the memory of the CPU. This requires to determine which activations should be offloaded and when these transfers from and to the memory of the GPU should take place. We prove that this problem is NP-hard in the strong sense, and we propose two heuristics based on relaxations of the problem. We perform extensive experimental evaluation on standard Deep Neural Networks. We compare the performance of our heuristics against previous approaches from the literature, showing that they achieve much better performance in a wide variety of situations.
Complete list of metadata

Cited literature [41 references]  Display  Hide  Download
Contributor : Lionel Eyraud-Dubois <>
Submitted on : Friday, February 21, 2020 - 4:59:54 PM
Last modification on : Thursday, December 3, 2020 - 3:19:16 PM
Long-term archiving on: : Friday, May 22, 2020 - 5:40:46 PM


Files produced by the author(s)




Olivier Beaumont, Lionel Eyraud-Dubois, Alena Shilova. Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training. Euro-Par 2020 - 26th International Conference on Parallel and Distributed Computing, Aug 2020, Warsaw / Virtual, Poland. pp.151-166, ⟨10.1007/978-3-030-57675-2_10⟩. ⟨hal-02316266v3⟩



Record views


Files downloads