Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training

Olivier Beaumont; Lionel Eyraud-Dubois; Alena Shilova

doi:10.1007/978-3-030-57675-2_10

Communication Dans Un Congrès Année : 2020

Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training

(1, 2) , (1, 2) , (2, 1)

1
2

Olivier Beaumont

Fonction : Auteur
PersonId : 181224
IdHAL : olivier-beaumont
ORCID : 0000-0003-2741-6228
IdRef : 124577083

Université de Bordeaux

High-End Parallel Algorithms for Challenging Numerical Simulations

Lionel Eyraud-Dubois

Fonction : Auteur
PersonId : 174911
IdHAL : lioneleyraud-dubois
ORCID : 0000-0003-2475-3309
IdRef : 172645301

Université de Bordeaux

High-End Parallel Algorithms for Challenging Numerical Simulations

Alena Shilova

Fonction : Auteur

High-End Parallel Algorithms for Challenging Numerical Simulations

Université de Bordeaux

Résumé

Training Deep Neural Networks is known to be an expensive operation, both in terms of computational cost and memory load. Indeed, during training, all intermediate layer outputs (called activations) computed during the forward phase must be stored until the corresponding gradient has been computed in the backward phase. These memory requirements sometimes prevent to consider larger batch sizes and deeper networks, so that they can limit both convergence speed and accuracy. Recent works have proposed to offload some of the computed forward activations from the memory of the GPU to the memory of the CPU. This requires to determine which activations should be offloaded and when these transfers from and to the memory of the GPU should take place. We prove that this problem is NP-hard in the strong sense, and we propose two heuristics based on relaxations of the problem. We perform extensive experimental evaluation on standard Deep Neural Networks. We compare the performance of our heuristics against previous approaches from the literature, showing that they achieve much better performance in a wide variety of situations.

Domaines

Calcul parallèle, distribué et partagé [cs.DC] Apprentissage [cs.LG] Réseau de neurones [cs.NE] Recherche opérationnelle [math.OC]

Fichier principal

report.pdf (757.81 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Lionel Eyraud-Dubois : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02316266

Soumis le : vendredi 21 février 2020-16:59:54

Dernière modification le : jeudi 21 mars 2024-03:13:43

Archivage à long terme le : vendredi 22 mai 2020-17:40:46

Dates et versions

hal-02316266 , version 1 (15-10-2019)

hal-02316266 , version 2 (21-10-2019)

hal-02316266 , version 3 (21-02-2020)

Identifiants

HAL Id : hal-02316266 , version 3
DOI : 10.1007/978-3-030-57675-2_10

Citer

Olivier Beaumont, Lionel Eyraud-Dubois, Alena Shilova. Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training. Euro-Par 2020 - 26th International Conference on Parallel and Distributed Computing, Aug 2020, Warsaw / Virtual, Poland. pp.151-166, ⟨10.1007/978-3-030-57675-2_10⟩. ⟨hal-02316266v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA INRIA2 TDS-MACS

609 Consultations

1363 Téléchargements

Optimal GPU-CPU Offloading Strategies for Deep Neural Network Training

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager