Distributed deep learning on edge-devices: feasibility via adaptive compression

Corentin Hardy; Erwan Le Merrer; Bruno Sericola

Pré-Publication, Document De Travail Année : 2017

Distributed deep learning on edge-devices: feasibility via adaptive compression

(1, 2) , (2) , (1)

1
2

Corentin Hardy

Fonction : Auteur
PersonId : 968504

Dependability Interoperability and perfOrmance aNalYsiS Of networkS

Technicolor R & I [Cesson Sévigné]

Erwan Le Merrer

Fonction : Auteur
PersonId : 874498

Technicolor R & I [Cesson Sévigné]

Bruno Sericola

Fonction : Auteur
PersonId : 9377
IdHAL : sericola
ORCID : 0000-0002-8201-0071
IdRef : 077485343

Dependability Interoperability and perfOrmance aNalYsiS Of networkS

Résumé

A large portion of data mining and analytic services use modern machine learning techniques, such as deep learning. The state-of-the-art results by deep learning come at the price of an intensive use of computing resources. The leading frameworks (e.g., TensorFlow) are executed on GPUs or on high-end servers in datacenters. On the other end, there is a proliferation of personal devices with possibly free CPU cycles; this can enable services to run in users' homes, embedding machine learning operations. In this paper, we ask the following question: Is distributed deep learning computation on WAN connected devices feasible, in spite of the traffic caused by learning tasks? We show that such a setup rises some important challenges, most notably the ingress traffic that the servers hosting the up-to-date model have to sustain. In order to reduce this stress, we propose AdaComp, a novel algorithm for compressing worker updates to the model on the server. Applicable to stochastic gradient descent based approaches, it combines efficient gradient selection and learning rate modulation. We then experiment and measure the impact of compression, device heterogeneity and reliability on the accuracy of learned models, with an emulator platform that embeds TensorFlow into Linux containers. We report a reduction of the total amount of data sent by workers to the server by two order of magnitude (e.g., 191-fold reduction for a convolutional network on the MNIST dataset), when compared to a standard asynchronous stochastic gradient descent, while preserving model accuracy.

Domaines

Multimédia [cs.MM] Recherche d'information [cs.IR] Intelligence artificielle [cs.AI] Calcul parallèle, distribué et partagé [cs.DC] Réseau de neurones [cs.NE]

Fichier principal

Deep_Learning_on_edge_devices__NCA_.pdf (631.91 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Corentin Hardy : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01622580

Soumis le : mardi 24 octobre 2017-15:01:03

Dernière modification le : lundi 5 juin 2023-12:32:03

Dates et versions

hal-01622580 , version 2 (24-10-2017)

hal-01622580 , version 1 (25-10-2017)

Identifiants

HAL Id : hal-01622580 , version 2

Citer

Corentin Hardy, Erwan Le Merrer, Bruno Sericola. Distributed deep learning on edge-devices: feasibility via adaptive compression. 2017. ⟨hal-01622580v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

532 Consultations

255 Téléchargements

Distributed deep learning on edge-devices: feasibility via adaptive compression

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager