ImageNet classification with deep convolutional neural networks, NIPS, 2012. ,
DOI : 10.1162/neco.2009.10-08-881
URL : http://dl.acm.org/ft_gateway.cfm?id=3065386&type=pdf
Deep Face Recognition, Procedings of the British Machine Vision Conference 2015, 2015. ,
DOI : 10.5244/C.29.41
Distributed representations of words and phrases and their compositionality Available: http://papers.nips.cc/paper/ 5021-distributed-representations-of-words-and-phrases-and-their-compositionality, NIPS, 2013. [Online] ,
Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.90
ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, vol.1010, issue.1, pp.211-252, 2015. ,
DOI : 10.1007/978-3-642-15555-0_11
URL : http://arxiv.org/pdf/1409.0575
Tensorflow: A system for large-scale machine learning, OSDI, 2016. ,
Benchmarking State-of-the-Art Deep Learning Software Tools, 2016 7th International Conference on Cloud Computing and Big Data (CCBD), p.7249, 1608. ,
DOI : 10.1109/CCBD.2016.029
Federated Learning: Strategies for Improving Communication Efficiency, p.5492, 1610. ,
Privacy-preserving deep learning, CCS, 2015. ,
DOI : 10.1145/2810103.2813687
Efficient backprop, Neural Networks: Tricks of the trade, pp.9-50, 1998. ,
Online algorithms and stochastic approximations, Online Learning and Neural Networks, 1998. ,
Staleness-aware async-sgd for distributed deep learning ,
Model accuracy and runtime tradeoff in distributed deep learning: A systematic study, ICDM, 2016. ,
Asynchronous parallel stochastic gradient for nonconvex optimization Available: http://papers.nips.cc/paper/ 5751-asynchronous-parallel-stochastic-gradient-for-nonconvex-optimization. pdf [18] A. Odena, NIPS, p.4033, 1601. ,
Hogwild: A lock-free approach to parallelizing stochastic gradient descent, NIPS, 2011. ,
The mnist database of handwritten digits, 1998. ,
Keras, 2015. ,
Distributed deep learning on edge-devices: feasibility via adaptive compression ,
URL : https://hal.archives-ouvertes.fr/hal-01622580
Learning representations by back-propagating errors, Nature, vol.85, issue.6088, 1988. ,
DOI : 10.1038/323533a0
Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research, vol.12, pp.2121-2159, 2011. ,
Adam: A method for stochastic optimization, 1412. ,
Project adam: Building an efficient and scalable deep learning training system, OSDI, 2014. ,
Revisiting distributed synchronous SGD Available: https://arxiv, International Conference on Learning Representations Workshop Track, p.981, 1604. ,
Federated optimization: Distributed optimization beyond the datacenter ,
Federated learning of deep networks using model averaging, 2016. ,