P. Beckman, R. Sankaran, C. Catlett, N. Ferrier, R. Jacob et al., Waggle: An open sensor platform for edge computing, SENSORS, pp.1-3, 2016.

A. Brunetti, D. Buongiorno, G. F. Trotta, and V. Bevilacqua, Computer vision and deep learning techniques for pedestrian detection and tracking: A survey, Neurocomputing, vol.300, pp.17-33, 2018.

C. E. Catlett, P. H. Beckman, R. Sankaran, and K. K. Galvin, Array of things: a scientific research instrument in the public way: platform design and early lessons learned, Proceedings of the 2nd International Workshop on Science of Smart City Operations and Platforms Engineering, pp.26-33, 2017.

T. Chen, M. Li, Y. Li, M. Lin, N. Wang et al., Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems, 2015.

T. Chen, B. Xu, C. Zhang, and C. Guestrin, Training deep nets with sublinear memory cost, 2016.

E. J. Crowley, G. Gray, and A. J. Storkey, Moonshine: Distilling with cheap convolutions, Advances in Neural Information Processing Systems, pp.2893-2903, 2018.

J. Feng and D. Huang, Cutting down training memory by re-fowarding, 2018.

P. G. Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino et al., Edge-centric computing: Vision and challenges, ACM SIGCOMM Computer Communication Review, vol.45, issue.5, pp.37-42, 2015.

A. Griewank and A. Walther, Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation, ACM Transactions on Mathematical Software (TOMS), vol.26, issue.1, pp.19-45, 2000.

A. Gruslys, R. Munos, I. Danihelka, M. Lanctot, and A. Graves, Memory-efficient backpropagation through time, Advances in Neural Information Processing Systems, pp.4125-4133, 2016.

J. Hanlon, How to solve the memory challenges of deep neural networks, 2018.

Y. Huang, Y. Cheng, D. Chen, H. Lee, J. Ngiam et al., Efficient training of giant neural networks using pipeline parallelism, 2018.

N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. Tang, On large-batch training for deep learning: Generalization gap and sharp minima, 2016.

N. Kukreja, J. Hückelheim, M. Lange, M. Louboutin, A. Walther et al., High-level python abstractions for optimal checkpointing in inversion problems, 2018.

G. Paul and J. Irvine, Privacy implications of wearable health devices, Proceedings of the 7th International Conference on Security of Information and Networks, p.117, 2014.

M. Wang, C. Huang, and J. Li, Supporting very large models using automatic dataflow graph partitioning, 2018.

, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the, The submitted manuscript has been created by UChicago Argonne