Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.521, issue.7553, pp.436-444, 2015.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012.

O. Russakovsky, J. Deng, H. Su, J. Krause, and S. Satheesh, Imagenet large scale visual recognition challenge, International Journal of Computer Vision, vol.115, issue.3, pp.211-252, 2015.

G. Hinton, L. Deng, D. Yu, G. E. Dahl, and A. Mohamed, Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal processing magazine, vol.29, issue.6, pp.82-97, 2012.

X. Xu, Y. Ding, S. X. Hu, M. Niemier, and J. Cong, Scaling for edge inference of deep neural networks, Nature Electronics, vol.1, issue.4, p.216, 2018.

L. Bottou, Large-scale machine learning with stochastic gradient descent, Proceedings of COMPSTAT'2010, pp.177-186, 2010.

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, and W. Wang, Mobilenets: Efficient convolutional neural networks for mobile vision applications, 2017.

V. Sze, Y. Chen, T. Yang, and J. Emer, Efficient processing of deep neural networks: A tutorial and survey, 2017.

M. Grimaldi, V. Tenace, and A. Calimera, Layer-wise compressive training for convolutional neural networks, Future Internet, vol.11, issue.1, 2018.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, and S. Reed, Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1-9, 2015.

Y. Chen, T. Krishna, J. S. Emer, and V. Sze, Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks, IEEE Journal of Solid-State Circuits, vol.52, issue.1, pp.127-138, 2017.

M. Courbariaux, Y. Bengio, and J. David, Binaryconnect: Training deep neural networks with binary weights during propagations, Advances in Neural Information Processing Systems, pp.3123-3131, 2015.

E. Flamand, D. Rossi, F. Conti, I. Loi, and A. Pullini, Gap-8: A risc-v soc for ai at the edge of the iot, 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors, pp.1-4, 2018.

B. Moons and M. Verhelst, A 0.3-2.6 tops/w precision-scalable processor for realtime large-scale convnets, VLSI Circuits (VLSI-Circuits), 2016 IEEE Symposium on, pp.1-2, 2016.

J. Albericio, A. Delmás, P. Judd, S. Sharify, and G. O'leary, Bit-pragmatic deep neural network computing, Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, pp.382-394, 2017.

B. Moons, B. De-brabandere, L. Van-gool, and M. Verhelst, Energy-efficient convnets through approximate computing, Applications of Computer Vision (WACV), pp.1-8, 2016.

M. Shafique, R. Hafiz, M. U. Javed, S. Abbas, and L. Sekanina, Adaptive and energy-efficient architectures for machine learning: Challenges, opportunities, and research roadmap, VLSI (ISVLSI), 2017 IEEE Computer Society Annual Symposium on, pp.627-632, 2017.

M. Alioto, V. De, and A. Marongiu, Energy-quality scalable integrated circuits and systems: Continuing energy scaling in the twilight of moores law, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol.8, issue.4, pp.653-678, 2018.

V. Peluso and A. Calimera, Weak-mac: Arithmetic relaxation for dynamic energyaccuracy scaling in convnets, Circuits and Systems (ISCAS), pp.1-5, 2018.

V. Peluso and A. Calimera, Energy-driven precision scaling for fixed-point convnets, Very Large Scale Integration (VLSI-SoC), pp.1-6, 2018.

L. Lai and N. Suda, Enabling deep learning at the iot edge, Proceedings of the International Conference on Computer-Aided Design, p.135, 2018.

N. P. Jouppi, C. Young, N. Patil, D. Patterson, and G. , Indatacenter performance analysis of a tensor processing unit, Proceedings of the 44th Annual International Symposium on Computer Architecture, ser. ISCA '17, pp.1-12, 2017.

B. Moons and M. Verhelst, An energy-efficient precision-scalable convnet processor in 40-nm cmos, IEEE Journal of Solid-State Circuits, vol.52, issue.4, pp.903-914, 2017.

A. Krizhevsky and G. Hinton, Learning multiple layers of features from tiny images, Citeseer, Tech. Rep, 2009.

P. Warden, Challenges in representation learning: Facial expression recognition challenge, vol.26, 2018.

R. Andri, L. Cavigelli, D. Rossi, and L. Benini, Yodann: An architecture for ultralow power binary-weight cnn acceleration, IEEE Transactions, 2017.

J. Gu, Z. Wang, J. Kuen, L. Ma, and A. Shahroudy, Recent advances in convolutional neural networks," Pattern Recognition, 2017.

T. J. Yang, Y. H. Chen, and V. Sze, Designing energy-efficient convolutional neural networks using energy-aware pruning, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6071-6079, 2017.

F. Fleuret and D. Geman, Coarse-to-fine face detection, International Journal of Computer Vision, vol.41, issue.1, pp.85-107, 2001.

P. Panda, A. Sengupta, and K. Roy, Conditional deep learning for energyefficient and enhanced pattern recognition, Proceedings of the 2016 Conference on Design, Automation & Test in Europe, ser. DATE '16, pp.475-480, 2016.

Z. Yan, H. Zhang, R. Piramuthu, V. Jagadeesh, and D. Decoste, Hd-cnn: hierarchical deep convolutional neural networks for large scale visual recognition, Proceedings of the IEEE international conference on computer vision, pp.2740-2748, 2015.

K. Neshatpour, F. Behnia, H. Homayoun, and A. Sasan, Icnn: An iterative implementation of convolutional neural networks to enable energy and computational complexity aware dynamic approximation, Design, Automation & Test in Europe Conference & Exhibition (DATE), pp.551-556, 2018.

B. Moons, R. Uytterhoeven, W. Dehaene, and M. Verhelst, 14.5 envision: A 0.26-to-10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm fdsoi, Solid-State Circuits Conference, pp.246-247, 2017.

V. Peluso and A. Calimera, Scalable-effort convnets for multilevel classification, 2018 IEEE/ACM International Conference on Computer-Aided Design, pp.1-8, 2018.

D. Lin, S. Talathi, and S. Annapureddy, Fixed point quantization of deep convolutional networks, International Conference on Machine Learning, pp.2849-2858, 2016.

L. Shan, M. Zhang, L. Deng, and G. Gong, A Dynamic Multi-precision Fixed-Point Data Quantization Strategy for Convolutional Neural Network, pp.102-111, 2016.

S. R. Jahnke and H. Hamakawa, Micro-controller direct memory access (dma) operation with adjustable word size transfers and address alignment/incrementing, uS Patent, vol.6, p.921, 2004.

M. Courbariaux, Y. Bengio, and J. David, Training deep neural networks with low precision multiplications, 2014.

G. Desoli, N. Chawla, T. Boesch, S. Singh, and E. Guidetti, 14.1 a 2.9 tops/w deep convolutional neural network soc in fd-soi 28nm for intelligent embedded systems, Solid-State Circuits Conference, pp.238-239, 2017.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, and J. Long, Caffe: Convolutional architecture for fast feature embedding, Proceedings of the 22nd ACM international conference on Multimedia, pp.675-678, 2014.

T. N. Sainath, O. Vinyals, A. Senior, and H. Sak, Convolutional, long short-term memory, fully connected deep neural networks, Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pp.4580-4584, 2015.

D. Kingma and J. Ba, Adam: A method for stochastic optimization, 2014.