M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis et al., Tensorflow: a system for large-scale machine learning, 2016.

T. Chen, M. Li, Y. Li, M. Lin, N. Wang et al., Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems, Statistics, 2015.

T. Chen, Z. Du, N. Sun, J. Wang, C. Wu et al., Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning, Proceedings of the 19th international conference on Architectural support for programming languages and operating systems, pp.269-284, 2014.

Y. Chen, T. Luo, S. Liu, S. Zhang, L. He et al., Dadiannao:a machine-learning supercomputer, Ieee/acm International Symposium on Microarchitecture, pp.609-622, 2014.

Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li et al., Shidiannao. Acm Sigarch Computer Architecture News, vol.43, issue.3, pp.92-104, 2015.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Caffe: Convolutional architecture for fast feature embedding, 2014.

D. Liu, T. Chen, S. Liu, J. Zhou, S. Zhou et al., Pudiannao: A polyvalent machine learning accelerator, Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, pp.369-381, 2015.

S. Liu, Z. Du, J. Tao, D. Han, T. Luo et al., Cambricon: An instruction set architecture for neural networks, Proceedings of the 43rd International Symposium on Computer Architecture, pp.393-405, 2016.

B. Reagen, P. Whatmough, R. Adolf, S. Rama, H. Lee et al., Minerva: Enabling low-power, highly-accurate deep neural network accelerators, ACM SIGARCH Computer Architecture News, vol.44, pp.267-278, 2016.

S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu et al., Cambricon-x: An accelerator for sparse neural networks, Ieee/acm International Symposium on Microarchitecture, pp.1-12, 2016.