T. Chen, Z. Du, N. Sun, J. Wang, and C. Wu, DianNao: a small-footprint highthroughput accelerator for ubiquitous machine-learning, Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS), pp.269-284, 2014.

Y. Chen, T. Luo, S. Liu, S. Zhang, L. He et al., DaDianNao: A Machine-Learning Supercomputer, Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp.609-622, 2015.

S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu et al., Cambricon-X : An Accelerator for Sparse Neural Networks, Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49, 2016.

D. Liu, T. Chen, S. Liu, J. Zhou, S. Zhou et al., Pudiannao: A polyvalent machine learning accelerator, Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, pp.369-381, 2015.

Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li et al., Shidiannao: shifting vision processing closer to the sensor, Proceedings of the 42nd Annual International Symposium on Computer Architecture, pp.92-104, 2015.

S. Liu, Z. Du, J. Tao, D. Han, T. Luo et al., Cambricon: An instruction set architecture for neural networks, 43rd ACM/IEEE Annual International Symposium on Computer Architecture, pp.393-405, 2016.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis et al., , p.18, 2016.

R. Collobert, K. Kavukcuoglu, and C. Farabet, Torch7: A matlab-like environment for machine learning

N. System, , 2016.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Caffe: Convolutional architecture for fast feature embedding, 2014.

T. Chen, T. Moreau, Z. Jiang, H. Shen, E. Q. Yan et al., TVM: end-to-end optimization stack for deep learning, 2018.