K. Hazelwood, A. Kalro, J. Law, K. Lee, J. Lu et al., Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective, IEEE International Symposium on High PERFORMANCE Computer Architecture, pp.620-629, 2018.

B. Tang, S. O. Information, and Y. N. University, Case study of the application of field programmable gate array fpga in the smart skill. Application of Ic, 2018.

N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal et al., ).: In-datacenter performance analysis of a tensor processing unit, pp.1-12, 2017.

H. T. Kung and C. E. Leiserson, Systolic arrays (for vlsi), Proc Sparse Matrix Conf, pp.256-282, 1978.

C. Farabet, B. Martini, B. Corda, and P. Akselrod, NeuFlow: A runtime reconfigurable dataflow processor for vision. Computer Vision and Pattern Recognition Workshops, vol.9, pp.109-116, 2011.

E. Chung, J. Fowers, K. Ovtcharov, M. Papamichael, A. Caulfield et al., Serving dnns in real time at datacenter scale with project brainwave, vol.38, pp.8-20, 2018.

Y. Chen, N. Sun, O. Temam, T. Luo, S. Liu et al., DaDianNao: A Machine-Learning Supercomputer. Ieee/acm International Symposium on Microarchitecture, vol.5, pp.609-622, 2014.

M. Alwani, H. Chen, M. Ferdman, and P. Milder, Fused-layer CNN accelerators, Ieee/acm International Symposium on Microarchitecture, pp.1-12, 2016.

Z. Li, Laius: An 8-Bit Fixed-Point CNN Hardware Inference Engine. Ubiquitous Computing and Communications (ISPA/IUCC, 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE, 2017.