J. Kepner, S. Alford, V. Gadepally, M. Jones, L. Milechin et al., Sparse Deep Neural Network Graph Challenge, 2019.

A. Buluç, T. Mattson, S. Mcmillan, J. Moreira, and C. Yang, The GraphBLAS C API specification, GraphBLAS. org, Tech. Rep, 2017.

B. Van-der-lugt and R. H. Bisseling, Banded Sparse Neural Networks and their parallel computation, 2018.

S. Alford, R. Robinett, L. Milechin, and J. Kepner, Pruned and structurally sparse neural networks, CoRR, 2018.

Y. Lecun and C. Cortes, MNIST handwritten digit database, 2010.

T. Ben-nun and T. Hoefler, Demystifying parallel and distributed deep learning: An in-depth concurrency analysis, ACM Computing Surveys (CSUR), vol.52, issue.4, pp.1-43, 2019.

M. Bisson and M. Fatica, A GPU implementation of the sparse deep neural network graph challenge, IEEE High Performance Extreme Computing Conference (HPEC), 2019.

T. Davis, M. Aznaveh, and S. Kolodziej, Write quick, run fast: Sparse deep neural network in 20 minutes of development time in SuiteSparse:GraphBLAS, IEEE High Performance Extreme Computing Conference (HPEC), 2019.

J. A. Ellis and S. Rajamanickam, Scalable inference for sparse deep neural networks using Kokkos kernels, IEEE High Performance Extreme Computing Conference (HPEC), 2019.

X. Wang, Z. Lin, C. Yang, and J. D. Owens, Accelerating DNN inference with GraphBLAS and the GPU, IEEE High Performance Extreme Computing Conference (HPEC), 2019.

J. Wang, Z. Huang, L. Kong, J. Xiao, P. Wang et al., Performance of training sparse deep neural networks on GPUs, IEEE High Performance Extreme Computing Conference (HPEC), 2019.

M. H. Mofrad, R. Melhem, Y. Ahmad, and M. Hammoud, Multithreaded layer-wise training of sparse deep neural networks using compressed sparse column, IEEE High Performance Extreme Computing Conference (HPEC), 2019.

J. Kepner, S. Alford, V. Gadepally, M. Jones, L. Milechin et al., Graphchallenge.org sparse deep neural network performance, 2020.

Y. Nagasaka, S. Matsuoka, A. Azad, and A. Buluç, High-performance sparse matrix-matrix products on Intel KNL and multicore architectures, Proceedings of the 47th International Conference on Parallel Processing Companion, ser. ICPP '18, 2018.

T. Lengauer, Combinatorial Algorithms for Integrated Circuit Layout, 1990.

Ü. V. and C. Aykanat, A fine-grain hypergraph model for 2d decomposition of sparse matrices, IPDPS, vol.1, p.118, 2001.

Ü. V. and C. Aykanat, PaToH: A Multilevel Hypergraph Partitioning Tool, 1999.

A. N. Yzelman and D. Roose, High-level strategies for parallel sharedmemory sparse matrix-vector multiplication, IEEE Transactions on Parallel and Distributed Systems, vol.25, issue.1, pp.116-125, 2013.

S. Acer, O. Selvitopi, and C. Aykanat, Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems, Parallel Computing, vol.59, pp.71-96, 2016.

B. Uçar and C. Aykanat, Partitioning sparse matrices for parallel preconditioned iterative methods, SIAM Journal on Scientific Computing, vol.29, issue.4, pp.1683-1709, 2007.