Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems, Parallel Computing, vol.59, pp.71-96, 2016. ,

Pruned and structurally sparse neural networks. CoRR, 2018. ,

Demystifying parallel and distributed deep learning: An in-depth concurrency analysis, ACM Computing Surveys (CSUR), vol.52, issue.4, pp.1-43, 2019. ,

A GPU implementation of the sparse deep neural network graph challenge, IEEE High Performance Extreme Computing Conference (HPEC), 2019. ,

The graphBLAS C API specification, GraphBLAS. org, Tech. Rep, 2017. ,

PaToH: A Multilevel Hypergraph Partitioning Tool, 1999. ,

A fine-grain hypergraph model for 2d decomposition of sparse matrices, IPDPS, vol.1, p.118, 2001. ,

Write quick, run fast: Sparse deep neural network in 20 minutes of development time in SuiteSparse:GraphBLAS, IEEE High Performance Extreme Computing Conference (HPEC), 2019. ,

Scalable inference for sparse deep neural networks using Kokkos kernels, IEEE High Performance Extreme Computing Conference (HPEC), 2019. ,

, Sparse Deep Neural Network Graph Challenge. arXiv e-prints, art, 2019.

, Graphchallenge.org sparse deep neural network performance, 2020.

MNIST handwritten digit database, 2010. ,

Combinatorial Algorithms for Integrated Circuit Layout, 1990. ,

Multithreaded layer-wise training of sparse deep neural networks using compressed sparse column, IEEE High Performance Extreme Computing Conference (HPEC), 2019. ,

High-performance sparse matrix-matrix products on Intel KNL and multicore architectures, Proceedings of the 47th International Conference on Parallel Processing Companion, ICPP '18, 2018. ,

Partitioning sparse matrices for parallel preconditioned iterative methods, SIAM Journal on Scientific Computing, vol.29, issue.4, pp.1683-1709, 2007. ,

Banded Sparse Neural Networks and their parallel computation, 2018. ,

Performance of training sparse deep neural networks on GPUs, IEEE High Performance Extreme Computing Conference (HPEC), 2019. ,

Accelerating DNN inference with GraphBLAS and the GPU, IEEE High Performance Extreme Computing Conference (HPEC), 2019. ,

High-level strategies for parallel shared-memory sparse matrixvector multiplication, IEEE Transactions on Parallel and Distributed Systems, vol.25, issue.1, pp.116-125, 2013. ,