, XONN: XNOR-based Oblivious Deep Neural Network Inference, USENIX Security, 2019.

M. Abadi, A. Chu, I. Goodfellow, H. B. Mcmahan, I. Mironov et al., Deep Learning with Differential Privacy, pp.308-318, 2016.

A. Canziani, A. Paszke, and E. Culurciello, An Analysis of Deep Neural Network Models for Practical Applications, 2016.

N. Carlini, C. Liu, Ú. Erlingsson, J. Kos, and D. Song, The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks, USENIX Security, pp.267-284, 2019.

S. Han, X. Liu, H. Mao, J. Pu, A. Pedram et al., EIE: Efficient Inference Engine on Compressed Deep Neural Network, 2016.

S. Han, H. Mao, and W. J. Dally, Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding, ICLR, 2016.

S. Han, J. Pool, S. Narang, H. Mao, E. Gong et al., DSD: Dense-Sparse-Dense Training for Deep Neural Networks. ICLR, 2017.

S. Han, J. Pool, J. Tran, and W. J. Dally, Learning Both Weights and Connections for Efficient Neural Networks, NIPS, pp.1135-1143, 2015.

J. Hayes, L. Melis, G. Danezis, and E. Cristofaro, LOGAN: Membership Inference Attacks Against Generative Models. PETS, vol.1, pp.133-152, 2019.

G. Hinton, O. Vinyals, and J. Dean, Distilling the Knowledge in a Neural Network, NIPS Deep Learning and Representation Learning Workshop, 2015.

M. Horowitz, Computing's energy problem (and what we can do about it), ISSCC, pp.10-14, 2014.

I. Hubara, M. Courbariaux, and D. Soudry, Ran El-Yaniv, and Yoshua Bengio, NIPS, pp.4107-4115, 2016.

I. Hubara, M. Courbariaux, and D. Soudry, Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations, J. Mach. Learn. Res, vol.18, pp.6869-6898, 2017.

F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally et al., SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size, 2016.

J. Jia, A. Salem, M. Backes, Y. Zhang, and N. Gong, MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples, CCS, 2019.

F. Li and B. Liu, Ternary Weight Networks, 2017.

L. Melis, C. Song, V. Emiliano-de-cristofaro, and . Shmatikov, Exploiting unintended feature leakage in collaborative learning, SP, 2019.

M. Nasr, R. Shokri, and A. Houmansadr, Machine Learning with Membership Privacy using Adversarial Regularization, CCS, pp.634-646, 2018.

M. Nasr, R. Shokri, and A. Houmansadr, Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks, SP, 2019.

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks, 2016.

A. Sablayrolles, M. Douze, C. Schmid, Y. Ollivier, and H. Jegou, White-box vs Black-box: Bayes Optimal Strategies for Membership Inference (PMLR), vol.97, pp.5558-5567, 2019.

A. Salem, Y. Zhang, M. Humbert, M. Fritz, and M. Backes, ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on, Machine Learning Models. NDSS, 2018.

M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L. Chen, MobileNetV2: Inverted Residuals and Linear Bottlenecks. In CVPR, pp.4510-4520, 2018.

V. Shejwalkar and A. Houmansadr, Reconciling Utility and Membership Privacy via Knowledge Distillation, 2019.

R. Shokri, M. Stronati, C. Song, and V. Shmatikov, Membership inference attacks against machine learning models, SP, 2017.

C. Song, T. Ristenpart, and V. Shmatikov, Machine Learning Models That Remember Too Much, CCS, p.587601, 2017.

L. Song and P. Mittal, Systematic Evaluation of Privacy Risks of, Machine Learning Models, 2020.

V. Sze, Designing Hardware for Machine Learning: The Important Role Played by Circuit Designers, pp.46-54, 2017.

V. Sze, Y. Chen, T. Yang, and J. S. Emer, Efficient Processing of Deep Neural Networks: A Tutorial and Survey, Proc. IEEE, pp.2295-2329, 2017.

W. Tang, G. Hua, and L. Wang, How to Train a Compact Binary Neural Network with High Accuracy?, 2017.

Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, and P. Leong, Magnus Jahre, and Kees A. Vissers. 2017. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. In FPGA

T. Yang, Y. Chen, and V. Sze, Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning, 2017.

S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha, Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting, CSF, pp.268-282, 2018.