K. He, X. Zhang, S. Ren, and J. Sun, Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification, Proceedings of the IEEE international conference on computer vision, pp.1026-1034, 2015.

G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. R. Mohamed et al., Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal processing magazine, vol.29, issue.6, pp.82-97, 2012.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012.

Z. C. Lipton, The mythos of model interpretability, 2016.

G. Litjens, T. Kooi, B. E. Bejnordi, A. A. Setio, F. Ciompi et al., A survey on deep learning in medical image analysis, Medical image analysis, vol.42, pp.60-88, 2017.

A. Van-den-oord, S. Dieleman, and B. Schrauwen, Deep content-based music recommendation, Advances in neural information processing systems, pp.2643-2651, 2013.

S. Pereira, R. Meier, R. Mckinley, R. Wiest, V. Alves et al., Enhancing interpretability of automatically extracted machine learning features: application to a rbm-random forest system on brain lesion segmentation, Medical image analysis, vol.44, pp.228-244, 2018.

M. T. Ribeiro, S. Singh, and C. Guestrin, Why should i trust you?: Explaining the predictions of any classifier, Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp.1135-1144, 2016.

O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, International Conference on Medical image computing and computer-assisted intervention, pp.234-241, 2015.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., Imagenet large scale visual recognition challenge, International Journal of Computer Vision, vol.115, issue.3, pp.211-252, 2015.

R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh et al., Gradcam: Why did you say that? visual explanations from deep networks via gradientbased localization, 2016.

J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, Striving for simplicity: The all convolutional net, 2014.

I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, Advances in neural information processing systems, pp.3104-3112, 2014.

O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, Show and tell: A neural image caption generator, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.3156-3164, 2015.

M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, European conference on computer vision, pp.818-833, 2014.

Q. S. Zhang and S. C. Zhu, Visual interpretability for deep learning: a survey, Frontiers of Information Technology & Electronic Engineering, vol.19, issue.1, pp.27-39, 2018.