M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, 2014.

R. R. Selvaraju, Grad-cam: Visual explanations from deep networks via gradient-based localization, Proceedings of the IEEE international conference on computer vision, 2017.

A. Ekin, R. Murat-tekalp, and . Mehrotra, Automatic soccer video analysis and summarization, IEEE Transactions on Image processing, vol.12, issue.7, pp.796-807, 2003.

Y. Mohamed, B. Eldib, H. Zaid, M. Mzawbaa, M. El-zahar et al., Soccer video summarization using enhanced logo de-tection, 2009 16th ICIP, pp.4345-4348, 2009.

H. Tang, V. Kwatra, . Mehmet-emre-sargin, and . Gargi, Detecting highlights in sports videos:Cricket as a test case, 2011 IEEE International Conference on Multimedia and Expo, pp.1-6, 2011.

K. Zhang, K. Grauman, and F. Sha, Retrospective encoders for video summarization, in ECCV, pp.383-399, 2018.

B. Zhao, X. Li, and X. Lu, Hsa-rnn: Hierarchical structure-adaptive rnn for video summarization, Proceedings of the IEEE CVPR, pp.7405-7414, 2018.

L. Mrigank-rochan, Y. Ye, and . Wang, Video summarization using fully convolutional sequence networks, Proceedings of ECCV, pp.347-363, 2018.

J. Wang, W. Wang, Z. Wang, L. Wang, D. Feng et al., Stacked memory network for video summarization, Proceedings of the 27th ACM MM, pp.836-844, 2019.

K. Zhang, W. Chao, F. Sha, and K. Grauman, Video summarization with long short-term memory, ECCV, 2016.

M. Rochan and Y. Wang, Video summarization by learning from unpaired data, Proceedings of the IEEE CVPR, pp.7902-7911, 2019.

X. Li, B. Zhao, and X. Lu, A general framework for edited video and raw video summarization, IEEE Transactions on Image Processing, vol.26, issue.8, pp.3652-3664, 2017.

M. Gygli, H. Grabner, H. Riemenschneider, and L. Van-gool, Creating summaries from user videos" in ECCV, pp.505-520, 2014.

Y. Song, J. Vallmitjana, A. Stent, and A. Jaimes, Tvsum: Summarizing web videos using titles, Proceedings of the IEEE CVPR, pp.5179-5187, 2015.

S. Avila, A. Lopes, A. Da, L. Jr, A. De-albuquerque et al., Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method, Pattern Recognition Letters, vol.32, issue.1, pp.56-68, 2011.

, Open video project, OVP, 2011.

K. Zeng, T. Chen, J. C. Niebles, and M. Sun, Generation for user generated videos" in ECCV, pp.609-625, 2016.

H. Xu, A. Das, and K. Saenko, R-c3d: region convolutional 3d network for temporal activity detection, pp.5794-5803, 2017.

T. Decroos, L. Bransen, J. Van-haaren, and J. Davis, Actions speak louder than goals: Valuing player actions in soccer, Proceedings of the 25th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, pp.1851-1861, 2019.

G. Liu and O. Schulte, Deep reinforcement learning in ice hockey for context-aware player evaluation, 2018.

T. Decroos, V. Dzyuba, J. Van-haaren, and D. , Predicting soccer highlights from spatio-temporal match event streams, Thirty-First AAAI Conference on Artificial Intelligence, 2017.

L. Bransen and J. Van-haaren, Measuring football players' onthe-ball contributions from passes during games, International Workshop on Machine Learning and Data Mining for Sports Analytics, pp.3-15, 2018.

L. Pappalardo, A public data set of spatio-temporal match events in soccer competitions, Scientific data, vol.6, pp.1-15, 2019.

H. Mathien, , 2016.

T. Bergmann, I-SEMANTICS (Posters and Demos), 2013.

M. Luong, H. Pham, and C. D. Manning, Effective approaches to attention-based neural machine translation, 2015.

S. Mirsamadi, E. Barsoum, and C. Zhang, Automatic speech emotion recognition using recurrent neural networks with local attention, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, 2017.

K. Xu, Show, attend and tell: Neural image caption generation with visual attention, International conference on machine learning, 2015.

K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville et al., Show, attend and tell: Neural image caption generation with visual attention, ICML, pp.2048-2057, 2015.