Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks, vol.5, issue.2, pp.157-166, 1994. ,
Fast and accurate deep network learning by exponential linear units (elus), 2015. ,
A theoretically grounded application of dropout in recurrent neural networks, Advances in Neural Information Processing Systems, vol.29, pp.1019-1027, 2016. ,
The movielens datasets: History and context, ACM Trans. Interact. Intell. Syst, vol.5, issue.4, 2015. ,
Batch normalization: Accelerating deep network training by reducing internal covariate shift, 2015. ,
Stochastic estimation of the maximum of a regression function, Ann. Math. Statist, vol.23, issue.3, p.1952 ,
Adam: A method for stochastic optimization, 2014. ,
Self-normalizing neural networks, 2017. ,
The BellKor Solution to the Netflix Grand Prize, 2009. ,
Matrix factorization techniques for recommender systems, Computer, vol.42, pp.30-37, 2009. ,
Learning multiple layers of features from tiny images, 2012. ,
Training Deep AutoEncoders for Collaborative Filtering, 2017. ,
Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, issue.11, pp.2278-2324, 1998. ,
Variational autoencoders for collaborative filtering, Proceedings of the 2018 World Wide Web Conference, WWW '18, pp.689-698, 2018. ,
Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems, vol.26, pp.3111-3119, 2013. ,
Rectified linear units improve restricted boltzmann machines, Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, pp.807-814, 2010. ,
The Pragmatic Theory Solution to the Netflix Grand Prize, 2009. ,
TensorFlow for Deep Learning: From Linear Regression to Reinforcement Learning, 2018. ,
Restricted Boltzmann machines for collaborative filtering, Proceedings of the International Conference on Machine Learning, vol.24, pp.791-798, 2007. ,
Improving predictive inference under covariate shift by weighting the log-likelihood function, 2000. ,
Dropout: A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, vol.15, pp.1929-1958, 2014. ,
On the Gravity recommendation system, Proc. of KDD Cup Workshop at SIGKDD'07, 13th ACM Int. Conf. on Knowledge Discovery and Data Mining, pp.22-30, 2007. ,
RMSprop Gradient Optimization ,
The Big Chaos Solution to the Netflix Grand Prize, 2009. ,
Individual Comparisons by Ranking Methods, pp.196-202, 1992. ,