D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, Computer Science, 2014.

Y. Bai, W. Yu, T. Xiao, C. Xu, K. Yang et al., Bag-of-words based deep neural network for image retrieval, ACM International Conference on Multimedia, pp.229-232, 2014.

Y. Bengio, R. Ducharme, P. Vincent, C. Jauvin, J. Kandola et al., A neural probabilistic language model, 2006.

K. Cho, B. Van-merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares et al., Learning phrase representations using rnn encoder-decoder for statistical machine translation, Computer Science, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01433235

J. Chung, C. Gulcehre, K. H. Cho, and Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, 2014.

R. Collobert and J. Weston, A unified architecture for natural language processing:deep neural networks with multitask learning, International Conference on Machine Learning, pp.160-167, 2008.

L. Dong, F. Wei, C. Tan, D. Tang, M. Zhou et al., Adaptive recursive neural network for target-dependent twitter sentiment classification, Meeting of the Association for Computational Linguistics, pp.49-54, 2014.

Y. Goldberg, A primer on neural network models for natural language processing, 2015.

T. Hofmann, Probabilistic latent semantic analysis, Proc Uncertainty in Artificial Intelligence, vol.41, issue.6, pp.289-296, 2013.

Y. Kim, Convolutional neural networks for sentence classification, 2014.

P. Diederik, J. Kingma, and . Ba, Adam: A method for stochastic optimization, Computer Science, 2014.

V. Quoc, T. Le, and . Mikolov, Distributed representations of sentences and documents, vol.4, p.1188, 2014.

Q. Li, S. Shah, R. Fang, A. Nourbakhsh, and X. Liu, Tweet sentiment analysis by incorporating sentiment-specific word embedding and weighted text features, Ieee/wic/acm International Conference on Web Intelligence, pp.568-571, 2017.

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng et al., Learning word vectors for sentiment analysis, Meeting of the Association for Computational Linguistics: Human Language Technologies, pp.142-150, 2011.

T. Mikolov, W. T. Yih, and G. Zweig, Linguistic regularities in continuous space word representations, HLT-NAACL, 2013.

T. Mikolov, A. Joulin, S. Chopra, M. Mathieu, and M. , Aurelio Ranzato. Learning longer memory in recurrent neural networks, Computer Science, 2014.

T. Mikolov, M. Karafit, L. Burget, J. Cernock, and S. Khudanpur, Recurrent neural network based language model, INTERSPEECH 2010, Conference of the International Speech Communication Association, pp.1045-1048, 2010.

M. Nadeem, Survey on Opinion Mining and Sentiment Analysis, 2015.

T. Nasukawa and J. Yi, Sentiment analysis:capturing favorability using natural language processing, International Conference on Knowledge Capture, pp.70-77, 2003.

B. Pang and L. Lee, Opinion mining and sentiment analysis, Foundations Trends in Information Retrieval, vol.2, issue.12, pp.1-135, 2008.

J. Pennington, R. Socher, and C. Manning, Glove: Global vectors for word representation, Conference on Empirical Methods in Natural Language Processing, pp.1532-1543, 2014.

L. Prechelt, Automatic early stopping using cross validation: quantifying the criteria, Neural Networks the Official Journal of the International Neural Network Society, vol.11, issue.4, p.761, 1998.

I. Sheikh, I. Illina, D. Fohr, and G. Linars, Learning word importance with the neural bag-of-words model, The Workshop on Representation Learning for Nlp, pp.222-229, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01331720

Y. Ma and X. Sun, Bag-of-words as target for neural machine translation, 2018.

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, vol.15, issue.1, pp.1929-1958, 2014.

P. Wang, J. Xu, B. Xu, C. Liu, H. Zhang et al., Semantic clustering and convolutional neural network for short text categorization, 2015.

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola et al., Hierarchical attention networks for document classification, Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1480-1489, 2017.

W. Zaremba, I. Sutskever, and O. Vinyals, Recurrent neural network regularization, 2014.

D. Matthew, R. Zeiler, and . Fergus, Stochastic pooling for regularization of deep convolutional neural networks, 2013.