Abstract : Twitter Sentiment Classification is undergoing great appeal from the research community; also, user posts and opinions are producing very interesting conclusions and information. In the context of this paper, a pre-processing tool was developed in Python language. This tool processes text and natural language data intending to remove wrong values and noise. The main reason for developing such a tool is to achieve sentiment analysis in an optimum and efficient way. The most remarkable characteristic is considered the use of emojis and emoticons in the sentiment analysis field. Moreover, supervised machine learning techniques were utilized for the analysis of users’ posts. Through our experiments, the performance of the involved classifiers, namely Naive Bayes and SVM, under specific parameters such as the size of the training data, the employed methods for feature selection (unigrams, bigrams and trigrams) are evaluated. Finally, the performance was assessed based on independent datasets through the application of k-fold cross validation.
https://hal.inria.fr/hal-02363858 Contributor : Hal IfipConnect in order to contact the contributor Submitted on : Thursday, November 14, 2019 - 3:51:18 PM Last modification on : Thursday, November 14, 2019 - 3:56:00 PM Long-term archiving on: : Saturday, February 15, 2020 - 4:24:18 PM
Elias Dritsas, Gerasimos Vonitsanos, Ioannis E. Livieris, Andreas Kanavos, Aristidis Ilias, et al.. Pre-processing Framework for Twitter Sentiment Classification. 15th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), May 2019, Hersonissos, Greece. pp.138-149, ⟨10.1007/978-3-030-19909-8_12⟩. ⟨hal-02363858⟩