Active Learning Enhanced Document Annotation for Sentiment Analysis

Abstract : Sentiment analysis is a popular research area devoted to methods allowing automatic analysis of the subjectivity in textual content. Many of these methods are based on the using of machine learning and they usually depend on manually annotated training corpora. However, the creation of corpora is a time-consuming task, which leads to necessity of methods facilitating this process. Methods of active learning, aimed at the selection of the most informative examples according to the given classification task, can be utilized in order to increase the effectiveness of the annotation. Currently it is a lack of systematical research devoted to the application of active learning in the creation of corpora for sentiment analysis. Hence, the aim of this work is to survey some of the active learning strategies applicable in annotation tools used in the context of sentiment analysis. We evaluated compared strategies on the domain of product reviews. The results of experiments confirmed the increase of the corpus quality in terms of higher classification accuracy achieved on the test set for most of the evaluated strategies (more than 20% higher accuracy in comparison to the random strategy).
Liste complète des métadonnées

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/hal-01506776
Contributor : Hal Ifip <>
Submitted on : Wednesday, April 12, 2017 - 11:19:08 AM
Last modification on : Thursday, April 13, 2017 - 1:06:48 AM
Document(s) archivé(s) le : Thursday, July 13, 2017 - 12:44:13 PM

File

978-3-642-40511-2_24_Chapter.p...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

  • HAL Id : hal-01506776, version 1

Citation

Peter Koncz, Ján Paralič. Active Learning Enhanced Document Annotation for Sentiment Analysis. Alfredo Cuzzocrea; Christian Kittl; Dimitris E. Simos; Edgar Weippl; Lida Xu. 1st Cross-Domain Conference and Workshop on Availability, Reliability, and Security in Information Systems (CD-ARES), Sep 2013, Regensburg, Germany. Springer, Lecture Notes in Computer Science, LNCS-8127, pp.345-353, 2013, Availability, Reliability, and Security in Information Systems and HCI. 〈hal-01506776〉

Share

Metrics

Record views

131

Files downloads

221