Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
This paper is focused on comparing corpus-based methods for estimating word sentiment. Evaluated algorithms represent varying degrees of supervision and range from regression alike approaches to more heavily supervised classifications. The main idea is to explore the opportunities arising from mining medium sized, balanced corpora -- as opposed to web as a corpus paradigm. The comparisons have been carried using sentiment estimator benchmarks designed to take into account classification and regression problems as well as varying granularity of predicted sentiment scores: from simple to complex scales. Overall, the results turn out to be very promising and indicate superiority of supervised algorithms, especially for lower sentiment granularity predictions. However, unsupervised methods can be still considered as an interesting alternative in the case of the most fine-grained, regression like scenarios of sentiment estimation. In these cases heavy supervision and large number of features are less attractive than simple unsupervised methods.
Estimation, Context, Humans, Vectors, Optimization, Support vector machines, Vegetation, unsupervised, lexeme sentiment estimation, optimization, supervised
Aleksander Wawer, Dominika Rogozinska, "How Much Supervision? Corpus-Based Lexeme Sentiment Estimation", ICDMW, 2012, 2013 IEEE 13th International Conference on Data Mining Workshops, 2013 IEEE 13th International Conference on Data Mining Workshops 2012, pp. 724-730, doi:10.1109/ICDMW.2012.119