2012 IEEE 12th International Conference on Data Mining Workshops (2012)
Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
This paper is focused on comparing corpus-based methods for estimating word sentiment. Evaluated algorithms represent varying degrees of supervision and range from regression alike approaches to more heavily supervised classifications. The main idea is to explore the opportunities arising from mining medium sized, balanced corpora -- as opposed to web as a corpus paradigm. The comparisons have been carried using sentiment estimator benchmarks designed to take into account classification and regression problems as well as varying granularity of predicted sentiment scores: from simple to complex scales. Overall, the results turn out to be very promising and indicate superiority of supervised algorithms, especially for lower sentiment granularity predictions. However, unsupervised methods can be still considered as an interesting alternative in the case of the most fine-grained, regression like scenarios of sentiment estimation. In these cases heavy supervision and large number of features are less attractive than simple unsupervised methods.
Estimation, Context, Humans, Vectors, Optimization, Support vector machines, Vegetation, unsupervised, lexeme sentiment estimation, optimization, supervised
A. Wawer and D. Rogozinska, "How Much Supervision? Corpus-Based Lexeme Sentiment Estimation," 2012 IEEE 12th International Conference on Data Mining Workshops(ICDMW), Brussels, Belgium Belgium, 2012, pp. 724-730.