Issue No. 04 - April (2012 vol. 24)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.269
Jimmy Xiangji Huang , York University, Toronto
Yang Liu , Shandong University, Jinan
Xiaohui Yu , Shandong University, Jinan and York University, Toronto
Aijun An , York University, Toronto
Posting reviews online has become an increasingly popular way for people to express opinions and sentiments toward the products bought or services received. Analyzing the large volume of online reviews available would produce useful actionable knowledge that could be of economic values to vendors and other interested parties. In this paper, we conduct a case study in the movie domain, and tackle the problem of mining reviews for predicting product sales performance. Our analysis shows that both the sentiments expressed in the reviews and the quality of the reviews have a significant impact on the future sales performance of products in question. For the sentiment factor, we propose Sentiment PLSA (S-PLSA), in which a review is considered as a document generated by a number of hidden sentiment factors, in order to capture the complex nature of sentiments. Training an S-PLSA model enables us to obtain a succinct summary of the sentiment information embedded in the reviews. Based on S-PLSFA, we propose ARSA, an Autoregressive Sentiment-Aware model for sales prediction. We then seek to further improve the accuracy of prediction by considering the quality factor, with a focus on predicting the quality of a review in the absence of user-supplied indicators, and present ARSQA, an Autoregressive Sentiment and Quality Aware model, to utilize sentiments and quality for predicting product sales performance. Extensive experiments conducted on a large movie data set confirm the effectiveness of the proposed approach.
Review mining, sentiment analysis, prediction.
Jimmy Xiangji Huang, Yang Liu, Xiaohui Yu, Aijun An, "Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain", IEEE Transactions on Knowledge & Data Engineering, vol. 24, no. , pp. 720-734, April 2012, doi:10.1109/TKDE.2010.269