Subscribe

Issue No.10 - Oct. (2012 vol.24)

pp: 1774-1788

Yu Peng , The Hong Kong University of Science and Technology, Hong Kong

Raymond Chi-Wing Wong , The Hong Kong University of Science and Technology, Hong Kong

Qian Wan , University of Wisconsin - Madison, Madison

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.52

ABSTRACT

The importance of dominance and skyline analysis has been well recognized in multicriteria decision-making applications. Most previous studies focus on how to help customers find a set of “best” possible products from a pool of given products. In this paper, we identify an interesting problem, finding top-k preferable products, which has not been studied before. Given a set of products in the existing market, we want to find a set of k “best” possible products such that these new products are not dominated by the products in the existing market. We study two problem instances of finding top-k preferable products. In the first problem instance, we need to set the prices of these products such that the total profit is maximized. We refer such products as top-k profitable products. In the second problem instance, we want to find k products such that these k products can attract the greatest number of customers. We refer these products as top-k products. In both problem instances, a straightforward solution is to enumerate all possible subsets of size k and find the subset which gives the greatest profit (for the first problem instance) or attracts the greatest number of customers (for the second problem instance). However, there are an exponential number of possible subsets. In this paper, we propose solutions to find the top-k profitable products and the top-k popular products efficiently. An extensive performance study using both synthetic and real data sets is reported to verify the effectiveness and efficiency of proposed algorithms.

INDEX TERMS

Vectors, Greedy algorithms, Dynamic programming, Companies, Complexity theory, Correlation, Heuristic algorithms, spatial database., Skyline

CITATION

Yu Peng, Raymond Chi-Wing Wong, Qian Wan, "Finding Top-k Preferable Products",

*IEEE Transactions on Knowledge & Data Engineering*, vol.24, no. 10, pp. 1774-1788, Oct. 2012, doi:10.1109/TKDE.2012.52REFERENCES

- [1] N. Archak, A. Ghose, and P.G. Ipeirotis, "Show Me the Money!: Deriving the Pricing Power of Product Features by Mining Consumer Reviews,"
Proc. 13th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '07), pp. 56-65, 2007.- [2] S. Borzsonyi, D. Kossmann, and K. Stocker, "The Skyline Operator,"
Proc. Int'l Conf. Data Eng. (ICDE), 2001.- [3] J.L. Bently, H.T. Kung, M. Schkolnick, and C.D. Thompson, "On the Average Number of Maxima in a Set of Vectors and Applications,"
J. ACM, vol. 25, no. 4, pp. 536-543, 1978.- [4] J.L. Bently, K.L. Clarkson, and D.B. Levine, "Fast Linear Expected-Time Algorithms for Computing Maxima and Convex Hulls,"
Proc. First Ann. ACM-SIAM Symp. Discrete Algorithms (SODA), 1990.- [5] O. Barndorff-Nielson and M. Sobel, "On the Distribution of the Number of Admissible Points in a Vector Random Sample,"
Theory of Probability and Its Application, vol. 11, no. 2, pp. 249-269, 1966.- [6] D.S. Hockhbaum, "Approximating Covering and Packing Problems: Set Cover, Vertex Cover, Independent Set, and Related Problems"
Approximation Algorithms for NP-Hard Problems, PWS Publishing Company, 1997.- [7] B. Jiang, J. Pei, X. Lin, D.W.-L. Cheung, and J. Han, "Mining Preferences from Superior and Inferior Examples,"
Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2008.- [8] J.M. Kang, M.F. Mokbel, S. Shekhar, T. Xia, and D. Zhang, "Continuous Evaluation of Monochromatic and Bichromatic Reverse Nearest Neighbors,"
Proc. Int'l Conf. Data Eng. (ICDE), 2007.- [9] F. Korn and S. Muthukrishnan, "Influence Sets Based on Reverse Nearest Neighbor Queries,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, 2000.- [10] D. Kossmann, F. Ramsak, and S. Rost, "Shooting Stars in the Sky: An Online Algorithm for Skyline Queries,"
Proc. 28th Int'l Conf. Very Large Data Bases (VLDB), 2002.- [11] B. Li, A. Ghose, and P.G. Ipeirotis, "Towards a Theory Model for Product Search,"
Proc. 20th Int'l Conf. World Wide Web (WWW '11), pp. 327-336, 2011.- [12] X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang, "Selecting Stars: The k Most Representative Skyline Operator,"
Proc. Int'l Conf. Data Eng. (ICDE), 2007.- [13] D. Papadias, Y. Tao, G. Fu, and B. Seeger, "Progressive Skyline Computation in Database Systems,"
ACM Trans. Database Systems, vol. 30, no. 1, pp. 41-82, 2005.- [14] D. Sacharidis, S. Papadopoulos, and D. Papadias, "Topologically-Sorted Skylines for Partially-Ordered Domains,"
Proc. Int'l Conf. Data Eng. (ICDE), 2009.- [15] I. Stanoi, M. Riedewald, D. Agrawal, and A.E. Abbadi, "Discovery of Influence Sets in Frequently Updated Databases,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), 2001.- [16] K.-L. Tan, P. Eng, and B. Ooi, "Efficient Progressive Skyline Computation,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), 2001.- [17] Y. Tao, L. Ding, X. Lin, and J. Pei, "Distance-Based Representative Skyline,"
ICDE '09: Proc. IEEE Int'l Conf. Data Eng., pp. 892-903, 2009.- [18] Q. Wan, R.C.-W. Wong, I.F. Ilyas, M.T. Özsu, and Y. Peng, "Creating Competitive Products,"
Proc. VLDB Endowment, vol. 2, pp. 898-909, 2009.- [19] Q. Wan, R.C.-W. Wong, and Y. Peng, "Creating Top-K Profitable Products," technical report, http://www.cse.ust.hk/~raywong/papercreateTopKProfitableProduct-technical.pdf , 2010.
- [20] Q. Wan, R.C.-W. Wong, and Y. Peng, "Finding Top-K Profitable Products,"
Proc. Int'l Conf. Data Eng. (ICDE), 2011.- [21] R.C.-W. Wong, J. Pei, A.W.-C. Fu, and K. Wang, "Mining Favorable Facets,"
Proc. 13th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2007.- [22] T. Xia, D. Zhang, E. Kanoulas, and Y. Du, "On Computing Top-$k$ Most Influential Spatial Sites,"
Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.- [23] C. Yang and K.-I. Lin, "An Index Structure for Improving Nearest Closest pairs and Related Join Queries in Spatial Databases,"
Proc. Int'l Symp. Database Eng. and Applications (IDEAS), 2002.- [24] Z. Zhang, L. Lakshmanan, and A.K. Tung, "On Domination Game Analysis for Microeconomic Data Mining,"
ACM Trans. Knowledge Discovery from Data, vol. 2, article 18, 2009. |