The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2009 vol.21)
pp: 959-973
Gautam Das , University of Texas at Arlington, Arlington
Vagelis Hristidis , Florida International University, Miami
Muhammed Miah , University of Texas at Arlington, Arlington
ABSTRACT
In recent years, there has been significant interest in the development of ranking functions and efficient top-k retrieval algorithms to help users in ad hoc search and retrieval in databases (e.g., buyers searching for products in a catalog). We introduce a complementary problem: How to guide a seller in selecting the best attributes of a new tuple (e.g., a new product) to highlight so that it stands out in the crowd of existing competitive products and is widely visible to the pool of potential buyers. We develop several formulations of this problem. Although the problems are NP-complete, we give several exact and approximation algorithms that work well in practice. One type of exact algorithms is based on Integer Programming (IP) formulations of the problems. Another class of exact methods is based on maximal frequent item set mining algorithms. The approximation algorithms are based on greedy heuristics. A detailed performance study illustrates the benefits of our methods on real and synthetic data.
INDEX TERMS
Data mining, knowledge and data engineering tools and techniques, marketing, mining methods and algorithms, retrieval models.
CITATION
Gautam Das, Vagelis Hristidis, Muhammed Miah, "Determining Attributes to Maximize Visibility of Objects", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 7, pp. 959-973, July 2009, doi:10.1109/TKDE.2009.72
REFERENCES
[1] S. Agrawal, S. Chaudhuri, G. Das, and F. Gionis, “Automated Ranking of Database Query Results,” Proc. Conf. Innovative Data Systems Research (CIDR), 2003.
[2] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., pp. 307-328, AAAI/MIT Press, 1996.
[3] R.J. BayardoJr., “Efficiently Mining Long Patterns from Databases,” Proc. ACM SIGMOD Conf., vol. 1, pp. 85-93, 1998.
[4] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” Proc. Int'l Conf. Data Eng. (ICDE '01), 2001.
[5] D. Burdick, M. Calimlim, and J. Gehrke, “MAFIA: A Maximal Frequent Item Set Algorithm for Transactional Databases,” Proc. Int'l Conf. Data Eng. (ICDE), 2001.
[6] S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Proc. World Wide Web (WWW) Conf., 1998.
[7] S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum, “Probabilistic Ranking of Database Query Results,” Proc. Int'l Conf. Very Large Data Bases (VLDB), 2004.
[8] G. Das, V. Hristidis, N. Kapoor, and S. Sudarshan, “Ordering the Attributes of Query Results,” Proc. ACM SIGMOD Conf., 2006.
[9] I. Good, “The Population Frequencies of Species and the Estimation of Population Parameters,” Biometrika, vol. 40, pp.237-264, 1953.
[10] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Machine Learning Research, vol. 3, pp. 1157-1182, Mar. 2003.
[11] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.
[12] D. Gunopulos, R. Khardon, H. Mannila, S. Saluja, H. Toivonen, and R.S. Sharm, “Discovering All Most Specific Sentences,” ACM Trans. Database Systems, vol. 28, no. 2, pp. 140-174, 2003.
[13] M. Gori and I. Witten, “The Bubble of Web Visibility,” Comm. ACM, vol. 48, no. 3, pp. 115-117, Mar. 2005.
[14] K. Gouda and M.J. Zaki, “Efficiently Mining Maximal Frequent Itemsets,” Proc. Int'l Conf. Data Mining (ICDM), 2001.
[15] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. ACM SIGMOD Conf., pp. 1-12, 2000.
[16] J. Han, J. Wang, Y. Lu, and P. Tzvetkov, “Mining Top-k Frequent Closed Patterns without Minimum Support,” Proc. Int'l Conf. Data Mining (ICDM), 2002.
[17] J.-P. Huang, C.-T. Yang, and C.-H. Fu, “A Genetic Algorithm Based Searching of Maximal Frequent Itemsets,” Proc. Int'l Conf. Applied Informatics (ICAI), 2004.
[18] J. Kleinberg, C. Papadimitriou, and P. Raghavan, “A Microeconomic View of Data Mining,” Data Mining and Knowledge Discovery, vol. 2, no. 4, pp. 311-324, Dec. 1998.
[19] D. Kossmann, F. Ramsak, and S. Rost, “Shooting Stars in the Sky: An Online Algorithm for Skyline Queries,” Proc. Int'l Conf. Very Large Data Bases (VLDB), 2002.
[20] C. Li, B.C. Ooi, A.K.H. Tung, and S. Wang, “DADA: A Data Cube for Dominant Relationship Analysis,” Proc. ACM SIGMOD Conf., 2006.
[21] C. Li, A.K.H. Tung, W. Jin, and M. Ester, “On Dominating Your Neighborhood Profitably,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 818-829, 2007.
[22] M.D. Morse, J.M. Patel, and H.V. Jagadish, “Efficient Skyline Computation over Low-Cardinality Domains,” Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[23] M. Miah, G. Das, V. Hristidis, and H. Mannila, “Standing Out in a Crowd: Selecting Attributes for Maximum Visibility,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 356-365, 2008.
[24] T.T. Nagle and J. Hogan, The Strategy and Tactics of Pricing: A Guide to Growing More Profitably, fourth ed. Prentice-Hall, 2005.
[25] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,” Proc. ACM SIGMOD Conf., 2003.
[26] S.E. Robertson and S. Walker, “Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval,” Proc. SIGIR Conf., 1994.
[27] A. Schrijver, Theory of Linear and Integer Programming. John Wiley and Sons, 1998.
[28] G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley, 1989.
[29] N. Sarkas, G. Das, N. Koudas, and A.K.H. Tung, “Categorical Skylines for Streaming Data,” Proc. ACM SIGMOD Conf., pp. 239-250, 2008.
[30] W. Su, J. Wang, Q. Huang, and F. Lochovsky, “Query Result Ranking over E-Commerce Web Databases,” Proc. ACM Conf. Information and Knowledge Management (CIKM), 2006.
[31] K.-L. Tan, P.-K. Eng, and B.C. Ooi, “Efficient Progressive Skyline Computation,” Proc. Int'l Conf. Very Large Data Bases (VLDB), 2001.
36 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool