This Article 
 Bibliographic References 
 Add to: 
Predicting Missing Items in Shopping Carts
July 2009 (vol. 21 no. 7)
pp. 985-998
Kasun Wickramaratna, University of Miami, Coral Gables
Miroslav Kubat, University of Miami, Coral Gables
Kamal Premaratne, University of Miami, Coral Gables
Existing research in association mining has focused mainly on how to expedite the search for frequently co-occurring groups of items in “shopping cart” type of transactions; less attention has been paid to methods that exploit these “frequent itemsets” for prediction purposes. This paper contributes to the latter task by proposing a technique that uses partial information about the contents of a shopping cart for the prediction of what else the customer is likely to buy. Using the recently proposed data structure of itemset trees (IT-trees), we obtain, in a computationally efficient manner, all rules whose antecedents contain at least one item from the incomplete shopping cart. Then, we combine these rules by uncertainty processing techniques, including the classical Bayesian decision theory and a new algorithm based on the Dempster-Shafer (DS) theory of evidence combination.

[1] S. Noel, V.V. Raghavan, and C.H. Chu, “Visualizing Association Mining Results through Hierarchical Clusters,” Proc. Int'l Conf. Data Mining (ICDM '01) pp.425-432, Nov./Dec. 2001.
[2] P. Bollmann-Sdorra, A. Hafez, and V.V. Raghavan, “A Theoretical Framework for Association Mining Based on the Boolean Retrieval Model,” Data Warehousing and Knowledge Discovery: Proc. Third Int'l Conf. (DaWaK '01), pp.21-30, Sept. 2001.
[3] R. Agrawal, T. Imielinski, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proc. ACM Special Interest Group on Management of Data (ACM SIGMOD), pp.207-216, 1993.
[4] M. Kubat, A. Hafez, V.V. Raghavan, J.R. Lekkala, and W.K. Chen, “Itemset Trees for Targeted Association Querying,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 6, pp.1522-1534, Nov./Dec. 2003.
[5] V. Ganti, J. Gehrke, and R. Ramakrishnan, “Demon: Mining and Monitoring Evolving Data,” Proc. Int'l Conf. Data Eng., 1999.
[6] J. Gehrke, V. Ganti, and R. Ramakrishnan, “Detecting Change in Categorical Data: Mining Contrast Sets,” Proc. ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems, pp.126-137, 2000.
[7] A. Rozsypal and M. Kubat, “Association Mining in Time-Varying Domains,” Intelligent Data Analysis, vol. 9, pp.273-288, 2005.
[8] V. Raghavan and A. Hafez, “Dynamic Data Mining,” Proc. 13th Int'l Conf. Industrial and Eng. Applications of Artificial Intelligence and Expert Systems IEA/AIE, pp.220-229, June 2000.
[9] C.C. Aggarwal, C. Procopius, and P.S. Yu, “Finding Localized Associations in Market Basket Data,” IEEE Trans. Knowledge and Data Eng., vol. 14, no. 1, pp.51-62, Jan./Feb. 2002.
[10] R. Bayardo and R. Agrawal, “Mining the Most Interesting Rules,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp.145-154, 1999.
[11] J. Zhang, S.P. Subasingha, K. Premaratne, M.-L. Shyu, M. Kubat, and K.K.R.G.K. Hewawasam, “A Novel Belief Theoretic Association Rule Mining Based Classifier for Handling Class Label Ambiguities,” Proc. Workshop Foundations of Data Mining (FDM '04), Int'l Conf. Data Mining (ICDM '04), Nov. 2004.
[12] B. Liu, W. Hsu, and Y.M. Ma, “Integrating Classification and Association Rule Mining,” Proc. ACM SIGKDD Int'l Conf. Know. Disc. Data. Mining (KDD '98), pp.80-86, Aug. 1998.
[13] W. Li, J. Han, and J. Pei, “CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules,” Proc. IEEE Int'l Conf. Data Mining (ICDM '01), pp.369-376, Nov./Dec. 2001.
[14] K.K.R.G.K. Hewawasam, K. Premaratne, and M.-L. Shyu, “Rule Mining and Classification in a Situation Assessment Application: A Belief Theoretic Approach for Handling Data Imperfections,” IEEE Trans. Systems, Man, Cybernetics, B, vol. 37, no. 6 pp.1446-1459, Dec. 2007.
[15] H.H. Aly, A.A. Amr, and Y. Taha, “Fast Mining of Association Rules in Large-Scale Problems,” Proc. IEEE Symp. Computers and Comm. (ISCC '01), pp.107-113, 2001.
[16] J. Neyman and E. Pearson, “On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference,” Biometrica, vol. 20A, pp.175-240, 1928.
[17] R. Fisher, “The Use of Multiple Measurement in Taxonomic Problems,” Ann. Eugenics, vol. 7, pp.111-132, 1936.
[18] I. Good, The Estimation of Probabilities: An Essay on Modern Bayesian Methods. MIT Press, 1965.
[19] B. Cestnik and I. Bratko, “On Estimating Probabilities in Tree Pruning,” Proc. European Working Session Machine Learning, pp.138-150, 1991.
[20] Y. Li and M. Kubat, “Searching for High-Support Itemsets in Itemset Trees,” Intelligent Data Analysis, vol. 10, no. 2, pp.105-120, 2006.
[21] G. Shafer, A Math. Theory of Evidence. Princeton Univ. Press, 1976.
[22] P. Smets, “Practical Uses of Belief Functions,” Proc. Conf. Uncertainty in Artifical Intelligence (UAI '99), pp.612-621, 1999.
[23] C.J. van Rijsbergen, Information Retrieval. Butterworths, 1979.
[24] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. Int'l Conf. Very Large Data Bases (VLDB '94), pp.487-499, 1994.
[25] D. Vilar, M.J. Castro, and E. Sanchis, “Multilabel Text Classification Using Multinomial Models,” Proc. España for Natural Language Processing (EsTAL '04), pp.220-230, Oct. 2004.
[26] S. Godbole and S. Sarawagi, “Discriminative Methods for Multi-Labeled Classification,” Proc. Pacific-Asia Conf. (PAKDD '04), pp.22-30, 2004.

Index Terms:
Frequent itemsets, uncertainty processing, Dempster-Shafer theory.
Kasun Wickramaratna, Miroslav Kubat, Kamal Premaratne, "Predicting Missing Items in Shopping Carts," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 7, pp. 985-998, July 2009, doi:10.1109/TKDE.2008.229
Usage of this product signifies your acceptance of the Terms of Use.