Issue No.09 - September (2005 vol.17)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2005.145
Grouping customer transactions into segments may help understand customers better. The marketing literature has concentrated on identifying important segmentation variables (e.g., customer loyalty) and on using cluster analysis and mixture models for segmentation. The data mining literature has provided various clustering algorithms for segmentation without focusing specifically on clustering customer transactions. Building on the notion that observable customer transactions are generated by latent behavioral traits, in this paper, we investigate using a pattern-based clustering approach to grouping customer transactions. We define an objective function that we maximize in order to achieve a good clustering of customer transactions and present an algorithm, GHIC, that groups customer transactions such that itemsets generated from each cluster, while similar to each other, are different from ones generated from others. We present experimental results from user-centric Web usage data that demonstrates that GHIC generates a highly effective clustering of transactions.
Index Terms- Data mining, clustering, classification, association rules, Web mining.
Yinghui Yang, Balaji Padmanabhan, "GHIC: A Hierarchical Pattern-Based Clustering Algorithm for Grouping Web Transactions", IEEE Transactions on Knowledge & Data Engineering, vol.17, no. 9, pp. 1300-1304, September 2005, doi:10.1109/TKDE.2005.145