Subscribe
Issue No.03 - March (2009 vol.21)
pp: 305-320
Tianyi Jiang , Stern School of Business/New York University, New York
Alexander Tuzhilin , New York University, New York
ABSTRACT
On the Web, where the search costs are low and the competition is just a mouse click away, it is crucial to segment the customers intelligently in order to offer more personalized products and services to them. Traditionally, customer segmentation is achieved using statistics-based methods that compute a set of statistics from the customer data and group customers into segments by applying distance-based clustering algorithms in the space of these statistics. In this paper, we present a direct grouping based approach to computing customer segments that groups customers in terms of optimally combining transactional data of several customers to build a predictive model of customer behavior for each group. We consider customer segmentation as a combinatorial optimization problem of finding the best partitioning of the customer base into disjoint groups and show that finding an optimal customer partition is NP-hard. We propose several suboptimal direct grouping segmentation methods, empirically compares them against traditional statistics-based hierarchical and affinity propagation based segmentation, and 1-to-1 methods across multiple experimental conditions. We show that the best direct grouping method builds mostly small sized customer segments and significantly dominates the statistics-based and 1-to-1 approaches across most of the experimental conditions, while still being computationally tractable.
INDEX TERMS
Personalization, Clustering, classification, and association rules, Data mining, Clustering
CITATION
Tianyi Jiang, Alexander Tuzhilin, "Improving Personalization Solutions through Optimal Segmentation of Customer Bases", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 3, pp. 305-320, March 2009, doi:10.1109/TKDE.2008.163
REFERENCES
 [1] G. Adomavicius and A. Tuzhilin, “Expert-Driven Validation of Rule-Based User Models in Personalization Applications,” Data Mining and Knowledge Discovery, vol. 5, nos. 1/2, pp. 33-58, 2001. [2] G. Adomavicius and A. Tuzhilin, “Personalization Technologies: A Process-Oriented Perspective,” Comm. ACM, 2005. [3] G.M. Allenby and P.E. Rossi, “Marketing Models of Consumer Heterogeneity,” J. Econometrics, vol. 89, 1999. [4] D. Beyer and R. Ogier, “Tabu Learning: A Neural Network Search Method for Solving Nonconvex Optimization Problems,” Proc. Int'l Joint Conf. Neural Networks (IJCNN), 1991. [5] Y. Boztug and T. Reutterer, “A Combined Approach for Segment-Specific Analysis of Market Basket Data,” European J. Operational Research, 2007. [6] T. Brijs, T. Swinnen, K. Vanhoof, and G. Wets, “Using Shopping Baskets to Cluster Supermarket Shoppers,” AARTF, Amelia Island Plantation, Fla, 2001. [7] P. Brucker, “On the Complexity of Clustering Problems,” Optimization and Operations Research, R. Henn, B. Korte, and W.Oettli, eds., pp. 45-54, Springer Verlag, 1977. [8] Comm. ACM, special issue on personalization, 2000. [9] I.V. Cadez, P. Smyth, and H. Mannila, “Predictive Profiles for Transaction Data Using Finite Mixture Models,” Technical Report No. 01–67, UC Irvine, 2001. [10] C. Cortes, K. Fisher, D. Pregibon, A. Rogers, and F. Smith, “Hancock: A Language for Extracting Signatures from Data Streams,” Proc. Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD), 2000. [11] W. DeSarbo and W.L. Cron, “A Maximum Likelihood Methodology for Clusterwise Linear Regression,” J. Classification, vol. 5, pp.249-282, 1988. [12] J. Dougherty, R. Kohavi, and M. Sahami, “Supervised and Unsupervised Discretization of Continuous Features,” Proc. 12th Int'l Conf. Machine Learning (ICML), 1995. [13] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. JohnWiley & Sons, 2001. [14] U.M. Fayyad and K.B. Irani, “Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning,” Proc. Int'l Joint Conf. Artificial Intelligence (IJCAI), 1993. [15] B. Frey and D. Dueck, “Mixture Modeling by Affinity Propagation,” Advances in Neural Information Processing Systems, vol. 18, Y.Weiss, B. Scholkopf, and J. Platt, eds., MIT Press, 2006. [16] S. Guha, R. Rastogi, and K. Shim, “ROCK: A Robust Clustering Algorithm for Categorical Attributes,” Information Systems, vol. 25, no. 5, pp. 345-366, 2000. [17] D. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, Sec.6.3.2-6.3.3, MIT Press, 2001. [18] P. Hansen, “The Steepest Ascent Mildest Descent Heuristic for Combinatorial Programming,” Congress on Numerical Methods in Combinatorial Optimization, 1986. [19] S.D. Hochbaum and B.D. Shmoys, “A Best Possible Heuristic for the $K$ -Center Problem,” Math. Operational Research, vol. 10, no. 2, pp. 180-184, 1985. [20] K. Hoffman, “Combinatorial Optimization: Current Successes and Directions for the Future,” J. Computational and Applied Math., vol. 124, pp. 341-360, 2000. [21] T. Jiang and A. Tuzhilin, “Segmenting Customers from Population to Individual: Does 1-to-1 Keep Your Customers Forever?” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 10, pp. 1297-1311, Oct. 2006. [22] G.H. John and P. Langley, “Estimating Continuous Distributions in Bayesian Classifiers,” Proc. 11th Ann. Conf. Uncertainty in Artificial Intelligence (UAI), 1995. [23] L. Kaufman and P. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, 1990. [24] P. Kotler, Marketing Management, 11th ed. Prentice Hall, 2003. [25] M. Koyuturk, A. Grama, and N. Ramakrishnan, “Compression, Clustering and Pattern Discovery in Very High Dimensional Discrete-Attribute Datasets,” IEEE Trans. Knowledge and Data Eng., vol. 17, no. 4, pp. 447-461, Apr. 2005. [26] F. Leisch, “A Toolbox for $K$ -Centroids Cluster Analysis,” Computational Statistics and Data Analysis, vol. 51, no. 2, pp. 526-544, 2006. [27] S. Lin and B.W. Kernigham, “An Effective Implementation for the Traveling Salesman Problem,” Operations Research, vol. 21, pp. 498-516, 1973. [28] E. Malthouse, “Database Sub-Segmentation,” Kellogg on Integrated Marketing, D. Iacobucci and B. Calder, eds., pp. 162-188, 2003. [29] E. Manavoglu, D. Pavlov, and C.L. Giles, “Probabilistic User Behavior Models,” Proc. Third IEEE Int'l Conf. Data Mining (ICDM), 2003. [30] W. Mendenhall and R.J. Beaver, Introduction to Probability and Statistics. Thomson, 1994. [31] B. Mobasher, H. Dai, T. Luo, and M. Nakagawa, “Using Sequential and Non-Sequential Patterns for Predictive Web Usage Mining Tasks,” Proc. IEEE Int'l Conf. Data Mining (ICDM), 2002. [32] H. Mühlenbein, “Parallel Genetic Algorithms in Combinatorial Optimization,” Computer Science and Operations Research, O.Blaci, ed., Pergamon Press, 1992. [33] O. Nasraoui, M. Soliman, E. Saka, A. Badia, and R. Germain, “A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites,” IEEE Trans. Knowledge and Data Eng., vol. 20, no. 2, pp. 202-215, Feb. 2008. [34] J. Novo, Drilling Down: Turning Customer Data into Profits with a Spreadsheet, Booklocker, 2004. [35] M. Ozdal and C. Aykanat, “Clustering Based on Data Patterns Using Hypergraph Models,” Data Mining and Knowledge Discovery, vol. 9, pp. 29-57, 2004. [36] M. Pazzani and D. Billsus, “Learning and Revising User Profiles: The Identification of Interesting Web Sites,” Machine Learning, vol. 27, no. 3, pp. 313-331, 1997. [37] D. Peppers and M. Rogers, Enterprise One to One. Bantam, 1997. [38] R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. [39] T. Reutterer, A. Mild, M. Natter, and A. Taudes, “A Dynamic Segmentation Approach for Targeting and Customizing Direct Marketing Campaigns,” Interactive Marketing, vol. 20, nos. 3/4, pp. 43-57, 2006. [40] W. Smith, “Product Differentiation and Market Segmentation as Alternative Marketing Strategies,” J. Marketing, vol. 21, 1956. [41] Spath, “Algorithm 39: Clusterwise Linear Regression,” Computing, vol. 22, pp. 363-373, 1979. [42] M. Spiliopoulou, B. Mobasher, B. Berendt, and M. Nakagawa, “A Framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis,” INFORMS J. Computing, no. 2, p. 15, 2003. [43] M. Wedel and W.S. DeSarbo, “A Mixture Likelihood Approach for Generalized Linear Models,” J. Classification, vol. 12, 1995. [44] M. Wedel, W. Kamakura, N. Arora, A. Bemmaor, J. Chiang, T. Elrod, R. Johnson, P. Lenk, S. Neslin, and C.S. Poulsen, “Discrete and Continuous Representations of Unobserved Heterogeneity in Choice Modeling,” Marketing Letters, vol. 10, no. 3, pp. 219-232, 1999. [45] M. Wedel and W. Kamakura, Market Segmentation: Conceptual and Methodological Foundations. Kluwer, 2000. [46] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 2000. [47] Y. Yang and B. Padmanabhan, “Segmenting Customer Transactions Using a Pattern-Based Clustering Approach,” Proc. Third IEEE Int'l Conf. Data Mining (ICDM), 2003. [48] G.K. Zipf, Human Behavior and the Principle of Least Effort. Addison-Wesley, 1949.