Issue No. 06 - November/December (2004 vol. 19)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2004.59
Dmitry Pavlov , Yahoo
Eren Manavoglu , Pennsylvania State University
David M. Pennock , Yahoo Research Labs
C. Lee Giles , Pennsylvania State University
The authors describe a novel maximum-entropy (maxent) approach for generating online recommendations as a user navigates through a collection of documents. They show how to handle high-dimensional sparse data and represent it as a collection of ordered sequences of document requests. This representation and the maxent approach have several advantages: (1) you can naturally model long-term interactions and dependencies in the data sequences; (2) you can query the model quickly once it is learned, which makes the method applicable to high-volume Web servers; and (3) you obtain empirically high-quality recommendations. Although maxent learning is computationally infeasible if implemented in the straightforward way, the authors explored data clustering and several algorithmic techniques to make learning practical even in high dimensions. They present several methods for combining the predictions of maxent models learned in different clusters. They conducted offline tests using over six months' worth of data from ResearchIndex, a popular online repository of over 470,000 computer science documents. They show that their maxent algorithm is one of the most accurate recommenders, as compared to such techniques as correlation, a mixture of Markov models, a mixture of multinomial models, individual similarity-based recommenders currently available on ResearchIndex, and even various combinations of current ResearchIndex recommenders.
recommender systems, maximum entropy model, sequence modeling, mixture models
C. L. Giles, E. Manavoglu, D. M. Pennock and D. Pavlov, "Collaborative Filtering with Maximum Entropy," in IEEE Intelligent Systems, vol. 19, no. , pp. 40-48, 2004.