This Article 
 Bibliographic References 
 Add to: 
Web-Log Mining for Predictive Web Caching
July/August 2003 (vol. 15 no. 4)
pp. 1050-1053

Abstract—Caching is a well-known strategy for improving the performance of Web-based systems. The heart of a caching system is its page replacement policy, which selects the pages to be replaced in a cache when a request arrives. In this paper, we present a Web-log mining method for caching Web objects and use this algorithm to enhance the performance of Web caching systems. In our approach, we develop an n-gram-based prediction algorithm that can predict future Web requests. The prediction model is then used to extend the well-known GDSF caching policy. We empirically show that the system performance is improved using the predictive-caching approach.

[1] C. Aggarwal, J. Wolf, and P. Yu, "Caching on the World Wide Web," IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, 1999, pp. 94-107.
[2] C.C. Aggarwal and P.S. Yu, "A New Approach to Online Generation of Association Rules," IEEE Trans. Knowledge and Data Eng. , vol. 13, no. 4, 2001, pp. 527-540; .
[3] R. Agrawal, T. Imielinski, and R. Srikant, Mining Association Rules between Sets of Items in Large Databases Proc. ACM SIGMOD Conf. Management of Data, pp. 207-216, May 1993.
[4] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules Proc. Very Large Data Base Conf., pp. 487-499, Sept. 1994.
[5] R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proc. 1995 Int'l Conf. Data Eng., pp. 3-14, Mar. 1995.
[6] M. Arlitt, R. Friedrich, L. Cherkasova, J. Dilley, and T. Jin, Evaluating Content Management Techniques for Web Proxy Caches HP Technical Report, Apr. 1999.
[7] L.A. Belady, A Study of Replacement Algorithms for Virtual Storage Computers IBM Systems J., vol. 5, no. 2, pp. 78-101, 1966.
[8] F. Bonchi, F. Giannotti, C. Gozzi, G. Manco, M. Nanni, D. Pedreschi, C. Renso, and S. Ruggieri, Web Log Data Warehousing and Mining for Intelligent Web Caching Data&Knowledge Eng., vol. 39, no. 2, pp. 165-189, 2001.
[9] P. Cao and S. Irani, Cost-Aware WWW Proxy Caching Algorithms Proc. USENIX Symp. Internet Technology and Systems, pp. 193-206, Dec. 1997.
[10] L. Cherkasova, Improving WWW Proxies Performance with Greedy-Dual-Size-Frequency Caching Policy HP Technical Report, Nov. 1998.
[11] J. Han, J. Pei, and Y. Yin, Mining Frequent Patterns without Candidate Generation Proc. 2000 ACM SIGMOD Int'l Conf. Management of Data, W. Chen, J. Naughton, and P.A. Bernstein, eds., pp. 1-12, 2000.
[12] T. Joachims, T Freitag, and D. Mitchell, Webwatcher: A Tour Guide for the World Wide Web Proc. 15th Int'l Conf. Artificial Intelligence (IJCAI '97), pp. 770-777, 1997.
[13] P. Krishnan and J.S. Vitter, Optimal Prediction for Prefetching in the Worst Case Proc. SODA: ACM-SIAM Symp. Discrete Algorithms (A Conf. Theoretical and Experimental Analysis of Discrete Algorithms), 1994.
[14] Q. Luo and J.F. Naughton, Form-Based Proxy Caching for Database-Backed Web Sites The VLDB J., pp. 191-200, 2001.
[15] E.P. Markatos and C.E. Chronaki, A Top 10 Approach for Prefetching the Web Proc. INET '98: Internet Global Summit, July 1998.
[16] S. Michel, K. Nguyen, A. Rosenstein, L. Zhang, S. Floyd, and V. Jacobson, Adaptive Web Caching: Towards a New Caching Architecture, Nov. 1998.
[17] V. Padmanabhan and J. Mogul, Using Predictive Prefetching to Improve World Wide Web Latency Proc. ACM SIGComm, pp. 22-36, 1996.
[18] M.J. Pazzani, J. Muramatsu, and D. Billsus, Syskill&Webert: Identifying Interesting Web Sites Proc. Am. Assoc. Artificial Intelligence, pp. 54-61, 1996.
[19] M. Perkowitz and O. Etzioni, Adaptive Web Sites: Concept and Case Study Artificial Intelligence, vol. 118, nos. 1-2, pp. 245-275, 2001.
[20] J. Pitkow and P. Pirolli, Mining Longest Repeating Subsequences to Predict World Wide Web Surfing Proc. Second USENIX Symp. Internet Technologies and Systems, pp. 139-150, 1999.
[21] S. Schechter, M. Krishnan, and M.D. Smith, Using Path Profiles to Predict http Requests Proc. Seventh Int'l World Wide Web Conf., pp. 457-467, Apr. 1998.
[22] J. Srivastava, R. Cooley, M. Deshpande, and P. Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data ACM SIGKDD Explorations, vol. 1, no. 2, pp. 12-13, 2000.
[23] Q. Yang, H.H. Zhang, and I.T.Y. Li, Mining Web Logs for Prediction Models in WWW Caching and Prefetching Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 473-478, Aug. 2001.
[24] I. Zukerman, D.W. Albrecht, and A.E. Nicholson, Predicting Users' Request on the WWW Proc. (UM '99) Seventh Int'l Conf. User Modeling, pp. 275-284, June 1999.

Index Terms:
Web log mining, Web caching, prediction models, classification.
Qiang Yang, Haining Henry Zhang, "Web-Log Mining for Predictive Web Caching," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 4, pp. 1050-1053, July-Aug. 2003, doi:10.1109/TKDE.2003.1209022
Usage of this product signifies your acceptance of the Terms of Use.