This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Characterizing Web Usage Regularities with Information Foraging Agents
May 2004 (vol. 16 no. 5)
pp. 566-584

Abstract—Researchers have recently discovered several interesting, self-organized regularities from the World Wide Web, ranging from the structure and growth of the Web to the access patterns in Web surfing. What remains to be a great challenge in Web log mining is how to explain user behavior underlying observed Web usage regularities. In this paper, we will address the issue of how to characterize the strong regularities in Web surfing in terms of user navigation strategies, and present an information foraging agent-based approach to describing user behavior. By experimenting with the agent-based decision models of Web surfing, we aim to explain how some Web design factors as well as user cognitive factors may affect the overall behavioral patterns in Web usage.

[1] R. Albert, H. Jeong, and A.-L. Barabasi, Diameter of World-Wide Web Nature, vol. 410, pp. 130-131, Sept. 1999.
[2] S. Lawrence and C.L. Giles, Accessibility of Information on the Web Nature, vol. 400, pp. 107-109, 1999.
[3] B.A. Huberman, P.L.T. Pirolli, J.E. Pitkow, and R.M. Lukose, Strong Regularities in World Wide Web Surfing Science, vol. 280, pp. 96-97, Apr. 1997.
[4] Graphics, Visualization, and Usability Center, GVU's WWW User Surveys http://www.gvu.gatech.eduuser_surveys, 2001.
[5] S. Cbakrabarti, B.E. Dom, D. Gibson, and J. Kleinberg, Mining the Web's Link Structure Computer, vol. 32, no. 8, pp. 60-67, Aug. 1999.
[6] D. Gibson, J. Kleinberg, and P. Raghavan, Inferring Web Communities from Link Topology Proc. Ninth ACM Conf. Hypertext and Hypermedia, pp. 225-234, 1998.
[7] B. Mobasher, N. Jain, E. Han, and J. Srivastava, Web Mining: Pattern Discovery from World Wide Web Transactions Technical Report TR-96050, Dept. of Computer Science, Univ. of Minnesota, 1996.
[8] J.E. Pitkow, Summary of WWW Characterizations Computer Networks and ISDN Systems, vol. 30, nos. 1-7, pp. 551-558, 1998.
[9] R. Cooley, J. Srivastava, and B. Mobasher, Web Mining: Information and Pattern Discovery on the World Wide Web Proc. Ninth IEEE Int'l Conf. Tools with Artificial Intelligence (ICTAI '97), pp. 558-567, Nov. 1997.
[10] L.D. Catledge and J.E. Pitkow, Characterizing Browsing Strategies in the World-Wide Web Computer Networks and ISDN Systems, vol. 26, no. 6, pp. 1065-1073, 1995.
[11] C.R. Cuhna, A. Bestavros, and M.E. Crovella, Characteristics of WWW Client Based Traces Technical Report BU-CS-95-010, Computer Science Dept., Boston Univ., 1995.
[12] R. Cooley, P.-N. Tan, and J. Srivastava, Discovery of Interesting Usage Patterns from Web Data Proc. Workshop Web Usage Analysis and User Profiling, pp. 163-182, Aug. 1999.
[13] J. Pei, J. Han, B. Mortazavi-asl, and H. Zhu, Mining Access Patterns Efficiently from Web Logs Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD2000), Apr. 2000.
[14] M. Spiliopoulou, The Laborious Way from Data Mining to Web Log Mining Int'l J. Computer Systems Science and Eng.: Special Issue on Semantics of the Web, vol. 14, pp. 113-126, Mar. 1999.
[15] M. Spiliopoulou, C. Pohle, and L. Faulstich, Improving the Effectiveness of a Web Site with Web Usage Mining Proc. Workshop Web Usage Analysis and User Profiling, Aug. 1999.
[16] O.R. Zaane, M. Xin, and J. Han, Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs Proc. Advances in Digital Libraries (ADL '98), pp. 19-29, Apr. 1998.
[17] A. Joshi and R. Krishnapuram, On Mining Web Access Logs Proc. ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery, pp. 63-69, 2000.
[18] O. Nasraoui, H. Frigui, A. Joshi, and R. Krishnapuram, Mining Web Access Logs Using Relational Competitive Fuzzy Clustering Proc. Eighth Int'l Fuzzy Systems Association World Congress (IFSA '99), Aug. 1999.
[19] T.W. Yan, M. Jacobsen, H. Garcia-Molina, and U. Dayal, From User Access Patterns to Dynamic Hypertext Linking Proc. Fifth World Wide Web Conf. (WWW5), pp. 1007-1014, May 1996.
[20] P. Barford, A. Bestavros, A. Bradley, and M. Crovella, Changes in Web Client Access Patterns: Characteristics and Caching Implications World Wide Web, Special Issue on Characterization and Performance Evaluation, vol. 2, pp. 15-28, 1999.
[21] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, Web Caching and Zipf-Like Distributions: Evidence and Implications Technical Report 1371, Computer Sciences Dept., Univ. of Wisconsin-Madison, Apr. 1998.
[22] S. Glassman, A Caching Relay for the World Wide Web Computer Networks and ISDN Systems, vol. 27, no. 2, pp. 165-173, 1994.
[23] V. Padmanabhan and J. Mogul, Using Predictive Prefetching to Improve World-Wide Web Latency Proc. SIGCOMM'96 Conf., 1996.
[24] C.R. Cuhna and C. Jaccoud, Determining WWW User's Next Access and Its Application to Pre-Fetching Proc. Second IEEE Symp. Computers and Comm. (ISCC '97), July 1997.
[25] M.F. Arlitt and C.L. Williamson, Web Server Workload Characterization: The Search for Invariants Proc. ACM SIGMETRICS '96 Conf., pp. 126-137, Apr. 1996.
[26] P. Barford and M. Crovella, Generating Representative Web Workloads for Network and Server Performance Evaluation Measurement and Modeling of Computer Systems: Proc. ACM SIGMETRICS Conf., pp. 151-160, July 1998.
[27] J. Mogul, Network Behavior of a Busy Web Server and Its Clients Technical Report TR-95.5, Digital Western Research Laboratory, 1995.
[28] S. Madria, S.S. Bhowmick, W.-K. NG, and R.P. Lim, Research Issues in Web Data Mining Proc. First Int'l Conf. Data Warehousing and Knowledge Discovery (DAWAK '99) 1999.
[29] G.K. Zipf, Human Behavior and the Principle of Least Effort. Addison-Wesley, 1949.
[30] M.E. Crovella and M.S. Taqqu, Estimating the Heavy Tail Index from Scaling Properties Methodology and Computing in Applied Probability, vol. 1, no. 1, pp. 55-79, 1999.
[31] L.A. Adamic and B.A. Huberman, The Nature of Markets in the World Wide Web Quarterly J. Electronic Commerce, vol. 1, pp. 5-12, 2000.
[32] S.M. Maurer and B.A. Huberman, Competitive Dynamics of Web Sites http://www.hpl.hp.com/research/idl/projects/ ecommercewinner.pdf, 2003.
[33] D. Helbing, B.A. Huberman, and S.M. Maurer, Optimizing Traffic in Virtual and Real Space Proc. Traffic and Granular Flow '99 Conf.: Social, Traffic, and Granular Dynamics, D. Helbing, H.J. Herrmann, M. Schreckenberg, and D.E. Wolf, eds., 2000.
[34] B.A. Huberman and L.A. Adamic, Evolutionary Dynamics of the World Wide Web http://www.parc.xerox.com/istl/groups/iea/ wwwgrowth.html, 1999.
[35] E. Adar and B.A. Huberman, The Economics of Surfing http://www.parc.xerox.com/spl/groups/dynamics new.shtml, 1999.
[36] M. Levene, J. Borges, and G. Loizou, Zipf's Law for Web Surfers Knowledge and Information Systems, vol. 3, pp. 120-129, 2001.
[37] R.M. Lukose and B.A. Huberman, Surfing as a Real Option Proc. Int'l Conf. Information and Computation Economics, pp. 45-51, Oct. 1998.
[38] A. Johansen and D. Sornette, Download Relaxation Dynamics on the WWW Following Newspapers Publication of URL IMA Hot Topics Workshop: Scaling Phenomena in Comm. Networks, Oct. 1999.
[39] L.A. Adamic and B.A. Huberman, Technical Comment to 'Emergence of Scaling in Random Networks' vol. 286, no. 15, pp. 509-512, Oct. 1999, http://www.parc.xerox.com/spl/groups/dynamics new.shtml.
[40] A.-L. Barabasi and R. Albert, Emergence of Scaling in Random Networks Science, vol. 286, pp. 509-512, Oct. 1999.
[41] A.-L. Barabasi and R. Albert, H. Jeong, Scale-Free Characteristics of Random Networks: The Topology of the World Wide Web Physica A, vol. 281, pp. 69-77, 2000.
[42] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, Graph Structure in the Web Proc. Ninth World Wide Web Conf. (WWW9), May 2000.
[43] B.A. Huberman and L.A. Adamic, Growth Dynamics of the World-Wide Web Nature, vol. 410 pp. 131, Sept. 1999.
[44] M. Levene and G. Loizou, Computing the Entropy of User Navigation in the Web Research Note RN/99/42, Dept. of Computer Science, Univ. College London, 1999.
[45] A. Thatcher, Determining Interests and Motives in WWW Navigation Proc. Second Int'l Cyberspace Conf. Ergonomics (CybErg1999), 1999.
[46] F. Menczer, Mapping the Sementics of Web Text and Links IEEE J. Selected Areas in Comm., to be published.
[47] F. Menczer, Lexical and Semantic Clustering by Web Links IEEE Trans. Knowledge and Data Eng., to be published.
[48] G.W. Flake, S. Lawrence, C.L. Giles, and F. Coetzee, Self-Organization of the Web and Identification of Communities Computer, vol. 35, no. 3, pp. 66-71, 2002.
[49] J.F. Cove and B.C. Walsh, Online Text Retrieval via Browsing Information Processing and Management, vol. 24, no. 1, pp. 31-37, 1988.
[50] D.E. Kieras, D.E. Meyer, S.T. Mueller, and T.L. Seymour, Insights into Working Memory from the Perspective of The EPIC Architecture for Modeling Skilled Perceptual-Motor and Cognitive Human Performance Models of Working Memory, A. Miyaki and P. Shah, eds., Cambridge Univ. Press, 1999.
[51] J.E. Pitkow, Summary of WWW Characteristics The World Wide Web J., vol. 2, no. 2, pp. 2-13, 1999.

Index Terms:
Web log, Web mining, power law, regularities, user behavior, decision models, information foraging, autonomous agents, agent-based simulation.
Citation:
Jiming Liu, Shiwu Zhang, Jie Yang, "Characterizing Web Usage Regularities with Information Foraging Agents," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 5, pp. 566-584, May 2004, doi:10.1109/TKDE.2004.1277818
Usage of this product signifies your acceptance of the Terms of Use.