This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
CDNs Content Outsourcing via Generalized Communities
January 2009 (vol. 21 no. 1)
pp. 137-151
Dimitrios Katsaros, University of Thessaly, Volos, Aristotle University of Thessaloniki, Thessaloniki
George Pallis, Aristotle University of Thessaloniki, Thessaloniki
Konstantinos Stamos, Aristotle University of Thessaloniki, Thessaloniki
Athena Vakali, Aristotle University of Thessaloniki, Thessaloniki
Antonis Sidiropoulos, Aristotle University of Thessaloniki, Thessaloniki
Yannis Manolopoulos, Aristotle University of Thessaloniki, Thessaloniki
Content Distribution Networks (CDNs) balance costs and quality in services related to content delivery. Devising an efficient content outsourcing policy is crucial since, based on such policies, CDN providers can provide client-tailored content, improve performance, and result in significant economical gains. Earlier content outsourcing approaches may often prove ineffective since they drive prefetching decisions by assuming knowledge of content popularity statistics, which are not always available and are extremely volatile. This work addresses this issue, by proposing a novel self-adaptive technique under a CDN framework on which outsourced content is identified with no a-priori knowledge of (earlier) request statistics. This is employed by using a structure-based approach identifying coherent clusters of "correlated" Web server content objects, the so-called Web page communities. These communities are the core outsourcing unit and in this paper a detailed simulation experimentation has shown that the proposed technique is robust and effective in reducing user-perceived latency as compared with competing approaches, i.e., two communities-based approaches, Web caching, and non-CDN.

[1] R. Andersen and K. Lang, “Communities from Seed Sets,” Proc. ACM Int'l Conf. World Wide Web (WWW '06), pp. 223-232, 2006.
[2] N. Bansal, A. Blum, and S. Chawla, “Correlation Clustering,” Machine Learning, vol. 56, nos. 1-3, pp. 89-113, 2004.
[3] J. Fritz Barnes and R. Pandey, “CacheL: Language Support for Customizable Caching Policies,” Proc. Fourth Int'l Web Caching Workshop (WCW), 1999.
[4] L. Bent, M. Rabinovich, G.M. Voelker, and Z. Xiao, “Characterization of a Large Web Site Population with Implications for Content Delivery,” World Wide Web J., vol. 9, no. 4, pp. 505-536, 2006.
[5] U. Brandes, “A Faster Algorithm for Betweenness Centrality,” J.Math. Sociology, vol. 25, no. 2, pp. 163-177, 2001.
[6] A.Z. Broder, R. Lempel, F. Maghoul, and J.O. Pedersen, “Efficient PageRank Approximation via Graph Aggregation,” Information Retrieval, vol. 9, no. 2, pp. 123-138, 2006.
[7] R. Buyya, A.K. Pathan, J. Broberg, and Z. Tari, “A Case for Peering of Content Delivery Networks,” IEEE Distributed Systems Online, vol. 7, no. 10, Oct. 2006.
[8] S. Chakrabarti, B.E. Dom, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J. Kleinberg, “Mining the Web's Link Structure,” Computer, vol. 32, no. 8, pp. 60-67, Aug. 1999.
[9] Y. Chen, L. Qiu, W. Chen, L. Nguyen, and R.H. Katz, “Efficient and Adaptive Web Replication Using Content Clustering,” IEEE J.Selected Areas in Comm., vol. 21, no. 6, pp. 979-994, 2003.
[10] A. Davis, J. Parikh, and W.E. Weihl, “Edgecomputing: Extending Enterprise Applications to the Edge of the Internet,” Proc. ACM Int'l Conf. World Wide Web (WWW '04), pp. 180-187, 2004.
[11] V. Estivill-Castro and J. Yang, “Non-Crisp Clustering by Fast, Convergent, and Robust Algorithms,” Proc. Int'l Conf. Principles of Data Mining and Knowledge Discovery (PKDD '01), pp. 103-114, 2001.
[12] G.W. Flake, S. Lawrence, C.L. Giles, and F.M. Coetzee, “Self-Organization and Identification of Web Communities,” Computer, vol. 35, no. 3, pp. 66-71, Mar. 2002.
[13] G.W. Flake, K. Tsioutsiouliklis, and L. Zhukov, “Methods for Mining Web Communities: Bibliometric, Spectral, and Flow,” Web Dynamics—Adapting to Change in Content, Size, Topology and Use, A.Poulovassilis and M. Levene, eds., pp. 45-68, Springer, 2004.
[14] N. Fujita, Y. Ishikawa, A. Iwata, and R. Izmailov, “Coarse-Grain Replica Management Strategies for Dynamic Replication of Web Contents,” Computer Networks, vol. 45, no. 1, pp. 19-34, 2004.
[15] K. Hosanagar, R. Krishnan, M. Smith, and J. Chuang, “Optimal Pricing of Content Delivery Network Services,” Proc. IEEE Hawaii Int'l Conf. System Sciences (HICSS '04), vol. 7, 2004.
[16] H. Ino, M. Kudo, and A. Nakamura, “A Comparative Study of Algorithms for Finding Web Communities,” Proc. IEEE Int'l Conf. Data Eng. Workshops (ICDEW), 2005.
[17] J. Jung, B. Krishnamurthy, and M. Rabinovich, “Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites,” Proc. ACM Int'l Conf. World Wide Web (WWW '02), pp. 293-304, 2002.
[18] J. Kangasharju, J. Roberts, and K.W. Ross, “Object Replication Strategies in Content Distribution Networks,” Computer Comm., vol. 25, no. 4, pp. 367-383, 2002.
[19] J.M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” J. ACM, vol. 46, no. 5, pp. 604-632, 1999.
[20] T. Masada, A. Takasu, and J. Adachi, “Web Page Grouping Based on Parameterized Connectivity,” Proc. Int'l Conf. Database Systems for Advanced Applications (DASFAA '04), pp. 374-380, 2004.
[21] A. Nanopoulos, D. Katsaros, and Y. Manolopoulos, “A Data Mining Algorithm for Generalized Web Prefetching,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 5, pp. 1155-1169, Sept./Oct. 2003.
[22] M.E.J. Newman and M. Girvan, “Finding and Evaluating Community Structure in Networks,” Physical Rev. E, vol. 69, no. 026113, 2004.
[23] V.N. Padmanabhan and L. Qiu, “The Content and Access Dynamics of a Busy Web Site: Findings and Implications,” Proc. ACM Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM '00), pp. 111-123, 2000.
[24] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek, “Uncovering the Overlapping Community Structure of Complex Networks in Nature and Society,” Nature, vol. 435, no. 7043, pp. 814-818, 2005.
[25] G. Palla, A.-L. Barabasi, and T. Vicsek, “Quantifying Social Group Evolution,” Nature, vol. 446, pp. 664-667, 2007.
[26] G. Pallis, K. Stamos, A. Vakali, D. Katsaros, A. Sidiropoulos, and Y. Manolopoulos, “Replication Based on Objects Load under a Content Distribution Network,” Proc. IEEE Int'l Workshop Challenges in Web Information Retrieval and Integration (WIRI), 2006.
[27] G. Pallis and A. Vakali, “Insight and Perspectives for Content Delivery Networks,” Comm. ACM, vol. 49, no. 1, pp. 101-106, 2006.
[28] P. Pollner, G. Palla, and T. Vicsek, “Preferential Attachment of Communities: The Same Principle, but a Higher Level,” Europhysics Letters, vol. 73, no. 3, pp. 478-484, 2006.
[29] L. Qiu, V.N. Padmnanabhan, and G.M. Voelker, “On the Placement of Web Server Replicas,” Proc. IEEE INFOCOM '01, vol. 3, pp. 1587-1596, 2001.
[30] M. Rabinovich and O. Spatscheck, Web Caching and Replication. Addison Wesley, 2002.
[31] M. Schiely, L. Renfer, and P. Felber, “Self-Organization in Cooperative Content Distribution Networks,” Proc. IEEE Int'l Symp. Network Computing and Applications (NCA '05), pp. 109-118, 2005.
[32] A. Sidiropoulos, G. Pallis, D. Katsaros, K. Stamos, A. Vakali, and Y. Manolopoulos, “Prefetching in Content Distribution Networks via Web Communities Identification and Outsourcing,” World Wide Web J., vol. 11, no. 1, pp. 39-70, 2008.
[33] H. Sivaraj and G. Gopalakrishnan, “Random Walk Based Heuristic Algorithms for Distributed Memory Model Checking,” Electronic Notes in Theoretical Computer Science, vol. 89, no. 1, 2003.
[34] S. Sivasubramanian, G. Pierre, M. van Steen, and G. Alonso, “Analysis of Caching and Replication Strategies for Web Applications,” IEEE Internet Computing, vol. 11, no. 1, pp. 60-66, 2007.
[35] A. Vakali, “Proxy Cache Replacement Algorithms: A History-Based Approach,” World Wide Web J., vol. 4, no. 4, pp. 277-298, 2001.
[36] A. Vakali and G. Pallis, “Content Delivery Networks: Status and Trends,” IEEE Internet Computing, vol. 7, no. 6, pp. 68-74, 2003.
[37] B. Wu and A.D. Kshemkalyani, “Objective-Optimal Algorithms for Long-Term Web Prefetching,” IEEE Trans. Computers, vol. 55, no. 1, pp. 2-17, Jan. 2006.

Index Terms:
Communication/Networking and Information Technology, Information Search and Retrieval, Information Storage, Systems and Software
Citation:
Dimitrios Katsaros, George Pallis, Konstantinos Stamos, Athena Vakali, Antonis Sidiropoulos, Yannis Manolopoulos, "CDNs Content Outsourcing via Generalized Communities," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 1, pp. 137-151, Jan. 2009, doi:10.1109/TKDE.2008.92
Usage of this product signifies your acceptance of the Terms of Use.