Subscribe

Issue No.06 - June (2012 vol.23)

pp: 977-984

Eric M. Heien , Laboratoire LIG, ENSIMAG, Antenne de Montbonnot, Monbonnot Saint Martin

Derrick Kondo , Laboratoire LIG, ENSIMAG, Antenne de Montbonnot, Monbonnot Saint Martin

David P. Anderson , University of California Berkeley, Berkeley

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2011.251

ABSTRACT

Understanding and modeling resources of Internet end hosts are essential for the design of desktop software and Internet-distributed applications. In this paper, we develop a correlated resource model of Internet end hosts based on real-trace data taken from several volunteer computing projects, including SETI@home. This data cover a five-year period with statistics for 6.7 million hosts. Our resource model is based on statistical analysis of host computational power, memory, and storage as well as how these resources change over time and the correlations among them. We find that resources with few discrete values (core count, memory) are well modeled by approximations governing the change of relative resource quantities over time. Resources with a continuous range of values are well modeled by correlated log-normal distributions (cache, processor speed, and available disk space). We validate and show the utility of the model by applying it to a resource allocation problem for Internet-distributed applications, and compare it to other models. We also make our trace data and tool for automatically generating realistic Internet end hosts publicly available.

INDEX TERMS

Resource model, host model, Internet end host, resource scheduling, Internet computing, volunteer computing.

CITATION

Eric M. Heien, Derrick Kondo, David P. Anderson, "A Correlated Resource Model of Internet End Hosts",

*IEEE Transactions on Parallel & Distributed Systems*, vol.23, no. 6, pp. 977-984, June 2012, doi:10.1109/TPDS.2011.251REFERENCES

- [1] I. Al-Azzoni and D.G. Down, "Dynamic Scheduling for Heterogeneous Desktop Grids,"
J. Parallel and Distributed Computing, vol. 70, no. 12, pp. 1231-1240, 2010.- [2] C. Anglano and M. Canonico, "Scheduling Algorithms for Multiple Bag-of-Task Applications on Desktop Grids: A Knowledge-Free Approach,"
Proc. IEEE Int'l Symp. Parallel and Distributed Processing (IPDPS), pp. 1-8, 2008.- [3] D. Zhou and V.M. Lo, "Wavegrid: A Scalable Fast-turnaround Heterogeneous Peer-Based Desktop Grid System,"
Proc. 20th Int'l Symp. Parallel and Distributed Processing (IPDPS), 2006.- [4] D. Anderson, J. Cobb, E. Korpela, M. Lebofsky, and D. Werthimer, "Seti@home: An Experiment in Public-Resource Computing,"
Comm. ACM, vol. 45, no. 11, pp. 56-61, Nov. 2002.- [5] "Seti@home Website," http:/setiathome.berkeley.edu/, 2011.
- [6] B.P. Abbott et al. "Einstein@home Search for Periodic Gravitational Waves in Early S5 LIGO Data,"
Physical Rev. D, vol. 80, p. 42003, Aug. 2009.- [7] "Einstein@home Website," http:/einstein.phys.uwm.edu/, 2011.
- [8] "World Community Grid Website," http:/www. worldcommunitygrid.org /, 2011.
- [9] "Rosetta@home Website," http://boinc.bakerlab.orgrosetta/, 2011.
- [10] "Climate Prediction Website," http:/climateprediction.net/, 2011.
- [11] S. Floyd and E. Kohler, "Internet Research Needs Better Models,"
ACM SIGCOMM Computer Comm. Rev., vol. 33, no. 1, pp. 29-34, 2003.- [12] "The Cooperative Association for Internet Data Analysis," http:/www.caida.org, 2011.
- [13] M. Faloutsos, P. Faloutsos, and C. Faloutsos, "On Power-law Relationships of the Internet Topology,"
Proc. SIGCOMM, pp. 251-262, 1999.- [14] C.R.S.Jr and G.F. Riley, "Neti@home: A Distributed Approach to Collecting End-to-End Network Performance Measurements,"
Proc. Workshop Passive and Active Measurements (PAM), pp. 168-174, 2004.- [15] Y. Shavitt and E. Shir, "Dimes: Let the Internet Measure Itself,"
ACM SIGCOMM Computer Comm. Rev., vol. 35, no. 5, pp. 71-74, 2005.- [16] M. Dischinger, A. Haeberlen, P.K. Gummadi, and S. Saroiu, "Characterizing Residential Broadband Networks,"
Proc. Seventh ACM SIGCOMM Conf. Internet Measurement (IMC '07), pp. 43-56, 2007.- [17] S. Saroiu, P. Gummadi, and S. Gribble, "A Measurement Study of Peer-to-Peer File Sharing Systems,"
Proc. 19th Multimedia Computing and Networking (MMCN), Jan. 2002.- [18] J. Chu, K. Labonte, and B. Levine, "Availability and Locality Measurements of Peer-to-Peer File Systems,"
Proc. ITCom: Scalability and Traffic Control in IP Networks, July 2002.- [19] "Xbench," http:/www.xbench.com/, 2011.
- [20] "PassMark," http:/www.passmark.com/, 2011.
- [21] "LMBench—Tools for Performance Analysis," http://www. bitmover.comlmbench, 2011.
- [22] Y.-S. Kee, H. Casanova, and A. Chien, "Realistic Modeling and Synthesis of Resources for Computational Grids,"
Proc. ACM/IEEE Conf. Supercomputing (SC '04), http://portal.acm.orgcitation. cfm?id=1048933.1049999 , Nov. 2004.- [23] A. Sulistio, U. Cibej, S. Venugopal, B. Robic, and R. Buyya, "A Toolkit for Modelling and Simulating Data Grids: An Extension to Gridsim,"
Concurrency and Computation: Practice & Experience, vol. 20, no. 13, pp. 1591-1609, Sept. 2008.- [24] D. Lu and P.A. Dinda, "Synthesizing Realistic Computational Grids,"
Proc. ACM/IEEE Conf. Supercomputing (SC '03), p. 16, 2003.- [25] D. Anderson and G. Fedak, "The Computational and Storage Potential of Volunteer Computing,"
Proc. IEEE Sixth Int'l Symp. Cluster Computing and the Grid (CCGRID '06), vol. 1, pp. 73-80, 2006.- [26] D. Anderson and K. Reed, "Celebrating Diversity in Volunteer Computing,"
Proc. 42nd Hawaii Int'l Conf. System Sciences (HICSS '09), pp. 1-8, 2009.- [27] E.M. Heien, D. Kondo, and D.P. Anderson, "Correlated Resource Models of Internet End Hosts,"
Proc. 31st Int'l Conf. Distributed Computing Systems (ICDCS '11), pp. 278-287, 2011.- [28] R. Bhagwan, S. Savage, and G. Voelker, "Understanding Availability,"
Proc. Second Int'l Workshop Peer-to-Peer Systems (IPTPS '03), 2003.- [29] S. Larson, C. Snow, M. Shirts, V. Pande, and V. Pande, "Folding@ Home and Genome@ Home,"
Using Distributed Computing to Tackle Previously Intractable Problems in Computational Biology, 2004.- [30] "BOINC Papers," http://boinc.berkeley.edu/trac/wiki BoincPapers , 2011.
- [31] D. Anderson, "BOINC: A System for Public-Resource Computing and Storage,"
Proc. IEEE/ACM Fifth Int'l Workshop Grid Computing (GRID '04), pp. 4-10, 2004.- [32] I.M. Chakravarti, R.G. Laha, and J. Roy,
Handbook of Methods of Applied Statistics, vol. 1. John Wiley and Sons, 1967.- [33] B. Javadi, D. Kondo, J. Vincent, and D. Anderson, "Mining for Statistical Models of Availability in Large-scale Distributed Systems: An Empirical Study of Seti@home,"
Proc. IEEE Int'l Symp. Modeling, Analysis & Simulation of Computer and Telecomm. Systems (MASCOTS '09), pp. 1-10, 2009.- [34] D. Nurmi, J. Brevik, and R. Wolski, "Modeling Machine Availability in Enterprise and Wide-Area Distributed Computing Environments,"
Lecture Notes in Computer Science, vol. 3648, p. 432, 2005.- [35] R. Weicker, "Dhrystone: A Synthetic Systems Programming Benchmark,"
Comm. ACM, vol. 27, no. 10,http://portal.acm. orgcitation.cfm?id=358274.358283 , pp. 1013-1030, Oct. 1984.- [36] H.J. Curnow, B.A. Wichmann, and T. Si, "A Synthetic Benchmark,"
The Computer J., vol. 19, pp. 43-49, 1976.- [37] P. Douglas, "A Theory of Production,"
The Am. Economic Rev., http://www.jstor.org/stable1811556, Jan. 1928. |