This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Design, Implementation, and Performance of an Extensible Toolkit for Resource Prediction in Distributed Systems
February 2006 (vol. 17 no. 2)
pp. 160-173

Abstract—RPS is a publicly available toolkit that allows a practitioner to straightforwardly create flexible online and offline resource prediction systems in which resources are represented by independent, periodically sampled, scalar-valued measurement streams. The systems predict the future values of such streams from past values and are composed at runtime out of a large and extensible set of communicating components that are in turn constructed using RPS's extensible sensor, prediction, wavelet, and communication libraries. This paper describes the design, implementation, and performance of RPS. We have used RPS extensively to evaluate predictive models and build online prediction systems for host load, Windows performance data, and network bandwidth. The computation and communication overheads involved in such systems are quite low.

[1] M. Aeschlimann, P. Dinda, L. Kallivokas, J. Lopez, B. Lowekamp, and D. O'Hallaron, “Preliminary Report on the Design of a Framework for DistributedVisualization,” Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications (PDPTA '99), pp. 1833-1839, June 1999.
[2] S. Basu, A. Mukherjee, and S. Klivansky, “Time Series Models for Internet Traffic,” Technical Report GIT-CC-95-27, College of Computing, Georgia Inst. ofTech nology, Feb. 1995.
[3] J. Beran, “Statistical Methods for Data with Long-Range Dependence,” Statistical Science, vol. 7, no. 4, pp. 404-427, 1992.
[4] F. Berman and R. Wolski, “Scheduling from the Perspective of the Application” Proc. Fifth Symp. High Performance Distributed Computing, pp. 100-111, Aug. 1996.
[5] A. Blum and C. Burch, “On-Line Learning and the Metrical Task System Problem,” Proc. 10th Ann. Conf. Computational Learning Theory (COLT '97), pp. 45-53, 1997.
[6] G.E.P. Box, G.M. Jenkins, and G. Reinsel, Time Series Analysis: Forecasting and Control. Prentice Hall, 1994.
[7] P.J. Brockwell and R.A. Davis, Introduction to Time Series and Forecasting. Springer-Verlag 1996.
[8] B. Cornell, J. Lange, and P. Dinda, “An Implementation of Diffusion in the Linux Kernel,” Technical Report NWU-CS-02-12, Dept. of Computer Science, Northwestern Univ., Sept. 2002.
[9] P.A. Dinda, “The Statistical Properties of Host Load,” Scientific Programming, vol. 7, nos. 3-4, 1999. A version of this paper is also available as CMU Technical Report CMU-CS-TR-98-175. A much earlier version appears in LCR '98 and as CMU-CS-TR-98-143.
[10] P.A. Dinda, “Exploiting Packet Header Redundancy for Zero Cost Dissemination of Dynamic resource information,” Proc. Sixth Workshop Languages, Compilers, andRun-Time Systems for Scalable Computers, Mar. 2002.
[11] P.A. Dinda, “Online Prediction of the Running Time of Tasks,” Cluster Computing, vol. 5, no. 3, 2002. Earlier version in HPDC 2001; summary in SIGMETRICS 2001.
[12] P.A. Dinda, “A Prediction-Based Real-Time Scheduling Advisor,” Proc. 16th Int'l Parallel and Distributed Processing Symposium (IPDPS 2002), Apr. 2002.
[13] P.A. Dinda and D.R. O'Hallaron, “An Extensible Toolkit for Resource Prediction in Distributed Systems,” Technical Report CMU-CS-99-138, School of Computer Science, Carnegie Mellon Univ., July 1999.
[14] P.A. Dinda and D.R. O'Hallaron, “Host Load Prediction Using Linear Models,” Cluster Computing, vol. 3, no. 4, 2000. Earlier version in HPDC 1999.
[15] C. Fraley, “Fracdiff: Maximum Likelihood Estimation of the Parameters of aFractionally Differenced ARIMA($p,d,q$ ) Model,” Computer Program, 1991, http://www.stat.cmu.edu/generalfracdiff.
[16] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Mancheck, and V. Sunderam, PVM: Parallel Virtual Machine. Cambridge, Mass.: MIT Press, 1994.
[17] C.W.J. Granger and R. Joyeux, “An Introduction to Long-Memory Time Series Models and Fractional Differencing,” J. Time Series Analysis, vol. 1, no. 1, pp. 15-29, 1980.
[18] N.C. Groschwitz and G.C. Polyzos, “A Time Series Model of Long-Term NSFNET Backbone Traffic,” Proc. IEEE Int'l Conf. Comm. (ICC '94), vol. 3, pp. 1400-1404, May 1994.
[19] M. Gudgin, M. Hadley, N. Mendelsohn, J.-J. Moreau, and H. Nielsen, “SOAP Version 1.2 Specification,” technical report, World Wide Web Consortium, 2003.
[20] D. Gunter, B. Tierney, K. Jackson, J. Lee, and M. Stoufer, “Dynamic Monitoring of High Performance Distributed Applications,” Proc. 11th IEEE Symp. High Performance Distributed Computing (HPDC), 2002.
[21] M. Hailperin, “Load Balancing Using Time Series Analysis for Soft Real-Time Systems with Statistically Periodic Loads,” PhD thesis, Stanford Univ., Dec. 1993.
[22] J. Haslett and A.E. Raftery, “Space-Time Modelling with Long-Memory Dependence: Assessing Ireland'sWind Power Resource,” Applied Statistics, vol. 38, pp. 1-50, 1989.
[23] J.R.M. Hosking, “Fractional Differencing,” Biometrika, vol. 68, no. 1, pp. 165-176, 1981.
[24] M. Knop, P. Dinda, and J. Schopf, “Windows Performance Monitoring and Data Reduction Using Watchtower,” Proc. Workshop Self-Healing, Adaptive, and Self-Managed Systems (SHAMAN), June 2002.
[25] M. Knop, P. Paritosh, P. Dinda, and J. Schopf, “Windows Performance Monitoring and Data Reduction Using Watchtowerand Argus” (poster), Proc. Supercomputing, Nov. 2001. Extended version appears as Northwestern Univ. Computer Science Dept. Technical Report NWU-CS-01-6.
[26] B. Lowekamp, N. Miller, R. Karrer, T. Gross, and P. Steenkiste, “Design, Implementation, and Evaluation of the Remos Network Monitoring System,” J. Grid Computing, vol. 1, no. 1, pp. 75-93, 2003.
[27] B. Lowekamp, N. Miller, D. Sutherland, T. Gross, P. Steenkiste, and J. Subhlok, “A Resource Monitoring System for Network-Aware Applications,” Proc. Seventh IEEE Int'l Symp. HighPerformance Distributed Computing (HPDC), pp. 189-196, July 1998.
[28] B. Lowekamp, D. O'Hallaron, and T. Gross, “Direct Queries for Discovering Network Resource Properties in a Distributed Environment,” Proc. Eighth IEEE Int'l Symp. High Performance Distributed Computing (HPDC99), pp. 38-46, Aug. 1999.
[29] B. Lowekamp, B. Tierney, L. Cottrell, R. Hughes-Jones, T. Kielmann, and M. Swany, “A Hierarchy of Network Performance Characteristics for Gridapplications and Services,” Global Grid Forum recommendation, May 2004.
[30] D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, “Characterizing and Predicting TCP Throughput on the Wide AreaNetwork,” Proc. 25th Int'l Conf. Distributed Computer Systems (ICDS), June 2005.
[31] D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, “Modeling and Taming Parallel TCP on the Wide Area Network,” Proc. 19th Int'l Parallel and Distributed Processing Symp., Apr. 2005.
[32] S-Plus User's Guide, MathSoft, Aug. 1997, http://www.mathsoft. comsplus.
[33] MATLAB System Identification Toolbox User's Guide, The Mathworks, 1996, http://www.mathworks.com/productssysid.
[34] MATLAB User's Guide, The Mathworks, 1996, http://www.mathworks.com/productsmatlab.
[35] K. Obraczka and G. Gheorghiu, “The Performance of a Service for Network-Aware Applications,” Proc. ACM SIGMETRICS Symp. Parallel and Distributed Tools (SPDT '98), Oct. 1997. Also available as USC CS Technical Report 97-660.
[36] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in Fortran. Cambridge Univ. Press, 1986.
[37] Y. Qiao, J. Skicewicz, and P. Dinda, “An Empirical Study of the Multiscale Predictability of Network Traffic,” Proc. 13th IEEE Int'l Symp. High Performance Distributed Computing, June 2004.
[38] M. Samadani and E. Kalthofen, “On Distributed Scheduling Using Load Prediction from Past Information,” abstracts published in Proc. 14th Annual ACM Symp. Principles of Distributed Computing (PODC '95), p. 261 and Proc. Third Workshop on Languages, Compilers and Run-Time Systems for Scalable Computers, pp. 317-320, 1996.
[39] S. Seshan, M. Stemm, and R.H. Katz, “Shared Passive Network Performance Discovery,” Proc. USENIX Symp. Internet Technologies and System (USITS), 1997.
[40] J. Siegal, CORBA Fundamentals and Programming. John Wiley and Sons, 1996.
[41] B. Siegell and P. Steenkiste, “Automatic Generation of Parallel Programs with Dynamic Load Balancing,” Proc. 3rd Int'l Symp. High-Performance Distributed Computing, pp. 166-175, Aug. 1994.
[42] J. Skicewicz, P. Dinda, and J. Schopf, “Multi-Resolution Resource Behavior Queries Using Wavelets,” Proc. 10th IEEE Symp. High-Performance Distributed Computing (HPDC 2001), pp. 395-405, Aug. 2001.
[43] J.A. Skicewicz and P.A. Dinda, “Tsunami: A Wavelet Toolkit for Distributed Systems,” Technical Report NWU-CS-03-18, Dept. of Computer Science, Northwestern Univ., Nov. 2003.
[44] Java Remote Method Invocation Specification, Sun Microsystems, 1997, http:/java.sun.com.
[45] B. Tierney, R. Aydt, D. Gunter, W. Smith, M. Swany, V. Taylor, and R. Wolski, “A Grid Monitoring Architecture,” technical report informational draft, Global Grid Forum, 2002.
[46] H. Tong, Threshold Models in Non-Linear Time Series Analysis. Springer-Verlag, 1983.
[47] D. Winer, “XML-RPC specification,” technical report, 1999.
[48] R. Wolski, “Forecasting Network Performance to Support Dynamic Scheduling Using the Network Weather Service,” Proc. Sixth High-Performance Distributed Computing Conf. (HPDC '97), pp. 316-325, Aug. 1997.
[49] R. Wolski, N. Spring, and J. Hayes, “Predicting the CPU Availability of Time-Shared Unix Systems,” Proc. Eighth IEEE Symp. High Performance Distributed Computing (HPDC '99), pp. 105-112, Aug. 1999.
[50] R. Wolski, N.T. Spring, and J. Hayes, “The Network Weather Service: A Distributed Resource Performance Forecasting System,” J. Future Generation Computing Systems, vol.15, nos. 5-6, 1999.
[51] J.A. Zinky, D.E. Bakken, and R.E. Schantz, “Architectural Support for Quality of Service for CORBA Objects,” Theory and Practice of Object Systems, vol. 3, no. 1, pp.55-73, Apr. 1997.

Index Terms:
Distributed systems, performance of systems.
Citation:
Peter A. Dinda, "Design, Implementation, and Performance of an Extensible Toolkit for Resource Prediction in Distributed Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 2, pp. 160-173, Feb. 2006, doi:10.1109/TPDS.2006.24
Usage of this product signifies your acceptance of the Terms of Use.