This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Low-Cost Static Performance Prediction of Parallel Stochastic Task Compositions
January 2006 (vol. 17 no. 1)
pp. 78-91
Arjan J.C. van Gemund, IEEE Computer Society

Abstract—Current analytic solutions to the execution time distribution of a parallel composition of tasks having stochastic execution times are computationally complex, except for a limited number of distributions. In this paper, we present an analytical solution based on approximating execution time distributions in terms of the first four statistical moments. This low-cost approach allows the parallel execution time distribution to be approximated at ultra-low solution complexity for a wide range of execution time distributions. The accuracy of our method is experimentally evaluated for synthetic distributions as well as for task execution time distributions found in real parallel programs and kernels (NAS-EP, SSSP, APSP, Splash2-Barnes, PSRS, and WATOR). Our experiments show that the prediction error of the mean value of the parallel execution time for N{\hbox{-}}{\rm{ary}} parallel composition is in the order of percents, provided the task execution time distributions are sufficiently independent and unimodal.

[1] V.S. Adve, “Analyzing the Behavior and Performance of Parallel Programs,” PhD thesis #1201, Univ. of Wisconsin-Madison, Dec. 1993.
[2] V.S. Adve and M.K. Vernon, “The Influence of Random Delays on Parallel Execution Times,” Proc. SIGMETRICS '93, pp. 61-73, May 1993.
[3] I. Angus et al., Solving Problems on Concurrent Processors, Software for Concurrent Processors, vol. 2, Prentice Hall, 1990.
[4] D. Bailey et al., “The NAS Parallel Benchmarks,” Report RNR-94-007, Dept. of Math. and Computer Science, Emory Univ., Mar. 1994.
[5] V. Balasundaram, G. Fox, K. Kennedy, and U. Kremer, “A Static Performance Estimator to Guide Data Partioning Decisions,” Proc. ACM SIGPLAN PPoPP, pp. 213-223, Apr. 1991.
[6] M.J. Clement and M.J. Quinn, “Multivariate Statistical Techniques for Parallel Performance Prediction,” Proc. 28th Hawaii Int'l Conf. System Sciences, vol. 2, pp. 446-455, Jan. 1995.
[7] B. Dodin, “Bounding the Project Completion Time Distributions Networks,” Operations Research, vol. 33, no. 4, pp. 862-881, 1985.
[8] T. Fahringer and H.P. Zima, “A Static Parameter-Based Performance Prediction Tool for Parallel Programs,” Proc. Seventh ACM Int'l Conf. Supercomputing, pp. 207-219, July 1993.
[9] H. Gautama and A.J. C. van Gemund, “Static Performance Prediction of Data-Dependent Programs,” Proc. ACM Int'l Workshop Software and Performance '00, pp. 216-226, Sept. 2000.
[10] H. Gautama and A.J. C. van Gemund, “Low-Cost Performance Prediction of Data-Dependent Data Parallel Programs,” Proc. IEEE Int'l Symp. Modeling, Analysis and Simulation of Computer and Telecomm. Systems '01, pp. 173-182, Aug. 2001.
[11] H. Gautama, “A Statistical Approach to Performance Modeling of Parallel Systems,” PhD thesis, Delft Univ. of Tech nology, Dec. 2004.
[12] E. Gelenbe, E. Montagne, R. Suros, and C.M. Woodside, “Performance of Block-Structured Parallel Programs,” Parallel Algorithms and Architectures, pp. 127-138, North-Holland, 1986.
[13] A.J.C. van Gemund, “The Importance of Synchronization Structure in Parallel Program Optimization,” Proc. ACM Int'l Conf. Supercomputing, pp. 164-171, 1997.
[14] A.J.C. van Gemund, “Symbolic Performance Performance Modeling of Parallel Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 2, pp. 154-165, Feb. 2003.
[15] A. Gonzalez-Escribano, A.J. C. van Gemund, and V. Cardenoso-Payo, “Mapping Unstructured Applications into Nested Parallelism,” Proc. Int'l Meeting High Performance Computing for Computational Science '02, pp. 469-482, 2002.
[16] E.J. Gumbel, “Statistical Theory of Extreme Values (Main Results),” Contributions to Order Statistics, pp. 56-93, Wiley and Sons, 1962.
[17] F. Hartleb and V. Mertsiotakis, “Bounds for the Mean Runtime of Parallel Programs,” Proc. Conf. Object-Oriented Languages and Systems '92, pp. 197-210, Sept. 1992.
[18] C.P. Kruskal and A. Weiss, “Allocating Independent Subtasks on Parallel Processors,” IEEE Trans. Software Eng., vol. 11, pp. 1001-1016, Oct. 1985.
[19] B.P. Lester, “A System for the Speedup of Parallel Programs,” Proc. Int'l Conf. Parallel Processing '86, pp. 145-152, 1986.
[20] D.-R. Liang and S.K. Tripathi, “On Performance Prediction of Parallel Computations with Precedent Constraints,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 5, pp. 491-508, May 2000.
[21] J. Lüthi, S. Majumdar, G. Kotsis, and G. Haring, “Performance Bounds for Distributed Systems with Workload Variabilities and Uncertainties,” Parallel Computing, vol. 22, pp. 1789-1806, Feb. 1997.
[22] S. Madala and J.B. Sinclair, “Performance of Synchronous Parallel Algorithms with Regular Structures,” IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 2, pp. 105-116, Jan. 1991.
[23] V.W. Mak and S.F. Lundstrom, “Predicting Performance of Parallel Computations,” IEEE Trans. Parallel and Distributed Systems, vol. 1, no. 7, pp. 257-270, July 1990.
[24] C.L. Mendes, J-C. Wang, and D.A. Reed, “Automatic Performance Prediction and Scalability Analysis for Data Parallel Programs,” Proc. Second Workshop Automatic Data Layout and Performance Prediction, Apr. 1995.
[25] D.M. Olsson and L.S. Nelson, “Nelder-Mead Simplex Procedure for Function Minimization,” Technometrics, vol. 17, pp. 45-51, 1975.
[26] C.D. Polychronopoulos, M. Girkar, M.R. Haghighat, C.L. Lee, B. Leung, and D. Schouten, “Parafrase-2: An Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors,” Proc. Int'l Conf. Parallel Processing '89, pp. 39-48, Aug. 1989.
[27] M.J. Quinn, Parallel Computing: Theory and Practice. McGraw-Hill, 1994.
[28] J.S. Ramberg, P.R. Tadikamalla, E.J. Dudewicz, and F.M. Mykytka, “A Probability Distribution and Its Uses in Fitting Data,” Technometrics, vol. 21, pp. 201-214, 1979.
[29] J.T. Robinson, “Some Analysis Techniques for Asynchronous Multiprocessor Algorithms,” IEEE Trans. Parallel and Distributed Systems, vol. 5, pp. 24-31, Jan. 1979.
[30] R.A. Sahner and K.S. Trivedi, “Performance and Reliability Analysis Using Directed Acyclic Graphs,” IEEE Trans. Software Eng., vol. 13, pp. 1105-1114, Oct. 1987.
[31] V. Sarkar, “Determining Average Program Execution Times and Their Variance,” Proc. SIGPLAN Conf. Programming Language Design and Implementation '89, pp. 298-312, 1989.
[32] V. Sarkar, Partitioning and Scheduling Parallel Programs for Multiprocessors. MIT Press, 1989.
[33] J.M. Schopf, “A Practical Methodology for Defining Histograms for Predictions and Scheduling,” Proc. ParCo '99, pp. 664-671, Aug. 1999.
[34] J.M. Schopf and F. Berman, “Using Stochastic Information to Predict Application Behavior on Contended Resources,” Int'l J. Foundations of Computer Science, vol. 12, no. 3, pp. 341-364, 2001.
[35] H. Shi and J. Schaeffer, “Parallel Sorting by Regular Sampling,” J. Parallel and Distributed Computing, vol. 14, no. 4, pp. 361-372, 1992.
[36] A.W. Shogan, “Bounding Distributions for a Stochastic PERT Network,” Networks, vol. 7, pp. 359-381, 1977.
[37] F. Sötz, “A Method for Performance Prediction of Parallel Programs,” Proc. CONPAR 90-VAPP IV, Joint Int'l Conf. Vector and Parallel Processing, pp. 98-107, 1990.
[38] A. Stuart and J.K. Ord, Kendall's Advanced Theory of Statistics, vol. 1, sixth ed., New York: Halsted Press, 1994.
[39] A. Thomasian and P. Bay, “Analytic Queuing Network Models for Parallel Processing of Task Systems,” IEEE Trans. Computers, vol. 35, no. 12, pp. 1045-1054, Dec. 1986.
[40] S.C. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations,” Proc. Int'l Symp. Computer Architecture '95, pp. 24-36, 1995.
[41] N. Yazici-Pekergin and J.-M. Vincent, “Stochastic Bounds on Execution Times of Parallel Programs,” IEEE Trans. Software Eng., vol. 17, no. 10, pp. 1005-1012, Oct. 1991.

Index Terms:
Performance prediction, stochastic graphs, workload distribution.
Citation:
Hasyim Gautama, Arjan J.C. van Gemund, "Low-Cost Static Performance Prediction of Parallel Stochastic Task Compositions," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 1, pp. 78-91, Jan. 2006, doi:10.1109/TPDS.2006.13
Usage of this product signifies your acceptance of the Terms of Use.