This Article 
 Bibliographic References 
 Add to: 
Computing Performance Bounds of Fork-Join Parallel Programs Under a Multiprocessing Environment
March 1998 (vol. 9 no. 3)
pp. 295-311

Abstract—We study a multiprocessing computer system which accepts parallel programs that have a fork-join computational paradigm. The multiprocessing computer system under study is modeled as K homogeneous servers, each with an infinite capacity queue. Parallel programs arrive at the multiprocessing system according to a series-parallel phase type interarrival process with mean arrival rate of λ. Upon the program arrival, it forks into K independent tasks and each task is assigned to an unique server. Each task's service time has a k-stage Erlang distribution with mean service time of 1/μ. A parallel program is completed upon the completion of its last task. This kind of queuing model has no known closed form solution in the general (K≥ 2) case. In this paper, we show that by carefully modifying the arrival and service distributions at some imbedded points in time, we can obtain tight performance bounds. We also provide a computational efficient algorithm for obtaining upper and lower bounds on the expected response time. The methodology is flexible and allows one to trade-off the tightness of the bounds and computational cost.

[1] F. Baccelli, A.M. Makowski, and A. Shwartz, "The Fork-Join Queue and Related Systems with Synchronization Constraints: Stochastic Ordering and Computable Bounds," Advanced Applied Probability, vol. 21, pp. 629-660, 1989.
[2] F. Baccelli, W.A. Massey, and D. Towsley, "Acyclic Fork-Join Queueing Network", J. ACM, vol. 36, no. 3, July 1989.
[3] A. Beguelin, J.J. Dongarra, A. Geist, R. Manchek, and V. Sunderam, "PVM and HeNCE: Tools for Heterogeneous Network Computing," Advances in Parallel Computing: Environments and Tools for Parallel Scientific Computing, pp. 139-153, 1993.
[4] S. Chen and D. Towsley, "Design and Modeling Policies for Two Server Fork/Join Queueing Systems," COINS Technical Report 91-39, Univ. of Massachusetts, 1991.
[5] P.J. Courtois, Decomposability—Queueing and Computer System Applications.New York: Academic Press, 1977.
[6] P.-J. Courtois and P. Semal, Computable Bounds for Conditional Steady-State Probabilities in Large Markov Chains and Queueing Models IEEE J. Selected Areas in Comm., vol. 4, no. 6, pp. 926-937, Sept. 1986.
[7] L. Flatto and S. Hahn, "Two Parallel Queues Created by Arrivals with Two Demands I," SIAM J. Applied Math., vol. 44, pp. 1,041-1,053, Oct. 1984.
[8] L. Flatto, "Two Parallel Queues Created by Arrivals with Two Demands II," SIAM J. Applied Math., vol. 45, pp. 861-878, Oct. 1985.
[9] G.A. Geist and V.S. Sunderam, "The PVM System: Supercomputer Level Concurrent Computation on a Heterogeneous Network of Workstations," Proc. Sixth Distributed Memory Computing Conf., pp. 258-261. IEEE, Apr. 1991.
[10] P. Heidelberger and K.S. Trivedi, "Queueing Network Models for Parallel Processing with Asynchronous Tasks." IEEE Trans. Computers, vol. 31, no. 11, pp.1,099-1,109, Nov. 1982.
[11] P. Heidelberger and K.S. Trivedi, "Analytic Queueing Models for Programs with Internal Concurrency," IEEE Trans. Computers, vol. 32, no. 11, pp. 73-82, Nov. 1983.
[12] C.A.R. Hoare, Communicating Sequential Processes, Prentice Hall, Englewood Cliffs, N.J., 1985.
[13] F.P. Kelly, Reversibility and Stochastic Networks.New York: Wiley, 1979.
[14] C. Kim and A.K. Agrawala, "Analysis of the Fork-Join Queue," IEEE Trans. Computers, vol. 38, pp. 250-255, no. 2, 1989.
[15] L. Kleinrock, Queueing Systems Vol. I: Theory. Wiley Int'l, 1975.
[16] P. Konstantopoulos and J. Walrand, "Stationary and Stability of Fork-Join Networks," J. Applied Probability, vol. 26, pp. 604-614, 1989.
[17] C.P. Kruskal and A. Weiss, "Allocating Independent Subtasks on Parallel Processors," IEEE Trans. Software Eng., vol. 11, no. 10, pp. 1,001-1,016, Oct. 1985.
[18] Y.C. Liu and H.G. Perros, “A Decomposition Procedure for the Analysis of a Closed Fork/Join Queueing System,” IEEE Trans. Computers, vol. 40, no. 3, pp. 365-370, Mar. 1991.
[19] J.C.S. Lui, R.R. Muntz, and D. Towsley, "Bounding the Response Time of a Minimum Expected Delay Routing System: An Algorithmic Approach," IEEE Trans. Computers, vol. 44, no. 5, pp. 1,371-1,382, May 1995.
[20] R.R. Muntz and J.C.S. Lui, “Computing Bounds on Steady State Availibility of Repairable Computer Systems,” J. ACM, vol. 41, no. 4, pp. 676-707, July 1994.
[21] A.W. Marshall and I. Olkin, Inequalities: Theory of Majorization and Applications.New York: Academic Press, 1979.
[22] S. Varma and A.M. Makowski, "Interpolation Approximations for Symmetric Fork-Join Queues," Proc. Performance '93, pp. 245-273,Rome, 1993.
[23] R.R. Muntz, E. De Souza, E Silva, and A. Goyal, "Bounding Availability of Repairable Computer Systems," Proc. 1989 ACM SIGMETRICS and PERFORMANCE '89. Also in special issue of IEEE Trans. Computers, vol. 38, no. 12, pp. 19-30, Dec. 1989.
[24] R. Nelson and A. Tantawi, “Approximate Analysis of Fork/Join Synchronization in Parallel Queues,” IEEE Trans. Computers, vol. 37, pp. 739–743, June 1988.
[25] R. Nelson, D. Towsley, and A. Tantawi, “Performance Analysis of Parallel Processing Systems,” IEEE Trans. Software Engineering, vol. 14, pp. 532–539, Apr. 1988.
[26] I.C. Pyle, The Ada Programming Language.London: Prentice-Hall Int'l, 1981.
[27] A. Giest, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, "PVM 3.0 User's Guide and Reference Manual," Oak Ridge National Lab., 1993.
[28] S.M. Ross, Introduction to Probability Models. Academic Press, 1970.
[29] C. Schimmel, Unix Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers. Addison-Wesley, 1994.
[30] M.R. Stonebraker, "The Case for Shared-Nothing," Proc. 1986 Data Engineering Conf. IEEE, 1986.
[31] A. Thomasian and A. Tantawi, "Approximate Solutions for M/G/1 Fork-Join Synchronization," Proc. Winter Simulation Conf., pp. 361-368,Orlando, Fla., Dec. 1994.
[32] D. Towsley, J.A. Rommel, and J.A. Stankovic, "The Performance of Processor Sharing Scheduling Fork-Join in Multiprocessors". High-Performance Computer Systems, E. Gelenbe, ed., pp. 146-156.Amsterdam, North-Holland: 1988.

Index Terms:
High performance computing, performance evaluation, performance modeling methodology, analysis of multiprocessing systems.
John C.S. Lui, Richard R. Muntz, Don Towsley, "Computing Performance Bounds of Fork-Join Parallel Programs Under a Multiprocessing Environment," IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 3, pp. 295-311, March 1998, doi:10.1109/71.674321
Usage of this product signifies your acceptance of the Terms of Use.