This Article 
 Bibliographic References 
 Add to: 
A Tool to Help Tune where Computation Is Performed
July 2001 (vol. 27 no. 7)
pp. 618-629

Abstract—We introduce a new performance metric, called Load Balancing Factor (LBF), to assist programmers with evaluating different tuning alternatives. The LBF metric differs from traditional performance metrics since it is intended to measure the performance implications of a specific tuning alternative rather than quantifying where time is spent in the current version of the program. A second unique aspect of the metric is that it provides guidance about moving work within a distributed or parallel program rather than reducing it. A variation of the LBF metric can also be used to predict the performance impact of changing the underlying network. The LBF metric is computed incrementally and online during the execution of the program to be tuned. We also present a case study that shows that our metric can accurately predict the actual performance gains for a test suite of six programs.

[1] G. Abandah and E. Davidson, “Modeling the Comunication Time Performance of the IBM SP2,” Proc. 10th Int'l Parallel Processing Symp., pp. 249-257, Apr. 1996.
[2] T.E. Anderson and E.D. Lazowska, “A Tool for Tuning Parallel Program Performance,” Proc. ACM SIGmetrics Conf. Measurement and Modeling of Computer Systems, ACM, New York, May 1990, pp. 115‐125.
[3] D. Bailey et al., “The NAS Parallel Benchmarks,” RNR Technical Report RNR-94-007, Mar. 1994.
[4] V. Balasundaram, G. Fox, K. Kennedy, and U. Kremer, “A Static Performance Estimator to Guide Data Partitioning Decisions,” Proc. Third ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, Apr. 1991.
[5] Mark E. Crovella and Thomas J. LeBlanc, “Parallel Performance Prediction Using Lost Cycles Analysis,” Proc. Supercomputing’94, CS Press, 1994, pp. 600-609.
[6] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek,, and V. Sunderam,PVM: Parallel Virtual Machine—A Users' Guide and Tutorial for Networked Parallel Computing. The MIT Press, 1994.
[7] A.J.C.v. Gemund, “Performance Prediction of Parallel Processing Systems: The PAMELA Methodology,” Proc. Int'l Conf. Supercomputing (ICS), pp. 318-327, July 1993.
[8] A.J. Goldberg and J. Hennessy, "Performance Debugging Shared Memory Multiprocessor Programs with Mtool," Proc. Supercomputing '91, IEEE CS Press, Los Alamitos, Calif., 1991, pp. 481-490.
[9] W. Gu et al., "Falcon: On-line Monitoring and Steering of Large-scale Parallel Programs," Proc. Fifth Symp. Frontiers of Massively Parallel Computing, ACM, New York, 1995, pp. 422-429.
[10] M.T. Heath and J.A. Etheridge, “Visualizing Performance of Parallel Programs,” IEEE Software, vol. 8, no. 5, pp. 28-39, 1991.
[11] P. Heidelberger and K. Trivedi, “Queueing Network Models for Paralell Processing with Asynchronous Tasks,” IEEE Trans. Computers, vol. 31, no. 11, pp. 1099-1109, Nov. 1982.
[12] J.K. Hollingsworth, “Critical Path Profiling of Message Passing and Shared-Memory Programs,” IEEE Trans. Parallel and Distributed Systems, pp. 1029-1040, Oct. 1998.
[13] J. Hollingsworth, R. Irvin, and B. Miller, “Dynamic Control of Performance Monitoring on Large-Scale Parallel Systems,” Proc. Seventh ACM Int’l Conf. Supercomputing, ACM Press, New York, 1993, pp. 185-194.
[14] D. Kimelman and D. Zernik, “On-the-Fly Topological Sort—A Basis for Interactive Debugging and Live Visualization of Parallel Programs,” Proc. ACM/ONR Workshop Parallel and Distributed Debugging, vol. 1, pp. 12-20, May 1996.
[15] L. Lamport, "Time, clocks and the ordering of events in a distributed system," Comm. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[16] F. Lange, R. Kroger, and M. Gergeleit, "Jewel: Design and Implementation of a Distributed Measurement System," IEEE Trans. Parallel and Distributed Systems, Vol. 3, No. 6, Nov. 1992, pp. 657-671.
[17] T. Lehr, Z. Segall, D.F. Vrsalovic, E. Caplan, A.L. Chung, and C.E. Fineman, “Visualizing Performance Debugging,” Computer, vol. 21, no. 10, pp. 38-51, Oct. 1989.
[18] M. Martonosi, A. Gupta, and T. Anderson, "MemSpy: Analyzing Memory System Bottlenecks in Programs," Proc. 1992 SIGMETRICS Conf. Measurement and Modeling of Computer Systems, pp. 1-12,Newport, R.I., June1-5 1992.
[19] B.P. Miller, M.D. Callaghan, J.M. Cargille, J.K. Hollingsworth, R.B. Irvin, K.L. Karavanic, K. Kunchithapadam, and T. Newhall, “The Paradyn Parallel Performance Measurement Tools,” IEEE Computer, vol. 28, no. 11, Nov. 1995. Also see.
[20] S.E. Perl and W.E. Weihl, “Performance Assertion Checking,” Proc. 14th ACM Symp. Operating Systems Principles, pp. 134-145, Dec. 1993.
[21] D.A. Reed et al., "An Overview of the Pablo Performance Analysis Environment," Proc. Scalable Parallel Libraries Conf., IEEE Computer Society Press, Los Alamitos, Calif., Oct. 1994, pp. 104-113.
[22] D.A. Reed et al., "Virtual Reality and Parallel Systems Performance Analysis," Computer, Vol. 28, No. 11, Nov. 1995, pp. 57-67.
[23] S.K. Reinhardt, M.D. Hill, J.R. Larus, A.R. Lebeck, J.C. Lewis, and D.A. Wood, "The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers," Proc. ACM SIGMETRICS Conf. Measurement and Modeling of Computer Systems, pp. 48-60, ACM, May 1993.
[24] W. Williams, T. Hoel, and D. Pase, “The MPP Apprentice Performance Tool: Delivering the Performance of the Cray T3D,” Programming Environments for Massively Parallel Distributed Systems, 1994.
[25] C.-Q. Yang and B.P. Miller, "Critical Path Analysis for the Execution of Parallel and Distributed Programs," Proc. Eighth Int'l Conf. Distributed Computing Systems, pp. 366-375,San Jose, Calif., June 1988.
[26] D. Zernik and L. Rudolph, “Animating Work and Time for Debugging Parallel Programs Foundation and Experience,” Proc. 1991 ACM/ONR Workshop Parallel and Distributed Debugging, pp. 46-56, May 1991.

Index Terms:
Parallel and distributed computing, performance prediction, measurement, tools, online analysis.
Hyeonsang Eom, Jeffrey K. Hollingsworth, "A Tool to Help Tune where Computation Is Performed," IEEE Transactions on Software Engineering, vol. 27, no. 7, pp. 618-629, July 2001, doi:10.1109/32.935854
Usage of this product signifies your acceptance of the Terms of Use.