This Article 
 Bibliographic References 
 Add to: 
Modeling and Evaluating Design Alternatives for an On-Line Instrumentation System: A Case Study
June 1998 (vol. 24 no. 6)
pp. 451-470

Abstract—This paper demonstrates the use of a model-based evaluation approach for instrumentation systems (ISs). The overall objective of this study is to provide early feedback to tool developers regarding IS overhead and performance; such feedback helps developers make appropriate design decisions about alternative system configurations and task scheduling policies. We consider three types of system architectures: network of workstations (NOW), symmetric multiprocessors (SMP), and massively parallel processing (MPP) systems. We develop a Resource OCCupancy (ROCC) model for an on-line IS for an existing tool and parameterize it for an IBM SP-2 platform. This model is simulated to answer several 'what if' questions regarding two policies to schedule instrumentation data forwarding: collect-and-forward (CF) and batch-and-forward (BF). In addition, this study investigates two alternatives for forwarding the instrumentation data: direct and binary tree forwarding for an MPP system. Simulation results indicate that the BF policy can significantly reduce the overhead and that the tree forwarding configuration exhibits desirable scalability characteristics for MPP systems. Initial measurement-based testing results indicate more than 60 percent reduction in the direct IS overhead when the BF policy was added to Paradyn parallel performance measurement tool.

[1] D.G. Belanger, Y.-F. Chen, N.R. Fildes, B. Krishnamurthy, P.H. Rank Jr., K.-P. Vo, and T.E. Walker, "Architecture Styles and Services: An Experiment Involving Signal Operations Platforms-Provisioning Operations Systems," AT&T Technical J., pp. 54-60, Jan./Feb. 1996.
[2] P.A. Bernstein, "Middleware: A Model for Distributed System Services," Comm. ACM, vol. 39, no. 2, Feb. 1996, pp. 86-98.
[3] D. Bhatt et al., "SPI: An Instrumentation Development Environment for Parallel/Distributed Systems," Proc. Ninth Int'l Parallel Processing Symp., IEEE Computer Society Press, Los Alamitos, Calif., 1995, pp. 494-501.
[4] Mark E. Crovella and Thomas J. LeBlanc, “Parallel Performance Prediction Using Lost Cycles Analysis,” Proc. Supercomputing’94, CS Press, 1994, pp. 600-609.
[5] R.T. Dimpsey and R.K. Iyer, "A Measurement-Based Model to Predict the Performance Impact of System Modifications: A Case Study," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 1, pp. 28-40, Jan. 1995.
[6] S.G. Eick and D.E. Fyock, "Visualizing Corporate Data," AT&T Technical J., pp. 74-85, Jan./Feb. 1996.
[7] J.D. Gee, M.D. Hill, D.N. Pnevmatikatos, and A.J. Smith, "Cache Performance of the SPEC92 Benchmark Suite," IEEE Micro, pp. 17-27, Aug. 1993.
[8] M.J. Gergeleit and H. Streich, "DIRECT: Towards a Distributed Object-Oriented Real-Time Control System," technical report, 1996.
[9] W. Gu, G. Eisenhauer, E. Kraemer, K. Schwan, J. Stasko, and J. Vetter, "Falcon: On-Line Monitoring and Steering of Large-Scale Parallel Programs," Technical Report GIT-CC-94-21, 1994.
[10] M.C. Hao, A.H. Karp, A. Waheed, and M. Jazayeri, "VIZIR: An Integrated Environment for Distributed Program Visualization," Proc. Int'l Workshop Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS '95), pp. 288-292, Tools Fair, Durham, North Carolina, Jan. 1995.
[11] R. Harrison, L. Zitzman, and G. Yoritomo, "High Performance Distributed Computing Program (HiPer-D)—Engineering Testbed One (T1) Report," technical report, Naval Surface Warfare Center, Dahlgren, Va., Nov. 1995.
[12] J.K. Hollingsworth, B.P. Miller, and J. Cargille, “Dynamic Program Instrumentation for Scalable Performance Tools,” Proc. IEEE Scalable High Performance Computing Conf., pp. 841-850, 1994.
[13] J.K. Hollingsworth and B.P. Miller, "An Adaptive Cost Model for Parallel Program Instrumentation," Proc. EuroPar '96, vol. 1, pp. 88-98,Lyon, France, Aug. 1996.
[14] H.D. Hughes, "Generating a Drive Workload from Clustered Data," Computer Performance, vol. 5, no. 1, pp. 31-37, Mar. 1984.
[15] R. Jain, The Art of Computer Systems Performance Analysis—Techniques for Experimental Design, Measurement, Simulation, and Modeling.New York: John Wiley&Sons, 1991.
[16] D.E. Knuth, The Art of Computer Programming. Addison-Wesley, 1973.
[17] F. Lange, R. Kroger, and M. Gergeleit, "Jewel: Design and Implementation of a Distributed Measurement System," IEEE Trans. Parallel and Distributed Systems, Vol. 3, No. 6, Nov. 1992, pp. 657-671.
[18] J.R. Larus, "The SPIM Simulator for the MPIS R2000/R3000," Computer Organization and Design—The Hardware/Software Interface D.A. Patterson and J.L. Hennessy, eds., Morgan Kaufmann, 1994.
[19] A. Law and W. Kelton, Simulation Modeling and Analysis,New York: McGraw-Hill, 1991.
[20] A.D. Malony, D.A. Reed, and H.A.G. Wijshoff, "Performance Measurement Intrusion and Perturbation Analysis," IEEE Trans. Parallel and Distributed Systems, Vol. 3, No. 4, July 1992, pp. 433-450.
[21] C. Mercer and R. Rajkumar, An Interactive Interface and RT-Mach Support for Monitoring and Controlling Resource Management Proc. IEEE Real-Time and Embedded Technology and Applications Symp., 1995.
[22] B.P. Miller et al., “IPS-2: The Second Generation of a Parallel Program Measurement System,” IEEE Trans. Parallel Distributed Systems, Vol. 1, No. 2, Apr. 1990, pp. 206-217.
[23] B.P. Miller, M.D. Callaghan, J.M. Cargille, J.K. Hollingsworth, R.B. Irvin, K.L. Karavanic, K. Kunchithapadam, and T. Newhall, “The Paradyn Parallel Performance Measurement Tools,” IEEE Computer, vol. 28, no. 11, Nov. 1995. Also see.
[24] D.A. Reed, R.A. Aydt, T.M. Madhyastha, R.J. Noe, K.A. Shields, B.W. Schwartz, "The Pablo Performance Analysis Environment," Dept. of Computer Science., Univ. of Illi nois, 1992.
[25] D.A. Reed, "Building Successful Performance Tools," presented in ARPA PI meeting, July 1995.
[26] B. Ries, R. Anderson, D. Breazeal, K. Callaghan, E. Richards, and W. Smith, "The Paragon Performance Monitoring Environment," Proc. Supercomputing '93,Portland, Ore., pp. 850-859, Nov. 1993.
[27] S. Saini and D. Bailey, "NAS Parallel Benchmark Results," Report NAS-95-021, NASA Ames Research Center, Dec. 1995.
[28] A. Waheed and D.T. Rover, "A Structured Approach to Instrumentation System Development and Evaluation," Proc. Supercomputing '95,San Diego, Calif., Dec. 1995.
[29] A. Waheed, H.D. Hughes, and D.T. Rover, "A Resource Occupancy Model for Evaluating Instrumentation System Overheads," Proc. 20th Ann. Int'l Conf. Computer Measurement Group (CMG '95), pp. 1,212-1,223,Nashville, Tenn., Dec. 1995.
[30] A. Waheed, D.T. Rover, and J. Hollingsworth, "Modeling, Evaluation, and Testing of Paradyn Instrumentation System," Proc. Supercomputing '96,Pittsburgh, Pa., Nov. 1996.
[31] A. Waheed, D.T. Rover, M.W. Mutka, H. Smith, and A. Bakic, "Modeling, Evaluation, and Adaptive Control of an Instrumentation System," Proc. Real-Time Technology and Applications Symp. (RTAS '97),Montreal, June 1997.
[32] J.C. Yan and S. Listgarten, "Intrusion Compensation for Performance Evaluation of Parallel Programs on a Multicomputer," Proc. Sixth Int'l Conf. Parallel and Distributed Systems,Louisville, Ky., Oct. 1993.
[33] J. Yan, S. Sarukkai, and P. Mehra, "Performance Measurement, Visualization, and Modeling of Parallel and Distributed Programs Using the AIMS Toolkit," Software Practice and Experience, Vol. 25, No. 4, 1995, pp. 429-461.

Index Terms:
Instrumentation system, resource occupancy model, workload characterization, parallel tools, parallel and distributed system, monitoring, intrusion.
Abdul Waheed, Diane T. Rover, Jeffrey K. Hollingsworth, "Modeling and Evaluating Design Alternatives for an On-Line Instrumentation System: A Case Study," IEEE Transactions on Software Engineering, vol. 24, no. 6, pp. 451-470, June 1998, doi:10.1109/32.689402
Usage of this product signifies your acceptance of the Terms of Use.