This Article 
 Bibliographic References 
 Add to: 
A Subsystem-Oriented Performance Analysis Methodology for Shared-Bus Multiprocessors
July 1996 (vol. 7 no. 7)
pp. 755-767

Abstract—A methodology, called Subsystem Access Time (SAT) modeling, is proposed for the performance modeling and analysis of shared-bus multiprocessors. The methodology is subsystem-oriented because it is based on a Subsystem Access Time Per Instruction (SATPI) concept, in which we treat major components other than processors (e.g., off-chip cache, bus, memory, I/O) as subsystems and model for each of them the mean access time per instruction from each processor.

The SAT modeling methodology is derived from the Customized Mean Value Analysis (CMVA) technique, which is request-oriented in the sense that it models the weighted total mean delay for each type of request processed in the subsystems. The subsystem-oriented view of the proposed methodology facilitates divide-and-conquer modeling and bottleneck analysis, which is rarely addressed previously. These distinguishing features lead to a simple, general, and systematic approach to the analytical modeling and analysis of complex multiprocessor systems.

To illustrate the key ideas and features that are different from CMVA, an example performance model of a particular shared-bus multiprocessor architecture is presented. The model is used to conduct performance evaluation for throughput prediction. Thereby, the SATPIs of the subsystems are directly utilized to identify the bottleneck subsystem and find the requests or subsystem components that cause the bottleneck. Furthermore, the SATPIs of the subsystems are employed to explore the impact of several performance influencing factors, including memory latency, number of processors, data bus width, as well as DMA transfer.

[1] G.M. Amdahl, "Validity of the Single Processor Approach to Archiving Large Scale Ccmputing Capability," American Federation of Informance Processing Societies, pp. 483-485, 1967.
[2] J. Akella and D.P. Siewiorek, "Modeling and Measurement of the Impact of Input/Output on System Performance," Proc. 18th Int'l Symp. Computer Architecture, pp. 390-399, 1991.
[3] D.P. Bhandarkar, "Analysis of Memory Interference in Multiprocessors," IEEE Trans. Computers, vol. 24, no. 9, pp. 897-908, Sept. 1975.
[4] M.C. Chiang and G.S. Sohi, "Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment," IEEE Trans. Computers, vol. 41, no. 3, pp. 297-317, Mar. 1992.
[5] A.L. DeCegama, The Technology of Parallel Processing—Parallel Processing Architectures and VLSI Hardware, vol. 1. Prentice Hall, 1989.
[6] P.J. Denning and J.P. Buzen, "The Operational Analysis of Queuing Network Models," ACM Computing Surveys, vol. 10, no. 3, Sept 1978, pp. 225-261.
[7] S.J. Eggers and R.H. Katz, "A Characterization of Sharing in Parallel Programs and Its Application to Coherency Protocol Evaluation," Proc. 15th Ann. Int'l Symp. Computer Architecture, IEEE Computer Society Press, Los Alamitos, Calif., 1988, pp. 373-382.
[8] P. Ein-Dor and J. Feldmesser, "Attributes of the Performance of Central Processing Units: A Relative Performance Prediction Model," Comm. ACM, Vol. 30, No. 4, Apr. 1987, pp. 308-317.
[9] F. Fung and H.C. Torng, "On the Analysis of Memory Conflicts and Bus Contentions in a Multiple-Microprocessor System," IEEE Trans. Computers, vol. 28, no. 1, pp. 28-37, Jan. 1979.
[10] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Mateo, Calif., 1990.
[11] M.A. Holliday and M.K. Vernon,"Exact Performance Estimates for Multiprocessor Memory and Bus Interference," IEEE Trans. Computers, vol. 36, no. 1, pp. 76-85, Jan. 1987.
[12] C.M. Hoogendoorn, "A General Model for Memory Interference in Multiprocessors," IEEE Trans. Computers., vol. 26, no. 10, pp. 990-1,005, Oct. 1977.
[13] Intel XA-MP Architecture Specification ver 3.0, July 1991.
[14] R. Jain, The Art of Computer Systems Performance Analysis.Cananda: John Wiley&Sons, 1991.
[15] C.S. Lee and T.M. Parng, "Performance Modeling and Evaluation for the XMP Shared-Bus Multiprocessor Architecture," Int'l Conf. Parallel and Distributed Systems, pp. 446-453,Hsinchu, Taiwan, Dec. 1994.
[16] S. Leutenegger and M.K. Vernon, "A Mean-Value Performance Analysis of a New Multiprocessor Architecture," Proc. ACM SIGMETRICS Conf. Measurement and Modeling of Computer Systems,Santa Fe, pp. 167-176,New Mexico, May 1988.
[17] M.A. Marsan, G. Balbo, G. Conte, and F. Gregoretti, "Modeling Bus Contention and Memory Interference in a Multiprocessor System," IEEE Trans. Computers, vol. 32, no. 1, pp. 60-72, Jan. 1983.
[18] M.A. Marsan, G. Balbo, and G. Conte, "Comparative Performance Analysis of Single Bus Multiprocessor Architectures," IEEE Trans. Computers, vol. 31, no. 12, pp. 1,179-1,191, Dec. 1982.
[19] M.A. Marsan and M. Gerla, "Markov Models for Multiple Bus Multiprocessor systems," IEEE Trans. Computers, vol. 31, no. 3, pp. 239-248, Mar. 1982.
[20] I.H. Onyuksel and K.B. Irani, "Markovian Queueing Network Models for Performance Analysis of a Single-Bus Multiprocessor Ssystem," IEEE Trans. Computers, vol. 39. no. 7, pp. 975-980, July 1990.
[21] S.A. Przybylski, Cache and Memory Hierarchy Design—A Performance-Directed Approach, pp. 181-186. Morgan Kaufmann, 1990.
[22] P. Schweitzer, "Approximate Analysis of Multiclass Closed Networks of Queues," J. ACM, vol. 29, no. 2, Apr. 1981.
[23] SES/workbench User's manual and Reference's Manual, Jan. 1991.
[24] R.T. Short and H.M. Levy, "A Simulation Study of Two-Level Caches," Proc. 15th Int'l Symp. Computer Architecture, pp. 81-88, June 1988.
[25] J.P. Singh, H.S. Stone, and D.F. Thiebaut, "A Model of Workloads and Its Use in Miss-Rate Prediction for Fully Associative Caches," IEEE Trans. Computers, vol. 41, no. 7, pp. 811-825, July 1992.
[26] T.F. Tsuei and M.K. Vernon, "A Multiprocessor Bus Design Model Validated by System Measurement," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 6, pp. 712-727, Nov. 1992.
[27] M.K. Vernon, E.D. Lazowska, and J. Zahorjan, “An Accurate and Efficient Performance Analysis Technique for Multiprocessor Snooping Cache-Consistency Protocols,” Proc. 15th Ann. Int'l Symp. Computer Architecture, pp. 308–315, May 1988.

Index Terms:
Bottleneck analysis, DMA transfer, performance analysis, separated address bus and data bus, shared-bus multiprocessor system, subsystem access time modeling, subsystem interferences.
Chiung-San Lee, Tai-Ming Parng, "A Subsystem-Oriented Performance Analysis Methodology for Shared-Bus Multiprocessors," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 7, pp. 755-767, July 1996, doi:10.1109/71.508254
Usage of this product signifies your acceptance of the Terms of Use.