This Article 
 Bibliographic References 
 Add to: 
Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment
March 1992 (vol. 41 no. 3)
pp. 297-317

The authors consider the evaluation of design choices in multiprocessors with a single, shared bus interconnect operating in an environment in which each task is being executed on a single processor and the performance of the multiprocessor is measured by its overall throughput. To evaluate design choices, they develop mean value analysis analytical models and validate the models by comparing their results against the results of a trace-driven simulation analysis for 5376 multiprocessor configurations. The trace-driven simulation uses actual programs and simulates their execution in a throughput-oriented environment. It is found that: (1) cache block sizes that yield the best performance in a multiprocessor differ from the block sizes that yield the best uniprocessor performance metrics, (2) a larger cache set associativity might be warranted in a multiprocessor even though it might not be warranted in a uniprocessor, (3) a split transaction, pipelined bus yields much higher multiprocessor throughput than a circuit switched bus, especially for larger main memory latencies, and (4) increasing the bus width appears to be an effective way of improving multiprocessor throughput.

[1] A. Agarwal, R.L. Sites, and M. Horowitz, "ATUM: A New Technique for Capturing Address Traces Using Microcode,"13th Int'l Symp. Computer Architecture, 1986, IEEE Computer Soc. Press, Los Alamitos, Calif., pp. 119-127.
[2] A. Agarwal, J. Hennessy, and M. Horowitz, "Cache performance of operating systems and multiprogramming workloads,"ACM Trans. Comput. Syst., vol. 6, pp. 393-431, Nov. 1988.
[3] A. Agarwal, M. Horowitz, and J. Hennessy, "An analytical cache model,"ACM Trans. Comput. Syst., vol. 7, pp. 184-215, May 1989.
[4] J. Archibald and J. L. Baer, "Cache-coherence protocols: Evaluation using a multiprocessor simulation model,"ACM Trans. Comput. Syst., vol. 4, no. 4, pp. 273-298, Nov. 1986.
[5] B. Beck, B. Kasten, and S. Thakker, "VLSI assist for a multiprocessor,"Proc. ASPLOS II, pp. 10-20, Oct. 1987.
[6] C. G. Bell, "Multis: A new class of multiprocessor computers,"Science, vol. 228, pp. 462-467, Apr. 1985.
[7] P. Borrill and J. Theus, "An advanced communication protocol for the proposed IEEE 896 Futurebus,"IEEE Micro, pp. 42-56, Aug. 1984.
[8] M.-C. Chiang and G. S. Sohi, "Experience with mean value analysis models for evaluating shared bus, throughput-oriented multiprocessors," inProc. SIGMETRICS Int. Symp. Comput. Perform. Modeling, Measurement and Eval., May 1991, pp. 90-100.
[9] S.J. Eggers and R.H. Katz, "A Characterization of Sharing in Parallel Programs and its Application to Coherency Protocol Evaluation,"Proc. 15th Int'l Symp. Computer Architecture, 1988, IEEE CS Press, Los Alamitos, Calif. Order No. 861, pp. 373-382.
[10] G. N. Fielland, "Symmetry: A second generation practical parallel," inDig. Papers, COMPCON Spring 1988, Feb. 1988, pp. 114-115.
[11] J.R. Goodman, "Using Cache Memory to Reduce Processor Memory Traffic,"Proc. 10th Symp. Computer Architecture, IEEE CS Press, Los Alamitos, Calif., Order No. 473 (microfiche only), 1983, pp. 124-131.
[12] M. D. Hill, "Aspects of cache memory and instruction buffer performance," Tech. Rep. UCB/CSD 87/381, Univ. of California at Berkeley, Berkeley, CA, Nov. 1987.
[13] M. D. Hill, "A case for direct-mapped caches,"IEEE Comput. Mag., vol. 21, pp. 25-40, Dec. 1988.
[14] R. Jog, G. S. Sohi, and M. K. Vernon, "The TREEBus architecture and its analysis," Computer Sciences Tech. Rep. 747, Univ. of Wisconsin-Madison, Madison, WI 53706, Feb. 1988.
[15] R. Katz, S. Eggers, D. Wood, C.L. Perkins, and R. Sheldon, "Implementing a cache consistency protocol," inProc. 12th Annu. Int. Symp. Comput. Architecture, vol. 13, June 1985, pp. 276-283.
[16] T. Lang, M. Valero, and I. Alegre, "Bandwidth of crossbar and multiplebus connections for multiprocessors,"IEEE Trans. Comput., vol. C-31, pp. 1227-1234, Dec. 1982.
[17] E. D. Lazawskaet al., Quantitative System Performance--Computer System Analysis Using Queueing Network Models. Englewood Cliffs, NJ: Prentice-Hall, 1984.
[18] S. Leutenegger and M. K. Vernon, "A mean-value performance analysis of a new multiprocessor architecture," inProc. ACM SIGMETRICS Conf. Measurement and Modelling of Comput. Syst., May 1988.
[19] D. Lilja, D. Marcovitz, and P.-C. Yew, "Memory reference behavior and cache peformance in a shared memory multiprocessor," CSRD Rep. 836, Center for Supercomputing Research and Development, Univ. of Illinois, Urbana, IL 61801-2932, Dec. 1988.
[20] M. A. Marsan and M. Gerla, "Markov models for multiple-bus multiprocessor systems,"IEEE Trans. Comput., vol. C-31, pp. 239-248, Dec. 1982.
[21] M. A. Marsan, G. Balbo, G. Conte, and F. Gregoretti, "Modeling bus contention and memory interference in a multiprocessor system,"IEEE Trans. Comput., vol. C-32, pp. 60-72, Jan. 1983.
[22] T. N. Mudge, J. P. Hayes, G. D. Buzzard, and D. C. Windsor, "Analysis of multiple bus interconnection networks," inProc. 1984 Int. Conf. Parallel Processing, Aug. 1984, pp. 228-232.
[23] J. H. Patel, "Analysis of multiprocessors with private cache memories,"IEEE Trans. Comput., vol. C-31, pp. 296-304, Apr. 1982.
[24] S. Przybylski, M. Horowitz, and J. Hennessy, "Performance Trade-offs in Cache Design,"15th Ann. Int'l Symp. Computer Architecture, IEEE CS Press, Los Alamitos, CA, Order No. 861, 1988, pp. 290-298.
[25] A. Smith, "Cache Memories,"Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473- 530.
[26] A. J. Smith, "Line (block) size choice for CPU cache memories,"IEEE Trans. Computers, vol. 36, no. 9, pp. 1063-1074, 1987.
[27] M. K. Vernon and M. A. Holliday, "Performance analysis of multiprocessor cache consistency protocols using generalized timed petri nets," inProc. Performance '86 and ACM Sigmetrics 1986, Raleigh, NC, May 1986, pp. 9-17.
[28] M. K. Vernon, E.D. Lazowska, and J. Zahorjan, "An accurate and efficient performance analysis technique for multiprocessor snooping cache-consistency protocols," inProc. 15th Annu. Int. Symp. Comput. Architecture, Honolulu, HI, May 1988, pp. 308-315.
[29] M. K. Vernon, R. Jog, and G. S. Sohi, "Performance analysis of hierarchical cache-consistent multiprocessors,"Perform. Eval., vol. 9, pp. 287-302, 1989.

Index Terms:
shared bus multiprocessors; throughput-oriented environment; performance; overall throughput; design choices; mean value analysis analytical models; trace-driven simulation analysis; cache block sizes; cache set associativity; multiprocessor throughput; digital simulation; multiprocessing systems; performance evaluation.
M.-C. Chiang, G.S. Sohi, "Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment," IEEE Transactions on Computers, vol. 41, no. 3, pp. 297-317, March 1992, doi:10.1109/12.127442
Usage of this product signifies your acceptance of the Terms of Use.