Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622) (2000)
Oct. 15, 2000 to Oct. 19, 2000
Babak Falsafi , Purdue University
Ilanthiraiyan Pragaspathy , Compaq Computer Corporation
Recent research suggests that DSM clusters can benefit from parallel coherence controllers. Parallel controllers require address partitioning and synchronization to avoid handling multiple coherence events for the same memory address simultaneously. This paper evaluates a spectrum of address partitioning schemes that vary in performance, hardware complexity, and cost. Dynamic partitioning minimizes load imbalance in controllers by using hardware address synchronizers to distribute the load among multiple protocol engines at runtime. Static partitioning obviates the need for hardware synchronization and assigns memory addresses to protocol engines at design time, but may lead to load imbalance among engines. We present simulation results indicating that: (i) dynamic partitioning performs best speeding up application execution on an 8 8-way cluster on average by 62% using four-engine as compared to single-engine controllers, (ii) block-interleaved static partitioning using low-order address bits is an attractive alternative and performs close to dynamic partitioning when protocol occupancies are low or there is little queuing, and (iii) previously proposed static schemes that partition memory pages either into home and remote engines or using low-order page address bits result in a high load imbalance in parallel controllers.
B. Falsafi and I. Pragaspathy, "Address Partitioning in DSM Clusters with Parallel Coherence Controllers," Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622)(PACT), Philadelphia, Pennsylvania, 2000, pp. 47.