This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Instruction Replication for Reducing Delays Due to Inter-PE Communication Latency
December 2005 (vol. 54 no. 12)
pp. 1496-1507
Aneesh Aggarwal, IEEE Computer Society
Manoj Franklin, IEEE Computer Society
As feature sizes are becoming smaller, wire delays are becoming very critical. Clustering is a popular decentralization approach to reduce the impact of shrinking technologies on clock speed. In this approach, the centralized instruction window is replaced with multiple smaller windows, called clusters (PEs). The performance of these clustered processors depends on the amount of inter-PE communication and load imbalance incurred by the distribution algorithm used to distribute instructions among the PEs. In this paper, we investigate a novel approach of reducing the impact of inter-PE communication latency, while preserving good load balance. The basic idea is to selectively replicate instructions in those PEs where their results are required. The replication is done based on heuristics that weigh the potential benefits of replication. We found that, with instruction replication, the IPC of a clustered processor is significantly higher than that obtained without instruction replication and is within just 8 percent of that of a superscalar configuration with a centralized instruction scheduler.
Index Terms:
Index Terms- Clustered processors, instruction replication, interconnection latency, load balancing, task assignment.
Citation:
Aneesh Aggarwal, Manoj Franklin, "Instruction Replication for Reducing Delays Due to Inter-PE Communication Latency," IEEE Transactions on Computers, vol. 54, no. 12, pp. 1496-1507, Dec. 2005, doi:10.1109/TC.2005.197
Usage of this product signifies your acceptance of the Terms of Use.