This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Compiler-Assisted Data Distribution and Network Configuration for Chip Multiprocessors
Nov. 2012 (vol. 23 no. 11)
pp. 2058-2066
Yong Li, Comput. Eng. Program, Univ. of Pittsburgh, Pittsburgh, PA, USA
A. Abousamra, Dept. of Comput. Sci., Univ. of Pittsburgh, Pittsburgh, PA, USA
R. Melhem, Dept. of Comput. Sci., Univ. of Pittsburgh, Pittsburgh, PA, USA
A. K. Jones, Electr. & Comput. Eng., Univ. of Pittsburgh, Pittsburgh, PA, USA
Data access latency, a limiting factor in the performance of chip multiprocessors, grows significantly with the number of cores in nonuniform cache architectures with distributed cache banks. To mitigate this effect, we use a compiler-based approach to leverage data access locality, choose an optimized data placement and efficiently configure the on-chip network. The proposed experimental compiler framework employs novel compilation techniques to discover and represent multithreaded memory access patterns (MMAPs). At runtime, symbolic MMAPs are resolved and used by a partitioning algorithm to choose a partition of allocated memory blocks among the forked threads in the analyzed application. This partition is used to enforce data ownership by associating the data with the core that executes the thread owning the data. Based on the partition, the communication pattern of the application can be extracted. We demonstrate how this information can be used in an experimental architecture to accelerate applications. In particular, our compiler assisted data partitioning approach shows a 20 percent speedup over shared caching and 5 percent speedup over the closest runtime approximation, first touch. By leveraging the communication pattern we can achieve a comparable performance to a system that uses a complex centralized network configuration system at runtime. Thus, our final system saves significant runtime complexity and achieves an 5.1 percent additional speedup through the addition of the reconfigurable network.
Index Terms:
Arrays,Instruction sets,Runtime,Benchmark testing,data partition,Circuit switching,network-on-chip,communication,data access pattern
Citation:
Yong Li, A. Abousamra, R. Melhem, A. K. Jones, "Compiler-Assisted Data Distribution and Network Configuration for Chip Multiprocessors," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 11, pp. 2058-2066, Nov. 2012, doi:10.1109/TPDS.2011.279
Usage of this product signifies your acceptance of the Terms of Use.