2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (2001)
Nuevo Leone, Mexico
Jan. 20, 2001 to Jan. 24, 2001
Jaeji Lee , Michigan State University
Yan Solihin , Michigan State University
Josep Torrellas , University of Illinois at Urbana-Champaign
Abstract: This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a host processor and a simpler memory processor. To achieve high performance with this type of architecture,code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we show average speedups of 1.7 for numerical applications and 1.2 for non-numerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on intelligent memory systems.
Jaeji Lee, Yan Solihin, Josep Torrellas, "Automatically Mapping Code on an Intelligent Memory Architecture", 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), vol. 00, no. , pp. 0121, 2001, doi:10.1109/HPCA.2001.903257