loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2007 IEEE International Symposium on Performance Analysis of Systems&Software
Simplifying Active Memory Clusters by Leveraging Directory Protocol Threads
San Jose, CA
April 25-April 27
ISBN: 1-4244-1081-9
D.D. Kalamkar, Syst. Res. Center, Intel Technol. India Pvt. Ltd., Bangalore
Address re-mapping techniques in so-called active memory systems have been shown to dramatically increase the performance of applications with poor cache and/or communication behavior on shared memory multiprocessors. However, these systems require custom hardware in the memory controller for cache line assembly/disassembly, address translation between re-mapped and normal addresses, and coherence logic. In this paper we make the important observation that on a traditional flexible distributed shared memory (DSM) multiprocessor node, equipped with a coherence protocol thread context as in SMTp or a simple dedicated in-order protocol processing core as in a CMP, the address re-mapping techniques can be implemented in software running on the protocol thread or core without custom hardware in the memory controller while delivering high performance. We implement the active memory address re-mapping techniques of parallel reduction and matrix transpose (two popular kernels in scientific, multimedia, and data mining applications) on these systems, outline the novel coherence protocol extensions needed to make them run efficiently in software protocols, and evaluate these protocols on four different DSM multiprocessor architectures with multi-threaded and/or dual-core nodes. The proposed protocol extensions yield speedup of 1.45 for parallel reduction and 1.29 for matrix transpose on a 16-node DSM multiprocessor when compared to non-active memory baseline systems and achieve performance comparable to the existing active memory architectures that rely on custom hardware in the memory controller
Index Terms:
memory controller, active memory cluster, directory protocol thread, active memory address remapping, parallel reduction, matrix transpose, coherence protocol extension, software protocol, multiprocessor architecture, distributed shared memory, multi-threaded node, dual-core node, active memory architecture
Citation:
D.D. Kalamkar, M. Chaudhuri, M. Heinrich, "Simplifying Active Memory Clusters by Leveraging Directory Protocol Threads," ispass, pp.242-253, 2007 IEEE International Symposium on Performance Analysis of Systems&Software, 2007
Usage of this product signifies your acceptance of the Terms of Use.