Search For:

Displaying 1-3 out of 3 total
Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU
Found in: Parallel Architectures and Compilation Techniques, International Conference on
By Ziyu Guo,Eddy Zheng Zhang,Xipeng Shen
Issue Date:October 2011
pp. 310-319
Automatic compilation for multiple types of devices is important, especially given the current trends towards heterogeneous computing. This paper concentrates on some issues in compiling fine-grained SPMD-threaded code (e.g., GPU CUDA code) for multicore C...
The Significance of CMP Cache Sharing on Contemporary Multithreaded Applications
Found in: IEEE Transactions on Parallel and Distributed Systems
By Eddy Zheng Zhang,Yunlian Jiang,Xipeng Shen
Issue Date:February 2012
pp. 367-374
Cache sharing on modern Chip Multiprocessors (CMPs) reduces communication latency among corunning threads, and also causes interthread cache contention. Most previous studies on the influence of cache sharing have concentrated on the design or management o...
Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU
Found in: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP '13)
By Bo Wu, Eddy Zheng Zhang, Xipeng Shen, Yunlian Jiang, Zhijia Zhao
Issue Date:February 2013
pp. 57-68
The performance of Graphic Processing Units (GPU) is sensitive to irregular memory references. Some recent work shows the promise of data reorganization for eliminating non-coalesced memory accesses that are caused by irregular references. However, all pre...