loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Cache-Optimal Methods for Bit-Reversals
Portland, Oregon, USA
November 13-November 18
ISBN: 1-58113-091-0
Zhao Zhang, College of William and Mary
Xiaodong Zhang, College of William and Mary
Bit-reversals are representative and important data reordering operations in many scientific computations. Performance degradation is mainly caused by cache conflict misses. Bit-reversals are often repeatedly used as fundamental subroutines for many scientific programs. Thus, in order to gain the best performance, cache-optimal methods and their implementations should be carefully and precisely done at the programming level. This type of performance programming for some special programs, such as the data reorderings, may significantly outperform an optimization from an automatic tool, such as a compiler. In this paper, we examine different methods using techniques of blocking, buffering, and padding for efficient implementations. We evaluate the merits and limits of each technique and their application and architecture-dependent conditions for developing cache-optimal methods. We present two contributions in this paper: (1) Our integrated blocking methods, which match cache associativity and TLB cache size and which fully use the available registers are cache-optimal and fast. (2) We show that our padding methods outperform other software oriented methods, and believe they are the fastest in terms of minimizing both CPU and memory access cycles. Since the padding methods are almost independent of hardware, they could be widely used on many uniprocessor workstations and SMP multiprocessors.
Citation:
Zhao Zhang, Xiaodong Zhang, "Cache-Optimal Methods for Bit-Reversals," sc, pp.26, Proceedings of the 1999 ACM/IEEE conference on Supercomputing, 1999
Usage of this product signifies your acceptance of the Terms of Use.