This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Heuristic Storage for Minimizing Access Time of Arbitrary Data Patterns
April 1997 (vol. 8 no. 4)
pp. 441-447

Abstract—The serialization of memory accesses is a major limiting factor in high performance SIMD computers. The data patterns or templates that are accessed by a program can be perceived by the compiler, and, therefore, the design of dynamic storage schemes that minimize conflicts may dramatically improve performance.

The problem of finding storage schemes that minimize the access time of arbitrary sets of power-of-two data patterns is proved to be NP-complete. We propose linear address transformations that can be dynamically applied by each processing element for mapping array references onto memories. An efficient approach for combining the constraints of different access patterns into one single linear address transformation is presented. We prove that finding the transformation that minimizes the access time is reducible to N-coloring, where N is the number of parallel memories. Using coloring heuristics, storage schemes are investigated with respect to minimizing the implementation cost (perfect storage) and overall access conflicts (semiperfect storage).

Results show that the perfect-storage may deviate on the average by 20% from the optimum access time in the case of 10 arbitrary data patterns and 16 memories. However, semiperfect schemes lead to dramatic reduction of the degree of conflict compared to perfect-schemes. The proposed heuristic storage largely outperforms interleaving and row-column-diagonals storages. The method can be implemented as compiler procedure for synthesizing storage schemes that promote parallel access to arbitrary sets of data patterns.

[1] D. Lawrie, "Access and Alignment of Data in an Array Processor," IEEE Trans. Computers, vol. 24, no. 12, pp. 1,145-1,155, Dec. 1975.
[2] P. Budnik and D. Kuck, "The Organization and Use of Parallel Memories," IEEE Trans. Computers, vol. 20, no. 12, pp. 1,566-1,569, Dec. 1971.
[3] G.S. Sohi,“High-bandwidth interleaved memories for vector processors—Asimulation study,” IEEE Trans. Computer Systems, vol. 42, pp. 34-44, 1993.
[4] J.M. Jalby, W. Frailong, and J. Lenfant, "XOR-Schemes: A Flexible Data Organization in Parallel Memories," Proc. Int'l Conf. Parallel Processing, pp. 276-283, 1985.
[5] A. Norton and E. Melton, "A Class of Boolean Linear Transformations for Conflict-Free Power-of-Two Stride Access," Proc. Int'l Conf. Parallel Processing, pp. 247-254, 1987.
[6] K. Batcher, "The Multidimensional Access Memory in STARAN," IEEE Trans. Computers, vol. 26, no. 2, pp. 174-177, Feb. 1977.
[7] R.V. Boppana and C.S. Raghavendra, "Efficient Storage Schemes for Arbitrary Size Square Matrices in Parallel Processors with Shuffle-Exchange Networks," Proc. Int'l Conf. Parallel Processing, pp. 365-368, 1991.
[8] M. Al-Mouhamed and S. Seiden, "A Cost-Effective Heuristic Storage for Minimizing Access Time of Arbitrary Data Templates," Technical Report ICS-UCI 93-30, Univ. of California, Irvine, June18, 1993.
[9] T.J. Schaefer, "The Complexity of Satisfiability Problems," Proc. 10th Ann. Symp. Theory of Computing, pp. 216-226, 1978.
[10] J. McHugh, Algorithmic Graph Theory. Prentice Hall, 1990.
[11] G.J. Chaitan, "Register Allocation and Spilling via Graph Coloring," ACM SIGPLAN Notices, vol. 17, no. 2, pp. 201-207, 1982.
[12] B. Bollobás, Graph Theory: An Introductory Course. Springer-Verlag, 1979.

Index Terms:
Boolean matrices, heuristics, memory organization, NP-complete, parallel memories, performance evaluation, storage schemes.
Citation:
Mayez A. Al-Mouhamed, Steven S. Seiden, "A Heuristic Storage for Minimizing Access Time of Arbitrary Data Patterns," IEEE Transactions on Parallel and Distributed Systems, vol. 8, no. 4, pp. 441-447, April 1997, doi:10.1109/71.588625
Usage of this product signifies your acceptance of the Terms of Use.