This Article 
 Bibliographic References 
 Add to: 
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems
January 1991 (vol. 2 no. 1)
pp. 43-51

A discussion is presented of the use of dynamic storage schemes to improve parallelmemory performance during three important classes of data accesses: vector accesses inwhich multiple strides are used to access a single vector, block accesses, andconstant-geometry FFT accesses. The schemes investigated are based on linear addresstransformations, also known as XOR schemes. It has been shown that this class ofschemes can be implemented more efficiently in hardware and has more flexibility thanschemes based on row rotations or other techniques. Several analytical results areshown. These include: quantitative analysis of buffering effects in pipelined memorysystems; design rules for storage schemes that provide conflict-free access usingmultiple strides, blocks, and FFT access patterns; and an analysis of the effects ofmemory bank cycle time on storage scheme capabilities.

[1] D. H. Bailey, "Vector computer memory bank contention,"IEEE Trans. Computers, vol. C-36, pp. 293-298, Mar. 1987.
[2] W. Oed and O. Lange, "On the effective bandwidth of interleaved memories in vector processing systems,"IEEE Trans. Comput., vol. C-34, no. 10, pp. 949-957, Oct. 1985.
[3] D. Lawrie and C. Vora, "The prime memory system for array access,"IEEE Trans. Comput., vol. C-31, no. 5, pp. 435-442, May 1982.
[4] D. Kuck and R. Stokes, "The Burroughs Scientific Processor (BSP),"IEEE Trans. Comput., vol. C-31, pp. 363-376, May 1982.
[5] P. Budnik and D. Kuck, "The organization and use of parallel memories,"IEEE Trans. Comput., vol. C-20, no. 12, pp. 1566-1569, Dec. 1971.
[6] D. Lawrie, "Access and alignment of data in an array processor,"IEEE Trans. Comput., vol. C-24. no. 12, pp. 1145-1155, Dec. 1975.
[7] K. Batcher, "The multidimensional access memory in STARAN,"IEEE Trans. Comput., vol. C-26, pp. 174-177, Feb. 1977.
[8] B. Rau, M. Schlansker, and D. Yen, "The Cydra 5 stride-insensitive memory system," inProc. Int. Conf. Parallel Processing, 1989, pp. 1242-1246.
[9] A. Norton and E. Melton, "A class of boolean linear transformations for conflict-free power-of-two stride access," inProc. Int. Conf. Parallel Processing, 1987, pp. 247-254.
[10] J. Frailong, W. Jalby, and J. Lenfant, "XOR-schemes: A flexible data organization in parallel memories," inProc. Int. Conf. Parallel Processing, 1985, pp. 276-283.
[11] D. T. Harper III and D. Linebarger, "A dynamic storage scheme for conflict-free vector access," inProc. Int. Symp. Comput. Architecture, 1989.
[12] D.T. Haper III, "Address transformations to increase memory performance," inProc. 1989 Int. Conf. Parallel Processing, 1989.
[13] D.T. Haper III, "Increased memory performance during vector accesses through the use of linear address transformations,"IEEE Trans. Comput., to be published.
[14] D. Lee, "Scrambled storage for parallel memory systems," inProc. 15th Ann. Int. Symp. Computer Arch., May 1988.
[15] K. Kim and V. K. Kumar, "Perfect Latin square and parallel array access," inProc. 16th Annu. Int. Symp. Comput. Architecture, May 1989, pp. 372-379.
[16] D. T. Harper III and J. R. Jump, "Vector access performance in parallel memories using a skewed storage scheme,"IEEE Trans. Comput., vol. C-36, no. 12, pp. 1440-1449, 1987.
[17] D.T. Harper III and D.A. Linebarger, "Storage schemes for efficient computation of a radix 2 FFT in a machine with parallel memories," inProc. 1988 Int. Conf. Parallel Processing, 1988.
[18] G. Sohi, "High-bandwidth interleaved memories for vector processors-A simulation study," Tech. Rep., Comput. Sci. Dep., Univ. of Wisconsin-Madison, Sept. 1988.
[19] D.T. Harper III and D. Linebarger, "Conflict-free vector access using a dynamic storage scheme,"IEEE Trans. Comput., to be published.
[20] E. Kozdrowicki and D. Theis, "Second generation of vector supercomputers,"IEEE Comput. Mag., pp. 71-83, Nov. 1980.
[21] T. Cheung and J. E. Smith, "A simulation study of the CRAY X-MP memory system,"IEEE Trans. Computers, vol. C-35, pp. 613-622, July 1986.
[22] CONVEX Computer Corp., CONVEX Architecture Reference, Oct 1988.
[23] CRAY Research Inc., CRAY X-MP Computer System Functional Description Manual-HR-3005, 1987.
[24] CRAY Research Inc., CRAY Y-MP Computer System Functional Description Manual-HR-4001A, 1988.
[25] T. Diede, C. Hagenmaier, G. Miranker, J. Rubinstein, and J. W. S. Worley, "The Titan graphics supercomputer architecture,"IEEE Comput. Mag., vol. 21, pp. 13-30, Sept. 1988.
[26] O. Lubeck, J. Moore, and R. Mendez, "A benchmark comparison of three supercomputers: Fujitsu VP-200, Hitachi S810/20, and Cray X-MP/2,"IEEE Comput. Mag., vol. 18, no. 12, pp. 10-24, 1985.
[27] K. Gallivan, W. Jalby, U. Meier, and A. H. Sameh, "Impact of hierarchical memory systems on linear algebra algorithm design," Int.J. Supercomput. Appl., vol. 2, no. 1, pp. 12-48, 1988.
[28] C.S. Burrus and T.W. Parks,DFT/FFT and Convolution Algorithms, John Wiley&Sons, New York, 1985, 232 pp.
[29] R.C. Singleton, "A Method for Computing the Fast Fourier Transform with Auxiliary Memory and Limited High-Speed Storage,"IEEE Trans. Audio and Elect., Vol. AU-15, June 1967, pp. 91-97.

Index Terms:
Index Termsfast Fourier transform; dynamic storage schemes; parallel memory performance; vectoraccesses; block accesses; constant-geometry FFT accesses; linear addresstransformations; XOR schemes; analytical results; quantitative analysis; bufferingeffects; pipelined memory systems; conflict-free access; memory bank cycle time; fastFourier transforms; memory architecture
D.T. Harper, III, "Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 2, no. 1, pp. 43-51, Jan. 1991, doi:10.1109/71.80188
Usage of this product signifies your acceptance of the Terms of Use.