This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems
April 1999 (vol. 48 no. 4)
pp. 449-456

Abstract—The performance of a vector processor accessing vectors placed in memory is strongly dependent on the conflicts produced in the memory subsystem. These conflicts delay the task of the functional units. There can be conflicts between elements of the same vector and between elements of different vector streams. It is known that the presence of the last kind of conflicts is the main cause of cycles lost. This paper proposes an order to access the elements of a vector stream that reduces the average memory access time in vector processors when several vector streams are concurrently accessed. The proposed order determines that the memory system observes the same stride for all the vector streams of a stride family. Conflicts between concurrent vector streams of the same family are completely eliminated if the rate at which memory modules are requested is less than or equal to their service rate. For other cases, the number of lost cycles due to conflicts is dramatically reduced.

[1] D.T. Harper and J.R. Jump,“Vector access performance in parallel memoriesusing a skewed storage scheme,” IEEE Trans. Computers, vol. 36, pp. 1440-1449, 1987.
[2] D.T. Harper III and D.A. Linebarger,“Conflict-free vector access using adynamic storage scheme,” IEEE Trans. Computers, vol. 40, no. 3, pp. 276-283, 1991.
[3] B.R. Rau,“Pseudo-randomly interleaved memory,” Int’l Symp. Computer Architecture, pp. 74-83, 1991.
[4] R. Raghavan and J.P. Hayes, "On Randomly Interleaved Memories," Proc. Supercomputing '90, pp. 49-58, Nov. 1990.
[5] D.T. Harper III,“Block, multistride vector and FFT accesses in parallel memorysystems,” IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 1, pp. 43-51, 1991.
[6] G.S. Sohi,“High-bandwidth interleaved memories for vector processors—Asimulation study,” IEEE Trans. Computer Systems, vol. 42, pp. 34-44, 1993.
[7] M. Valero,T. Lang,J.M. Llaberia,M. Peiron,E. Ayguadé,, and J.J. Navarro:, “Increasing the number of strides for conflict-free vector access,” Int’l Symp. Computer Architecture, pp. 372-381, 1992.
[8] T. Hashimoto, T. Hironaka, K. Murakami, and H. Yasuura, "A Micro-Vector Processor Architecture—Performance, Modeling and Benchmarking," Proc. Int'l Conf. Supercomputing, pp. 308-317, 1993.
[9] R. Raghavan and J. Hayes, "Reducing Interference Among Vector Accesses in Interleaved Memories," IEEE Trans. Computers, vol. 42, no. 4, pp. 471-483, Apr. 1993.
[10] K.A. Robbins and S. Robbins, "Buffered Banks in Multiprocessor Systems," IEEE Trans. Computers, vol. 44, no. 4, pp. 518-529, Apr. 1995.
[11] W. Oed and O. Lange,“On the effective bandwidth of interleaved memories invector processing systems,” IEEE Trans. Computers, vol. 34, no. 10, pp. 949-957, Oct. 1985.
[12] R.W. Hockney and C.R. Jesshope, Parallel Computers 2. Adam Hilger, 1988.
[13] T. Cheung and J.E. Smith,“A simulation study of the Cray X-MP memorysystem,” IEEE Trans. Computers, vol. 35, pp. 613-622, 1986.
[14] S. Edirisooriya and G. Edirisooriya, "Enhancing Vector Access Performance in CRAY X-MP Memory System," Proc. Compcon-93, pp. 569-576, 1993.
[15] A.M. del Corral and J.M. Llaberia, "Out-of-Order Access to Vector Elements in Order to Reduce Conflicts in Vector Processors," Proc. Sixth IEEE Symp. Parallel and Distributed Processing, pp. 126-134, Oct. 1994.
[16] A.M. del Corral and J.M. Llaberia, "Eliminating Conflicts Between Vector Streams in Interleaved Memory Systems," CEPBA Report RR-95/17. DAC-UPC Report RR-95/25, Aug. 1995.
[17] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Mateo, Calif., 1990.
[18] L. Kurian, B. Choi, P.T. Hulina, and L.D. Coroor, "Module Partitioning and Interleaved Data Placement Schemes to Reduce Conflicts in Interleaved Memories," Proc. Int'l Conf. Parallel Processing, vol. 1, pp. 212-219, 1994.
[19] A. Seznec and Y. Jegon, "Optimizing Memory Throughput in a Tightly Coupled Multiprocessors," Proc. Int'l Conf. Parallel Processing, pp. 344-346, 1987.
[20] J.E. Smith and W.R. Taylor,“Accurate modeling of interconnection networks in vector supercomputers,” 1991 Int’l Conf. Supercomputing, pp. 264-273, 1991.
[21] J. Torrellas and Z. Zhang, "The Performance of Cedar Multistage Switching Network," Proc. Supercomputing '94, pp. 265-274, Nov. 1994.
[22] D.L. Lee, "Prime-Way Interleaved Memory," Proc. Int'l Conf. Parallel Processing, vol. I, pp. 268-272, 1993.
[23] D.L. Lee, "Memory Access Reordering in Vector Processors," High-Performance Computer Architecture, pp. 380-389, Jan. 1995.
[24] J.W.C. Fu and J.H. Patel, "Memory Reference Behavior of Compiler Optimized Programs on High Speed Architectures," Proc. Int'l Conf. Parallel Processing, vol. II, pp. 87-94, 1993.
[25] P.P.N. de Groen, "Base-p-Cyclic Reduction for Tridiagonal Systems of Equations," Applied Numerical Mathematics, vol. 8, pp. 117-125, 1991.
[26] J.E. Smith and W.R. Taylor, "Characterizing Memory Performance in Vector Multiprocessors," Proc. Int'l Conf. Supercomputing, pp. 35-44, July 1992.

Index Terms:
Interleaved memory system, memory bandwidth, vector stream, concurrent access of vector streams, inter-vector-conflicts, hardware support.
Citation:
A.m. del Corral, J.m. Llaberia, "Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems," IEEE Transactions on Computers, vol. 48, no. 4, pp. 449-456, April 1999, doi:10.1109/12.762540
Usage of this product signifies your acceptance of the Terms of Use.