This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Conflict-Free Accesses to Strided Vectors on a Banked Cache
July 2005 (vol. 54 no. 7)
pp. 913-916
With the advance of integration technology, it has become feasible to implement a microprocessor, a vector unit, and a multimegabyte bank-interleaved L2 cache on a single die. Parallel access to strided vectors on the L2 cache is a major performance issue on such vector microprocessors. A major difficulty for such a parallel access is that one would like to interleave the cache on a block size basis in order to benefit from spatial locality and to maintain a low tag volume, while strided vector accesses naturally work on a word granularity. In this paper, we address this issue. Considering a parallel vector unit with 2^n independent lanes, a 2^n bank interleaved cache, and a cache line size of 2^k words, we show that any slice of 2^{n+k} consecutive elements of any strided vector with stride 2^rR with R odd and r\leq k can be accessed in the L2 cache and routed back to the lanes in 2^k subslices of 2^n elements.

[1] P. Budnik and D. Kuck, “The Organization and Use of Parallel Memories,” IEEE Trans. Computers, vol. 20, no. 12, pp. 1566-1569, Dec. 1971.
[2] R. Espasa, F. Ardanaz, J. Emer, S. Felix, J. Gago, R. Gramunt, I. Hernandez, T. Juan, G. Lowney, M. Mattina, and A. Seznec, “Tarantula: A Vector Extension to the Alpha Architecture,” Proc. 29th Int'l Symp. Computer Architecture, May 2002.
[3] D. Harper and J. Jump, “Vector Access Performance in Parallel Memories Using a Skewed Storage Scheme,” IEEE Trans. Computers, vol. 36, no. 12, pp. 1440-1449, Dec. 1987.
[4] D.H. Lawrie and C.R. Vora, “The Prime Memory System for Array Access,” IEEE Trans. Computers, vol. 31, no. 5, pp. 435-442, May 1982.
[5] M. Peiron, M. Valero, E. Ayguade, and T. Lang, “Vector Multiprocessors with Arbitrated Memory Access,” Proc. 22nd Ann. Int'l Symp. Computer Architecture, June 1995.
[6] A. Seznec and J. Lenfant, “Interleaved Parallel Schemes: Improving Memory Throughput on Supercomputers,” Proc. 19th Ann. Int'l Symp. Computer Architecture, May 1992.

Index Terms:
Index Terms- Vector microprocessor, strided vectors, conflict free access, L2 caches.
Citation:
Andr? Seznec, Roger Espasa, "Conflict-Free Accesses to Strided Vectors on a Banked Cache," IEEE Transactions on Computers, vol. 54, no. 7, pp. 913-916, July 2005, doi:10.1109/TC.2005.110
Usage of this product signifies your acceptance of the Terms of Use.