This Article 
 Bibliographic References 
 Add to: 
A Graphics Parallel Memory Organization Exploiting Request Correlations
June 2010 (vol. 59 no. 6)
pp. 762-775
George Lentaris, National Kapodistrian University of Athens, Athens
Dionysios Reisis, National Kapodistrian University of Athens, Athens
Real-time graphics applications require memory organizations featuring parallel pixel access and low-cost implementation. This work bases on a nonlinear skew mapping scheme and exploits the correlation between consecutive requests for pixels to design an efficient parallel memory organization. The mapping achieves parallel access, of mn pixels in various shapes, to the memory organized with mn banks. The proposed design technique combines the mapping properties and the spatial correlations among pixel requests to eliminate conflicts by spending at most one extra cycle every mn consecutive parallel pixel accesses. Consequently, the technique ensures that any pixel pattern—among these commonly used in graphics—can be accessed in a single cycle from any image location. The address computations become straightforward as the numbers of the requested pixels and the banks—apart from equal—can be powers of 2.

[1] P. Budnick and D.J. Kuck, "Organization and Use of Parallel Memories," IEEE Trans. Computers, vol. 20, no. 12, pp. 1565-1569, Dec. 1971.
[2] D.C. VanVoorhis and T.H. Morrin, "Memory Systems for Image Processing," IEEE Trans. Computers, vol. 27, no. 2, pp. 113-125, Feb. 1978.
[3] B. Chor, C.E. Leiserson, R.L. Rivest, and J.B. Shearer, "An Application of Number Theory to the Organization of Raster-Graphics Memory," J. ACM, vol. 33, no. 1, pp. 86-104, Jan. 1986.
[4] K. Kim and V.K. Prasanna, "Latin Squares for Parallel Array Access," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 4, pp. 361-370, Apr. 1993.
[5] A. Deb, "Multiskewing—a Novel Technique for Optimal Parallel Memory Access," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 6, pp. 595-604, June 1996.
[6] J.W. Park, "Multiaccess Memory System for Attached SIMD Computer," IEEE Trans. Computers, vol. 53, no. 4, pp. 439-452, Apr. 2004.
[7] J.K. Tanskanen, R. Creutzburg, and J.T. Niittylahti, "On Design of Parallel Memory Access Schemes for Video Coding," J. VLSI Signal Processing Systems, vol. 40, no. 2, pp. 215-237, June 2005.
[8] R.F. Sproull, I.E. Sutherland, A. Thompson, S. Gupta, and C. Minter, "The 8 by 8 Display," ACM Trans. Graphics, vol. 2, no. 1, pp. 32-56, Jan. 1983.
[9] J.M. Frailong, W. Jalby, and J. Lenfant, "XOR-Schemes: A Flexible Data Organization in Parallel Memories," Proc. 1985 Int'l Conf. Parallel Processing, pp. 276-283, Aug. 1985.
[10] H. Vandierendonck and K. De Bosschere, "XOR-Based Hash Functions," IEEE Trans. Computers, vol. 54, no. 7, pp. 800-812, July 2005.
[11] C. Liu, X. Yan, and X. Qin, "An Optimized Linear Skewing Interleave Scheme for On-Chip Multi-Access Memory Systems," Proc. 17th Great Lakes Symp. VLSI, pp. 8-13, Mar. 2007.
[12] J. Tanskanen, T. Sihvo, J. Niittylahti, J. Takala, and R. Creutzburg, "Parallel Memory Access Schemes for H.263 Encoder," Proc. IEEE Int'l Symp. Circuits and Systems, vol. 1, pp. 691-694, May 2000.
[13] R. Raghavan and J.P. Hayes, "On Randomly Interleaved Memories," Proc. 1990 ACM/IEEE Conf. Supercomputing, pp. 49-58, Nov. 1990.
[14] N. Topham and A. Gonzalez, "Randomized Cache Placement for Eliminating Conflicts," IEEE Trans. Computers, vol. 48, no. 2, pp. 185-192, Feb. 1999.
[15] A. Vitkovski, G. Kuzmanov, and G. Gaydadjiev, "Memory Organization with Multi-Pattern Parallel Accesses," Proc. Conf. Design, Automation and Test in Europe, pp. 1414-1419, Mar. 2008.
[16] K. Babionitakis, G. Lentaris, K. Nakos, N. Vlassopoulos, D. Reisis, G. Doumenis, G. Georgakarakos, and I. Sifnaios, "A Real-Time Motion Estimation FPGA Architecture," J. Real-Time Image Processing, vol. 3, nos. 1/2, pp. 3-20, Mar. 2008.
[17] "H.264 Advanced Video Coding for Generic Audiovisual Services," ITU-T, May 2003.
[18] T.M. Apostol, Introduction to Analytic Number Theory. Springer-Verlag, 1976.
[19] V. Shoup, A Computational Introduction to Number Theory and Algebra. Cambridge Univ. Press, 2005.
[20] D.E. Knuth, Art of Computer Programming, Volume 2: Seminumerical Algorithms, chapter 4. Addison-Wesley, 1981.
[21] R.C. Gonzalez and R.E. Woods, Digital Image Processing. Prentice-Hall, 2002.

Index Terms:
Parallel processing, graphics processors, storage devices, interleaved memories.
George Lentaris, Dionysios Reisis, "A Graphics Parallel Memory Organization Exploiting Request Correlations," IEEE Transactions on Computers, vol. 59, no. 6, pp. 762-775, June 2010, doi:10.1109/TC.2010.48
Usage of this product signifies your acceptance of the Terms of Use.