This Article 
 Bibliographic References 
 Add to: 
Area Time Trade-Offs in Micro-Grain VLSI Array Architectures
October 1994 (vol. 43 no. 10)
pp. 1121-1128

We study the relative performance of three different massively parallel fine-grain, VLSI, control-flow architectures. The processor architectures being considered are: an associative memory architecture, a Mux-based SIMD architecture and a modification of the Mux-based architecture using RAMs making it suitable for systolic MIMD/MISD computation. All three architectures are organized as two-dimensional, near-neighbor mesh connected, array of processors. All three are very similar in their construction, and in their control and data-flow requirements. The custom hardware for all three architectures was built using the same technology. We compare and contrast the performance of these three VLSI architectures for a select set of applications. To evaluate the computational power of the three architectures we use the area time product, AT, as the metric. The three designs are known to perform well in their niche applications and we find that for non-niche applications all three designs are comparable in power to within a small constant factor. The performance of the Mux-based SIMD architecture is better in general than the other two in terms of speed though the associative architecture is found to out-perform the SIMD architecture for certain numeric applications like the FFT and matrix multiplication in the AT sense.

[1] J. M. Arnold, D. A. Buell, and E. G. Davis, "SPLASH 2," inProc. 4th Annu. ACM Symp. Parallel Algorithms and Architectures,1992, pp. 316-322.
[2] R. S. Bajwa, R. M. Owens, and M. J. Irwin, "A massively parallel micro-grained, vlsi architecture," inVLSI Design' 93,Bombay, India, Jan. 1993, pp. 250-255.
[3] R. S. Bajwa, R. M. Owens, and M. J. Irwin, "Image processing on the MGAP: A cost effective solution," inInt. Parallel Processing Symp.,Newport Beach, CA, Apr. 1993, pp. 439443.
[4] K. E. Batcher, "STARAN Parallel Processor System Hardware," inProc. AFtPS NCC,1974, pp. 405410.
[5] K. E. Batcher, "Design of a massively parallel processor,"IEEE Trans. Comput.,vol. C-29, 9, pp. 836840, Sept. 1980.
[6] P. Bertin, D. Roncin, and J. Vuillemin, "Programmable active memories: A performance assesment," presented at theFPGA' 92, 1st ACMBIGDA Workshop on Field Programmable Gate Arrays,Berkeley, CA, Feb. 1992.
[7] C. E. Cox and W. E Blanz, "GANGLION-A fast field-programmable gate array implementation of a connectionist classifier,"IEEE J. Solid-State Circ.,vol. 27, no. 3, pp. 288-299, Mar. 1992.
[8] C. C. Foster,Content Addressable Processors. New York: Van Nostrand Reinhold, 1976.
[9] M. Gokhale, W. Holmes, A. Kosper, S. Lucas, R. Minnich, D. Sweely, and D. Lopresti, "Building and using a highly parallel programmable logic array,"IEEE Comput.,vol. 24, pp. 81-89, 1991.
[10] R. M. Hord,Parallel Supercomputing in SIMD Architectures.New York: CRC Press, 1990.
[11] R. Hughey, "Programmable systolic arrays," Ph.D. dissertation, Dept. Comput. Sci., Brown Univ., Providence, RI, Tech. Rep. CS-91-34, 1991.
[12] M. J. Irwin and R. M. Owens, "Digit pipelined arithmetic as illustrated by the paste-up system,"IEEE Comput. Mag., pp. 61-73, Apr. 1987.
[13] --, "A two-dimensional, distributed logic processor,"IEEE Trans. Comput.,vol. 40, no. 10, pp. 1094-l101, Oct. 1991.
[14] --, "A micro-grained VLSI signal processor," inICASSP-92,Mar. 1992, pp. 641-644.
[15] W. Kautz, "Cellular logic-in-memory arrays,"IEEE Trans. Comput.,vol. C-18, no. 8, pp. 719-727, 1969.
[16] T. Kean and J. Gray, "Configurable hardware: Two case studies of micro-grain computation,"J. VLSI Signal Processing,vol. 2, no. 1, pp. 9-16, Sept. 1990.
[17] C. Nagendra, M. Borah, M. Vishwanath, R. M. Owens, and M. J. Irwin, "Edge detection using fine-grained parallelism in VLSI," presented at theInt. Conf Acoust. Speech and Signal Processing,Minneapolis, MN, Apr. 1993.
[18] NSF-CISE-MIPS. "Reoort on the workshop on field programmable gate arrays in the universit", Washington, DC, Jan. 1990.
[19] R. M. Owens. M. J. Irwin. T. P. Kelliher, M. Vishwanath, and R. S. Bajwa, "Implementing a family of high performance, micrograined architectures," inProc. Application SpeciJc Array Processors,Aug. 1992, pp. 191-205.
[20] D. Parkinson, D. J. Hunt, and K. S. MacQueen, "The AMT DAP 500," inMassively Parallel Computing with the DAP,D. Parkinson and J. Litt, Eds. Cambridge, MA: The MIT Press, (Research Monographs in Parallel and Distributed Computing, ch. 5) 1990
[21] D. Shu, L.-W. Chow, J. Nash, and C. Weems, "A content addressable bit-serial associative processor," inVLSI Signal Processing III.New York: IEEE Press, 1988, pp. 12&128.
[22] C. Weems and T. Titanic, "A VLSI based content addressable parallel array processor," inProc. Int. ConjI Circ. and Components,1982, pp. 23&239.
[23] Xilinx, Inc.,The Programmable Gate Array Design Handbook.Xilinx, Inc., San Jose, CA. 1986.

Index Terms:
VLSI; parallel architectures; performance evaluation; area time trade-offs; micro-grain VLSI array architectures; performance; massively parallel control-flow architectures; associative memory architecture; Mux-based SIMD architecture; RAMs; systolic MIMD/MISD computation; data-flow requirements; FFT; matrix multiplication.
R. Singh Bajwa, R.M. Owens, M.J. Irwin, "Area Time Trade-Offs in Micro-Grain VLSI Array Architectures," IEEE Transactions on Computers, vol. 43, no. 10, pp. 1121-1128, Oct. 1994, doi:10.1109/12.324538
Usage of this product signifies your acceptance of the Terms of Use.