This Article 
 Bibliographic References 
 Add to: 
Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays
January 1990 (vol. 1 no. 1)
pp. 64-76

Consideration is given to transforming depth p-nested for loop algorithms into q-dimensional systolic VLSI arrays where 1>or=q>or=p-1. Previously, there existed complete characterizations of correct transformation only for the cases where q=p-1 orq=1. This gap is filled by giving formal necessary and sufficient conditions for correct transformation of a p-nested loop algorithm into a q-dimensional systolic array for any q,1>or=q>or=p-1. Practical methods are presented. The techniques developed are applied to the automatic design of special purpose and programmable systolic arrays. The results also contribute toward automatic compilation onto more general purpose programmable arrays. Synthesis of linear and planar systolic array implementations for a three-dimensional cube-graph algorithm and a reindexed Warshall-Floyd path-finding algorithm are used to illustrate the method.

[1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman,The Design and Analysis of Computer Algorithms. Menlo Park, CA: Addison-Wesley, 1974.
[2] M. C. Chen, "The generation of a class of multipliers: Synthesizing highly parallel algorithms in VLSI,"IEEE Trans. Comput., vol. C- 37, pp. 329-338, Mar. 1988.
[3] M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to Theory of NP-Completeness. San Francisco, CA: Freeman, 1979.
[4] L. J. Guibas, H. T. Kung, and C. D. Thompson, "Direct VLSI implementation of combinatorial algorithms," inProc. CALTECH Conf. VLSI, Jan. 1979, pp. 509-525.
[5] C.-H. Huang and C. Lengauer, "The derivation of systolic implementations of programs,"Acta Informatica 24, pp. 595-632, 1987.
[6] R. Karp, R. Miller, and S. Winograd, "The Organization of Computations for Uniform Recurrence Equations,"J. ACM, Vol. 14, No. 3, 1967, pp. 563-590.
[7] R. H. Kuhn, "Transforming algorithms for single-stage and VLSI architectures," inProc. Workshop Interconnection Networks for Parallel and Distributed Processing, IEEE CH1560-2, 1980, pp. 11- 19.
[8] H. T. Kung and C. E. Leiserson, "Algorithms for VLSI processor arrays," inIntroduction to VLSI Systems.C. Mead and L. Conway, Eds. Reading, MA: Addison-Wesley, 1980, ch. 8.3.
[9] S. Y. Kung, "On supercomputing with systolic/wavefront array processors," inProc. IEEE, vol.72, pp. 867-884, July 1984.
[10] S. Y. Kung, S. C. Lo, and P. S. Lewis, "Optimal systolic design for the transitive closure problem,"IEEE Trans. Comput., vol. C-36, no. 5, pp. 603-614, May 1987.
[11] L. Lamport, "The parallel execution of DO loops,"Commun. ACM, vol. 17, no. 2, pp. 83-93, Feb. 1974.
[12] P.-Z. Lee and Z. M. Kedem, "Synthesizing linear-array algorithms from nested for loop algorithms,"IEEE Trans. Comput., vol. C-37, pp. 1578-1598, Dec. 1988.
[13] P. Lee and Z.M. Kedem, "On High-Speed Computing with a Programmable Linear Array,"Proc. Supercomputing 88, Vol. 1, CS Press, Los Alamitos, Calif., Order No. 882, pp. 425-432.
[14] P.-Z. Lee, J. Wu, A. Yang, and K. Yip, "SYSDES: A systolic array automation design system,"Fourth SIAM Conf. Parallel Processing for Scientific Computing, Dec. 1989.
[15] G. Li and B. W. Wah, "The design of optimal systolic arrays,"IEEE Trans. Comput., vol. C-34, pp. 66-77, Jan. 1985.
[16] F. C. Lin and I. C. Wu, "Broadcast normalization in systolic design,"IEEE Trans. Comput., vol. C-37, pp. 1428-1434, Nov. 1988.
[17] Y. J. Ma, J. F. Wang, and J. Y. Lee, "Systolic array mapping of sequential algorithm for VLSI architecture," inProc. Int. Comput. Symp., Tainan, Taiwan, R.O.C., Dec. 1986, pp. 865-874.
[18] W. L. Miranker and A. Winkler, "Spacetime representations of computational structures,"Computing, vol. 32, pp. 93-114, 1984.
[19] D. I. Moldovan, "On the analysis of VLSI systems,"IEEE Trans. Comput., vol. C-31, pp. 1121-1126, Nov. 1982.
[20] D. I. Moldovan, "On the design of algorithms for VLSI systolic arrays," inProc. IEEE, vol. 71, pp. 113-120, Jan. 1983.
[21] D. I. Moldovan and J. A. B. Fortes, "Partitioning and mapping algorithms into fixed size systolic arrays,"IEEE Trans. Comput., vol. C-35, pp. 1-12, Jan. 1986.
[22] E. T. L. Omtzigt, "SYSTARS: A CAD tool for the synthesis and analysis of VLSI systolic/wavefront arrays," inProc. Internat. Conf. Systolic Array, San Diego, CA, May 1988, pp. 383-391.
[23] D. A. Padua, "Multiprocessors: Discussion of theoretical and practical problems," Ph.D. dissertation, Univ. of Illinois at Urbana-Champaign, Rep. UIUCDCS-R-79-990, Nov. 1979.
[24] D. A. Padua and M. J. Wolfe, "Advanced compiler optimizations for supercomputers,"Common. ACM, vol. 29, no. 12, pp. 1184- 1201, Dec. 1986.
[25] J. K. Peir and R. Cytron, "Minimum distance: A method for partitioning recurrences for multiprocessors," inProc. ICPP, 1987, pp. 217-225; also inIEEE Trans. Comput., vol. C-38, pp. 1203- 1211, Aug. 1989.
[26] P. Quinton, "Automatic synthesis of systolic arrays from uniform recurrent equations," inProc. 11th Annu. Symp. Comput. Architecture, 1984, pp. 208-214.
[27] P. Quinton, "Mapping recurrences on parallel architectures," inThird Int. Conf. on Supercomputing, Boston, MA, May 15-20, 1988.
[28] Y. Robert and D. Trystram, "An orthogonal systolic array for the algebraic path problem,"Computing, vol. 39, pp. 187-199, 1987.
[29] I. Ramakrishnan, D. Fussell, and A. Silberschatz, "A linear array matrix multiplication algorithm, " inProc. 20th Ann. Allerton Conf. on Comput., Control, Commun., Oct. 1982.
[30] S. K. Rao, "Regular iterative algorithms and their implementations on processor arrays," Ph.D. dissertation, Stanford Univ., Stanford, CA, Oct. 1985.
[31] G. Rote, "A systolic array algorithm for the algebraic path problem,"Computing 34, pp. 191-219, 1985.
[32] W. Shang and J. A. B. Fortes, "Independent partitioning of algorithms with uniform dependencies," inProc. ICPP, 1988, pp. 26-33.
[33] I. V. Ramakrishnan and P. J. Varman, "Synthesis of an optimal family of matrix multiplication algorithms on linear arrays,"IEEE Trans. Comput., vol. C-35, no. 11, 1986.
[34] U. Weiser and A. Davis, "A wavefront notational tool for VLSI array design," inCMU Conf. VLSI Systems and Computations. Pittsburgh, PA: Computer Sci. Press, Oct. 1981, pp. 226-234.
[35] Y. Wong and J. Delosme, "Optimal systolic implementations ofN- dimensional recurrences," inIEEE Proc. ICCD, 1985, pp. 618-621.

Index Terms:
Index Termsnecessary conditions; algorithm transformations; data dependence; matrix multiplication; parallel processing; nested loop algorithms; multidimensional systolic arrays; sufficient conditions; correct transformation; programmable systolic arrays; automatic compilation; general purpose programmable arrays; planar systolic array implementations; three-dimensional cube-graph algorithm; reindexed Warshall-Floyd path-finding algorithm; cellular arrays; graph theory; matrix algebra; parallel algorithms
P.Z. Lee, Z.M. Kedem, "Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays," IEEE Transactions on Parallel and Distributed Systems, vol. 1, no. 1, pp. 64-76, Jan. 1990, doi:10.1109/71.80125
Usage of this product signifies your acceptance of the Terms of Use.