This Article 
 Bibliographic References 
 Add to: 
Optimal Synthesis of Algorithm-Specific Lower-Dimensional Processor Arrays
March 1996 (vol. 7 no. 3)
pp. 274-287

Abstract—Processor arrays are frequently used to deliver high performance in many applications with computationally intensive operations. This paper presents the General Parameter Method (GPM), a systematic parameter-based approach for synthesizing such algorithm-specific architectures. GPM can synthesize processor arrays of any lower dimension from a uniform-recurrence description of the algorithm. The design objective is a general nonlinear and nonmonotonic user-specified function, and depends on attributes such as computation time of the recurrence on the processor array, completion time, load time, and drain time. In addition, bounds on some or all of these attributes can be specified. GPM performs an efficient search of polynomial complexity to find the optimal design satisfying the user-specified design constraints. As an illustration, we show how GPM can be used to find optimal linear processor arrays for computing transitive closures. We consider design objectives that minimize computation time, or processor count, or completion time (including load and drain times), and user-specified constraints on number of processing elements and/or computation/completion times. We show that GPM can be used to obtain optimal designs that trade between number of processing elements and completion time, thereby allowing the designer to choose a design that best meets the specified design objectives. We also show the equivalence between the model assumed in GPM and that in the popular dependence-based methods [1], [2]. Consequently, GPM can be used to find optimal designs for both models.

[1] R. H. Kuhn,“Optimization and interconnection complexity for: Parallel processors, single-stage networks, and decision trees,”Ph.D. dissertation, Dep. Comput. Sci., Univ. Illinois, Urbana-Champaign, 1980.
[2] D.I. Moldovan, "On the Analysis and Synthesis of VLSI Algorithms," IEEE Trans. Computers, vol. 31, no. 11, pp. 1,121-1,126, Nov. 1982.
[3] H.T. Kung, "Why Systolic Architectures?" Computer, vol. 15, no. 1, pp. 37-46, Jan. 1982
[4] J.A.B. Fortes, K.-S. Fu, and B.W. Wah, "Systematic Design Approached for Algorithmically Specified Systolic Arrays," Computer Architecture: Concepts and Systems, V.M. Milutinovic, ed., pp. 454-494. NorthHolland, 1988.
[5] Z. Chen and W. Shang, "On Uniformization of Affine Dependence Algorithms," Proc. Fourth Symp. Parallel and Distributed Systems, vol. 3, pp. 128-137, Dec. 1992.
[6] W. Shang and J.A.B. Fortes, "On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 5, pp. 350-363, May 1992.
[7] P.Z. Lee and Z.M. Kedem, “Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays,” IEEE Trans. Parallel and Distributed Systems, vol. 1, no. 1, pp. 64-76, Jan. 1990.
[8] P.Z. Lee and Z.M. Kedem,“Synthesizing linear array algorithms from nested for loop algorithms,” IEEE Trans. Computers, vol. 37, pp. 1,578-1,598, Dec. 1988.
[9] V.P. Roychowdhury and T. Kailath, "Subspace Scheduling and Parallel Implementation of Non-Systolic Regular Iterative Algorithms," J.VLSI Signal Processing, vol. 1. Kluwer Academic, 1989.
[10] G.-J. Li and B.W. Wah, "The Design of Optimal Systolic Arrays," IEEE Trans. Computers, vol. 34, no. 1, pp. 66-77, Jan. 1985.
[11] M.T. O'Keefe, J.A.B. Fortes, and B.W. Wah, "On the Relationship Between Systolic Array Design Methodologies," IEEE Trans. Computers, vol. 41, no. 12, pp. 1,589-1,593, Dec. 1991.
[12] J.A.B. Fortes, B.W. Wah, W. Shang, and K.N. Ganapathy, "Algorithm-Specific Parallel Processing with Linear Processor Arrays," Advances in Computers, M. Yovits, ed. Academic Press, 1994.
[13] K. Ganapathy, "Mapping Regular Recursive Algorithms to Fine-Grained Processor Arrays," PhD thesis, Univ. of Illinois, Urbana-Champaign, May 1994.
[14] K.N. Ganapathy and B.W. Wah, "Synthesizing Optimal Lower Dimensional Processor Arrays," Proc. Int'l Conf. Parallel Processing, pp. 96-103. Pennsylvania State Univ. Press, Aug. 1992.
[15] J. Zue, "A New Formulation of the Mapping Conditions for the Synthesis of Linear Systolic Arrays," Proc. Application Specific Array Processors, pp. 297-308. IEEE CS Press, 1993.
[16] K.N. Ganapathy and B.W. Wah, "Optimal Design of Lower Dimensional Processor Arrays for Uniform Recurrences," Proc. Application Specific Array Processors, pp. 636-648. IEEE CS Press, Aug. 1992.
[17] S.Y. Kung,S.C. Lo,, and P.S. Lewis,“Optimal systolic design for the transitive closure and the shortest path problems,” IEEE Trans. Computers, vol. 36, pp. 603-614, May 1987.
[18] G. Rote, "A Systolic Array for Algebraic Path Problem," Computing, vol. 34, pp. 192-219. Springer-Verlag, 1985.
[19] K.N. Ganapathy and B.W. Wah, "Designing a Coprocessor for Regular Recurrent Computations," Proc. Fifth IEEE Symp. Parallel and Distributed Systems, pp. 806-813, Dec. 1993.

Index Terms:
Design constraints, objective function, optimal design, polynomial-time search, processor arrays, transitive closure, uniform recurrence equations.
Kumar N. Ganapathy, Benjamin W. Wah, "Optimal Synthesis of Algorithm-Specific Lower-Dimensional Processor Arrays," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 3, pp. 274-287, March 1996, doi:10.1109/71.491581
Usage of this product signifies your acceptance of the Terms of Use.