This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Temporal Partitioning and Scheduling Data Flow Graphs for Reconfigurable Computers
June 1999 (vol. 48 no. 6)
pp. 579-590

Abstract—FPGA-based configurable computing machines are evolving rapidly. They offer the ability to deliver very high performance at a fraction of the cost when compared to supercomputers. The first generation of configurable computers (those with multiple FPGAs connected using a specific interconnect) used statically reconfigurable FPGAs. On these configurable computers, computations are performed by partitioning an entire task into spatially interconnected subtasks. Such configurable computers are used in logic emulation systems and for functional verification of hardware. In general, configurable computers provide the ability to reconfigure rapidly to any desired custom form. Hence, the available resources can be reused effectively to cut down the hardware costs and also improve the performance. In this paper, we introduce the concept of temporal partitioning to partition a task into temporally interconnected subtasks. Specifically, we present algorithms for temporal partitioning and scheduling data flow graphs for configurable computers. We are given a configurable computing unit (RPU) with a logic capacity of $S_{RPU}$ and a computational task represented by an acyclic data flow graph$G=(V,E)$. Computations with logic area requirements that exceed $S_{RPU}$ cannot be completely mapped on a configurable computer (using traditional spatial mapping techniques). However, a temporal partitioning of the data flow graph followed by proper scheduling can facilitate the configurable computer based execution. Temporal partitioning of the data flow graph is a $k$-way partitioning of $G=(V,E)$ such that each partitioned segment will not exceed $S_{RPU}$ in its logic requirement. Scheduling assigns an execution order to the partitioned segments so as to ensure proper execution. Thus, for each segment in $\{ s_1, s_2, \cdots, s_k \}$, scheduling assigns a unique ordering $ s_i \rightarrow j$, $1 \leq i \leq k$, $1 \leq j \leq k$, such that the computation would execute in proper sequential order as defined by the flow graph $G=(V,E)$.

[1] A.V. Aho, R. Sethi, and J.D. Ullman, Compilers, Principles, Techniques and Tools.New York: Addison-Wesley, 1985.
[2] O.T. Albahama, P. Cheung, and T.J. Clarke, “On the Viability of FPGA-Based Integrated Coprocessors,” Proc. IEEE Symp. FPGAs for Custom Computing Machines, K.L. Pocek and J. Arnold, eds., pp. 206-215, Apr. 1996.
[3] P.M. Athanas and A.L. Abbott, “High-Speed Image Processing with Splash 2,” Splash2, FPGAs in Custom Computing Machine, pp. 141-165. IEEE CS Press, 1996.
[4] J. Babb, R. Tessier, M. Dahl, S. Hanono, D. Hoki, and A. Agarwal, “Logic Emulation with Virtual Wires,” IEEE Trans. Computer-Aided Design, vol. 16, no. 6, pp. 609-626, June 1997.
[5] N.B. Bhat, “Novel Techniques for High Performance Field Programmable Logic Devices,” Technical Report ERL-93-80, Computer Science Division, Univ. of California, Berkeley, Nov. 1993.
[6] N.B. Bhat, K. Chaudhary, and E.S. Kuh, “Performance Oriented Fully Routable Architecture for a Field Programmable Logic Device,” Technical Report ERL-93-42, Computer Science Division, Univ. of California, Berkeley, June 1993.
[7] D. Bhatia, P. Kannan, K. Simha, and K.M. Gajjala Purna, “REACT: Reactive Environment for Reconfigurable Computing,” Field-Programmable Logic: Smart Applications, New Paradigms and Compilers, R.W. Hartenstein and A. Keevallik, eds. Berlin: Springer-Verlag, Aug./Sept. 1998.
[8] S. Brown and J. Rose, "FPGA and CPLD Architectures: A Tutorial," IEEE Design&Test of Computers, vol. 13, no. 2, 1996, pp. 42-57.
[9] D.A. Buell, J.M. Arnold, and W.J. Kleinfelde, Splash2, FPGAs in Custom Computing Machine. IEEE CS Press, 1996.
[10] W.-H. Chen, C.H. Smith, and S.C. Fralick, “A Fast Computational Algorithm for the Discrete Cosine Transform,” IEEE Trans. Comm., vol. 25, no. 9, pp. 1,004-1,009, Sept. 1977.
[11] Gatefield Corporation, www. gatefield. com.
[12] Xilinx Corporation “The Virtex Series of FPGAs 1,000,000 System Gates at 100+ mhz,” http://www.xilinx.com/productsvirtex.htm .
[13] J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 1995.
[14] T. Fahringer and E. Mehofer, “Buffer-Safe and Cost-Driven Communication Optimization,” J. Parallel and Distributed Computing, vol. 57, pp. 33-63, 1999.
[15] A. DeHon, “DPGA Utilization and Application,” Proc. ACM/SIGDA Int'l Symp. Field Programmable Gate Arrays, pp. 115-121, Monterey, Calif., Feb. 1996.
[16] M.J. Flynn, Computer Architecture Pipelined and Parallel Processor Design, Jones and Bartlett Publishers, Boston, 1995.
[17] D. Gajski et al., High-Level Synthesis: Introduction to Chip and System Design, Kluwer Academic Publishers, 1992.
[18] S. Gehring and S. Ludwig, "The Trianus System and Its Application to Custom Computing," Proc. Sixth Int'l Workshop Field-Programmable Logic and Applications, LNCS 1142, Springer, Berlin, 1996, pp. 176-184.
[19] J.P. Gray and T.A. Kean, “Configurable Hardware: A New Paradigm for Computation,” Proc. Decennial CalTech Conf. VLSI, pp. 277-293, Pasadena, Calif., Mar. 1989.
[20] S. Hauck, “Multi-FPGA Systems,” PhD thesis, Univ. of Washington, 1997.
[21] S. Hauck, “The Role of FPGAs in Reprogrammable Systems,” Proc. IEEE, vol. 86, no. 4, pp. 615-638, Apr. 1998.
[22] D.T. Hoang, "Searching Genetic Databases on Splash 2," Proc. IEEE Workshop FPGAs for Custom Computing Machines,Napa, Calif., 1993.
[23] J. Vuillemin, P. Bertin, D. Roncin, M. Shand, H. Touati, and P. Boucard, “Programmable Active Memories: Reconfigurable Systems Come of Age,” IEEE Trans. VLSI Systems, vol. 4, pp. 56-69, Mar. 1996.
[24] N.K. Ratha, A.K. Jain, and D.T. Rower, “Fingerprint Matching on Splash2,” Splash2, FPGAs in Custom Computing Machine, pp. 117-140, 1996.
[25] M.C. McFarland, A.C. Parker, and R. Camposano, "The High-Level Synthesis of Digital Systems," Proc. IEEE, vol. 78, Feb. 1990.
[26] P.G. Paulin and J.P. Knight, "Force-Directed Scheduling for the Behavioral Synthesis of ASIC's," IEEE Trans. Computer-Aided Design, vol. 8, June 1989.
[27] D.V. Pryor, M.R. Thistle, and N. Shirazi, “Text Searching on Splash 2,” Proc. IEEE Workshop FPGAs for Custom Computing Machines, D.A. Buell and K.L. Pocek, eds., pp. 172-177, Apr. 1993.
[28] K.M. Gajjala Purna, “Temporal Partitioning and Scheduling Data Flow Graphs for Reconfigurable Computers,” Master's thesis, Dept. of Electrical and Computer Eng. and Computer Science, Univ. of Cincinnati, Oct. 1998.
[29] K.M. Gajjala Purna and D. Bhatia, “Emulating Large Designs on Small Reconfigurable Hardware,” Proc. Ninth IEEE Int'l Workshop Rapid System Prototyping, June 1998.
[30] K.M. Gajjala Purna and D. Bhatia, “Temporal Partitioning and Scheduling for Reconfigurable Computing,” Technical Report TR 212/01/98/ECECS, Dept. of ECECS, Univ. of Cincinnati, Jan. 1998.
[31] Rational Software Corporation, Quantify User's Guide, Version 3.1, 1997.
[32] R. Jain, A. Majumdar, A. Sharma, and H. Wang, “Empirical Evaluation of Some High-Level Synthesis Scheduling Heuristics,” Proc. 28th ACM/IEEE Design Automation Conf., pp. 210-215, June 1991.
[33] V. Sarkar,Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors.Cambridge, Mass.: MIT Press, 1989.
[34] D. Smith, “RACE : A Reconfigurable and Adaptive Computing Environment,” Master's thesis, Dept. of ECECS, Univ. of Cincinnati, June 1997.
[35] D. Smith and D. Bhatia, “RACE: Reconfigurable and Adaptive Computing Environment,” Field-Programmable Logic: Smart Applications, New Paradigms and Compilers, pp. 87-95. Berlin: Springer-Verlag, Sept. 1996.
[36] J.L. Smith, “Implementing Median Filters in XC4000 FPGAs,” Xcell Articles. Xilinx Corporation, San Jose, Calif., Quarter 4, 1996.
[37] W. Stallings, Computer Organization and Architecture, Fourth ed. Prentice-Hall, 1996.
[38] B. Stott, D. Johnson, and V. Akella, “Asynchronous 2-D Discrete Cosine Transform Core Processor,” Proc. Int'l Conf. Computer Design, ICCD95, Oct. 1995.
[39] Ikos Systems, www. ikos. com.
[40] Quickturn Design Systems, www. quickturn. com.
[41] S. Trimberger, “Scheduling Designs into a Time-Multiplexed FPGA,” Proc. ACM/SIGDA Int'l Symp. Field Programmable Gate Arrays, FPGA98, pp. 153-160, Feb. 1998.
[42] S. Trimberger, D. Carberry, A. Johnson, and J. Wong, “A Time-Multiplexed FPGA,” Proc. IEEE Workshop FPGAs for Custom Computing Machines, pp. 22-28, Apr. 1997.
[43] TSI-Telsys Inc., ACE Card, User's Manual, Version 1. 0, 1998.
[44] J. Varghese, M. Butts, and J. Batcheller, “An Efficient Logic Emulation System,” IEEE Trans. VLSI Systems, vol. 1, no. 2, pp. 171-174, June 1993.
[45] V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards—Algorithms and Architectures. Boston: Kluwer Academic, 1996.
[46] Xilinx Corporation, Xilinx Netlist Format (XNF) Specification, Version 6. 1, San Jose, Calif., 1995.
[47] Xilinx Corporation, Xilinx XABEL Reference Manual, San Jose, Calif., 1995.
[48] Xilinx Corporation, Gate Count Capacity Metrics for FPGAs, XAPP 059, San Jose, Calif., 1997.
[49] Xilinx Corporation, The Programmable Logic Data Book, San Jose, Calif., 1998.
[50] Xilinx Corporation, Virtex 2. 5v FPGA Series(XCv00), San Jose, Calif., Oct. 1998.

Index Terms:
Configurable computing, field programmable gate arrays, spatial partitioning, temporal partitioning, scheduling, data flow graphs, reconfigurable computers, high performance computing.
Citation:
Karthikeya M. Gajjala Purna, Dinesh Bhatia, "Temporal Partitioning and Scheduling Data Flow Graphs for Reconfigurable Computers," IEEE Transactions on Computers, vol. 48, no. 6, pp. 579-590, June 1999, doi:10.1109/12.773795
Usage of this product signifies your acceptance of the Terms of Use.