This Article 
 Bibliographic References 
 Add to: 
A Pipeline-Based Approach for Scheduling Video Processing Algorithms on NOW
February 2003 (vol. 14 no. 2)
pp. 119-130

Abstract—Network Of Workstations (NOW) platforms put together with off-the-shelf workstations and networking hardware have become a cost effective, scalable, and flexible platform for video processing applications. Still, one has to manually schedule an algorithm to the available processors of the NOW to make efficient use of the resources. However, this approach is time-consuming and impractical for a video processing system that must perform a variety of different algorithms, with new algorithms being constantly developed. Improved support for program development is absolutely necessary before the full benefits of parallel architectures can be realized for video processing applications. Toward this goal, an automatic compile-time scheduler has been developed to schedule input tasks of video processing applications with precedence constraints onto available processors. The scheduler exploits both spatial (parallelism) and temporal (pipelining) concurrency to make the best use of machine resources. Two important scheduling problems are addressed. First, given a task graph and a desired throughput, a schedule is constructed to achieve the desired throughput with the minimum number of processors. Second, given a task graph and a finite set of available resources, a schedule is constructed such that the throughput is maximized while meeting the resource constraints. Results from simulations show that the scheduler and proposed optimization techniques effectively tackle these problems by maximizing processor utilization. A code generator has been developed to generate parallel programs automatically. The tools developed in this paper make it much easier for a programmer to develop video processing applications on these parallel architectures.

[1] T.E. Anderson, D.E. Culler, and D.A. Patterson, “A Case for NOW (Networks of Workstations),” IEEE Micro, vol. 15, no. 1, pp. 54–64, 1995.
[2] S. Banerjee, T. Hamada, P. Chau, and R. Fellman, “Macro Pipelining Based Scheduling on High-Performance Heterogeneous Multiprocessor Systems,” IEEE Trans. Signal Processing, vol. 43, no. 6, June 1995.
[3] N. Boden et al., "Myrinet: A Gigabit-per-Second Local Area Network," IEEE Micro, Feb. 1995, pp. 29-36.
[4] E. Coffman and G. Lueker, Probabilistic Analysis of Packing and Partitioning Algorithms, Wiley-Interscience Series in Discrete Mathematics and Optimization, 1991.
[5] V. Donaldson and J. Ferrante, “Analyzing Asynchronous Pipeline Schedules,” Int'l J. Parallel Programming, vol. 26, no. 1, 1998.
[6] H. El-Rewini, T.G. Lewis, and H.H. Ali, Task Scheduling in Parallel and Distributed Systems. Prentice Hall, 1994.
[7] D. Feitelson, “A Survey of Scheduling in Multiprogrammed Parallel Systems,” Research Report RC 19790 (87657), IBM T.J. Watson Research Center, 1997.
[8] R. Govindarajan, E. Altman, and G. Gao, “A Framework for Resource-Constrained Rate-Optimal Software Pipelining,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 11, Nov. 1996.
[9] C. Han and K. Shin, “Message Transmission with Timing Constraints in Ring Networks,” Proc. IEEE 17th Symp. Real-Time Systems, pp. 165-174, 1996.
[10] P. Hoang and J. Rabaey, “Scheduling of DSP Programs onto Multiprocessors for Maximum Throughput,” IEEE Trans. Signal Processing, vol. 41, no. 6, June 1993.
[11] J.J. Hwang,Y.C. Chow,F.D. Anger, and C.Y. Lee,"Scheduling precedence graphs in systems with interprocessor communication times," SIAM J. Computing, vol. 18, no. 2, pp. 244-257, Apr. 1989.
[12] C. Leiserson, F. Rose, and J. Saxe, “Optimizing Synchronous Circuitry by Retiming,” Proc. Third Caltech Conf. VLSI, pp. 87-116, Mar. 1983.
[13] A. Majumdar, “Design of an ASIC for Straight Line Detection in an Image,” Proc. 13th Int'l Conf. VLSI Design, 2000.
[14] MPI Forum, Message Passing Interface Standard, , 2002.
[15] S. Nagar, A. Banerjee, A. Sivasubramaniam, and C. Das, “An Experimental Evaluation of Scheduling Strategies for a Network of Workstations,” Proc. ACM Symp. Parallel Algorithms and Architectures (SPAA), pp. 96-105, June 1999.
[16] A. Reza and R. Turney, “FPGA Implementation of 2D Wavelet Transform,” Proc. 33rd Asilomar Conf. Signals, Systems, and Computers, 1999.
[17] J. Skovira, W. Chan, H. Zhou, and D. Lifka, “The EASY-LoadLeveler API Project,” Job Scheduling Strategies for Parallel Processing, D.G. Feitelson and L. Rudolph, eds., pp. 41–47, 1996.
[18] Y. Wei and C. Cheng, “Ratio Cut Partitioning for Hierarchical Designs,” IEEE Trans. Computer-Aided Design, vol. 10, no. 7, July 1991.
[19] M. Yang, “An Automatic Scheduler for Real-Time Vision Applications,” PhD thesis, Dept. of Computer Science and Eng., Pennsylvania State Univ., 2000.
[20] M. Yang, “An Automatic Scheduler for Real-Time Vision Applications,” PhD thesis, Dept. of Computer Science and Eng., Pennsylvania State Univ., 2000.
[21] M. Yang, T. Gandhi, R. Kasturi, L. Coraor, O. Camps, and J. McCandless, “Real-Time Obstacle Detection System for High-Speed Civil Transport Supersonic Aircraft,” Proc. IEEE Nat'l Aerospace and Electronics Conf., Oct. 2000.
[22] Y. Zhang and A. Sivasubramaniam, “Scheduling Best-Effort and Real-Time Pipelined Applications on Time-Shared Clusters,” Proc. 13th Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 209-218, July 2001.

Index Terms:
Pipelined scheduling, throughput optimization, resource optimization, network of workstations.
Mau-Tsuen Yang, Rangachar Kasturi, Anand Sivasubramaniam, "A Pipeline-Based Approach for Scheduling Video Processing Algorithms on NOW," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 2, pp. 119-130, Feb. 2003, doi:10.1109/TPDS.2003.1178876
Usage of this product signifies your acceptance of the Terms of Use.