loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Network CoProcessors for Scalable, Predictable Media Services
July 2003 (vol. 14 no. 7)
pp. 655-670

Abstract—This paper presents the embedded realization and experimental evaluation of a media stream scheduler on Network Interface (NI) CoProcessor boards. When using media frames as scheduling units, the scheduler is able to operate in real-time on streams traversing the CoProcessor, resulting in its ability to stream video to remote clients at real-time rates. The contributions of this paper are its detailed evaluation of the effects of placing application or kernel-level functionality, like packet scheduling on NIs, rather than the host machines to which they are attached. The main benefits of such placement are 1) that traffic is eliminated from the host bus and memory subsystem, thereby allowing increased host CPU utilization for other tasks, and 2) that NI-based scheduling is immune to host-CPU loading, unlike host-based media schedulers that are easily affected even by transient load conditions. An outcome of this work is a proposed cluster architecture for building scalable media servers by distributing schedulers and media stream producers across the multiple NIs used by a single server and by clustering a number of such servers using commodity network hardware and software.

[1] A. Acharya, M. Uysal, and J. Saltz, Active Disks Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 1998.[2] Akamai and FreeFlow Content Management,http:/www.aka mai.com, 2003.[3] 10 Gigabit Ethernet Alliance,http:/www.10gea.org, 2003.[4] K. Almeroth and M. Ammar, A Scalable, Interactive Video-on-Demand Service Using Multicast Communication Proc. Int'l Conf. Computer Comm. Networks, Sept. 1994.[5] T.E. Anderson, D.E. Culler, D.A. Patterson, and the NOW Team A Case for Networks of Workstations: NOW IEEE Micro, Feb. 1995.[6] Apache http Server Project Apache Software Foundation,http://www.apache.orghttpd.html, 2003.[7] Infiniband Trade Association,http:/www.infinibandta.org, 2003.[8] B. Bershad, S. Savage, P. Pardyak, E. Sirer, M. Fiuczynski, D. Becker, C. Chambers, and S. Eggers, “Extensibility, Safety and Performance in the SPIN Operating System,” Proc. Symp. Operating Systems Principles, pp. 267–284, 1995.[9] B. Blake, A Fast, Efficient Scheduling Framework for Parallel Computing Systems PhD thesis, Dept. of Computer and Information Science, The Ohio State Univ., Dec. 1989.[10] N. Boden et al., "Myrinet: A Gigabit-per-Second Local Area Network," IEEE Micro, Feb. 1995, pp. 29-36.[11] W.J. Bolosky, R.P. Fitzgerald, and J.R. Douceur, Distributed Schedule Management in the Tiger Video Fileserver Proc. 16th ACM Symp. Operating System Principles, vol. 31, pp. 212-223, Dec. 1997.[12] C. Keppitiyagama, et al. Asynchronous MPI Messaging on Myrinet Proc. Int'l Parallel and Distributed Processing Symp., Apr. 2001.[13] Solaris On-Line Documentation,http:/www.docs.sun.com, 2003.[14] D.R. Engler, M.F. Kaashoek, and J.O. Jr, “Exokernel: An Operating System Architecture for Application-Level Resource Management,” Proc. 15th Symp. Operating Systems Principles, Dec. 1995.[15] E.W. Felten, R.A. Alpert, A. Bilas, M.A. Blumrich, D.W. Clark, S.N. Damianakis, C. Dubnicki, L. Iftode, and K. Li, “Early Experience with Message-Passing on the SHRIMP Multicomputer,” Proc. Int'l Symp. Computer Architecture (ISCA), pp. 296-307, 1996.[16] D. Ferrari, A. Banerjea, and H. Zhang, Network Support for Multimedia A Discussion of the Tenet Approach TR-92-072, Dept. of Computer Science, Univ. of California Berkeley, 1992.[17] M.E. Fiuczynski, B.N. Bershad, R.P. Martin, and D.E. Culler, SPINE An Operating System for Intelligent Network Adapters TR-98-08-01, Aug. 1998.[18] A. Gavrilovska, K. Schwan, A. McDonald, and K. Mackenzie, Stream Handlers: Application-Specific Message Services on Attached Network Processors Proc. 10th IEEE Conf. High-Performance Interconnects, Aug. 2002.[19] A. Gavrilovska, K. Schwan, and V. Oleson, Adaptable Mirroring on Clusters Proc. 10th Int'l Conf. High Performance Distributed Computing, Aug. 2001.[20] A. Gavrilovska, K. Schwan, and V. Oleson, Practical Approach to Zero Downtime in an Operational Information System Proc. 22nd IEEE Int'l Symp. Distributed Computing Systems, 2002.[21] $I_{2}O$Special Interest Group,www.i2osig.org/architecturetech back98.html , 1999.[22] Intel, IQ80960Rx Evaluation Platform Board Manual, Mar. 1997.[23] C. Isert and K. Schwan, “ACDS: Adapting Computational Data Streams for High Performance,” Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), May 2000.[24] M. Jones, D. Rosu, and M.-C. Rosu, “CPUReservations and Time Constraints: Efficient, Predictable Scheduling of Independent Activities,” Proc. 16th ACM Symp. Operating Systems Principles, Oct. 1997.[25] C. Krasic and J. Walpole, QoS Scalability for Streamed Media Delivery Technical Report CSE-99-011, Dept. of Computer Science, Oregon Graduate Inst., 17, 1999.[26] R. Krishnamurthy, et al. The Georgia Tech Asan Approach Proc. IEEE Int'l Symp. High Performance Computer Architecture, Jan. 2001.[27] R. Krishnamurthy, et al. Architecture and Hardware for Scheduling of Gigabit Packet Streams Proc. IEEE Int'l Symp. High Performance Computer Architecture, Jan. 2001.[28] R. Krishnamurthy, K. Schwan, R. West, and M. Rosu, A Network CoProcessor-Based Approach to Scalable Media Streaming in Servers Technical Report GIT-CC-00-03, Georgia Inst. of Tech nology, 2000.[29] R. Krishnamurthy, K. Schwan, R. West, and M. Rosu, A Network Co-Processor-Based Approach to Scalable Media Streaming in Servers Proc. Int'l Conf. Parallel Processing, Int'l Assoc. for Computers and Comm. (IACC), Aug. 2000.[30] R. Krishnamurthy, S. Yalamanchili, K. Schwan, and R. West, Architecture and Hardware for Scheduling Gigabit Packet Streams Proc. 10th IEEE Conf. High-Performance Interconnects, Aug. 2002.[31] E.K. Lee and C.A. Thekkath, “Petal: Distributed Virtual Disks,” Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 84-92, Oct. 1996.[32] J. Lockwood, J. Turner, and D. Taylor, Field Programmable Port Extender (FPX) for Distributed Routing and Queuing Proc. ACM Int'l Symp. Field Programmable Gate Arrays, pp. 137-144, Feb. 2000.[33] G. Mair, Telepresence The Technology and Its Economic and Social Implications Proc. IEEE Int'l Symp. Technology and Soc., 1997.[34] M. Martonosi, D. Clark, and M. Mesarina, The Shrimp Hardware Performance Monitor: Design and Applications Proc. 1996 SIGMETRICS Symp. Parallel and Distributed Tools, 1996.[35] C.W. Mercer, S. Savage, and H. Tokuda, Processor Capacity Reservation for Multimedia Operating Systems Proc. IEEE Int'l Conf. Multimedia Computing and Systems, May 1994.[36] D. Mosberger and T. Jin, Httperf A Tool for Measuring Web Server Performance Proc. 1998 Workshop Internet Server Performance, held in conjunction with Sigmetrics 1998, June 1998.[37] J. Nieplocha, et al. One-Sided Communication on Myrinet-Based SMP Clusters Using the GM Message-Passing Library Proc. Workshop Comm. Architectures in Clusters, held in conjunction with Proc. Int'l Parallel and Distributed Processing Symposium, Apr. 2001.[38] $I_{2}O$Intel Page,http://www.developer.intel.comiio, 1999.[39] V. Pai, P. Druschel, and W. Zwaenepoel, Lo-Lite: A Unified Buffering and Caching System Proc. Third Symp. Operating Systems Design and Implementation, 1999.[40] S. Pakin, M. Lauria, and A. Chien, "High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet," Proc. Supercomputing 95, IEEE Computer Society, Los Alamitos, Calif., Dec. 1995.[41] P. Goyal, X. Guo, and H. Vin, “A Hierarchical CPU Scheduler for Multimedia Operating Systems,” Proc. Second Usenix Symp. Operating System Design and Implementation, Oct. 1996.[42] B. Plale and K. Schwan, “dQUOB: Managing Large Data Flows Using Dynamic Embedded Queries,” Proc. Ninth IEEE Int'l High Performance Distributed Computing Symp. (HPDC), Aug. 2000.[43] RAMBUS,http:/www.rambus.com, 2003.[44] J.L. Rexford, A.G. Greenberg, and F.G. Bonomi, Hardware-Efficient Fair Queuing Architectures for High-Speed Networks Proc. INFOCOMM, pp. 638-646, Mar. 1996.[45] M.-C. Rosu, K. Schwan, and R. Fujimoto, Supporting Parallel Applications on Clusters of Workstations: The Intelligent Network Interface Approach Proc. Sixth IEEE Int'l Symp. High Performance Distributed Computing, Aug. 1997.[46] D. Rosu, K. Schwan, and S. Yalamanchili, FARA: A Framework for Adaptive Resource Allocation in Complex Real-Time Systems Proc. IEEE Real-Time Technology and Applications Symp., 1998.[47] D.I. Rosu, K. Schwan, S. Yalamanchili, and R. Jha, “On Adaptive Resource Allocation for Complex Real-Time Applications,” Proc. 18th IEEE Real-Time Systems Symp., pp. 320–329, Dec. 1997.[48] M.-C. Rosu and K. Schwan, Sender Coordination in the Distributed Virtual Communication Machine Proc. Seventh IEEE Int'l Symp. High Performance Distributed Computing, 1998.[49] M.-C. Rosu, K. Schwan, and R. Fujimoto, Supporting Parallel Applications on Clusters of Workstations: The Virtual Communication Machine-Based Architecture Cluster Computing, vol. 1, pp. 1-17, Jan. 1998.[50] Y. Saito, B. Bershad, and H. Levy, Availability and Performance in Porcupine: A Highly Scalable Internet Mail Service Proc. 17th ACM Symp. Operating Systems Principles, Dec. 1999.[51] K. Schwan and H. Zhou, “Dynamic Scheduling of Hard Real-Time Tasks and Real-Time Threads,” IEEE Trans. Software Eng., vol. 18, no. 8, pp. 736–748, Aug. 1992.[52] Intel IXP 1200 Web Site,http://www.intel.com/design/network/products/ npfamilyindex.htm, 2003.[53] C. Stunkel, D. Shea, B. Abali, M. Atkins, C. Bender, D. Grice, P. Hochschild, D. Joseph, B. Nathanson, R. Swetz, R. Stucke, M. Tsao, and P. Varker, The SP2 Communication Subsystem technical report, IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y.,http:/ibm.tc.cornell.edu, Aug. 1994[54] R. Swan, S. Fuller, D. Siewiorek, and C. Modular, Multi-Microprocessor Proc. Nat'l Computer Conf., vol. 46, pp. 637-644, 1977.[55] Alteon Web Systems,http:/www.alteonWebsystems.com, 2001.[56] M. Trivedi, B. Hall, G. Kogut, and S. Roche, Web-Based Teleautonomy and Telepresence Proc. SPIE Optical Science and Technology Conf., 2000.[57] T. von Eicken et al., "U-Net: A User-Level Network Interface for Parallel and Distributed Computing," Proc. 15th ACM Symp. OS Principles, ACM Press, New York, 1995, pp. 40-53.[58] T. von Eicken et al., "Active Messages: a Mechanism for Integrated Communications and Computation," Computer Architecture News, Vol. 20, No. 2, May 1992, pp. 256-266.[59] J. Walpole, R. Koster, S. Chen, C. Cowan, D. Maier, D. McNamee, C. Pu, D. Steere, and L. Yu, A Player for Adaptive MPEG Video Streaming over the Internet Proc. 26th Applied Imagery Pattern Recognition Workshop, Oct. 1997.[60] R. West, R. Krishnamurthy, W. Norton, K. Schwan, S. Yalamanchili, M. Rosu, and S. Chandra, QUIC: A Quality of Service Network Interface Layer for Communication in NOWS Proc. Heterogeneous Computing Workshop, in conjunction with IPPS/SPDP, Apr. 1999.[61] R. West and C. Poellabauer, Analysis of a Window-Constrained Scheduler for Real-Time and Best-Effort Packet Streams Proc. 21st IEEE Real-Time Systems Symp., Dec. 2000.[62] R. West and K. Schwan, Dynamic Window-Constrained Scheduling for Multimedia Applications Proc. Sixth Int'l Conf. Multimedia Computing and Systems (ICMCS '99), June 1999. Also available as Technical Report: GIT-CC-98-18, Georgia Inst. of Tech nology. [63] R. West, K. Schwan, and C. Poellabauer, Scalable Scheduling Support for Loss and Delay Constrained Media Streams Technical Report GIT-CC-98-29, Georgia Inst. of Tech nology, 1998.[64] WindRiver Systems, VxWorks Reference Manual, first ed., Feb. 1997.[65] Xilinx,http:/www.xilinx.com, 2003.[66] H. Zhang and S. Keshav, “Comparison of Rate-Based Service Disciplines,” Proc. ACM SIGCOMM '91, pp. 113-121, Sept. 1991.[67] D. Zhou and K. Schwan, Adaptation and Specialization for High Performance Mobile Agents Proc. Usenix Conf. Object-Oriented Technologies, 1999.[68] X. Zhuang, W. Shi, I. Paul, and K. Schwan, On the Efficient Implementation of the DWCS Packet Scheduling Algorithm on IXP1200 Network Processors Proc. IEEE Int'l Symp. Multimedia Networks and Systems, 2002.

Index Terms:
Cluster machines, multimedia services, embedded systems, quality of service, operating systems, real-time systems, data streaming, packet scheduling.
Citation:
Raj Krishnamurthy, Karsten Schwan, Richard West, Marcel-Catalin Rosu, "On Network CoProcessors for Scalable, Predictable Media Services," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 7, pp. 655-670, July 2003, doi:10.1109/TPDS.2003.1214318
Usage of this product signifies your acceptance of the Terms of Use.