The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - Oct. (2012 vol.23)
pp: 1934-1943
Qiang Wu , Juniper Networks, Inc., Sunnyvale
Tilman Wolf , University of Massachusetts, Amherst
ABSTRACT
Computer networks require increasingly complex packet processing functions in the data plane to adapt to new requirements. To meet performance demands, packet processing systems on routers employ multiple processor cores. To efficiently utilize processing resources in such systems, we propose a novel methodology for allocating tasks to processors. The main idea is to obtain runtime profiling information and to duplicate tasks with heavy processing requirements. Using our duplication algorithm, a balanced workload can be obtained and the complexity of packing tasks with different processing requirements can be reduced. By translating traffic characteristics into processing requirements, the system is able to adapt to dynamic changes in the workload and balance the utilization of all processing resources to maximize system throughput. Our approach can adapt to any traffic change in a single iteration, whereas existing adaptive approaches may require multiple steps. Results from our prototype implementation based on the Click modular router show that our system only requires on average 5.3-31.5 percent of the adaptation steps that are necessary in iterative systems. In addition, our system achieves a throughput that is 1.32 times higher than the throughput achieved with symmetric multiprocessing support with general-purpose task allocation.
INDEX TERMS
Decision support systems, Silicon, Mercury (metals), scheduling, Network router, multicore processor, network processor, task allocation
CITATION
Qiang Wu, Tilman Wolf, "Runtime Task Allocation in Multicore Packet Processing Systems", IEEE Transactions on Parallel & Distributed Systems, vol.23, no. 10, pp. 1934-1943, Oct. 2012, doi:10.1109/TPDS.2012.56
REFERENCES
[1] T. Anderson, L. Peterson, S. Shenker, and J. Turner, "Overcoming the Internet Impasse through Virtualization," Computer, vol. 38, no. 4, pp. 34-41, Apr. 2005.
[2] K. Asanovic, R. Bodik, B.C. Catanzaro, J.J. Gebis, P. Husbands, K. Keutzer, D.A. Patterson, W.L. Plishker, J. Shalf, S.W. Williams, and K.A. Yelick, "The Landscape of Parallel Computing Research: A View from Berkeley," Technical Report UCB/EECS-2006-183, EECS Dept., Univ. of California, Dec. 2006.
[3] B. Chen and R. Morris, "Flexible Control of Parallelism in a Multiprocessor PC Router," Proc. General Track: 2002 USENIX Ann. Technical Conf., pp. 333-346, June 2001.
[4] The Cisco QuantumFlow Processor: Cisco's Next Generation Network Processor, Cisco Systems, Inc., San Jose, CA, Feb. 2008.
[5] D.D. Clark, "The Design Philosophy of the DARPA Internet Protocols," Proc. ACM SIGCOMM '88, pp. 106-114, Aug. 1988.
[6] W. Eatherton, "The Push of Network Processing to the Top of the Pyramid," Proc. Keynote Presentation at ACM/IEEE Symp. Architectures for Networking and Comm. Systems (ANCS), Oct. 2005.
[7] A. Feldmann, "Internet Clean-Slate Design: What and Why?" SIGCOMM Computer Comm. Rev., vol. 37, no. 3, pp. 59-64, July 2007.
[8] S.D. Goglin, D. Hooper, A. Kumar, and R. Yavatkar, "Advanced Software Framework, Tools, and Languages for the IXP Family," Intel Technology J., vol. 7, no. 4, pp. 64-76, Nov. 2003.
[9] R.L. Graham, "Bounds on Multiprocessing Timing Anomalies," SIAM J. Applied Math., vol. 17, no. 2, pp. 416-429, Mar. 1969.
[10] G. Grohoski, "Niagara2: A Highly Threaded Server-on-a-Chip," Proc. Symp. High Performance Chips (HOT CHIPS), Aug. 2006.
[11] I. Hadzic, W.S. Marcus, and J.M. Smith, "On-the-fly Programmable Hardware for Networks," Proc. IEEE Globecom '98, Nov. 1998.
[12] E. Kohler, R. Morris, B. Chen, J. Jannotti, and M.F. Kaashoek, "The Click Modular Router," ACM Trans. Computer Systems, vol. 18, no. 3, pp. 263-297, Aug. 2000.
[13] R. Kokku, T. Riché, A. Kunze, J. Mudigonda, J. Jason, and H. Vin, "A Case for Run-Time Adaptation in Packet Processing Systems," Proc. Second Workshop Hot Topics in Networks (HOTNETS-II), Nov. 2003.
[14] J. Kuang and L. Bhuyan, "LATA: A Latency and Throughput-Aware Packet Processing System," Proc. 47th Design Automation Conf. (DAC), pp. 36-41, June 2010.
[15] J. Kuang and L. Bhuyan, "Optimizing Throughput and Latency under Given Power Budget for Network Packet Processing," Proc. IEEE INFOCOM, pp. 2901-2909, Mar. 2010.
[16] J.W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous, R. Raghuraman, and J. Luo, "NetFPGA-an Open Platform for Gigabit-Rate Network Switching and Routing," MSE '07: Proc. IEEE Int'l Conf. Microelectronic Systems Education, pp. 160-161, June 2007.
[17] A. Mallik and G. Memik, "Automated Task Distribution in Multicore Network Processors Using Statistical Analysis," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems (ANCS), pp. 67-76, Dec. 2007.
[18] W. Plishker, K. Ravindran, N. Shah, and K. Keutzer, "Automated Task Allocation for Network Processors," Proc. Network System Design Conf., pp. 235-245, Oct. 2004.
[19] R. Ramaswamy, N. Weng, and T. Wolf, "Analysis of Network Processing Workloads," J. Systems Architecture, vol. 55, no. 10, pp. 421-433, Oct. 2009.
[20] N. Shah, W. Plishker, K. Ravindran, and K. Keutzer, "NP-Click: A Productive Software Development Approach for Network Processors," IEEE Micro, vol. 24, no. 5, pp. 45-54, Sept. 2004.
[21] T. Spalink, S. Karlin, L. Peterson, and Y. Gottlieb, "Building a Robust Software-Based Router Using Network Processors," Proc. 18th ACM Symp. Operating Systems Principles (SOSP), pp. 216-229, Oct. 2001.
[22] J.S. Turner, "A Proposed Architecture for the GENI Backbone Platform," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems (ANCS), pp. 1-10, Dec. 2006.
[23] J.S. Turner, P. Crowley, J. DeHart, A. Freestone, B. Heller, F. Kuhns, S. Kumar, J. Lockwood, J. Lu, M. Wilson, C. Wiseman, and D. Zar, "Supercharging PlanetLab: A High Performance, Multi-Application, Overlay Network Platform," SIGCOMM '07: Proc. Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm., pp. 85-96, Aug. 2007.
[24] M. Welsh and D. Culler, "Adaptive Overload Control for Busy Internet Servers," Proc. Fourth Conf. USENIX Symp. Internet Technologies and Systems (USITS), Mar. 2003.
[25] T. Wolf, "In-Network Services for Customization in Next-Generation Networks," IEEE Network, vol. 24, no. 4, pp. 6-12, July 2010.
[26] T. Wolf, N. Weng, and C.-H. Tai, "Run-Time Support for Multi-Core Packet Processing Systems," IEEE Network, vol. 21, no. 4, pp. 29-37, July 2007.
[27] Q. Wu and T. Wolf, "Dynamic Workload Profiling and Task Allocation in Packet Processing Systems," Proc. IEEE Workshop High Performance Switching and Routing (HPSR), May 2008.
[28] Q. Wu and T. Wolf, "On Runtime Management in Multi-Core Packet Processing Systems," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems (ANCS), pp. 69-78, Nov. 2008.
[29] Q. Wu and T. Wolf, "Support for Dynamic Adaptation in Next Generation Packet Processing Systems," Proc. IEEE Int'l Conf. Comm. (ICC), June 2009.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool