The Community for Technology Leaders
RSS Icon
Issue No.06 - June (2010 vol.59)
pp: 748-761
Itamar Cohen , Jerusalem College of Engineering, Israel
Ori Rottenstreich , Technion, Haifa
Isaac Keslassy , Technion, Haifa
Chip multiprocessors (CMPs) combine increasingly many general-purpose processor cores on a single chip. These cores run several tasks with unpredictable communication needs, resulting in uncertain and often-changing traffic patterns. This unpredictability leads network-on-chip (NoC) designers to plan for the worst case traffic patterns, and significantly overprovision link capacities. In this paper, we provide NoC designers with an alternative statistical approach. We first present the traffic-load distribution plots (T-Plots), illustrating how much capacity overprovisioning is needed to service 90, 99, or 100 percent of all traffic patterns. We prove that in the general case, plotting T-Plots is #P-complete, and therefore extremely complex. We then show how to determine the exact mean and variance of the traffic load on any edge, and use these to provide Gaussian-based models for the T-Plots, as well as guaranteed performance bounds. We also explain how to practically approximate T-Plots using random-walk-based methods. Finally, we use T-Plots to reduce the network power consumption by providing an efficient capacity allocation algorithm with predictable performance guarantees.
Networks-on-chip, chip multiprocessors, capacity allocation, traffic-load distribution plot.
Itamar Cohen, Ori Rottenstreich, Isaac Keslassy, "Statistical Approach to Networks-on-Chip", IEEE Transactions on Computers, vol.59, no. 6, pp. 748-761, June 2010, doi:10.1109/TC.2010.35
[1] M.B. Taylor et al., "The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs," IEEE Micro, vol. 22, no. 2, pp. 25-35, Apr. 2002.
[2] L. Shang, L.-S. Peh, A. Kumar, and N.K. Jha, "Thermal Modeling, Characterization and Management of On-Chip Networks," Proc. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), Dec. 2004.
[3] S. Murali, D. Atienza, P. Meloni, S. Carta, L. Benini, G. De Micheli, and L. Raffo, "Synthesis of Predictable Network-on-Chip-Based Interconnect Architectures for Chip Multiprocessors," IEEE Trans. Very Large Scale Integration Systems, vol. 15, no. 8, pp. 869-880, Aug. 2007.
[4] R. Merritt, "AMD, Intel Square Off in Quad-Core Processors," EE Times, Sept. 2007.
[5] AMD, "Quad-Core Processors," us-en quadcore/, 2010.
[6] Intel, "Teraflops Research Chip," 1449.htm, 2010.
[7] R. Kalla et al., "IBM Power5 Chip: A Dual-Core Multithreaded Processor," IEEE Micro, vol. 24, no. 2, pp. 40-47, Mar./Apr. 2004.
[8] P. Guerrier and A. Greiner, "A Generic Architecture for On-Chip Packet-Switched Interconnections," Proc. Conf. Design, Automation and Test in Europe (DATE '00), pp. 250-256, Mar. 2000.
[9] W.J. Dally and B. Towles, "Route Packets, Not Wires: On-Chip Interconnection Networks," Proc. Design Automation Conf. (DAC '01), pp. 684-689, June 2001.
[10] L. Benini and G. DeMicheli, "Networks on Chip: A New SoC Paradigm," Computer, vol. 35, no. 1, pp. 70-78, Jan. 2002.
[11] A. Radulescu and K. Goossens, "Communication Services for Networks on Chip," Domain-Specific Processors: Systems, Architectures, Modeling, and Simulation, pp. 193-213, Marcel Dekker, 2004.
[12] R. Mullins, A. West, and S. Moore, "The Design and Implementation of a Low-Latency on-Chip Network," Proc. Asia and South Pacific Design Automation Conf. (ASP-DAC), pp. 164-169, 2006.
[13] Z. Guz, I. Walter, E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny, "Network Delays and Link Capacities in Application-Specific Wormhole NoCs," Very Large Scale Integration Design, May 2007.
[14] J. Kim, M. Taylor, J. Miller, and D. Wentzlaff, "Energy Characterization of a Tiled Architecture Processor with On-Chip Networks," Proc. Int'l Symp. Low-Power Electronics and Design, 2003.
[15] H. Wang, L.S. Peh, and S. Malik, "Power-Driven Design of Router Microarchitectures in On-Chip Networks," Proc. Int'l Symp. Microarchitecture, pp. 105-116, Dec. 2003.
[16] J. Hu and R. Marculescu, "Exploiting the Routing Flexibility for Energy/Performance Aware Mapping of Regular NoC Architectures," Proc. Conf. Design, Automation and Test in Europe (DATE '00), 2003.
[17] S. Murali et al., "Mapping and Physical Planning of Networks-on-Chip with Quality-of-Service Guarantees," Proc. Asia and South Pacific Design Automation Conf. (ASP-DAC), pp. 27-32, 2005.
[18] A. Hansson et al., "A Unified Approach to Constrained Mapping and Routing on Network-on-Chip Architectures," Proc. IEEE/ACM/IFIP Int'l Conf. Hardware/Software Codesign and System Synthesis (ISSS), pp. 75-80, 2005.
[19] K. Srinivasan et al., "An Automated Technique for Topology and Route Generation of Application Specific on-Chip Interconnection Networks," Proc. IEEE/ACM Int'l Conf. Computer-Aided Design (ICCAD), pp. 231-237, 2005.
[20] W.J. Dally and B. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2004.
[21] A. Baron, R. Ginosar, and I. Keslassy, "The Capacity Allocation Paradox," Proc. IEEE INFOCOM '09, Apr. 2009.
[22] B. Towles and W.J. Dally, "Worst-Case Traffic for Oblivious Routing Functions," IEEE Computer Architecture Letters, vol. 1, no. 1, Jan.-Dec. 2002.
[23] B. Towles, W.J. Dally, and S. Boyd, "Throughput-Centric Routing Algorithm Design," Proc. ACM Symp. Parallel Algorithms and Architectures (SPAA), pp. 200-209, 2003.
[24] N. Duffield, P. Goyal, and A. Greenberg, "A Flexible Model for Resource Management in Virtual Private Networks," ACM SIGCOMM Computer Comm. Rev., vol. 29, pp. 95-108, 1999.
[25] E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny, "QNoC: QoS Architecture and Design Process for Network on Chip," J. Systems Architecture, vol. 50, pp. 105-128, Feb. 2004.
[26] P. Bogdan and R. Marculescu, "Quantum-Like Effects in Network-on-Chip Buffers Behavior," Proc. Ann. ACM IEEE Design Automation Conf. (DAC)—Session: Wild and Crazy Ideas, pp. 266-267, July 2007.
[27] S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G.D. Micheli, "Mapping and Configuration Methods for Multi-Use-Case Networks on Chips," Proc. Asia and South Pacific Design Automation Conf. (ASP-DAC), pp. 146-151, 2006.
[28] S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. DeMicheli, "A Methodology for Mapping Multiple Use-Cases onto Networks on Chips," Proc. Conf. Design, Automation and Test in Europe (DATE '06), pp. 118-123, 2006.
[29] R. Gindin, I. Cidon, and I. Keidar, "NoC-Based FPGA: Architecture and Routing," Proc. Int'l Symp. Networks-on-Chip (NOCS '07), May 2007.
[30] Y. Azar, E. Cohen, A. Fiat, H. Kaplan, and H. Racke, "Optimal Oblivious Routing in Polynomial Time," Proc. ACM Symp. Theory of Computing, pp. 383-388, 2003.
[31] H. Sullivan and T.R. Bashkow, "A Large Scale, Homogeneous, Fully Distributed Parallel Machine," Proc. Int'l Symp. Computer Architecture (ISCA), pp. 105-117, 1977.
[32] A. Ben-Dor and S. Halevi, "Zero-One Permanent is #P-Complete, a Simpler Proof," Proc. Israel Symp. Theory of Computing and Systems, 1993.
[33] S. Vempala, "Geometric Random Walks: A Survey," Combinatorial and Computational Geometry, MSRI, 2005.
[34] S. Chib and E. Greenberg, "Understanding the Metropolis-Hastings Algorithm," Am. Statistician, vol. 49, no. 4, pp. 327-335, 1995.
[35] S.P. Brooks, "Markov Chain Monte Carlo Method and Its Application," Statistician, vol. 47, no. 1, pp. 69-100, 1998.
[36] N. Dukkipati, Y. Ganjali, and R. Zhang-Shen, "Typical versus Worst Case Design in Networking," Proc. Fourth Workshop Hot Topics in Networks (HotNets-IV), Nov. 2005.
[37] VINCI, Vinci.html, 2008.
[38] M.E. Dyer. and A.M. Frieze, "On the Complexity of Computing the Volume of a Polyhedron," SIAM J. Computing, vol. 17, no. 5, pp. 967-974, 1988.
[39] D. Seo, A. Ali, W.T. Lim, N. Rafique, and M. Thottethodi, "Near-Optimal Worst-Case Throughput Routing for Two-Dimensional Mesh Networks," Proc. Int'l Symp. Computer Architecture (ISCA), pp. 432-443, June 2005.
[40] E.J. Gumbel, "Multivariate Extremal Distributions," Bull. de l'Institut Int'l de Statistique, vol. 37, pp. 471-475, 1960.
[41] J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S.W. Keckler, "A NUCA Substrate for Flexible CMP Cache Sharing," Proc. Int'l Conf. Supercomputing (ICS '05), June 2005.
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool