The Community for Technology Leaders
RSS Icon
Issue No.01 - Jan. (2013 vol.62)
pp: 59-73
Michel A. Kinsy , Massachusetts Institute of Technology, Cambridge
Myong Hyon Cho , Massachusetts Institute of Technology, Cambridge
Keun Sup Shim , Massachusetts Institute of Technology, Cambridge
Mieszko Lis , Massachusetts Institute of Technology, Cambridge
G. Edward Suh , Cornell University, Ithaca
Srinivas Devadas , Massachusetts Institute of Technology, Cambridge
Conventional oblivious routing algorithms do not take into account resource requirements (e.g., bandwidth, latency) of various flows in a given application. As they are not aware of flow demands that are specific to the application, network resources can be poorly utilized and cause serious local congestion. Also, flows, or packets, may share virtual channels in an undetermined way; the effects of head-of-line blocking may result in throughput degradation. In this paper, we present a framework for application-aware routing that assures deadlock freedom under one or more virtual channels by forcing routes to conform to an acyclic channel dependence graph. In addition, we present methods to statically and efficiently allocate virtual channels to flows or packets, under oblivious routing, when there are two or more virtual channels per link. Using the application-aware routing framework, we develop and evaluate a bandwidth-sensitive oblivious routing scheme that statically determines routes considering an application's communication characteristics. Given bandwidth estimates for flows, we present a mixed integer-linear programming (MILP) approach and a heuristic approach for producing deadlock-free routes that minimize maximum channel load. Our framework can be used to produce application-aware routes that target the minimization of latency, number of flows through a link, bandwidth, or any combination thereof. Our results show that it is possible to achieve better performance than traditional deterministic and oblivious routing schemes on popular synthetic benchmarks using our bandwidth-sensitive approach. We also show that, when oblivious routing is used and there are more flows than virtual channels per link, the static assignment of virtual channels to flows can help mitigate the effects of head-of-line blocking, which may impede packets that are dynamically competing for virtual channels. We experimentally explore the performance tradeoffs of static and dynamic virtual channel allocation on bandwidth-sensitive and traditional oblivious routing methods.
Routing, System recovery, Computer architecture, Bandwidth, Channel allocation, Switches, Heuristic algorithms, virtual channel allocation, Systems-on-chip, on-chip interconnection networks, oblivious routing
Michel A. Kinsy, Myong Hyon Cho, Keun Sup Shim, Mieszko Lis, G. Edward Suh, Srinivas Devadas, "Optimal and Heuristic Application-Aware Oblivious Routing", IEEE Transactions on Computers, vol.62, no. 1, pp. 59-73, Jan. 2013, doi:10.1109/TC.2011.219
[1] T. Bjerregaard and S. Mahadevan, “A Survey of Research and Practices of Network-on-Chip,” ACM Computing Surveys, vol. 38, no. 1, article 1, 2006.
[2] G.-M. Chiu, “The Odd-Even Turn Model for Adaptive Routing,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 7, pp. 729-738, July 2000.
[3] M.H. Cho, C.-C. Cheng, M. Kinsy, G.E. Suh, and S. Devadas, “Diastolic Arrays: Throughput-Driven Reconfigurable Computing,” Proc. IEEE/ACM Int'l Conf. Computer-Aided Design (ICCAD '08), Nov. 2008.
[4] M.H. Cho, M. Lis, K.S. Shim, M. Kinsy, T. Wen, and S. Devadas, “Oblivious Routing in On-Chip Bandwidth-Adaptive Networks,” Proc. 18th Int'l Conf. Parallel Architecture and Compilation Techniques (PACT '09), Sept. 2009.
[5] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms. MIT Press, 2001.
[6] W.J. Dally, P.P. Carvey, and L.R. Dennison, “The Avici Terabit Switch/Router,” Proc. Sixth Symp. Hot Interconnects, pp. 41-50, Aug. 1998.
[7] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, vol. 36, no. 5, pp. 547-553, May 1987.
[8] W.J. Dally and B. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2003.
[9] J. Duato, “A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 12, pp. 1320-1331, Dec. 1993.
[10] J. Duato, “A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 10, pp. 1055-1067, Oct. 1995.
[11] M. Galles, “Scalable Pipelined Interconnect for Distributed Endpoint Routing: The SGI SPIDER Chip,” Proc. Symp. Hot Interconnects, pp. 141-146, Aug. 1996.
[12] R. Gindin, I. Cidon, and I. Keidar, “NoC-Based FPGA: Architecture and Routing,” Proc. First Int'l Symp. Networks-on-Chips (NOCS), pp. 253-264, 2007.
[13] C.J. Glass and L.M. Ni, “The Turn Model for Adaptive Routing,” J. ACM, vol. 41, no. 5, pp. 874-902, Sept. 1994.
[14] J. Hu and R. Marculescu, “Exploiting the Routing Flexibility for Energy/Performance Aware Mapping of Regular NoC Architectures,” Proc. Design, Automation and Test in Europe Conf., 2003.
[15] J. Hu and R. Marculescu, “DyAD: Smart Routing for Networks on Chip,” Proc. Design Automation Conf., June 2004.
[16] A.B. Kahng, B. Li, L.S. Peh, and K. Samadi, ORION 2.0: A Fast and Accurate NoC Power and Area Model for Early-Stage Design Space Exploration, 2009.
[17] J.M. Kleinberg, “Approximation Algorithms for Disjoint Paths Problems,” PhD thesis, Massachusetts Inst. of Tech nology, 1996.
[18] M. Lis, P. Ren, M.H. Cho, K.S. Shim, C.W. Fletcher, O. Khan, and S. Devadas, “Scalable, Accurate Multicore Simulation in the 1000-Core Era,” Proc. IEEE Int'l Symmp. Performance Analysis of Systems and Software (ISPASS '11), pp. 175-185, 2011.
[19] R.D. Mullins, A.F. West, and S.W. Moore, “Low-Latency Virtual-Channel Routers for On-Chip Networks,” Proc. 31st Ann. Int'l Symp. Computer Architecture (ISCA '04), pp. 188-197, 2004.
[20] S. Murali, D. Atienz, L. Benini, and G.D. Micheli, “A Method for Routing Packets Across Multiple Paths in NoCs with In-Order Delivery and Fault-Tolerance Gaurantees,” VLSI Design, 2007.
[21] S. Murali and G.D. Micheli, “SUNMAP: A Tool for Automatic Topology Selection and Generation for NoCs,” Proc. 41st Ann. Conf. Design Automation (DAC '04), pp. 914-919, 2004.
[22] T. Nesson and S. Lennart Johnsson, “ROMM Routing on Mesh and Torus Networks,” Proc. Seventh Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA '95), pp. 275-287, 1995.
[23] L.M. Ni and P.K. McKinley, “A Survey of Wormhole Routing Techniques in Direct Networks,” Computer, vol. 26, no. 2, pp. 62-76, Feb. 1993.
[24] M. Palesi, R. Holsmark, S. Kumar, and V. Catania, “A Methodology for Design of Application Specific Deadlock-Free Routing Algorithms for NoC Systems,” Proc. Fourth Int'l Conf. Hardware/Software Codesign and System Synthesis (CODES+ISSS '06), Oct. 2006.
[25] M. Palesi, G. Longo, S. Signorino, R. Holsmark, S. Kumar, and V. Catania, “Design of Bandwidth Aware and Congestion Avoiding Efficient Routing Algorithms for Networks-on-Chip Platforms,” Proc. ACM/IEEE Int'l Symp. Networks-on-Chip (NOCS), pp. 97-106, 2008.
[26] L.-S. Peh and W.J. Dally, “A Delay Model and Speculative Architecture for Pipelined Routers,” Proc. Int'l Symp. High-Performance Computer Architecture (HPCA), pp. 255-266, Jan. 2001.
[27] L. Schwiebert, “Deadlock-Free Oblivious Wormhole Routing with Cyclic Dependencies,” Proc. Ninth Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA '97), pp. 149-158, 1997.
[28] D. Seo, A. Ali, W.-T. Lim, N. Rafique, and M. Thottethodi, “Near-Optimal Worst-Case Throughput Routing for Two-Dimensional Mesh Networks,” Proc. 32nd Ann. Int'l Symp. Computer Architecture (ISCA '05), pp. 432-443, 2005.
[29] C.B. Stunkel, D.G. Shea, D.G. Grice, P.H. Hochschild, and M. Tsao, “The SP1 High-Performance Switch,” Proc. Scalable High Performance Computing Conf., pp. 150-157, May 1994.
[30] B. Towles, W.J. Dally, and S. Boyd, “Throughput-Centric Routing Algorithm Design,” Proc. 15th Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA '03), pp. 200-209, 2003.
[31] L.G. Valiant and G.J. Brebner, “Universal Schemes for Parallel Communication,” Proc. 13th Ann. ACM Symp. Theory of Computing (STOC '81), pp. 263-277, 1981.
[32] K. Walkowiak, “New Algorithms for the Unsplittable Flow Problem,” Proc. Int'l Conf. Computational Science and Its Applications (ICCSA), pp. 1101-1110, 2006.
[33] X. Zhong and V. Mary, “Application-Specific Deadlock Free Wormhole Routing on Multicomputers,” Proc. Fourth Int'l PARLE Conf. Parallel Architectures and Languages Europe (PARLE '92), pp. 193-208, 1992.
42 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool