The Community for Technology Leaders
RSS Icon
Issue No.08 - August (2009 vol.20)
pp: 1126-1141
Michihiro Koibuchi , National Institute of Informatics, Tokyo
Yutaka Yamada , Keio University, Yokohama and Toshiba Corporation, Kawasaki
D. Frank Hsu , Fordham University, New York
Hiroki Matsutani , Keio University, Yokohama
The topological explorations of on-chip networks are important for efficiently using their enormous wire resources for low-latency and high-throughput communications using a modest silicon budget. In this paper, we propose a novel tree-based interconnection network called Fat H-Tree that meets these requirements. A Fat H-Tree provides a torus structure by combining two folded H-Tree networks and is an attractive alternative to tree-based networks such as the Fat Trees in a microarchitecture domain. We introduce its chip layout schemes based on a folding technique for 2D and 3D ICs. Three deadlock-free routing schemes are proposed for Fat H-Tree. We evaluate the performance of Fat H-Tree and other tree-based networks using real application traces. In addition, the network logic area, wire resource, and energy consumption of Fat H-Tree are compared with other topologies, based on a typical implementation of on-chip routers synthesized with a 90-nm standard cell library. The results show that 1) a Fat H-Tree outperforms a Fat Tree with two upward and four downward connections in terms of the throughput and average hop count, 2) a Fat H-Tree requires 19.8 percent-27.8 percent smaller network logic area than the Fat Tree, 3) a Fat H-Tree consumes slightly less energy than the Fat Tree does, and 4) a Fat H-Tree uses slightly more wire resources than the Fat Tree, but the current process technology can provide sufficient wire resources for implementing Fat-H-Tree-based on-chip networks.
Interconnection networks, on-chip networks, network topology, tree, routing algorithm.
Michihiro Koibuchi, Yutaka Yamada, D. Frank Hsu, Hiroki Matsutani, "Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network", IEEE Transactions on Parallel & Distributed Systems, vol.20, no. 8, pp. 1126-1141, August 2009, doi:10.1109/TPDS.2008.233
[1] W.J. Dally and B. Towles, “Route Packets, Not Wires: On-Chip Interconnection Networks,” Proc. 37th Design Automation Conf. (DAC '01), pp. 684-689, June 2001.
[2] L. Benini and G.D. Micheli, “Networks on Chips: A New SoC Paradigm,” Computer, vol. 35, no. 1, pp. 70-78, Jan. 2002.
[3] L. Benini and G.D. Micheli, Networks on Chips: Technology and Tools. Morgan Kaufmann, 2006.
[4] D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J.F. Brown III, and A. Agarwal, “On-Chip Interconnection Architecture of the Tile Processor,” IEEE Micro, vol. 27, no. 5, pp. 15-31, Sept. 2007.
[5] P. Gratz, C. Kim, K. Sankaralingam, H. Hanson, P. Shivakumar, S.W. Keckler, and D. Burger, “On-Chip Interconnection Networks of the TRIPS Chip,” IEEE Micro, vol. 27, no. 5, pp. 41-50, Sept. 2007.
[6] Y. Hoskote, S. Vangal, A. Singh, N. Borkar, and S. Borkar, “A 5-GHz Mesh Interconnect for a Teraflops Processor,” IEEE Micro, vol. 27, no. 5, pp. 51-61, Sept. 2007.
[7] C.E. Leiserson, “Fat-Trees: Universal Networks for Hardware-Efficient Supercomputing,” IEEE Trans. Computers, vol. 34, no. 10, pp. 892-901, Oct. 1985.
[8] A. DeHon, “Compact, Multilayer Layout for Butterfly Fat-Tree,” Proc. 12th Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA '00), pp. 206-215, July 2000.
[9] A. DeHon, “Unifying Mesh- and Tree-Based Programmable Interconnect,” IEEE Trans. Very Large Scale Integration Systems, vol. 12, no. 10, pp. 1051-1065, Oct. 2004.
[10] A. Andriahantenaina, H. Charlery, A. Greiner, L. Mortiez, and C.A. Zeferino, “SPIN: A Scalable, Packet Switched, On-Chip Micro-Network,” Proc. Design Automation and Test in Europe Conf. (DATE '03), pp. 70-73, Mar. 2003.
[11] C. Grecu, P.P. Pande, A. Ivanov, and R. Saleh, “Structured Interconnect Architecture: A Solution for the Non-Scalability of Bus-Based SoCs,” Proc. 14th ACM Great Lakes Symp. VLSI (GLSVLSI '04), pp. 192-195, Apr. 2004.
[12] N. Kapre, N. Mehta, M. deLorimier, R. Rubin, H. Barnor, M.J. Wilson, M. Wrighton, and A. DeHon, “Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks,” Proc. 14th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '06), pp. 205-216, Apr. 2006.
[13] S. Das, A. Fan, K.-N. Chen, C.S. Tan, N. Checka, and R. Reif, “Technology, Performance, and Computer-Aided Design of Three-Dimensional Integrated Circuits,” Proc. Int'l Symp. Physical Design (ISPD '04), pp. 108-115, Apr. 2004.
[14] W.R. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A.M. Sule, M. Steer, and P.D. Franzon, “Demystifying 3D ICs: The Pros and Cons of Going Vertical,” IEEE Design and Test of Computers, vol. 22, no. 6, pp. 498-510, Nov. 2005.
[15] B. Black, M. Annavaram, N. Brekelbaum, J. DeVale, L. Jiang, G.H. Loh, D. McCaule, P. Morrow, D.W. Nelson, D. Pantuso, P. Reed, J. Rupley, S. Shankar, J.P. Shen, and C. Webb, “Die Stacking (3D) Microarchitecture,” Proc. 39th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO '06), pp. 469-479, Dec. 2006.
[16] C.E. Leiserson, Z.S. Abuhamdeh, D.C. Douglas, C.R. Feynman, M.N. Ganmukhi, J.V. Hill, W.D. Hillis, B.C. Kuszmaul, M.A.S. Pierre, D.S. Wells, M.C. Wong-Chan, S.-W. Yang, and R. Zak, “The Network Architecture of the Connection Machine CM-5,” J. Parallel and Distributed Computing, vol. 33, no. 2, pp. 145-158, Mar. 1996.
[17] Y. Yang, A. Funahashi, A. Jouraku, H. Nishi, H. Amano, and T. Sueyoshi, “Recursive Diagonal Torus: An Interconnection Network for Massively Parallel Computers,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 7, pp. 701-715, July 2001.
[18] F.T. Leighton, “New Lower Bound Techniques for VLSI,” Math. Systems Theory, vol. 17, no. 1, pp. 47-70, Apr. 1984.
[19] W.J. Dally and B. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2004.
[20] J. Kim, W.J. Dally, B. Towles, and A.K. Gupta, “Microarchitecture of a High-radix Router,” Proc. 32nd Int'l Symp. Computer Architecture (ISCA '05), pp. 420-431, June 2005.
[21] J. Kim, J. Balfour, and W.J. Dally, “Flattened Butterfly Topology for On-Chip Networks,” Proc. 40th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO '07), pp. 172-182, Dec. 2007.
[22] M. Coppola, R. Locatelli, G. Maruccia, L. Pieralisi, and A. Scandurra, “Spidergon: A Novel On-Chip Communication Network,” Proc. Int'l Symp. System-on-Chip (ISSOC '04), p. 15, Nov. 2004.
[23] R. Sabbaghi-Nadooshan, M. Modarressi, and H. Sarbazi-Azad, “A Novel High-Performance and Low-Power Mesh-Based NoC,” Proc. Int'l Workshop Performance Modeling, Evaluation, and Optimization of Ubiquitous Computing and Networked Systems (PMEO-UCNS '08), Apr. 2008.
[24] A. Sharifi, R. Sabbaghi-Nadooshan, and H. Sarbazi-Azad, “The Shuffle-Exchange Mesh Topology for 3D NoCs,” Proc. Ninth Int'l Symp. Parallel Architectures, Algorithms, and Networks (I-SPAN '08), pp. 275-280, May 2008.
[25] G.D. Vecchia and C. Sanges, “A Recursively Scalable Network VLSI Implementation,” Future Generation Computer Systems, vol. 4, no. 3, pp. 235-243, Oct. 1988.
[26] D. Rahmati, A.E. Kiasari, S. Hessabi, and H. Sarbazi-Azad, “A Performance and Power Analysis of WK-Recursive and Mesh Networks for Network-on-Chips,” Proc. 24th Int'l Conf. Computer Design (ICCD '06), pp. 142-147, Oct. 2006.
[27] Y. Yamada, H. Amano, M. Koibuchi, A. Jouraku, K. Anjo, and K. Nishimura, “Folded Fat H-Tree: An Interconnection Topology for Dynamically Reconfigurable Processor Array,” Proc. Int'l Conf.Embedded and Ubiquitous Computing (EUC '04), pp. 301-311, Aug. 2004.
[28] H. Matsutani, M. Koibuchi, and H. Amano, “Performance, Cost, and Energy Evaluation of Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network,” Proc. 21st Int'l Parallel and Distributed Processing Symp. (IPDPS '07), Mar. 2007.
[29] H. Matsutani, M. Koibuchi, D.F. Hsu, and H. Amano, “Three-Dimensional Layout of On-Chip Tree-Based Networks,” Proc. Ninth Int'l Symp. Parallel Architectures, Algorithms, and Networks (I-SPAN '08), pp. 281-288, May 2008.
[30] T.M. Pinkston and J. Shin, “Trends toward On-Chip Networked Microsystems,” Int'l J. High Performance Computing and Networking, vol. 3, no. 1, pp. 3-18, Sept. 2005.
[31] F. Li, C. Nicopoulos, T. Richardson, Y. Xie, V. Narayanan, and M. Kandemir, “Design and Management of 3D Chip Multiprocessors Using Network-in-Memory,” Proc. 33rd Int'l Symp. Computer Architecture (ISCA '06), pp. 130-141, June 2006.
[32] W.J. Dally and J.W. Poulton, Digital Systems Eng. Cambridge Univ. Press, 1998.
[33] A. Banerjee, R. Mullins, and S. Moore, “A Power and Energy Exploration of Network-on-Chip Architectures,” Proc. First Int'l Symp. Networks-on-Chip (NOCS '07), pp. 163-172, May 2007.
[34] H. Matsutani, M. Koibuchi, D. Wang, and H. Amano, “Adding Slow-Silent Virtual Channels for Low-Power On-Chip Networks,” Proc. Second Int'l Symp. Networks-on-Chip (NOCS '08), pp. 23-32, Apr. 2008.
[35] H. Wang, L.-S. Peh, and S. Malik, “A Technology-Aware and Energy-Oriented Topology Exploration for On-Chip Networks,” Proc. Design, Automation and Test in Europe Conf. (DATE '05), pp.1238-1243, Mar. 2005.
[36] R. Ho, K.W. Mai, and M.A. Horowitz, “The Future of Wires,” Proc. IEEE, vol. 89, no. 4, pp. 490-504, Apr. 2001.
[37] J.C. Sancho and A. Robles, “Improving the Up*/Down* Routing Scheme for Networks of Workstations,” Proc. Int'l Euro-Par Conf. Parallel Processing (Euro-Par '00), pp. 882-889, Aug. 2000.
[38] D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow, “The NAS Parallel Benchmarks 2.0,” NAS Technical Report NAS-95-020, Dec. 1995.
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool