Subscribe

Issue No.12 - December (2010 vol.21)

pp: 1765-1778

Miquel Moretó , Universitat Politècnica de Catalunya, Barcelona

Enrique Vallejo , Universidad de Cantabria, Santander

Ramón Beivide , Universidad de Cantabria, Santander

José Miguel-Alonso , University of the Basque Country, San Sebastian

Carmen Martínez , Universidad de Cantabria, Santander

Javier Navaridas , University of the Basque Country, San Sebastian

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.30

ABSTRACT

Many current parallel computers are built around a torus interconnection network. Machines from Cray, HP, and IBM, among others, make use of this topology. In terms of topological advantages, square (2D) or cubic (3D) tori would be the topologies of choice. However, for different practical reasons, 2D and 3D tori with different number of nodes per dimension have been used. These mixed-radix topologies are not edge symmetric, which translates into poor performance due to an unbalanced use of network resources. In this work, we analyze twisted 2D and 3D mixed-radix tori that remove the network bottlenecks present in nontwisted ones. Such topologies recover edge symmetry, and consequently, balance the utilization of their links. The distance-related properties of twisted tori together with a full characterization of their bisection bandwidth are described in this paper. A simulation-based performance evaluation has been carried out to assess the network performance under synthetic and trace-driven workloads. The obtained results show noticeable and consistent performance gains (up to an increase of 74 percent in accepted load). In addition, we propose scalable and practicable packet routing mechanisms and wiring layouts for these interconnection systems. The complexity of the architectural proposals is similar to the one exhibited by routing and folding mechanisms in standard tori.

INDEX TERMS

Multiprocessor interconnection, parallel architectures, routing, supercomputers.

CITATION

Miquel Moretó, Enrique Vallejo, Ramón Beivide, José Miguel-Alonso, Carmen Martínez, Javier Navaridas, "Twisted Torus Topologies for Enhanced Interconnection Networks",

*IEEE Transactions on Parallel & Distributed Systems*, vol.21, no. 12, pp. 1765-1778, December 2010, doi:10.1109/TPDS.2010.30REFERENCES

- [1] N.R. Adiga et al. "An Overview of the BlueGene/L Supercomputer,"
Proc. ACM/IEEE Conf. Supercomputing (Supercomputing '02) Technical Papers, Nov. 2002.- [2] A. Agarwal, "Limits on Interconnection Network Performance,"
IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 398-412, Oct. 1991.- [3] R. Beivide, E. Herrada, J.L. Balcazar, and J. Labarta, "Optimized Mesh-Connected Networks for SIMD and MIMD Architectures,"
Proc. 14th Ann. Int'l Symp. Computer Architecture, pp. 163-169, 1987.- [4] W.J. Bouknight, S.A. Denenberg, D.E. McIntyre, J.M. Randall, A.H. Sameh, and D.L. Slotnick, "The Illiac IV System,"
Proc. IEEE, vol. 60, no. 4, pp. 369-388, Apr. 1972.- [5] N.R. Adiga, M.A. Blumrich, D. Chen, P. Coteus, A. Gara, M.E. Giampapa, P. Heidelberger, S. Singh, B.D. Steinmacher-Burow, T. Takken, M. Tsao, and P. Vranas, "Blue Gene/L Torus Interconnection Network,"
IBM J. Research and Development, vol. 49, nos. 2/3, pp. 265-276, 2005.- [6] P. Coteus, H.R. Bickford, T.M. Cipolla, P. Crumley, A. Gara, S. Hall, G.V. Kopcsay, A.P. Lanzetta, L.S. Mok, R.A. Rand, R.A. Swetz, T. Takken, P. La Rocca, C. Marroquin, P.R. Germann, and M.J. Jeanson, "Packaging the Blue Gene/L Supercomputer,"
IBM J. Research and Development, vol. 49, nos. 2/3, pp. 213-248, 2005.- [7] Cray Inc., "Cray XT3 Datasheet," http://www.cray.com/downloadsCray_XT3_Datasheet.pdf , 2008.
- [8] Cray Inc., "Cray XT4 Datasheet," http://www.cray.com/downloadsCray_XT4_Datasheet.pdf , 2008.
- [9] Cray Inc., "Cray X1E Supercomputer," http://www.cray.com/downloadsX1E_datasheet.pdf , 2008.
- [10] Z. Cvetanovic, "Performance Analysis of the Alpha 21364- Based HP GS1280 Multiprocessor,"
Proc. 30th Ann. Int'l Symp. Computer Architecture, pp. 218-228, 2003.- [11] W.J. Dally and B. Towles,
Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2004.- [12] P. Kermani and L. Kleinrock, "Virtual Cut-Through: A New Computer Communication Switching Technique,"
Computer Networks, vol. 3, pp. 267-286, 1979.- [13] IBM, "IBM System Blue Gene Solution," http://www-03.ibm.com/servers/deepcomputing bluegene.html, 2008.
- [14] W. Imrich and S. Klavzar,
Product Graphs: Structure and Recognition. John Wiley & Sons, Inc., 2000.- [15] F.C.M. Lau and G. Chen, "Optimal Layouts of Midimew Networks,"
IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 9, pp. 954-961, Sept. 1996.- [16] C. Martínez, R. Beivide, E. Stafford, M. Moretó, and E. Gabidulin, "Modeling Toroidal Networks with the Gaussian Integers,"
IEEE Trans. Computers, vol. 57, no. 8, pp. 1046-1056, Aug. 2008.- [17] J. Miguel-Alonso, J.A. Gregorio, V. Puente, F. Vallejo, and R. Beivide, "Load Unbalance in k-ary n-cube Networks,"
Proc. Euro-Par Parallel Processing, pp. 900-907, Springer, 2004.- [18] J. Miguel-Alonso, C. Izu, and J.A. Gregorio, "Improving the Performance of Large Interconnection Networks Using Congestion-Control Mechanisms,"
Performance Evaluation, vol. 65, pp. 203-211, 2008.- [19] V. Puente, C. Izu, J.A. Gregorio, R. Beivide, and F. Vallejo, "Adaptive Bubble Router: A Design to Improve Performance in Torus Networks,"
Proc. 28th Int'l Conf. Parallel Computing (ICPP ' 99), pp. 58-67, Sept. 1999.- [20] J. Miguel-Alonso, J. Navaridas, and F.J. Ridruejo, "Interconnection Network Simulation Using Traces of MPI Applications,"
Int'l J. Parallel Programming, vol. 37, no. 2, pp. 153-174, 2009.- [21] F.J. Ridruejo and J. Miguel-Alonso, "INSEE: An Interconnection Network Simulation and Evaluation Environment,"
Proc. Euro-Par Parallel Processing, pp 1014-1023, Springer, 2005.- [22] C.H. Sequin, "Doubly Twisted Torus Networks for VLSI Processor Arrays,"
Proc. Eighth Ann. Int'l Symp. Computer Architecture, pp. 471-480, 1981.- [23] S.L. Scott and G.M. Thorson, "The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus,"
Proc. HOT Interconnects IV Symp., 1996.- [24] E. Vallejo, R. Beivide, and C. Martínez, "Practicable Layouts for Optimal Circulant Graphs,"
Proc. 13th Euromicro Conf. Parallel, Distributed and Network-Based Processing, pp. 118-125, Feb. 2005.- [25] C.K. Wong and D. Coppersmith, "A Combinatorial Problem Related to Multimodule Memory Organizations,"
J. ACM, vol. 21, no. 3, pp. 392-402, 1974.- [26] Y. Yang, A. Funahashi, A. Jouraku, H. Nishi, H. Amano, and T. Sueyoshi, "Recursive Diagonal Torus: An Interconnection Network for Massively Parallel Computers,"
IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 7, pp. 701-715, July 2001. |