The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan.-Feb. (2014 vol.11)
pp: 16-29
Leandro Fiorin , Faculty of Lugano, Lugano
Mariagiovanna Sami , Politecnico di Milano, Milano
ABSTRACT
As the complexity of designs increases and technology scales down into the deep-submicron domain, the probability of malfunctions and failures in the networks-on-chip (NoCs) components increases. In this work, we focus on the study and evaluation of techniques for increasing reliability and resilience of network interfaces (NIs) within NoC-based multiprocessor system-on-chip architectures. NIs act as interfaces between intellectual property cores and the communication infrastructure; the faulty behavior of one of them could affect, therefore, the overall system. In this work, we propose a functional fault model for the NI components by evaluating their susceptibility to faults. We present a two-level fault-tolerant solution that can be employed for mitigating the effects of both permanent and temporary faults in the NI. Experimental simulations show that with a limited overhead, we can obtain an NI reliability comparable to the one obtainable by implementing the system by using standard triple modular redundancy techniques, while saving up to 48 percent in area, as well as obtaining a significant energy reduction.
INDEX TERMS
Nickel, Circuit faults, Table lookup, Routing, Registers, Fault tolerance, Fault tolerant systems,high-level error models, Networks-on-chip, network interface, fault tolerance, reliability, online fault detection
CITATION
Leandro Fiorin, Mariagiovanna Sami, "Fault-Tolerant Network Interfaces for Networks-on-Chip", IEEE Transactions on Dependable and Secure Computing, vol.11, no. 1, pp. 16-29, Jan.-Feb. 2014, doi:10.1109/TDSC.2013.28
REFERENCES
[1] R. Marculescu, "Networks-on-Chip: The Quest for on-Chip Fault-Tolerant Communication," Proc. IEEE CS Ann. Symp. VLSI, pp. 8-12, Feb. 2003.
[2] J. Srinivasan and S.V. Adve, "RAMP: A Model for Reliability Aware MicroProcessor Design," IBM Research Report RC23048, 2003.
[3] I. Koren and C.M. Krishna, Fault Tolerant Systems. Morgan Kaufmann, 2007.
[4] A. Ferrante, S. Medardoni, and D. Bertozzi, "Network Interface Sharing Techniques for Area Optimized NoC Architectures," Proc. 11th EUROMICRO Conf. Digital System Design Architectures, Methods and Tools (DSD '08), pp. 10-17, Sept. 2008.
[5] L. Fiorin, L. Micconi, and M. Sami, "Design of Fault Tolerant Network Interfaces for NoCs," Proc. 14th EUROMICRO Conf. Digital System Design (DSD '11), pp. 393-400, 2011.
[6] S. Murali, T. Theocharides, N. Vijaykrishnan, M. Irwin, L. Benini, and G. De Micheli, "Analysis of Error Recovery Schemes for Networks on Chips," IEEE Design Test of Computers, vol. 22, no. 5, pp. 434-442, Sept./Oct. 2005.
[7] A. Frantz, M. Cassel, F. Kastensmidt, E. Cota, and L. Carro, "Crosstalk- and SEU-Aware Networks on Chips," IEEE Design Test of Computers, vol. 24, no. 4, pp. 340-350, July/Aug. 2007.
[8] T. Lehtonen, P. Liljeberg, and J. Plosila, "Online Reconfigurable Self-Timed Links for Fault Tolerant NoC," VLSI Design, vol. 2007, article 13, 2007.
[9] Q. Yu and P. Ampadu, "Transient and Permanent Error Co-Management Method for Reliable Networks-on-Chip," Proc. Fourth ACM/IEEE Int'l Symp. Networks-on-Chip (NOCS '10), pp. 145-154, May 2010.
[10] A. Ejlali, B. Al-Hashimi, P. Rosinger, S. Miremadi, and L. Benini, "Performability/Energy Tradeoff in Error-Control Schemes for on-Chip Networks," IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 18, no. 1, pp. 1-14, Jan. 2010.
[11] J. Kim, C. Nicopoulos, D. Park, V. Narayanan, M. Yousif, and C. Das, "A Gracefully Degrading and Energy-Efficient Modular Router Architecture for on-Chip Networks," Proc. 33rd Int'l Symp. Computer Architecture (ISCA '06), pp. 4-15, 2006.
[12] S. Rodrigo, J. Flich, A. Roca, S. Medardoni, D. Bertozzi, J. Camacho, F. Silla, and J. Duato, "Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing," Proc. Fourth ACM/IEEE Int'l Symp. Networks-on-Chip (NOCS '10), pp. 25-32. May 2010.
[13] A. DeOrio, D. Fick, V. Bertacco, D. Sylvester, D. Blaauw, J. Hu, and G. Chen, "A Reliable Routing Architecture and Algorithm for NoCs," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 31, no. 5, pp. 726-739, May 2012.
[14] K. Stewart and S. Tragoudas, "Interconnect Testing for Networks on Chips," Proc. 24th IEEE VLSI Test Symp., pp. 100-107, 2006.
[15] Y. Zou, Y. Xiang, and S. Pasricha, "Characterizing Vulnerability of Network Interfaces in Embedded Chip Multiprocessors," IEEE Embedded Systems Letters, vol. 4, no. 2, pp. 41-44, June 2012.
[16] V. Rantala, T. Lehtonen, P. Liljeberg, and J. Plosila, "Multi Network Interface Architectures for Fault Tolerant Network-on-Chip," Proc. Int'l Symp. Signals, Circuits and Systems (ISSCS '09), pp. 1-4, July 2009.
[17] S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, V. Erraguntla, C. Roberts, Y. Hoskote, N. Borkar, and S. Borkar, "An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS," IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 29-41, Jan. 2008.
[18] G.D. Micheli and L. Benini, Networks on Chips: Technology and Tools (Systems on Silicon). Morgan Kaufmann, 2006.
[19] OCP-IP Assoc., Open Core Protocol Specification 2.2, 2008.
[20] A. Radulescu, J.S. Pestana, O. Gangwal, E. Rijpkema, P. Wielage, and K. Goossens, "An Efficient on-Chip NI Offering Guaranteed Services, Shared-Memory Abstraction, and Flexible Network Configuration," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 24, no. 1, pp. 4-17, Jan. 2005.
[21] I. Loi, F. Angiolini, and L. Benini, "Synthesis of Low-Overhead Configurable Source Routing Tables for Network Interfaces," Proc. Conf. Design, Automation Test in Europe (DATE '09), pp. 262-267, Apr. 2009.
[22] M. Coenen, K. Goossens, G. De Micheli, S. Murali, and M. Coenen, "A Buffer-Sizing Algorithm for Networks on Chip Using TDMA and Credit-Based End-to-End Flow Control," Proc. Fourth Int'l Conf. Hardware/Software Codesign and System Synthesis (CODES+ISSS '06), pp. 130-135, 2006.
[23] J. Hu, U. Ogras, and R. Marculescu, "System-Level Buffer Allocation for Application-Specific Networks-on-Chip Router Design," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 12, pp. 2919-2933, Dec. 2006.
[24] K. Constantinides, S. Plaza, J. Blome, B. Zhang, V. Bertacco, S. Mahlke, T. Austin, and M. Orshansky, "Bulletproof: A Defect-Tolerant CMP Switch Architecture," Proc. 12th Int'l Symp. High-Performance Computer Architecture, pp. 5-16, Feb. 2006.
[25] T. Bengtsson, S. Kumar, and Z. Peng, "Application Area Specific System Level Fault Models: A Case Study with a Simple NoC Switch," Proc. Third IEEE Int'l Workshop Electronic Design, Test and Applications, 2006.
[26] P. Roche and G. Gasiot, "Impacts of Front-End and Middle-End Process Modifications on Terrestrial Soft Error Rate," IEEE Trans. Device and Materials Reliability, vol. 5, no. 3, pp. 382-396, Sept. 2005.
[27] K. Pagiamtzis and A. Sheikholeslami, "Content-Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey," IEEE J. Solid-State Circuits, vol. 41, no. 3, pp. 712-727, Mar. 2006.
[28] M.Y. Hsiao, "A Class of Optimal Minimum Odd-Weight-Column SEC-DED Codes," IBM J. Research and Development, vol. 14, no. 4, pp. 395-401, July 1970.
[29] S.-K. Lu and C.-H. Hsu, "Fault Tolerance Techniques for High Capacity RAM," IEEE Trans. Reliability, vol. 55, no. 2, pp. 293-306, June 2006.
[30] C. Concatto, D. Matos, L. Carro, F. Kastensmidt, A. Susin, E. Cota, and M. Kreutz, "Fault Tolerant Mechanism to Improve Yield in NoCs Using a Reconfigurable Router," Proc. 22nd Ann. Symp. Integrated Circuits and System Design: Chip on the Dunes, pp. 26:1-26:6, 2009.
[31] S. Niranjan and J. Frenzel, "A Comparison of Fault-Tolerant State Machine Architectures for Space-Borne Electronics," IEEE Trans. Reliability, vol. 45, no. 1, pp. 109-113, Mar. 1996.
48 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool