This Article 
 Bibliographic References 
 Add to: 
3D-Integrated SRAM Components for High-Performance Microprocessors
October 2009 (vol. 58 no. 10)
pp. 1369-1381
Kiran Puttaswamy, Georgia Institute of Technology, Atlanta
Gabriel H. Loh, Georgia Institute of Technology, Atlanta
3D integration is an emergent technology that has the potential to greatly increase device density while simultaneously providing faster on-chip communication. 3D fabrication involves stacking two or more die connected with a very high density and low-latency interface. The die-to-die vias that comprise this interface can be treated as regular on-chip metal due to their small size (on the order of 1 \mu{\rm m}) and high speed (sub-FO4 die-to-die communication delay). The increased device density and the ability to place and route in the third dimension provide new opportunities for microarchitecture design. In this paper, we focus on the 3D-integrated designs of SRAM structures. We show that the dense die-to-die vias enable 3D-integrated SRAM components that are partitioned at the level of individual wordlines or bitlines. This results in a wire length reduction within SRAM arrays, and a reduction in the area footprint, which reduces the wires required for global routing. The wire length reduction provides simultaneous latency and energy reduction benefits, e.g., 47 percent latency reduction and 18 percent energy reduction for a 4 MB 4-die stacked 3D SRAM array. A 3D implementation of a 128-entry multiported SRAM array achieves a 36 percent latency improvement with a simultaneous energy reduction of 55 percent. As planar designs adapt high-performance techniques such as hierarchical wordlines to improve performance, 3D integration provides even larger benefits, making it a desirable technology for high-performance designs. For the 4 MB SRAM array, the 3D-integrated designs provide additional latency reduction benefit over the planar designs when hierarchical wordlines are implemented in both planar and 3D designs.

[1] P.M. Solomon, K.W. Gaurini, Y. Zhang, K. Chan, E.C. Jones, G.M. Cohen, A. Krasnoperova, A. Ronay, O. Dokumaci, H.J. Hovel, J.J. Bucchignano, C. CabralJr., C. Lavoie, V. Pu, D.C. Boyd, K. Petrarca, J.H. Yoon, I.V. Babich, J. Treichler, P.M. Kozlowski, J.S. Newbury, C.P. D'Emic, R.M. Sicina, J. Benedict, and H.-S.P. Wong, “Two Gates Are Better than One (Double-Gate MOSFET Process),” IEEE Circuits and Devices, vol. 19, no. 1, pp. 48-62, Jan. 2003.
[2] R.S. Chau, B. Doyle, J. Kavalieros, D. Barlage, A. Murthy, M. Doczy, R. Rios, T. Linton, R. Arghavani, B. Jin, S. Datta, and S. Hareland, “Advanced Depleted-Substrate Transistors: Single-Gate, Double-Gate and Tri-Gate,” Proc. Int'l Conf. Solid State Devices and Materials, pp. 68-69, Sept. 2002.
[3] D. Hasimoto, W.-C. Lee, J. Kedzierski, H. Takeuchi, K. Asano, C. Kuo, E. Anderson, T.-J. King, J. Bokor, and C. Hi, “FinFET: A Self-Aligned Double-Gate MOSFET Scalable to 20nm,” IEEE Trans. Electron Devices, vol. 47, no. 21, pp. 2320-2325, Dec. 2000.
[4] K. Mistry, M. Armstrong, C. Auth, S. Cea, T. Coan, T. Ghani, T. Hoffmann, A. Murthy, J. Sandford, R. Shaheed, K. Zawadzki, K. Zhang, S. Thompson, and M. Bohr, “Delaying Forever: Uniaxial Strained Silicon Transistors in a 90nm CMOS Technology,” Proc. Symp. Very Large-Scale Integration (VLSI) Technology, pp. 50-51, June 2004.
[5] R.S. Chau, S. Datta, M. Doczy, B. Doyle, J. Kavalieros, and M. Metz, “High-k/Metal-Gate Stack Effect and Its MOSFET Characteristics,” IEEE Electron Device Letters, vol. 25, no. 6, pp. 408-410, June 2004.
[6] P. Reed, G. Yeung, and B. Black, “Design Aspects of a Microprocessor Data Cache Using 3D Die Interconnect Technology,” Proc. Int'l Conf. Integrated Circuit Design and Technology, pp.15-18, May 2005.
[7] B. Black, D. Nelson, C. Webb, and N. Samra, “3D Processing Technology and Its Impact on IA32 Microprocessors,” Proc. 22nd Int'l Conf. Computer Design, pp. 316-318, Oct. 2004.
[8] D. Nelson, C. Webb, D. McCauley, K. Raol, J. RupleyII, J. DeVale, and B. Black, “A 3D Interconnect Methodology Applied to iA32-Class Architectures for Performance Improvements through RC Mitigation,” Proc. 21st Int'l Very Large-Scale Integration (VLSI) Multilevel Interconnection Conf., Sept. 2004.
[9] K.W. Guarini, A.W. Topol, M. Ieong, R. Yu, L. Shi, M.R. Newport, D.J. Frank, D.V. Singh, G.M. Cohen, S.V. Nitta, D.C. Boyd, P.A. O'Neil, S.L. Tempest, H.B. Pogge, S. Purushothaman, and W.E. Haensch, “Electrical Integrity of State-of-the-Art $0.13\;\mu{\rm m}$ SOI CMOS Devices and Circuits Transferred for Three-Dimensional (3D) Integrated Circuit (IC) Fabrication,” Proc. Int'l Electron Devices Meeting, pp. 943-945, Dec. 2002.
[10] S. Gupta, M. Hilbert, S. Hong, and R. Patti, “Techniques for Producing 3D ICs with High-Density Interconnect,” Proc. 21st Int'l Very Large-Scale Integration (VLSI) Multilevel Interconnection Conf., Sept. 2004.
[11] K. Puttaswamy and G.H. Loh, “Implementing Caches in a 3D Technology for High Performance Processors,” Proc. Int'l Conf. Computer Design, pp. 525-532, Oct. 2005.
[12] Y.-F. Tsai, Y. Xie, N. Vijaykrishnan, and M.J. Irwin, “Three-Dimensional Cache Design Using 3DCacti,” Proc. Int'l Conf. Computer Design, Oct. 2005.
[13] J. Mayega, O. Erdogan, P.M. Belemjian, K. Zhou, J.F. McDonald, and R.P. Kraft, “3D Direct Vertical Interconnect Microprocessors Test Vehicle,” Proc. ACM Great Lakes Symp. Very Large-Scale Integration (VLSI), pp. 141-146, Apr. 2003.
[14] K. Puttaswamy and G.H. Loh, “The Impact of 3-Dimensional Integration on the Design of Arithmetic Units,” Proc. Int'l Symp. Circuits and Systems, pp. 4951-4954, May 2006.
[15] Y. Xie, G.H. Loh, B. Black, and K. Bernstein, “Design Space Exploration for 3D Architectures,” J. Emerging Technologies in Computing Systems, vol. 2, no. 2, pp. 65-103, 2006.
[16] K. Puttaswamy and G.H. Loh, “Implementing Register Files for High-Performance Microprocessors in a Die-Stacked (3D) Technology,” Proc. Int'l Symp. Very Large-Scale Integration (VLSI), pp.384-389, 2006.
[17] K. Puttaswamy and G.H. Loh, “Dynamic Instruction Schedulers in a 3-Dimensional Integration Technology,” Proc. ACM Great Lakes Symp. Very Large-Scale Integration (VLSI), pp. 153-158, 2006.
[18] F. Li et al., “Design and Management of 3D Chip Multiprocessors Using Network-in-Memory,” Proc. 33rd Int'l Symp. Computer Architecture, pp. 130-141, June 2006.
[19] C.C. Liu, I. Ganusov, M. Burtscher, and S. Tiwari, “Bridging the Processor-Memory Performance Gap with 3D IC Technology,” IEEE Design and Test of Computers, vol. 22, no. 6, pp. 556-564, Nov./Dec. 2005.
[20] M. Healy, M. Vittes, M. Ekpanyapong, C. Ballapuram, S.K. Lim, H.-H.S. Lee, and G.H. Loh, “Multi-Objective Microarchitectural Floorplanning for 2D and 3D ICs,” Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 1, Jan. 2007.
[21] J. Cong, A. Jagannathan, Y. Ma, G. Reinman, J. Wei, and Y. Zhang, “An Automated Design Flow for 3D Microarchitecture Evaluation,” Proc. 11th Asia South Pacific Design Automation Conf., pp.384-389, 2006.
[22] J. Cong and Y. Zhang, “Thermal via Planning for 3-D IC's,” Proc. Int'l Conf. Computer-Aided Design, pp. 745-752, 2005.
[23] S.-M. Jung, J. Jang, W. Cho, J. Moon, K. Kwak, B. Choi, B. Hwang, H. Lim, J. Jeong, J. Kim, and K. Kim, “The Revolutionary and Truly 3-Dimensional ${\rm 25F}^2$ SRAM Technology with the Smallest ${\rm S}^3$ Cell, ${\rm 0.16um}^2$ , and SSTFT for Ultra High Density SRAM,” Proc. Symp. Very Large-Scale Integration (VLSI) Technology, pp. 228-229, June 2004.
[24] P. Morrow, M.J. Kobrinsky, S. Ramanathan, C.-M. Park, M. Harmes, V. Ramachandrarao, H. mog Park, G. Kloster, S. List, and S. Kim, “Wafer-Level 3D Interconnects via Cu Bonding,” Proc. 21st Advanced Metallization Conf., Oct. 2004.
[25] R. Reif, A. Fan, K.-N. Chen, and S. Das, “Fabrication Technologies for Three-Dimensional Integrated Circuits,” Proc. Third Int'l Symp. Quality Electronic Design, pp. 33-37, Mar. 2002.
[26] S. Das, A. Fan, K.-N. Chen, and C.S. Tan, “Technology, Performance, and Computer-Aided Design of Three-Dimensional Integrated Circuits,” Proc. Int'l Symp. Physical Design, pp. 108-115, Apr. 2004.
[27] J.-H. Ahn and J.-S. Jeong, “Hierarchical Word Line Structure,” United States Patent Application, Feb. 1998.
[28] K. Ghose and M.B. Kamble, “Reducing Power in Superscalar Processor Caches Using Subbanking, Multiple Line Buffers and Bit-Line Segmentation,” Proc. Int'l Symp. Low Power Electronics and Design, pp. 70-75, Aug. 1999.
[29] R. Balasubramonian, S. Dwarkadas, and D. Albonesi, “Reducing the Complexity of the Register File in Dynamic Superscalar Processors,” Proc. 34th Int'l Symp. Microarchitecture, pp. 237-248, Dec. 2001.
[30] J.H. Tseng and K. Asanović, “Banked Multiported Register Files for High-Frequency Superscalar Microprocessors,” Proc. 30th Int'l Symp. Computer Architecture, pp. 62-71, May 2003.
[31] R.E. Kessler, “The Alpha 21264 Microprocessor,” IEEE Micro Magazine, vol. 19, no. 2, pp. 24-36, Mar./Apr. 1999.
[32] G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyler, and P. Roussel, “The Microarchitecture of the Pentium 4 Processor,” Intel Technology J., vol. 5, no. 1, Feb. 2001.
[33] A. Seznec, S. Felix, V. Krishnan, and Y. Sazeides, “Design Tradeoffs for the Alpha EV8 Conditional Branch Predictor,” Proc. 29th Int'l Symp. Computer Architecture, May 2002.
[34] Y. Cao, T. Sato, D. Sylvester, M. Orshansky, and C. Hu, “New Paradigm of Predictive MOSFET and Interconnect Modeling for Early Circuit Design,” Proc. 2000 Custom Integrated Circuits Conf., pp. 201-204, May 2000.
[35] Intel Corporation “130 nm Logic Technology Featuring 60nm Transistors, Low-K Dielectrics and Cu Interconnects,” ftp:/, May 2002,
[36] S. Strickland, E. Ergin, D.R. Kaeli, and P. Zavracky, “VLSI Design in the Third Dimension,” Integration: the VLSI J., vol. 25, no. 1, pp.1-16, Sept. 1998.
[37] Y. Deng and W. Maly, “2.5D System Integration: A Design Driven System Implementation Schema,” Proc. Asia South Pacific Design Automation Conf., pp. 450-455, Jan. 2004.
[38] J.M. Rabaey, Digital Integrated Circuits: A Design Perspective. Prentice Hall, 1996.
[39] J.P. Shen and M.H. Lipasti, Modern Processor Design: Fundamentals of Superscalar Processors. McGraw Hill, 2005.

Index Terms:
3D-integrated technology, die stacking, SRAM arrays.
Kiran Puttaswamy, Gabriel H. Loh, "3D-Integrated SRAM Components for High-Performance Microprocessors," IEEE Transactions on Computers, vol. 58, no. 10, pp. 1369-1381, Oct. 2009, doi:10.1109/TC.2009.92
Usage of this product signifies your acceptance of the Terms of Use.