
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Sameh Galal, Mark Horowitz, "EnergyEfficient FloatingPoint Unit Design," IEEE Transactions on Computers, vol. 60, no. 7, pp. 913922, July, 2011.  
BibTex  x  
@article{ 10.1109/TC.2010.121, author = {Sameh Galal and Mark Horowitz}, title = {EnergyEfficient FloatingPoint Unit Design}, journal ={IEEE Transactions on Computers}, volume = {60}, number = {7}, issn = {00189340}, year = {2011}, pages = {913922}, doi = {http://doi.ieeecomputersociety.org/10.1109/TC.2010.121}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  EnergyEfficient FloatingPoint Unit Design IS  7 SN  00189340 SP913 EP922 EPD  913922 A1  Sameh Galal, A1  Mark Horowitz, PY  2011 KW  Arithmetic and logic structures KW  highspeed arithmetic KW  floating point KW  fused multiplyadd KW  throughput/{\rm mm}^{2} optimization. VL  60 JA  IEEE Transactions on Computers ER   
[1] R.H. Dennard, F.H. Gaensslen, L. Kuhn, and H.N. Yu, "Design of Micron MOS Switching Devices," Proc. IEEE Int'l Electron Devices Meeting, pp. 168170, 1972.
[2] D. Patil, O. Azizi, and M. Horowitz, "Robust EnergyEfficient Adder Topologies," Proc. 18th IEEE Symp. Computer Arithmetic (ARITH '07), pp. 1628, 2007.
[3] P.M. Seidel and G. Even, "DelayOptimized Implementation of IEEE FloatingPoint Addition," IEEE Trans. Computers, pp. 97113, vol. 53, no. 2, Feb. 2004.
[4] T. Lang and J.D. Bruguera, "FloatingPoint Fused MultiplyAdd: Reduced Latency for FloatingPoint Addition," Proc. 17th IEEE Symp. Computer Arithmetic (ARITH '05), pp. 4251, 2005.
[5] P.M. Seidel, "Multiple Path IEEE FloatingPoint Fused MultiplyAdd," Proc. 46th Int'l IEEE MidWest Symp. Circuits and Systems (MWSCAS), 2003.
[6] E. Hokenek, R.K. Montoye, and P.W. Cook, "SecondGeneration RISC Floating Point with MultiplyAdd Fused," IEEE J. SolidState Circuits, vol. 25, no. 5, pp. 12071213, Oct. 1990.
[7] H.J. Oh et al., "A Fully Pipelined SinglePrecision FloatingPoint Unit in the Synergistic Processor Element of a CELL Processor," IEEE J. SolidState Circuits, vol. 41, no. 4, pp. 759771, Apr. 2006.
[8] S. Dao Trong, M.S. Schmookler, E.M. Schwarz, and M. Kroener, "P6 Binary FloatingPoint Unit," Proc. 18th IEEE Symp. Computer Arithmetic (ARITH '07), pp. 7786, 2007.
[9] N. Ide et al., "2.44GFLOPS 300MHz FloatingPoint VectorProcessing Unit for HighPerformance 3D Computer Graphics Computing," IEEE J. SolidState Circuits, vol. 35, no. 7, pp. 10251033, July 2000.
[10] D.R. Lutz and C.N. Hinds, "A New FloatingPoint Architecture for Wireless 3D Graphics," Proc. 38th Asilomar Conf. Signals, Systems and Computers (ACSSC '04), vol. 2, pp. 18791883, Nov. 2004.
[11] E.M. Schwarz, "Binary FloatingPoint Unit Design: The Fused MultiplyAdd Dataflow," HighPerformance EnergyEfficient Microprocessor, V.G. Oklobdzija and R.K. Krishnamurthy, eds., Springer, 2006.
[12] K. Johguchi, Y. Mukuda, K. Aoyama, H.J. Mattausch, and T. Koide, "A 2StagePipelined 16 Port SRAM with 590 Gbps Random Access Bandwidth and Large Noise Margin," IEICE Electronics Express, vol. 4, no. 2, pp. 2125, 2007.
[13] J.E. Lindholm, M.Y. Siu, S.S. Moy, S. Liu, and J.R. Nickolls, "Simulating Multiported Memories Using Lower Port Count Memories," US Patent US 7,339,592 B2, Nvidia Corporation, Mar. 2008.
[14] L. Yue, J.W. Berendsen, K.M. Abdalla, R.M. Bastos, and R. Danilak, "Architecture for Compact MultiPorted Register File," US Patent US 7,339,592 B2, Nvidia Corporation, Mar. 2008.
[15] S. Thoziyoor, N. Muralimanohar, and N.P. Jouppi, "CACTI 5.0: An Integrated Cache Timing, Power, and AreaModel," technical report, HP Laboratories, 2007.
[16] J.D. Owens, M. Houston, D. Luebke, S. Green, J.E. Stone, and J.C. Phillips, "GPU Computing," Proc. IEEE, vol. 96, no. 5, pp. 879899, May 2008.
[17] C. Patel et al., "Cost Model for Planning, Development and Operation of a Datacenter," http://www.hpl.hp.com/ techreports/ 2005HPL2005107R1.pdf, 2009.
[18] Predictive Transistor Models, http:/ptm.asu.edu/, 2010.
[19] Hynix 1 Gb (32Mx32) GDDR5 SGRAM H5GQ1H24AFR Datasheet, http://www.hynix.com/datasheet/pdf/graphics H5GQ1H24AFR(Rev1.0).pdf, 2010.
[20] ATI Radeon HD 5870 GPU Feature Summary, http:/www.amd. com, 2010.
[21] R.W. Brodersen, M.A. Horowitz, D. Markovic, B. Nikolic, and V. Stojanovic, "Methods for True Power Minimization," Proc. IEEE/ACM Int'l Conf. Computer Aided Design (ICCAD), pp.3542, Nov. 2002,