14th IEEE Symposium on Computer Arithmetic (ARITH-14 '99) Series Approximation Methods for Divide and Square Root in the Power3(TM) Processor Adelaide, Australia April 14-April 16 ISBN: 0-7695-0116-8
The Power3 processor is a 64-bit implementation of the PowerPC(TM) architecture and is the successor to the Power2(TM) processor for workstations and servers which require high performance floating point capability. The previous processors used Newton-Raphson algorithms for their implementations of divide and square root. The Power3 processor has a longer pipeline latency, which would substantially increase the latency for these instructions. Instead, new algorithms based on power series approximations were developed which provide significantly better performance than the Newton-Raphson algorithm for this processor. This paper describes the algorithms, and then shows how both the series based algorithms and the Newton-Raphson algorithms are affected by pipeline length. For the Power3, the power series algorithms reduce the divide latency by over 20% and the square root latency by 35%.
Citation:
Martin S Schmookler, Ramesh C. Agarwal, Fred G. Gustavson, "Series Approximation Methods for Divide and Square Root in the Power3(TM) Processor," arith, pp.116, 14th IEEE Symposium on Computer Arithmetic (ARITH-14 '99), 1999 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||