The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—We present techniques for accelerating the floating-point computation of <tmath>x/y</tmath> when <tmath>y</tmath> is known before <tmath>x</tmath>. The proposed algorithms are oriented toward architectures with available fused-mac operations. The goal is to get exactly the same result as with usual division with rounding to nearest. It is known that the advanced computation of <tmath>1/y</tmath> allows performing correctly rounded division in one multiplication plus two fused-macs. We show algorithms that reduce this latency to one multiplication and one fused-mac. This is achieved if a precision of at least <tmath>n+1</tmath> bits is available, where <tmath>n</tmath> is the number of mantissa bits in the target format, or if <tmath>y</tmath> satisfies some properties that can be easily checked at compile-time. This requires a double-word approximation of <tmath>1/y</tmath> (we also show how to get it). These techniques can be used by compilers to accelerate some numerical programs without loss of accuracy.</p>
Computer arithmetic, floating-point arithmetic, division by software, division with fused-mac, compilation optimization.

J. Muller, S. K. Raina and N. Brisebarre, "Accelerating Correctly Rounded Floating-Point Division when the Divisor Is Known in Advance," in IEEE Transactions on Computers, vol. 53, no. , pp. 1069-1072, 2004.
92 ms
(Ver 3.3 (11022016))