A Fast, Efficient Parallel-Acting Method of Generating Functions Defined by Power Series, Including Logarithm, Exponential, and Sine, Cosine
Issue No.01 - January (1996 vol.7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.481596
<p><b>Abstract</b>—A fundamental parallel procedure of implementing certain algorithms is by means of trees and arrays, [<ref rid="bibl00331" type="bib">1</ref>]. A method of generating any function defined by a power series in a fast, efficient parallel-acting manner using trees and arrays is described. The power series considered can be written as f(Y) = a<sub>0</sub> + a<sub>1</sub>Y + a<sub>2</sub>Y<super>2</super> + ... where Y = v<sub>1</sub>x + v<sub>2</sub>x<super>2</super> + ... + v<sub>k</sub>x<super>k</super>, v<sub>i</sub> = (0, 1), is a binary fraction when x = ½. The power series must be expanded into individual terms cx<super>i</super>. These terms are then transformed into weighted binary terms. Two methods are given to obtain all the individual terms (including coefficients) associated with each power of x. The hardware required for implementation is a tree similar to a Wallace or Dadda tree used for parallel multiplication of two binary numbers. Despite the multiplicity of terms required, Boolean logic methods reduce the tree dimensions in many cases so that the total tree required is smaller than an existing multiplier tree. In that case, Schwarz and Flynn, [<ref rid="bibl003313" type="bib">13</ref>], [<ref rid="bibl003315" type="bib">15</ref>], have shown that the required tree can be superimposed on the existing multiplier tree in a multiplexed manner with relatively little increase in hardware. The generation of the logarithmic function is described in detail. Comparisons with other methods are made for the case of 11 bit accuracy of the logarithm. Using a figure of merit of latency times area (number of transistors), estimates show that the superposition scheme gives the best (smallest) figure of merit. For 11 bit accuracy, the superposition scheme requires only about 480 additional gates to be superimposed upon a 41 bit or larger multiplier, and the speed of operation is that of the multiplier.</p>
Arrays, cosine, exponential, functions, logarithm, multinomials, multiplier tree, partitions, power series, sine.
David M. Mandelbaum, Stefanie G. Mandelbaum, "A Fast, Efficient Parallel-Acting Method of Generating Functions Defined by Power Series, Including Logarithm, Exponential, and Sine, Cosine", IEEE Transactions on Parallel & Distributed Systems, vol.7, no. 1, pp. 33-45, January 1996, doi:10.1109/71.481596