The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2008 vol.57)
pp: 404-417
ABSTRACT
The unfolded and pipelined CORDIC is a high performance hardware element that produces a wide variety of one and two argument functions with high throughput. The reduction in delay, power, and area (cost) are of significant interest regarding this module due to its high demand for resources. The linear approach to rotation has been proposed to achieve such reductions; however, the schemes for rotation (multiplication) and vectoring (division) complicate the implementation in a single unit. In this work, we improve the linear approximation scheme, leading to a unified implementation for rotation and vectoring where fully parallel tree multipliers are used instead of the second half of CORDIC iterations. We also combine the linear approximation to rotation with the scale factor compensation so that the compensation is performed concurrently with the rotation process. We then extend the method to 3D CORDIC. Such an extension is not straightforward due to the lack of existing analytical expressions for the convergence of the algorithm. A comparison using a rough area--time model and synthesis results shows that our proposals may achieve significant reductions in delay with no increase in area in actual implementations.
INDEX TERMS
Arithmetic and Logic Structures, High-Speed Arithmetic, Algorithms, Computer arithmetic
CITATION
Elisardo Antelo, Julio Villalba, Emilio L. Zapata, "A Low-Latency Pipelined 2D and 3D CORDIC Processors", IEEE Transactions on Computers, vol.57, no. 3, pp. 404-417, March 2008, doi:10.1109/TC.2007.70796
REFERENCES
[1] E. Antelo and J. Villalba, “Low-Latency Pipelined Circular CORDIC,” Proc. 17th IEEE Symp. Computer Arithmetic, pp. 280-287, 2005.
[2] J. Volder, “The CORDIC Computing Technique,” IRE Trans. Electronic Computers, vol. 8, no. 3, pp. 330-334, 1959.
[3] J.S. Walther, “A Unified Algorithm for Elementary Functions,” Proc. Spring Joint Computer Conf. '71, pp. 379-385, 1971.
[4] S.F. Hsiao and J.M. Delosme, “Householder CORDIC Algorithms,” IEEE Trans. Computers, vol. 44, no. 8, pp. 990-1001, Aug. 1995.
[5] S.F. Hsiao and J.M. Delosme, “Parallel Singular Value Decomposition of Complex Matrices Using Multidimensional CORDIC Algorithms,” IEEE Trans. Signal Processing, vol. 44, no. 3, pp. 685-697, 1996.
[6] S.F. Hsiao, C.Y. Lau, and J.M. Delosme, “Redundant Constant-Factor Implementation of Multi-Dimensional CORDIC and Its Application to Complex SVD,” J. VLSI Signal Processing Systems for Signal, Image, and Video Technology, special issue on CORDIC, vol.25, no. 2, pp. 155-166, June 2000.
[7] R. Andraka, “A Survey of CORDIC Algorithms for FPGAs,” Proc. ACM/SIGDA Sixth Int'l Symp. Field Programmable Gate Arrays, pp.191-200, 1998.
[8] T. Lang and E. Antelo, “High-Throughput CORDIC-Based Geometry Operations for 3D Computer Graphics,” IEEE Trans. Computers, vol. 54, no. 3, pp. 347-361, Mar. 2005.
[9] M. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann, 2003.
[10] http:/ieeexplore.ieee.org, 2007.
[11] Y.H. Hu, “The Quantization Effects of the CORDIC Algorithm,” IEEE Trans. Signal Processing, vol. 40, no. 4, pp. 834-844, 1992.
[12] H.M. Ahmed, “Efficient Elementary Functions Generation with Multipliers,” Proc. Ninth IEEE Symp. Computer Arithmetic, pp. 52-59, 1990.
[13] D. Timmermann, H. Hahn, and B.J. Hosticka, “A Modified CORDIC Algorithm with Reduced Iterations,” Electronic Letters, vol. 25, no. 15, pp. 950-951, 1989.
[14] D. Timmermann et al., “Low-Latency Time CORDIC Algorithms,” IEEE Trans. Computers, vol. 41, no. 8, pp. 1010-1015, Aug. 1992.
[15] E. Antelo, J.D. Bruguera, and E.L. Zapata, “Unified Mixed Radix 2-4 Redundant CORDIC Processor,” IEEE Trans. Computers, vol. 45, no. 9, pp. 1068-1073, Sept. 1996.
[16] D. DasSarma and D.W. Matula, “Measuring the Accuracy of ROM Reciprocal Tables,” IEEE Trans. Computers, vol. 43, no. 8, pp. 932-940, Aug. 1994.
[17] J. Villalba, T. Lang, and E.L. Zapata, “Parallel Compensation of Scale Factor for the CORDIC Algorithm,” J. VLSI Signal Processing, vol. 19, no. 3, pp. 227-241, Aug. 1998.
[18] T. Moller and E. Haines, Real-Time Rendering, second ed. A.K.Peters, 2002.
[19] I.E. Sutherland et al., Logical Effort: Designing Fast CMOS Circuits. Morgan Kaufmann, 1999.
[20] E. Antelo, T. Lang, P. Montuschi, and A. Nannarelli, “Digit-Recurrence Dividers with Reduced Logical Depth,” IEEE Trans. Computers, vol. 54, no. 7, pp. 837-851, July 2005.
[21] P.M. Seidel and G. Even, “Delay-Optimized Implementation of IEEE Floating-Point Addition,” IEEE Trans. Computers, vol. 53, no. 2, pp. 97-113, Feb. 2004.
[22] N. Burgess, “New Models of Prefix Adder Topologies,” J. VLSI Signal Processing Systems, vol. 40, no. 1, pp. 125-141, May 2005.
[23] D. DasSarma and D.W. Matula, “Faithful Interpolation in Reciprocal Tables,” Proc. 13th IEEE Symp. Computer Arithmetic, pp. 82-91, 1997.
[24] M.J. Schulte and E.E. Swartzlander, “Hardware Designs for Exactly Rounded Elementary Functions,” IEEE Trans. Computers, vol. 43, no. 8, pp. 964-973, Aug. 1994.
[25] J.A. Piñeiro, J.D. Bruguera, and J.M. Muller, “Faithful Powering Computation Using Table Look-Up and a Fused Accumulation Tree,” Proc. 15th IEEE Symp. Computer Arithmetic, pp. 40-47, June 2001.
[26] M.D. Ercegovac, T. Lang, and P. Montuschi, “Very-High Radix Division with Prescaling and Selection by Rounding,” IEEE Trans. Computers, vol. 43, no. 8, pp. 909-918, Aug. 1994.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool