5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM '97)
Implementation of single precision floating point square root on FPGAs
Napa Valley, CA
April 16-April 18
ISBN: 0-8186-8159-4
Yamin Li, Comput. Archit. Lab., Univ. of Aizu, Aizu-Wakamatsu, Japan
Wanming Chu, Comput. Archit. Lab., Univ. of Aizu, Aizu-Wakamatsu, Japan
The square root operation is hard to implement on FPGAs because of the complexity of the algorithms. In this paper, we present a non-restoring square root algorithm and two very simple single precision floating point square root implementations based on the algorithm on FPGAs. One is low-cost iterative implementation that uses a traditional adder/subtracter. The operation latency is 25 clock cycles and the issue rate is 24 clock cycles. The other is high-throughput pipelined implementation that uses multiple adder/subtracters. The operation latency is 15 clock cycles and the issue rate is one clock cycle. It means that the pipelined implementation is capable of accepting a square root instruction on every clock cycle.
Index Terms:
field programmable gate arrays; single precision floating point; square root; FPGAs; non-restoring square root algorithm; low-cost iterative implementation; adder/subtracter; pipelined implementation
Citation:
Yamin Li, Wanming Chu, "Implementation of single precision floating point square root on FPGAs," fccm, pp.226, 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM '97), 1997