Pages: pp. 145-147
COMPUTATION is as old as recorded human history. Even floating-point computation can be traced back that long: Sumerian and then Babylonian sexagesimal numbers of more than 4000 years ago did not show trailing zeros, and could thus be considered floating-point values with just the significand represented explicitly and the exponent inferred from the context. The word “computation” itself comes from Latin (computare), and is more than 2000 years old.
Computer Arithmetic is one of the first subfields of Computer Architecture. It has evolved considerably since the early days of modern computers 40 or 50 years ago, when arithmetic-logic units (ALUs) processed only integer numbers and represented a significant fraction of a CPU's hardware. Floating-point arithmetic was added later and gradually, first according to rules established by each manufacturer. A major milestone was represented by the adoption of the IEEE Standard 754-1985 for Floating-Point Arithmetic.
For almost four decades now, many of the best ideas in computer arithmetic have been presented at, and recorded in the proceedings of the IEEE Symposia on Computer Arithmetic, more commonly known as the ARITH Conferences. A total of 19 ARITH conferences have taken place so far, alternating every two years between Europe and the United States, with one exception (ARITH 14, which was held in Australia). It has also become a tradition lately to dedicate a special section of the Transactions on Computers to Computer Arithmetic following each conference.
In the beginning, ARITH conferences were focused on studies and innovations concerning hardware and software designs and algorithms for integer and floating-point calculations. The latter was initially aimed mostly at scientific and engineering applications, but in recent years multimedia and graphics applications increased in importance as well. Because of the compute-intensive nature of encryption/decryption algorithms, cryptography has emerged as a distinct focus area of computer arithmetic as well.
While there are still advances to be made in integer arithmetic, most of the computer arithmetic research today is in the floating-point domain. A catalyst in this direction was and still is the IEEE Standard 754-2008 for Floating-Point Arithmetic, which replaced the old IEEE 754-1985 after a lengthy revision process. Another important factor that influences computer arithmetic research is the extraordinary increase in performance and complexity of processors that we are witnessing today, and will for the foreseeable future. Multi-core CPUs, many-core CPUs, multiple floating-point units, vector floating-point units, the need to lower cost and power while increasing performance and maintaining simplicity and programmability are all placing constraints on today's computer architecture solutions.
Efficient computer arithmetic algorithms, fast implementations, reliable designs utilizing the best-suited number systems and representations, thorough verification and validation, are all necessary for any modern computing device. The computer arithmetic field remains a fertile and interesting area for research and innovation. The present issue of the Transactions on Computers is proof of this fact.
This special section hosts 11 high quality research papers in computer arithmetic. They are the result of a selection from more than 50 submitted manuscripts, in response to an open call for papers that followed the 19th IEEE International Symposium on Computer Arithmetic which took place in Portland, Oregon in June 2009. The papers selected through a peer review process represent a wide range of the topics, and all received positive evaluations from expert referees. The papers in this special section are grouped into three categories: Basic Arithmetic Operations and Number Systems; Polynomial Evaluation and Elementary Functions; and Cryptography.
Basic Arithmetic Operations and Number Systems. The first paper from this category is “Reducing the Computation Time in (Short Bit-Width) Two's Complement Multipliers”, authored by Fabrizio Lamberti, Nikos Andrikos, Elisardo Antelo, and Paolo Montuschi. The authors present a technique to reduce the maximum height of the modified booth encoder-generated radix-4 partial product array by one, without increasing the delay of the partial product generation stage. This may allow for a faster compression of the partial product array and regular layouts, a technique of particular interest in all multiplier designs, but especially in short bit-width two's complement multipliers for high-performance embedded cores.
The second paper is “Exact and Approximated Error of the FMA”, by Sylvie Boldo and Jean-Michel Muller. The fused multiply-add operation (FMA), was introduced in the IEEE Standard 754-2008 for Floating-Point Arithmetic. The paper is an extension of earlier work on the computation of the exact error of the FMA, now based on more general conditions, and providing a formal proof. A new algorithm is also presented, that computes an approximation as well as bounds to the error of an FMA.
The third paper is “Improved Division by Invariant Integers”, by Niels Möller and Torbjörn Granlund. The problem of dividing a two-word integer by a single-word integer is solved as a multiplication using a pre-computed single-word approximation of the reciprocal of the divisor, followed by a couple of adjustment steps, all while using cheaper multiplication operations than before. The algorithm gives a speedup of roughly 30 percent on AMD and Intel processors in the x86_64 family.
The fourth paper of this group is “Simulation-Based Verification of Floating-Point Division”, by Elena Guralnik, Merav Aharoni, Ariel J. Birnbaum, and Anatoly Koyfman. The authors present a simulation-based method for verification of division - an operation with an exceptionally wide array of corner cases. FPgen, a test generation framework targeted at the floating-point datapath was created and has been successfully used in the verification of a variety of hardware designs. The relevant verification tasks supplied with FPgen and the underlying algorithms used to target them are presented.
The fifth paper is “Area-Efficient Multipliers Based on Multiple-Radix Representations”, authored by Vassil S. Dimitrov, Kimmo U. Järvinen, and Jithra Adikari. New algorithms for integer multiplication are introduced, that are based on a specific multiple-radix representation of one of the multiplicands. Theoretical analysis and experimental results are provided for such multipliers in the 0.18 m CMOS technology, showing the advantages of the new method in 64-bit hardware implementations - better area and power consumption compared to reference multipliers.
The sixth and last paper of this set is “A Real/Complex Logarithmic Number System ALU”, by Mark G. Arnold and Sylvain Collange. The paper shows how to reuse the real Logarithmic Number System (LNS) hardware for the complex CLNS (which represents complex values in log-polar form), with specialized hardware, including a novel logsin that overcomes singularity problems and is smaller than the real-valued LNS ALU to which it is attached. FPGA synthesis shows the new CLNS ALU is smaller than prior fast CLNS units. Accuracy trade-offs are also considered.
Polynomial Evaluation and Elementary Functions. The first paper of this group is “Computing Floating-Point Square Roots via Bivariate Polynomial Evaluation”, by Claude-Pierre Jeannerod, Hervé Knochel, Christophe Monat, and Guillaume Revy. The authors show how to reduce the computation of correctly rounded square roots of binary floating-point values to fixed-point evaluation of some particular integer polynomials in two variables. This leads to high instruction-level parallelism and potentially low-latency implementations. Experiments carried out on an integer processor (ST231) demonstrated low latency, as expected.
The second paper of this group is “Midpoints and Exact Points of Some Algebraic Functions in Floating-Point Arithmetic”, by Claude-Pierre Jeannerod, Nicolas Louvet, Jean-Michel Muller, and Adrien Panhaleux. When implementing a function $\bf f$ in floating-point arithmetic with correct IEEE rounding and good performance, it is important to know if there are input floating-point values $\bf x$ such that $\bf f(x)$ is either the middle of the interval between two consecutive floating-point numbers, or a floating-point number. The paper proves whether or not there are such midpoints or exact points, for some usual algebraic functions and for various floating-point formats. The points are listed or characterized whenever possible. The results and techniques presented can be used with both the binary and the decimal formats defined in the IEEE Standard 754-2008 for floating-point arithmetic.
The third and last paper of the group is “Certifying the Floating-Point Implementation of an Elementary Function Using Gappa”, by Florent de Dinechin, Christoph Lauter, and Guillaume Melquiond. Certifying floating-point programs by hand is tedious and error-prone. The paper presents the Gappa tool, a proof assistant designed to make this task easier and more secure by automating the evaluation of rounding errors using interval arithmetic. Its input format is very close to the actual code that needs to be validated. It generates a formal proof of the results, which can be checked independently by a lower level proof assistant such as Coq. Examples are chosen from a widely used class of floating-point programs - elementary functions in a mathematical library.
Cryptography. The first paper from this last category is “Hybrid Binary-Ternary Number System for Elliptic Curve Cryptosystems“, by Jithra Adikari, Vassil S. Dimitrov, and Laurent Imbert. Single and double scalar multiplications are the most computational intensive operations in elliptic-curve-based cryptosystems. The hybrid binary-ternary number system provides both short representations and small density. The authors present three novel algorithms for both single and double scalar multiplication. A detailed theoretical analysis is provided, together with timings and fair comparisons over both tripling-oriented Doche-Ichart-Kohel curves and generic Weierstrass curves. Experiments show that the new algorithms are almost always faster than their widely used counterparts.
The second paper of this group and also the closing one of the special section on Computer Arithmetic is “Fast Architectures for the $\eta$ T Pairing over Small-Characteristic Supersingular Elliptic Curves”, by Jean-Luc Beuchat, Jérémie Detrey, Nicolas Estibals, Eiji Okamoto, and Francisco Rodríguez-Henríquez. The paper is devoted to the design of fast parallel accelerators for the cryptographic $\eta$ T pairing on supersingular elliptic curves over finite fields of characteristics two and three. A novel hardware implementation of Miller's algorithm is proposed, based on a parallel pipelined Karatsuba multiplier. The authors present the careful choice of algorithms for the tower field arithmetic associated with the $\eta$ T pairing. A final exponentiation is still required to obtain a unique value, and the pairing accelerators are supplemented with a coprocessor responsible for this task. According to place-and-route results on Xilinx FPGAs, these designs improve both the computation time and the area-time trade-off compared to previous coprocessors.
Debjit Das Sarma