This Article 
 Bibliographic References 
 Add to: 
A Hardware Algorithm for Modular Multiplication/Division
January 2005 (vol. 54 no. 1)
pp. 12-21
A mixed radix-4/2 algorithm for modular multiplication/division suitable for VLSI implementation is proposed. The algorithm is based on Montgomery method for modular multiplication and on the extended Binary GCD algorithm for modular division. Both algorithms are modified and combined into the proposed algorithm so that almost all the hardware components are shared. The new algorithm carries out both calculations using simple operations such as shifts, additions, and subtractions. The radix-2 signed-digit representation is used to avoid carry propagation in all additions and subtractions. A modular multiplier/divider based on the algorithm performs an n{\hbox{-}}{\rm bit} modular multiplication/division in O(n) clock cycles where the length of the clock cycle is constant and independent of n. The modular multiplier/divider has a linear array structure with a bit-slice feature and can be implemented with much smaller hardware than that necessary to implement both multiplier and divider separately.

[1] ANSI X9.30, Public Key Cryptography for the Financial Services Industry: Part 1: The Digital Signature Algorithm (DSA). Am. Nat'l Standard Inst. Am. Bankers Assoc., 1997.
[2] J.-C. Bajard, L.-S. Didier, and P. Kornerup, “An RNS Montgomery Modular Multiplication Algorithm,” IEEE Trans. Computers, vol. 47, no. 7, pp. 766-776, July 1998.
[3] R.P. Brent and H.T. Kung, “Systolic VLSI Array for Linear-Time GCD Computation,” Proc. VLSI '83, F. Anceau and E.J. Aas, eds., pp. 145-154, 1983.
[4] J.-S. Coron, “Resistance against Differential Power Analysis for Elliptic Curve Cryptosystems,” Proc. Workshop Cryptographic Hardware and Embedded Systems, pp. 292-302, 1998.
[5] W. Diffie and M.E. Hellman, “New Directions in Cryptography,” IEEE Trans. Information Theory, vol. 22, no. 11, pp. 644-654, Nov. 1976.
[6] T. ElGamal, “A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms,” IEEE Trans. Information Theory, vol. 31, no. 4, pp. 469-472, July 1985.
[7] W.L. Freking and K.K. Parhi, “Modular Multiplication in the Residue Number System with Application to Massively-Parallel Public-Key Cryptography Systems,” Proc. 34th Asilomar Conf. Signals, Systems, and Computers, pp. 1339-1343, Oct. 2000.
[8] M.E. Kaihara and N. Takagi, “A VLSI Algorithm for Modular Multiplication/Division,” Proc. 16th IEEE Symp. Computer Arithmetic, pp. 220-227, June 2003.
[9] S. Kawamura, M. Koike, F. Sano, and A. Shimbo, “Cox-Rower Architecture for Fast Parallel Montgomery Multiplication,” Proc. Advances in Cryptology-EUROCRYPT 2000, pp. 523-538, May 2000.
[10] D.E. Knuth, The Art of Computing Programming, Volume 2, Seminumerical Algorithms, third ed. Reading Mass.: Addison-Wesley, 1998.
[11] N. Koblitz, “Elliptic Curve Cryptosystems,” Math. Computation, vol. 48, no. 177, pp. 203-209, Jan. 1987.
[12] P.C. Kocher, “Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems,” Proc. Advances in Cryptology-CRYPTO '96, pp. 104-113, Aug. 1996.
[13] P.C. Kocher, J. Jaffe, and B. Jun, “Differential Power Analysis,” Proc. Advances in Cryptology (CRYPTO '99), pp. 388-398, 1999.
[14] Ç.K Koç, T. Acar, and B.S. Kaliski Jr., “Analyzing and Comparing Montgomery Multiplication Algorithms,” IEEE Micro, vol. 16, no. 3, pp. 26-33, June 1996.
[15] P. Kornerup, “High-Radix Modular Multiplication for Cryptosystems,” Proc. 11th IEEE Symp. Computer Arithmetic, G. Jullien, M.J. Irwin, and E. Swartzlander, eds., pp. 277-283, 1993.
[16] P.L. Montgomery, “Modular Multiplication without Trial Division,” Math. Computation, vol. 44, no. 170, pp. 519-521, Apr. 1985.
[17] H. Orup, “Simplifying Quotient Determination in High-Radix Modular Multiplication,” Proc. 12th IEEE Symp. Computer Arithmetic, S. Knowles and W.H. McAllister, eds., pp. 193-199, 1995.
[18] E. Oswald and M. Aigner, “Randomized Addition-Subtraction Chains as a Countermeasure against Power Attacks,” Proc. Cryptographic Hardware and Embedded Systems-CHES 2001, Ç.K. Koç, D. Naccache and C. Paar, eds., pp. 39-50, May 2001.
[19] S.N. Parikh and D.W. Matula, “A Redundant Binary Euclidean GCD Algorithm,” Proc. 10th Symp. Computer Arithmetic, pp. 220-224, June 1991.
[20] K.C. Posch and R. Posch, “Modulo Reduction in Residue Number Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 5, pp. 449-454, May 1995.
[21] R.L. Rivest, A. Shamir, and L. Adleman, “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems,” Comm. ACM, vol. 21, no. 2, pp. 120-126, Feb. 1978.
[22] N. Takagi, “A VLSI Algorithm for Modular Division Based on the Binary GCD Algorithm,” IEICE Trans. Fundamentals, vol. E81-A, no. 5, pp. 724-728, May 1998.
[23] N. Takagi and S. Yajima, “Modular Multiplication Hardware Algorithms with a Redundant Representation and Their Application to RSA Cryptosystem,” IEEE Trans. Computers, vol. 41, no. 7, pp. 887-891, July 1992.
[24] A.F. Tenca, G. Todorov, and Ç.K. Koç, “High-Radix Design of a Scalable Modular Multiplier,” Proc. Cryptographic Hardware and Embedded Systems-CHES 2001, Ç.K. Koç, D. Naccache, C. Paar, eds., pp. 185-201, 2001.
[25] C.D. Walter, “Systolic Modular Multiplication,” IEEE Trans. Computers, vol. 42, no. 3, pp. 376-378, Mar. 1993.

Index Terms:
Computer arithmetic, hardware algorithm, modular multiplication, modular division, redundant representation, cryptography.
Marcelo E. Kaihara, Naofumi Takagi, "A Hardware Algorithm for Modular Multiplication/Division," IEEE Transactions on Computers, vol. 54, no. 1, pp. 12-21, Jan. 2005, doi:10.1109/TC.2005.1
Usage of this product signifies your acceptance of the Terms of Use.