A Novel Unified Algorithm and Hardware Architecture for Integrated Modular Division and Multiplication in GF(p) and GF(2n) Suitable for Public-Key Cryptography

L. A. Tawalbeh
Ph.D. Thesis, School of Electrical Engineering & Computer Science, Oregon State University, October 28, 2004.


The spread of the internet and communications techniques increases the necessity for security in applications that involves sharing or exchange of secret or private information. Public-key cryptography is widely used in establishing secure communication channels between the users on the Internet, for E-commerce transactions, and in network security protocols. Public-key cryptography relies on algorithms from computer arithmetic, number theory and algebra. The modular arithmetic operations, modular division, and modular multiplication over finite fields (GF(p) and GF(2^n)) are extensively used in many public-key cryptosystems, such as RSA, ElGamal cryptosystem, Diffie-Hellman key exchange algorithm, elliptic curve cryptography (ECC), and the Digital Signature Standard including the Elliptic Curve Digital Signature Algorithm. In our research, we have mainly concentrated on hardware realization of the ECC since it seems to provide similar amount of security using smaller key size.

The modular multiplication operation with a large modulus is very important in many public-key cryptosystems. One of the most efficient ways to compute modular multiplication is the Montgomery algorithm. Many efficient Montgomery multiplier designs were proposed up to now. On the other hand, computing modular division (inverse) is a time-consuming process and cannot be avoided completely. It was claimed that a gain in performance can be obtained when implementing the division (inverse) in hardware.

In this work, we propose, with a mathematical proof, an efficient unified division algorithm to compute the modular division operation in GF(p) and GF(2^n). The algorithm uses a counter to keep track of the difference between two field elements and this way eliminates the need for comparisons which are usually expensive and time-consuming. An hardware architecture implementing the algorithm is also proposed.

The unified division algorithm is integrated with a unified Montgomery multiplication algorithm to obtain a novel Unified Division/Multiplication Algorithm (UDMA). The UDMA computes division (inverse) and multiplication in a very efficient way in both GF(p) and GF(2^n) fields. Also, we propose a unified hardware architecture that efficiently supports all operations in the UDMA and uses carry-save unified adders for reduced critical path delay, making the proposed architecture faster than other previously proposed designs.

Experimental results obtained by synthesizing the hardware design for AMI 0.5 micron CMOS technology and FPGA VertixII chip (xc2vp50-7ff148 technology) are shown and compared with other proposed dividers and multipliers.

04 May 2005