Radix-4 ASIC Design of a Scalable Montgomery Modular Multiplier Using Encoding Techniques
L. A. Tawalbeh
Ms. Thesis, School of Electrical Engineering & Computer Science, Oregon State University, October 24, 2002.

Abstract

Modular arithmetic operations (i.e., inversion, multiplication and exponentiation) are used in several cryptography applications, such as decipherment operation of RSA algorithm, Diffie-Hellman key exchange algorithm, elliptic curve cryptography, and the Digital Signature Standard including the Elliptic Curve Digital Signature Algorithm. The most important of these arithmetic
operations is the modular multiplication operation since it is the core operation in many cryptographic functions.

Given the increasing demands on secure communications, cryptographic algorithms will be embedded in almost every application involving exchange of information. Some of theses
applications such as smart cards and hand-helds require hardware restricted in area and power
resources.

Cryptographic applications use a large number of bits in order to be considered secure. While some of these applications use 256-bit precision operands, others use precision values up to 2048 or 4096 such as in some exponentiation-based cryptographic applications. Based on this characteristics, a scalable multiplier that operates on any bit-size of the input values (variable precision) was recently proposed. It is replicated in order to generate long-precision results independently of the data path precision for which it was originally designed.

The multiplier presented in this work is based on the Montgomery multiplication algorithm. This thesis work contributes by presenting a modified radix-4 Montgomery multiplication algorithm
with new encoding technique for the multiples of the modulus. This work also describes the scalable hardware design and analyzes the synthesis results for a 0.5 micro m CMOS technology. The results are compared with two other proposed scalable Montgomery multiplier designs, namely, the radix-2 design, and the radix-8 design. The comparison is done in terms of area, total
computational time and complexity.


Since modular exponentiation can be generated by successive multiplication, we include in this thesis an analysis of the boundaries for inputs and outputs. Conditions are identified to allow the use of one multiplication output as the input of another one without adjustments (or reduction).

High-radix multipliers exhibit higher complexity of the design. This thesis shows that radix-4 hardware architectures does not add significant complexity to radix-2 design and has a significant
performance gain.
 

04 May 2005