Radix4 ASIC Design of a Scalable Montgomery
Modular Multiplier Using Encoding Techniques
L. A. Tawalbeh
Ms. Thesis, School of Electrical Engineering & Computer Science, Oregon State
University, October 24, 2002.
Abstract
Modular arithmetic operations (i.e., inversion, multiplication and
exponentiation) are used in several cryptography applications, such as
decipherment operation of RSA algorithm, DiffieHellman key exchange algorithm,
elliptic curve cryptography, and the Digital Signature Standard including the
Elliptic Curve Digital Signature Algorithm. The most important of these
arithmetic
operations is the modular multiplication operation since it is the core
operation in many cryptographic functions.
Given the increasing demands on secure communications, cryptographic algorithms
will be embedded in almost every application involving exchange of information.
Some of theses
applications such as smart cards and handhelds require hardware restricted in
area and power
resources.
Cryptographic applications use a large number of bits in order to be considered
secure. While some of these applications use 256bit precision operands, others
use precision values up to 2048 or 4096 such as in some exponentiationbased
cryptographic applications. Based on this characteristics, a scalable multiplier
that operates on any bitsize of the input values (variable precision) was
recently proposed. It is replicated in order to generate longprecision results
independently of the data path precision for which it was originally designed.
The multiplier presented in this work is based on the Montgomery multiplication
algorithm. This thesis work contributes by presenting a modified radix4
Montgomery multiplication algorithm
with new encoding technique for the multiples of the modulus. This work also
describes the scalable hardware design and analyzes the synthesis results for a
0.5 micro m CMOS technology. The results are compared with two other proposed
scalable Montgomery multiplier designs, namely, the radix2 design, and the
radix8 design. The comparison is done in terms of area, total
computational time and complexity.
Since modular exponentiation can be generated by successive multiplication, we
include in this thesis an analysis of the boundaries for inputs and outputs.
Conditions are identified to allow the use of one multiplication output as the
input of another one without adjustments (or reduction).
Highradix multipliers exhibit higher complexity of the design. This thesis
shows that radix4 hardware architectures does not add significant complexity to
radix2 design and has a significant
performance gain.
