Professional Documents
Culture Documents
1. INTRODUCTION
The intense drive for low power-delay product has been at the
forefront of the escalated development for the leading-edge VLSI
and Giga-scale-integration (GSI) circuits. Residue Number
System (RNS) is popular in high performance arithmetic
applications such as digital signal processing systems because of
its inherent carry-free operations, parallelism and fault-tolerance
properties [1]. By decomposing large binary numbers into
smaller residues, addition and subtraction in RNS arithmetic have
no inter-digit carries or borrows, and multiplication can be
performed without the need to generate partial products.
A typical RNS consists of three parts: the binary-to-residue
(B/R) converter which converts the binary data into their residue
representations, the residue arithmetic units (RAUs) which
perform the necessary arithmetic operations required by the
applications, and the residue-to-binary (R/B) converter, which
converts the RNS-represented results into their weighted
representations. The efficiency of each part largely depends on
the moduli set. The architectures of the RNS for general moduli
sets have been widely studied. Because of the lack of special
number theoretic properties of the general moduli sets, the R/B
converters and RAUs for the general moduli set RNS are usually
area-consuming [2] or implemented based on lookup tables
(LUTs) [3]. Although the cost of memory has been driven low
nowadays, the number of LUTs and the access time incurred by
the need to read these LUTs iteratively have made the
implementations inefficient for ASIC realization for RNS with
large dynamic ranges.
Special moduli sets have been used extensively to reduce
the hardware complexity in the implementation of RNS,
especially for R/B converters [414]. The moduli sets in the form
of 2nf1 are most important, not only because of their efficient
;,(((
B/R and R/B converters, but also due to the existence of efficient
modular adders and modular multipliers for their RAUs.
There are some efficient R/B converters for the 3-moduli
set [47], but the granularity of its dynamic range and parallelism
is limited and insufficient for the contemporary high performance
and fault-tolerant applications. Moduli sets obtained by
extending the popular 3-moduli set through the addition of
moduli in the form of 2n f 1 are called supersets. The 4-moduli
superset {2n 1, 2n, 2n + 1, 2n1 1} was proposed by Bhardwaj
et al. [8], but two of the moduli are in the form of 2n + 1, causing
some excess of the dynamic range being unused. Vinod and
Premkumar proposed a more efficient 4-moduli superset {2n 1,
2n, 2n + 1, 2n1 1}[9] and its R/B converters are improved by
Cao et al. [10] using the efficient R/B conversion algorithm for
the 3-moduli set {2n 1, 2n, 2n + 1}. Skavantzos and Abdallah
proposed a class of conjugate moduli sets [11]. Although it is a
high-cardinality set, the moduli are not pairwise relatively prime.
Consequently, its dynamic range is reduced, the moduli are
unbalanced and the conversion delay is long. Skavantzos also
proposed a 5-moduli set {2n+1, 2n 1, 2n + 1, 2n + 2(n1)/2 + 1, 2n
2(n1)/2 + 1}, valid for odd number n [12]. As the two extended
moduli are not in the form of 2n f 1, the B/R and R/B
converters and the RAUs will be less efficient when compared to
those of supersets.
In this paper, we propose a new 5-moduli superset {2n 1,
n
n
2 , 2 + 1, 2 n1 1, 2n1 1}, which is valid for even values of n.
Our proposed algorithm for its R/B converter is based on the
mixed-radix conversion (MRC) technique and the R/B
conversion algorithm for the 4-moduli superset [10] wherein an
efficient R/B converter for the popular 3-moduli set proposed in
either [6] or [7] has been used. The derived R/B converter is
based on full adders (FAs). Such FA-based design has the
advantage of being design automation friendly and can be readily
pipelined to suit the throughput rate constrained by the
application.
2. BACKGROUND
In an RNS, an integer X can be represented by an n-tuple of
residues, (x1, x2, , xn) defined over a set of pairwise relatively
prime moduli S = {P1, P2, , Pn}, where gcd(Pi, Pj) = 1 for 1 d i,
j d n, and i z j. The set S is called the moduli set, while the
dynamic range of the system is defined by the product M of all
the moduli. Any integer X belonging to the ring ZM = {0, 1, 2, ,
M 1} has a unique RNS representation. The decomposition of
the binary number X into an n-tuple of residues, is called the B/R
conversion. The reverse process of combining all the residues
,,
,6&$6
M i 1
where M
M i
Pi ,
(1)
M i
Pi
i 1
M Pi , and
M i 1
denotes the
Pi
(3)
2 n 1 1
is given by:
when n = 6k 2, k = 2, 3, 4, ,
2
2
26k 5
2
6i 4
i 0
6 k 2 4
2 6i 1
2
6 1 1
2 n 3 2 4 2 6 1 4
2
6 k 2 1
(4)
2 6k 2
k 1
2 6i 2
i 0
k 2
1
2 6i 5
2 n 2 2 2 2 61 2
i 0
2 6k 1 2 2 0 2 5 2 6 1 5 2 6 k 2 5
when n = 6k + 2, k =1, 2, 3, ,
k
k0
2
i 1
6 1 3
6i
2 6i 3
2 6 1 2 6 2 2 6 k
2 62 3 2 6k 3
(5)
(6)
18 2 6 k 5 2 6 k 6 1
9
2n 1 1
5 2
6 k 3
1 1
2 n 1 1
5 2 6 k 3 4
2 1
1
2 1
1
1
k 0 2 n 2 2 n 1 2 n 1 1
2n 1 1
x 5 X 2
k 0 x 5 X 2
and
k 0 2 3 3 2n 1 1
18k 0
2n 1 1
2n 1 1
k 0 L 2n 1 1 ,
(13)
(14)
By substituting the expression of X(2) given by (9) into (12),
L can be calculated as follows:
L
x5 x 2 2 n T
x5 x 2 2 2 2n 1 Z 2 n Y2 Y1
2 n 1 1
2 n 1 1
x5 x 2 6 Z 4Y2 2Y1 2n 1 1
Y2d
(12)
2 n 1 1
(7)
,
so that the solution of (11) can be obtained by Lemma 1.
To simplify (10), let
CLS x, r ,
2n 1
k 2
k0
x 2r
x2 2n1 X 2,n1 ,
d
Y2 2 n1Y2,n1 and Y1
Zd
Z 2n Z n 2n 1 Z n1 ,
terms into the expression of L, and with the aid of Properties (7)
and (8), L can be simplified as follows:
,,
When n = 6k + 2,
13
Li
i 1
2n 1 1
(15)
R
where
L1
X 5, n 2 X 5, n 3 X 5,1 X 5,0
L2
X 2, n 2 X 2, n 3 X 2,1 X 2,0
(16b)
L3
Z n 3 Z1Z 0 Z n 2
(16c)
L4
Z n 4 Z 0 Z n 2 Z n 3
(16d)
L5
(16e)
L6
Y1, n 3 Y1,1Y1,0Y1, n 2
(16f)
L7
1n 2 X 2, n 1
1n 5 Z n 13
(16g)
L8
(16h)
L9
1n 4 Z n 12
(16i)
L10
1n 4 Z n 1 12
L11
1n 3 Z n 1 11
L12
1n 4Y2, n 1 12
L13
1n 3Y
1, n 1
2 n 1 1
L3
(n1)-bit CSA1
c
s
L4
L5
L6
cM sM
1 2 n 1 1 R ,
(21)
2 n 1 2 2 n R 2 n 1 R R R
2 n 1 U R
2 2 n R 2 n 1 R R
Rn 2 Rn 3 R1 R0 02n Rn 2 Rn 3 R1 R0 2
(22)
(23)
x 2 2 n T V
(24)
A 4n-bit binary adder is required to sum the values of T and
V in (24) and the resultant sum is concatenated to the residue x2
to obtain X. Fig. 3 shows the final architecture of the R/B
converter for the proposed 5-moduli superset.
X
Residue-to-binary
converter
for 4-moduli set
x2
x5
x1 x2 x3 x4
Residue-to-binary
converter
for 3-moduli set
Y 2 Y1
Calculation of Z
Calculation of R
n 1
Fig. 1 Calculation of L
Fig. 2 Calculation of R
4n-bit Adder
T+V
(18)
4. PERFORMANCE COMPARISONS
When n = 6k,
CLS L, n 2 CLS L,2 CLS L,6 1 2
CLS L ,6 1 5 CLS L ,6 k 2 5
3n+1
(3n-1)-bit SUB
(3n+1)-bit SUB
n-1
R
(20)
1
n 1
(n1)-bit CSA 6
cL sL
(n1)
2n
(n1)-bit CSA 5
c s
(n1)-bit CSA 4
c
s
(n1)-bit CSA 3
c
s
(n1)-bit CSA 2
c
s
11
n1
(16l)
i 1
L1
(16k)
(16m)
As there are some embedded constant strings of 1s in L7 to
L13, the modular summation M, of L7 to L13 can be simplified
substantially. After the simplification, L can be calculated by (17),
where CM and SM are the carry and sum of M, respectively.
6
(17)
L
L 2uC S
CLS L ,6 2 3 CLS L ,6 k 3
(16a)
(16j)
(19)
,,
12
x 10
240
[12]
Proposed
11
9
Are 8
a
(NA 7
ND
s) 6
5
100
80
60
40
1
0
[12]
Proposed
Tot 220
al
Co 200
nve
180
rsio
n 160
Del
ay 140
(ns)
120
10
20
0
10
20
30
40
10
20
30
40
50
700
[12]
Proposed
600
500
Po
wer
(m 400
W)
300
200
100
10
20
30
40
50
5. CONCLUSION
In this paper, a new 5-moduli superset {2n 1, 2n, 2n + 1, 2n+1 1,
2n1 1} RNS has been proposed, which is valid for even n. The
new R/B converter incorporates the R/B conversion algorithm of
the 4-moduli superset {2n 1, 2n, 2n + 1, 2n+1 1}, which is
based on most efficient algorithm cited in the literature for the
popular three moduli set. Being a fundamentally FA-based design,
the proposed architecture of the R/B converter can be easily
pipelined to achieve high throughput rate and is more versatile to
optimization by silicon compilers for different dynamic ranges.
Performance comparisons show that the proposed R/B converter
achieves better performances in area, delay and power
,,
consumption than the existing advanced R/B converter of 5moduli set RNS [12]. The proposed 5-moduli set will also be
more efficient for the B/R conversion and the RAUs than its
counterpart as all moduli are in the form of 2n 1.
6. REFERENCES
[1] M. A. Soderstrand, W. K. Jenkins, G. A. Jullien and F. J.
Taylor, Residue Number System Arithmetic: Modern
Applications in Digital Signal Processing. New York: IEEE
Press, 1986.
[2] R. M. Capocelli and R. Giancarlo, Efficient VLSI networks
for converting an integer from binary system to residue number
system and vice versa, IEEE Trans. Circuits Syst., vol. 35, no.
11. pp. 1425-1430, 1988.
[3] C. H. Huang, A fully parallel mixed radix conversion
algorithm for residue number applications, IEEE Trans.
Comput., vol. 32, no. 4, pp. 398-402, 1983.
[4] S. Andraos and H. Ahmad, A new efficient memoryless
residue to binary converter, IEEE Trans. Circuits Syst., vol. 35,
no. 11, pp. 1441-1444, 1988.
[5] A. A. Hiasat and H. S. Abdel-Aty-Zohdy, Residue-tobinary arithmetic converter for the moduli set (2k, 2k 1, 2k1
1), IEEE Trans. Circuits Syst. -II, vol. 45, no. 2, pp. 204-209,
Feb. 1998.
[6] Z. Wang, G. A. Jullien and W. C. Miller, An improved
residue-to-binary converter, IEEE Trans. Circuits Syst. -I, vol.
47, no. 9, pp. 1437-1440, Sep. 2000.
[7] Y. Wang, X. Song, M. Aboulhamid and H. Shen, Adder
based residue to binary number converters for (2n 1, 2n, 2n +
1), IEEE Trans. Signal Processing, vol. 50, no. 7, pp. 17721779, 2002.
[8] M. Bhardwaj, T. Srikanthan and C. T. Clarke, A reverse
converter for the 4-moduli superset {2n 1, 2n, 2n + 1, 2n+1 + 1},
in Proc. of 14th IEEE Symp. on Computer Arithmetic, Adelaide,
Australia, pp. 168-175, Apr., 1999.
[9] A. P. Vinod and A. B. Premkumar, A memoryless reverse
converter for the 4-moduli superset {2n 1, 2n, 2n + 1, 2n+1 1},
J. of Circuits, Systems, and Computers, vol. 10, no. 1&2, pp. 8599, 2000.
[10] B. Cao, C. H. Chang, and T. Srikanthan, New efficient
residue-to-binary converters for 4-moduli set {2n 1, 2n, 2n + 1,
2n+1 1}, in Proc. of IEEE Symp. On Circuits and Systems
(ISCAS-2003), vol. 4, pp. 536-539, May 2003.
[11] A. Skavantzos and M. Abdallah, Implementation issues of
the two-level residue number system with pairs of conjugate
moduli, IEEE Trans. Signal Processing, vol. 47, no. 3, pp. 826838, Mar. 1999.
[12] A. Skavantzos, An efficient residue to weighted converter
for a new residue number system, in Proc. of the 8th Great
Lakes Symp. VLSI, LA, no. 9, pp. 185-191, Feb. 1998.
[13] S. J. Piestrak, A high speed realization of residue to binary
number system converter. IEEE Trans. Circuits Syst. -II , vol.
42, no. 10, pp. 661-663, 1995.
[14] B. Cao, C. H. Chang and T. Srikanthan, An efficient
reverse converter for the 4-moduli set {2n1, 2n, 2n+1, 22n1}
based on the New Chinese Remainder Theorem, IEEE Trans.
Circuits Syst.I, vol. 50, no. 10, pp. 1296-1303, October 2003.