04463802

828 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO.
3, APRIL 2008
Arithmetic Unit for Finite Field GF(2m)

Tung-Chou Chen, Member, IEEE, Shyue-Win Wei, Senior Member, IEEE, and Hung-Jen Tsai
Abstract—An arithmetic unit (AU) that performs all basic arith-

metic operations in the finite field GF(2 ) is presented, where m
is an arbitrary integer. The presented finite field AU consists of an
arithmetic processor, an arithmetic logic unit, and a control unit.
The proposed AU has low circuit complexity and is programmable,
so that any error-correcting decoder that operates in GF(2 )
can
be easily implemented with this AU.
Index Terms—Arithmetic unit (AU), error-correcting codes,
finite field.
I. INTRODUCTION
HE FORWARD error-correction codes can be employed to

T improve the system performance and thus have been widely
used in computer and digital communication systems, such as the
compact disc (CD), third-generation (3G) mobile system, digital
audio broadcasting (DAB), digital video broadcasting (DVB),
wireless metropolitan area network (WMAN), etc. [1]–[5].
And the finite fields play an important role in the structure of
error-correction codes [5]–[7]. The codes with symbols from the
binary field GF(2) or its extension are most widely used
because information in computer and digital communication
systems is universally coded in binary form for practical reasons.
To design an error-correction decoder with low circuit com-
plexity, it is necessary to have a multifunction arithmetic circuit.
Therefore, there is a trend, when designing the multifunction
arithmetic circuit, to reduce its complexity, shorten its calculating
delay and increase its operation speed. Addition, multiplication,
division, exponentiation and inverse multiplication are the most Fig. 1. Structural diagram of AU.
basic arithmetic operations in a finite field; and several kinds
of circuits, e.g., multiplier [8]–[14], divider [15]–[17], inverter II. AU FOR FINITE FIELD
[15]–[18], exponentiator [19], power-sum circuit [20], etc., have
been proposed to perform these arithmetic operations. However, Let be an irreducible polynomial over GF(2) of degree
these arithmetic circuits are, respectively, designed for a specific , i.e.,
arithmetic operation, which is never enough, for example, for a (1)
forward error-correction decoder. In a finite field , an
arithmetic circuit with high-speed, low complexity, and versatile where and . The fi-
features is required. Therefore, an arithmetic unit (AU) which nite field is an extension field of the prime field GF(2)
can perform all basic arithmetic operations in the finite field and consists of elements. Let the element is a root of
is proposed in this paper. The present AU is structured the irreducible polynomial . Then the modulo polynomial
with low circuit complexity, so that an error-correction decoder can be obtained
applying this AU can be greatly simplified. when . By this, all nonzero elements of can
be represented in terms of the polynomial of degree with
Manuscript received March 1, 2006; revised October 23, 2006, and March 12,
respect to the basis [6].
2007. This work was supported by the National Science Council of the Republic The proposed AU, which can performs all basic arithmetic
of China under Grant NSC87-2218-E-216-006. This paper was recommended operations in the finite field , includes an arithmetic
by Associate Editor B. Zhao.
T.-C. Chen is with the Department of Communication Engineering, Chung
processor (AP), an arithmetic logic unit (ALU) and control
Hua University, Hsinchu, Taiwan 300, R.O.C. (e-mail: tcchen@chu.edu.tw). circuits. Therein the AP is structured on a calculating pro-
S.-W. Wei is with the Department of Electrical Engineering, National cessor(CP) that can perform the and operations
Chi-Nan University, Nantou, Taiwan 545, R.O.C. (e-mail: will@ncnu.edu.tw). in the finite field , where and are elements in the
H.-J. Tsai is with the Beam Dynamics Group, National Synchrotron Radiation
Research Center, Hisnchu, Taiwan 300, R.O.C. (e-mail: jacky@nsrrc.org.tw). finite field . Based on this CP, multiplication, division,
Digital Object Identifier 10.1109/TCSI.2008.919757 exponentiation and inverse multiplication can be performed.
1549-8328/$25.00 © 2008 IEEE
CHEN et al.: AU FOR FINITE FIELD 829
Fig. 2. Structural diagram of CP.
The major job of the ALU is provided to perform addition in the

finite field . Adding the control circuits, all arithmetic
operations in the finite field can be completed using
this AU. The architecture of the AU is shown in Fig. 1.
A. CP
Based on the cellular-array power-sum circuit and the cel-
lular-array multiplier [8], a CP is constructed. The CP is pro-
vided to perform and in the finite field ,
which includes an array of identity cells, as shown in
Fig. 2. Each identity cell includes three two-input AND gates,
one two-input XOR gate, one three-input XOR gate, and a mul-
tiplexer, as shown in Fig. 3. In this CP, what arithmetic oper-
ation which this CP wants to perform is decided by a control
signal . Assume two input elements, and , are, re-
spectively, expressed as
(2)
Fig. 3. Circuit of (i; j ) identity cell in CP.
(3)
Then, according to the irreducible polynomial and its modulo
polynomial, we have
(4)
830 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 3, APRIL 2008
where According to (7)–(9), the algorithm of can be summarized

in the following algorithm.
(5) ;
for to ,
By this, a CP is designed to perform the operation and
operation. A signal is used to control the operations. for to ,
When the control signal ,
;
and operation is performed. This CP performs the
operation when . At this time, end;
for and . The outcome of this
CP is ;
for to ,
;
end;
(6)
end;
For the operation, the calculation result can be repre- .
sented as follows [8]:
For the operation, the algorithm can be represented
as follows:
(7)
(10)
where . Let and

, and then let where . Let and
and for . Finally, , and then let
. Therein, the computation and for . Finally,
of can be performed as follows: . Therein, the computa-
tion of can be performed as follows:
(8)
From (8), we obtain the following relation:

(11)
(9) (12)
In summary, the algorithm of is as follows. in each identity cell. Accordingly, for the
identity cell
;
for to , (13)
for to ,
; (14)
(15)
end;
;
Then the identity cells in the th column of the CP perform
; the computations of and . Finally, the output of
the CP is . The CP performs the
for to ,
operation when the control signal = 1. At this time, the
; multiplexer MUX of each identity cell outputs .
Accordingly
end;
end;
(16)
.
(17)
The complete algorithm of CP can then be obtained by just
combining the algorithms of and , that is, as shown
For the identity cell
in the following algorithm.
;
for to , (18)
for to , (19)
;
end; Therefore, the identity cells in the th column of the CP per-
form the computations of and . Then the output
if , then of the CP is .
The CP mentioned above can be modified to be a general CP
;
of a finite field (see Fig. 4). The same as the above
for to , CP, the general CP is also an array of identity cells. Assuming
this general CP is structured of identity cells, then this
; general CP can perform the and operations in all
end; finite fields for , where and are elements
of the finite field . Furthermore, to adapt the different-
else sized finite fields , each identity cell is further provided
; with two two-input multiplexers MUX1, MUX2 and a control
signal . The control signal is determined by the size of
; the finite field , for controlling the multiplexers MUX1
and MUX2. The control signal only for the th
for =2 to ,
row of identity cells, so that the multiplexer MUX1 can pass
; to , in the same row, and the other
multiplexer MUX2 can pass to , in
end;
the same row. The other control signals for ,
end; so that the multiplexer MUX1 in all identity cells for
can receive of the upper identity cell to its
.
, and the other multiplexer MUX2 in all identity cells
Figs. 2 and 3 show the circuit diagram of the CP and it per- for can receive of the upper
forms the operation when the control signal . identity cell to its . Thus, the identity cells in
At this time, the input signal . As a result, the lower left part of this general CP perform the same arithmetic
the gate AND3 outputs a 0’s, and the multiplexer MUX outputs operations as the above mentioned CP. Furthermore, the
Therefore, the output of the general CP is
(21)
Thus, a general CP that can perform and in different-

sized finite field can be designed.
The I/O ports of a general CP include two input elements,
and ; a control signal ; three control parameters ,
and ; and an output element . To reduce the number of the
I/O ports, the irreducible polynomial generator and the field-size
controller can be designed with simple logic gates. By inputting
several bits of control signals, parameters and can be ob-
tained by the irreducible polynomial generator, and parameter
can be obtained by the field-size controller. Thus, the total
number of the I/O ports can be reduced. For example, a general
CP for finite fields from to includes an array
of 10 10 identity cells, a irreducible polynomial generator and
a field-size controller, as shown in Fig. 5. The irreducible poly-
nomial generator and the field-size controller are controlled by
three control signals , and .
B. AP
The AP is structured on the CP, for performing all arithmetic
operations except addition. These arithmetic operations can be
combined by four basic operations: loading, multiplication, ex-
ponentiation, and inverse multiplication. For example, division
is implemented by combining multiplication and inverse mul-
tiplication. The detailed structural diagram of the AP is shown
in Fig. 6, which includes a CP and additional control circuits
and storage memories. For a finite field , these control
Fig. 4. General CP. (a) Structural diagram. (b) Circuit of (i; j ) identity cell. circuits and storage memories includes five -bit multiplexers,
two groups of -bit D-type flip flops, an -bit switch and some
logic gates generating control signals for multiplexers. The
output , are redundancy since input of the AP includes: ,
input signal for , that is control signal ,
determined by the maximum size of the finite field
and , while the output of the AP

includes: .
Below, the basic arithmetic operations (loading, multiplica-
tion, exponentiation and inverse multiplication) which are con-
trolled by two control signals, and , are, re-
.. spectively, described.
. 1) Loading: When the control signals
, the AP performs the loading operation. This is in order
to have the input stored in the reg-
ister Register1 of the AP to serve as an initial value for the next
(20) instruction. At this time, the control signals for the multiplexers
Fig. 5. General CP for finite fields GF(2 ) 0 GF(2 ).

MUX1–MUX4 are, respectively, 0, 1, 1, 1. If the input
, then two input elements input to
the CP are and . Because the control signal is 0,
the CP performs the operation and the outcome
is then loaded to the register Register1.
2) Multiplication: When the control signals
, the AP performs mul-
tiplication, multiplying the input
or the data stored in the register Register2 (determined by the
multiplexer MUX5 controlled by the switch signal ) by
the data stored in the register Register1. When ,
the AP multiplies the input by the
data stored in Register1. When , the AP multiplies
the data stored in Register2 by the data stored in Register1.
When executing this instruction, the CP performs the
operation because , and the control signals of
the multiplexers MUX2–MUX4 are, respectively, 0, 1, 1. The
outcome is then stored back in Register1.
3) Exponentiation: When the control signals
, the AP performs expo-
Fig. 6. Structural diagram of AP. nentiation , where and .
Fig. 7. Circuit diagram of ALU.
Here is an element in the finite field , which is multiplication , where . In fact, for
input from the input , where the finite field . Therefore, to
is between 0 and and can be divided as perform is to perform the exponentiation with
, then can
be expressed as . Clearly, inverse mul-
tiplication , like exponentiation , is implemented by
operations of the CP. Therefore, the control signal
(21) for cycles, so that the CP performs the
operation for times. The outcome of the th cycle
Clearly, exponentiation can be implemented by is stored in the register Register1 so as to feedback to the CP for
operations of the CP. Therefore, the control signal the next operation. The control signal of the multiplexer
for cycles so that the CP performs the operations MUX5 is for cycles, the control signal of
for times. The outcome of the th cycle is stored in the multiplexer MUX1 is for the first cycle, the control
the register Register1 so as to feedback to the CP for the next signal of the multiplexer MUX2 is
operation for cycles, the control signal of the multiplexer MUX3
is 1 for cycles, the control signal of the multiplexer
(22) MUX4 is . Thus, the outcome of the
inverse multiplication operation can be obtained in
Furthermore, the outcome of the exponentiation is se- cycles and stored in Register1. Furthermore, when the inverse
lected from or according to . The control signal multiplication operation is executing, the outcome for each
of the multiplexer MUX5 is for cycle is stored in Register1, and so the data of the previous
cycles, the control signal of the multiplexer MUX1 is instruction stored in Register1 has to be transferred to Register2
for the first cycle, the control signal of the multi- (controlled by the signal ) for later use.
plexer MUX2 is for Since it is able to perform loading, multiplication, exponen-
cycles, the control signal of the multiplexer MUX3 is 0 for tiation and inverse multiplication, the AP can perform all arith-
cycles, the control signal of the multiplexer MUX4 metic operations in the finite field except addition (ac-
is . Thus, the outcome of the cumulation), which can be implemented by the ALU.
exponentiation operation can be obtained in cycles
and stored in the register Register1. Moreover, when the expo- C. ALU
nentiation operation is executing, the outcome for each cycle Addition in the finite field can be simply imple-
is stored in the register Register1. Therefore, the data of the mented by XOR gates, and another register is provided to
previous instruction stored in the register Register1 has to be store the previous data when performing accumulation. When
transferred to the register Register2 (controlled by the signal the accumulation is completed, the register is also refreshed.
) for later use. The overall ALU can be seen in Fig. 7. This circuit is designed
4) Inverse Multiplication: When the control signal to perform one accumulation in each cycle, which adds the data
, the AP performs inverse from the AP and the data stored in the register and outputs back
Fig. 8. Verilog simulation result of example.
to the register. Whether or not the AP performs accumulation is tion. When , a zero element (0) in the finite field
determined by the control signal . When , is sent to the ALU, so the output of the ALU remains
the ALU receives the output of the AP and performs accumula- the same.
TABLE I
OPERATION PROCEDURE OF EXAMPLE
D. AU tested and functionally performed as expected. This chip was

Combining the AP, the ALU and the control circuits, the fully functional at a clock speed of 25 MHz.
overall arithmetic circuit for the finite field can be
IV. CONCLUSION
obtained. The proposed AU can perform all arithmetic opera-
tions in the finite field . The proposed AU is structured This paper proposes an AU which can execute all basic arith-
with low circuit complexity, so that an error-correcting decoder metic operations in the finite field . This AU includes
applying this CP can be greatly simplified. an AP, an ALU, and a control unit. Herein, the AP is structured
by a CP that can perform the and the operations
III. SIMULATION AND CHIP IMPLEMENTATION in the finite field . Based on the CP, this unit can per-
form multiplications, divisions, exponentiations, and inverses in
Based on the proposed architecture of the AU, a general AU one circuit. The major job of the presented ALU is to perform
for finite fields was implemented. A reg- additions in the finite field . By adding the control cir-
ister transfer level (RTL) code was written in Verilog to describe cuit, all arithmetic operations can be completed using this AU.
the circuit using gate-level descriptions. A Verilog test bench Since the proposed AU can perform all arithmetic operation in
was written for initial code verification and functional simula- the finite field, any error-correcting decoder working in the fi-
tion. For example, the function , nite field can be implemented using this AU, including
where and are the el- the well-developed Reed–Solomon decoder, the Bose–Chaud-
ements of with irreducible polynomial hari–Hochquenghem (BCH) decoder, and others. Furthermore,
, is executed by the AU. The Verilog simu- the AU has less circuit complexity, and is suitable to be com-
lation result is shown in Fig. 8, where, “ clock” is the system bined with existing digital signal processing chips to extend the
global clock with a period of 40 ns. and are applications from real arithmetic to finite field.
the control signals used to control the multiplexers and registers
in AP circuit. 2-bit signal is used to select the ACKNOWLEDGMENT
basic arithmetic operations, that is, for The authors would also like to thank the referees for their
loading, for multiplication, valuable comments and some important corrections.
for exponentiation, and for
inverse multiplication. is used to control the opera- REFERENCES
tion size of finite field and size is used for this example. [1] W. Hoeg and T. Lauterbach, Digital Audio Broadcasting: Principles
represents the input signals of AU to input the elements and Applications of Digital Radio. Chichester, U.K.: Wiley, 2003.
[2] R. D. Bruin and J. Smits, Digital Video Broadcasting: Technology,
Standards, and Regulations. Natick, MA: Artech House, 1999.
, and in se- [3] C. Smith and J. Meyer, 3G Wireless with WiMAX and WiFi: 802.16 and
rial order. and are the outputs of 802.11. NY: McGraw-Hill, 2005.
[4] T. R. N. Rao and E. Fujiwara, Error-Control Coding for Computer
AP and ALU, respectively. Table I shows the complete oper- Systems. Upper Saddle River, NJ: Prentice-Hall, 1989.
ation procedure of the example in the AU and can be also used [5] A. M. Michelson and A. H. Levesque, Error-Control Techniques for
to verify the simulation waveforms in Fig. 8. Finally, the RTL Digital Communication. New York: Wiley, 1985.
[6] S. Lin and D. J. Costellor, Jr., Error Control Coding: Fundamentals
code was synthesized, compiled and laid out using Synopsys and Applications. Englewood Cliffs, NJ: Prentice-Hall, 2004.
and Cadence design tools. The TSMC (Taiwan Semiconductor [7] S. B. Wicker and V. K. Bhargava, Reed-Solomon Codes and Their Ap-
plications. Piscataway, NJ: IEEE Press, 1994.
Manufacturing Company) 0.6- m SPDM CMOS technology [8] B. A. Laws, Jr. and C. K. Rushforth, “A cellular-array multipliers for
cell library is adopted to implement the proposed design. The finite fieldsGF(2 ) ,” IEEE Trans. Comput., vol. C-20, no. 12, pp.
chip occupies an area of 2069.375 m 2069.375 m including 1573–1578, Dec. 1971.
[9] C. S. Yeh, I. S. Reed, and T. K. Truong, “Systolic multipliers for finite
40 bounding pads. The core utilization is 22% and occupies an fieldsGF(2 ) ,” IEEE Trans. Comput., vol. C-33, no. 4, pp. 357–360,
area of 967.875 m 967.875 m. This chip has been fully Apr. 1984.
[10] W. C. Tsai and S. J. Wang, “Two systolic architecture for multiplication Shyue-Win Wei (S’85–M’86–SM’97) was born in
in GF(2 ) ,” Proc. IEE Comput. Digit. Tech., vol. 147, pp. 375–382, Changhua,Taiwan, R.O.C., on June 9, 1958. He re-
Nov. 2000. ceived the M.S. degree in communications and Ph.D.
[11] C. H. Kim, C. P. Hong, and S. Kwon, “A digit-serial multiplier for finite degree in electronics from National Chiao Tung Uni-
field GF(2 ) ,” IEEE Trans. Very Large Scale Integr. (VLSI) Circuits versity, Hsinchu, Taiwan, R.O.C., in 1986 and 1990,
Syst., vol. 13, no. 4, pp. 476–483, Apr. 2005. respectively.
[12] H. Wu, “Bit-parallel finite field multiplier and squarer using polynomial From 1980 to 1984 he was with the Institute
basis,” IEEE Trans. Comput., vol. 51, no. 7, pp. 750–758, Jul. 2002. of Police Telecommunications, Taiwan, R.O.C. In
[13] C. Y. Lee, “Low complexity bit parallel systolic multiplier over
GF(2 ) using irreducible trinomials,” Proc. IEE Comput. Digit.
1990, he joined Telecommunication Laboratories,
Taiwan, R.O.C., where he worked on the devel-
Tech., vol. 150, no. 1, pp. 39–42, Jan. 2003. opment of a high-bit-rate digital subscriber line
[14] K. Y. Chang, D. Hong, and H. S. Cho, “Low complexity bit-par-
allel multiplier for GF(2 ) defined by all-one polynomials using
transmission system. Form 1992 to 2000, he was with the Department of
Electrical Engineering, Chung Hua University, Taiwan, R.O.C. Since 2000, he
redundant representation,” IEEE Trans. Comput., vol. 54, no. 12, pp. was been a Professor in the Department of Electrical Engineering, National
1628–1630, Dec. 2005.
Chi-Nan University, Nantou, Taiwan, R.O.C. His research interests include
[15] C. L. Wang and J. L. Lin, “A systolic architecture for computing in-
verses and divisions in finite field GF(2 ) ,” IEEE Trans. Comput.,
digital transmission system, digital subscriber lines, coding theory, and related
VLSI circuits design.
vol. 42, no. 9, pp. 1141–1146, Sep. 1993.
[16] A. V. Dinh, R. J. Bolton, and R. Mason, “A low latency architecture
for computing multiplicative inverses and divisions in GF(2 ) ,” IEEE
Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 48, no. 8,
pp. 789–793, Aug. 2001. Hung-Jen Tsai was born on April 28, 1968. He re-
[17] Z. Yan and D. V. Sarwate, “New systolic architectures for inversion ceived the M.S. degree in electrical engineering from
and division in GF(2 ) ,” IEEE Trans. Comput., vol. 52, no. 11, pp. Chung Hua University, Hsinchu, Taiwan, R.O.C., in
1514–1519, Nov. 2003. 1997.
[18] C. C. Wang, T. K. Truong, H. M. Shao, L. J. Dentsch, J. K. Omura, Since 1989, he has been a member of Beam Dy-
and I. S. Reed, “VLSI architectures for computing multiplications
GF(2 )
namics Group, National Synchrotron Radiation Re-
and inverses in ,” IEEE Trans. Comput., vol. C-34, no. 7, pp. search Center, Hsinchu, Taiwan, R.O.C.
709–716, Jul. 1985.
[19] P. A. Scott, S. J. Simmons, S. E. Tavares, and L. E. Peppard, “Archi-
tectures for exponentiation in GF(2 ) ,” IEEE J. Sel. Areas Commun.,
vol. 6, no. 3, pp. 578–586, Apr. 1988.
[20] S. W. Wei, “A systolic power-sum circuit for GF(2 ) ,” IEEE Trans.
Comput., vol. 43, no. 2, pp. 226–229, Feb. 1994.
Tung-Chou Chen (S’98–M’02) was born in Hsinchu,

Taiwan, R.O.C., on July 2, 1969. He received the B.S.
degree in communication engineering from National
Chiao Tung University, Hsinchu, Taiwan, R.O.C., in
1991, the M.S. degree in electrical engineering from
Chung Hua University, Hsinchu, Taiwan, R.O.C., in
1996, and the Ph.D. degree in electronics from Na-
tional Chiao Tung University in 2003.
Since 2003, he has been an Assistant Professor
in the Department of Communication Engineering,
Chung Hua University. His research interests include
digital transmission system, coding theory, and related VLSI circuits design.

04463802

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

04463802

Uploaded by

Copyright:

Available Formats

828 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO.

Arithmetic Unit for Finite Field GF(2m)

Abstract—An arithmetic unit (AU) that performs all basic arith-

HE FORWARD error-correction codes can be employed to

Fig. 2. Structural diagram of CP.

The major job of the ALU is provided to perform addition in the

where According to (7)–(9), the algorithm of can be summarized

where . Let and

From (8), we obtain the following relation:

Therefore, the output of the general CP is

Thus, a general CP that can perform and in different-

and , while the output of the AP

Fig. 5. General CP for finite fields GF(2 ) 0 GF(2 ).

Fig. 7. Circuit diagram of ALU.

Fig. 8. Verilog simulation result of example.

D. AU tested and functionally performed as expected. This chip was

Tung-Chou Chen (S’98–M’02) was born in Hsinchu,

You might also like