Professional Documents
Culture Documents
accumulate structures, e.g. high throughput and smaller energy consumption. The
energy consumption is reduced as the switching activity introduced by memoryfetches is also reduced. The principle of DA takes an outset in the general
description of a FIR filter as a sum of products:
y=
h[k]x[k]
12
(3.1)
If the coefficients h[k] are known a priori, then the partial product term h[k] x[k]
becomes a multiplication with a constant. This characteristic makes it possible the
use of the Distributed Arithmetic Technique.
To understand this paradigm, lets start by unfolding the FIR Equation:
h[k]x[k] = h[0]x[0] + h[1]x[1] + h[2]x[2] + . . +h[N 1]x[N 1]
y=
where
Then
[ ]
[ ]
(h[k]
= [0](x
[1](x
(3.2)
X [k] 2 )
[ 0] 2
(3.3)
[ 0] 2
+ x
+ . +x [0]2 ) +
[1]2
+ x
[1]2
+ . +x [1]2 ) +
[ 1] 2
+ x
[ 1] 2
.
.
.
[ 1](x
+ . +x [ 1]2 ) (3.4)
[ 0] + [ 1]
[ 0] + [ 1]
[ 1] + + [ 1]
[ 1] + + [ 1]
[ 1]) 2
[ 1]) 2
f(h[k], x [k]) =
(2
+
+ . . ..
+
h[k] x [k])
(2 f( h[k] x [k]))
[ ] x [k] ,
13
[0]
[ ]
[1] .
[ 1] ]
(3.5)
[ ] = 2
[ ]+
X [k] 2
([ ] (2
[ ]+
X [k] 2 ))
Using a similar procedure as in the previous case, the inner product results,
h[k]x [k] +
y = 2
= 2
([[ ],
[ ]) +
Preferred implementation of ( [ ],
to accept an N-bit input vector
= [
(2
h[k]x
[ ])
(2 f( h[k], x [k]))
[ ]
[1] .
[ 1] to produce the
output f(h[k], x [k]). Then, each f(h[k], x [k]) is weighted by 2 and finally all of them
are accumulated.
3.2
scaling accumulator. In DA, all the cumulative partial product outcomes are pre
14
computed and stored in a Look Up Table (LUT) which is addressed by the multiplier
bits. For filter with N coefficients, the LUT has 2 values.
We make use of a shift-adder as shown in Fig. 3.2
1. A vector
[ ] x [k]
has only 2
possible
values. Hence it can be pre calculated for all values and can be stored in a look-up
table of 2 words addressed by N bits. For e.g., if the number of inputs is 4, then the
LUT will have 2 = 16 memory words.
Table for f(x,0), and that tables output becomes the initial value of the accumulator.
During the second iteration, the next to least significant bits
( ),
1), ., of the K input samples form another K-bit address to the lookup table for
f(x,1), and the adder sums the Look up Table output to the contents of the
accumulator shifted by one bit. This process continues until the last iteration, where
the most-significant bits
( ),
bit address to the Lookup Table for f(x, N-1) and the adder sums the Look up Table
output to the contents of the accumulator after shifting it to the corresponding
position.
3.3
FIR filter is
[ ]=
[ ] [ ]
16
17
Address
Data
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0
h3
h2
h2 + h3
h1
h1 + h3
h1 + h2
h1 + h2 + h3
h0
h0 + h3
h0 + h2
h0 + h2 + h3
h0 + h1
h0 + h1 + h3
h0 + h1 + h2
h0 + h1 + h2 + h3
Step 1:
Store the values in input buffer.
[0] [0]
[0]
[0] =1101
[1] [1]
[1]
[1] =1010
[2] [2]
[2]
[2] =0100
[3] [3]
[3]
[3] =1111
Step 2:
Read the values from LUT for corresponding values in buffer.
Output of LUT:
0[0]= 0011 = 3
0[1] = 0010 = 2
0[2]= 0001 = 1
0[3] = 0100 = 4
18
Step 3:
If the value is multiplied by 2, it implies left shift.
Output =0[0] + Shift the value of 0[1] one time + Shift the value of 0[2] 2 times
+ Shift value of 0[3] 3 times.
Output = 3 + 4 + 4 + 32 = 43.
Disadvantage:
A filter with N coefficients requires LUT with 2 values. For higher order, filter
LUT size will increase and require more memory space.
19