You are on page 1of 16

Shift/Add Unsigned Multiplication

Algorithms
a

Multiplicand

ak 1ak 2 L a1a0

Multiplier

xk 1 xk 2 L x1 x0

p Product (a x )

Dec 2012

p2 k 1 p2 k 2 L p1 p0

Shift partial
products

Elementary School
Algorithm
0110
multiplicand
1001
multiplier
0110
+0000
partial
products
00110
+ 0000
000110
+ 0110
0110110

Combinational Multiplier

bit of multiplier controls whether addition occurs

Array Multiplier
Regular layout
An n m cell layout
Easy to be pipelined
Used frequently in FPGA and ASICs

Critical path
Less than (n+m-1) bit adder delay

Handles unsigned multiplication


ONLY

A4

4 Unsigned Array Multiplier

skew array
for rectangular
layout

P7

X3Y3
P6

X3Y2
X2Y3
P5

X3
Y3

X2
Y2

X1
Y1

X0
Y0

X3Y1
X2Y2
X1Y3
P4

X3Y0
X2Y1
X1Y2
X0Y3
P3

X2Y0
X1Y1
X0Y2

X1Y0
X0Y1

X0Y0

P2

P1

P0

Unsigned Array Multiplier


x 3y 0
Ci

x 2y 0

x 3y 1 +

Cou

Su
m

x2y1 +

x 1y 0

x 0y 0

x1y1 +

x 0y 1

x 3y 2 +

x2y2 +

x1y2 +

x 0y 2

x2y3 +

x1y3 +

x 0y 3

x 3y 3

+
P7

+
P6

P5

P0
P1
P2
P3
P4

Signed Multiplication
Signed number representation
X x 2 x 2
Signed nn multiplication
n 1

n 1

n2
i 0

(1110)2 (0011)2 = (1010)2 (-2) 3 = (-6)


No difference from unsigned multiplication if
the result has the same bit-width as the input

But what if we want the result to be 2n


bit?
Use sign-bit extension
Needs 2n 2n array multiplier

Parallel Multiplication
Algorithms

M 1

P YX

N 1 M 1

i 0 i 0

Dec 2012

N 1

y j 2 xi 2

j 0
i 0

xi y j 2i j

Dot diagram is convenient to illustrate large array


multiplication.

x0

multiplie
r

x15

Dec 2012

The most obvious of


adding k N-bit
numbers is by
cascading k-1 CPAs.

Dec 2012

10

The most obvious of adding k


N-bit numbers is by cascading
k-1 CPAs.
This is slow and area
consuming, taking (kN)
time and area.
Observatio
n:
A Full-adder has three inputs x,
y
z.
It and
is producing
an output s of
weight 1 and an output c of
weight
2. are symmetric with respect
The inputs
to s and c.
Dec 2012

11

Carry-Save
Adder

The sum X+Y+Z can


therefore be obtained
by first summing
xi+yi+zi in parallel,
producing C and S.

Then summing S and left


shifted C by CPA. This is
called Carry-Save Adder
(CSA).
Dec 2012

12

Summation of k numbers requires stacking k2 CSAs and a single CPA.


The resulting delay is (k+n) rather than
(kn) if CPAs were used (not exactly).
CSA was invented by von Neumann early
digital computer (1946).
Dec 2012

13

Unsigned Array
Multiplication

B
Ci
n

Critical path has N CASs and M-bit CPAs, yielding


(N+M)
delay.
The N LSBs
are obtained directly from the sum
outputs
of CSAs.
The
M MSBs
are obtained
CPA.
It can be squashed in layout to occupy a
Dec 2012
rectangle.

14

y5

y4

y3

y2

y1

y0

x5

x4

x3

x2

x1

x0

x0 y4 x0 y3 x0 y2 x0 y1 x0 y0
x1 y4 x1 y3 x1 y2 x1 y1 x1 y0

N 2 M 2

xi y j 2i j

i 0 j 0

x2 y4 x2 y3 x2 y2 x2 y1 x2 y0
x3 y4 x3 y3 x3 y2 x3 y1 x3 y0
x4 y4 x4 y3 x4 y2 x4 y1 x4 y0

xN 1 yM 1 2 M N 2
N 2

xi yM 1 2i M 1
i 0

M 2

x5 y5

x5 y4 x5 y3 x5 y2 x5 y1 x5 y0

bit
complement

P11
Dec 2012

x4 y5 x3 y5 x2 y5 x1 y5 x0 y5

bit
complement

xN 1 y j 2 j N 1 1
j 0

P10

P9

P8

P7

P6

P5

P4

P3

1
+
1

+
1

P2

1
1

1
1

P2

P0
15

Physical
layout
Notice how all 1s
were summed and
propagated leftward.

y5

y4

y3

y2

y1

y0

x5

x4

x3

x2

x1

x0

x5 y0 x0 y4 x0 y3 x0 y2 x0 y1 x0 y0

x5 y1 x1 y4 x1 y3 x1 y2 x1 y1 x1 y0
x5 y2 x2 y4 x2 y3 x2 y2 x2 y1 x2 y0
x5 y3 x3 y4 x3 y3 x3 y2 x3 y1 x3 y0
x5 y4 x4 y4 x4 y3 x4 y2 x4 y1 x4 y0

Dec 2012

x5 y5 x4 y5 x3 y5 x2 y5 x1 y5 x0 y5

P11

P10

P9

P8

P7

P6

P5

P4

P3

P2

P1

P0
16

You might also like