Professional Documents
Culture Documents
R.Lauwereins
Imec 2001
Course contents
Digital
• Digital design
design
• Combinatorial circuits: without status
Combina-
torial • Sequential circuits: with status
circuits
FSMD design: hardwired processors
Sequential
circuits • Language based HW design: VHDL
FSMD
design
VHDL
4/1
©
R.Lauwereins
Imec 2001
FSMD design
Digital
FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Sequential
circuits
FSMD
design
VHDL
4/2
©
R.Lauwereins
Imec 2001
FSMD
Digital
• FSMD: Finite State Machine with Datapath
design
• FSMD = hardcoded processor
Combina- Consists of a datapath that performs the
torial
circuits computations
and a controller which indicates to the
Sequential
circuits datapath which operations have to be carried
out on which data
FSMD
design The controller always executes the same
algorithm: hardcoded
VHDL
• A traditional ASIC consists of multiple
interconnected FSMDs
4/3
©
R.Lauwereins
Imec 2001
FSMD
Digital
design
Combina-
Data Data
torial inputs outputs
circuits
Datapath
Sequential
circuits
4/4
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
Datapath design
Combina- Controller design
torial
circuits • Models
Sequential • Synthesis techniques
circuits
FSMD
design
VHDL
4/5
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
Datapath design
Combina- Controller design
torial
circuits • Models
Sequential • Synthesis techniques
circuits
FSMD
design
VHDL
4/6
©
R.Lauwereins
Imec 2001
Datapath design
Digital
• Datapath
design
Temporary storage: registers, register files,
Combina-
FIFO’s, …
torial
circuits
Functional units: arithmetic and logic units,
shifters
Sequential
circuits
Connections: busses, multiplexors, tri-state bus
drivers
FSMD
design
VHDL
4/7
©
R.Lauwereins
Imec 2001
Datapath design
2
Digital
design
Task: sum xi
i 1
Combina-
torial
circuits
Algorithm: Processing
sum = 0
Sequential
circuits
FOR i = 1 TO 2
sum = sum + xi Control
FSMD ENDFOR
design y = sum
VHDL
Datapath construction rules:
•each variable and constant corresponds to a register
•each operator corresponds to a functional unit
•connect outputs of registers to input of functional
units; when multiple outputs connect to the same input:
MUX or bus with tristate drivers
•connect output of functional units to input
of registers; when multiple outputs connect to the same
4/8 input: MUX or bus with tristate drivers
©
R.Lauwereins
Imec 2001
Datapath design
Variables: sum Algorithm:
Output order: sum = 0
Digital
Operators: add ‘Reset’,’Load’, FOR i = 1 TO 2
design
Connections ’Out’ sum = sum + xi
210 ENDFOR
Combina-
torial xi y = sum
circuits
Sequential
Start 0 2
circuits
Wait Reset
Register
100 Load
FSMD
1 SUM
design Start=1 Clk
Add1
VHDL 010
Add2
010 Add
Output
001 0
4/9 y
©
R.Lauwereins
Imec 2001
Datapath design
Task: count the number of ‘1’s in a word
Digital
design
Algorithm:
Combina- Data = Inport || OCnt = 0 || Mask = 1
torial
circuits
WHILE Data <> 0 DO
Temp = Data AND Mask
Sequential OCnt = OCnt + Temp || Data = Data >> 1
circuits ENDWHILE
Outport = OCnt
FSMD
design
Combina-
torial
s=0 Inport
circuits
s Wait
x01x00
Sequential s=1
circuits
1 0
Load
5
FSMD 111x00
design
3R
Comp Data OCnt Mask Temp
VHDL 4 2 1
x00000
z=0 z=1
Temp Out
x00010 x00001
<>0 AND Add >>1 0
Update zero
010100
Outport
4/11
©
R.Lauwereins
Imec 2001
Datapath design
Digital
• Possible optimisations:
design When the life time of 2 variables is non-
overlapping, both can be stored in the same
Combina-
torial register: register sharing
circuits
When two operations are not executed
concurrently, they can be assigned to the same
Sequential
circuits functional unit: functional unit sharing
When two connections are not used
FSMD
design
concurrently, they can be shared: connection
sharing
VHDL When two registers are not concurrently read
from resp. writen to, they can be combined into
a single register file: register port sharing
Operations that could be executed
concurrently, may also be executed
sequentially, facilitating the four previous
optimisations
4/12
©
R.Lauwereins
Imec 2001
Datapath design
Digital
• Generic structure of the datapath:
design
External input
Combina-
torial
circuits
Sequential
circuits
Temporary storage
FSMD
design
Functional units
External output
4/13
©
R.Lauwereins
Imec 2001
Datapath design
Digital
• Typical datapath:
design
Inport
Combina- S 1 0
torial
circuits
WA
WE
Sequential RA1 Register
circuits RE1 File
R R
23
L Counter RA2 Register
FSMD C RE2 L
design
COE RFOE1 RFOE2 ROE
VHDL
OOE
Outport
4/14
©
R.Lauwereins
Imec 2001
Datapath design
Digital
• In the datapath of previous slide a few
design
decisions have been taken:
Combina- Only 1 i.o. 2 result busses ALU and Barrel
torial
circuits
shifter cannot be used concurrently
Only 2 i.o. 4 operand busses e.g. Compare
Sequential
circuits
and ALU work on the same set of data
9 registers with only 2 write ports and 3 read
FSMD
design
ports
Inport can only feed the register file
VHDL
4/15
©
R.Lauwereins
Imec 2001
Datapath design
Instruction format
Digital
design
17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
RF
Combina- RA2RA1RA0RE2 R L ROE F2 F1 F0 AOESH2SH1SH0 D SOEOOE
OE2
torial
circuits
FSMD 31 30 29 28 27 26 25 24 23 22 21 20 19 18
design RF
R L C COE S WA2WA1WA0 WE RA2RA1RA0RE1
OE1
VHDL
Register
Register File
Counter File
Read Port 1
Write Port
32-bit instruction word
For reasons of simplicity, clarity and correctness, it is
possible to assign a mnemonic to a certain bit pattern
(e.g. ADD): assembly instruction
4/16
©
R.Lauwereins
Imec 2001
Datapath design
Digital
• The size of the instruction word may be
design reduced, since several operations cannot
Combina-
be executed concurrently
torial Either Register File Read Port 2, either Register
circuits
Read Port connects to the 1st Operand Bus (-1)
Sequential Either Register File Read Port 1, either Counter
circuits Read Port connects to the 2nd Operand Bus (-1)
ALU & Shift cannot occur concurrently: 1 bit
FSMD
design needed to select the operator and 4 bits control
the operator (-2)
VHDL When the ALU operator is active, its output
may immediately be placed on the result bus;
idem for the Barrel shifter (-2)
For the counter the ‘Count’ and ‘Load’
operations are exclusive (-1)
• Additional limitations to concurrency may
be introduced at the cost of increased
4/17
execution time
©
R.Lauwereins
Imec 2001
Datapath design
• Design freedom
Digital
design Type Fixed To be designed speed cost design
time
Combina- custom fixed - - custom custom
torial
circuits proc. algo DP Ctrl
soft IP fixed DP - DP custom
Sequential algo ext. Ctrl
circuits ASIP algo DP Ctrl DP Ctrl
class ext. ext.
FSMD Proc any DP Ctrl - - -
design
algo
VHDL
4/18
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
Datapath design
Combina- Controller design
torial
circuits • Models
Sequential • Synthesis techniques
circuits
FSMD
design
VHDL
4/19
©
R.Lauwereins
Imec 2001
Controller design
Digital
• The controller has been designed each
design
time using the design method for FSMs as
Combina- discussed before
torial
circuits • For a large number of states this is a
Sequential
tedious job
• Next slides present alternative design
circuits
4/20
©
R.Lauwereins
Imec 2001
Controller design
Standard FSM
Digital
design
D Q
Combina-
torial Clk
circuits
Sequential
S*=F(S,I)
circuits
D Q O=H(S,I)
FSMD Next Clk Output
design State
Combi-
Combi-
VHDL nato-
nato-
rial
rial
Logic
Logic
D Q
Clk
4/21
©
R.Lauwereins
Imec 2001
Controller design
Redrawn Control Status
Digital Signals (CS) Signals (SS)
design
Next CI SS
Combina- State
torial
circuits
Next
Sequential
circuits state
logic
FSMD Control Control
design Input (CI) Output (CO)
State
VHDL Reg
Size State Reg:
Out- CS
log2n for n states
for straightforward put
Current logic
and State
CO
minimum-bit-change;
n for n states for
CI SS
one-hot
4/22
©
R.Lauwereins
Imec 2001
Controller design
Critical path delay:
Digital
Find the longest combinatorial path from clock
design to clock
ClkOutStateReg + OutputLogic + AddressToOutRegFile +
Combina-
torial BusDriver + BarrelShifter +BusDriver +Mux +
circuits SetupInPortRegFile
Sequential
circuits Next
S1 0
CI SS
State WA
FSMD
WE
Register
design Next RA1
R RE1 File R
state
logic L Counter RA2 23 Register
VHDL C RE2 L
COE RFOE1 RFOE2 ROE
State
Reg
CI SS OOE
Outport
4/23
©
R.Lauwereins
Imec 2001
Controller design
Modification 1 CS SS
Digital
design
One-hot CI Next CI SS
Combina- State
torial State
circuits reg
Next
Sequential
circuits state
Properties: logic
FSMD * simple
design design and small log2n CO
State
next state and n
Reg
VHDL output logic of dec.
one-hot CS
* small number of Out-
flip-flops of put
Current logic
straightforward State
CO
and minimum-
bit-change
CI SS
4/24
©
R.Lauwereins
Imec 2001
Controller design
Digital
• Modification 2
design
Often the state diagram shows an unconditional
Combina-
sequence of states, but for a few exceptions
torial
circuits
E.g.
Sequential
circuits
0
FSMD Wait
design 100
Start=1
VHDL
Add1
010
Add2
010
Output
001
4/25
©
R.Lauwereins
Imec 2001
Controller design
Modification 2
CS SS
Digital
design
Next CI SS
Combina- State
torial Next
circuits
State
Logic Next
Sequential
circuits state
logic
FSMD
design CI CO
MUX State
VHDL Reg
INC Out- CS
put
logic CO
Current
State
CI SS
4/26
©
R.Lauwereins
Imec 2001
Controller design
Digital
• Advantage of modification 2:
design
The next state logic is very simple:
Combina- for unconditional next state: select
torial
circuits the INC
Sequential only for conditional next state the
hardware should generate the next
circuits
FSMD state
design
• Implementation of the INC:
VHDL
ripple carry chain of Half Adders
INC and State Reg together form a synchronous
counter
4/27
©
R.Lauwereins
Imec 2001
Controller design
Digital
• Modification 3
design
Often the state diagram contains a part that is
Combina-
repeated several times subroutine
torial
circuits
s0 s0
Sequential
circuits
s1
FSMD s3
design s2
s4 5 states
VHDL s3
Only at run-time
s4 s1 it is known
which will be
s5 the next
state following
s6 the end of a
s2 subroutine
stack
4/28 7 states
©
R.Lauwereins
Imec 2001
Controller design
Modification 3 CS SS
Digital
design
Combina-
Next CI SS
torial State
circuits
Logic Next
State
Next
Sequential Push/ state
circuits
Pop’ logic
FSMD
design CI CO
Stack MUX State
VHDL Reg
Return Out- CS
State put
Current
State logic CO
CI SS
4/29
©
R.Lauwereins
Imec 2001
Controller design
Combination
CS SS
Digital
design
CI SS
Combina-
torial Next
circuits Push/ State
Next
Pop’ state
Sequential
circuits logic
Stack
FSMD Log2n
design MUX State
n
CI Reg CO
Dec
VHDL
Out- CS
INC
put
Current
State logic CO
CI SS
Digital
• Implementation of the next state logic
design
and the output logic
Combina- Either construct via Karnaugh a minimal AND-
torial
circuits
OR implementation
Either put the truth table in a ROM-table (this
Sequential
circuits
method is called microprogrammed control)
FSMD
design
VHDL
4/31
©
R.Lauwereins
Imec 2001
Controller design
ROM table
CS SS
Digital
design
CI SS
Combina-
torial Next
circuits Push/ State
Pop’
Sequential
circuits
Stack
FSMD
State ROM
design MUX
CI Reg table CO
VHDL
CS
INC
CO
Current
State
4/32
©
R.Lauwereins
Imec 2001
Controller design
Be careful about timing!
Digital
design
Example: Each iteration of the
Combina-
ReadFromExternal(A); WHILE loop (body, test
torial || sum := 0; and decision) should be
circuits WHILE A <> 1 executed in just one
sum := sum + A; clock cycle!!
Sequential
circuits || ReadFromExternal(A);
Comp
FSMD A
design
LA LS
VHDL A sum
RS
C
Comp Add No 3-state
drivers: each
C=1 when A<>1 bus only has
4/33
one source
©
R.Lauwereins
Imec 2001
Controller design
Can the controller be state based?
Digital
design
Example: Animate sequence
ReadFromExternal(A); A=5,2,1 sum=7
Combina-
torial || sum := 0; Reset is asynchronous
circuits WHILE A <> 1
sum := sum + A; One count too much
Sequential
circuits || ReadFromExternal(A); sum=8 i.o. 7
FSMD
design 5
2
1
? 5
7
8
?
s0 LA LS
VHDL LA=1 A=5
A=2
A=1
A=?
A Sum=8
Sum=0
Sum=5
Sum=7
Sum=?
sum
RS=1 RS
LS=0
C=1
s1
LA=1 Comp Add
RS=0
LS=1
C=0 5
7
8
?
C=1 when
C=1
C=0
C=? A<>1
4/34
©
R.Lauwereins
Imec 2001
Controller design
Can the controller be input based?
Digital
design
Example: Animate sequence
ReadFromExternal(A); A=5,2,1 sum=7
Combina-
torial || sum := 0; Reset is asynchronous
circuits WHILE A <> 1
sum := sum + A;
Sequential Result is correct.
|| ReadFromExternal(A);
circuits Always check timing!
FSMD
design
LA 5
2
1
? LS 5
7
8
?
s0 LA LS
VHDL LA=1 A=2
A=5
A=1
A=?
A Sum=5
Sum=0
Sum=7
Sum=?
sum
RS=1 RS
LS=0
C=1
LA=1
LS=1
s1 Comp Add
RS=0
C=0
LA=0 5
7
8
?
LS=0 C=1 when
C=1
C=0
C=? A<>1
4/35
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
• Models
Combina- State-action table
torial
circuits Algorithmic-state-machine chart
Sequential • Synthesis techniques
circuits
FSMD
design
VHDL
4/36
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
• Models
Combina- State-action table
torial
circuits Algorithmic-state-machine chart
Sequential • Synthesis techniques
circuits
FSMD
design
VHDL
4/37
©
R.Lauwereins
Imec 2001
State-action table
Digital
• The specification of an FSMD could be
design
done using the traditional next state &
Combina- output table
torial
circuits • However, for large designs, this becomes
Sequential
not so practical
• Next slide shows the next state & output
circuits
VHDL
4/38
©
R.Lauwereins
Imec 2001
State-action table
Digital
• Next state and output table
design
4/39
©
R.Lauwereins
Imec 2001
State-action table
Digital
• The next state and output table do not
design
offer a good overview
Combina- often the next state is only dependent on a few
torial
circuits
of the inputs
often, the data path variables do not change
Sequential
circuits • Hence, the same information as in the
FSMD
next state and output table is presented
design in a more condensed form: the state
VHDL action table (See next slide)
4/40
©
R.Lauwereins
Imec 2001
State-action table
Digital
design Present Next state Control and data path
state actions
Combina- Condition State Condition Actions
torial
circuits S0 Start=0 S0 Output=Z
Start=1 S1
Sequential S1 S2 Data=Inport
circuits
S2 S3 Ocount=0
FSMD S3 S4 Mask=1
design S4 S5 Temp=Data
AND Mask
VHDL
S5 S6 Ocount=
Ocount+
Temp
S6 Data <> 0 S4 Data >> 1
Data = 0 S7
S7 S0 Output=
OCount
4/41
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
• Models
Combina- State-action table
torial
circuits Algorithmic-state-machine chart
Sequential • Synthesis techniques
circuits
FSMD
design
VHDL
4/42
©
R.Lauwereins
Imec 2001
Algorithmic-state-machine chart
Digital
• An algorithmic-state-machine chart (ASM
design
chart) is an alternative visualization
Combina- method for the state action table
torial
circuits • It shows loops, conditions and next states
Sequential
in a way which is easier to understand for
circuits
a human being
FSMD • Each row in the state action table
design
translates to an ASM block
VHDL
• ASM blocks are constructed out of three
types of elements: state boxes, decision
boxes and condition boxes
4/43
©
R.Lauwereins
Imec 2001
Algorithmic-state-machine chart
Digital
design State name State encoding
Combina- Unconditional
torial
circuits State box variable
assignment
Sequential
circuits
FSMD
design
1 0
Decision box Condition
VHDL
Conditional
Condition box variable
assignment
4/44
©
R.Lauwereins
Imec 2001
Algorithmic-state-machine chart
Combina-
torial s0
circuits
Sequential Done = 0
circuits
FSMD
design
0 1
Start = 0
VHDL
Data = Inport
4/45
©
R.Lauwereins
Imec 2001
Algorithmic-state-machine chart
Digital
• An ASM block has to obey following rule:
design
each input combination should lead to exactly
Combina-
one next state
• Example 1 of an invalid ASM block:
torial
circuits
Sequential s0
circuits When Cond2=1
there are two
FSMD next states
design
VHDL
1 0 0 1
Cond1 Cond2
s1 s2
4/46
©
R.Lauwereins
Imec 2001
Algorithmic-state-machine chart
Digital
• Example 2 of an invalid ASM block:
design
Combina-
torial
When Cond1=0
circuits s0
and Cond2=0
Sequential there is no
circuits next state
1 0
FSMD
design
Cond1
VHDL
0 1
Cond2
s1 s2
4/47
©
R.Lauwereins
Imec 2001
Algorithmic-state-machine chart
Digital
• An ASM chart representing a state-based
design
or Moore type FSMD has no condition
Combina- boxes, since all outputs only depend on
torial
circuits the state; all assignments to variables are
done in state boxes
Sequential
circuits
• An ASM chart representing an input-based
FSMD or Mealy type FSMD has state boxes as
design
well as condition boxes; variable
VHDL assignments that only depend on the
state are done within the state boxes;
variable assignments that depend on
input conditions are done in condition
boxes
4/48
©
R.Lauwereins
s0
Imec 2001
Algorithmic-
1 Start=1
0 state-machine
Digital
design
chart
Data=Inport
s1
Combina- OCount=0
torial
circuits
s2
Sequential State based (Moore)
circuits
0 DataLSB 1
FSMD
design
VHDL
s3 Ocount=Ocount+1
Data=Data>>1 s4
1 Data<>0 0
s5
4/49
Output=OCount
©
R.Lauwereins
s0
Imec 2001
Algorithmic-
1 Start=1
0 state-machine
Digital
design
chart
Data=Inport
s1
Combina- OCount=0
torial
circuits
s2
Sequential Input based (Mealy)
circuits
0 DataLSB 1
FSMD Only 4 states instead
design
of the 6 for a state
based approach
VHDL Ocount=Ocount+1
1 Data<>0 0
Data=Date>>1
s3
4/50
Output=OCount
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Basic principles
Sequential Merging
circuits
Register sharing (variable merging)
FSMD
design Functional-unit sharing (operator
VHDL
merging)
Bus sharing (connection merging)
Register port sharing (register
merging)
4/51
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Basic principles
Sequential Merging
circuits
Register sharing (variable merging)
FSMD
design Functional-unit sharing (operator
VHDL
merging)
Bus sharing (connection merging)
Register port sharing (register
merging)
4/52
©
R.Lauwereins
Imec 2001
Basic synthesis principles
Digital
• An FSMD represented by an action state
design
table or an ASM chart could be
Combina- implemented using the methodology we
torial
circuits used:
every variable corresponds to a register
Sequential
circuits every operation corresponds to a functional
unit
FSMD
design every reading of a variable correponds to a
connection from register to functional unit
VHDL
every writing of a variable corresponds to a
connection from a functional unit to a register
every row of the state action table or every
ASM block of the ASM chart corresponds to a
state of the controller
• This method however leads to expensive
4/53
realisations
©
R.Lauwereins
Imec 2001
Basic synthesis principles
• Minimization requires two steps:
Digital
design First, the controller can be minimized by
minimizing the number of states via combining
Combina-
torial
equivalent states
circuits choosing the best state encoding scheme
Sequential
selecting the appropriate flip-flop type
circuits minimizing the next state and output logic
Second, the data path should be minimized according to the
FSMD
design
principles already mentioned:
When the life time of 2 variables is non-
VHDL
overlapping, both can be stored in the same
register: register sharing
When two operations are not executed
concurrently, they can be assigned to the same
functional unit: functional unit sharing
When two connections are not used concurrently,
they can be shared: connection sharing
When two registers are not concurrently read
from resp. writen to, they can be combined into a
single register file: register port sharing
4/54
©
R.Lauwereins
Imec 2001
Basic synthesis principles
Digital
• We are going to show the data path
design
minimizations using an approximation for
Combina- a square root calculation (SRA: Square
torial
circuits Root Approximation):
Sequential
circuits a 2 b 2 max 0.875 x 0.5 y , x
FSMD
design
with x max a , b and y min a , b
4/55
©
R.Lauwereins
Imec 2001
Basic synthesis principles
Digital
design a2 b2
Combina-
a=In1
b=In2
max 0.875 x 0.5 y , x
with x max a , b
torial
circuits
0
Sequential
Start Out=t7
and y min a , b
circuits
1
t1=|a|
FSMD t7=max(t6,x)
design t2=|b|
VHDL x=max(t1,t2)
t6=t4+t5
y=min(t1,t2)
t3=x>>3
t5=x-t3
t4=y>>1
t3=0.125x t5=0.875x
t4=0.5y
4/56
©
R.Lauwereins
Imec 2001
Basic synthesis principles
Liveliness of variables:
a variable is alive in first
Digital
design
state following active
clock edge which assigns
a=In1
Combina- b=In2 its new value
torial and in all states between
circuits
0 this first state and the
Sequential
Start Out=t7 last state which uses it.
circuits
1
S1 S2 S3 S4 S5 S6 S7
t1=|a|
FSMD t7=max(t6,x) A X
design t2=|b| B X
T1 X
VHDL x=max(t1,t2) T2 X
t6=t4+t5
y=min(t1,t2) X X X X X
Y X
t3=x>>3 T3 X
t5=x-t3
t4=y>>1 T4 X X
T5 X
T6 X
T7 X
# 2 2 2 3 3 2 1
4/57
©
R.Lauwereins
Imec 2001
Basic synthesis principles
S1 S2 S3 S4 S5 S6 S7
Digital A X
design B X
T1 X
Combina- T2 X
torial
circuits X X X X X
Y X
Sequential T3 X
circuits T4 X X
T5 X
FSMD T6 X
design
T7 X
# 2 2 2 3 3 2 1
VHDL
4/58
©
R.Lauwereins
Imec 2001
Basic synthesis principles
Digital
design
Operation usage:
a=In1
Combina- b=In2
torial
circuits S1 S2 S3 S4 S5 S6 S7 #
0 abs 2 2
Start Out=t7 min 1 1
Sequential
circuits
1 max 1 1 2
>> 2 2
t1=|a|
FSMD t7=max(t6,x) - 1 1
design t2=|b|
+ 1 1
# 2 2 2 1 1 1
VHDL x=max(t1,t2)
t6=t4+t5
y=min(t1,t2)
t3=x>>3
t5=x-t3
t4=y>>1
4/59
©
R.Lauwereins
Imec 2001
Basic synthesis principles
S1 S2 S3 S4 S5 S6 S7 #
Digital abs 2 2
design
min 1 1
max 1 1 2
Combina-
torial >> 2 2
circuits - 1 1
+ 1 1
Sequential # 2 2 2 1 1 1
circuits
4/60
Basic synthesis
©
R.Lauwereins
Imec 2001 a=In1
b=In2 principles
Digital
design
0 Start Out=t7
1
Combina-
t1=|a|
torial t7=max(t6,x)
circuits t2=|b|
Sequential x=max(t1,t2)
circuits t6=t4+t5
y=min(t1,t2)
FSMD
design t3=x>>3 Connectivity table:
t5=x-t3
t4=y>>1
VHDL
a b t1 t2 x y t3 t4 t5 t6 t7
abs1 I O
abs2 I O
min I I O
max I I O/I I O
>>3 I O
>>1 I O
- I I O
4/61 + I I O
S1 S2 S3 S4 S5 S6 S7 #
Basic synthesis
©
R.Lauwereins
Imec 2001
abs 2 2
min
max
1
1 1
1
2 principles
>> 2 2
Digital - 1 1
design
+ 1 1
# 2 2 2 1 1 1
Combina-
torial a b t1 t2 x y t3 t4 t5 t6 t7
circuits abs1 I O
abs2 I O
Sequential min I I O
circuits
max I I O/I I O
>>3 I O
FSMD >>1 I O
design
- I I O
+ I I O
VHDL
• The straightforward approach would allocate 20
connections (11 register outputs and 9 FU outputs)
• In state S2, the largest number of connections is
needed: 4 inputs and 2 outputs.
• We should hence try to merge multiple connections into
one bus
• In a further section, the algorithm is presented to
accomplish this: connection merging
4/62
©
R.Lauwereins
Imec 2001
FSMD design
Digital
• FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Basic principles
Sequential Merging
circuits
Register sharing (variable merging)
FSMD
design Functional-unit sharing (operator
VHDL
merging)
Bus sharing (connection merging)
Register port sharing (register
merging)
4/63
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Definition of the lifetime of a variable:
design The set of states in which the variable is alive
Combina-
starting at the state following the state in
torial which it is assigned a new value (write state)
circuits
ending at every state in which its value is used
Sequential (read state)
circuits
and all the states on each path between the
FSMD write state and a read state.
design
Note that a variable may be written more than
VHDL once (multiple assignments)
and that a single written value may be read
multiple times.
• After determining the lifetime of the variables, we
have to group variables with non-overlapping
lifetimes and assign each group to a single
variable. We should hence find the smallest
4/64
number of groups.
©
R.Lauwereins
Imec 2001
Determine variable Register sharing
lifetimes
Digital
design Sort by write state
& life length
Combina-
torial
circuits Allocate new Left-edge algorithm
register
Sequential
circuits
Remove all
assigned variables
from list
no yes
Empty?
4/65
©
R.Lauwereins
Imec 2001
Register sharing
Digital
Determine variable lifetimes
design
S1 S2 S3 S4 S5 S6 S7
Combina- A X
torial
B X
circuits
T1 X
T2 X
Sequential
circuits X X X X X
Y X
FSMD T3 X
design T4 X X
T5 X
VHDL T6 X
T7 X
4/66
©
R.Lauwereins
Imec 2001
Register sharing
Digital
Sort variables by write state and lifetime
design
S1 S2 S3 S4 S5 S6 S7
Combina- A X
torial B X
circuits
T1 X
T2 X
Sequential
circuits X X X X X
Y X
FSMD T3
T4 X X T4 has longer lifetime
design T4
T3 X X than T3
T5 X
VHDL T6 X
T7 X
4/67
©
R.Lauwereins
Imec 2001
Register sharing
Digital
Allocate new register and assign non-overlapping variables
design
S1 S2 S3 S4 S5 S6 S7
A X
Combina-
torial B X
circuits T1 X
T2 X
Sequential X X X X X
circuits
Y X
T4 X X
FSMD
design T3 X
T5 X
VHDL T6 X
T7 X
R1: A T1 X T7
R2: B T2 Y T4 T6
R3: T3 T5
4/68
©
R.Lauwereins
Imec 2001
Register sharing
In1 In2
Digital
design
Combina-
torial
circuits MUX MUX MUX
Sequential
circuits
R2: b,t2,y
R1: a,t1,x,t7 R3: t3,t5
FSMD t4,t6
design
VHDL
Out
4/69
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• The left-edge algorithm finds an assignment
design with the smallest number of registers
Combina- • There exist however multiple possible
torial
circuits variable-to-register assignments with the
smallest number of registers
Sequential
circuits • We hence can use a second cost criterion to
find the best assignment
FSMD
design First criterion: smallest number of registers
Second criterion: minimize the number of ports of
VHDL
the MUX and DEMUX circuits
preferably map two variables to the
same register that are the same (e.g.
left) input of the same functional unit
preferably map two variables to the
same register that are the same output
4/70
of the same FU
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Why does this register sharing reduces
design
the cost of MUX and DEMUX?
Combina-
torial
circuits
R1: t1 R2: t2
Sequential
circuits
MUX R1: t1,t2
FSMD
design
VHDL
FU FU
R3: t3 R4: t4
4/71
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• We should hence determine which
design
variables are the same input of the same
Combina- functional unit and which variables are
torial
circuits the same output of the same FU
Sequential
• However, at this stage of the design,
circuits
before operator merging, each operator is
FSMD implemented in a different FU such that
design
no variables share the same input or
VHDL output
4/72
©
R.Lauwereins
Imec 2001
Register sharing
• Does this mean that we should do operator merging
Digital
design before register sharing?
Register sharing: (1) minimize registers and (2) minimize size
Combina- of MUX/DEMUX
torial
circuits The latter is only known after operator merging
Operator merging: merge operators where the combined cost
Sequential of MUX/DEMUX/CombinedFU is smaller than the cost of two
circuits
FUs
FSMD
The cost of the MUX/DEMUX is only known after
design register merging
This deadlock situation is typical for all optimization steps in
VHDL hardware synthesis (and software compilation)!! Solution:
First optimize those things that give the largest
cost improvement; use quick-and-dirty
estimates for the next optimization steps
Next optimize the things with less cost
influence
Iterate till satisfied with outcome
4/73
©
R.Lauwereins
Imec 2001
Register sharing
• What gives the biggest cost influence: register
Digital
design sharing or operator merging
In most cases, register sharing has a higher cost impact:
Combina-
torial
there are more variables than FUs
circuits
merging two registers in one does not increase
Sequential
the cost of the register; merging two different
circuits FUs in one makes this single FU more expensive
than each of the original FUs separately
FSMD
design it is easier to quickly estimate which operators
will be merged, than to see which variables will
VHDL be merged
We hence mostly do register sharing first
For some applications (e.g. when they contain
only one type of FU) and some target platforms
(e.g. where the cost of a register is negligible
compared to the cost of an FU), we do operator
merging first
In an FPGA, a register at the FU output is free!
4/74
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• We choose to do register sharing first
design
• We hence have to estimate operator
Combina-
torial
merging
circuits
S1 S2 S3 S4 S5 S6 S7 #
abs 2 2
Sequential
circuits min 1 1
max 1 1 2
FSMD >> 2 2
design - 1 1
+ 1 1
VHDL # 2 2 2 1 1 1
Digital
• Method for register sharing, combined
design
with MUX/DEMUX cost reduction:
Combina- Build a compatibility graph
torial
circuits Perform a max-cut graph partitioning
Sequential
circuits
FSMD
design
VHDL
4/76
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Build a compatibility graph
design
Nodes are variables
Combina- Hint: sort the nodes graphically
torial
circuits according to the left-edge merging
Sequential
since this will already separate
circuits incompatible variables with
FSMD
overlapping lifetime
design Incompatibility edges are drawn between two
variables with overlapping lifetime: they cannot
VHDL
be merged
Priority edges are drawn between two variables
that are the same input of the same FU or the
same output of the same FU. A weight on this
edge indicates how many times the two
variables drive the same input of the same FU
plus how many times they are the same output
4/77 of the same FU.
©
R.Lauwereins
Imec 2001 Register sharing
a t1 x t7
Digital
design
Combina- b t2 y t4 t6
torial
circuits
Sequential
circuits t3 t5
FSMD
design
VHDL
Combina- b t2 y t4 t6
torial
circuits
Sequential
circuits t3 t5
FSMD S1 S2 S3 S4 S5 S6 S7
design
A X Incompatibility edges:
B X variables with
VHDL
T1 X
overlapping lifetimes
T2 X
X X X X X
Y X
T4 X X
T3 X
T5 X
T6 X
T7 X
4/79
©
R.Lauwereins
Imec 2001 Register sharing
1 1
a t1 x t7
Digital
design
Combina- b t2 y t4 1 t6
torial
circuits
1
Sequential
t3 t5
circuits
1
FSMD x and t4 however have overlapping lifetimes:
design
no priority edge
VHDL
a
a b
b t1
t1 t2
t2
t2 xx
x yyy t3
t3
t3 t4
t4 t5
t5 t6
t4 t5 t6 t7
t6 t7t7
abs1 II OO Priority edges:
abs2
abs2 III O
O
O variables with
min
min III III O
O
O
O same input to
max
max III
I III O/I
O/I
O/I
O/I IIII
I O
OO
O O FU or same
>>3 II O
>>3 II O
O
O output from FU
>>1 I O
>>1
>>1 III O
O
O
- I I O
-- III III O
OO
+ I I O
++ III III O
OO
4/80
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Perform a max-cut graph partitioning
design
Divide the graph in the minimum number of
Combina-
clusters of compatible nodes, such that the
torial total weight is maximized.
circuits
Total weight is computed by summing all
Sequential weights of priority edges within a cluster (a
circuits
priority edge crossing cluster boundaries is not
FSMD counted)
design
• We are going to do this optimization
VHDL visually
• See course on optimization techniques for
max-cut graph partitioning optimization
algorithm
4/81
©
R.Lauwereins
Imec 2001 Register sharing
1 1
a t1 x t7
Digital
design
Combina- b t2 y t4 1 t6
torial
circuits
1
Sequential
t3 t5
circuits
1
FSMD
design
4/82
©
R.Lauwereins
Imec 2001 Register sharing
1 1
a t1 x t7 Cut=2
Digital
design
Combina- b t2 y t4 1 t6
torial
circuits
1
Sequential
t3 t5
circuits
1
FSMD
design
4/83
©
R.Lauwereins
Imec 2001 Register sharing
1 1
a t1 x t7 Cut=5
Digital
design
Combina- b t2 y t4 1 t6
torial
circuits
1
Sequential
t3 t5
circuits
1
FSMD
design
4/84
©
R.Lauwereins
Imec 2001 Register sharing
1 1
a t1 x t7 Cut=5
Digital
design
Combina- b t2 y t4 1 t6
torial
circuits
1
Sequential
t3 t5
circuits
1
FSMD
design
VHDL
The three other variables do not have priority edges
and can be assigned to any register as long as they
are compatible with all other variables assigned to
the same register
Result of max-cut algorithm: Result of left-edge algorithm:
R1: a, t1, x, t7 R1: a, t1, x, t7
R2: b, t2, t3, t5, t6 R2: b, t2, y, t4, t6
R3: y, t4 R3: t3, t5
4/85
©
R.Lauwereins
Imec 2001
Register sharing
In1 In2
Digital
design
Combina-
torial
circuits MUX MUX MUX
Sequential
circuits
R2: b,t2,t3
R1: a,t1,x,t7 R3: y,t4
FSMD t5,t6
design
VHDL
Out
4/86
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Register cost computation
design Cost of 1 bit register with CE and asynchronous preset
or clear
Combina-
torial 1/2 CLB
circuits
7 gates
Sequential 34 TOR
circuits
Cost of 1-bit 2-to-1 MUX
FSMD
design
1/2 CLB
3 gates
VHDL
14 TOR
Cost of 1-bit 4-to-1 MUX
1 CLB
5 gates
36 TOR
In FPGA, register and MUX share CLB
4/87
5-to-1 MUX is 4-to-1 MUX followed by 2-to-1 MUX
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Register cost computation for original
design
FSMD implementation (32-bit data path):
Combina- 11 registers of 32 bits
torial
circuits 11 reg * 32 bit/reg * 1/2 CLB/bit = 176
Sequential
CLB
circuits
11 reg * 32 bit/reg * 7 gates/bit =
FSMD 2464 gates
design
11 reg * 32 bit/reg * 34 TOR/bit =
VHDL
11968 TOR
4/88
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Register cost computation for current FSMD
design implementation:
Combina-
1 register of 32 bits with 4-to-1 MUX
torial
circuits 1 CLB/MUXREGbit * 32 bit = 32 CLB
Sequential
(5 gates/MUXbit + 7 gates/REGbit) * 32
circuits bit = 384 gates
FSMD (36 TOR/MUXbit + 34 TOR/REGbit) * 32
design
bit = 2240 TOR
VHDL 1 register of 32 bits with 5-to-1 MUX
(1 CLB/4MUXbit + 1/2 CLB/2MUXREGbit)
* 32 bit = 48 CLB
(5 gates/4MUXbit + 3 gates/2MUXbit + 7
gates/REGbit) * 32 bit = 480 gates
(36 TOR/4MUXbit + 14 TOR/2MUXbit +
4/89
34 TOR/REGbit) * 32 bit = 2688 TOR
©
R.Lauwereins
Imec 2001
Register sharing
Digital
• Register cost computation for current
design
FSMD implementation:
Combina- 1 register of 32 bits with 2-to-1 MUX
torial
circuits 1/2 CLB/MUXREGbit * 32 bit = 16 CLB
Sequential (3 gates/MUXbit + 7 gates/REGbit) *
32 bit = 320 gates
circuits
FSMD
design
(14 TOR/MUXbit + 34 TOR/REGbit) *
32 bit = 1536 TOR
VHDL
4/90
©
R.Lauwereins
Imec 2001
Register sharing
Digital
CLB gates TOR Conn
design
Reg FU Tot Reg FU Tot Reg FU Tot
Combina-
Origi 176 2464 11968 20
torial nal
circuits Reg 96 1184 6464 12
share
Sequential FU
circuits
share
Bus
FSMD
design share
Port
VHDL share
Digital
• FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Basic principles
Sequential Merging
circuits
Register sharing (variable merging)
FSMD
design Functional-unit sharing (operator
VHDL
merging)
Bus sharing (connection merging)
Register port sharing (register
merging)
4/92
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Basic principle:
design
Replace two FUs that are not used at the same
Combina-
time by a single FU with combined functionality
torial and by a MUX at each input and a DEMUX at
circuits
each output
Sequential Do this only when MUX/CombinedFU/DEMUX is
circuits
cheaper than two FUs
FSMD
design a b c d a c b d
DEMUX
x y x y
4/93
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• When register sharing did a correct guess
design
for FU sharing, the cost of the extra MUX
Combina- and DEMUX will be small since input and
torial
circuits output variables of both FUs will often be
assigned to the same register
Sequential
circuits
• Which units can be shared:
FSMD identical units (cf. 2 MAX units)
design
different units (cf. ADD and SUBTRACT)
VHDL
4/94
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Build a compatibility graph
design
Nodes are operators
Combina- Incompatibility edges are drawn between two
torial
circuits
operators that are used in the same state: they
cannot be merged
Sequential
circuits
Priority edges are drawn between two (or a
group of n) operators that can be merged into
FSMD the same FU. A weight on this edge indicates
design
how large the cost saving is by merging the
VHDL two (or n) operators.
4/95
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design
Sequential
circuits
ABS MAX MAX ADD >>1
FSMD
design
VHDL
Nodes are operators
4/96
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design
Sequential
circuits
ABS MAX MAX ADD >>1
FSMD
design
VHDL S1 S2 S3 S4 S5 S6 S7 #
Incompatibility edge:
abs 2 2
two operators needed
min 1 1
max 1 1 2
in same state
>> 2 2
- 1 1
+ 1 1
# 2 2 2 1 1 1
4/97
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design
Sequential
circuits ?
ABS MAX MAX ADD >>1
FSMD
design
VHDL
Priority edge:
weight indicates saving by sharing
4/98
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for the MAX
design
a b ai
bi
Combina-
torial ci
circuits
subtract Cost per bit:
Sequential
circuits
- 1 CLB
- 8 gates
FSMD MUX - 34 TOR
design Sign
max(a,b)
VHDL ci+1
Digital
• Cost model for one FU (MAX&MAX)
design
R1 R2 R1 R2
Cost:
Combina-
torial 2 CLB
circuits
R1=MAX(R1,R2)
& R1=MAX(R1,R2)
16 gate
68 TOR
Sequential R1 R1
circuits
FSMD
design
VHDL R1 R2
Cost: Savings:
1 CLB 1 CLB
R1=MAX(R1,R2)
8 gate 8 gate
34 TOR 34 TOR
R1
Digital
design
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
?
VHDL
4/101
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for the ABS
design
a
Combina-
torial
Cost per bit:
circuits negator - 1/2 CLB (using carry chain)
- 6 gates
Sequential - 34 TOR
circuits
MUX
Sign: an-1
FSMD |a|
design
an-1 2 gates a1 a0
VHDL
(AND & XOR)
18 TOR
(6 + 12)
1
HA HA HA
an-1
MUX MUX MUX
Digital
• Cost model for one FU(ABS&MAX&MAX)
design
R2 R1 R2 R1 R2
Cost:
Combina-
torial 2.5 CLB
circuits
R2=ABS(R2)
& R1=MAX(R1,R2)
& R1=MAX(R1,R2)
22 gate
102 TOR
Sequential R2 R1 R1
circuits
FSMD
design
VHDL R1 R2
R2=ABS(R2)
Cost:
R1=MAX(R1,R2) ?
R1
4/103
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ABS&MAX unit
design
R1 R2 MAX/ABS' R2n-1 Sn-1 F M10
0 0 0 R2 1x
Combina-
torial 0 0 1 R2 1x
circuits MAX/ABS’
0 1 0 S 01
0 1 1 S 01
Sequential 1 0 0 R1 00
circuits
1 0 1 R2 1x
1
FA 1 1 0 R1 00
FSMD
design 1 1 1 R2 1x
R1 S R2
VHDL Cost per bit:
M1 • 1/2 CLB (FA&INV) + 1/2 CLB
00 01 1x
M0 (AND) + 1 (MUX) = 2 CLB
• 5 gates (FA) + 1 (AND) + 1 (INV)
F + 4 (MUX) = 11 gates
R2 appears • 36 TOR (FA) + 6 (AND) + 2 (INV)
most in table: + 22 (MUX) = 66 TOR
most don’t
4/104
cares is best
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(ABS&MAX&MAX)
design
R2 R1 R2 R1 R2
Cost:
Combina-
torial 2.5 CLB
circuits
R2=ABS(R2)
& R1=MAX(R1,R2)
& R1=MAX(R1,R2)
22 gate
102 TOR
Sequential R2 R1 R1
circuits
FSMD
design
VHDL R1 R2
Cost: Savings:
R2=ABS(R2)
2 CLB 0.5 CLB
R1=MAX(R1,R2) 11 gates 11 gate
66 TOR 36 TOR
R1
4/105
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design
?
ABS MIN SUB >>3
Combina-
torial
circuits
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
VHDL
4/106
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for the MIN
design
a b ai
bi
Combina-
torial ci
circuits
subtract
Sequential Cost per bit:
circuits - 1 CLB
MUX - 8 gates
FSMD
design Sign - 34 TOR
min(a,b)
VHDL ci+1
Digital
• Cost model for one FU(ABS&MIN)
design
R1 R1 R2
Cost:
Combina-
torial 1.5 CLB
circuits
R1=ABS(R1)
& R3=MIN(R1,R2)
14 gate
68 TOR
Sequential R1 R3
circuits
FSMD
design
VHDL R1 R2
R1=ABS(R1)
Cost:
R3=MAX(R1,R2) ?
R1/R3
4/108
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ABS&MIN unit
design
R1 R2 MIN/ABS' R1n-1 Sn-1 F M10
0 0 0 R1 1x
Combina-
torial 0 0 1 R1 1x
circuits MIN/ABS’ MIN/
MUX 0 1 0 S 01
ABS’ 0 1 1 S 01
Sequential 1 0 0 R2 00
circuits
1 0 1 R1 1x
1
FA 1 1 0 R2 00
FSMD
design 1 1 1 R1 1x
4/109
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(ABS&MIN)
design
R1 R1 R2
Cost:
Combina-
torial 1.5 CLB
circuits
R1=ABS(R1)
& R3=MIN(R1,R2)
14 gate
68 TOR
Sequential R1 R3
circuits
FSMD
design
VHDL R1 R2
Cost: Savings:
R1=ABS(R1)
2.5 CLB -1 CLB
R3=MAX(R1,R2) 13 gates 1 gate
80 TOR -12 TOR
R1/R3
Digital
design -1/1/
-12
ABS MIN SUB >>3
Combina-
torial
circuits
?
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
VHDL
4/111
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for the ADD
design
Combina-
torial
Cost per bit:
circuits - 1/2 CLB
xi - 5 gates
yi
Sequential
ci - 36 TOR
circuits
FSMD
design
VHDL
si
ci+1
4/112
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for the SUB
design
Combina-
torial
Cost per bit:
circuits - 1/2 CLB
- 6 gates
Sequential - 38 TOR
circuits
a3 b3 a2 b2 a1 b1 a0 b0
FSMD
design
VHDL
c4 c3 c2 c1 1
FA FA FA FA
f3 f2 f1 f0
4/113
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(ADD&SUB)
design
R3 R2 R1 R2
Cost:
Combina-
torial 1 CLB
circuits
R2=ADD(R3,R2)
& R2=SUB(R1,R2)
11 gate
74 TOR
Sequential R2 R2
circuits
FSMD
design
VHDL R1 R2 R3
R2=ADD(R3,R2)
Cost:
R2=SUB(R1,R2) ?
R2
4/114
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ADD&SUB unit
design A/
R1 R3 R2
S’
Combina-
torial
circuits A/S’
MUX
Sequential
circuits
It is not clear
A’/S whether MUX
FSMD FA fits in same
design CLB
S
VHDL
Cost per bit:
• 1/2 CLB (FAS&MUX)
• 6 gates (FAS) + 3 (MUX) =
13 gates
• 48 TOR (FAS) + 14 (MUX) =
62 TOR
4/115
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(ADD&SUB)
design
R3 R2 R1 R2
Cost:
Combina-
torial 1 CLB
circuits
R2=ADD(R3,R2)
& R2=SUB(R1,R2)
11 gate
74 TOR
Sequential R2 R2
circuits
FSMD
design
VHDL R1 R2 R3
Cost: Savings:
R2=ADD(R3,R2)
1/2 CLB 0.5 CLB
R2=SUB(R1,R2) 9 gates 2 gate
62 TOR 12 TOR
R2
4/116
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design -1/1/
-12
ABS MIN SUB >>3
Combina-
torial
circuits 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
?
VHDL
4/117
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(MAX&MAX&ADD)
design
R1 R2 R1 R2 R3 R2
Cost:
Combina-
torial 2.5 CLB
circuits
R1=MAX(R1,R2)
& R1=MAX(R1,R2)
& R2=ADD(R3,R2)
21 gate
104 TOR
Sequential R1 R1 R2
circuits
FSMD
design
VHDL R1 R2 R3
R1=MAX(R1,R2) Cost:
R1=MAX(R1,R2)
R2=ADD(R3,R2) ?
R1/R2
4/118
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ADD&MAX unit
design
R1 R3 A/ R2 ADD/MAX' Sn-1 F M10
Combina-
M’ 0 0 R1 00
0 1 R2 01
torial A/M’ 1 0 S 1x
circuits MUX
1 1 S 1x
Sequential
circuits M1 = ADD/MAX’
1 It is not clear
FA M0 = Sn-1
FSMD whether MUX
design
fits in same
R1 S R2 CLB
VHDL Cost per bit:
M1 • 1/2 CLB (FAS&MUX)
00 1x 01
M0 + 1 (MUX) = 1.5 CLB
• 6 gates (FAS) + 3 (MUX)
F + 4 (MUX) = 13 gates
• 48 TOR (FAS) + 12 (MUX)
+ 22 (MUX) = 82 TOR
4/119
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(MAX&MAX&ADD)
design
R1 R2 R1 R2 R3 R2
Cost:
Combina-
torial 2.5 CLB
circuits
R1=MAX(R1,R2)
& R1=MAX(R1,R2)
& R2=ADD(R3,R2)
21 gate
104 TOR
Sequential R1 R1 R2
circuits
FSMD
design
VHDL R1 R2 R3
Cost: Savings:
R1=MAX(R1,R2) 1.5 CLB 1 CLB
R1=MAX(R1,R2)
R2=ADD(R3,R2) 13 gates 8 gate
82 TOR 22 TOR
R1/R2
4/120
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design -1/1/
-12
ABS MIN SUB >>3
Combina-
torial
circuits 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
1/8/22
VHDL ?
4/121
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model FU(ABS&MAX&MAX&ADD)
design
R2 R1 R2 R1 R2 R3 R2
Cost:
Combina-
torial 3 CLB
circuits
R2=ABS(R2)
& R1=MAX(R1,R2)
& R1=MAX(R1,R2)
& R2=ADD(R3,R2)
27 gate
138 TOR
Sequential R2 R1 R1 R2
circuits
FSMD
design
VHDL R1 R2 R3
R2=ABS(R2) Cost:
R1=MAX(R1,R2)
R2=ADD(R3,R2) ?
R1/R2
4/122
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ABS&MAX&ADD unit
design
R1 R3 A/ R2
M’
Else/ABS’ 0
Combina-
torial
circuits MUX
ADD/MAX’
Sequential
circuits
FSMD FA
design
R1 S R2
VHDL Cost per bit:
M1 • 1/2 CLB (FAS) + 1/2 CLB (MUX)
00 1x 01
M0 + 1 (MUX) = 2 CLB
• 6 gates (FAS) + 3 (MUX)
F + 4 (MUX) = 13 gates
• 48 TOR (FAS) + 16 (MUX)
+ 22 (MUX) = 86 TOR
4/123
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model FU(ABS&MAX&MAX&ADD)
design
R2 R1 R2 R1 R2 R3 R2
Cost:
Combina-
torial 3 CLB
circuits
R2=ABS(R2)
& R1=MAX(R1,R2)
& R1=MAX(R1,R2)
& R2=ADD(R3,R2)
27 gate
138 TOR
Sequential R2 R1 R1 R2
circuits
FSMD
design
VHDL R1 R2 R3
Cost: Savings:
R2=ABS(R2) 2 CLB 1 CLB
R1=MAX(R1,R2)
R2=ADD(R3,R2) 13 gates 14 gate
86 TOR 52 TOR
R1/R2
4/124
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design -1/1/
-12
ABS MIN SUB >>3
Combina-
torial
circuits ? 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
1/8/22
VHDL 1/14/52
4/125
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• FU(ABS&MAX&MAX&ADD&SUB)
design
R2 R1 R2 R1 R2 R3 R2
Cost:
Combina-
torial 3.5 CLB
circuits
R2=ABS(R2)
& R1=MAX(R1,R2)
& R1=MAX(R1,R2)
& R2=ADD(R3,R2)
33 gate
176 TOR
Sequential R2 R1 R1 R2
circuits
R1 R2
FSMD
design
& R2=SUB(R1,R2)
VHDL
R2
R1 R2 R3
R2=ABS(R2)
R1=MAX(R1,R2)
R2=ADD(R3,R2)
Cost:
R2=SUB(R1,R2) ?
R1/R2
4/126
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ABS&MAX&ADD&SUB unit
design
R1 R3 R2
Combina-
torial
0
circuits MUX
Sequential
circuits
FSMD FA
design
R1 S R2
VHDL Cost per bit:
M1 • 1/2 CLB (FAS) + 1/2 CLB (MUX)
00 1x 01
M0 + 1 (MUX) = 2 CLB
• 6 gates (FAS) + 3 (MUX)
F + 4 (MUX) = 13 gates
• 48 TOR (FAS) + 16 (MUX)
+ 22 (MUX) = 86 TOR
4/127
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• FU(ABS&MAX&MAX&ADD&SUB)
design
R2 R1 R2 R1 R2 R3 R2
Cost:
Combina-
torial 3.5 CLB
circuits
R2=ABS(R2)
& R1=MAX(R1,R2)
& R1=MAX(R1,R2)
& R2=ADD(R3,R2)
33 gate
176 TOR
Sequential R2 R1 R1 R2
circuits
R1 R2
FSMD
design
& R2=SUB(R1,R2)
VHDL
R2
R1 R2 R3
R2=ABS(R2)
Cost: Savings:
R1=MAX(R1,R2)
R2=ADD(R3,R2)
2 CLB 1.5 CLB
R2=SUB(R1,R2) 13 gates 20 gate
86 TOR 90 TOR
R1/R2
4/128
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
design -1/1/
-12 ?
ABS MIN SUB >>3
Combina-
torial
circuits 1.5/20/90 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
1/8/22
VHDL 1/14/52
4/129
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• FU(MIN&SUB)
design
R1 R2 R1 R2
Cost:
Combina-
torial 1.5 CLB
circuits
R3=MIN(R1,R2)
& R2=SUB(R1,R2)
14 gate
72 TOR
Sequential R3 R2
circuits
FSMD
design
VHDL
R1 R2
R3=MIN(R1,R2)
R2=SUB(R1,R2)
Cost:
?
R2/R3
4/130
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of a MIN&SUB unit
design
R1 R2
Combina-
torial
circuits
Sequential
circuits
1
FSMD FA
design
R1 S R2
VHDL Cost per bit:
M1 • 1/2 CLB (FA&INV)
00 01 1x
M0 + 1 (MUX) = 1.5 CLB
• 5 gates (FA) + 1 (INV)
F + 4 (MUX) = 10 gates
• 36 TOR (FA) + 2 (INV)
+ 22 (MUX) = 60 TOR
4/131
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• FU(MIN&SUB)
design
R1 R2 R1 R2
Cost:
Combina-
torial 1.5 CLB
circuits
R3=MIN(R1,R2)
& R2=SUB(R1,R2)
14 gate
72 TOR
Sequential R3 R2
circuits
FSMD
design
VHDL
R1 R2
Cost: Savings:
R3=MIN(R1,R2)
R2=SUB(R1,R2)
1.5 CLB 0 CLB
10 gates 4 gate
60 TOR 12 TOR
R2/R3
4/132
©
R.Lauwereins
Imec 2001
Functional-unit sharing
?
Digital
design -1/1/
-12 0/4/12
ABS MIN SUB >>3
Combina-
torial
circuits 1.5/20/90 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
1/8/22
VHDL 1/14/52
4/133
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(ABS&MIN&SUB)
design
R1 R1 R2 R1 R2
Cost:
Combina-
torial 2 CLB
circuits
R1=ABS(R1)
& R3=MIN(R1,R2)
& R2=SUB(R1,R2)
20 gate
106 TOR
Sequential R1 R3 R2
circuits
FSMD
design
VHDL R1 R2
R1=ABS(R1) Cost:
R3=MAX(R1,R2)
R2=SUB(R1,R2) ?
R1/R2/R3
4/134
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Structure of an ABS&MIN&SUB unit
design
R1 R2
Combina-
torial
circuits MUX
Sequential
circuits
1
FSMD FA
design
R1 S R2
VHDL Cost per bit:
M1 • 1/2 CLB (FA) + 1/2 (AND) + 1/2
00 01 1x
M0 (MUX&INV) + 1 (MUX) = 2.5 CLB
• 5 gates (FA) + 1 (AND) + 3 (MUX
F &INV) + 4 (MUX) = 13 gates
• 36 TOR (FA) + 6 (AND) + 16 (MUX
&INV) + 22 (MUX) = 80 TOR
4/135
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost model for one FU(ABS&MIN&SUB)
design
R1 R1 R2 R1 R2
Cost:
Combina-
torial 2 CLB
circuits
R1=ABS(R1)
& R3=MIN(R1,R2)
& R2=SUB(R1,R2)
20 gate
106 TOR
Sequential R1 R3 R2
circuits
FSMD
design
VHDL R1 R2
Cost: Savings:
R1=ABS(R1) 2.5 CLB -0.5 CLB
R3=MAX(R1,R2)
R2=SUB(R1,R2) 13 gates 7 gate
80 TOR 26 TOR
R1/R2/R3
4/136
©
R.Lauwereins
Imec 2001
Functional-unit sharing
-0.5/7/26
Digital
design -1/1/
-12 0/4/12
ABS MIN SUB >>3
Combina-
torial
circuits 1.5/20/90 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
1/8/22
VHDL 1/14/52
4/137
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Cost models for the FUs: SHIFT
design
FSMD
design >>1 Since the SHIFTs do not cost
anything, cost can only increase
VHDL by combining them with other
operators
>>3
4/138
©
R.Lauwereins
Imec 2001
Functional-unit sharing
-0.5/7/26
Digital
design -1/1/
-12 0/4/12
ABS MIN SUB >>3
Combina-
torial
circuits 1.5/20/90 0.5/
2/12
Sequential
circuits 1/8/34
ABS MAX MAX ADD >>1
FSMD
design
0.5/11/36
1/8/22
VHDL 1/14/52
Combina- -1
torial 0
circuits
ABS MIN SUB >>3
Sequential 1.5
0.5
circuits
FSMD 1
design
ABS MAX MAX ADD >>1
0.5
VHDL
1
1
4/140
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Cost minimization for FPGA
Digital
design -0.5
Combina- -1
torial 0
circuits
ABS MIN SUB >>3
Sequential 1.5
0.5
circuits
FSMD 1
design
ABS MAX MAX ADD >>1
0.5
VHDL
1
1
Combina- 1
torial 4
circuits
ABS MIN SUB >>3
Sequential 20
2
circuits
FSMD 8
design
ABS MAX MAX ADD >>1
11
VHDL
8
14
4/142
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Cost minimization for gate arrays
Digital
design 7
Combina- 1
torial 4
circuits
ABS MIN SUB >>3
Sequential 20
2
circuits
FSMD 8
design
ABS MAX MAX ADD >>1
11
VHDL
8
14
Combina- -12
torial 12
circuits
ABS MIN SUB >>3
Sequential 90
12
circuits
FSMD 34
design
ABS MAX MAX ADD >>1
36
VHDL
22
52
4/144
©
R.Lauwereins
Imec 2001
Functional-unit sharing
We select solution 1 for FPGA
Digital
design -0.5
Combina- -1
torial 0
circuits
ABS MIN SUB >>3
Sequential 1.5
0.5
circuits
FSMD 1
design
ABS MAX MAX ADD >>1
0.5
VHDL
1
1
Digital
design
Combina-
torial
circuits MUX MUX MUX
Sequential
circuits
R2: b,t2,t3
R1: a,t1,x,t7 R3: y,t4
FSMD t5,t6
design
VHDL
MUX
Out
4/146
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Note that functional-unit sharing reduced the
design number of ports of the register MUXes; we
guided register sharing already with this in mind
Combina-
torial • We should hence recalculate register cost
circuits
Cost of 1-bit 3-to-1 MUX
Sequential 1 CLB
circuits
4 gates
FSMD 28 TOR
design
Cost of 1-bit 2-to-1 MUX
VHDL 1/2 CLB
3 gates
14 TOR
Cost of 1-bit register
1/2 CLB
7 gates
34 TOR
4/147
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
• Register cost computation for current FSMD
design
implementation:
Combina-
2 registers of 32 bits with 3-to-1 MUX; each
torial register costs:
circuits
1 CLB/MUXREGbit * 32 bit = 32 CLB
Sequential
circuits (4 gates/MUXbit + 7 gates/REGbit) * 32
bit = 352 gates
FSMD
design
(28 TOR/MUXbit + 34 TOR/REGbit) * 32
VHDL bit = 1984 TOR
1 register of 32 bits with 2-to-1 MUX
0.5 CLB/MUXREGbit * 32 bit = 16 CLB
(3 gates/MUXbit + 7 gates/REGbit) * 32
bit = 320 gates
(14 TOR/MUXbit + 34 TOR/REGbit) * 32
4/148
bit = 1536 TOR
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
CLB gates TOR Conn
design
Reg FU Tot Reg FU Tot Reg FU Tot
Combina-
Origi 176 160 336 2464 1408 3872 11968 7616 19584 20
torial nal
circuits Reg 96 160 256 1184 1408 2592 6464 7616 14080 12
share
Sequential FU 80 112 192 1024 832 1856 5504 4864 10368 8
circuits
share
Bus
FSMD
design share
Port
VHDL share
Digital
• FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Basic principles
Sequential Merging
circuits
Register sharing (variable merging)
FSMD
design Functional-unit sharing (operator
VHDL
merging)
Bus sharing (connection merging)
Register port sharing (register
merging)
4/150
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Basic principle:
design
Replace two connections that are not used at
Combina-
the same time by a single connection
torial
circuits
This reduces wiring, which in today’s circuits
became the predominant cost
Sequential at the cost of requiring tri-state drivers each
circuits
time two different sources drive the same bus
FSMD but also saving MUXes each time two different
design
connections driving the same destination are
VHDL replaced by a single bus
R1 R2 R1 R2
MUX FU1
FU1
4/151
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Since wiring cost is so high for buses, we
design
search for the absolute minimum number
Combina- of buses, without looking at the increased
torial
circuits cost for drivers
Sequential
• When several solutions lead to the same
circuits
number of buses, we choose that
FSMD combination that has the minimum
design
number of tri-state drivers at the sources
VHDL and MUXes at the destinations
4/152
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Build a compatibility graph for the
design
connections from registers to functional
Combina- units and a second compatibility graph for
torial
circuits the connections from functional units to
registers
Sequential
circuits Nodes are connections
Incompatibility edges are drawn between two
FSMD
design connections that are used in the same state
and have different sources
VHDL
Priority edges are drawn between two
connections that have the same source (saves
on tri-state drivers) or the same destination
(saves on input MUXes)
4/153
©
R.Lauwereins
Imec 2001
Bus sharing
In1 In2
Digital
design
Combina-
torial
circuits MUX MUX MUX
Sequential
circuits
R2: b,t2,t3
R1: a,t1,x,t7 R3: y,t4
FSMD t5,t6
design
VHDL
A B C D E FG H I
MUX
Out
4/154 Name all input connections for the FUs
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Build the compatibility graph: nodes are
design
connections
Combina-
torial
circuits A
Sequential I B
circuits
FSMD
design
H C
VHDL
G D
F E
4/155
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• In which state is each connection used?
design
From which source and to which
Combina- destination do they go?
torial
circuits
S0 S1 S2 S3 S4 S5 S6 S7
Sequential
circuits
A R1Out
B R1FU1
FSMD C R1FU21
design
D R2FU22
E R1FU31
VHDL
F R3FU31
G R2FU32
H R1FU4
I R3FU5
4/156
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
design
R1=In1
a=In1 R1: a,t1,x,t7
Combina- R2=In2
b=In2 R2: b,t2,t3,t5,t6
torial
circuits R3: y,t4
0
Start Out=R1
Out=t7
Sequential
circuits
1 FU1: ABS
FSMD
R1=F1(R1)
t1=|a|
R1=F3(R1,R2)
t7=max(t6,x)
FU2: MIN
design R2=F3(R2)
t2=|b| FU3: ABS, MAX,MAX,
ADD, SUB
VHDL R1=F3(R1,R2)
x=max(t1,t2) FU4: >>3
R2=F3(R3,R2)
t6=t4+t5
R3=F2(R1,R2)
y=min(t1,t2) FU5: >>1
R2=F4(R1)
t3=x>>3
R2=F3(R1,R2)
t5=x-t3
R3=F5(R3)
t4=y>>1
0
Bus sharing
FSMD
design Start Out=R1
1
VHDL
R1=F1(R1) Incompatible connections
R1=F3(R1,R2)
R2=F3(R2) are those that are used
in the same state and
R1=F3(R1,R2) come from a different
R2=F3(R3,R2)
R3=F2(R1,R2) register
R2=F4(R1)
R2=F3(R1,R2)
R3=F5(R3)
4/158
©
R.Lauwereins
Imec 2001
Bus sharing
Incompatibility edges: B-G
Digital
design C-D
C-G
Combina- D-E
torial E-G
circuits A H-I
F-G
Sequential I B
circuits
FSMD
design
H C
VHDL
G D
F E
4/159
©
R.Lauwereins
Imec 2001
Bus sharing
Priority edges:
same source or
Digital
same destination A R1Out
R1Out
design
B
B R1FU1
R1FU1
R1FU1
C
C R1FU21
R1FU21
R1FU21
Combina-
torial D
D R2FU22
R2FU22
R2FU22
circuits A E
E R1FU31
R1FU31
R1FU31
F R3FU31
R3FU31
Sequential I B F R3FU31
circuits G R2FU32
R2FU32
G R2FU32
H R1FU4
H R1FU4
R1FU4
FSMD I R3FU5
design I R3FU5
R3FU5
H C
VHDL
G D
F E
4/160
©
R.Lauwereins
Imec 2001
Bus sharing
Bus 1: A, B, C, E, F, H
Digital
design Bus 2: D, G, I
Combina-
torial
circuits A
Sequential I B
circuits
FSMD
design
H C
VHDL
G D
F E
4/161
©
R.Lauwereins
Imec 2001
Bus sharing
In1 In2
Digital
design
A B C D E F G H
Combina-
torial
circuits MUX MUX MUX
Sequential
circuits
R2: b,t2,t3
R1: a,t1,x,t7 R3: y,t4
FSMD t5,t6
design
VHDL
Out
4/162 Name all input connections for the registers
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Build the compatibility graph: nodes are
design
connections
Combina-
torial
circuits A
Sequential
circuits H B
FSMD
design
VHDL G C
F D
E
4/163
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• In which state is each connection used?
design
From which source and to which
Combina- destination do they go?
torial
circuits
S0 S1 S2 S3 S4 S5 S6 S7
Sequential A In1R1
circuits
B FU1R1
FSMD
C FU3R1
design D In2R2
E FU3R2
VHDL F FU4R2
G FU2R3
H FU5R3
4/164
© S0
S0 S1
S1 S2
S2 S3
S3 S4
S4 S5
S5 S6
S6 S7
S7
R.Lauwereins A-D
A
A In1R1
In1R1 X
X
Imec 2001
B-E
B
B FU1R1
FU1R1 X
C-G
C
C FU3R1
FU3R1 X
X X
F-H
D
D In2R2
In2R2 X
X
Digital
design E
E FU3R2
FU3R2 X
X X X
F FU4R2
F FU4R2 X
Combina- G FU2R3
torial G FU2R3 X
circuits H FU5R3
H FU5R3 X
Sequential
R1=In1
circuits R2=In2
0
Bus sharing
FSMD
design Start Out=R1
1
VHDL
R1=F1(R1) Incompatible connections
R1=F3(R1,R2)
R2=F3(R2) are those that are used
in the same state and
R1=F3(R1,R2) come from a different
R2=F3(R3,R2)
R3=F2(R1,R2) functional unit
R2=F4(R1)
R2=F3(R1,R2)
R3=F5(R3)
4/165
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Incompatibility edges:
design
A-D
B-E
Combina-
torial C-G
circuits A F-H
Sequential
circuits H B
FSMD
design
VHDL G C
F D
E
4/166
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Priority edges: A In1R1
In1R1
design
B FU1R1
FU1R1
C
C FU3R1
FU3R1
FU3R1
Combina-
torial D
D In2R2
In2R2
In2R2
circuits A E
E FU3R2
FU3R2
FU3R2
F
F FU4R2
FU4R2
FU4R2
Sequential
circuits H B G
G
FU2R3
FU2R3
FU2R3
H FU5R3
FU5R3
H FU5R3
FSMD
design
VHDL G C
F D
E
4/167
©
R.Lauwereins
Imec 2001
Bus sharing
Digital Bus 1: A, B, C, H
design
Bus 2: D, E, F, G
Combina-
torial
circuits A
Sequential
circuits H B
FSMD
design
VHDL G C
F D
E
4/168
©
R.Lauwereins
Imec 2001
Bus sharing
In1 In2
Digital
design
Combina-
torial
circuits MUX MUX MUX
Sequential
circuits
R2: b,t2,t3
R1: a,t1,x,t7 R3: y,t4
FSMD t5,t6
design
VHDL
Out
4/169
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Cost calculation
design
Register cost
Combina- Before bus sharing: 2 3-to-1 MUXes
torial
circuits and 1 2-to-1 MUX
Sequential After bus sharing: 3 2-to-1 MUXes
and 4 tri-state drivers
circuits
4/170
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Cost of a tri-state driver
design
FPGA
Combina- each CLB has a tri-state driver to a
torial
circuits horizontal long line
Sequential cost is hence included in the CLB
circuits
long lines are scarce: highest priority
FSMD
design
is reducing the number of
connections
VHDL
4/171
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Cost of a tri-state driver
design
Gate array & CMOS
Combina-
torial
circuits
Vcc
Sequential
circuits
E F is driven high when
FSMD I E=1 and I =1
design F
VHDL
E F is driven low when
I E=1 and I =0
Vss
4 gates, 12 TOR
4/172
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Recalculation of register cost
design Cost of tri-state driver
Combina-
0 CLB
torial
circuits
4 gates
12 TOR
Sequential
circuits
Cost of 1-bit 2-to-1 MUX
1/2 CLB
FSMD
design 3 gates
14 TOR
VHDL
Cost of 1-bit register
1/2 CLB
7 gates
34 TOR
• Recalculation of functional unit cost
One 2-to-1 MUX less
4/173
6 tri-state drivers more
©
R.Lauwereins
Imec 2001
Bus sharing
Digital
• Register cost computation for current FSMD
design implementation:
Combina-
3 registers of 32 bits with 2-to-1 MUX; each
torial register costs:
circuits
0.5 CLB/MUXREGbit * 32 bit = 16 CLB
Sequential
circuits (3 gates/MUXbit + 7 gates/REGbit) * 32
bit = 320 gates
FSMD
design (14 TOR/MUXbit + 34 TOR/REGbit) * 32
VHDL
bit = 1536 TOR
4 tri-state drivers of 32 bits; each tri-state driver
costs:
0 CLB/TRIStatebit * 32 bit = 0 CLB
4 gates/TRIStatebit * 32 bit = 128 gates
12 TOR/TRIStatebit * 32 bit = 384 TOR
4/174
©
R.Lauwereins
Imec 2001
Functional-unit sharing
Digital
CLB gates TOR Conn
design
Reg FU Tot Reg FU Tot Reg FU Tot
Combina-
Origi 176 160 336 2464 1408 3872 11968 7616 19584 20
torial nal
circuits Reg 96 160 256 1184 1408 2592 6464 7616 14080 12
share
Sequential FU 80 112 192 1024 832 1856 5504 4864 10368 8
circuits
share
Bus 48 96 144 1472 1504 2976 6144 6720 12864 4
FSMD
design share
Port
VHDL share
Digital
• FSMDs
design
• Models
Combina-
torial • Synthesis techniques
circuits
Basic principles
Sequential Merging
circuits
Register sharing (variable merging)
FSMD
design Functional-unit sharing (operator
VHDL
merging)
Bus sharing (connection merging)
Register port sharing (register
merging)
4/176
©
R.Lauwereins
Imec 2001
Register port sharing
Digital
• Basic principle:
design
Combine several registers into one register file
Combina-
to reduce the number of read ports (less input
torial MUXes) and the number of write ports (less tri-
circuits
state drivers
Sequential
circuits
• Methodology: build the Register Access
Table, indicating reads and writes to
FSMD
design registers in each state
VHDL
4/177
©
R.Lauwereins
Imec 2001
Register port sharing
S0 S1 S2 S3 S4 S5 S6 S7
A R1Out X
Reuse
Digital
design RegFU
B R1FU1 X
table
C R1FU21 X
Combina- used for
torial D R2FU22 X
circuits
connection
E R1FU31 X X X
merging
F R3FU31 X
Sequential
circuits G R2FU32 X X X X X
H R1FU4 X
FSMD
I R3FU5 X
design
VHDL
S0 S1 S2 S3 S4 S5 S6 S7
R1 R R R R R R
R2 R R R R R
R3 R R
4/178
©
R.Lauwereins
Imec 2001
Register port sharing
S0 S1 S2 S3 S4 S5 S6 S7
A In1R1 X
Reuse
Digital
design FUReg
B FU1R1 X
table
C FU3R1 X X
Combina- used for
torial D In2R2 X
circuits
connection
E FU3R2 X X X merging
F FU4R2 X
Sequential
circuits G FU2R3 X
H FU5R3 X
FSMD
design
VHDL
S0 S1 S2 S3 S4 S5 S6 S7
R1 W R W R W R R R W R
R2 W R W R W R W R W R
R3 W R W R
4/179
©
R.Lauwereins
Imec 2001
Register port sharing
S0 S1 S2 S3 S4 S5 S6 S7
Digital
design R1 W R W R W R R R W R
R2 W R W R W R W R W R
Combina- R3 W R W R
torial
circuits
• When implemented as three registers, we
Sequential
circuits
need 3 write ports and 3 read ports
FSMD
• In next slides, we do an exhaustive
design search (i.e. we enumerate all possibilities
VHDL
and compute their cost) for merging 2 or
more registers in 1 register file
• For large designs, we would need an
optimization technique
4/180
©
R.Lauwereins
Imec 2001
Register port sharing
S0 S1 S2 S3 S4 S5 S6 S7
Digital
design R1 W R W R W R R R W R
R2 W R W R W R W R W R
Combina- R3 W R W R
torial
circuits
• How many ports are needed for a register file
Sequential sharing 2 registers?
circuits
Combine R1 and R2
FSMD 2 read ports (S1, S2, S4, S6)
design
2 write ports (S0, S1)
VHDL
Combine R1 and R3
2 read ports (S3)
2 write ports (S2)
Combine R2 and R3
2 read ports (S5)
2 write ports (S3)
4/181 No saving is obtained
©
R.Lauwereins
Imec 2001
Register port sharing
S0 S1 S2 S3 S4 S5 S6 S7
Digital
design R1 W R W R W R R R W R
R2 W R W R W R W R W R
Combina- R3 W R W R
torial
circuits
• How many ports are needed for a register
Sequential
circuits
file sharing 3 registers?
Combine R1, R2 and R3
FSMD
design 2 read ports (S1, S2, S3, S4, S5, S6)
VHDL 2 write ports (S0, S1, S2, S3)
We save 2 ports
4/182
©
R.Lauwereins
Imec 2001
Register port sharing
In1 In2
Digital
design
Combina-
torial
circuits R1: a,t1,x,t7
Sequential
R2: b,t2,t3
circuits t5,t6
FSMD
R3: y,t4
design
VHDL
Out
4/183
©
R.Lauwereins
Imec 2001
Register port sharing
Digital
• Recalculation of register cost
design
Before register port sharing: 3 2-to-1 MUXes
Combina-
and 4 tri-state drivers
torial
circuits
After register port sharing: 4 tri-state drivers
Saving:
Sequential
circuits 0 CLB (the small MUXes fitted in the
same CLB as the register bits)
FSMD
design 3 gates/MUXbit * 32 bit = 96 gates
VHDL
14 TOR/MUXbit * 32 bit = 448 TOR
4/184
©
R.Lauwereins
Imec 2001
Register port sharing
Digital
CLB gates TOR Conn
design
Reg FU Tot Reg FU Tot Reg FU Tot
Combina-
Origi 176 160 336 2464 1408 3872 11968 7616 19584 20
torial nal
circuits Reg 96 160 256 1184 1408 2592 6464 7616 14080 12
share
Sequential FU 80 112 192 1024 832 1856 5504 4864 10368 8
circuits
share
Bus 48 96 144 1472 1504 2976 6144 6720 12864 4
FSMD
design share
Port 48 96 144 1376 1504 2880 5696 6720 12416 4
VHDL share
4/185