Professional Documents
Culture Documents
Dr Shankar Balachandran
Indian Institute of Technology Madras
shankar@cse.iitm.ernet.in
14 October 2006
Datapath
Adders,
multipliers, dividers
Shifters, Registers
Anything that changes or stores data
Control Unit
Controls
the data
How data is stored?
Where is it stored?
When should data be available?
Control Unit
Correct sequencing of control signals
Much like human brain controlling various
parts of body
Sequence and timing is the key
Any
Control Unit
Decode Unit
Execute
Execution Unit
Write Back
Write Back Unit
A Possible Implementation
Mod-3
Counter
CLK
2 to 4
Decoder
Timing Diagram
CLK
Fetch
Decode
Execute
Write Back
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Hardwired vs Microprogrammed
Hardwired
Use
Microprogrammed
Store
A Model Computer
(Richard Eckert, SIGCSE Bulletin, Vol. 20, No. 3, September 1988)
IP
LP
EP
LM
12
8
PC
Accumulator
12
12
12
MAR
ALU
8
R
W
Register B
12
MDR
IR
4
Bus
S
A
EU
12
12
RAM
12
LD
ED
LA
EA
Control
LB
LI
EI
More Details
L = Load
E = Copy to bus
A,S = Add and Subtract
Sign bit to control unit
IP = Increment PC
IP
LP
EP
LM
R
W
LD
ED
ACC
PC
LA
EA
MAR
ALU
A
EU
LB
IR
LI
EI
RAM
MDR
Bus
Control
Mnemonic Opcode
LDA
Register Transfers
Active
Controls
A(Mem)
1. MAR IR
2. MDR M(MAR)
3. A MDR
EI,LM
R
ED,LA
EI,LM
EA,LD
W
Action
Load
Accumulator
STA
Store
Accumulator
(Mem) A
1. MAR IR
2.MDR A
3. M(MAR) MDR
ADD
A A+B
1. AALU(Add)
A,EU,LA
SUB
A A-B
1. AALU(Sub)
S,EU,LA
MBA
B A
1. BA
EA,LB
JMP
PC Mem
1. PCIR
EI,LP
JN
PC Mem
If ve flag
is set
1. PCIR if NF is set
NF : EI,LP
HLT
8-15
Stop Clock
1. MAR PC
2. MDR M(MAR)
3. IR MDR
EP,LM
R
ED,LI,IP
Fetch
IR Next
Instruction
Hardwired Unit
CLK
IR
Ring Counter
T5
Opcode
T1
LDA
STA
ADD
Decoder
SUB
MBA
JMP
Control
Matrix
JN
Halt
NF
Control Signals
LP
Fetch T2
EP LM R
T0
LD
ED LI
T0
T1
T2
LDA
T3
T4
T5
STA
T3
T5
EI
LA
T3
T5
EA A
EU LB
T2
T4
T3
T4
MBA
T3
ADD
T3
SUB
T3
JMP
T3
T3
JN
T3
*F
T3
*F
IP = T2;
LP = T3*JMP+T3*JN*NF;
EP = T0;
LM = T0+T3*LDA+T3*STA
R=T1+T4*LDA;
W=T5* STA;
LD = T4*STA;
ED=T2+T5*LDA;
T3
T3
T3
T3
LI=T2;
A = T3*ADD;
S = T3*SUB;
..
T3
Control Matrix
Implement using discrete gates
Usually done using PLAs
Large control matrices are implemented
hierarchically
For
speed
An Alternate Implementation
4-bit
opcode
IR
MAP
Starting
Address
Generator
CD
&
1*
NF
01
00
CLK
Map
CD
Meaning
From IR
Unconditional
Branch within
Microprogram
NF=0 =>
Increment
NF=1 =>
Conditional Branch
uPC
+1
32 x 24
Control ROM
Jump Address
Control
Store
Microinstruction
Register
HLT
Control
Instruction Op-Code
Fetch
LDA
STA
Control Store
uInstruction
Address
Control Signals
CD
00
0011000000000000
01
01
0000100000000000
02
02
1000000110000000
XX
03
0001000001000000
04
04
0000100000000000
05
05
0000000100100000
00
06
0001000001000000
07
07
0000001000010000
08
08
0000010000000000
00
ADD
09
0000000000101010
00
SUB
0A
0000000000100110
00
MBA
0B
0000000000010001
00
JMP
0C
0100000001000000
00
JN
0D
0000000000000000
0F
0E
0000000000000000
00
0F
0100000001000000
00
Expansion
8-E
10-1E
Control Word
I
L E
Example 1 MBA followed by ADD
P P P
Fetch
LDA
STA
L
M
L
D
E
D
L
I
E
I
L
A
E
A
E
U
L
B
00
0011000000000000
01
01
0000100000000000
02
02
1000000110000000
XX 0B
09
03
0001000001000000
04
04
0000100000000000
05
05
0000000100100000
00
06
0001000001000000
07
07
0000001000010000
08
08
0000010000000000
00
ADD
09
0000000000101010
00
SUB
0A
0000000000100110
00
MBA
0B
0000000000010001
00
JMP
0C
0100000001000000
00
JN
0D
0000000000000000
0F
0E
0000000000000000
00
0F
0100000001000000
00
Expansion
8-E
10-1E
ADD
1. MAR PC
2. MDR M(MAR)
3. IR MDR
BA
1. MAR PC
2. MDR M(MAR)
3. IR MDR
AALU(Add)
0011000000000000
0000100000000000
1000000110000000
0000000000010001
0011000000000000
0000100000000000
1000000110000000
0000000000101010
I
P
L
P
E
P
L
M
L
D
E
D
L
I
E
I
L
A
E
A
E
U
L
B
Example 2 JN with
Flag Set
CD
Fetch
LDA
STA
00
0011000000000000
01
01
0000100000000000
02
02
1000000110000000
XX
03
0001000001000000
04
04
0000100000000000
05
05
0000000100100000
00
06
0001000001000000
07
07
0000001000010000
08
08
0000010000000000
00
ADD
09
0000000000101010
00
SUB
0A
0000000000100110
00
MBA
0B
0000000000010001
00
JMP
0C
0100000001000000
00
JN
0D
0000000000000000
0F
0E
0000000000000000
00
0F
0100000001000000
00
Expansion
8-E
10-1E
0D
I
P
L
P
E
P
L
M
L
D
E
D
L
I
E
I
L
A
E
A
E
U
L
B
Example 3 JN with
Flag Not Set
CD
Fetch
LDA
STA
00
0011000000000000
01
01
0000100000000000
02
02
1000000110000000
XX
03
0001000001000000
04
04
0000100000000000
05
05
0000000100100000
00
06
0001000001000000
07
07
0000001000010000
08
08
0000010000000000
00
ADD
09
0000000000101010
00
SUB
0A
0000000000100110
00
MBA
0B
0000000000010001
00
JMP
0C
0100000001000000
00
JN
0D
0000000000000000
0F
0E
0000000000000000
00
0F
0100000001000000
00
Expansion
8-E
10-1E
0D
What is Microcode?
Thought Experiment
Why is the design a little clumsy?
What can we do about it?
Real Life
A little American Football Story
Theory vs. Practice
In
A General Approach
IR
Starting
and Branch
Address
Generator
External Inputs
Conditional Codes
uPC
Control
Store
Control Word
Format of Microinstructions
Pick yours
Your
What we did :
One
Dont matter
Can
Vertical
Microprogram
Vertical Microprogram
Encode the bits by grouping similar
elements together
General Idea :
Group
Some
Design Issues
requires decoders
Another Idea
Group concuurently active signals
Every meaningful combination gets a code
Complex decoder to interpret every code
Vertical vs Horizontal
Horizontal
Faster
More
area
More common currently
Cheap transistors
Vertical
Slower
More
microinstructions
Microsequencing
Other ways to save on hardware
Every instruction had its own
microprogram sequence
Also, instructions have several addressing
modes
Only
Bit-ORing
Example
Two instructions share some microcode
Eventually, must branch
The default branch (one instructions) is X0
The other branch is stored at X1
Change the least significant bit(s?) to get a new address
Thought Experiment :
What if we provided explicit branch
instead of storing next field in our
microprogram?
Typical instruction set will need a lot of
branches
Lot of time will be wasted on branching
Caution :
Microinstruction
Solution :
There
is no free lunch.
A neat idea :
Caveat :
Commonly used
Historical Perspectives
Hardwired Logic
Popular now
Speed Benefits
Microprogram
Popular in 70s
Shades of gray :
Hardwired
Any
Microcoding
Small
Hardwired vs Microcoding
Hardwired units are faster and smaller
Emulation is easy with microcoding
Hardwired design is complex if large
Bugs in hardwired design cannot be fixed
in field
Hardwired control is not suited for loops
Looping
RISC
Simpler
instruction set
Hardwired Implementation
Store
Difference :
Contents
Microprogram vs Software
process
Error prone
Many fetches repeatedly from memory for the given
sequence of operations
Solution 2 : Microcode
Long
Emulation
32 bit architecture
16-bit registers
Secret :
Heavy microcoding
Programmers oblivious
Implemented in
VAX
8800
PDP-11/60
IBM System/370
Current Trends
Microcode Update
Linux Utility - microcode_ctl
Companion
Intel Said..
The Pentium(R) Pro processor and Pentium(R) II processor may
contain design defects or errors known as errata that may cause the
product to deviate from published specifications. Many times, the
effects of the errata can be avoided by implementing hardware or
software work-arounds, which are documented in the Pentium Pro
Processor Specification Update and the Pentium II Processor
Specification Update. Pentium Pro and Pentium II processors include a
feature called "reprogrammable microcode", which allows certain types
of errata to be worked around via microcode updates. The microcode
updates reside in the system BIOS and are loaded into the processor
by the system BIOS during the Power-On Self Test, or POST.
Current Trends
Hyperthreading in P4
A second
logical CPU
Complete state of the system in both CPUs
Microcoding in P4
Two
Thank You