Professional Documents
Culture Documents
Pipeline: Hazards
Lecturer: Prof. Hong Jiang
Courtesy of Prof. Yifeng Zhu, U. of Maine Fall, 2006
CSCE430/830
Pipeline Hazards
Pipelining Outline
Introduction
Defining Pipelining Pipelining Instructions
Hazards
Structural hazards Data Hazards Control Hazards
CSCE430/830
Pipeline Hazards
Pipeline Hazards
Where one instruction cannot immediately follow another Types of hazards
Structural hazards - attempt to use the same resource by two or more instructions Control hazards - attempt to make branching decisions before branch condition is evaluated Data hazards - attempt to use data before it is ready
CSCE430/830
Pipeline Hazards
Structural Hazards
Attempt to use the same resource by two or more instructions at the same time Example: Single Memory for instructions and data
Accessed by IF stage Accessed at same time by MEM stage
Solutions
Delay the second access by one clock cycle, OR Provide separate memories for instructions & data This is what the book does This is called a Harvard Architecture Real pipelined processors have separate caches
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CC 1
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
lw $r0, 10($r1)
IM
REG
ALU
DM
REG
sw $r3, 20($r4)
IM
REG
ALU
DM
REG
IM
REG
ALU
DM
REG
IM
REG
ALU
DM
REG
CSCE430/830
Pipeline Hazards
CC 1
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
lw $r0, 10($r1)
IM
REG
ALU
DM
REG
Memory Conflict
sw $r3, 20($r4)
IM REG ALU DM REG
IM
REG
ALU
DM
REG
IM
REG
ALU
DM
REG
CSCE430/830
Pipeline Hazards
ALU
Reg
DMem
Reg
ALU
Ifetch
Reg
DMem
Reg
ALU
O r d e r
Instr 2
Ifetch
Reg
DMem
Reg
Stall Instr 3
Bubble
Bubble Bubble
Bubble
ALU
Bubble
Reg
Ifetch
Reg
DMem
CSCE430/830
Pipeline Hazards
Structural Hazards
Some common Structural Hazards: Memory:
weve already mentioned this one.
Floating point:
Since many floating point instructions require many cycles, its easy for them to interfere with each other.
CSCE430/830
Pipeline Hazards
Structural Hazards
Dealing with Structural Hazards
Stall low cost, simple Increases CPI use for rare case since stalling has performance effect Pipeline hardware resource useful for multi-cycle resources good performance sometimes complex e.g., RAM Replicate resource good performance increases cost (+ maybe interconnect delay) useful for cheap or divisible resources
CSCE430/830 Pipeline Hazards
Structural Hazards
Many RISC ISAs are designed with this in mind Sometimes very difficult to do this.
For example, memory of necessity is used in the IF and MEM stages.
CSCE430/830
Pipeline Hazards
Structural Hazards
We want to compare the performance of two machines. Which machine is faster? Machine A: Dual ported memory - so there are no memory stalls Machine B: Single ported memory, but its pipelined implementation has a clock rate that is 1.05 times faster Assume: Ideal CPI = 1 for both Loads are 40% of instructions executed
CSCE430/830
Pipeline Hazards
Cy T Ideal CPI un Pipe dep d Speedu Ideal CPI pi Pipel stal CPI Cy T
Cyc Ti Pipelin depth unpi d Speedup 1 stall Pipelin CPI Cyc Ti pipe
CSCE430/830 Pipeline Hazards
Structural Hazards
We want to compare the performance of two machines. Which machine is faster? Machine A: Dual ported memory - so there are no memory stalls Machine B: Single ported memory, but its pipelined implementation has a 1.05 times faster clock rate Assume: Ideal CPI = 1 for both Loads are 40% of instructions executed SpeedUpA = Pipeline Depth/(1 + 0) x (clockunpipe/clockpipe) = Pipeline Depth SpeedUpB = Pipeline Depth/(1 + 0.4 x 1) x (clockunpipe/(clockunpipe / 1.05) = (Pipeline Depth/1.4) x 1.05 = 0.75 x Pipeline Depth SpeedUpA / SpeedUpB = Pipeline Depth / (0.75 x Pipeline Depth) = 1.33 Machine A is 1.33 times faster
CSCE430/830
Pipeline Hazards
Pipelining Summary
CSCE430/830
Pipeline Hazards
Review
Speedup of pipeline
Speedup =
Pipeline Depth
X 1 + Pipeline stall CPI
CSCE430/830
Pipeline Hazards
Pipelining Outline
Introduction
Defining Pipelining Pipelining Instructions
Hazards
Structural hazards Data Hazards Control Hazards
CSCE430/830
Pipeline Hazards
Pipeline Hazards
Where one instruction cannot immediately follow another Types of hazards
Structural hazards - attempt to use same resource twice Control hazards - attempt to make decision before condition is evaluated Data hazards - attempt to use data before it is ready
CSCE430/830
Pipeline Hazards
Data Hazards
Data hazards occur when data is used before it is ready
T e(in c c c c s im lo k y le ) C 1 C V lu o a e f re is r $ : 1 g te 2 0 P g m ro ra eeu n x c tio o e rd r (in in tru tio s s c n) s b$ , $ , $ u 2 1 3 IM Rg e D M Rg e C 2 C 1 0 C 3 C 1 0 C 4 C 1 0 C 5 C 1 /2 0 0 C 6 C 2 0 C 7 C 2 0 C 8 C 2 0 C 9 C 2 0
a d$ 2 $ , $ n 1, 2 5
IM
Rg e
D M
Rg e
o $ 3 $ ,$ r 1, 6 2
IM
Rg e
D M
Rg e
a d$ 4 $ , $ d 1, 2 2
IM
Rg e
D M
Rg e
s $510 2 w 1 , 0 ($ )
IM
Rg e
D M
Rg e
The use of the result of the SUB instruction in the next three instructions causes a data hazard, since the register $2 is not written until after those instructions read it.
CSCE430/830 Pipeline Hazards
Data Hazards
Execution Order is: InstrI InstrJ
Caused by a Dependence (in compiler nomenclature). This hazard results from an actual need for communication.
CSCE430/830
Pipeline Hazards
Data Hazards
Execution Order is: InstrI InstrJ
CSCE430/830
Pipeline Hazards
Data Hazards
Execution Order is: InstrI InstrJ
Called an output dependence by compiler writers This also results from the reuse of name r1.
Cant happen in MIPS 5 stage pipeline because:
All instructions take 5 stages, and Writes are always in stage 5
CSCE430/830
Pipeline Hazards
IF/ID
IM
ID/EX
Rg e
EX/MEM MEM/WB
D M Rg e
a d$ 2 $ , $ n 1, 2 5
IM
Rg e
D M
Rg e
o $ 3 $ ,$ r 1, 6 2
IM
Rg e
D M
Rg e
a d$ 4 $ , $ d 1, 2 2
IM
Rg e
D M
Rg e
s $510 2 w 1 , 0 ($ )
IM
Rg e
D M
Rg e
EX hazard
MEM hazard
Pipeline Hazards
Data Hazards
Solutions for Data Hazards
Stalling Forwarding: connect new value directly to next stage Reordering
CSCE430/830
Pipeline Hazards
8 W
10
12
16
18
ID
EX MEM s0
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE sub $s0 $t2, ,$t3 R s0
IF
EX MEM WB
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
10
12
16
18
lw ,20($t1) $s0 IF
ID
ID W EXMEM s0
new value of s0
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
IF
R s0
EXMEM B W
CSCE430/830
Pipeline Hazards
Data Hazards
LW R1, 0(R2) IF ID IF EX ID IF MEM EX ID IF
LW
R1, 0(R2)
IF
ID IF
EX ID IF
CSCE430/830
Pipeline Hazards
Forwarding
Key idea: connect data internally before it's stored
T e(in c ck c s im lo ycle ) C 1 C V lu o a e f re iste $ : 1 g r 2 0 P g m ro ra e c tio xe u n o e rd r (in in c n stru tio s) s b$ , $ , $ u 2 1 3
IF/ID
C 2 C 1 0
ID/EX
C 3 C 1 0
C 4 C 1 0
C 5 C 1 /2 0 0
C 6 C 2 0
C 7 C 2 0
C 8 C 2 0
C 9 C 2 0
EX/MEM
MEM/WB
IM
Rg e
D M
Rg e
a d$ 2 $ , $ n 1, 2 5
IM
Rg e
D M
Rg e
o $ 3 $ ,$ r 1, 6 2
IM
Rg e
D M
Rg e
a d$ 4 $ , $ d 1, 2 2
IM
Rg e
D M
Rg e
s $510 2 w 1 , 0 ($ )
IM
Rg e
D M
Rg e
No Forwarding
CSCE430/830
Pipeline Hazards
a d$ 2 $ , $ n 1, 2 5
IM
Rg e
D M
Rg e
o $ 3 $ ,$ r 1, 6 2
IM
Rg e
D M
Rg e
a d$ 4 $ , $ d 1, 2 2
IM
Rg e
D M
Rg e
s $510 2 w 1 , 0 ($ )
IM
Rg e
D M
Rg e
CSCE430/830
Assumption: The register file forwards values that are read and written during the same cycle.
Pipeline Hazards
A stall is needed if read a register after a load instruction that writes the same register. Reordering
CSCE430/830
Pipeline Hazards
Review
Speedup of pipeline
Speedup =
Pipeline Depth
X 1 + Pipeline stall CPI
CSCE430/830
Pipeline Hazards
Pipelining Outline
Introduction
Defining Pipelining Pipelining Instructions
Hazards
Structural hazards Data Hazards Control Hazards
CSCE430/830
Pipeline Hazards
Forwarding
CSCE430/830
Pipeline Hazards
SUB
ADD
EX Hazard: SUB result not written until its WB, ready at end of its EX, needed at start of ADDs EX EX/MEM Forwarding: forward $s0 from EX/MEM to ALU input in ADD EX stage (CC4)
CSCE430/830
Pipeline Hazards
SUB
ADD
EX Hazard Detection - EX/MEM Forwarding Conditions: If ((EX/MEM.RegWrite = 1) & (EX/MEM.RegRD = ID/EX.RegRS)) If ((EX/MEM.RegWrite = 1) & (EX/MEM.RegRD = ID/EX.RegRT)) Then forward EX/MEM result to EX stage
CSCE430/830
Pipeline Hazards
SUB ADD OR
MEM Hazard: SUB result not written until its WB, stored in MEM/WB, needed at start of ORs EX MEM/WB Forwarding: forward $s0 from MEM/WB to ALU input in OR EX stage (CC5)
CSCE430/830
Pipeline Hazards
SUB
ADD
OR
MEM Hazard Detection - MEM/WB Forwarding Conditions: If ((MEM/WB.RegWrite = 1) & (MEM/WB.RegRD = ID/EX.RegRS)) If ((EX/MEM.RegWrite = 1) & (EX/MEM.RegRD = ID/EX.RegRT)) Then forward MEM/WB result to EX stage
CSCE430/830
Pipeline Hazards
IF/ID
IM
ID/EX
Rg e
EX/MEM MEM/WB
D M Rg e
a d$ 2 $ , $ n 1, 2 5
IM
Rg e
D M
Rg e
o $ 3 $ ,$ r 1, 6 2
IM
Rg e
D M
Rg e
a d$ 4 $ , $ d 1, 2 2
IM
Rg e
D M
Rg e
s $510 2 w 1 , 0 ($ )
IM
Rg e
D M
Rg e
EX/MEM.RegisterRd = ID/EX.RegisterRs EX/MEM.RegisterRd = ID/EX.RegisterRt MEM/WB.RegisterRd = ID/EX.RegisterRs MEM/WB.RegisterRd = ID/EX.RegisterRt instructions do not write register.
Pipeline Hazards
Data Hazards
Solutions for Data Hazards
Stalling Forwarding: connect new value directly to next stage Reordering
CSCE430/830
Pipeline Hazards
8 W
10
12
16
18
ID
EX MEM s0
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE sub $s0 $t2, ,$t3 R s0
IF
EX MEM WB
CSCE430/830
Pipeline Hazards
a d$ 2 $ , $ n 1, 2 5
IM
Rg e
D M
Rg e
o $ 3 $ ,$ r 1, 6 2
IM
Rg e
D M
Rg e
a d$ 4 $ , $ d 1, 2 2
IM
Rg e
D M
Rg e
s $510 2 w 1 , 0 ($ )
IM
Rg e
D M
Rg e
CSCE430/830
Assumption: The register file forwards values that are read and written during the same cycle.
Pipeline Hazards
Forwarding
00 01 10 00
01
10
CSCE430/830
Add hardware to feed back ALU and MEM results to both ALU inputs Pipeline Hazards
Controlling Forwarding
Need to test when register numbers match in rs, rt, and rd fields stored in pipeline registers "EX" hazard:
EX/MEM - test whether instruction writes register file and examine rd register ID/EX - test whether instruction reads rs or rt register and matches rd register in EX/MEM
"MEM" hazard:
MEM/WB - test whether instruction writes register file and examine rd (rt) register ID/EX - test whether instruction reads rs or rt register and matches rd (rt) register in EX/MEM
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
LW
ADD
CSCE430/830
Pipeline Hazards
LW
ADD
LW doesnt write $s0 to Reg File until the end of CC5, but ADD reads $s0 from Reg File in CC3
CSCE430/830 Pipeline Hazards
LW
ADD
EX/MEM forwarding wont work, because the data isnt loaded from memory until CC4 (so its not in EX/MEM register)
CSCE430/830 Pipeline Hazards
LW
ADD
MEM/WB forwarding wont work either, because ADD executes in CC4
CSCE430/830 Pipeline Hazards
LW
ADD
IF
bubbl e
We must handle this hazard by stalling the pipeline for 1 Clock Cycle (bubble)
CSCE430/830 Pipeline Hazards
LW
ADD
IF
bubbl e
We can then use MEM/WB forwarding, but of course there is still a performance loss
CSCE430/830 Pipeline Hazards
bubbl e
bubbl e
bubbl e
bubbl e
bubbl e
LW ADD
CSCE430/830 Pipeline Hazards
LW ADD
We do this by preserving the current values in IF/ID for use on the next Clock Cycle
CSCE430/830 Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
A stall is needed if read a register after a load instruction that writes the same register. Reordering
CSCE430/830
Pipeline Hazards
Hazards
Structural hazards Data Hazards Control Hazards
CSCE430/830
Pipeline Hazards
Pipeline Hazards
Where one instruction cannot immediately follow another Types of hazards
Structural hazards - attempt to use same resource twice Control hazards - attempt to make decision before condition is evaluated Data hazards - attempt to use data before it is ready
CSCE430/830
Pipeline Hazards
Control Hazards
A control hazard is when we need to find the destination of a branch, and cant fetch any new instructions until we know that destination. A branch is either
Taken: PC <= PC + 4 + Immediate Not Taken: PC <= PC + 4
CSCE430/830
Pipeline Hazards
Control Hazards
ALU
Ifetch
Reg
DMem
Reg
ALU
Ifetch
Reg
DMem
Reg
ALU
Ifetch
Reg
DMem
Reg
ALU
Ifetch
Reg
DMem
Reg
ALU
Ifetch
Reg
DMem
Reg
Branch Hazards
Just stalling for each branch is not practical Common assumption: branch not taken When assumption fails: flush three instructions
T e (inc c c c s im lo k y le ) P g m ro ra eeu n x c tio C 1 C C 2 C o e rd r (in in tru tio s s c n) 4 b q$ , $ , 7 0 e 1 3 IM Rg e C 3 C C 4 C C 5 C C 6 C C 7 C C 8 C C 9 C
D M
Rg e
4 a d$ 2 $ , $ 4 n 1, 2 5
IM
Rg e
D M
Rg e
4 o $ 3 $ ,$ 8 r 1, 6 2
IM
Rg e
D M
Rg e
5 a d$ 4 $ , $ 2 d 1, 2 2
IM
Rg e
D M
Rg e
7 lw$ , 5 ($ ) 2 4 0 7
IM
Rg e
D M
Rg e
(Fig. 6.37)
CSCE430/830 Pipeline Hazards
CSCE430/830
Pipeline Hazards
Predict
assume an outcome and continue fetching (undo if prediction is wrong) lose cycles only on mis-prediction
Delayed branch
specify in architecture that the instruction immediately following branch is always executed
CSCE430/830
Pipeline Hazards
Why are branches (especially backward branches) more likely to be taken than not taken?
CSCE430/830
Pipeline Hazards
CSCE430/830
add $r4,$r5,$r6 IF
ID
EX MEM WB
beq $r0,$r1,tgt
IF
ID
EX MEM WB
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
sw $s4,200($t5)
IF
beq writes PC here
ID
EX MEM WB
CSCE430/830
add $r4,$r5,$r6
IF
ID
EX
MEM
WB
beq $r0,$r1,tgt
IF
ID
EX
MEM
WB
tgt: sw $s4,200($t5)
IF
ID
EX
MEM
WB
18
add $r4,$r5,$r6 IF
ID
EX MEM WB
beq $r0,$r1,tgt
IF
ID
EX MEM WB
IF
BUBBLE BUBBLE BUBBLE BUBBLE
or $r8,$r8,$r9
IF
Squashed instruction
ID
EX MEM WB
CSCE430/830
Pipeline Hazards
Pipeline Hazards
CSCE430/830
Pipeline Hazards
NT T
10
Predict Taken
NT NT
01
00
NT
CSCE430/830 Pipeline Hazards
CSCE430/830
Pipeline Hazards
Prediction accuracy of 4K-entry 2-bit prediction buffer on SPEC89 benchmarks: accuracy is lower for integer programs (gcc, espresso, eqntott, li) than for FP
CSCE430/830 Pipeline Hazards
Prediction accuracy of 4K-entry 2-bit prediction buffer vs. infinite 2-bit buffer: increasing buffer size from 4K does not significantly improve performance
CSCE430/830 Pipeline Hazards
Delayed branches code rearranged by compiler to place independent instruction after every branch (in delay slot).
add $R4,$R5,$R6 beq $R1,$R2,20 lw $R3,400($R0) beq $R1,$R2,20 add $R4,$R5,$R6 lw $R3,400($R0)
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards
MIPS Instructions
All instructions exactly 32 bits wide Different formats for different purposes Similarities in formats ease implementation
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
31 31
op
6 bits
rs
5 bits
rt
5 bits
rd
shamt funct
16 bits
0 0
R-Format
op
6 bits
rs
rt
26 bits
offset
I-Format
J-Format
op 31
address 0
CSCE430/830
Pipeline Hazards
CSCE430/830
Pipeline Hazards