Manual Pipelining

BITS
Investigating Instruction Pipelines
Objectives
At the end of this lab students should be able to:
1. Demonstrate the difference between pipelined and sequential processing of the
CPU instructions
2. Understand pipeline data dependency and data hazard
3. Explain a pipeline technique to eliminate data hazards
4. Demonstrate compiler “loop unrolling” optimization’s benefits for instruction
pipelining
5. Describe the re‐arranging of instructions by compiler to minimize data dependencies
6. Learn to make use of jump‐predict table for pipeline optimization.
Exercise 1 – Difference between the sequential and the pipelined execution of

CPU instructions
Enter the following source code, compile it and load in simulator’s memory:
program Ex1
for n = 1 to 20
p = p + 1
next
end
Make sure No instruction pipeline check box is selected. In the CPU simulator window bring the
speed slider down to around a reading of 30. Run the program and observe the pipeline. Wait for
the program to complete. Now make a note of the following values
BITS
CPI (Clocks Per Instruction)

SF (Speed‐up Factor)
clocks
Instruction count
Next, the No instruction pipeline checkbox is unchecked. Reset and run the above program again
and wait for it to complete.
Make the entry in the table
Note down the observation and watch the pipeline to visualize the different behavior of the
pipeline

clocks
Instruction count
BITS
Exercise 2 – CPU pipeline data hazards, bubbles and the NOP instruction
A data hazard is caused by unavailability of an operand value when it is needed. In order to
demonstrate this create a program (call it Ex2) and enter the following set of instructions.
MOV #3, R02
MOV #4, R03
MOV #5, R02
ADD R02, R03
HLT
Before running the program make a note of the result expexted,
R03=?
Make sure the No instruction pipeline is NOT checked and Do not insert bubbles is checked.
Reset the program and run the above instructions. Make a note of the value in register R03
below:
R03 = ?
Now insert a NOP instruction (use the Miscellaneous tab) after the third instruction, i.e.
Reset the program and run the above set of instructions & note down the value
R03= ?
Explanation?
Now there are three records of R03. Briefly explain the result in each case.
Now delete the NOP instruction from above program and uncheck the option Do not insert
bubbles. Reset the program and run the instructions. Observe the value in register R03 when the
program completes. Make a note of this value below
R03= ? 9
Briefly explain : Why one gets the same result without including the “NOP” instruction?
Is the “bubble” still there? What colour is it?
BITS
Prepare the table:

clocks
Instruction count
Data Hazard
BITS
Exercise 3 – A pipeline technique to eliminate data hazards

The “operand forwarding”, is a kind of short‐cut by the hardware to speed up the operands to
remove the data hazards
To demonstrate this check the box titled Enable operand forwarding and run the above code
again.
Observation:
Is the bubble still there? Explain.
The simulator keeps a count of the pipeline hazards it detects as the instructions go through the
pipeline. These can be seen near the bottom of the pipeline window.
Prepare the table

clocks
Instruction count
Data Hazard
Explain the relationship between CPI, clocks & Instruction count.
Has there been an improvement? Reason it.

BITS
Exercise 4 – Loop unrolling optimization for minimizing control dependencies

This method essentially duplicates the inner code of a loop as many times as the number of
loops, removing some redundant code as well as the loop’s compare and jump instructions.
However, the code size of the program increases. It is shown that “loop unrolling” is well suited
to instruction pipelining which takes full advantage of it thus improving CPU performance. Here,
we will prove this to be the case.
Write the following code in program, select optimization, check enable optimizer and check
option Redundant Code and compile it.
program Ex4_1
for n = 1 to 8
t=t+1
next
end
Make a note of the size of the code generated.
Next, make sure the optimization option Loop Unrolling is selected in addition to the option
Redundant Code optimization. Change the program name to Ex4_2 and compile it again. Load
this code in memory too starting from location 100.
This can be done by by unchecking the base address option in the assembly code section of
program compiler,
So, now there should be two versions of the code: Ex4_1 without “loop unrolling” optimization
and Ex4_2 with “loop unrolling” optimization. It can be seen in the program window of CPU
simulator as shown below.
BITS
Make a note of the size of the code generated for Ex4_2 here:
Click on show pipeline and check the keep on top box. Ascertain that Enable operand
forwarding and Enable jump prediction boxes are all unchecked.
First, select program Ex4_1 from the PROGRAM LIST frame in the CPU simulator window as
shown in figure below and then click the RESET button
BITS
Now run the program EX4_1 at full speed and observe the pipeline. Prepare the following table.
CPI (Clocks Per Instruction) Data

hazard
SF (Speed‐up Factor) Busy
stage
No of instructions executed Control
Hazards
Do the same with program Ex4_2 and make note of the following values:
CPI (Clocks Per Instruction) Data

hazard
SF (Speed‐up Factor) Busy
stage
No of instructions executed Control
Hazards
Briefly comment on the observations making references to the code sizes and the number of
instructions executed:
BITS
Exercise 5 – Compiler re‐arranging instruction sequence to help minimize data

dependencies
This is done to minimize pipeline hazards such as the “data hazard” we studied in Exercise 3.
Ascertain that Show dependencies check box is checked and ONLY the Redundant Code
optimization is selected. Enter the following source code, compile it and load in memory
Explain the redundancies shown in figure below.
Copy the CPU instruction sequence generated below (do not include the instruction addresses):
Load it in memory and run. Observe the pipeline.
Next, select the optimization option Code Dependencies. Change the program name to Ex5_2,
compile it and load in memory.
Copy the CPU instruction sequence generated below:
Load it in memory and run. Observe the pipeline.

BITS
How do the two sequences differ? Does the change affect the logic of the program? Briefly
explain the rationale for the change. Explain the difference in pipeline achieved in two
methodologies.
We will measure the performance “with out of sequence method applied”

program Ex5_3
for n = 1 to 30
a=2
b=a
c=5
next
end
Now, compile and load two version of the above program, one without the Code Dependencies
optimization and one with this optimization (call this one program Ex5_4)
Repeat all the steps as for ex5_2 & Ex5_1
Do you see any improvement in program Ex5_4 over program Ex5_3 (express this in
percentage)?
BITS
Exercise 6 – Jump predict table

Enter the following program and compile it with ONLY the Enable optimizer and Remove
redundant code check boxes selected. Load the compiled program in the CPU
program Ex6
i=0
for p = 1 to 40
i=i+1
if i = 10 then
i=0
r=i
end if
next
end
Run the program and make a note of the following pipeline statistics:
Now, in the pipeline window select the Enable jump prediction check box. Reset the
program and run it again. Make a note of the following pipeline statistics
Explain the difference.

BITS
Click on the SHOW JUMP TABLE… button. Observe the Jump Predict Table
window showing. This table keeps an entry relevant to each conditional jump instruction. The
information contained has the following fields. What each field stands for? Enter the suggestions
in the table below:
V
JInstAddr
JTarget
PStat
Count

Manual Pipelining

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Manual Pipelining

Uploaded by

Copyright:

Available Formats

BITS

Investigating Instruction Pipelines

Exercise 1 – Difference between the sequential and the pipelined execution of

CPI (Clocks Per Instruction)

Make the entry in the table

CPI (Clocks Per Instruction)

Prepare the table:

CPI (Clocks Per Instruction)

Exercise 3 – A pipeline technique to eliminate data hazards

CPI (Clocks Per Instruction)

Explain the relationship between CPI, clocks & Instruction count.

Has there been an improvement? Reason it.

Exercise 4 – Loop unrolling optimization for minimizing control dependencies

CPI (Clocks Per Instruction) Data

CPI (Clocks Per Instruction) Data

Exercise 5 – Compiler re‐arranging instruction sequence to help minimize data

Load it in memory and run. Observe the pipeline.

Load it in memory and run. Observe the pipeline.

We will measure the performance “with out of sequence method applied”

Exercise 6 – Jump predict table

Explain the difference.

You might also like