You are on page 1of 27

ARCHITECTURES - 1

Mariagiovanna Sami

31/07/2013

Architecture: which definition?

Abstract architecture the functional specification of a computer: Concrete architecture an implementation of an abstract architecture. Abstract arcihtecture: a black box specification specification of a machine can be seen:
2

31/07/2013

Architecture definition (2)

From the programmers point of view we deal with a programming model, equivalent to description of the machine language; From the designers point of view we deal with a hardware model (a black-box description for the designer: must include additional information, e.g., interface protocols etc.).

31/07/2013

Architecture definition (3)

Usually, architecture denotes abstract architecture. Concrete Architecture is often called microarchitecture (term originally created for microprogrammed CPUs, extended more in general to the structural description in terms of functional units and interconnections).
4

31/07/2013

Where do we start from?

Background: the Von Neumann paradigm (and the Harvard alternative) Extension to a reactive paradigm still V.N!

31/07/2013

An Architectural Paradigm:

Composition of hardware and program execution mode; Does not include software, but implies the execution mode of object code!

31/07/2013

The classical V.N. abstract architecture:


ALU Memory Control Unit I/O CPU

31/07/2013

Programming style: imperative, control-flow dominated

One address space in memory information is identified by its address Machine instructions are stored sequentially: natural order of fetching and execution is by increasing address values execution in the same sequential order; Variables are identified by names translated as addresses.
8

31/07/2013

The Control Flow:

The C.U. determines address of next instruction to be executed as contained in ther Program Counter (PC) and fetches it from memory: The C.U. decodes the instruction and controls its execution by proper commands to ALU and memory Simultaneously, address of next instruction is computed: as a rule, next instruction is immediately sequential to the one being executed (address computed by incrementing PC) unless otherwise explicitly stated by control instruction

31/07/2013

Control-dominated execution:

Control is implicitly determined by ordering of instructions in the program or explicitly modified by jump/branch instructions: Execution is inherently sequential and serial.

31/07/2013

10

The basic approach: C.U. the only active unit

All transfers to/from memory controlled by C.U.; I/O initiated by instructions in the program (program-controlled I/O, polling): C.U. activates transfer channels; All actions are de facto synchronized by execution of the program.
11

31/07/2013

The Harvard variant...

Basically, separates program and data memory:


ALU Program Memory Control Unit I/O Data Memory

31/07/2013

12

Performance Evaluation...

Made with reference to a set of benchmark programs (often synthetic); For every instruction in the machines Instruction Set (IS) the total time required (fetch+execute) is known; Profiling (execution of the program with suitable sets of data) gives the dynamic sequence of instructions executed
13

31/07/2013

Performance Evaluation (2)


Total time required by execution of the program = sum of times required by all instructions in the dynamic sequence of execution (Instructions may have different latency depending on specific operations, necessity of accessing memory to read/write data, even length of instruction itself). performance optimization through choice of best algorithm + less time-consuming instructions
31/07/2013

14

(Some of) the bottlenecks

Memory is slower than logic: larger (and less costly) memory = wider gap, but ever larger addressable memory space is requested!; Execution is totally serial an instruction must be completed before its successor is fetched from memory; technology dominates instruction latency and overall performances;

31/07/2013

15

Bottlenecks (2)

If a reactive system is designed (typically, an application-specific or embedded system) an external event created by an I/O device is serviced only when the device is polled by the program real-time only as good as the programmer can make it!
16

31/07/2013

So, how to achieve better performances?

Modify memory structure so that the programmer will see a very large addressable space but the CPU will see a fast equivalent memory; Achieve better efficiency for execution of the instruction sequence; Allow servicing external events when events arise in an asynchronous way with respect to program execution.
17

31/07/2013

Starting from the bottom...

Servicing external events? Solution born with the first minicomputers (early 60s): interrupt (an external unit may initiate an action execution of the servicing routine is then controlled by the C.U.)

31/07/2013

18

Getting better efficiency for instruction execution?

A first approach: create instructions capable of executing complex operations (object code more compact; one instruction fetched from memory executes actions previously performed by a sequence of instructions). Drawbacks: more complex C.U. (longer clock period); identification of useful complex instructions for general-purpose CPUs is difficult.

31/07/2013

19

Complex instructions
Still...: The solution has been widely adopted CISC machines a winning approach for a long time; May be very useful when specialized tasks are widely used (e.g., DSP or imageprocessing) or for application-specific CPUs.
31/07/2013

20

Getting better efficiency for instruction execution the alternative

Modify structure of CPU and execution paradigm to introduce parallelism overcome the serial execution bottleneck. But... Which kind of parallelism? Parallelism has to be detected within the application at which level?
21

31/07/2013

What about the memory problem?

Introduce a hierarchy of memories - large, slow (and cheap) ones at the bottom, fast, small (and costly) ones at the top (nearest the CPU); Allow a wider memory bandwidth more than one unit of information at a time is transferred from memory to CPU (or between memories).
22

31/07/2013

Memory (2)
In fact: Hierarchy: does not imply any assumption on mode of execution other than serial; requires extensions to hw structure controlling memory access; larger bandwidth: meaningful only if some form of parallelism is adopted.
31/07/2013

23

What these lectures will be about:

Memory hierarchy: it is assumed that the basic points are already known (e.g.: virtual memory and its hw supports; the scope of cache memory...). Attention will be given to cache organization and performances: technological aspects are not discussed here (other courses...).
24

31/07/2013

What these lectures will be about (2):

Parallelism: from within the CPU to system-level, taking into account characteristics of application-specific systems:

31/07/2013

Pipelining Instruction-Level Parallelism (ILP) Multi-threading Multi-processor systems.


25

Course organization

Lectures Exercises Use of tools for architecture evaluation and design:

Analysis of an applications behaviour given a fixed architecture Design of a specific architecture for a given application.
26

31/07/2013

Texts:

Slides are available in the Masters repository; Suggested readings: a list will be circulated (books available in the Library, papers accessible via Internet or provided in hardcopy); Manuals of software tools: available in the repository.
27

31/07/2013

You might also like