Architectures - Exploring Definitions and Paradigms

ARCHITECTURES - 1
Mariagiovanna Sami
31/07/2013
Architecture: which definition?
Abstract architecture the functional specification of a computer: Concrete architecture an implementation of an abstract architecture. Abstract arcihtecture: a black box specification specification of a machine can be seen:
2
31/07/2013
Architecture definition (2)
From the programmers point of view we deal with a programming model, equivalent to description of the machine language; From the designers point of view we deal with a hardware model (a black-box description for the designer: must include additional information, e.g., interface protocols etc.).
31/07/2013
Architecture definition (3)
Usually, architecture denotes abstract architecture. Concrete Architecture is often called microarchitecture (term originally created for microprogrammed CPUs, extended more in general to the structural description in terms of functional units and interconnections).
4
31/07/2013
Where do we start from?
Background: the Von Neumann paradigm (and the Harvard alternative) Extension to a reactive paradigm still V.N!
31/07/2013
An Architectural Paradigm:
Composition of hardware and program execution mode; Does not include software, but implies the execution mode of object code!
31/07/2013
The classical V.N. abstract architecture:

ALU Memory Control Unit I/O CPU
31/07/2013
Programming style: imperative, control-flow dominated
One address space in memory information is identified by its address Machine instructions are stored sequentially: natural order of fetching and execution is by increasing address values execution in the same sequential order; Variables are identified by names translated as addresses.
8
31/07/2013
The Control Flow:
The C.U. determines address of next instruction to be executed as contained in ther Program Counter (PC) and fetches it from memory: The C.U. decodes the instruction and controls its execution by proper commands to ALU and memory Simultaneously, address of next instruction is computed: as a rule, next instruction is immediately sequential to the one being executed (address computed by incrementing PC) unless otherwise explicitly stated by control instruction
31/07/2013
Control-dominated execution:
Control is implicitly determined by ordering of instructions in the program or explicitly modified by jump/branch instructions: Execution is inherently sequential and serial.
31/07/2013
10
The basic approach: C.U. the only active unit
All transfers to/from memory controlled by C.U.; I/O initiated by instructions in the program (program-controlled I/O, polling): C.U. activates transfer channels; All actions are de facto synchronized by execution of the program.
11
31/07/2013
The Harvard variant...
Basically, separates program and data memory:

ALU Program Memory Control Unit I/O Data Memory
31/07/2013
12
Performance Evaluation...

Made with reference to a set of benchmark programs (often synthetic); For every instruction in the machines Instruction Set (IS) the total time required (fetch+execute) is known; Profiling (execution of the program with suitable sets of data) gives the dynamic sequence of instructions executed
13
31/07/2013
Performance Evaluation (2)

Total time required by execution of the program = sum of times required by all instructions in the dynamic sequence of execution (Instructions may have different latency depending on specific operations, necessity of accessing memory to read/write data, even length of instruction itself). performance optimization through choice of best algorithm + less time-consuming instructions
31/07/2013
14
(Some of) the bottlenecks
Memory is slower than logic: larger (and less costly) memory = wider gap, but ever larger addressable memory space is requested!; Execution is totally serial an instruction must be completed before its successor is fetched from memory; technology dominates instruction latency and overall performances;
31/07/2013
15
Bottlenecks (2)
If a reactive system is designed (typically, an application-specific or embedded system) an external event created by an I/O device is serviced only when the device is polled by the program real-time only as good as the programmer can make it!
16
31/07/2013
So, how to achieve better performances?
Modify memory structure so that the programmer will see a very large addressable space but the CPU will see a fast equivalent memory; Achieve better efficiency for execution of the instruction sequence; Allow servicing external events when events arise in an asynchronous way with respect to program execution.
17
31/07/2013
Starting from the bottom...
Servicing external events? Solution born with the first minicomputers (early 60s): interrupt (an external unit may initiate an action execution of the servicing routine is then controlled by the C.U.)
31/07/2013
18
Getting better efficiency for instruction execution?
A first approach: create instructions capable of executing complex operations (object code more compact; one instruction fetched from memory executes actions previously performed by a sequence of instructions). Drawbacks: more complex C.U. (longer clock period); identification of useful complex instructions for general-purpose CPUs is difficult.
31/07/2013
19
Complex instructions
Still...: The solution has been widely adopted CISC machines a winning approach for a long time; May be very useful when specialized tasks are widely used (e.g., DSP or imageprocessing) or for application-specific CPUs.
31/07/2013
20
Getting better efficiency for instruction execution the alternative
Modify structure of CPU and execution paradigm to introduce parallelism overcome the serial execution bottleneck. But... Which kind of parallelism? Parallelism has to be detected within the application at which level?
21
31/07/2013
What about the memory problem?
Introduce a hierarchy of memories - large, slow (and cheap) ones at the bottom, fast, small (and costly) ones at the top (nearest the CPU); Allow a wider memory bandwidth more than one unit of information at a time is transferred from memory to CPU (or between memories).
22
31/07/2013
Memory (2)
In fact: Hierarchy: does not imply any assumption on mode of execution other than serial; requires extensions to hw structure controlling memory access; larger bandwidth: meaningful only if some form of parallelism is adopted.
31/07/2013
23
What these lectures will be about:
Memory hierarchy: it is assumed that the basic points are already known (e.g.: virtual memory and its hw supports; the scope of cache memory...). Attention will be given to cache organization and performances: technological aspects are not discussed here (other courses...).
24
31/07/2013
What these lectures will be about (2):
Parallelism: from within the CPU to system-level, taking into account characteristics of application-specific systems:

31/07/2013
Pipelining Instruction-Level Parallelism (ILP) Multi-threading Multi-processor systems.

25
Course organization
Lectures Exercises Use of tools for architecture evaluation and design:
Analysis of an applications behaviour given a fixed architecture Design of a specific architecture for a given application.
26
31/07/2013
Texts:
Slides are available in the Masters repository; Suggested readings: a list will be circulated (books available in the Library, papers accessible via Internet or provided in hardcopy); Manuals of software tools: available in the repository.
27
31/07/2013

Architectures - Exploring Definitions and Paradigms

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Architectures - Exploring Definitions and Paradigms

Uploaded by

Copyright:

Available Formats

ARCHITECTURES - 1

Architecture: which definition?

Architecture definition (2)

Architecture definition (3)

Where do we start from?

The classical V.N. abstract architecture:

Programming style: imperative, control-flow dominated

The Control Flow:

The basic approach: C.U. the only active unit

The Harvard variant...

Basically, separates program and data memory:

Performance Evaluation (2)

(Some of) the bottlenecks

So, how to achieve better performances?

Starting from the bottom...

Getting better efficiency for instruction execution?

Getting better efficiency for instruction execution the alternative

What about the memory problem?

What these lectures will be about:

What these lectures will be about (2):

Pipelining Instruction-Level Parallelism (ILP) Multi-threading Multi-processor systems.

Lectures Exercises Use of tools for architecture evaluation and design:

You might also like