Professional Documents
Culture Documents
Course practicalities
Literature:
Gajski book I will find additional reading in form of papers
Course Web
(http://kom.aau.dk/~abo/Teaching/DSP_alg_arch/index.htm)
1947
MM1 DSP Algorithms and Architectures
Todays processors
2008 > 300M transistors > 3000 MHz operation ~150mm2
The A3 Paradigm
Application
LP-filter (specification)
Algorithm
FIR IIR (parallel)
Architecture
DSP -Controller ASIC/FPGA
IIR (cascade) Design dedicated architectures that fits our algorithmic demands. CAD tools typical help us, but we need to know why and how
10
The A3 Paradigm
Application
LP-filter (specification)
Specifications
This course
11
Design representation
12
MODULE + GATE
CIRCUIT
DEVICE G S n+ D n+
13
Mixed strategies
Mostly top-down, but also bits of bottom-up Reality: need to know both top level and bottom level constraints
14
1010101011 101011
Analog Reconstructor
RE AL -TIM
Y(n)
FIR:
General architectures
-controllers General Purpose Processors (GPP) Application Specific Instruction-set Processor (ASIP) Digital Signal Processors (DSP) Application Specific Integrated Circuit (ASIC) Field Programmable Gate Array (FPGA)
MM1 DSP Algorithms and Architectures 16
Control
Mem
ALU
18
MUL
Control Mem
ALU
19
ARM7
20
10
Control Path
Mem
Control Mem
21
TMS32010
22
11
Mem
Control
ALU
Blackfin architecture
Mem
24
12
Using address arithmetic unit the core of the above algorithm becomes a single line of parallel instructions . . A0 += data1*data2 || A1+=data3*data4; . .
25
26
13
27
FPGA
Customized for a particular use, using programmable logic components and programmable interconnecting busses
From an algorithmic point the design methodologies is more or less similar for the two
28
14
Multiplexed (HW-sharing)
Cost T, A (+ Control)
29
Y[n] T1
X[n]
Ha
Hb
Hc
Hd
Y[n]
T2 = Ta+Tb+Tc+Td
The operation time of a given transfer function is obviously dependent on the algorithmic complexity, but also on the implementation technology used.
MM1 DSP Algorithms and Architectures 30
15
Ha
Latch
Hb
Hc
Hd
Y[n-3]
Ha
X[n]
Hb Hc Hd
Y[n]
32
16
33
Graphical Representations
Data Flow Graphs (DFG) Control Flow Graphs (CFG) Control Data Flow Graphs (CDFG) State Transition Graphs (STG)
nodes (or vertices) edges (or arcs)
34
17
36
18
Cost Functions
37
Cost Functions
Implementation quality is determined by cost functions
noise, power, area, time
Noise: wordlength Power: technology Area: circuit Time: the three above
Interaction
38
19
Architecture
39
v(t)
0 1 v(t)
t0
t1
Short Circuit:
V out I Vin V out I Vin
Vin=0
Leakage:
Ids
V out=Vdd
Vgs
Ioff
Vth
Vin
40
20
Circuit Delay:
41
42
21
43
Design Representation
Behavioral or functional representation
Specifies the behavior or the functions of a design without any implementation information
Structural representation
Specifies the implementation of a design in terms of components and their interactions
Physical representation
Specifies the physical characteristics of the design (Blueprint for manufacturing)
44
22
Plain English Algorithm State machine,ALU,Regs Gate level netlist Transistor list
Product
45
46
23
Implementation Technologies
47
HW Design Abstraction
Processor-Memory Level RT Level Logic Gates Transistors Polygons of Silicon
MM1 DSP Algorithms and Architectures 48
24
Algorithm
RT Language
Boolean Eqn
Y-Chart
Transistor
Differential Eqn
Cost
Only SW, Low cost and Low performance.
Performance
25
Constraints
System-level HW-SW Co-design Interconnect and buses SW behavior, RTOS, schedule policy and processors
51
26
Performance analysis (timing, power, area) Design and optimization (timing, power, area) Architecture selection: processing elements, memory units and inter-connect. RTOS and schedule scheme.
53
54
27
Algorithm
FIR IIR (parallel)
Architecture
DSP -Controller ASIC
ig n Des
MM1 DSP Algorithms and Architectures
IIR (cascade)
s flow
55
Summary
Algorithms and Architectures
Data path Control path Algorithmic properties
Following courses
Architectural optimization (mm2-mm3) Scheduling concepts (mm4-mm5)
MM1 DSP Algorithms and Architectures 56
28
Exercises
Gajski: 1.1, 1.4, and 1.8 Cost functions: Discuss power vs. energy optimization
Why is there a difference? How can you optimize energy, only taking the dynamic contribution into account?
Taking an outset in the paper by C.H. Wang, Algorithmic Implementation of Low-Power High Performance FIR Filtering IP Cores (Hint: only sections 1 and 2). For these exercises you should prepare a few notes such that you can present your findings next Thursday (no more than 5 minutes). Gr840: Find the various representation forms of the FIR filter used, and writ them in mathematical form and make a block-diagram representation Gr841: Discuss or verify that the data-path in figure 2 is reasonable and try to map the algorithms onto it. Gr842 Make a 1:1 mapping and propose an architecture for a four tap FIR filter
57
29