Professional Documents
Culture Documents
Architecture
Outline
CPU Architecture
Instruction Set Overview
Internal Buses & Memory
C6000 Peripherals Overview
Device Family Review
CPU
Digital sampling of
an analog signal
code
T =1
fs
Y =
an * xn
n = 1
Y =
an * xn
n = 1
Multiply
40
Y =
an * xn
n = 1
.?
MPY
a, x, prod
Y =
an * xn
n = 1
.M
MPY .M
a, x, prod
Add
40
Y =
an * xn
n = 1
.M
.?
MPY .M
a, x, prod
ADD .?
Y =
an * xn
n = 1
Where are
the variables
stored?
.M
.L
MPY .M
a, x, prod
ADD .L
Register File - A
Register File A
A0
a
x
A1
A2
prod
A3
Y
A4
.
.
.
A31
32-bits
40
Y =
an * xn
n = 1
.M
.L
MPY .M
a, x, prod
ADD .L
.
.
.
A31
32-bits
40
Y =
an * xn
n = 1
.M
.L
MPY .M
A0, A1, A3
ADD .L
A4, A3, A4
.
.
.
A31
32-bits
40
Y =
an * xn
n = 1
.M
.L
MPY .M
A0, A1, A3
ADD .L
A4, A3, A4
.
.
.
40
Y =
an * xn
n = 1
.M
.L
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
Loop?
A31
32-bits
Creating a Loop
1. Add branch instruction (B) and a label
2. Create a loop counter (= 40)
3. Add an instruction to decrement the loop counter
4. Make the branch conditional based on the value
in the loop counter
Branching (1)
Register File A
A0
a
x
A1
A2
prod
A3
Y
A4
.
.
.
A31
32-bits
40
Y =
.?
an * xn
n = 1
.M
loop:
.L
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
.?
loop
.
.
.
A31
32-bits
40
Y =
.S
an * xn
n = 1
.M
loop:
.L
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
.S
loop
Creating a Loop
1. Add branch instruction (B) and a label
2. Create a loop counter (= 40)
3. Add an instruction to decrement the loop counter
4. Make the branch conditional based on the
value in the loop counter
.S
40, A2
; A2 = 40
.
.
.
A31
32-bits
40
an * xn
.S
Y =
.M
MVK
.S
40, A2
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
.S
loop
n = 1
loop:
.L
Creating a Loop
1. Add branch instruction (B) and a label
2. Create a loop counter (= 40)
3. Add an instruction to decrement the loop counter
4. Make the branch conditional based on the
value in the loop counter
.
.
.
A31
32-bits
40
an * xn
.S
Y =
.M
MVK
.S
40, A2
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
SUB
.L
A2, 1, A2
.S
loop
n = 1
loop:
.L
Creating a Loop
1. Add branch instruction (B) and a label
2. Create a loop counter (= 40)
3. Add an instruction to decrement the loop counter
4. Make the branch conditional based on the
value in the loop counter
Conditional Instructions
To minimize branching, all instructions are conditional
[condition]
loop
Execute instruction if :
[cond]
true:
cond 0
[!cond]
false:
cond = 0
.
.
.
A31
40
Y =
.M
MVK
.S
40, A2
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
SUB
.L
A2, 1, A2
.S
loop
n = 1
loop:
.L
[A2] B
32-bits
an * xn
.S
Creating a Loop
1. Add branch instruction (B) and a label
2. Create a loop counter with proper value
3. Add an instruction to decrement the loop counter
4. Make the branch conditional based on the
value in the loop counter
a, x, Y located in memory
.M
.L
32-bits
Memory
a [40]
x [40]
Y
*A5
*A6
*A7
Load/Store Options
Because the 'C6000 provides byte addressability, the instruction
set supports several types of load/store instructions:
Load instructions
C Data Type
LDB
char
LDH
short
LDW
int
LDDW
Not Supported
C62x
Store instructions
STB
char
STH
short
STW
int
STDW
C62x, C67x
C Data Type
LDB
char
LDH
short
LDW
int
LDDW
Not Supported
C62x
Store instructions
STB
char
STH
short
STW
int
STDW
C62x, C67x
Load/Store
Register File A
A0
a
x
A1
A2 loop count
prod
A3
Y
A4
&a[n]
A5
&x[n]
A6
&Y
A7
..
A31
40
an * xn
.S
Y =
.M
MVK
.S
40, A2
LDH
.?
*A5, A0
LDH
.?
*A6, A1
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
SUB
.L
A2, 1, A2
.S
loop
.?
A4, *A7
n = 1
loop:
.L
.?
[A2] B
STH
32-bits
Data Memory
Load/Store - .D Unit
Register File A
A0
a
x
A1
A2 loop count
prod
A3
Y
A4
&a[n]
A5
&x[n]
A6
&Y
A7
..
A31
40
an * xn
.S
Y =
.M
MVK
.S
40, A2
LDH
.D
*A5, A0
LDH
.D
*A6, A1
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
SUB
.L
A2, 1, A2
.S
loop
.D
A4, *A7
n = 1
loop:
.L
.D
[A2] B
STH
32-bits
Data Memory
a0
a1
a2
a
&x
&
A6
++
.
.
40
Y = an * xn
x0
x1
x2
.
.
a0 * x0
How do you access a1 and
x1 on the second loop?
LDH .D
*A5++, A0
LDH .D
*A6++, A1
n = 1
loop:
MVK
.S
40, A2
LDH
LDH
.D
*A5,
*A5++,
A0A0
LDH
LDH
.D
*A6, A1A1
*A6++,
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
SUB
.L
A2, 1, A2
.S
loop
.D
A4, *A7
[A2] B
STH
40
an * xn
.S
Y =
.M
MVK
.S
40, A2
LDH
.D
*A5++, A0
LDH
.D
*A6++, A1
MPY
.M
A0, A1, A3
ADD
.L
A4, A3, A4
SUB
.L
A2, 1, A2
.S
loop
.D
A4, *A7
n = 1
loop:
.L
.D
[A2] B
STH
32-bits
Data Memory
Adding Side B
Register File A
A0
A1
A2
A3
A4
.
.
.
A31
Register File B
.S1
.S2
.M1
.M2
.L1
.L2
.D1
.D2
32-bits
B0
B1
B2
B3
B4
.
.
.
B31
32-bits
Data Memory
Y =
an * xn
n = 1
MVK
loop: LDH
LDH
MPY
ADD
SUB
[A2] B
STH
.S1
.D1
.D1
.M1
.L1
.L1
.S1
.D1
40, A2
*A5++, A0
*A6++, A1
A0, A1, A3
A3, A4, A4
A2, 1, A2
loop
A4, *A7
Outline
CPU Architecture
Instruction Set Overview
Classic C6x Devices (C62x, C67x)
Introducing SIMD (C64x)
Brand New (C64x+, C674x, C66x)
Outline
CPU Architecture
Instruction Set Overview
Classic C6x Devices (C62x, C67x)
Introducing SIMD (C64x)
Brand New (C64x+, C674x, C66x)
Logical
ABS
ADD
ADDA
ADDK
ADD2
MPY
MPYH
NEG
SMPY
SMPYH
SADD
SAT
SSUB
SUB
SUBA
SUBC
SUB2
ZERO
AND
CMPEQ
CMPGT
CMPLT
NOT
OR
SHL
SHR
SSHL
XOR
Bit Mgmt
CLR
EXT
LMBD
NORM
SET
Data Mgmt
LDB/H/W
MV
MVC
MVK
MVKL
MVKH
MVKLH
STB/H/W
Program Ctrl
B
IDLE
NOP
Note: Refer to the 'C6000 CPU Reference Guide for more details
.S
.L
.D
ADD
ADDK
ADD2
AND
B
CLR
EXT
MV
MVC
MVK
MVKL
MVKH
NEG
NOT
OR
SET
SHL
SHR
SSHL
SUB
SUB2
XOR
ZERO
ABS
ADD
AND
CMPEQ
CMPGT
CMPLT
LMBD
MV
NEG
NORM
NOT
OR
SADD
SAT
SSUB
SUB
SUBC
XOR
ZERO
.M Unit
.D Unit
.M
.L Unit
ADD
NEG
ADDAB (B/H/W) STB
(B/H/W)
SUB
LDB
(B/H/W) SUBAB (B/H/W)
ZERO
MV
MPY
MPYH
MPYLH
MPYHL
SMPY
SMPYH
No Unit Used
NOP
IDLE
.S
.L
.D
ADD
ADDK
ADD2
AND
B
CLR
EXT
MV
MVC
MVK
MVKL
MVKH
NEG
NOT
OR
SET
SHL
SHR
SSHL
SUB
SUB2
XOR
ZERO
ABSSP
ABSDP
CMPGTSP
CMPEQSP
CMPLTSP
CMPGTDP
CMPEQDP
CMPLTDP
RCPSP
RCPDP
RSQRSP
RSQRDP
SPDP
.D Unit
.M
ADD
NEG
ADDAB (B/H/W) STB
(B/H/W)
ADDAD
SUB
LDB
(B/H/W) SUBAB (B/H/W)
LDDW
ZERO
MV
.L Unit
ABS
ADD
AND
CMPEQ
CMPGT
CMPLT
LMBD
MV
NEG
NORM
NOT
OR
SADD
SAT
SSUB
SUB
SUBC
XOR
ZERO
ADDSP
ADDDP
SUBSP
SUBDP
INTSP
INTDP
SPINT
DPINT
SPRTUNC
DPTRUNC
DPSP
.M Unit
MPY
MPYH
MPYLH
MPYHL
SMPY
SMPYH
MPYSP
MPYDP
MPYI
MPYID
No Unit Used
NOP
IDLE
New Instructions
.S Units enhanced with FP Adder
ADDSP
ADDDP
SUBSP
SUBDP
Along with .L unit, you can have
4 float adds/subtracts in parallel
Outline
CPU Architecture
Instruction Set Overview
Classic C6x Devices (C62x, C67x)
Introducing SIMD (C64x)
Brand New (C64x+, C674x, C66x)
Emulation
Advanced Instruction
Packing
Advanced
Emulation
Instruction Decode
L1
S1
+
+
+
+
+
Interrupt
Control
Control Registers
+
+
M1
x
x
x
x
D1
D2
M2
X
x
x
x
x
S2
L2
+
+
+
+
+
+
+
.D
Dual/Quad Arith
SADD2
SADDUS2
SADD4
Data Pack/Un
PACK2
PACKH2
PACKLH2
PACKHL2
Bitwise Logical UNPKHU4
ANDN
UNPKLU4
Shifts & Merge SWAP2
SPACK2
SHR2
SPACKU4
SHRU2
SHLMB
SHRMB
Dual Arithmetic Mem Access
ADD2
LDDW
SUB2
LDNW
LDNDW
Bitwise Logical STDW
AND
STNW
ANDN
STNDW
OR
XOR
Load Constant
MVK (5-bit)
Address Calc.
ADDAD
Compares
CMPEQ2
CMPEQ4
CMPGT2
CMPGT4
.L
Branches/PC
BDEC
BPOS
BNOP
ADDKPC
Dual/Quad Arith
ABS2
ADD2
ADD4
MAX
MIN
SUB2
SUB4
SUBABS4
Bitwise Logical
ANDN
.M
Average
AVG2
AVG4
Shifts
ROTL
SSHVL
SSHVR
Data Pack/Un
PACK2
PACKH2
PACKLH2
PACKHL2
PACKH4
PACKL4
UNPKHU4
UNPKLU4
SWAP2/4
Multiplies
MPYHI
Shift & Merge
MPYLI
SHLMB
MPYHIR
SHRMB
MPYLIR
Load Constant
MPY2
MVK (5-bit)
SMPY2
Bit Operations DOTP2
DOTPN2
BITC4
DOTPRSU2
BITR
DOTPNRSU2
DEAL
DOTPU4
SHFL
DOTPSU4
Move
GMPY4
MVD
XPND2/4