12 views

Uploaded by Simo Guermoud

Architecture of TI's C6XXX Digital signal Processor

- Computer Architectures
- 2013HZ12362 Prakash Katudia
- Hardware Fundamentals(1)
- How_You_Can_Meet_Your_#1_Leadership_Challenge_RUSSELL
- Module 3
- videoprocessingondsp-150301123832-conversion-gate02.pdf
- Apc Project Synopsis
- Ch_1_95
- 9-0 binary engineering
- prog model.pdf
- Assignment 2
- Functional
- 140207-PM-iCEM_L1_Arch
- Info Cpu Design
- A Dynamic Reconfigurable Processor and a Design Tool for the Next Generation ECUs
- ROGAWAY, 1998 a Software-Optimized Encryption Algorithm
- l01-Introduction to Microprocessor
- MetaAware Identifying Metamorphic Malware
- XAPP213 Microcontroller
- Judel Assignment

You are on page 1of 43

Architecture

Outline

CPU Architecture

Instruction Set Overview

Internal Buses & Memory

C6000 Peripherals Overview

Device Family Review

Internal

Memories

CPU

(A)

Digital sampling of

an analog signal

code

T =1

fs

N

Y =

an * xn

n = 1

40

Y =

an * xn

n = 1

algorithm

And develop the architecture

along the way...

instructions required

by this algorithm?

Multiply

Add

Multiply

40

Y =

an * xn

n = 1

.?

MPY

a, x, prod

40

Y =

an * xn

n = 1

.M

MPY .M

a, x, prod

Add

40

Y =

an * xn

n = 1

.M

.?

MPY .M

a, x, prod

ADD .?

40

Y =

an * xn

n = 1

Where are

the variables

stored?

.M

.L

MPY .M

a, x, prod

ADD .L

Register File - A

Register File A

A0

a

x

A1

A2

prod

A3

Y

A4

.

.

.

A31

32-bits

40

Y =

an * xn

n = 1

.M

.L

MPY .M

a, x, prod

ADD .L

Register File A

A0

a

x

A1

A2

prod

A3

Y

A4

.

.

.

A31

32-bits

40

Y =

an * xn

n = 1

.M

.L

MPY .M

A0, A1, A3

ADD .L

A4, A3, A4

Register File A

A0

a

x

A1

A2

prod

A3

Y

A4

.

.

.

A31

32-bits

40

Y =

an * xn

n = 1

.M

.L

MPY .M

A0, A1, A3

ADD .L

A4, A3, A4

Register File A

A0

a

x

A1

A2

prod

A3

Y

A4

.

.

.

40

Y =

an * xn

n = 1

.M

.L

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

Loop?

A31

32-bits

Creating a Loop

1. Add branch instruction (B) and a label

2. Create a loop counter (= 40)

3. Add an instruction to decrement the loop counter

4. Make the branch conditional based on the value

in the loop counter

Branching (1)

Register File A

A0

a

x

A1

A2

prod

A3

Y

A4

.

.

.

A31

32-bits

40

Y =

.?

an * xn

n = 1

.M

loop:

.L

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

.?

loop

Register File A

A0

a

x

A1

A2

prod

A3

Y

A4

.

.

.

A31

32-bits

40

Y =

.S

an * xn

n = 1

.M

loop:

.L

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

.S

loop

Creating a Loop

1. Add branch instruction (B) and a label

2. Create a loop counter (= 40)

3. Add an instruction to decrement the loop counter

4. Make the branch conditional based on the

value in the loop counter

MVK

.S

40, A2

; A2 = 40

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

.

.

.

A31

32-bits

40

an * xn

.S

Y =

.M

MVK

.S

40, A2

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

.S

loop

n = 1

loop:

.L

Creating a Loop

1. Add branch instruction (B) and a label

2. Create a loop counter (= 40)

3. Add an instruction to decrement the loop counter

4. Make the branch conditional based on the

value in the loop counter

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

.

.

.

A31

32-bits

40

an * xn

.S

Y =

.M

MVK

.S

40, A2

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

SUB

.L

A2, 1, A2

.S

loop

n = 1

loop:

.L

1. Add branch instruction (B) and a label

2. Create a loop counter (= 40)

3. Add an instruction to decrement the loop counter

4. Make the branch conditional based on the

value in the loop counter

Conditional Instructions

To minimize branching, all instructions are conditional

[condition]

loop

Code Syntax

Execute instruction if :

[cond]

true:

cond 0

[!cond]

false:

cond = 0

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

.

.

.

A31

40

Y =

.M

MVK

.S

40, A2

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

SUB

.L

A2, 1, A2

.S

loop

n = 1

loop:

.L

[A2] B

32-bits

an * xn

.S

Creating a Loop

1. Add branch instruction (B) and a label

2. Create a loop counter with proper value

3. Add an instruction to decrement the loop counter

4. Make the branch conditional based on the

value in the loop counter

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

&a[n]

A5

&x[n]

A6

&Y

A7

..

A31

.S

a, x, Y located in memory

.M

A5 = &a

A6 = &x

A7 = &Y

Use pointer with load/store

LD

*A5, A0

LD

*A6, A1

ST

A4, *A7

.L

32-bits

Memory

a [40]

x [40]

Y

*A5

*A6

*A7

Load/Store Options

Because the 'C6000 provides byte addressability, the instruction

set supports several types of load/store instructions:

Load instructions

C Data Type

LDB

char

LDH

short

LDW

int

LDDW

Not Supported

C62x

Store instructions

STB

char

STH

short

STW

int

STDW

C62x, C67x

load instruction should be used?

Because the 'C6000 provides byte addressability, the instruction

set supports several types of load/store instructions:

Load instructions

C Data Type

LDB

char

LDH

short

LDW

int

LDDW

Not Supported

C62x

Store instructions

STB

char

STH

short

STW

int

STDW

C62x, C67x

Load/Store

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

&a[n]

A5

&x[n]

A6

&Y

A7

..

A31

40

an * xn

.S

Y =

.M

MVK

.S

40, A2

LDH

.?

*A5, A0

LDH

.?

*A6, A1

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

SUB

.L

A2, 1, A2

.S

loop

.?

A4, *A7

n = 1

loop:

.L

.?

[A2] B

STH

32-bits

Data Memory

Load/Store - .D Unit

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

&a[n]

A5

&x[n]

A6

&Y

A7

..

A31

40

an * xn

.S

Y =

.M

MVK

.S

40, A2

LDH

.D

*A5, A0

LDH

.D

*A6, A1

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

SUB

.L

A2, 1, A2

.S

loop

.D

A4, *A7

n = 1

loop:

.L

.D

[A2] B

STH

32-bits

Data Memory

A5

A6

A5

++

a0

a1

a2

a

&x

&

A6

++

.

.

40

Y = an * xn

x0

x1

x2

.

.

a0 * x0

How do you access a1 and

x1 on the second loop?

LDH .D

*A5++, A0

LDH .D

*A6++, A1

n = 1

loop:

MVK

.S

40, A2

LDH

LDH

.D

*A5,

*A5++,

A0A0

LDH

LDH

.D

*A6, A1A1

*A6++,

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

SUB

.L

A2, 1, A2

.S

loop

.D

A4, *A7

[A2] B

STH

Register File A

A0

a

x

A1

A2 loop count

prod

A3

Y

A4

&a[n]

A5

&x[n]

A6

&Y

A7

..

A31

40

an * xn

.S

Y =

.M

MVK

.S

40, A2

LDH

.D

*A5++, A0

LDH

.D

*A6++, A1

MPY

.M

A0, A1, A3

ADD

.L

A4, A3, A4

SUB

.L

A2, 1, A2

.S

loop

.D

A4, *A7

n = 1

loop:

.L

.D

[A2] B

STH

32-bits

Data Memory

Adding Side B

Register File A

A0

A1

A2

A3

A4

.

.

.

A31

Register File B

.S1

.S2

.M1

.M2

.L1

.L2

.D1

.D2

32-bits

B0

B1

B2

B3

B4

.

.

.

B31

32-bits

Data Memory

40

Y =

an * xn

n = 1

MVK

loop: LDH

LDH

MPY

ADD

SUB

[A2] B

STH

.S1

.D1

.D1

.M1

.L1

.L1

.S1

.D1

40, A2

*A5++, A0

*A6++, A1

A0, A1, A3

A3, A4, A4

A2, 1, A2

loop

A4, *A7

; A0 = a(n)

; A1 = x(n)

; A3 = a(n) * x(n)

; Y = Y + A3

; decrement loop count

; if A2 0, branch

; *A7 = Y

Outline

CPU Architecture

Instruction Set Overview

Classic C6x Devices (C62x, C67x)

Introducing SIMD (C64x)

Brand New (C64x+, C674x, C66x)

C6000 Peripherals Overview

Device Family Review

Exam 1

Outline

CPU Architecture

Instruction Set Overview

Classic C6x Devices (C62x, C67x)

Introducing SIMD (C64x)

Brand New (C64x+, C674x, C66x)

C6000 Peripherals Overview

Device Family Review

Exam 1

Arithmetic

Logical

ABS

ADD

ADDA

ADDK

ADD2

MPY

MPYH

NEG

SMPY

SMPYH

SADD

SAT

SSUB

SUB

SUBA

SUBC

SUB2

ZERO

AND

CMPEQ

CMPGT

CMPLT

NOT

OR

SHL

SHR

SSHL

XOR

Bit Mgmt

CLR

EXT

LMBD

NORM

SET

Data Mgmt

LDB/H/W

MV

MVC

MVK

MVKL

MVKH

MVKLH

STB/H/W

Program Ctrl

B

IDLE

NOP

Note: Refer to the 'C6000 CPU Reference Guide for more details

.S Unit

.S

.L

.D

ADD

ADDK

ADD2

AND

B

CLR

EXT

MV

MVC

MVK

MVKL

MVKH

NEG

NOT

OR

SET

SHL

SHR

SSHL

SUB

SUB2

XOR

ZERO

ABS

ADD

AND

CMPEQ

CMPGT

CMPLT

LMBD

MV

NEG

NORM

NOT

OR

SADD

SAT

SSUB

SUB

SUBC

XOR

ZERO

.M Unit

.D Unit

.M

.L Unit

ADD

NEG

ADDAB (B/H/W) STB

(B/H/W)

SUB

LDB

(B/H/W) SUBAB (B/H/W)

ZERO

MV

MPY

MPYH

MPYLH

MPYHL

SMPY

SMPYH

No Unit Used

NOP

IDLE

.S Unit

.S

.L

.D

ADD

ADDK

ADD2

AND

B

CLR

EXT

MV

MVC

MVK

MVKL

MVKH

NEG

NOT

OR

SET

SHL

SHR

SSHL

SUB

SUB2

XOR

ZERO

ABSSP

ABSDP

CMPGTSP

CMPEQSP

CMPLTSP

CMPGTDP

CMPEQDP

CMPLTDP

RCPSP

RCPDP

RSQRSP

RSQRDP

SPDP

.D Unit

.M

ADD

NEG

ADDAB (B/H/W) STB

(B/H/W)

ADDAD

SUB

LDB

(B/H/W) SUBAB (B/H/W)

LDDW

ZERO

MV

.L Unit

ABS

ADD

AND

CMPEQ

CMPGT

CMPLT

LMBD

MV

NEG

NORM

NOT

OR

SADD

SAT

SSUB

SUB

SUBC

XOR

ZERO

ADDSP

ADDDP

SUBSP

SUBDP

INTSP

INTDP

SPINT

DPINT

SPRTUNC

DPTRUNC

DPSP

.M Unit

MPY

MPYH

MPYLH

MPYHL

SMPY

SMPYH

MPYSP

MPYDP

MPYI

MPYID

No Unit Used

NOP

IDLE

CPU Enhancements

Number of registers doubled to 64

Cross-path operand sourcing ability doubled to 2

Execution Packets can now Span Fetch Packets (for better code size!)

All changes are backwards compatible to 67x CPU

New Instructions

.S Units enhanced with FP Adder

ADDSP

ADDDP

SUBSP

SUBDP

Along with .L unit, you can have

4 float adds/subtracts in parallel

precision multiply instructions

MPYSPDP SP x DP into DP

MPYSP2DP SP x SP into DP

Many apps may benefit from these

mixed precision floating point mpys

These provide faster alternatives to

the full double precision MPYDP

Outline

CPU Architecture

Instruction Set Overview

Classic C6x Devices (C62x, C67x)

Introducing SIMD (C64x)

Brand New (C64x+, C674x, C66x)

C6000 Peripherals Overview

Device Family Review

Exam 1

Instruction Fetch

Instruction Dispatch

Emulation

Advanced Instruction

Packing

Advanced

Emulation

Instruction Decode

L1

S1

+

+

+

+

+

Interrupt

Control

Control Registers

+

+

M1

x

x

x

x

D1

D2

M2

X

x

x

x

x

S2

L2

+

+

+

+

+

+

+

Dual 64-bit buses for loads/stores

Packed Data Processing - Dual 16-bit (4000 MMACs) or

- Quad 8-bit (8000 MMACs) which is great for imaging applications

Increased code density

100% object code compatible with C62x

.S

.D

Dual/Quad Arith

SADD2

SADDUS2

SADD4

Data Pack/Un

PACK2

PACKH2

PACKLH2

PACKHL2

Bitwise Logical UNPKHU4

ANDN

UNPKLU4

Shifts & Merge SWAP2

SPACK2

SHR2

SPACKU4

SHRU2

SHLMB

SHRMB

Dual Arithmetic Mem Access

ADD2

LDDW

SUB2

LDNW

LDNDW

Bitwise Logical STDW

AND

STNW

ANDN

STNDW

OR

XOR

Load Constant

MVK (5-bit)

Address Calc.

ADDAD

Compares

CMPEQ2

CMPEQ4

CMPGT2

CMPGT4

.L

Branches/PC

BDEC

BPOS

BNOP

ADDKPC

Dual/Quad Arith

ABS2

ADD2

ADD4

MAX

MIN

SUB2

SUB4

SUBABS4

Bitwise Logical

ANDN

.M

Average

AVG2

AVG4

Shifts

ROTL

SSHVL

SSHVR

Data Pack/Un

PACK2

PACKH2

PACKLH2

PACKHL2

PACKH4

PACKL4

UNPKHU4

UNPKLU4

SWAP2/4

Multiplies

MPYHI

Shift & Merge

MPYLI

SHLMB

MPYHIR

SHRMB

MPYLIR

Load Constant

MPY2

MVK (5-bit)

SMPY2

Bit Operations DOTP2

DOTPN2

BITC4

DOTPRSU2

BITR

DOTPNRSU2

DEAL

DOTPU4

SHFL

DOTPSU4

Move

GMPY4

MVD

XPND2/4

- Computer ArchitecturesUploaded byAlin Vîjaică
- 2013HZ12362 Prakash KatudiaUploaded byShivam Shukla
- Hardware Fundamentals(1)Uploaded bynandhalaalaa
- How_You_Can_Meet_Your_#1_Leadership_Challenge_RUSSELLUploaded byLeland Russell
- Module 3Uploaded byVarghese Thomas E
- videoprocessingondsp-150301123832-conversion-gate02.pdfUploaded bygobinath
- Apc Project SynopsisUploaded byDivya Narayanappa Gowda
- Ch_1_95Uploaded bytugasutomo
- 9-0 binary engineeringUploaded bySubhash Singh
- prog model.pdfUploaded byHammad Gillani
- Assignment 2Uploaded byAnand Pandey
- FunctionalUploaded bynishant4you
- 140207-PM-iCEM_L1_ArchUploaded bySatyabrata Nayak
- Info Cpu DesignUploaded byMahesh Jangid
- A Dynamic Reconfigurable Processor and a Design Tool for the Next Generation ECUsUploaded bysenthilvl
- ROGAWAY, 1998 a Software-Optimized Encryption AlgorithmUploaded byfelipefalcantara
- l01-Introduction to MicroprocessorUploaded byEmmanuel Joshua
- MetaAware Identifying Metamorphic MalwareUploaded byMukan Kuzey
- XAPP213 MicrocontrollerUploaded byVlad Mărginean
- Judel AssignmentUploaded byJodel Morimonte
- Introduction to MicroprocessorsUploaded byVibishpv Pv
- 27320401Uploaded byMarioKundit
- Chameleon ChipUploaded byMegna Urs
- Cse IV Computer Organization [10cs46] NotesUploaded byRajesh Kanna
- Computer Hardware BasicsUploaded byMark Arthur Parina
- Billing Rate Configuration ProcedureUploaded byAnit Gautam
- Chapter 2Uploaded byJagan Rajendiran
- Pl 7 OperatingUploaded byAde Junaidi Stc
- lec03Uploaded byjyothibellary4233
- lab2Uploaded byJose Miranda

- Finite State AutomataUploaded byKiran Kumar Kuppa
- MatlabUploaded bySimo Guermoud
- Reduction de La durée d'un ProjetUploaded bySimo Guermoud
- regul-2.pdfUploaded bySimo Guermoud
- 3_le_grafcet.pdfUploaded byGenie Meca
- Design and Simulation of 8 Bit Microprocessor Using VhdlUploaded bySimo Guermoud
- cours API Siemens/Allen BradleyUploaded bySimo Guermoud
- magnétoUploaded byhajaritaa
- Exercices en asmUploaded bySimo Guermoud
- 1erordreUploaded bySaad Hamimi
- 1erordreUploaded bySaad Hamimi
- 2002 Physique 1 CorrigeUploaded bySimo Guermoud
- CNC_2011_PSI_SIUploaded bySimo Guermoud
- Cnc 2002 Psi Maths 2Uploaded byMartin Gogo

- Basics of ARUploaded bypravadhan
- GL R12 Demanding GuideUploaded bySivakumar Hvk
- vmstatUploaded bysid_srms
- deviceiobookUploaded byarchankumarturaga
- aca201new-130304032114-phpapp01Uploaded byNusrat Mary Chowdhury
- Oracle Academy Mid Term Exam Semester 1 AnswersUploaded byMiron Costin
- WinCC Flexible 2008 Communication Part 1 Www.otomasyonegitimi.comUploaded bywww.otomasyonegitimi.com
- Copy of 20100309101603704Uploaded byNguyen Dang Son
- RevisionsUploaded byCésar Alonso
- tocode-qrUploaded byKisung Sa
- AP8132 SpecificationSheet FINALUploaded byAdvantec Srl
- A Discrete-time Event Simulator for Distance Vector RoutingUploaded byAsad Rao
- TouchChip TCESC4K ModuleUploaded byaneeshkumarvr
- Ww Generics Ql GridUploaded byAnton Amanov
- ASA1Uploaded bySamba Sidibé
- Akira LH185S92DT Manual de Servicio LCDUploaded byAlexis Colmenares
- navisworksclashcourse-110726130515-phpapp01Uploaded byKodali Naveen Kumar
- Test Coverage for Zii Apps v0.2Uploaded bydavidkaung
- eynittanylionnationalbankUploaded byapi-405275971
- docker_rspamd_log.txtUploaded byAnonymous MzAplqT
- IxChariot Customer PresentationUploaded bynambiar123
- A Word Processor is a Computer Application Used for the ProductionUploaded byShaun Nand K. Luky
- 08-kinectNUploaded byAfri Nata
- Brocade Command ListUploaded byRamchandra Kasapuram
- TECHNICAL MANUAL Of AMD 780G Based Mini-ITX M/BUploaded by0_x_0
- FlexFrame-CommvaultUploaded byFaraz Ahmed Quddusi
- SiebalAnalyticsUserGuide_75.pdfUploaded byAnil Patial
- Jump Start Shell TubeUploaded bytranthean
- 20 Practical Examples of RPM Commands in LinuxUploaded byKarim Zahrouni
- Heaps About HeapsUploaded byFilipe Xavier Minormonif