You are on page 1of 602

STM32F3 Technical Training

Microcontrollers Division
2012 - Week 16 , Tunis

STM32F3 Training Agenda (1/4)


Day 1
Cortex M4 presentation with Focus on FPU and DSP + Hands-on
STM32F3 common parts

Block Diagrams
Memory and System architecture
Flash
Power Control (PWR) + Hands-on
Direct memory access controller
General Purpose I/Os
Extended interrupts and events controller

STM32F3 Training Agenda (2/4)


Day 2
STM32F3 Ecosystem
Standard firmware Library
Tools (STLink utility, STVP, etc)
ULINK PRO and TRACE presentation

Continue with STM32F3 common parts

Reset and clock control (RCC)


CRC
Digital-to-analog converter (DAC)
System window watchdog (WWDG)
Independent window watchdog (IWDG)
Serial peripheral interface (SPI)
Universal synchronous asynchronous receiver transmitter (USART) + Hands-on
Inter-integrated circuit (I2C) interface + Hands-on
inter-IC sound I2S (Simplex in STM32F37x and Full duplex in STM32F30x)

STM32F3 Training Agenda (3/4)


Day 3:
Continue with STM32F3 common parts

Controller area network (CAN)


Real Time Clock (RTC)
Genral Purpose Timers
Basic Timers 6 and 7
Universal serial bus full-speed device interface (USB)
Touch sensing controller (TSC)
STM32F3xx Minimum External Components

STM32F30x specific parts


Analog-to-Digital Converter ADC 5MSPS + Hands-on
STM32F30x Timers new functionalities + Hands-on

STM32F3 Training Agenda (4/4)


Day 4:
Continue with STM32F30x Specific parts
Comparators (COMP) + Hands-on
Operational amplifiers (OPAMP) + Hands-on

STM32F37x specific parts

Analog-to-Digital Converter ADC sigma delta + Hands-on


Comparators(COMP) (Only differences vs STM32F30x comparator).
Analog-to-Digital Converter ADC 1 MSPS
CEC

STM32F30x Motor Control kit - Complete development platform with all the
hardware and software required to get STM32-based motor control applications
started quickly + STM32F30x new features/peripherals easing motor control

STM32F3 Training Objectives


Get understanding on CORTEX-M4 Core + Floating Point Unit (FPU)
+ Digital Signal Processing (DSP) + Hands-on.
Become familiar with STM32F30x and STM32F37x peripherals and
features.
Hands-on based on the STM32F3xx Firmware Libraries
Become familiar with the STM32F30x Motor Control Kit
At the end of the training you will be able to
Understand all the STM32F3 features
Install dev tools and run demos and peripheral examples
Get the background to promote the STM32F3xx series and be
able to provide first level support.

Day 1
Cortex M4 presentation with Focus on FPU and DSP + Hands-on
STM32F3 common parts

Block Diagrams
Memory and System architecture
Flash
Power Control (PWR - with mentioning the differences in both products Power Supply
Schemes )
Direct memory access controller (DMA- with mentioning the differences in the DMA
requests mapping in both products.)
General Purpose I/Os
Extended interrupts and events controller (EXTI - with mentioning the differences in the
EXTI internal lines connections )

ARM Cortex M4 in few words

Cortex-M processors
9

Forget traditional 8/16/32-bit classifications


Seamless architecture across all applications
Every product optimised for ultra low power and ease of use

Cortex-M0

Cortex-M3

8/16-bit applications

16/32-bit applications

Binary and tool compatible

Cortex-M4
32-bit/DSC applications

Cortex-M processors binary compatible


10

ARM Cortex M4 Core


11

FPU
Single precision
Ease of use
Better code efficiency
Faster time to market
Eliminate scaling and saturation
Easier support for meta-language tools

What is Cortex-M4?
MCU
Ease of use of C
programming
Interrupt handling
Ultra-low power

DSP
Cortex-M4

Harvard architecture
Single-cycle MAC
Barrel shifter

Cortex-M4 processor microarchitecure


ARMv7ME Architecture

Thumb-2 Technology
DSP and SIMD extensions
Single cycle MAC (Up to 32 x 32 + 64 -> 64)
Optional single precision FPU
Integrated configurable NVIC
Compatible with Cortex-M3

Microarchitecture
3-stage pipeline with branch speculation
3x AHB-Lite Bus Interfaces

Configurable for ultra low power


Deep Sleep Mode, Wakeup Interrupt Controller
Power down features for Floating Point Unit

Flexible configurations for wider applicability


Configurable Interrupt Controller (1-240 Interrupts and Priorities)
Optional Memory Protection Unit
Optional Debug & Trace

12

Cortex-M feature set comparison


Cortex-M0
Architecture Version

Cortex-M3

Cortex-M4

V6M

v7M

v7ME

Thumb, Thumb-2
System Instructions

Thumb + Thumb-2

Thumb + Thumb-2,
DSP, SIMD, FP

0.9

1.25

1.25

Yes

Yes

Yes

Number interrupts

1-32 + NMI

1-240 + NMI

1-240 + NMI

Interrupt priorities

8-256

8-256

4/2/0, 2/1/0

8/4/0, 2/1/0

8/4/0, 2/1/0

Memory Protection Unit (MPU)

No

Yes (Option)

Yes (Option)

Integrated trace option (ETM)

No

Yes (Option)

Yes (Option)

Fault Robust Interface

No

Yes (Option)

No

Yes (Option)

Yes

Yes

Hardware Divide

No

Yes

Yes

WIC Support

Yes

Yes

Yes

Bit banding support

No

Yes

Yes

Single cycle DSP/SIMD

No

No

Yes

Floating point hardware

No

No

Yes

AHB Lite

AHB Lite, APB

AHB Lite, APB

Yes

Yes

Yes

Instruction set architecture


DMIPS/MHz
Bus interfaces
Integrated NVIC

Breakpoints, Watchpoints

Single Cycle Multiply

Bus protocol
CMSIS Support

13

Cortex M4 DSP features

Cortex-M processors binary compatible


15

Cortex-M4 overview
Main Cortex-M4 processor features
ARMv7-ME architecture revision
Fully compatible with Cortex-M3 instruction set

Single-cycle multiply-accumulate (MAC) unit


Optimized single instruction multiple data (SIMD) instructions
Saturating arithmetic instructions
Optional single precision Floating-Point Unit (FPU)
Hardware Divide (2-12 Cycles), same as Cortex-M3
Barrel shifter (same as Cortex-M3)

16

Single-cycle multiply-accumulate unit


The multiplier unit allows any MUL or MAC instructions to be
executed in a single cycle
Signed/Unsigned Multiply
Signed/Unsigned Multiply-Accumulate
Signed/Unsigned Multiply-Accumulate Long (64-bit)

Benefits : Speed improvement vs. Cortex-M3


4x for 16-bit MAC (dual 16-bit MAC)
2x for 32-bit MAC
up to 7x for 64-bit MAC

17

Cortex-M4 extended single cycle MAC


OPERATION

CM3

CM4

SMULBB, SMULBT, SMULTB, SMULTT


SMLABB, SMLABT, SMLATB, SMLATT
SMLALBB, SMLALBT, SMLALTB, SMLALTT
SMULWB, SMULWT
SMLAWB, SMLAWT
SMUAD, SMUADX, SMUSD, SMUSDX

n/a
n/a
n/a
n/a
n/a
n/a

1
1
1
1
1
1

(16 x 16) (16 x 16) + 32 = 32


(16 x 16) (16 x 16) + 64 = 64

SMLAD, SMLADX, SMLSD, SMLSDX


SMLALD, SMLALDX, SMLSLD, SMLSLDX

n/a
n/a

1
1

32 x 32 =
32 (32
32 x 32 =
(32 x 32)
(32 x 32)

MUL
MLA, MLS
SMULL, UMULL
SMLAL, UMLAL
UMAAL

1
2
5-7
5-7
n/a

1
1
1
1
1

SMMLA, SMMLAR, SMMLS, SMMLSR


SMMUL, SMMULR

n/a
n/a

1
1

16 x 16 =
16 x 16 +
16 x 16 +
16 x 32 =
(16 x 32)
(16 x 16)

32
32 = 32
64 = 64
32
+ 32 = 32
(16 x 16) = 32

32
x 32) = 32
64
+ 64 = 64
+ 32 + 32 = 64

32 (32 x 32) = 32 (upper)


(32 x 32) = 32 (upper)

INSTRUCTIONS

All the above operations are single cycle on the Cortex-M4 processor

18

Saturated arithmetic
Intrinsically prevents overflow of variable by clipping to min/max
boundaries and remove CPU burden due to software range checks
Benefits
Audio applications

1.5

Without
saturation

1.5

1
0.5
0
-0.5

-1

0.5

-1.5
1.5

-0.5

0.5

-1

With
saturation

-1.5

0
-0.5
-1
-1.5

Control applications
The PID controllers integral term is continuously accumulated over time. The saturation
automatically limits its value and saves several CPU cycles per regulators

19

Single-cycle SIMD instructions


Stands for Single Instruction Multiple Data
Allows to do simultaneously several operations with 8-bit or 16-bit
data format
Ex: dual 16-bit MAC (Result = 16x16 + 16x16 + 32)
Ex: Quad 8-bit SUB / ADD

Benefits
Parallelizes operations (2x to 4x speed gain)
Minimizes the number of Load/Store instruction for exchanges between memory
and register file (2 or 4 data transferred at once), if 32-bit is not necessary
Maximizes register file use (1 register holds 2 or 4 values)

20

SIMD operation example


SIMD extensions perform multiple operations in one cycle
Sum = Sum + (A x C) + (B x D)

32-bit

64-bit

SIMD techniques operate with packed data

32-bit

64-bit

21

Cortex-M4 DSP instructions compared


Cycle counts

CLASS
Arithmetic

Multiplication

Division

INSTRUCTION
ALU operation (not PC)
ALU operation to PC
CLZ
QADD, QDADD, QSUB, QDSUB
QADD8, QADD16, QSUB8, QSUB16
QDADD, QDSUB
QASX, QSAX, SASX, SSAX
SHASX, SHSAX, UHASX, UHSAX
SADD8, SADD16, SSUB8, SSUB16
SHADD8, SHADD16, SHSUB8, SHSUB16
UQADD8, UQADD16, UQSUB8, UQSUB16
UHADD8, UHADD16, UHSUB8, UHSUB16
UADD8, UADD16, USUB8, USUB16
UQASX, UQSAX, USAX, UASX
UXTAB, UXTAB16, UXTAH
USAD8, USADA8
MUL, MLA
MULS, MLAS
SMULL, UMULL, SMLAL, UMLAL
SMULBB, SMULBT, SMULTB, SMULTT
SMLABB, SMLBT, SMLATB, SMLATT
SMULWB, SMULWT, SMLAWB, SMLAWT
SMLALBB, SMLALBT, SMLALTB, SMLALTT
SMLAD, SMLADX, SMLALD, SMLALDX
SMLSD, SMLSDX
SMLSLD, SMLSLD
SMMLA, SMMLAR, SMMLS, SMMLSR
SMMUL, SMMULR
SMUAD, SMUADX, SMUSD, SMUSDX
UMAAL
SDIV, UDIV

CORTEX-M3 Cortex-M4
1
1
3
3
1
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
1 - 2
1
1 - 2
1
5 - 7
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1
2 - 12
2 12

Single
cycle
MAC

22

Cortex-M4 nonDSP instructions


Cycle counts

CLASS
Load/Store

Branch

Special

Manipulation

INSTRUCTION
Load single byte to R0-R14
Load single halfword to R0-R14
Load single word to R0-R14
Load to PC
Load double-word
Store single word
Store double word
Load-multiple registers (not PC)
Load-multiple registers plus PC
Store-multiple registers
Load/store exclusive
SWP
B, BL, BX, BLX
CBZ, CBNZ
TBB, TBH
IT
MRS
MSR
CPS
BFI, BFC
RBIT, REV, REV16, REVSH
SBFX, UBFX
UXTH, UXTB, SXTH, SXTB
SSAT, USAT
SEL
SXTAB, SXTAB16, SXTAH
UXTB16, SXTB16
SSAT16, USAT16
PKHTB, PKHBT

CORTEX-M3 Cortex-M4
1 - 3
1 - 3
1 - 3
1 - 3
1 - 3
1 - 3
5
5
3
3
1 - 2
1 - 2
3
3
N+1
N+1
N+5
N+5
N+1
N+1
2
2
n/a
n/a
2 - 3
2 - 3
3
3
5
5
0 - 1
0 - 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
n/a
1
n/a
1
n/a
1
n/a
1
n/a
1

23

Packed data types


Several instructions operate on packed data types
Byte or halfword quantities packed into words
Allows more efficient access to packed structure types
SIMD instructions can act on packed data
Instructions to extract and pack data
A

B
Extract

00......00

00......00

B
Pack

24

DSP performances for control application




Example based on a complex


formula used for sensorless
motor drive
Gain comes for load
operations and SIMD
instructions
Total gain on this part is 25 to
35%

Cortex M3 (28-38 c.)

Cortex M4 (18-28 c.)

LDRSH R12,[R4, #+12]

LDR

LDRSH

(1 single 32-bit load replacing two 16-bit load


with sign extension. Gain: 2 cycles

R0,[SP, #+20]

SXTH

LR,R8

MUL

R8,LR,R0

LDR

R1,[R4, #+44]

SDIV

R0,R1,R7

LDRSH

R2,[R4, #+24]

LDRSH

R3,[R4, #+26]

LDRSH

R10,[R4, #+22]

SXTH

LDR

R2,[R4, #+22]

(1 single 32-bit load replacing to 16-bit with sign


extension. Gain: 2 cycles)

R6,R6

MLS

R5,R6,R10,R5

MLA

R5,R9,R12,R5

ASR

R6,R8,#+15

MLA

R5,R6,R3,R5

SXTH

R10,[R4, #+12]

SMLSD R5, R10, R6, R5


(1 SIMD instruction replacing two multiplyaccumulate. Gain: 3 cycles)

R0,R0

MLS

R5,R0,R2,R5

STR

R5,[SP, #+12]

SMLSD R5, R0, R2


(1 SIMD instruction replacing two multiplyaccumulate. Gain: 3 cycles)

25

DSP application example: MP3 audio playback


26

MHz required for MP3 decode (smaller is better !)


DSP concept from ARM (*)

DSP lib provided for free by ARM


The benefits of software libraries for Cortex-M4
Enables end user to develop applications faster
Keeps end user abstracted from low level programming
Benchmarking vehicle during system development
Clear competitive positioning against incumbent DSP/DSC offerings
Accelerate third party software development

Keeping it easy to access for end user


Minimal entry barrier - very easy to access and use

One standard library no duplicated efforts


ARM channels effort/resources with software partner
Value add through another level of software eg: filter config tools

27

DSP lib function list snapshot


Basic math vector mathematics PID Controller
Fast math sin, cos, sqrt etc
Interpolation linear, bilinear
Complex math
Statistics max, min,RMS etc
Filtering IIR, FIR, LMS etc
Transforms FFT(real and
complex) , Cosine transform etc
Matrix functions

Support functions copy/fill arrays,


data type conversions etc

28

Tools
Matlab / Simulink
Embedded coder for code generation
Mathworks
Demo being developed (availability end of year)

Aimagin (Rapidstm32)

Filter design tools


Lot of tools available, most of them commercial product, some with low-cost offer,
few free
http://www.dspguru.com/dsp/links/digital-filter-design-software

29

Main DSP operations

Finite impulse response (FIR) filters

Data communications
Echo cancellation (adaptive versions)
Smoothing data
Infinite impulse response (IIR) filters

Audio equalization
Motor control
Fast Fourier transforms (FFT)

Audio compression
Spread spectrum communication
Noise removal

30

Assembly or C ?
Assembly ?
Pros
Can result in highest performance
Cons
Difficult learning curve, longer development cycles
Code reuse difficult not portable
C?
Pros
Easy to write and maintain code, faster development cycles
Code reuse possible, using third party software is easier
Cons
Highest performance might not be possible
Get to know your compiler !

31

Mathematical details
y[n] = h[k ]x[n k ]
N 1

FIR Filter

k =0

IIR or recursive filter

y[n] = b0 x[n] + b1 x[n 1] + b2 x[n 2]


+ a1 y[n 1] + a2 y[n 2]

Y [k1 ] = X [k1 ] + X [k 2 ]

FFT Butterfly (radix-2)

Y [k 2 ] = ( X [k1 ] X [k 2 ])e j

Most operations are dominated by MACs


These can be on 8, 16 or 32 bit operations

32

Computing Coefficients
Variables in a DSP algorithm can be
classified as coefficients or state
Coefficients parameters that determine the
response of the filter (e.g., lowpass,
highpass, bandpass, etc.)
State intermediate variables that update
based on the input signal

Coefficients may be computed in a


number of different ways
Simple design equations running on the MCU
External tools such as MATLAB or QED Filter
Design

This structure is called a Direct Form 1


Biquad. It has 5 coefficients and 4 state
variables.
http://www.dsprelated.com/dspbooks/filters/Four_Direct_Forms.html

33

IIR single cycle MAC benefit


34

Cortex-M3 Cortex-M4
cycle count cycle count
xN = *x++;
yN = xN * b0;
yN += xNm1 * b1;
yN += xNm2 * b2;
yN -= yNm1 * a1;
yN -= yNm2 * a2;
*y++ = yN;
xNm2 = xNm1;
xNm1 = xN;
yNm2 = yNm1;
yNm1 = yN;
Decrement loop counter
Branch

2
3-7
3-7
3-7
3-7
3-7
2
1
1
1
1
1
2

2
1
1
1
1
1
2
1
1
1
1
1
2

Only looking at the inner loop, making these assumptions


Function operates on a block of samples
Coefficients b0, b1, b2, a1, and a2 are in registers
Previous states, x[n-1], x[n-2], y[n-1], and y[n-2] are in registers

Inner loop on Cortex-M3 takes 27-47 cycles per sample


Inner loop on Cortex-M4 takes 16 cycles per sample

y[n] = b0 x[n] + b1 x[n 1] + b2 x[n 2]


a1 y[n 1] a2 y[n 2]

Further optimization strategies

Circular addressing alternatives

Loop unrolling

Caching of intermediate variables

Extensive use of SIMD and intrinsics


These will be illustrated by looking at a
Finite Impulse Response (FIR) Filter

35

FIR Filter
36

Occurs frequently in communications, audio, and video


applications
N 1

y[n] = h[k ]x[n k ]

A filter of length N requires


N coefficients h[0], h[1], , h[N-1]
N state variables x[n], x[n-1], , x[n-(N-1)]
N multiply accumulates

k =0

Classic function in signal processing

x[n]

z 1

z 1

z 1

z 1

h[0]

h[1]

h[2]

h[3]

x[n-1]

x[n-2]

x[n-3]

h[4]

x[n-4]
y[n]

Circular Addressing

37

Data in the delay chain is right shifted every sample. This


is very wasteful. How can we avoid this?
Circular addressing avoids this data movement
h[N 1]

h[N 2]

h[1]

h[0]

x[n ]

x[n ( N 1)]

Linear addressing of coefficients.

coeffPtr

x[n 2]

x[n 1]

statePtr

Circular addressing of states

FIR Filter Standard C Code


void fir(q31_t *in, q31_t *out, q31_t *coeffs, int *stateIndexPtr,
int filtLen, int blockSize)
{
int sample;
int k;
q31_t sum;
int stateIndex = *stateIndexPtr;
for(sample=0; sample < blockSize; sample++)
{
state[stateIndex++] = in[sample];
sum=0;
for(k=0;k<filtLen;k++)
{
sum += coeffs[k] * state[stateIndex];
stateIndex--;
if (stateIndex < 0)
{
stateIndex = filtLen-1;
}
}
out[sample]=sum;
}
*stateIndexPtr = stateIndex;
}

Block based processing


Inner loop consists of:
Dual memory fetches
MAC
Pointer updates with
circular addressing

38

FIR Filter DSP Code


32-bit DSP processor assembly code
Only the inner loop is shown, executes in a single cycle
Optimized assembly code, cannot be achieved in C
Zero overhead loop

FIRLoop:

lcntr=r2, do FIRLoop until lce;


f12=f0*f4, f8=f8+f12, f4=dm(i1,m4), f0=pm(i12,m12);

Multiply and
accumulate previous

Coeff fetch with


linear addressing

State fetch with


circular addressing

39

Cortex-M inner loop


for(k=0;k<filtLen;k++)
{
sum += coeffs[k] * state[stateIndex];
stateIndex--;
if (stateIndex < 0)
{
stateIndex = filtLen-1;
}
}

Fetch coeffs[k]
Fetch state[stateIndex]
MAC
stateIndex-Circular wrap
Loop overhead
Total

Even though the MAC executes in 1 cycle,


there is overhead compared to a DSP.
How can this be improved on the Cortex-M4 ?

2 cycles
1 cycle
1 cycle
1 cycle
4 cycles
3 cycles
-----------12 cycles

40

Circular addressing alternative


41

Create a circular buffer of length N + blockSize-1 and shift this once

per block
Example. N = 6, blockSize = 4. Size of state buffer = 9.

x[0]

x[1]

x[2]

x[3]

x[4]

x[5]

h[4]

h[3]

h[2]

h[1]

h[0]

Block 1 Block 2
h[5]

x[6]

x[7]

x[8]

Circular addressing alternative

42

Create a circular buffer of length N + blockSize-1 and shift this once

per block
Example. N = 6, blockSize = 4. Size of state buffer = 9.

x[0]

x[1]

x[2]

x[3]

x[4]

x[5]

h[3]

h[2]

h[1]

h[0]

Block 2 Block 3
h[5]

h[4]

x[6]

x[7]

x[8]

Circular addressing alternative


43

Create a circular buffer of length N + blockSize-1 and shift this once

per block
Example. N = 6, blockSize = 4. Size of state buffer = 9.

x[0]

x[1]

x[2]

x[3]

x[4]

x[5]

h[2]

h[1]

h[0]

Block 3 Block 4
h[5]

h[4]

h[3]

x[6]

x[7]

x[8]

Cortex-M4 code with change


44

for(k=0; k<filtLen; k++)


{
sum += coeffs[k] * state[stateIndex];
stateIndex++;
}

Fetch coeffs[k]
Fetch state[stateIndex]
MAC
stateIndex++
Loop overhead
Total

2 cycles
1 cycle
1 cycle
1 cycle
3 cycles
----------8 cycles

Improvement in performance
DSP assembly code = 1 cycle

Cortex-M4 standard C code takes 12 cycles

Using circular addressing alternative = 8 cycles


33% better but still not
comparable to the DSP
Lets try loop unrolling

45

Loop unrolling
46

This is an efficient language-independent optimization technique and


makes up for the lack of a zero overhead loop on the Cortex-M4

There is overhead inherent in every loop for checking the loop counter
and incrementing it for every iteration (3 cycles on the Cortex-M.)

Loop unrolling processes n loop indexes in one loop iteration,


reducing the overhead by n times.

Unroll Inner Loop by 4


for(k=0;k<filtLen;k++)
{
sum += coeffs[k] * state[stateIndex];
stateIndex++;
sum += coeffs[k] * state[stateIndex];
stateIndex++;
sum += coeffs[k] * state[stateIndex];
stateIndex++;
sum += coeffs[k] * state[stateIndex];
stateIndex++;
}
Fetch coeffs[k]

Fetch state[stateIndex]
MAC
stateIndex++
Loop overhead
Total

2x4
1x4
1x4
1x4
3x1

= 8 cycles
= 4 cycles
= 4 cycles
= 4 cycles
= 3 cycles
-----------23 cycles for 4 taps
= 5.75 cycles per tap

47

Improvement in performance
DSP assembly code = 1 cycle

Cortex-M4 standard C code takes 12 cycles

Using circular addressing alternative = 8 cycles


25% further improvement
But a large gap still exists
After loop unrolling < 6 cycles
Lets try SIMD

48

Apply SIMD
Many image, video processing, and communications applications use
8- or 16-bit data types.
SIMD speeds these up
16-bit data yields a 2x speed
improvement over 32-bit
8-bit data yields a 4x speed
improvement

32-bit register
H

32-bit register

H
16-bit

L
16-bit
16-bit

16-bit

Access to SIMD is via


compiler intrinsics

32-bit

Sum

Example dual 16-bit MAC


SUM=__SMLALD(C, S, SUM)

64-bit

32-bit
64-bit

64-bit

49

CMSIS Files

50

Data organization with SIMD


16-bit example
Access two neighboring values using a single 32-bit memory read

x[0]

x[1]

x[2]

x[3]

x[4]

x[5]

h[5]

h[4]

h[3]

h[2]

h[1]

h[0]

x[6]

x[7]

x[8]

51

Inner Loop with 16-bit SIMD


filtLen = filtLen << 2;
for(k = 0; k < filtLen; k++)
{
c = *coeffs++;
s = *state++;
sum = __SMLALD(c, s, sum);
c = *coeffs++;
s = *state++;
sum = __SMLALD(c, s, sum);
c = *coeffs++;
s = *state++;
sum = __SMLALD(c, s, sum);
c = *coeffs++;
s = *state++;
sum = __SMLALD(c, s, sum);
}

//
//
//
//
//
//
//
//
//
//
//
//
//

2
1
1
2
1
1
2
1
1
2
1
1
3

cycles
cycle
cycle
cycles
cycle
cycle
cycles
cycle
cycle
cycles
cycle
cycle
cycles

19 cycles total. Computes 8 MACs


2.375 cycles per filter tap

52

Improvement in performance
DSP assembly code = 1 cycle
Cortex-M4 standard C code takes 12 cycles
Using circular addressing alternative = 8 cycles
After loop unrolling < 6 cycles
After using SIMD instructions < 2.5 cycles
Thats much better!
But is there anything more?
One more idea left

53

Caching Intermediate Values


FIR filter is extremely memory intensive. 12 out of 19 cycles in the
last code portion deal with memory accesses
2 consecutive loads take
4 cycles on Cortex-M3, 3 cycles on Cortex-M4

MAC takes
3-7 cycles on Cortex-M3, 1 cycle on Cortex-M4

When operating on a block of data, memory bandwidth can be


reduced by simultaneously computing multiple outputs and caching
several coefficients and state variables

54

Data Organization with Caching


55
statePtr++
Increment by 16-bits

x[0]

x[2]

x[1]

x[4]

x[3]

x0

x[7]

x[6]

x[5]

x[8]

x0
x1

x1
x2

x2
x3
3

coeffsPtr++

x3
3

Increment by 32-bits

h[5]

h[4]

c0

h[3]

h[2]

c0

h[1]

h[0]

Compute 4 Outputs Simultaneously:


sum0 = __SMLALD(x0, c0, sum0)
sum1 = __SMLALD(x1, c0, sum1)
sum2 = __SMLALD(x2, c0, sum2)
sum3 = __SMLALD(x3, c0, sum3)

Final FIR Code


sample = blockSize/4;
do
{
sum0 = sum1 = sum2 = sum3 = 0;
statePtr = stateBasePtr;
coeffPtr = (q31_t *)(S->coeffs);
x0 = *(q31_t *)(statePtr++);
x1 = *(q31_t *)(statePtr++);
i = numTaps>>2;
do
{
c0 = *(coeffPtr++);
x2 = *(q31_t *)(statePtr++);
x3 = *(q31_t *)(statePtr++);
sum0 = __SMLALD(x0, c0, sum0);
sum1 = __SMLALD(x1, c0, sum1);
sum2 = __SMLALD(x2, c0, sum2);
sum3 = __SMLALD(x3, c0, sum3);
c0 = *(coeffPtr++);
x0 = *(q31_t *)(statePtr++);
x1 = *(q31_t *)(statePtr++);
sum0 = __SMLALD(x0, c0, sum0);
sum1 = __SMLALD(x1, c0, sum1);
sum2 = __SMLALD (x2, c0, sum2);
sum3 = __SMLALD (x3, c0, sum3);
} while(--i);
*pDst++ = (q15_t) (sum0>>15);
*pDst++ = (q15_t) (sum1>>15);
*pDst++ = (q15_t) (sum2>>15);
*pDst++ = (q15_t) (sum3>>15);
stateBasePtr= stateBasePtr + 4;
} while(--sample);

Uses loop unrolling, SIMD intrinsics,


caching of states and coefficients, and
work around circular addressing by
using a large state buffer.
Inner loop is 26 cycles for a total of 16,
16-bit MACs.
Only 1.625 cycles per filter tap!

56

Cortex-M4 FIR performance


DSP assembly code = 1 cycle
Cortex-M4 standard C code takes 12 cycles
Using circular addressing alternative = 8 cycles
After loop unrolling < 6 cycles
After using SIMD instructions < 2.5 cycles
After caching intermediate values ~ 1.6 cycles

Cortex-M4 C code now comparable in performance

57

Summary of optimizations
Basic Cortex-M4 C code quite reasonable performance for simple
algorithms

Through simple optimizations, you can get to high performance on the


Cortex-M4

You DO NOT have to write Cortex-M4 assembly, all optimizations can


be done completely in C

58

Quick introduction to
fixed point data format
Fixed point format can be integer, fractional or a mix of integer and
fractional.
Fixed point use Qx.y notation
X : number of integer bits
Y: number of fractional bits

Q2.13 denotes fixed point data type with 2 bits for integer and 13 bits
for fractional part.
Fixed point format used in CMSIS DSP library is Q0.7 (Q7), Q0.15
(Q15) and Q0.31 (Q31)
Only fractional bits to represent numbers between -1.0 and 1.0.
Value = b(15-i) * 2(-1*i) : with i = 1..15
Example: 0.25 is represented as 0x2000 in Q15 format.

59

Cortex-M4F benefits
Cortex-M4F benefits Vs. Cortex-M3
Improvement in code size (A)
Improvement in performance (B)

Complex FFT 64 points (CFFT-64)


CFFT-64 (Q15 data) code size in bytes
3738

1.8x
improvement
2034

Complex FFT 64 points (CFFT-64)


CFFT-64 Q15 execution time (# cycles)
7374

Cortex-M3

2.23x
improvement

Cortex-M4F

(A)

3300

Cortex-M3

Cortex-M4F

(B)

60

Fixed point DSP examples


We will provide an overview on the new ARM CMSIS DSP library &
give example of performance of FIR (Finite impulse response) filtering
and FFT (Fast Fourier transform) with STM32F2, STM32F3 and
STM32F4.

FIR & FFT Examples


Benchmarking setup
Benchmarking results

61

FFT: Hardware setup & data flow


DMA2

DMA1
DAC2

SRAM
Lockup table

Input signal

TIM6

Processed data

LCD

TIM2

Potentiometer

Signal Sampling

Signal Generation

Signal Output

Signal Processing

Start display input signal


Convert
to good
Start display FFT magnitude
input
format
Buffer1 Processing
Buffer1

Buffer2

Buffer1

ADC1

Start to sample data

DAC2

Start to generate sine wave

DMA (circular
mode)

SRAM

Update Frequency

New scaling

FFT
Processing
DMA

Sampled data

ADC1

50Hz<F<Fs/2

ADC3

LCD

CPU (DSP
Processing)

Buffer2

DAC stop

Buffer1

Buffer2

DAC start

Transfer of lockup table


Potentiometer activated

Stopped if key
or joystick used

62

FIR: Hardware setup & data flow


DMA2

DMA1

Sampled data

DMA1
Processed data

ADC1

DAC2

SRAM

TIM2

CPU (DSP
Processing)

SRAM

DAC1

Input signal

Lockup table

50Hz<F<Fs/2

TIM6

ADC3

TIM2

Update Frequency

New scaling

Signal Sampling

Signal Processing

Signal Output

Potentiometer
Signal Generation

Oscilloscope
Start display input signal to the oscilloscope
DAC1

Convert to good
input format

Start display FIR result to


the oscilloscope

FIR
Processing
DMA

Buffer1 Processing
Buffer1

Buffer2

Buffer1

ADC1

Start to sample data

DAC2

Start to generate sine wave

DMA (circular
mode)

Buffer2

DAC stop

Buffer1

Buffer2

DAC start

Transfer of lockup table


Potentiometer activated

Stopped if key
or joystick used

63

FFT benchmark results


Setup
Input signal characteristics
Input signal

a sine wave with a frequency F


50Hz <F<4KHz

Sampling frequency

8KHz

Measures done with MDK-ARM (4.23.00.0) toolchain


Level 3(-O3) for time optimization without MicroLib

64

FFT benchmark results


Cortex-M3 vs Cortex-M4F
FFT 64-points average processing
time (# cycles)

FFT 1024-points average processing


time at 0 WS (# cycles)

10000

250000

8000

200000

6000

150000

4000

100000

2000

50000
0

0
Q15
F2 (30 MHz, 0 WS)

Q15

Q31
F3 (24 MHz, 0 WS)

F4 (30 MHz, 0 WS)

F2(30MHz,0WS) F3(24 MHz,0WS)


(#cycles)
(#cycles)
7374
3300
Q15

FFT
(64points) Q31
Q15
FFT
(1024points) Q31

F2 (30 MHz, 0WS)

Gain
(F2 vs F3)
x2.23

Q31
F3 (24 MHz, 0 WS)

F4 (30 MHz, 0 WS)

F4(30MHz,0WS) F3(24 MHz,0WS)


(#cycles)
(#cycles)
3307
3300

8022

6522

x1.23

6410

6522

190028

80608

x2.36

80252

80608

215505

158022

x1.36

166406

158022

65

FFT benchmarking results


F2/F3/F4
FFT 1024-points average
processing time (s)

FFT 64-points average


processing time (s)
140.000

3000.000

120.000

2500.000

100.000

2000.000

80.000

1500.000

60.000
1000.000
40.000
500.000

20.000

0.000

0.000
Q15
F2 (120 MHz, 3 WS)

Q15

Q31
F3 ( 72 Mz, 2 WS)

FFT (64-points)
FFT (1024-points)

F4 (168 MHz, 5 WS)

F2 (120 MHz, 3 WS)

Q31
F3 (72 MHz, 2 WS)

F4 (168 MHz, 5 WS)

F2(120MHz,3WS)
(s)

F3 (72 MHz, 2WS)


(s)

F4(168MHz,5WS)
(s)

Gain
F4/F3

64.847
115.694

22.101

x 2.9

Q31

63.442
69.683

40.679

x 2.8

Q15

1600.067

1532.139

496.952

x 3.08

Q31

1825.642

2765.861

1021.208

x 2.7

Q15

66

FIR benchmarking results


Setup
Filter & input signal characteristics
Filter type

Stop Band

Filter order

165

Filter coefficients

166

Cut-off frequency

FSTOP1=1.9KHz, FSTOP2=2.1KHz

Sampling frequency

48KHz

Number of samples

128

Input signal

a sine wave with a frequency F


50Hz <F<4KHz

Measures done with MDK-ARM (4.23.00.0) toolchain


Level 3(-O3) for time optimization without MicroLib

67

FIR benchmarking results


F2/F3/F4
FIR average processing time (# cycles)
250000
200000
150000
100000
50000
0
Q15

Q31

Q15

FIR
F2 (30MHz, 0WS)

Q31
Fast FIR

F3 (24MHz, 0WS)

F2(30MHz,0WS) F3(24 MHz,0WS)


(#cycles)
(#cycles)
Q15
167284
36339
FIR
Q31
195537
99861
Q15
90955
34916
Fast FIR
Q31
177917
44599

Gain
(F2 vs F3)
x4.60
x1.96
x2.60
x3.99

F4 (30MHz, 0WS)

F4(30MHz,0WS) F3(24 MHz,0WS)


(#cycles)
(#cycles)
36374
36339
99861
103745
35079
34916
44736
44599

68

FIR benchmarking results


F2/F3/F4
FIR average processing time (s)
1800.00
1600.00
1400.00
1200.00
1000.00
800.00
600.00
400.00
200.00
0.00
Q15

Q31

Q15

FIR

Fast FIR

F2 (120MHz, 3WS)

F2(120MHz,3 Processing time


WS)
per Tap
FIR
Fast FIR

Q31

F3 (72MHz, 2WS)

F3(72 MHz,
2 WS)

F4 (168MHz, 5WS)

Processing time per F4(168MHz,5 Processing time per


Tap
WS)
Tap

Q15 1396.99 s

66ns

605.81 s

28.5 ns

218.28 s

10 ns

Q31 1636.66 s

77ns

1613.72 s

76 ns

29 ns

Q15

760.68 s

36ns

566.33 s

26 ns

618.27 s
209.76 s

Q31 1488.29 s

70ns

907.73 s

42 ns

267.46 s

13 ns

10 ns

69

CONTENTS
Cortex-M4F (DSP and Floating point Unit)
Cortex-M4 and DSP features
Floating point unit

70

Floating Point Unit

Overview
FPU : Floating Point Unit
Handles real number computation
Standardized by IEEE.754-2008

Number format
Arithmetic operations
Number conversion
Special values
4 rounding modes
5 exceptions and their handling

ARM Cortex-M FPU ISA


Supports
Add, subtract, multiply, divide
Multiply and accumulate
Square root operations

72

Floating Point Unit


Introduction
FPU usage
Historical perspective
Benefit of floating point arithmetic
Example & performances
Rounding issues

IEEE 754
ARM FPv4-SP Single Precision FPU

73

FPU usage
High level approach
Matrix, mathematical equations

Meta language tools


Matlab ,Scilabetc

C code generation
Floating point numbers (float)

FPU

No FPU

No FPU

Direct mapping
No code modification
High performance
Optimal code efficiency

Usage of SW lib
No code modification
Low performance
Medium code efficiency

Usage of integer based format


Code modification
Corner case behavior to be checked
(saturation, scaling)
Medium/high performance
Medium code efficiency

74

Historical perspective
Usage of floating point as always been a need for computers since
the beginning (Konrad Zuse - 1935)
But the complexity of implementation discarded their usage during
decades (IBM 704 - 1956)
Floating point unit where implemented in mainframes with various
coding techniques depending of the manufacturer
IBM PC where designed to have floating point capabilities through
optional arithmetic coprocessors (80x87 series)
The standardization of floating point coding was done in the 80s
through the IEEE 754 standard in 1985
The Intel 80387 was the first intel coprocessor to implement the full
IEEE 754 standard in 1987

75

Benefits of a Floating-Point Unit


FPU allows to handled real numbers (C float) without penalty
If no FPU
Need to emulate it by software
Need to rework all its algorithm and fixed point implementation to handle
scaling and saturation issues

FPU eases usage of high-level design tools (MatLab/Simulink)


Now part of microcontroller development flow for advanced applications.
Derivate code directly using native floating point leads to :
quicker time to market (faster development)
easy code maintenance
more reliable application code as no post modification are needed (no critical
scaling operations to move to fixed point)

76

C language example
77

float function1(float number1, float number2)


{
float temp1, temp2;
temp1 = number1 + number2;
temp2 = number1/temp1;
return temp2;
}

Same code compiled on Cortex-M4F

Code compiled on Cortex-M3


# float function1()
# {
#
temp1 = number1 + number2;
MOVS
R1,R4
BL
__aeabi_fadd
MOVS
R1,R0
#
temp2 = number1/temp1;
MOVS
R0,R4
BL
__aeabi_fdiv
#
return temp2;
POP
{R4,PC}
# }

float function1()
# {
#
temp1 = number1 + number2;
VADD.F32 S1,S0,S1
#
temp2 = number1/temp1;
VDIV.F32 S0,S0,S1
#
#
return temp2;
BX
LR
# }

Call Soft-FPU (keils software library)

FPU assembly instructions

Binary library example


Library compiled for Cortex-M3
MOVS
BL
MOVS
MOVS
BL
POP

__aeabi_fadd on Cortex-M3
# __aeabi_fadd ()
TEQ
R0,R1
IT
MI
EORMI
R1,R1,#0x80000000
BMI.W
0x0800xxxx
SUBS
R2, R0, R1
ITT
CC
SUBCC
...
...

R1,R4
__aeabi_fadd
R1,R0
R0,R4
__aeabi_fdiv
{R4,PC}

__aeabi_fadd on Cortex-M4F
# __aeabi_fadd ()
VMOV
S0,R0
VMOV
S1,R1
VADD.F32 S0,S0,S1
VMOV
R0,S0
BX
LR

Reduced code size & Enhanced performances

78

Benefits of a Floating-Point Unit


Comparison for a 166 coefficient FIR on float 32 with and without FPU
(CMSIS library)
Improvement in code size (A)
Improvement in performance (B)

Cortex-M4F FPU Benefits


FIR float code size in bytes
1074

1.5x
improvement
696

FIR float execution time (#


cycles)
FIR float execution time (# cycles)

17.8x
improvement
Best compromise
Development time
vs. performance

1593604

Cortex-M3

(A)

Cortex-M4F

89136
Cortex-M3

(B)

Cortex-M4F

79

Cortex-M4 : Floating point unit Features


Single precision FPU
Conversion between
Integer numbers
Single precision floating point numbers
Half precision floating point numbers

Handling floating point exceptions (Untrapped)


Dedicated registers
16 single precision registers (S0-S15) which can be viewed as 16
Doubleword registers for load/store operations (D0-D7)
FPSCR for status & configuration

80
80

Rounding issues
The precision has some limits
Rounding errors can be accumulated along the various operations an may
provide unaccurate results (do not do financial operations with floatings)

Few examples
If you are working on two numbers in different base, the hardware
automatically denormalize on of the two number to make the
calculation in the same base
If you are substracting two numbers very closed you are loosing the
relative precision (also called cancellation error)

If you are reorganizing the various operations, you may not


obtain the same result as because of the rounding errors
Value1 = ((2.0f - 1.99f) - 0.01f); /* Value1 = -9.313266E-9 */
Value2 = (2.0f - (1.99f + 0.01f)); /* Value2 = 0 */

81

IEEE 754

Floating Point Unit


Introduction
IEEE 754
Number format
Arithmetic operations
Number conversion
Special values
4 rounding modes
5 exceptions and their handling

ARM FPv4-SP Single Precision FPU

83

Number format
3 fields
Sign
Biased exponent (sum of an exponent plus a constant bias)
Fractions (or mantissa)

Single precision : 32-bit coding


32-bit

1-bit Sign

8-bit Exponent

23-bit Mantissa

Double precision : 64-bit coding


64-bit

1-bit Sign

11-bit Exponent

52-bit Mantissa

84

Number format
Half precision : 16-bit coding
16-bit

1-bit Sign

5-bit Exponent

10-bit Mantissa

Can also be used for storage in higher precision FPU


ARM has an alternative coding for Half precision

85

Normalized number value


Normalized number
Code a number as :
A sign + Fixed point number between 1.0 and 2.0 multiplied by 2N

Sign field (1-bit)


0 : positive
1 : negative

Single precision exponent field (8-bit)


Exponent range : 1 to 254 (0 and 255 reserved)
Bias : 127
Exponent - bias range : -126 to +127

Single precision fraction (or mantissa) (23-bit)


Fraction : value between 0 and 1 : (Ni.2-i) with i in 1 to 24 range
The 23 Ni values are store in the fraction field

(-1)s x (1 + (Ni.2-i) ) x 2exp-bias

86

Number value
Single precision coding of -7
Sign bit = 1
7 = 1.75 x 4 = (1 + + ) x 4 = (1 + + ) x 2 2
= (1 + 2-1 + 2-2) x 22
Exponent = 2 + bias = 2 + 127 = 129 = 0b10000001
Mantissa = 2-1 + 2-2 = 0b11000000000000000000000

Result
Binary coding : 0b 1 10000001 11000000000000000000000
Hexadecimal value : 0xC0E00000

87

Special values
Denormalized (Exponent field all 0, Mantisa non 0)
Too small to be normalized (but some can be normalized afterward)
(-1)s x ((Ni.2-i) x 2-bias

Infinity (Exponent field all 1, Mantissa all 0)


Signed
Created by an overflow or a division by 0
Can not be an operand

Not a Number : NaN (Exponent filed all1, Mantisa non 0)


Quiet NaN : propagated through the next operations (ex: 0/0)
Signalled NaN : generate an error

Signed zero
Signed because of saturation

88

Summary of IEEE 754 number coding


Sign

Exponent

Mantissa

Number

+0

-0

Max

+oo

Max

-oo

Max

!=0 MSB=1

QNaN

Max

!=0 MSB=0

SNaN

!=0

Denormalized number

[1, Max-1]

Normalized number

89

Floating-point rounding
Round to nearest
Default rounding mode
If the two nearest are equally near : select the one with the LSB
equal to 0

Directed rounding
3 user-selectable directed rounding modes
Round toward +oo, -oo or 0

Usage
Program through FPU configuration registers

90

Floating-point operations
Add
Subtract
Multiply
Divide
Remainder
Square root

91

Floating-point format conversion


Floating-point and Integer
Round-floating point number to integer value
Binary-Decimal
Comparison

92

Exceptions
Invalid operation
Resulting in a NaN

Division by zero
Overflow
The result depend of the rounding mode and can produce a +/-oo
or the +/-Max value to be written in the destination register

Underflow
Write the denormalize number in the destination register

Inexact result
Caused by rounding

93

Exception handling
A TRAP can be requested by the user for any of the 5
exception with a specific handler

The TRAP handler can return a value to be used


instead of the exceptional operation result

94

ARM Cortex-M FPU

Floating Point Unit


Introduction
IEEE 754
ARM FPv4-SP Single Precision FPU
Introduction
FPUv4-SP vs IEEE 754-2008
FP Status & Control Register
FPU instructions
Exception management
FPU programmers model

96

Introduction
Single precision FPU
Conversion between
Integer numbers
Single precision floating point numbers
Half precision floating point numbers

Handling floating point exceptions (Untrapped)

Dedicated registers
32 single precision registers (S0-S31) which can be viewed as 16
Doubleword registers for load/store operations (D0-D15)
FPSCR for status & configuration

97

Modifications vs IEEE 754


Full Compliance mode
Process all operations according to IEEE 754

Alternative Half-Precision format


(-1)s x (1 + (Ni.2-i) ) x 216 and no de-normalize number support

Flush-to-zero mode
De-normalized numbers are treated as zero
Associated flags for input and output flush

Default NaN mode


Any operation with an NaN as an input or that generates a NaN
returns the default NaN

98

Complete implementation
Cortex-M4F does NOT support all operations of IEEE
754-2008

Full implementation is done by software

Unsupported operations

Remainder
Round FP number to integer-value FP number
Binary to decimal conversions
Decimal to binary conversions
Direct comparison of SP and DP values

99

Floating-Point Status & Control Register


Condition code bits
negative, zero, carry and overflow (update on compare operations)

ARM special operating mode configuration


half-precision, default NaN and flush-to-zero mode

The rounding mode configuration


nearest, zero, plus infinity or minus infinity

The exception flags are routed to interrupt controller


Masks allow to Enable/Disable exception to generate FPU
interruption
Inexact result flag is by default masked,

100

FPU instructions

101

FPU arithmetic instructions


Operation
Absolute value

Description

Assembler

Cycle

of float

VABS.F32

Addition

float
and multiply float
floating point

VNEG.F32
VNMUL.F32
VADD.F32

1
1
1

Subtract

float

VSUB.F32

float
then accumulate float
then subtract float
then accumulate then negate float
the subtract the negate float
then accumulate float
then subtract float
then accumulate then negate float
then subtract then negate float

VMUL.F32
VMLA.F32
VMLS.F32
VNMLA.F32
VNMLS.F32
VFMA.F32
VFMS.F32
VFNMA.F32
VFNMS.F32

1
3
3
3
3
3
3
3
3

float

VDIV.F32

14

of float

VSQRT.F32

14

Negate

Multiply

Multiply
(fused)
Divide
Square-root

102

FPU Load/Store/Compare/Convert
Operation
Load

Store

Move

Pop
Push
Compare
Convert

Description
multiple doubles (N doubles)
multiple floats (N floats)
single double
single float
multiple double registers (N doubles)
multiple float registers (N doubles)
single double register
single float register
top/bottom half of double to/from core register
immediate/float to float-register
two floats/one double to/from core registers
one float to/from core register
floating-point control/status to core register
core register to floating-point control/status
double registers from stack
float registers from stack
double registers to stack
float registers to stack
float with register or zero
float with register or zero
between integer, fixed-point, half precision and
float

Assembler
VLDM.64
VLDM.32
VLDR.64
VLDR.32
VSTM.64
VSTM.32
VSTR.64
VSTR.32
VMOV
VMOV
VMOV
VMOV
VMRS
VMSR
VPOP.64
VPOP.32
VPUSH.64
VPUSH.32
VCMP.F32
VCMPE.F32
VCVT.F32

Cycle
1+2*N
1+N
3
2
1+2*N
1+N
3
2
1
1
2
1
1
1
1+2*N
1+N
1+2*N
1+N
1
1
1

103

Exception management
No TRAP function : exception through interrupt
controller

FP register saving modes (when FPU is enabled)


No FP registers saving
Lazy saving/restoring (only space allocation in the stack)
Automatic FP registers saving/restoring

Stack frame
17 entries in the stack (FPSCR + S0-S15)

104

IEEE754 compliancy
The Cortex-M4 Floating Point Unit is IEEE754 compliant :
The rounding more is selected in the FPSCR register (nearest even value by default)

!=0

Compliant options
FZ=0 and AHP=0 and DN=0
De-normalized number

Non compliant option


FZ=1 or AHP=1 or DN=1
Flush to zero

Max

+infinity

Alternate Half Precision

Max

-infinity

Max

!=0 MSB=1

QNaN (Quiet Not a Number)

Max

!=0 MSB=0

SNaN (Signaling Not a Number)

Alternate Half Precision


Default NaN
Alternate Half Precision
Default NaN
Alternate Half Precision

Sign

Exponent

Mantissa

Some non compliant options are available in the FPSCR Register:


Flush to zero (FZ bit) :
de-normalized numbers are flushed to zero

Alternate Half Precision formation (AHP bit):


special numbers (exp = all 1) = normalized numbers

Default NaN (DN bit):


Different way to handle the Not A Number values

105

STM32 - Floating point exceptions


The FPU supports the 5 IEEE754 exceptions +1
Invalid operation (IEEE754)

Underflow (IEEE754)

Division by zero (IEEE754)

Inexact (IEEE754)

Overflow (IEEE754)

Input denormal ( Fluh to zero mode only)

Comments
These flags are in the FPSCR register
When flush to zero mode is used:
the FPU add a specific exception : input denormal
the FPU handles the underflow and Inexact exception in a non-IEEE754 way

The exception are not trapped


This is compliant with IEEE754
The value returned by the instruction generating an exception is a default result.

Examples
1234 / 0 => division by zero flag is set / the returned value is +infinity
Sqrt(-1) => Invalid Operation flag is set / the returned value is QNaN

Note: For details on each exception as well as the default returned value when such exceptions occurs,
please refer to ARM-7M architecture reference manual

106

FPU programmers model


Address

Name

Type

Description

0xE000EF34

FPCCR

RW

FP Context Control Register

0xE000EF38

FPCAR

RW

FP Context Address Register

0xE000EF3C

FPDSCR

RW

FP Default Status Control Register

0xE000EF40

MVFR0

RO

Media and VFP Feature Register 0

0xE000EF44

MVFR1

RO

Media and VFP Feature Register 1

Floating-Point Context Control Register


Indicates the context when the FP stack frame has been allocated
Context preservation setting

Floating-Point Context Address Register


Points to the stack location reserved for S0
Valid only for lazy context saving mode

Floating-Point Default Status Control Register


Details default values for Alternative half-precision mode, Default NaN mode, Flush to zero mode and Rounding
mode

Media & FP Feature Register 0 & 1


Details supported mode, instructions precision and and additional hardware support

107

About the Stack Frame

108

There is a difference between the stack frame with or without FPU


0x64

Reserved

0x60

FPSCR

0x5C

S15

0x20

S0

0x1C

xPSR

0x1C

xPSR

0x18

ReturnAddress

0x18

ReturnAddress

0x14

LR (R14)

0x14

LR (R14)

0x10

R12

0x10

R12

0x0C

R3

0x0C

R3

0x08

R2

0x08

R2

0x04

R1

0x04

R1

0x00

R0

0x00

R0

Frame without FPU

Basic
Frame

Extended
Frame

Frame with FPU


Note : the FPU registers S16 to S31 are not in the frame

About the Stack Frame

109

Depending on the Floating-Point Context Control Register configuration,


the core handle the stack in different ways
Area reserved
But registers are
not pushed
automaticaly

Reserved

Reserved
Not stacked
Not stacked

Registers
are pushed
automatically

FPSCR
S15

Not stacked

S0

xPSR

xPSR

xPSR

ReturnAddress

ReturnAddress

ReturnAddress

LR (R14)

LR (R14)

LR (R14)

R12

R12

R12

R3

R3

R3

R2

R2

R2

R1

R1

R1

R0

R0

R0

ASPEN = 0

ASPEN = 1, LSPEN=1

ASPEN = 1, LSPEN=0

Lazy context save (default after reset)


Reserved

110

In Lazy mode, the FP context is not saved

Not stacked

This reduces the exception latency.

Not stacked

This keep it simple for the user to push the value if needed

Not stacked
xPSR
ReturnAddress
LR (R14)
R12
R3
R2
R1
R0

ASPEN = 1
LSPEN=1

If a floating point instruction is needed when lazy context


save is active, the processor first :
Retrieve the address of the reserved area from FPCAR register
Save the FP state, S0-S15 and the FPSCR,
Sets the FPCCR.LSPACT bit to 0, to indicate that lazy state
preservation is no longer active,
It can then processes the FPU instruction.

Enabling FPU exception interruption


Six exception flags (IDC, IXC, UFC, OFC, DZC, IOC) are ORed and
connected to the interrupt controller.
There is an individual mask to enable/disable FPU interrupt for each
exception.
FPU exception mask is done at product level : System configuration
controller configuration register 1 (SYSCFG_SFGR1).
FPU interruption enable/disable is done at interrupt controller level.

111

Clearing FPU exception interruption flags

112

Clearing the FPU exception interruption source flags depends on FPU


context save/restore configuration
FP registers save/restore
mode

How to clear

Comment

None

Interrupt source must be


cleared in FP Status and
Control Register (FPSCR).

Using CMSIS functions:


__get_FPSCR()
__set_FPSCR()

Lazy

Interrupt source must be


cleared in the stack.
FPSCR register address :

A dummy read access


should be made to FP
register to force context
saving.

FPU->FPCAR + 0x40

Automatic

Interrupt source must be


cleared in the stack.

Check LR value to
determine which stack was
used to preserve context.

Note : Please refer to Cortex-M4F programming manual for more details

Exception entry & LR values

113

Depending on the CPU mode and configuration, context format &


destination stack varies.
LR register value gives details on which mode/configuration was active when entering
the exception.

LR Values

Return to (Mode)

Return Stack

Frame Type

0xFFFF_FFF1

Handler Mode

Main

Basic

0xFFFF_FFE1

Handler Mode

Main

Extended

0xFFFF_FFF9

Thread mode

Main

Basic

0xFFFF_FFE9

Thread mode

Main

Extended

0xFFFF_FFFD

Thread mode

Process

Basic

0xFFFF_FFED

Thread mode

Process

Extended

FPU exception interruption benefits


Boost the priority of FPU exception handler (via NVIC software
interruption priorities configuration)
Optimize over all performance
Example handling Divide-by-Zero Exception:

With polling

With Divide-by-Zero Interruption

float x = 2.5f;
for(index = 0; index < 0xFFFF;i++)
{
x = 1.0f/(x*x);
if(__get_FPSCR() & 0x00000002)
{
DivZeroExc_Handler();
}
}

float x = 2.5f;
SYSCFG_ITConfig(SYSCFG_IT_DZC, ENABLE);
for(index = 0; index < 0xFFFF; index ++)
{
x = 1.0f/(x*x);
}
SYSCFG_ITConfig(SYSCFG_IT_DZC, DISABLE);
void FPU_IRQHandler(void)
{
DivZeroExc_Handler();
}

114

What can reduce FPU performances?


Accidentally used double precision data/functions
The compiler will use double precision software library instead of using single
precision Hardware FPU
Explicitly cast your constant data to float type.

Compiler/library settings (e.g. hard VFP vs soft VFP)


Bad instructions scheduling (when writing in assembly)
Pipelining instruction execution between Cortex-M4 core and FPU co-processor
improves over all performance.
Basic example : float division(VDIV 14 cycles) can hide load (LDR, LDM, )
overhead,

115

Floating point DSP examples


Comparison of DSP (ARM CMSIS DSP library) algorithm execution
time on Cortex-M4F:
with and without FPU
version using FPU insructions Vs version using DSP instructions

116

FFT benchmarking results


Setup
Input signal characteristics
Input signal

a sine wave with a frequency F


50Hz <F<4KHz

Sampling frequency

8KHz

Measures done with MDK-ARM (4.23.00.0) toolchain


Level 3(-O3) for time optimization without MicroLib

117

FFT benchmark results


Cortex-M3 vs Cortex-M4F (2/2)
FFT 1024-points average
processing time at 0 WS (# cycles)

FFT 64-points average processing


time at 0 WS (# cycles)
1800000
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0

60000
50000
40000
30000
20000
10000
0
Q15
F2 (30 MHz, 0 WS)

Q31
F3 (24 MHz, 0 WS)

Float
F4 (30 MHz, 0 WS)

Q15
F2 (30 MHz, 0WS)

F2(30MHz,0WS) F3(24 MHz,0WS) F4(30MHz,0WS)


(#cycles)
(#cycles)
(#cycles)
Q15
7374
3300
3307
FFT
(64Q31
8022
6522
6410
points) Float
52763
4725
4793
Q15
190028
80608
80252
FFT
(1024- Q31
215505
158022
166406
points) Float
1544676
116576
118633

Q31
F3 (24 MHz, 0 WS)

Gain
(F2 vs F3)
x2.23
x1.23
x11.17
x2.36
x1.36
x13.25

Float
F4 (30 MHz, 0 WS)

118

FIR benchmarking results


Setup
Filter & input signal characteristics
Filter type

Stop Band

Filter order

165

Filter coefficients

166

Cut-off frequency

FSTOP1=1.9KHz, FSTOP2=2.1KHz

Sampling frequency

48KHz

Number of samples

128

Input signal

a sine wave with a frequency F


50Hz <F<4KHz

Measures done with MDK-ARM (4.23.00.0) toolchain


Level 3(-O3) for time optimization without MicroLib

119

FIR benchmarking results


F2/F3/F4

FIR average processing time (s)


16000.000
14000.000
12000.000
10000.000
8000.000
6000.000
4000.000
2000.000
0.000
FIR Q15

FIR Q32

FIR Float

Fast FIR Q15

F2 (120 MHz, 3 WS)

F3 (72 MHz, 2 WS)

F4 (168 MHz, 5 WS)

F2(120MHz,3 Processing time


WS)
per Tap

F3(72 MHz,
2 WS)

Fast FIR Q32

Processing time per F4(168MHz,5 Processing time


Tap
WS)
per Tap

Q15 1396.992 s

66ns

605.819 s

28.5 ns

218.280 s

10 ns

FIR Q31 1636.658 s

77ns

1613.722 s

76 ns

618.273 s

29 ns

Float 14782.510 s

696ns

1338.528 s

63 ns

531.160 s

25 ns

Fast Q15 760.675 s


FIR Q31 1488.292 s

36ns

566.333 s

26 ns

209.761 s

10 ns

70ns

907.736 s

42 ns

267.464 s

13 ns

120

Summary
FPU is a key benefit for many application tasks that require precision
(to name just a few) :
loop control,
audio processing,
sensor signal conditioning,
motor control,
digital filtering,

FPU gives developers unparalleled flexibility:


Applications requiring to apply mathematical models on data benefits from FPU
where math formulas are translated directly to C code with no pains
Code is more easy to maintain compared to fixed point.

Floating point number (Half precision) has larger dynamic range than
fixed-point

121

Cortex-M4 Hands-on

123

Preliminary: The purpose of this hands-on is to get familiarized


with ARM CMSIS DSP library and use one algorithm (FIR Filter) on
STM32F3xx devices.

Example Using FIR Filter on STM32F3xx


devices

124

The aim of this LAB is to use Using_STM32_In_DSP_Application


firmware example to perform :
Signal generation,
Signal sampling,
Signal processing (signal filtering),
Signal output.

Complete Signal_FIR_Processing.c file to :


Initialize FIR Filter
Apply FIR Filter on block of data
Use Scilab tool to generate new coefficients
H

ADC

Low Pass
FIR Filter

DAC

Filter design
On Scilab console write:
wfir to get access to scilab filter designer wizard
Choose filter type : low pass filter.
Set filter characteristics
Cut-off frequency : 1KHz (Sampling Frequency = 48KHz)
Filter length : 64

Set input window type : Rectangular

Scale the generated coefficients to Q15 data type


coefficient = ans * 32768;

Set the number of coefficient


N=64

Write coefficient to file:

fd = mopen(/fir_coefficient.txt,wt);
for i=1:30
mfprintf(fd,%d, ,coefficient(i));
end;
mclose(fd);

125

Firmware update
Copy Coefficient to Signal_FIR_Processing.c
Set NUM_TAPS to the number of coefficient you have generated
Initialize FIR Filter
Call FIR filter for Q15 data format initialization function from ARM CMSIS DSP lib
(function to modify FIR_PROCESSING_Q15Init)

Apply FIR Filter on block of data


Call FIR filter for Q15 data format processing function from ARM CMSIS DSP lib
(function to modify FIR_PROCESSING_Q15Process)

CMSIS DSP library documentation :


Libraries\CMSIS\Documentation\DSP_Lib\html\index.html
Modules  Filtering Funtions  Finite Impulse Response (FIR) Filters

126

STM32F3xx Block Diagrams




Nested vect IT Ctrl


SW debug
w/ ROP level2
protection
1 x Systick Timer

Flash I/F

72MHz

Matrix / Arbiter (max 72MHz)




2 x DMA
12 Channels
256 kB
FLASH
Memory

XTAL
4~32MHz

Reset Clock Ctrl

16 backup registers

CRC

64 Bytes

Touch Sensing Ctrl

2x 12-bit DAC Ch
SPI 1

7x GP comparators

USART 1

1 x 16-bit TIMER
2ch (1ch w/ cpl/dt)

PLL

RTC

XTAL
32KHz

I-WDG
w/ AWU

LSI
32KHz

(max 36 MHz)

Up to 36 Ext. ITs

2 x 16-bit TIMER
1ch ( with cpl/dt)

HSI
8MHz 1%

40KB + 8KB CCM RAM

ARM Peripheral Bus

2 x 16-bit Advanced
TIMER 6ch

Power Supply
POR/PDR/PVD

up to 48kB SRAM

4 x 12-bit ADC
39ch / 0.20 s

Fast I/O interface

128

OPAMP
2 x 16-bit Basic TIMER

2x SPI, w/ 2 x I2S
2x IC
1 w/ FM+ 20mA
USART 2/3
UART 4/5
1 x USB

bxCAN

Win-WDG

1 x 32-bit GP TIMER
4ch
2 x 16-bit GP TIMER
4ch

STBY/VBAT





CORTEXTM-M4F
M4F
CPU

(max 72 MHz)

STM32F30x Series
ARM Lite Hi-Speed Bus

ARM 32-bit Cortex-M4F CPU


Operating Voltage:

VDD = 2.0 V to 3.6 V or 1.8V +/8%

VBAT = 1.65V to 3.6 V
Safe Reset System (Integrated Power On Reset
(POR)/Power Down Reset (PDR) +
Programmable voltage detector (PVD))
Embedded Memories:

FLASH: up 256 KB

SRAM: up 40 KB SRAM + 8KB CCM RAM
CRC calculation unit
2 x DMA: 12 channels .
Power Supply with software configurable
internal regulator and low power modes.
Low Power Modes with Auto Wake-up
Low power calendar RTC with 64 bytes of
backup registers
Up to 72 MHz frequency managed & monitored
by the Clock Control w/ Clock Security System
Rich set of peripherals & IOs

2 12-bit DAC channels with output buffer

7 general purpose comparators (Window
mode and wakeup from low-power mode)

4 operational amplifiers

Dual Watchdog Architecture

13 Timers w/ advanced control features
(including 1 Cortex SysTick timer and 2
WDGs timers)

14 communication Interfaces

Up to 87 fast I/Os all mappable on external
interrupts/event

4x12-bits 5Msps ADC w/ up to 39 external
channels + Temperature sensor/ voltage
reference/VBAT measurement

ARM Peripheral Bus




STM32F37x Series

Nested vect IT Ctrl


SW debug
w/ ROP level2
protection
1 x Systick Timer

Flash I/F

Matrix / Arbiter (max 72MHz)




72MHz

256 KB
FLASH
Memory

HSI
8MHz 1%
PLL
XTAL
4~32MHz

up to 32 KB SRAM
16 backup registers
Reset Clock Ctrl

128 Bytes

CRC
Touch Sensing Ctrl

Fast I/O interface

Power Supply
POR/PDR/PVD

ARM Peripheral Bus

RTC

XTAL
32KHz

I-WDG
w/ AWU

LSI
32KHz

(max 36MHz)

1 x 12-bit ADC
18ch / 1s

Win-WDG

Up to 29 Ext. ITs
SPI 1/I2S

2 x GP comparators

SPI 2/3, w/ I2S

3 x 12-bit DAC Ch

2x IC
1 w/ FM+ 20mA

USART 1

3 x 16-bit SDADC

3 x 16-bit Basic
TIMER
2 x 32-bit GP TIMER
4ch
3 x 16-bit GP TIMER
4ch
4 x 16-bit GP TIMERS
1ch
2 x 16-bit GP TIMERS
2ch

USART 2/3

1 x USB
1x CEC

STBY/VBAT





2 x DMA
12 Channels

CPU

(max 72MHz)

129
CORTEXTM-M4F
M4F
ARM Lite Hi-Speed Bus

ARM 32-bit Cortex-M4F CPU


Operating Voltage:

VDD = 2.0 V to 3.6 V or 1.8V +/8%

VBAT = 1.65 V to 3.6 V
Safe Reset System (Integrated Power On
Reset (POR)/Power Down Reset (PDR) +
Programmable voltage detector (PVD))
Embedded Memories:

FLASH: up 256 Kbytes

SRAM: up 32Kbytes
CRC calculation unit
2 x DMA: 12 Channels
Power Supply with software configurable
internal regulator and low power modes.
Low Power Modes with Auto Wake-up
Low power calendar RTC with 128 bytes of
backup registers
Up to 72 MHz frequency managed &
monitored by the Clock Control w/ Clock
Security System
Rich set of peripherals & IOs

3 12-bit DAC channels with output
buffer

2 general purpose comparators
(Window mode and wakeup from lowpower mode)

Dual Watchdog Architecture

17 Timers (including Cortex SysTick and
WDGs)

14 communication Interfaces

Up to 84 fast I/Os all mappable on
external interrupts/event

1 x12-bits SAR ADC w/ up to 18
external channels .

3 x 16-bit Sigma-Delta ADC with
conversion speed up to 50 ksps and up
to 19 single/ 10 diff channels

ARM Peripheral Bus




Memory and System Architecture

System Architecture

In STM32F30x

In STM32F37x

Five masters:

Five masters:

Cortex-M4 core I-bus

131

Cortex-M4F core I-bus

Cortex-M4 core D-bus

Cortex-M4F core D-bus

Cortex-M4 core S-bus

Cortex-M4F core S-bus

GP-DMA1 and GP-DMA2 (general-purpose

GP-DMA1 and GP-DMA2 (general-purpose


DMAs)

Five slaves:

DMAs)

Seven slaves:

Internal SRAM on the Dcode

Internal SRAM

Internal SRAM on the ICode (CCM RAM)

Internal Flash memory

Internal Flash memory

AHB to APBx (APB1 or APB2), which connect

AHB to APBx (APB1 or APB2), which connect


all the APB peripherals

all the APB peripherals

AHB dedicated to GPIO ports

AHB dedicated to GPIO ports

ADCs 1,2,3 and 4.

Memory Mapping and Boot Modes

Addressable memory space of 4 Gbytes

FLASH : up to 256 Kbytes


RAM:

Up to 40 (F30x) and 32 (F37x) Kbytes SRAM with

HW parity check (*)

Up to 8 Kbytes CCM RAM with HW parity

check (STM32F30xonly)

0xFFFF FFFF

The boot configuration is defined with BOOT0 pin and BOOT1 bit in
USER Option Byte.

4 bits per word for parity check


(*) In STM32F30x, only the first 16Kbytes
Reserved

support the hardware parity check.

0xE010 0000

0xE000 0000

132
Boot modes
Depending on the Boot configuration, Embedded Flash memory,
System memory or Embedded SRAM memory is aliased at @0x00
thanks to memory remapping bits in SYSCFG registers.
Even when aliased, these memories are still accessible from their
original memory space.

Cortex-M4
internal
peripherals

BOOT Mode
Selection

Boot Mode

Aliasing

User Flash

User Flash is selected as


boot space

System
memory

SystemMemory is
selected as boot space

Embedded
SRAM

Embedded SRAM is
selected as boot space

0x1FFF FFFF

BOOT1

BOOT0

0x1FFF F80C

1
0

Reserved
Reserved

Option Bytes

0x1FFF F800
System Memory
0x1FFF EC00
System Memory
Reserved

0x0804 0000

0x4800 17FF

Peripherals

Flash
0x0800 0000

0x4000 0000

Reserved

System memory : contains the Bootloader used to re-program


the FLASH through USART or USB .

SRAM
0x2000 0000

CODE
0x0000 0000

Memory type
depending on
boot
configuration

0x0001 0000

0x0000 0000

Boot from SRAM : In the application initialization code you have


to Relocate the Vector Table in SRAM using the NVIC Exception
Table and Offset register.

Embedded Flash Memory

Flash Features Overview


Flash general features:

Up to 256 KBytes
128 pages of 2KBytes size
Access time: 35ns
Half word (16-bit) program time: 52.5s (Typ)
Page erase time and Mass erase time: 20ms (Min), 40ms (Max)

Flash interface features:

Read Interface with pre-fetch buffer


Option Bytes loader
Flash program/erase operations
Types of Protection:
Readout protection: Level 0, Level 1 and Level 2
Write Protection

Flash Memory Organization


Main memory block containing 128 pages of 2Kbyte each.
Information block contains the system memory and option
bytes, is divided into two parts:
System Memory
8 KB size
contains the bootloader which is used to reprogram the Flash memory through USART1,
USART2 or USB.
used to boot the device in System memory boot mode.
programmed by ST when the device is manufactured, and protected against unwanted
write/erase operations.

8 Option bytes : can

be read from the memory location starting from 0x1FFFF800 or from


the Option byte register (FLASH_OBR) in the Flash memory interface register area.
4 for write protection
1 for read protection
1 for device configuration:
IWDG HW/SW mode
Reset when entering STANDBY mode
Reset when entering STOP mode
VDDA supervisor
BOOT1
SRAM parity check

2 For User Data (To store Security IDs, etc.)

135

Flash Operations (1/2)


The embedded Flash memory can be programmed using
in-circuit programming (ICP) or in-application programming (IAP).

The Flash program and erase operations are handled by the


Flash program and erase controller (FPEC).
After reset, the FPEC is protected against unwanted write or erase operations.
The FLASH_CR register is not accessible in write mode. An unlocking sequence of
KEYs should be written to the FLASH_KEYR register to open the access to the
FLASH_CR register.
The Main Flash can be programmed by writing a half-word (16-bits) at a
time. Any attempt to write data that are not half-word long will result in a bus error
generating a Hard Fault interrupt.
The Main Flash can be erased page-wise or completely (Mass Erase)
I-bus stalled during program\erase : Flash cannot be accessed during
these operation and have to wait until the BSY bit is reset in the
FLASH_SR register to perform the next operation.

136

Flash Operations (2/2)


The 8 option bytes are programmed differently from main Flash:
After unlocking the Flash access, authorize the programming of option bytes by
writing same set of KEYS to FLASH_OPTKEYR register to set the OPTWRE bit in
the FLASH_CR register.
Then set the OPTPG bit in the FLASH_CR register.
When the Flash memory read protection option is changed from protected to
unprotected, a Mass Erase of the main Flash memory is performed.
On POR reset, the option bytes loader performs a read of option bytes and stores
the data into the FLASH registers (when programmed, the option bytes are taken
into account only after POR reset). User can also use the FORCE_OPTLOAD bit
from FLASH_CR register to initiate the option bytes loader (generating SYSTEM
reset).

The Read access can be performed with the following


configuration options:
Latency: number of wait states for a correct read operation (from 0 to
2).
Prefetch buffer of 2x64bit: For faster CPU execution, can be enabled and
disabled on the fly.
Half Cycle: Flash access can be made on a half cycle of the HCLK to
reduce power consumption, enabled by software.

137

Flash memory prefetch controller

138

Mission: Support 72 MHz operation directly from Flash memory

Instructions-BUS
32 bits
Thumb-2

ARBITER *

16 bits
Thumb-2

FLASH
MEMORY

32 bits

64 bitsThumb-2

16 bits
Thumb

Memory
Accelerator
64 bits
64 bits
64 bits

64 bits

32 16 16 Bits
Thumb-2

64-bits wide Flash with Prefetch (2 x 64bits buffers).

32 bits
Thumb-2

CORTEX-M4
CPU

ARRAY

Data/Debug-BUS
16-bit
Data

32 bits
Data

8 bit
Data

Flash error/status flags and interrupts


The Flash program and erase controller provides error and status flags
with possible interrupts:
WRPRTERR (write protection error flag): Set by hardware when an address to be
erased/programmed belongs to a write-protected part of the Flash memory.
PGERR (programming error flag): Set by hardware when the data to program is different from
0xFFFF before programming.
EOP (End of operation): This bit is set by hardware when a Flash or Option byte operation
(program/erase) is completed.
BSY (Write/erase operations in progress)

Interrupt event

Event flag

Enable control bit

End of programming

EOP

EOPIE

Error

WRPRTERR
PGERR

ERRIE

139

Flash protections (1/6)


Two kinds of protections are available:
Write protection to avoid unwanted writings
Readout protection to avoid piracy: Level 0, Level 1 and Level 2 (No debug)
Both are activated by setting option bytes

Write protection
The write protection is implemented with a choice of protecting 2 pages (4K) at a time
4 options bytes are used to protect all the 256KBytes main Flash program memory
Any programming or erase of a protected page is discarded and the Flash will return
protection error flag in the FLASH_SR status register
Un-protection
Erase the corresponding bit on WRPx option bytes, x = 0..3.
Reset the device (POR Reset) or set the FORCE_OPTLOAD bit to re-load the options bytes for
disabling any write protection.

The write protection bit values are visible also through FLASH_WRPR write protection
register.

Flash Protections (2/6)

Read protection

The read protection is activated by setting the RDP option byte and then, by applying
POR reset or using FORCE_OPTLOAD bit from FLASH_CR register to reload the
new RDP option byte.
Three levels of protection from no protection (Level 0) to maximum protection (Level
2 or No debug)

RDP byte value

141

RDP complement value

Read protection level

0xAA

0x55

Level 0

Any value but 0xAA or


0xCC

Any value (not necessarily


complement) but 0x55 and 0x33

Level 1

0xCC

0x33

Level 2 (No debug)

Readout protection Level 0

No read protection

All operations (if no write protection is set) from/to the Flash, option byte or the
RTC Backup registers are possible in all boot configurations (Flash user boot,
boot RAM, boot loader or debug).

Flash Protections (3/6)

142

Readout protection Level 1


When this protection is enabled :

User mode: Code executing in user mode can access main Flash memory and
option bytes with all operations.

Debug, boot RAM and boot loader modes: The main Flash memory and
backup registers (RTC_BKPxR in RTC) are totally inaccessible in these modes, a
simple read access generates a bus error and a Hard Fault interrupt. Any
attempted program/erase operations sets the PGERR flag.

Un-protection:

When the RPD is reprogrammed to the value 0xAA to move back to Level 0, a
Mass erase of the main Flash memory is performed and the backup registers
(RTC_BKPxR in RTC) are reset.

Flash Protections (4/6)

Readout protection Level 2 (No debug)

When This protection is enabled :


All protections provided by Level 1 are active.
Boot from RAM, boot from system memory and all debug features (serialwire) are disabled.
Option bytes can no longer be changed except in user mode but not totally ; RDP
option byte cannot be programmed/erased and other option bytes can only be
programmed (not erased).

Un-protection:
Not possible :level 2 cannot be removed at all: it is an irreversible operation.

143

Flash Protections (5/6)


RDP 0xAA and RDP 0xCC
Other option(s) modified

Level 1
RDP 0xCC
RDP 0xAA
Write options including RDP
= 0xCC

Level 2
RDP=0xCC

144

Write options including RDP


= 0xAA

Write options including RDP


0xAA and RDP 0xCC

Write options including RDP


= 0xCC

Level 0
RDP=0xAA

RDP = 0xAA
Other option(s) modified
Option byte write (RDP level increase) includes: Option byte erase and New option byte programming
Option byte write (RDP level decrease) includes: Option byte erase, New option byte programming and Mass Erase
Option byte write (RDP level identical) includes : Option byte erase and New option byte programming

Flash Protections (6/6)

145

Access status versus protection level and execution modes :


Area

Protection
level

User execution

Debug, boot from RAM or


boot from system memory
(loader)

Read

Write

Erase

Read

Write

Erase

Main
memory

Yes

Yes

Yes

No

No

No

Yes

Yes

Yes

N/A

N/A

N/A

System
memory

Yes

No

No

Yes

No

No

Yes

No

No

N/A

N/A

N/A

Option
bytes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No

N/A

N/A

N/A

Backup
registers

Yes

Yes

N/A

No

No

N/A

Yes

Yes

N/A

N/A

N/A

N/A

Quiz
List all supported protections and How Enable/Disable them ?
____________

What is the Maximum Flash Read Frequency?


____________

146

Power Control (PWR)

SRM32F30x Power Supply




Power Supply Schemes




VDD = 2.0 to 3.6 V : External Power Supply for


I/Os (or VDD = 1.8 +/- 8%: %: External Power
Supply for I/Os with internal regulator is OFF.)

VDDA domain
A/D converter
D/A converter
COMP
Temp. sensor
Reset block
PLL

VDDA

VDDA = 2.0 to 3.6 V: External Analog Power


supplies for ADC,DAC, Reset blocks, RCs and
PLL.

VSSA

VDD domain

 DAC working only if VDDA >=2.4 V





VBAT = 1.65V to 3.6 V: For Backup domain


when VDD is not present.

VSS
VDD

V18 domain

I/O Rings
STANDBY circuitry
(Wake-up logic,
IWWDG, RTC, LSE
crystal 32K osc,
RCC CSR )
Voltage Regulator

Power pins connection:


Low Voltage Detector

 VDD and VDDA can be provided by a separated power


supply source.

Backup domain

 VSS and VSSA must be tight to ground


VBAT

LSE crystal 32K osc


BKP registers
RCC BDCR register
RTC

Core
Memories
Digital
peripherals

148

SRM32F37x Power Supply


Power Supply Schemes


VDD = 2.0 to 3.6 V : External Power Supply for


I/Os (or VDD = 1.8 +/- 8%: %: External Power
Supply for I/Os with internal regulator is OFF.)

VDDA domain
A/D converter
D/A converter
COMP
Temp. sensor
Reset block
PLL
SDADCs

SDADC1_2_VDD
SDADC3_VDD
SDADC1_2_3_VSS

VDDA = 2.0 to 3.6 V: External Analog Power


supplies for ADC,DAC, Reset blocks, RCs and
PLL.

VDDA
VSSA

VDD domain

 ADC and DAC working only if VDDA >=2.4 V






VBAT = 1.65V to 3.6 V: For Backup domain when


VDD is not present.
SDADCx_VDD = 2.2 to 3.6V : External Analog
Power supplies for SDADCs with:

VSS
VDD

Power pins connection:

The SD1_SD2_VDD and SD3_VDD can be different from VDD ,


VDDA and from one another.

Voltage Regulator

Low Voltage Detector

Backup domain

 VDD and VDDA can be provided by a separated power


supply source.
 VSS, VSSA and SDADCx_VSS must be tight to ground

I/O Rings
STANDBY circuitry
(Wake-up logic,
IWWDG, RTC, LSE
crystal 32K osc,
RCC CSR )

VBAT

LSE crystal 32K osc


BKP registers
RCC BDCR register
RTC

V18 domain

Core
Memories
Digital
peripherals

149

Power Sequence

150

When VDD power supply source is different from VDDA power


supply source (VDD < VDDA)
The VDDA voltage level must be always greater or equal to the
VDD voltage
During power-on, the VDDA must be provided first (before VDD)
During power-off, it is allowed to have temporarily VDD > VDDA, but
the voltage difference must be <0.4V
could be maintained by an external Schottky diode

When SDADCx power supply is different from VDDA, VDD


power supply and from one another:
SDADCx_VDD <= VDDA
SDADC1_VDD/SDADC2_VDD <= SDADC3_VDD
SDADC3_VDD must start before or at the same time as
SD12_VDD
150

Supply monitoring and Reset circuitry


151

The STM32F3xx POR / PDR circuitries are always active


and monitor two supply voltages: VDD and VDDA.
The POR supervisor circuit monitors only VDD
The PDR supervisor circuit monitors VDD and VDDA
The PDR supervisor on VDDA can be disabled by
programming Option byte.

151

Power On Reset / Power Down Reset


VDD and
VDDA

Two Integrated POR / PDR circuitries


guarantees proper product reset when
voltage is not in the product guaranteed
voltage range (2V to 3.6V)

POR

Vtrh

40mv hysteresis

Vtrl

PDR

Tempo
2.5ms

No need for external reset circuit

Reset

POR and PDR have a typical hysteresis of


40mV

Vtrl min 1.8V / Vtrh max 2V


The PDR detector monitors VDD and also VDDA (if kept enabled in the
option bytes). The POR detector monitors only VDD.

152

Programmable Voltage Detector (PVD)

153

Programmable Voltage Detector

Enabled by software
VDDA

Monitors the VDDA power supply by


comparing it to a threshold
PVD Threshold

Threshold configurable from 2.1V to 2.9V


by step of 90mV
Generates interrupt through EXTI Line16 (if
enabled) when VDDA < Threshold and/or
VDDA > Threshold
 Can be used to generate a warning
message and/or put the MCU into a safe
state

PVD
Output

100mv hysteresis

Backup Domain

Backup Domain contains

Low power calendar RTC (Alarm, periodic wakeup from


Stop/Standby)

64 and 128 Bytes Data RTC registers in STM32F30x


and STM32F37x respectively.

Separate 32kHz Osc (LSE) for RTC

RCC BDCR register: RTC clock source selection and


enable + LSE config

Backup Domain

VBAT

 Reset only by RTC domain RESET


VDD

154

VBAT independent voltage supply

Automatic switch-over to VBAT when VDD goes lower


than PDR level

No current sunk on VBAT when VDD present.

2 x Tamper events detection: resets all user backup


registers

TimeStamp event detection.

RTC_TAMPx

power switch

RCC BDCR

32KHz OSC
(LSE)

Wakeup
Logic

IWDG

RTC + 64 (or 128) Bytes Data

154

Low Power Modes (1/4)


SLEEP Mode: Core stopped, peripherals kept running

Entered by executing special instructions


WFI (Wait For Interrupt)
Exit: any peripheral interrupt acknowledged by the Nested Vectored Interrupt Controller (NVIC)

WFE (Wait For Event)


An event can be an interrupt enabled in the peripheral control register but NOT in the NVIC or an EXTI
line configured in event mode
Exit: as soon as the event occurs  No time wasted in interrupt entry/exit

Two mechanisms to enter this mode


Sleep Now: MCU enters SLEEP mode as soon as WFI/WFE instruction are executed
Sleep on Exit: MCU enters SLEEP mode as soon as it exits the lowest priority ISR

To further reduce power consumption you can save power of unused


peripherals by gating their clock

155

Low Power Modes (2/4)

156

STOP Mode: all peripherals clocks, PLL, HSI and HSE are disabled, SRAM and
registers contents are preserved.
If the RTC and IWDG are running, they are not stopped in STOP (either as their clock
sources)
To further reduce power consumption, the Voltage Regulator can be put in Low Power mode
Wake-up sources:
WFI was used for entry: any EXTI Line configured in Interrupt mode (the corresponding EXTI Interrupt
vector must be enabled in the NVIC)
WFE was used for entry: any EXTI Line configured in event mode
EXTI line source can be: one of the 16 external lines, PVD output, RTC alarm, COMPx, I2Cx,
USARTx or the CEC (*).
The I2Cx, USARTx, CEC (*) can be configured to enable the HSI RC oscillator for processing
incoming data. If this is used, the voltage regulator should not be put in the low-power mode but kept
in normal mode.
After resuming from STOP the clock configuration returns to its reset state (HSI used as system
clock).
(*): CEC is available in STM32F37x only.

Low Power Modes (3/4)


STANDBY Mode: Voltage Regulator off, the entire V18 domain is powered off.
SRAM and register contents are lost except registers in the Backup domain and STANDBY
circuitry
PLL, the HSI RC and the HSE crystal oscillators are also switched off.
RTC and IWDG are kept running in STANDBY (if enabled)
In STANDBY mode all IO pins are high impedance and non-active except:
Reset pad (still available)
RTC pins (if configured)
PC14 & PC15 could be forced to output high/low in RTC registers
WKUPx pins (if enabled)

Wake-up sources:
WKUPx pins rising edge
RTC alarm and tamper events
External reset in NRST pin
IWDG reset
 After wake-up from STANDBY mode, program execution will restart in the same way as after a RESET.

157

STM32F3xx Low Power modes


Mode name

SLEEP,
SLEEP now
or
SLEEP onexit

STOP

STANDBY

Entry

Wakeup

WFI

Any interrupt

WFE

Wake-up event

PDDS,
LPSDSR
bits +
SLEEPDEEP
bit +
WFI or WFE

Any EXTI line


(configured
in the EXTI registers,
internal and external
lines)

PDDS bit +
SLEEPDEEP
bit +
WFI or WFE

WKUP pin rising edge,


RTC alarm, RTC tamper
event, external reset in
NRST pin, IWDG reset

Effect on
1.8V
domain
clocks

Effect on
VDD
domain
clocks

Voltage
regulator

CPU CLK
OFF
no effect on
other
clocks or
analog
clock
sources

None

ON

All 1.8V
domain
clocks
OFF

HSI and
HSE and
oscillator
s
OFF

ON, in low
power
mode
(dependin
g
on
PWR_CR)

OFF

IO state

158

Wakeup latency

None
All I/O pins
keep the same
state as in the
Run mode
HSI RC wakeup
time + regulator
wakeup time
from Low-power
mode

all I/O pins are


high impedance
(*)

Reset phase

(*): Standby mode: all I/O pins are high impedance except:
- Reset pad (still available)
- RTC pins PC14 and PC15 if configured in the RTC registers.
- WKUP pin 1 (PA0) and WKUP pin 2(PC13), if enabled.

158

Hands-on: Power Consumption


on STM32F30x
02/04/2012

Aim of the Hands-on

This hands-on allows to determine the STM32F30x power


consumption in: RUN, SLEEP, STOP and STANDBY modes.

F3 Alpha Training

02/04/2012

160

Low Power Modes (4/4)


STM32F303 Low Power modes: uses Cortex M4 Sleep modes
SLEEP, STOP and STANDBY

Feature
RUN mode w/ execute from Flash on 72MHz
All peripherals clock ON
RUN mode w/ execute from Flash on 24MHz
All peripherals clock ON
RUN mode w/ execute from Flash on 8MHz
All peripherals clock ON
Sleep mode w/ execute from Flash at 48MHz
All peripherals clock ON
STOP w/ Voltage Regulator in low power
All oscillators OFF, PDR on VDDA is OFF
STANDBY w/ LSI and IWWDG OFF
PDR on VDDA is OFF
Typical values are measured at TA = 25 C, VDD =3.3 V VDDA= 3.3 V.

typ IDD/IDDA
(*)

Quiz

162

How many power supply domains are available?


____________

What is the power sequence recommendation?


____________

What are the wake-up sources from STOP mode?


____________

162

Direct memory access controller


(DMA)

DMA Features
12 independently configurable channels: hardware requests or software trigger on each
channel.
DMA1: 7 Channels
DMA2: 5 Channels

Software programmable priorities: Very high, High, Medium or Low. (Hardware priority in
case of equality).
Programmable and Independent source and destination transfer data size: Byte,
Halfword or Word.
3 event flags for each channel: DMA Half Transfer, DMA Transfer complete and DMA
Transfer Error.
Memory-to-memory, peripheral-to-memory and memory-to-peripheral transfers and
peripheral-to-peripheral transfers.
Faulty channel is automatically hardware disabled in case of bus access error.
Programmable number of data to be transferred: up to 65535.
Support for circular buffer management.

164

DMA1 Request Mapping (1/2)


165
Periphera
l

ADC

Channel 1

Channel 2

Channel 3

Channel 4

Channel 5

SPI1_RX

SPI1_TX

SPI2_RX

SPI2_TX

USART1_TX

USART1_RX

USART2_RX

I2C2_TX

I2C2_RX

I2C1_TX

TIM1_CH4
TIM1_TRIG
TIM1_COM

TIM1_UP

TIM1_CH3

USART

USART3_TX

USART3_RX

I2C
TIM1 (*)

TIM1_CH1

TIM2_CH3

TIM3
TIM4
TIM6 /
DAC (*)

Channel 7

ADC1

SPI

TIM2

Channel 6

TIM1_CH2

TIM2_UP
TIM3_CH3

TIM2_CH1
TIM3_CH4
TIM3_UP

TIM4_CH1

USART2_TX
I2C1_RX

TIM2_CH2
TIM2_CH4

TIM3_CH1
TIM3_TRIG
TIM4_CH2

TIM4_CH3

TIM4_UP

TIM6_UP
DAC_CH1 (1)

(*) Available on STM32F30x only.


(1) DMA request mapped on this DMA channel only if the
corresponding remapping bit is set in the SYSCFG_CFGR1 register

DMA1 Request Mapping (2/2)


Peripherals

Channel 1

Channel 2

Channel 3

TIM7 / DAC (*)

Channel 4

Channel 5

TIM16_CH1
TIM16_UP

166

TIM16_CH1
TIM16_UP
(*) (1)

TIM17_CH
1
TIM17_UP

TIM17_CH1
TIM17_UP
(*) (1)

TIM18 /
DAC
channel 3 (**)

TIM19 (**)

Channel 7

TIM7_UP
DAC_CH2 (1)

TIM16

TIM17

Channel 6

TIM18_UP
DAC_CH3
TIM19_CH3
TIM19_CH4

TIM19_CH1

TIM19_CH2

TIM19_UP

(*) Available on STM32F30x only


(**) Available on STM32F37x only.
(1) DMA request mapped on this DMA channel only if the corresponding remapping bit
is set in the SYSCFG_CFGR1 register

DMA2 Request Mapping


Peripherals

Channel1

Channel2

Channel3

Channel4

Channel5

ADC

ADC2

ADC4

ADC2 (1)
SDADC1

ADC4 (1)
SDADC2

ADC3
SDADC3

SPI3

SPI3_RX

SPI3_TX

UART4(*)

UART4_RX

TIM6 / DAC
channel 1

TIM6_UP
DAC_CH1

TIM7 / DAC
channel 2
TIM8 / DAC
(*)

UART4_TX

TIM7_UP
DAC_CH2
TIM8_CH3
TIM8_UP

TIM8_CH4
TIM8_TRIG
TIM8_COM

TIM8_CH1

TIM18 / DAC
channel 3 (**)

(*) Available on STM32F30x only


(**) Available on STM32F37x only.

TIM8_CH2

TIM18_UP
DAC_CH3

167

Quiz
How many DMA Channels are available in the STM32F3xx ?
____________

How many interrupts can be generated for each channel?


____________

Which Channel is able to perform Memory to Memory transfer?


____________

168

General Purpose IOs

GPIO features

170

Up to 84 (in STM32F37x) and 87 (in STM32F30x) multifunction bidirectional I/O ports available on biggest package 100 pin.
Several I/Os are 5V tolerant (ADC, opamp, comparators pins are not).
All Standard I/Os are shared in 6 ports: GPIOA, GPIOB, GPIOC,
GPIOD, GPIOE, GPIOF.
Atomic Bit Set and Bit Reset using BSRR and BRR registers
GPIO connected to AHB bus, max toggling frequency 18 MHz
Configurable Output slew rate speed up to 50MHz
Locking mechanism (GPIOx_LCKR) provided to freeze the I/O
configuration
When the LOCK sequence has been applied on a port bit, it is no longer possible to
modify the configuration of the port bit until the next reset (no write access to the CRL
and CRH registers corresponding bit).

Up to 84 (in STM32F37x) and 87 (in STM32F30x) GPIOs can be set-up


as external interrupt (up to 16 lines at time) able to wake-up the MCU
from low power modes.
170

GPIO Configuration Modes

171

Analog
MODER(i)
[1:0]

OTYPER(i)
[1:0]

PUPDR(i)
[1:0]

I/O configuration

Alternate Function Input

0
0
1

0
1
0

Output Open Drain


Output Open Drain with Pull-up
Output Open Drain with Pull-down

01

Read

0
0
1

0
1
0

Alternate Function Push Pull


Alternate Function PP Pull-up
Alternate Function PP Pull-down

0
0
1

0
1
0

Alternate Function Open Drain


Alternate Function OD Pull-up
Alternate Function OD Pull-down
Write

00

0
0
1

0
1
0

Input floating
Input with Pull-up
Input with Pull-down

Bit Set/Reset
Register

10

Read / Write

11

Analog mode

From On-chip Peripherals

On Off

VDD

VDD or VDD_FT(1)

On/Off

Schmitt
Trigger Input Driver

VDD

On/Off

OUTPUT
VSS

CONTROL

VSS

Output Driver

I/O pin

Output Push Pull


Output Push Pull with Pull-up
Output Push Pull with Pull-down

Pull - Up

0
1
0

Pull - Down

0
0
1

Output Data Register

Input Data Register

To On-chip Peripherals

VSS

Push-Pull
Open Drain

Alternate Function Output


Analog
* In output mode, the I/O speed is configurable through OSPEEDR register:
2MHz, 10MHz or 50MHz

(1) VDD_FT is a potential specific to five-volt tolerant I/Os and different from VDD.

171

Alternate Functions features

172

Most of the peripherals shares the same pin (like USARTx_TX, TIMx_CH2,
I2Cx_SCL, SPIx_MISO, EVENTOUT)
Alternate functions multiplexers prevent to have several peripherals function pin
to be connected to a specific I/O at a time.

AF0
AF1
AF2
Pin x (016)

AF7
172

I/Os special considerations


173

During and just after reset, the alternate functions are not active and the
I/O ports are configured in input floating mode. But, the debug pins
(JTAG/SWD) are in AF pull-up/pull-down after reset:
PA13: JTMS/SWDIO
PA14: JTCK/SWCLK
PA15: JTDI
PB3: JTDO
PB4: NJTRST

Using the HSE or LSE oscillator pins as GPIOs


When the HSE or LSE oscillator is switched OFF (default state after reset), the
related oscillator pins can be used as general purpose IOs.
When the oscillator is configured in a user external clock mode, only the
OSC_IN or OSC32_IN pin is reserved for clock input and the OSC_OUT or
OSC32_OUT pin can still be used as general purpose IOs.

Using the GPIO pins in the backup supply domain


The PC13/PC14/PC15 GPIO functionality is lost when the device enters
Standby mode. In this case, if their GPIO configuration is not bypassed by the
RTC configuration, these pins are set in an analog input mode.
173

Quiz

174

How many I/Os and ports there are in the STM32F3xx microcontroller ?
____________

List all the I/O configuration modes


____________

How many External interrupts and Wake-up pins, exist in the STM32F3xx
microcontroller?
____________

174

Same as STM32F1xx
but with New features and
new Name

Extended interrupts and events


controller (EXTI)

Some communication peripherals (UART,


I2C, CEC (*), comparators) are able to
generate events when the system is in
run/sleep mode and also when the system
is in stop mode allowing to wake up the
system from stop mode.
These peripherals are able to generate
both a synchronized (to the system APB
clock) and an asynchronous version of the
event.
All others features are same as
STM32F1xx series
Up to 36 (F30x) 29(F37x) Interrupt/Events
requests : Up to 88 (in STM32F30x) and 84 (in STM32F37x)
GPIOs can be used as EXTI line(0..15)

(*) The CEC is available on STM32F37x only.

Interrupt Mask
Register

Pending Request
Register

Rising Trigger
Selection Register

Falling Trigger
Selection Register

Edge Detect
Circuit

To NVIC

Pulse
Generator

Software Interrupt
Event Register

176

EXTI[15:0]

Manages the external and internal


asynchronous events/interrupts and
generates the event request to the
CPU/Interrupt Controller and a wake-up
request to the Power Manager

EXTI Features

Event Mask
Register

EXTI line 16 is connected to the PVD output


EXTI line 17 is connected to the RTC Alarm event
EXTI line 18 is connected to USB Device FS wakeup event
EXTI line 19 is connected to RTC tamper and Timestamps
EXTI line 20 is connected to RTC wakeup
EXTI line 21 is connected to Comparator 1 output
EXTI line 22 is connected to Comparator 2 output
EXTI line 23 is connected to I2C1 wakeup
EXTI line 24 is connected to I2C2 wakeup
EXTI line 25 is connected to USART1 wakeup
EXTI line 26 is connected to USART2 wakeup.
EXTI line 27 is connected to CEC wakeup. (STM32F37x only)
EXTI line 28 is connected to USART3 wakeup
EXTI line 29 is connected to Comparator 3 output (F30x only)
EXTI line 30 is connected to Comparator 4 output (F30x only)
EXTI line 31 is connected to Comparator 5 output. (F30x only)
EXTI line 32 is connected to I Comparator 6 output (F30x only)
EXTI line 33 is connected to Comparator 7 output (F30x only)
EXTI line 34 is connected to UART4 wakeup. (F30x only)
EXTI line 35 is connected to UART5 wakeup. (F30x only)
176

Quiz

177

How many lines does the Extended interrupt controller support?


____________

Which lines are mapped to a special asynchronous events ?


____________

Which lines can be used as system wake-up ?


____________

177

STM32F3 Training Agenda (2/4)


Day 2
STM32F3 Ecosystem
Standard firmware Library
Tools (STLink utility, STVP, etc)
ULINK PRO and TRACE presentation

Continue with STM32F3 common parts


Reset and clock control (RCC- with mentioning the differences in both products Clock
Schemes)
CRC
Digital-to-analog converter (DAC)
System window watchdog (WWDG)
Independent window watchdog (IWDG)
Serial peripheral interface (SPI)
Universal synchronous asynchronous receiver transmitter (USART) + Hands-on
Inter-integrated circuit (I2C) interface + Hands-on
inter-IC sound I2S (Simplex in STM32F37x and Full duplex in STM32F30x)

178

STM32F3 Eco-system
Standard Peripheral Library

What is a Standard Peripherals Library? (1/2)


A complete register address mapping with all bits, bit fields and
registers declared in C.
A collection of routines and data structures which covers all peripheral
functions (drivers with common API).
A set of examples covering all available peripherals with template
projects for the most common development toolchains.
Evaluation board drivers to allow getting started rapidly with a new
micro within few hours.

16/04/2012

What is a Standard Peripherals Library (2/2)


Libraries folder
CMSIS subfolder: Cortex-M4 CMSIS files:
Core Peripheral Access Layer
CMSIS DSP Software Library

STM32F3xx_StdPeriph_Driver subfolder:
Standard Peripherals drivers

Project folder
STM32F3xx_StdPeriph_Templates
subfolder
STM32F3xx_StdPeriph_Examples subfolder

Utilities folder
STM32_EVAL subfolder for the abstraction
layer of the of the supported evaluation board

Libraries: CMSIS
ARM DSP Library:
A suite of common signal processing
functions for use on Cortex-M processor
based devices. Written in C and CMSIS
compliant

STM32F3xx
Device CMSIS
files

CortexM core access layer


CortexM

STM32xx Device CMSIS files

CMSIS files

stm32f3xx.h file
system_stm32f3xx.c/.h files
startup_stm32f3xx.s
ARM DSP Library

16/04/2012

Libraries: stm32f3xx.h
Configuration section
Used device
Std_Periph_Lib use
Specific parameters

Data structures and


the address mapping
for all peripherals
Peripheral's registers
declarations and bits
definition

Libraries: stm32f3xx_system.c
SystemInit()
This function is called at startup just after reset and before branch to main
program. This call is made inside the "startup_stm32f3xx.s" file.
Setups the system clock (System clock source, PLL Multiplier and Divider factors,
AHB/APBx prescalers and Flash settings)
Can be generated depending on the configuration made in the clock xls tool

Libraries: startup_stm32f3xx.c
Main Characteristics
Contains the vector table for the device

Initializes stack pointer


Sets the PC to the Reset handler
Calls SystemInit() function
Branches to main()

Libraries: Std_Periph_Drivers
STM32F3xx_StdPeriph_Driver subfolder
Contains all the subdirectories and files that make up
the core of the library:
inc sub-folder : the Peripheral's Drivers header files.
stm32f3xx_ppp.h (one header file per peripheral):
Function prototypes, data structures and enumeration.

src sub-folder: Peripheral's Drivers source files.


stm32f3xx_ppp.c (one source file per peripheral):
Function bodies of each peripheral.

 Drivers files dont need to be modified by the user.

Drivers are:
Strict ANSI-C coded
Software Toolchain independent

Register manipulation abstraction


Standard API for peripheral functions access

Projects:Std_Periph_Templates
Standard template projects for all the
supported toolchains that compile the
STM32F3xx Standard Peripheral's drivers
All the user-modifiable files that are
necessary to create a new project
stm32f3xx_conf.h
stm32f3xx_it.c/.h
main.c/.h(optional)
system_stm32f3xx.h

Projects:stm32f3xx_conf.h

Projects:stm32f3xx_it.c
Contains Cortex-M4 Processor Exception Handlers (ISRs)
void NMI_Handler(void);
void HardFault_Handler(void);
void MemManage_Handler(void);
void BusFault_Handler(void);
void UsageFault_Handler(void);
void SVC_Handler(void);
void DebugMon_Handler(void);
void PendSV_Handler(void);
void SysTick_Handler(void);

Contains the STM32F3xx Peripherals Interrupt Handlers (default is


empty)
Add the Interrupt Handler for the used peripheral(s) (PPP), for the
available peripheral interrupt handler's name please refer to the
startup file (startup_stm32f3xx.s)
void PPP_IRQHandler(void) {};

Projects:main.c
main()
Standard C main() function entry
Start of application program

Projects:Std_Periph_Examples
Provides for each peripheral sub-folder, the minimum set of files
needed to run a typical example on how to use this peripheral:
readme.txt: brief text file describing the example and how to make it work.
stm32f3xx_conf.h: header file allowing to enable/disable the peripheral's drivers
header files inclusion.
stm32f3xx_it.c: source file containing the interrupt handlers
stm32f3xx_it.h: header file including all interrupt handler prototypes.
main.c: example of code.
system_stm32f3xx.c: this file provides functions to setup the STM32 system

To execute any of the examples:


1. Copy and paste the above files into the Std_Periph_Templates folder
2. Choose the preferred supported toolchain and build the project
3. Then load your image into target memory and Run  it should work

Utilities
STM32_EVAL: abstraction layer to interact with the Human Interface
resources; buttons, LEDs, LCD and COM ports (USARTs) available on
STMicroelectronics evaluation boards.
Common: contains common drivers (lcd_log.c and fonts.c)
STM32303C_EVAL:Contains board specific functions

void STM_EVAL_LEDInit(Led_TypeDef Led);


void STM_EVAL_LEDOn(Led_TypeDef Led);
void STM_EVAL_LEDOff(Led_TypeDef Led);
void STM_EVAL_LEDToggle(Led_TypeDef Led);
void STM_EVAL_PBInit(Button_TypeDef Button, ButtonMode_TypeDef Button_Mode);
uint32_t STM_EVAL_PBGetState(Button_TypeDef Button);

A full API compatibility is maintained between different STM32_EVAL


boards drivers  Users has only to include the appropriate eval board files
into his project.

How to use the Standard Peripheral


Library?
Create a project and setup all the used toolchain's start-up files, or simply
use the Project templates provided within the Library package.
Library programming model:
Direct register access using CMSIS Layer
(+) compact and efficient generated code
(-) necessity of detailed knowledge of peripheral operation, registers and bits meaning, and the
configuration procedures
Procedure:
Comment the line #define USE_STDPERIPH_DRIVER in stm32f3xx.h file
Use peripheral registers structure and bits definition available within stm32f3xx.h to build the application

Peripheral drivers access through API to control the peripheral configuration and
operation
(+) no need of in-depth study of each peripheral specification and saves development time and
intregration cost
(-) drivers genericity may induce a non optimized size and/or speed of application code.
Procedure:

uncomment the line #define USE_STDPERIPH_DRIVER in stm32f3xx.h file


select the peripherals to include in the stm32f3xx_conf.h file
use the peripheral drivers API provided by the STM32F3xx_Std_Periph_Drivers
Reuse or adapt the rich set of examples provided within the Library package

STM32F3 vs. STM32F1 FW


compatibility(1/3)
Peripheral

F1
series

F37x
family

F30x
family

Comment

FW compatibility

ADC

YES

YES

YES++

New design

Full for F37x


Not compatible for
F30x

CAN

YES

YES

YES

Same feature

Full

CEC

YES

YES+

NA

Enhancement

Not compatible

COMP

NA

YES

YES

NA

NA

CRC

YES

YES+

YES+

New feature

Partial

DAC

YES

YES

YES

Same feature

Full for F30x


Not compatible for
F37x

DBGMCU

YES

YES

YES

Same feature

Full

DMA

YES

YES

YES

Same feature

Full

EXTI

YES

YES

YES

Same feature

Full

STM32F3 vs. STM32F1 FW


compatibility(2/3)
Peripheral

F1
series

F37x
family

F30x
family

Comment

FW
compatibility

GPIO

YES

YES++

YES++

New design

Not compatible

I2C

YES

YES++

YES++

New design

Not compatible

I2S

YES

YES+

YES+

New fetaure

Partial

IWDG

YES

YES+

YES+

New fetaure

Partial

OPAMP

NA

NA

YES

NA

NA

PWR

YES

YES+

YES+

Enhancement

Partial

RCC

YES

YES+

YES+

New feature

Partial

RTC

YES

YES++

YES++

New peripheral

SDADC

NA

YES++

NA

New peripheral

NA

SDIO

YES

NA

NA

NA

NA

Not compatible

STM32F3 vs. STM32F1 FW


compatibility(3/3)
Peripheral

F1
series

F37x
family

F30x
family

Comment

FW compatibility

SPI

YES

YES+

YES+

New fetaure

Partial

SYSCFG

NA

YES

YES

NA

NA

TIM

YES

YES

YES++

USART

YES

YES+

YES+

New fetaure

Partial

WWDG

YES

YES

YES

Same
feature

Full

FSMC

YES

NA

NA

NA

NA

FLASH

YES

YES

YES

Compatible for
common feature

New feature or new architecture (Yes++)


Same feature, but specification change or enhancement (Yes+)

Not compatible

STM32F3 Eco-system
Tools

3rd party Tools contribution on F3 workflow

Customers
Internal

1st
FPGA

F3 basic
support

Internal & alpha customers

Validation
Porting on Tech/Tools Support
F3 advanced Firmware
&
support
development toolchains maintenance
verification
Synch with 3rd parties

3rd parties

198

How to get start?

http://gnbproject7mms.gnb.st.com/tools/1632toolssupport/default.aspx

199

Basic Support

200

Customers
Internal

1st
FPGA

F3 basic
support

Internal & alpha customers

Validation
Porting on Tech/Tools Support
F3 advanced Firmware
&
support
development toolchains
maintenance
verification
Synch with 3rd parties

3rd parties

Basic Support: SW Patch


(Available)

1. Connection

2. Flash algorithm

201

Advanced Support

202

Customers
Internal

1st
FPGA

F3 basic
support

Internal & alpha customers

Validation
Porting on Tech/Tools Support
F3 advanced Firmware
&
support
development toolchains maintenance
verification
Synch with 3rd parties

3rd parties

Advanced Support: SW Patch


203

(Available)
1. Connection
2. Flash algorithm
3. SFR Viewer (SVD-CMSIS)

Advanced Support: ST-LINK Utility


(1/4)
STM32F37x and STM32F30x are
supported since v2.3RC1 (Available).

Features:
Display, modify, program and erase the
target memory
Save the memory content to different
formats (Hex, SREC and Bin)
Display and modify the option bytes
Flash blank check
Compare the device memory content
with a file
MCU core registers display
Automatic mode

204

Advanced Support: ST-LINK Utility


(2/4)

205

Readout Protection:
Flash protection against read (3 levels).

User option bytes


nSRAM_Parity
nBoot1
...

Tooltip windows:
Further description
option bit purpose

on

the

user

Write Protection
Flash protection against write operation.

Advanced Support: ST-LINK Utility


(3/4)
STM32F30x User Option bytes

nBoot1
Together with the BOOT0 pin, selects the
Boot mode:

STM32F37x User Option bytes

nB00T1 checked/uncheked
BOOT0=0
=> Boot from Main Flash memory.
nB00T1 checked
BOOT0=1
=> Boot from System memory.
nB00T1 unchecked
BOOT0=1
=> Boot from Embedded SRAM.

nSRAM_Parity:
Enable/Disable the SRAM memory Parity
check.

206

Synchronization with 3rd parties

Customers
Internal

1st
FPGA

F3 basic
support

Internal & alpha customers

Validation
Porting on Tech/Tools Support
F3 advanced Firmware
&
support
development toolchains maintenance
verification
Synch with 3rd parties

3rd parties

Documentation (RM, PM, Ds and ES)


STM32F3xx FW StdPeriph Library)
Advanced support

3rd parties: Targeting official


support

207

What are new ST-LINK Vision debug


features (1/3) (Available)
1. Flash algorithm and connection mechanism is the same as ULINK

208

What are new ST-LINK Vision debug


features (2/3)

209

2. Add the support of "Connect under rest", "With Pre-Reset" "Normal", "HW Reset", "SW
Reset" options

3. Add the support of "Flash Download", "Flash Erase" from Flash menu

What are new ST-LINK Vision debug


features (3/3)
4. Add the support of SWV feature (SWO freq: 2Mbit/sec)

5. Add the support of System and FPU registers display

210

What is planned for the next period


1. Ensure the official support for STM32F3xx on :







EWARM
MDK-ARM
RIDE7
Tasking
TrueSTUDIO
RedSuite (CodeRed), Crossworks (Rowley Associates), Multi
(GreenHills)

2. Publish ST-LINK Utility V2.3 (W18)


3. Contribute on starter kit definition and integration with 3rd parties
4. Ensure the porting of different FW deliverables on SW toolchains
5. Provide 2nd level support on SW Toolchains

211

Lab Session:
Using ETM to identify
root cause of Hardfault

Requirements
Software Tools:
MDK-ARM v4.50
MDK-ARM STM32F3 Add-on Installer

Hardware Tools:
ULINKpro Debug Adaptor

Target Hardware:
STM3240G-EVAL
or

STM3230C-EVAL or STM3237C-EVAL

Objective
To demonstrate using the ETM interface to quickly and easily identify
the root cause of a hard fault condition.

Which windows to open in MDK to see the trace data


How to interpret the data

Notes
The procedure for the Lab is the same for both target platforms:
STM3240G-EVAL
STM3230C-EVAL or STM3237C-EVAL

There may be some slight differences in the appearance of the screen


captures depending upon your target platform.

If you are using the STM3240G-EVAL you need to enable the ETM by
moving two jumpers:
JP1 and JP2 (just below the boot switches)
Set to 1-2 (labelled as Trace)

Open the Project


For the F3 Eval Board:
C:\Blinky_ULp_Hardfault_F3xx\Blinky_Ulp\Blinky.uvproj

For the F4 Eval Board:


C:\Blinky_ULp_Hardfault_F4xx\Blinky_Ulp\Blinky.uvproj

What are we doing?

1 of 2

Inside the project, open the file blinky.c

Inside Blinky.c, locate main()

At the beginning of main() you will see calls to initialise the ADC,
LEDs serial port and systick timer.

What are we doing?

2 of 2

Inside the while(1) loop that is part of main() we will repeatedly


sample the ADC and use that to control the speed of the LEDs in the
application toggling.

Every 1 second we use printf (through the ITM Debug channel) to


write the ADC data out on a serial port
This 1 second delay is determined using the systick timer.

All project configuration (for debug adaptor, trace settings etc) is done
for you.

Entering Debug
Before we enter the Debug view, we must rebuild the application.

Press the Rebuild All Icon


Near the top left of the uVision window

The Build Output window should report 0 Errors 0 Warnings


Enter Debug using the debug button

The Debug View


When you enter debug on the right hand
side you will find the System Viewer
window for ADC3.
This will update dynamically with the ADC data
as the application runs.

Open the Instruction Trace window.


View\Trace\Trace Data
Or

Trace Window
There will already be some content in the Trace window
This is because the application is set to Run to Main on entering debug and the
trace interface collects ALL instructions from device reset.

Run the application:

Run the application

Press
Or

Use the key F5

Turn the POT on the target board to the right (clockwise)


You should be able to observe in the System Viewer for ADC 3 that the data changes.
Now turn the POT all the way to the left (anti-clockwise) and back again data stops coming
into the System Viewer window, something is wrong!
Click the trace Data tab to show the trace view
Its full of Hard Fault Handler entries....
In the code window you can see the Program Counter showing the Hard Fault Handler

Working out what is wrong...

1 of 4

We can work out what is wrong buy putting a Breakpoint onto the
HardFault Handler and then reviewing the trace information when we
hit the Breakpoint.

Left click in the margin of the startup file to add a breakpoint on the
hardfault handler:

We need to reset the MCU to remove the hardfault, which means we


also need to re-sync the ETM interface.
To re-sync the ETM you must exit and re-enter debug
Exit Debug with the

button and then re-enter using

Working out what is wrong...

2 of 4

When back in debug, all the windows from the previous session, and
the Breakpoint will be preserved.
Run the application with the

button

Turn the POT all the way to the left


The hardfault occurs again, but this time the Trace Window has useful
data in it:

And in the source code window you can see we are at the Hardfault
Breakpoint.

Working out whats wrong...

3 of 4

The last instruction in the trace window shows an SDIV instruction:

Change the display in the Trace window to High Level Language


Use the drop down box near the top of the trace window

Now you can see the last line of source code that was executed

Double click that line and the source code window will update (as will the disassembly window)

Working out what is wrong... 4 of 4


Now we can clearly see what happened before the hardfault (in both
dis-assembly and in C-Source).

This makes it easy to determine the cause of the fault.

In this case, when the ADC reaches zero, the offending line of code generates a
hardfault by doing a divide by 0 operation.

Now we know what is wrong we can modify the source code to fix it...
Exit debug using the

button

Fix and Retest


A simple fox for this lab...
Open IRQ.c in the uVision editor
In the systick handler, locate the line:
Comment out the line
Re-build all using the

button

Enter debug using the

button

Run the application using the

button

Twist the POT, no more hardfault!!

Debug and Trace Adapters

JTAG

SWD

SWO

SWO

ETM

1Mb/s

100Mb/s

Streaming

JTAG

ULINK2:

SWD

ULINKpro:

 Programming + Run-Control
 ULINK2 +
 Memory + Breakpoint Access
 Serial Wire Trace (SWO)
 Serial Wire Trace Capturing (SWO)
 100Mbit/sec (Manchester Mode)
 1Mbit/sec (UART mode)
 ETM Streaming Trace
 Up to 800Mbit/sec
 100% Code Coverage and
Performance Analysis

What is Streaming Trace?


 Trace data transferred in




real-time to debug host


Trace for minutes, hours, or longer
Required for full code-coverage and
timing analysis
Todays workstations can present
trace data instantly

Alternative analysis methods


Other vendors offer Code Coverage and Profiling, but...
Code Coverage
Only in simulation. Not sufficient for certification

Execution Profiling
SWO
Statistical sampling samples 1 in 1,000 cycles
Only gives an approximation of application performance

ETM
Limited to size of debug adapter trace buffer (typically 4MB)
Can only profile small parts of application (~10Secs)
Cannot soak test application for long periods

ULINKpro Fastest Data Trace


100 times faster than most other MCU solutions
Real-Time data trace analysis
CPU operates at full speed
No overflows or lost data

MDK gives clear visibility into application behaviour

JTAG

SWD

SWO

ETM

100Mb/s

Streaming

ULINKpro Streaming Instruction Trace


JTAG

SWD

SWO

ETM

100Mb/s

Streaming

Only Solution to stream trace


directly to PC which delivers
unique capabilities

Search trace data


Save trace data
Function trace
Synchronised to source
code

ULINKpro Unlimited Trace

Trace Navigation

ULINKpro 100% Code Coverage


100% accurate Code Coverage on silicon
Identifies every executed instruction
Colour code in left margin of source shows execution level

Essential for software verification & certification

JTAG

SWD

SWO

ETM

100Mb/s

Streaming

ULINKpro Performance Analysis


Performance Analysis
Optimize and Profile Applications
Quickly identify hot-spots with the performance analysis view
Use the Execution analysis data to determine to the exact line of source
code that is the best target for optimisation

JTAG

SWD

SWO

ETM

100Mb/s

Streaming

ULINKpro Advanced Debug Capability

Streaming Instruction Trace


Debug historical sequences

JTAG

SWD

Full details of execution history

SWO

ETM

100Mb/s

Streaming

Application Soak testing over long periods of time

Performance Analysis
Optimize and Profile Applications
Identify hotspots quickly

Code Coverage
Implement 100% accurate Code Coverage on silicon
Essential for validation and verification

Fastest Data Trace


100 times faster than any other solution
CPU at full speed
No overflows or lost data

Further Information
Visit www.keil.com/arm
Product Overview
Users Manual
Application Notes

Continue with F3 common


features/peripherals

Reset and Clock control RCC

RCC introduction
Reset:
Initialize the device
Wakeup device
Safety functions (watchdog)

Clocks:
Select appropriate clock source:
Internal
External

Select appropriate speed:


High speed
Low speed
Speed regulation

Modify clock parameters for:


Core
Peripherals

Security functions:
In case of clock source malfunction

240

System RESET

Reset sources

241

Resets all registers except some RCC registers and Backup domain
Sources:

Low level on the NRST pin (External Reset)


WWDG & IWWDG end of count condition
A software reset (through NVIC)
Low power management reset
Option byte loader reset (FORCE_OBL bit)

Power RESET
Resets all registers except the Backup domain
Sources:
Power On/Power down Reset (POR/PDR)
Exit from STANDBY

Backup domain RESET


Resets in the Backup domain: RTC registers + Backup Registers + RCC_BDCR register
Sources:
BDRST bit in RCC_BDCR register
POWER Reset

Reset block diagram


Reset sources in STM32F3 family and their relation to RESET pin:

VDD

RPU
External
RESET

SYSTEM RESET

Filter
NRST

WWDG RESET
IWWDG RESET
PULSE
GENERATOR
(min 20s)

Software RESET
Low power management RESET
Option byte loader RESET
Power RESET

Standby exit

POR/PDR

242

Clock features (1/2)


System Clock (SYSCLK) sources:
HSE (High Speed External oscillator or crystal)
4MHz to 32MHz,
can be bypassed by user clock

HSI (High Speed Internal RC):


factory trimmed internal RC oscillator 8MHz +/- 1%

PLL x2, x3, .. x16


From HSE or HSI/2
16MHz 72MHz output

Additional clock sources:


LSI (Low Speed Internal RC):
~40kHz internal RC

LSE (Low Speed External oscillator):


32.768kHz
can be bypassed by user clock
Configurable driving strength (power/robustness compromise)

243

Clock features (2/2)


Clock-out capability on the MCO:
LSI, LSE, SYSCLK, HSI, HSE, PLL/2

Clock Security System (CSS) to switch to backup clock:


In case of HSE clock failure
Enabled by SW w/ interrupt capability linked to NMI
Could generate BREAK for Timers

RTC Clock sources:


LSE, LSI and HSE/32

USART, I2C & CEC have multiple possible clock sources:


Possibility to wakeup device if there is no system clock:
For USART: HSI, LSE
For I2C: HSI
For HDMI-CEC: LSE, HSI

244

Clock scheme STM32F37x


HSE

32.768KHz
OSC32_IN

/32
RTCCLK

LSE Osc

/8

OSC32_OUT
LSI RC

245

~40kHz

SysTick

IWWDGCLK

HCLK
CSS

8MHz

PCLK1
HSI

HSI RC
/2
4 -32 MHz
OSC_OUT

/2, 3, ..16

PLL
x2, x3,
.. x16

PLLCLK
HSE

SYSCLK

AHB Prescaler
72 MHz max /1, 2, ..512

APB1
Prescaler
/1,2,4,8,16

If (APB1 pres=1)
Else

x1
x2

HSE Osc

PCLK2

OSC_IN
APB2
Prescaler
/1,2,4,8,16
/2, /3

USB

If (APB2 pres=1)
Else

x1
x2

PCLK2

PCLK1

SYSCLK

SYSCLK
USART1

HSI

ADC

USART2,
USART3

HSI

LSE

LSE
HSI

LSI
LSE
SYSCLK

TIMxCLKAPB2

VCO * 2
/2,4,6,8

MCO

TIMxCLKAPB1

PCLK2

SPI1/I2S1

PCLK1

SPI2, SPI3
I2S2, I2S3

SYSCLK

I2C1,
I2C2

LSE

CEC

HSI
HSE
PLLCLK/2

SYSCLK

/244

/1,2,3,4,5,6,7,8,10,12,14,16,18,20,22,24

/2

SDADC1,
SDADC2,
SDADC3

Clock scheme STM32F30x

FLITFCLK
to Flash Programming interface

HSI RC
8 MHz

HSI
To I2Cx (x = 1,2)

SYSCLK
SYSCLK

PLLSRC

Ext.Clock

USB
prescaler
/1,1.5

PLLMUL

PLL
x2,x3..
x16

PLLCLK
SYSCLK
HSE

AHB
prescaler
/1,2,..512

/8

HSE
OSC 432 MHz

APB1
prescaler
/1,2,4,8,16

/2,/3,/16

USBCLK
to USB interface
To AHB bus, core,
memory and DMA
To cortex System
Timer (systick)
To FHCLK Cortex free
running clock

HCLK

HSI

OSC_IN

To I2Sx (x = 2,3)

I2S_CKIN pin

/2

OSC_OUT

PCLK1
If (APB1
prescale
r = 1)x1
else x2

CSS

To APB1 peripherals
To TIM 2,3,4,6,7

PCLK1

OSC32_OUT
OSC32_IN

LSE
OSC
32.768kh
z

/32

LSI RC
40 KHz

LSI

SYSCLK
HSI
LSE

RTCCLK
To
RTC

APB2
prescaler
/1,2,4,8,16

To IWDG
IWDGCLK

MCO

/2

PLLCLK
HSI
LSI
HSE
SYSCLK
LSE

ADC
Prescaler
/1,2,4

To USARTx ( x = 2..5)

PCLK2
To APB2 peripherals

RTCSEL[1:0]

MCO

246

If (APB2
prescale
r = 1)x1
else x2

To ADCxy
(xy = 12 ,32 )

To TIM 15,16,17

PCLK2
SYSCLK

ADC Prescaler
/1,2,4,6,8,10,12,1
6,32,64,128,256

HSI

To USART1

LSE

x2

To TIM1/8

HSI/LSI/ext. clock measurement

247

TIM14 (in F37x) and TIM16 (in F30x) input capture can be
triggered by:
TI1_RMP[1:0] in TIM14_OR/
GPIO pin
RTCCLK
HSE/32
MCO output

TIM14
Or
TIM16
TI1

TIM16_OR
RTCCLK
HSE/32
MCO

GPIO
LSI
LSE
SYSCLK
HSI

Purposes:

HSE
PLLCLK/2

Measure HSI frequency using the precise LSE clock. HSI is used as system clock.
Knowing the (more precise) LSE frequency we can determine the HSI frequency.
Measure the LSI frequency using HSE or HSI. To fine tune IWWDG and/or RTC timing
(if LSI used as RTC clock).
Have rough indication of the frequency of external crystal by comparing HSI and
HSE/32

Quiz
What is the maximum AHB and APB1 and APB2 clock frequencies ?
What is the purpose of connecting LSE clock to TIM14/16 CH1 input
capture and how it could be done?
What is the purpose of the CSS?

248

CRC calculation unit

CRC Introduction 1/2


CRC-based techniques are used to verify data integrity (communications)
In functional safety standards (such as EN/IEC 60335-1), CRC peripheral
offers a means of verifying the embedded Flash memory integrity
Single input/output 32-bit data register, but handles 8,16, 32-bits input
data size
CRC computation done in 4 AHB clock cycles (HCLK) maximum
General-purpose 8-bit register (can be used for temporary storage)

250

CRC Introduction 2/2


New features:
Programmable parameters:
Programmable polynomial:
By default uses CRC-32 (Ethernet) polynomial: 0x4C11DB7
Alternatively uses a fully programmable polynomial with programmable size (7, 8, 16, 32 bit)

Programmable polynomial size (7, 8, 16, 32 bits)


Programmable CRC initial value (default = 0xFFFF_FFFF)

Reversibility option on I/O data


Input data can be reversed by 8, 16, 32 bit
Example if input data 0x1A2B3C4D is used for CRC calculation as:
0x58D43CB2 with bit-reversal done by byte
0xD458B23C with bit-reversal done by half-word
0xB23CD458 with bit-reversal done on the full word

Output data can be reversed in 32-bit (output register)


Example on output data 0x11223344:
0x11223344 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 0
0x22CC4488

31

00100010110011000100010010001000

251

CRC Operation

Operation:

Each write operation to the data register creates a combination of the previous CRC
value (stored in CRC_DR) and the new one. CRC computation is done on the whole 32bit data word or byte by byte depending on the format of the data being written.
The duration of the CRC computation depends on input data width:
4 AHB clock cycles for 32-bit
2 AHB clock cycles for 16-bit
1 AHB clock cycles for 8-bit

Polynomial can be changed after finishing current CRC calculation (or after CRC reset)
The input and output data can be bit reversed, to manage the various endianness
schemes (REV_IN [1:0], REV_OUT bits).
AHB Bus
32-bit (read access)

Data register (Output)

Initial value

CRC computation

Data register (Input)


32-bit (write access)

Polynomial

252

Quiz
What are the new programmable parameters in CRC?
How many cycles are required to compute a CRC of 15 bytes from
RAM ?
What is the value taken into CRC computation if data input is :
0x11223344 and reversal mode is set to half word ?

0x44332211
0x22114433
0x448822CC

253

Digital to Analog Converter DAC

DAC introduction
Interfaces:
Two 12-bit DAC converters inside STM32F37x:
DAC1 with 2 DAC output channels
DAC2 with 1 output channel

One 12-bit DAC converter inside STM32F30x:


DAC1 with 2 DAC output channels

Features and differences:

8-bit or 12-bit mode (left or right data alignment in 12-bit mode)


Synchronized update capability
Noise-wave or Triangular-wave generation
DMA capability for each channel (with DMA underrun error detection)
External triggers for conversion (Timers, ext. pin, SW trigger)
Programmable output buffer to drive more current
Input voltage reference VREF+
DAC supply requirement: VDDA = 2.4V to 3.6 V
DAC outputs range: 0 DAC_OUTx VREF+
Dual DAC channel mode supported by DAC1 only:
Two channels can be used independently or simultaneously when both channels are grouped
together for synchronous update operations (dual mode).

255

DACx channel block diagram


DAC Control Register

TIM7_TRGO

DMAENx

TIM3/8_TRGO

TENx

MAMPx[3:0]

SWTRIGx
TIM6_TRGO

WAVEx[1:0]

TSELx[2:0]

TIM2_TRGO
TIM4_TRGO
TIM5/15_TRGO

Control Logic x

Ext_IT_9

DMA Request x
12 bits

Noise/triangle

DHRx

BOFF

12 bits

DORx
12 bits

VREF+
VDDA
VSSA

Digital to Analog Converter x

DAC_OUTx

256

DAC analog output


Output voltage:
Analog output voltage is given by formula:
DAC Output = VREF+ * (DOR / 4095)
VREF+ . reference voltage (shared with ADC, input pin or shared with VDDA)
DOR . Data output register

Output current:
Optional output analog buffer (booster) to improve current capability (BOFF bit)
Without output analog buffer (BOFF bit = 1):
Rail to rail output: Vout = (VREF+ + 1LSB) (VREF+ - 1LSB)
Output impedance: 15k
Min. load for 1% error: >1.5M

With output analog buffer (BOFF bit = 0):


Limited output near edges: Vout = (200mV) (VDDA - 200mV)
Min. load for 1LSB error: >5k
DAC_Channel_x
DOR = 0xFFF  3.3V

DAC_OUT

RLOAD >= 5 K

VSS

257

DAC data format


8-bit mode:
Always right alignment (in register DAC_DHR8Rx)
Also in dual channel mode (register DAC_DHR8RD)

12-bit mode:
Right alignment (in register DAC_DHR12Rx)
Left alignment (in register DAC_DHR12Lx)
Also in dual channel mode (registers DAC_DHR12RD, DAC_DHR12LD)
8 bits Right alignment:
alignment Load DAC_DHR8Rx [7:0]

DAC_DHR8Rx

D7

D6

D5

D4

D3

D2

D1

D0

D3

D2

D1

D0

12 bits Right alignment : Load DAC_DHR12Rx [11:0]

DAC_DHR12Rx

D11 D10

D9

D8

D7

D6

D5

D4

D7

D5

D4

D3

D2

D1

D0

12 bits Left alignment : Load DAC_DHR12Lx [15:4]

DAC_DHR12Lx

D11 D10

D9

D8

D6

258

DAC conversion triggers


Conversion started (load data to the DORx register) by:
Automatically (if external trigger disabled TENx) :
From DAC_DHRx after one APB1 clock cycle

Triggered conversion (if external trigger enabled TENx) :


After three APB1 clock cycles after trigger generated by external trigger (except SWTRIG)

Triggers:
Timers:

Timer 6 TRGO event


Timer 3 TRGO event (or Timer 8 as option for STM32F30x)
Timer 7 TRGO event
Product dependent :
For STM32F37x: Timer 5 (for DAC1) / Timer 18 (for DAC2) TRGO event
For STM32F30x: Timer 15

Timer 2 TRGO event


Timer 4 TRGO event

External pin
EXTI line9

Software
SWTRIG bit

259

DAC noise wave generation


Noise generation:
Based on LFSR (linear feedback shift register):
Initial value = 0xAAAA

The LFSR 12bits value can be masked partially or totally


Anti lock-up mechanism: if LFSR equal to 0 then a 1 is injected on it
Calculated noise value (updated through external trigger) is added to the
DAC_DHRx content without overflow

260

DAC triangle wave generation


Triangle generation:
Add a small-amplitude triangular waveform on a DC or slowly varying signal: used
as basic waveform generator for example
Calculated triangle value (updated through external trigger) is added to the
DAC_DHRx content without overflow to reach the configurable max amplitude
Up-Down triangle counter:
Incremented to reach defined max amplitude value
Decremented to return to the initial base value

Triangle max. amplitude are configurable: (2N1) with N=[1..12]

MAMPx[3:0]: Max amplitude

DAC_DHRx: Base value

261

DAC dual channel mode


For DAC1 only (2 channel outputs):
Both DAC channels can be used together
generate differential or stereo signals in simultaneous conversion mode

11 DAC dual modes:


Independent trigger or Simultaneous trigger or Software start
without or with wave generation
LFSR or Triangle wave generation

All modes listed:


1. Independent trigger without wave generation
2. Independent trigger with single noise generation
3. Independent trigger with different noise generation
4. Independent trigger with single triangle generation
5. Independent trigger with different triangle generation
6. Simultaneous trigger without wave generation
7. Simultaneous trigger with single LFSR generation
8. Simultaneous trigger with different LFSR generation
9. Simultaneous trigger with single triangle generation
10. Simultaneous trigger with different triangle generation
11. Simultaneous software start
12 bits Left alignment in dual channel mode : Load DAC_DHR12LD bits [31:20] [15:4]

DAC_DHR12LD

D11 D10

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

D11 D10

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

262

DAC with DMA


A DAC DMA request is generated when an external trigger occurs:
The value of the DAC_DHR register is then transferred to the DAC_DOR register.

DMA underrun error detection with interrupt capability

CPU
RAM
(Pattern 1)
(Pattern 2)

DACx
DAC Triggers

Channel x
output

DMA

263

Quiz
How many DAC channels are in the STM32F3x microcontroller ?
What are the possibilities to start a DAC channel conversion?
What are the different generated waves?
What is the min. load of DAC channel output?

264

Independent and System


Watchdogs

System window watchdog (WWDG)


features (1/2)
Configurable time-window can be programmed to detect abnormally
late or early application behavior
Conditional reset
Reset (if watchdog activated) when the down counter value becomes less than 40h
(T6=0)
Reset (if watchdog activated) if the down counter is reloaded outside the time-window

To prevent WWDG reset:


write T[6:0] bits (with T6 equal to 1) at regular intervals while the counter value is
lower than the time-window value (W[6:0])
WWDG
Reset

WWDG_CFR

T[6:0] CNT down counter

W6 W5 W4 W3 W2 W1 W0

comparator
= 1 when
T6:0 > W6:0
CM
P

W[6:0]
3Fh

Write WWDG_CR

WDGA T6 T5 T4 T3 T2 T1 T0
Refresh
not allowed

Refresh
Window

time

WWDG_CR

6-Bit Down Counter

T6 bit
Reset

PCLK
(up to 48MHz)

PRESCALER
(WDGTB)

266

WWDG features (2/2)


Early Wakeup Interrupt (EWI):
occurs whenever the counter reaches 40h
can be used to reload the down counter or recovery or store state before reset (in special
cases)

WWDG reset flag (in RCC_CSR) to inform when a WWDG reset


occurs (after device reset)
Prescaled from the APB1 clock (36MHz):
4 predividers:

4096 * 1
4096 * 2
4096 * 4
4096 * 8

Min-max timeout value: 113.8s / 58.25ms

Best suited to applications which require the


watchdog to react within an accurate timing window

267

Independent window watchdog (IWDG)


features (1/2)

1.8V voltage domain

Prescaler
Register

Status
Register

Reload
Register

Key
Register

Window
Register

12-bit
reload value
LSI
(40KHz)

8-bit
PRESCALER
12-bit
down counter

IWWDG
Reset

Best suited to applications which require the


watchdog to run as a totally independent
process outside the main application

12-bit
window comp
VDD voltage domain

Selectable HW/SW start through option byte


Clocked from an independent RC oscillator (LSI)
can operate in Standby and Stop modes

Once enabled the IWWDG cant be disabled (LSI cant be disabled too)
Implemented in the VDD voltage domain:
Is still functional in STOP and STANDBY mode

268

Independent window watchdog (IWDG)


features (2/2)
Conditional reset
Reset (if IWDG activated) when the downcounter value becomes less than 0x000
Reset (if IWDG activated) if the downcounter is reloaded outside the window

To prevent IWDG reset:


write to IWWDG_KR register with 0xAAAA key value at regular intervals before the
counter reaches 0, while respecting the defined refresh window
Safe Reload Sequence (key) + window

IWDG reset flag (in RCC_CSR) to inform when a IWDG reset occurs
(after device reset)
Prescaled from the LSI clock (40kHz):
8-bit predivider: 4-256 (and 12-bit watchdog counter):
Min-max timeout value: 100s / 26.2s

269

Quiz
Which clock feeds the IWWDG down counter?
How can be the IWWDG started?
In which WDG is implemented the window option?
How to detect that device was reset by watchdogs?

270

Serial peripheral interface SPI

SPI Features (1/2)


272

Full duplex synchronous transfers (3 lines)


Half duplex/Simplex synchronous transfers (2 lines, bi-directional data
line at half duplex)
Programmable clock polarity & phase, data MSB/LSB first
Master/multi Master/Slave operation
Dynamic software/hardware NSS management (Master/Slave)
Hardware CRC feature (8-bit & 16-bit data frames checking)
Flags with IT capability (TxE, RxNE, MODF, OVR, CRCERR)
Programmable bit rate: up to fPCLK/2
BSY flag (ongoing communication check)

Up to 18 MHz
bit rate

DMA capability (separated Rx/Tx channels,


automatic CRC & Tx/Rx access/threshold handling)
272

SPI Features (2/2)


273

NEW!

New enhanced NSS control:


NSS pulse mode (NSSP)
TI mode
Programmable data frame from 4-bit to 16-bit
Two 32-bit Tx/Rx FIFO buffers with DMA capability
Data packed mode control

273

SPI block scheme

274

(SD)
(MCLK)

(CK)
(WS)

274

Data Frame Format (Motorola mode)

275

Data frame format :


Programmable size of transfer data frame format from 4-bit up to
16-bit
Programmable clock polarity & phase, data bit order MSB/LSB
first

275

NSS management

276

NSS input SSM selects HW control (NSS pin) or SW control (SSI bit):
Slave mode - select slave for communication
(optionally can be used for synchronization of a transaction begin)
Master mode - signalize conflict between masters

NSS output HW control at master mode only


(SSM=0, SSOE=1 or at specific modes - TI, NSSP)

276

Communication modes 1/3

277

Full Duplex mode (Single master & single slave)


Three lines are necessary at least - MISO, MOSI, SCK (NSS is
optional)

Master

Slave

NSS HW management
NSS SW management

277

Communication modes 2/3

278

Simplex mode (Single master & single slave)


Two lines are necessary at least (NSS is optional):
MISO & SCK (Master Rx only, Slave Tx only)
MOSI & SCK (Master Tx only, Slave Rx only - at below figure)

Master

Slave

278

Communication modes 3/3

Half duplex mode (Single master & single slave)


Two lines are necessary at least - bidirectional cross data line MOSIMISO and SCK (NSS is optional)

Master

Slave

279

Multi slave system (duplex/simplex)

280

Separated NSSs
At full duplex one slave
only (just selected by
NSS) can
communicate with
master at time

Slave

!
Master

Slave

Common NSS
Slaves at simplex
Rx only mode can
receive the same data
sent in parallel from
master
MISO pin cant be used
at this case

280

Multi slave circular duplex chain


Data lines are
connected into a
closed loop
Master must shift
the data through all
of the slaves
One common NSS
is used

281

32-bit Rx and Tx FIFOs


Two separated 32-bit
FIFOs for receive and
transmission
8-bit or 16-bit read/write
access to FIFOs.
Occupancy level flags
FTLVL[1:0], FRLVL[1:0],
TxE, RxNE
Different capability of Tx
and Rx FIFOs

282

Packed mode & FIFOs access


When data frame fits into one byte (from 4 up to 8 bits) two patterns
can be accessed in parallel by single write/read FIFOs.

Example:
4-bit data frame length, MSB first, 16-bit threshold is set for
RxFIFO,
both FIFOs can be accessed by single 16-bit read or write
1x TxE event at transmitter - 1x RxNE event at receiver

283

DMA
DMA handles automatically:

all the TXE and RXNE events


CRC is sent after last data frame
Initialization of CRC calculation
change of data register access and Rx threshold control of the last
data frame in case it is an odd frame at packed mode
(user must set LDMA_TX & LDMA_RX bits at this case!)

Packed mode is used by DMA:


data frame fits into one byte
when peripheral size (PSIZE) is set to 16-bit for SPI DMA channel
Notes:
OVR flag is set at transmit only mode (SPI continues to receive)
BSY & FTLVL must be checked before SPI is disabled

284

BSY flag
Communication activity checking
(to prevent corruption of ongoing transfer)
Before SPI or its clock are disabled (at some modes)
Before entry to Halt

Cleared under any one of the following conditions:


When the SPI is properly disabled
When a master mode fault occurs (MODF=1)
In master mode between transactions when no next pattern is ready
to transfer (FTLVL=00)
In slave mode between each data pattern transfer
Note:

BSY & FTLVL must be checked before SPI is disabled

285

CRC calculation (1/3)

286

Hardware CRC feature for reliable communication:

Separated CRC calculators implemented for transmitted and


received data
CRC value is transmitted as a last transaction(s)
CRC error checking for last received byte and interrupt generation
on error
Programmable CRC polynomial calculation (odd polynomials only)
Available for 8-bit or 16-bit data patterns only
Two possible CRC calculations: CRC8, CRC16-CCITT standard

286

CRC calculation (2/3)

287

Example of n data transfer between two SPIs followed by the


CRC transmission of each one in Full-duplex mode
SPI_TXCRCR and SPI_RXCRCR separated registers for CRC
calculation

MOSI

Data 1

Data 2

Data n

CRC[1..n]

Transmitter puts calculated


CRC value into TxFIFO

CRCNXT=1
Receiver - compares last frame(s)
MISO

Data 1

Data 2

Data n

CRC[1..n]

at RxFIFO with calculated CRC

SCK

CRCERR
interrupt
NSS

287

CRC calculation (3/3)

288

Basic SD/MMC support (SPI protocol):


Performance: speed up to 18MHz
Error checking: hardware CRC calculation
VDD

K
R = 4.7 K

VDD

MOSI
CS

1234

SCK

MISO

5678

Master

288

NSS enhanced modes (1/2)


Pulse mode (NSSP=1)
At master and Motorola mode with CPHA=0 only
NSS output is managed by HW

289

NSS enhanced modes (2/2)


TI mode (FRF=1)
Clock and NSS are managed by HW
At slave, baud rate setting defines trelease (MISOs HiZ)
Format frame error interrupt (FRE)

290

Be careful!
When data packed mode is used
Keep Rx threshold & read access of Rx FIFO always in line either 8-bit or
16-bit (preferable -> limited number of events)
Change Rx threshold just before last odd data frame is received

When go to Halt or when disable the SPI


check read FIFO occupancy and bus activity (FxLVL[1:0] = 00 & BSY = 0)

When communication is continuous (e.g. master Rx-only)


Perform Rx threshold, change/CRC control or Stop within Control window
CRCNXT=1
RXONLY=0
FRXTH=1

CPHA=0

Dummy/Odd/CRC Frame

Control
window

291

Quiz
What is a maximum SPI speed?
When data packed mode is used?
What should be in line when access the RxFIFO?
Does a master take any care of the NSS signal level if it is
managed by SW?
How many bytes (maximum) can be stored into TxFIFO till it
is full when 8-bit access is used?
What should precede before SPI is disabled?

292

Universal Synchronous Asynchronous


Receiver Transmitter (USART)

USART Features (1/3)


Fully-programmable serial interface characteristics:
Data can be 8 or 9 bits
Even, odd or no-parity bit generation and detection
1, 1.5 or 2 stop bit generation
Programmable baud rate generator
Configurable oversampling method by 16 or by 8
Up to 9Mbps when the clock frequency is 72 MHz and oversampling by 8 is selected.

Programmable data order with MSB or LSB first.


Swappable Tx/Rx pin configuration
Tx/Rx pins active level inversion & Binary data inversion
Support hardware flow control (CTS and RTS)
Dual clock domain allowing
UART functionality and wakeup from Stop mode
Convenient baud rate programming independent from the PCLK reprogramming

Dedicated transmission and reception flags (TxE and RxNE) with interrupt capability

294

USART Features (2/3)


Support for DMA
Receive DMA request
Transmit DMA request

LIN Master compatible


Synchronous Mode: Master mode only
IrDA SIR Encoder Decoder
Smartcard Capability T = 0, T = 1 (using the Address/character match,
End of block, receiver timeout etc)
Basic support for Modbus communication (using Address/character
match and Receiver timeout features).
Single wire Half Duplex Communication

295

USART Features (3/3)


Multi-Processor communication
USART can enter Mute mode
Mute mode: disable receive interrupts
Wake up from mute mode (by idle line detection or address mark detection)

Auto-baudrate detection using various character patterns.


Driver enable (for RS485) signal sharing the same pin as nRTS.
14 interrupt sources

296

STM32F30x USART Implementation


297

USART features

USARTs1/2
/3

UART4

UART5

Hardware Flow Control

YES

NO

NO

Continous communication using DMA

YES

YES

NO

Multiprocessor communication

YES

YES

YES

Synchronous mode

YES

NO

NO

Smartcard mode

YES

NO

NO

Single wire half duplex mode

YES

YES

YES

IrDA

YES

YES

YES

LIN

YES

YES

YES

Dual clock domain and wake up from


STOP mode

YES

YES

YES

Receiver timeout

YES

YES

YES

Modbus Communication

YES

YES

YES

Autobaudrate detection

YES

YES

YES

Driver enable

YES

NO

NO

STM32F37x USART Implementation


298

USART features

USART1

USART2

USART3

Hardware Flow Control

YES

YES

YES

Continous communication using DMA

YES

YES

YES

Multiprocessor communication

YES

YES

YES

Synchronous mode

YES

YES

YES

Smartcard mode

YES

YES

YES

Single wire half duplex mode

YES

YES

YES

IrDA

YES

YES

YES

LIN

YES

YES

YES

Dual clock domain and wake up from STOP mode

YES

YES

YES

Receiver timeout

YES

YES

YES

Modbus Communication

YES

YES

YES

Autobaudrate detection

YES

YES

YES

Driver enable

YES

YES

YES

298

Wakeup from STOP


 When USART_CLK clock is HSI or LSE, the USART is able to wakeup the
MCU from STOP.
 Wakeup from STOP is enabled by setting UESM bit in the USART_CR1.
 The sources of wake up from STOP mode can be the standard RXNE interrupt
(the RXNEIE bit must be set before entering Stop mode). Or, a specific
interrupt may be selected through the WUS bit fields in the USART_CR3:
 Wake up on address match
 Wake up on Start bit detection
 Wake up on RXNE

299

How calculating the max baudrate allowing to


wakeup properly from STOP
When M= 0, Max baudrate = (DWU(max) * 10)/T(WU)
When M= 1, Max baudrate = (DWU(max) * 11)/ T(WU)
With T(WU) is the wakeup time from STOP mode
DWU(max) is the receiver tolerance
Example: M= 1, OVER8= 0, ONEBIT = 0,
 Max baudrate = (3.03% * 11)/4s = 83,32 KBaud

300

DMA Capability

Each USART has DMA Tx and Rx requests

Each of the USARTs (except UART5 in STM32F30x) requests is mapped on a


different DMA channels: possibility to use DMA for all USARTs transfer direction at
the same time.

301

Synchronous Mode
USART supports Full duplex synchronous communication mode

Full-duplex, three-wire synchronous transfer


USART Master mode only
Programmable clock polarity (CPOL) and phase (CPHA)
Programmable Last Bit Clock Pulse (LBCL) generation
Transmitter Clock output (SCLK)

Slave

Master
SCLK

SCK

Rx

MISO

Tx

MOSI

USART

NSS

Full Duplex

SPI

302

Smart Card mode (1/2)


USART supports Smart Card Emulation ISO 7816-3
Half-Duplex, Clock Output (SCLK)
9Bits data, 1.5 Stop Bits in transmit and receive.
T=0, T = 1 support
Programmable Clock Prescaler to guarantee a wide range clock input

USART
Tx

SCLK

303

Smart Card mode (2/2)


STM32F3 vs STM32F1xx
Features
Maximum USART baudrate in Smartcard mode

T=0

STM32F3xx

STM32F1xx

3Mbits/s

4.5Mbits/s

In case of transmission error,


according to the protocol
specification, the USART should
send the character.

The USART can handle automatic resending of data. The number of retries is
programmable (8 max).

In case of reception error ,


according to the protocol
specification, the smartcard must
resend the same character.

The number of maximum retries is


programmable (8 max).
If the received character is still erroneous
after the programmed number of retries,
the USART will stop transmitting the
NACK and will signal the error as a parity
error.

A programmable guardtime is
automaticaly inserted between two
consecutive characters in
transmission.

Yes

The data retry should be


done by software.

No

Character Wait Time (CWT)

New
T=1

Block Wait Time (BWT)

Implemented using the new timeout


feature.

Block length and end of block


detection
Direct/Inverse convention

All T = 1 feature should be


implemented by software.

Implemented using some new features:


MSB/LSBFIRST, Binary data inversion
etc

304

Single Wire Half Duplex mode


USART supports Half duplex synchronous communication mode
Only Tx pin is used (Rx is no longer used)

Used to follow a single wire Half duplex protocol.


VDD

R = 10 K

USART1
Tx

Half Duplex

USART2
Tx

305

IrDA SIR Encoder Decoder


USART supports the IrDA Specifications
Half-duplex, NRZ modulation,
Max bit rate 115200 bps
The pulse width is 3/16 bit duration in normal mode

306

Modbus communication

307

Basic support for the implementation of Modbus/RTU and Modbus/ASCII


protocols.
Modbus/RTU: In this mode, the end of one block is recognized by a silence (idle line)
for more than 2 character times. This function is implemented through the programmable
timeout function.
Modbus/ASCII: In this mode, the end of a block is recognized by a specific (CR/LF)
character sequence. The USART manages this mechanism using the character match
function.

Auto-baudrate detection


4 patterns for auto-baudrate detection:




Any character starting with a bit at 1  the USART will measure the duration of the START bit
(Falling edge to rising edge).

Any character starting with a 10xx bit pattern  the USART will measure the duration of the START
bit and of the first data bit (Falling edge to Falling edge).

A 0x7F character frame (it may be a 0x7F character in LSB first mode or a 0xFE in MSB first mode).
In this case, the USART measures the duration of the start bit and the duration of bit 6.

A 0x55 character frame. In this case, the USART measures the duration of the start bit, the duration
of bit 0 and the duration of bit 6. In parallel, another check is performed for each intermediate
transition of RX line.

Once the automatic baudrate detection is activated, the USART will wait for the first character on the RX
line. The auto-baudrate completion is indicated by the setting of ABRF flag.

The clock source frequency must be compatible with the expected communication speed : When
oversampling by 16, the baud rate is between fCK/65535 and fCK/16. When oversampling by 8, the
baudrate is between fCK/65535 and fCK/8).

If the line is noisy, the correct baudrate detection is not guaranteed (BRR content may be corrupted))

308

USART Interrupts
Interrupt event

Interrupt flag

Transmit Data Register Empty

TXE

Transmission Complete

TC

CTS

CTSIF

Receive Data Register Not Emptyy

RXNE

Overrun Error

ORE

Idle line detection

IDLE

Parity Error

PE

LIN break

LBDF

Noise Flag, Overrun error and Framing Error in


multibuffer communication.

NE, ORE, FE

Character Match

CMF

Receiver timeout error

RTOF

End of Block

EOBF

Wakeup from STOP mode

WUF

309

STM32F1/F2/L USART vs STM32F3 USART


Main features (1/2)

310

Feature

STM32F3

STM32F1/2/L

Programmable data length (8 or 9 bits)

Yes

Yes

Configurable stop bits

1, 1.5, 2

0.5, 1, 1.5, 2

Synchronous mode (Master only)

Yes

Yes

Single wire Half duplex

Yes

Yes

Programmable parity

Yes

Yes

Hardware flow control (nCTS/nRTS)

Yes

Yes

Driver Enable (for RS485)

Yes

No

Swappable Tx/Rx pin

Yes

No

IrDA

Yes

Yes

Basic support for Modbus

Yes

No

LIN

Yes

Yes

Smartcard

Yes (T = 0, T=1)

Yes (T = 0)

Dual Clock domain and wake up from STOP mode

Yes

No

Programmable data order with MSB first or LSB


first

Yes

No

STM32F1/F2/L USART vs STM32F3 USART


Main features (2/2)

311

Feature

STM32F3

STM32F1/2/L

Receiver timeout

Yes

No

Auto-baudrate detection

Yes

No

Continous communication using DMA

Yes

Yes

Address/character match interrupt

Yes

No

End of Block interrupt

Yes

No

Multiprocessor communication

Yes

Yes

Quiz

How many USART interfaces are in the STM32F30x/F37x microcontrollers ?


____________

What is the maximum USART Baudrate?


____________

What are the features that are not supported by the UART4/5 in the STM32F30x?
____________

What are the different USART DMA requests ?


____________

312

Hands-on: USART and MCU Wake


up from Stop Mode on START bit
detection
02/04/2012

Aim of the Hands-on

This lab illustrates the use of the USART to wake up the MCU
from STOP mode. The wake up method is teh START bit
detection.

F3 Alpha Training

02/04/2012

314

USART configuration
Select the USART clock : LSE or HSI.
Configure the USARTs Init Structure with the appropriate values:
BaudRate = 9600 baud // if HSI is clock source otherwise 1200 if LSE is clock
source
Word Length = 8 Bits
Stop Bit = 1 Stop Bit
Parity = No Parity
Hardware flow control disabled (RTS and CTS signals)
Receive and transmit enabled.

USART is configured to wake up the system from STOP mode on


Start Bit detection

F3 Alpha Training

02/04/2012

315

Complete missing code


Complete the missing code in the file main.c :
Configure the source of wakeup from STOP: Start Bit detection
 Use USART_StopModeWakeUpSourceConfig function

Put the adequate condition to ensure that USART RX is ready by checking that
REACK flag is set:
 Use the USART_GetFlagStatus function

Complete the function call to enable the Wake up from Stop Mode Interrupt:
 USART_ITConfig(USART1, .... , ENABLE);

Check that USART is not performing any transfer before putting it in Stop Mode, by
checking the BUSY Flag:
 Use the USART_GetFlagStatus function

Complete the missing code in the file stm32f30x_it.c


Presentation Title

02/04/2012

316

Inter-Integrated Circuit (I2C)

I2C Features (1/2)

318

I2C specification rev03 compatibility


SMBus 2.0 HW support
PMBus 1.1 Compatibility
Multi Master and slave capability
Controls all IC bus specific sequencing, protocol, arbitration and
timing
Standard, fast and fast mode + IC mode (up to 1MHz)
20mA output drive capability for FM+ mode
318

I2C Features (2/2)

319

7-bit and 10-bit addressing modes


Multiple 7-bit Addressing Capability with configurable mask
Programmable setup and hold time
Easy to use event management
Programmable analog and digital noise filter
Wakeup from STOP mode on address match
Optional clock stretching
Independent clock
1-byte buffer with DMA capability

319

I2C Block Diagram


SYSCFG_CFGR1 / I2Cx_FM+
SYSCFG_CFGR1 / I2C_PBx_FM+

RCC_CFGR3 / I2C1SW

Analog
Noise
Filter

GPIO
logic

SCL

Data
Control

Digital
Noise
Filter

Analog
Noise
Filter

GPIO
logic

SDA

I2C1

SYSCLK
HSI

Clock
Control

Digital
Noise
Filter

I2CCLK

SYSCFG_CFGR1 / I2Cx_FM+
SYSCFG_CFGR1 / I2C_PBx_FM+

Registers
PCLK

APB bus

SMBA

320

I2C SDA and SCL noise filter

321
321

Analog noise filter in SDA and SCL I/O


Can filter spikes with a length up to 50ns
This filter can be enabled or disabled by SW (enabled by default)

Digital noise filter for SDA and SCL


Suppress spikes with a programmable length from 0 to 15 I2CCLK periods.

Only analog filter can be enabled when Wakeup from STOP feature is enable.

Filters configuration must be programmed when the I2C is disable.

I2C Programmable timings

322
322

Setup and Hold timings between SDA and SCL in transmission are programmable by
SW with PRESC, SDADEL and SCLDEL fields in I2C Timing Register
(I2Cx_TIMINGR).

SDADEL is used to generate Data Hold time. TSDADEL = SDADEL * (PRESC+1) * TI2CCLK
SCLDEL is used to generate Data Setup time. TSCLDEL = (SCLDEL+1) * (PRESC+1) * TI2CCLK

Example Data Hold Time :


Th(SDA) Data hold time

SDADEL

SCL

SDA
TSYNC1
SCL falling edge internal detection

The Setup and Hold configuration must be programmed when the I2C is disable.

I2C_Timing_Config_Tool will be available to calculate I2C_TIMINGR value for your


application.

I2C Master clock generation

323
323

SCL Low and High duration are programmable by SW with PRESC, SCLL and SCLH fields in I2C
Timing Register (I2Cx_TIMINGR).

SCL Low counter is (SCLL+1) * (PRESC+1) * TI2CCLK. . It starts counting after SCL falling edge internal detection. After
counting, SCL is released.
SCL High counter is (SCLH+1) * (PRESC+1) * TI2CCLK . It starts counting after SCL rising edge internal detection.
After counting SCL is driven low.

The total SCL period is :

TSYNC1 + TSYNC2 + [(SCLL+1) + (SCLH+1)] * (PRESC+1) * TI2CCLK

SCL Period:
TSYNC2
SCLH
SCLL

SCL

SDA
TSYNC1
SCL falling edge internal detection

The SCLL and SCLH configuration must be programmed when the I2C is disable.

I2C_Timing_Config_Tool is available in FWLib to calculate I2C_TIMINGR value for your


application.

Slave Addressing Mode

324
324

I2C can acknowledge several slave addresses. 2 address registers :


I2Cx_OAR1 : 7-bit or 10-bit mode.
I2Cx_OAR2 : 7-bit mode only. OA2MSK[2:0] allow to mask from 0 to 7
LSB of OAR2 :
OA2MSK[2:0]

Address match condition

000

address[7:1] = OA2[7:1]

001

address[7:2] = OA2[7:2] (Bit 1 is dont care)

010

address[7:3] = OA2[7:3] (Bit 2:1 are dont care)

...
111

All addresses are acknowledged except I2C


reserved addresses.

324

Wakeup from STOP on address match

325
325

When I2CCLK clock is HSI, the I2C is able to wakeup MCU from STOP when it
receives its slave address. All addressing mode are supported.
During STOP mode and no address reception : HSI is switched off.
On START detection, I2C enables HSI, used for address reception.

Wakeup from STOP is enabled by setting WUPEN in I2C1_CR1.

Clock stretching must be enabled to ensure proper operation: NOSTRETCH=0.

325

Easy Master mode management

326
326

For payload <= 255 bytes : only 1 write action needed !! (apart data rd/wr)
 START=1

I2Cx_CR2 is written w/ :

 SADD : slave address


 RD_WRN : transfer direction
 NBYTES = N : number of bytes to be transferred
 AUTOEND =1 : STOP automatically sent after N data.

AUTOEND
0 : Software end mode

End of transfer SW control after NBYTES data transfer :


TC flag is set. Interrupt if TCIE=1.
TC is cleared when START or STOP is set by SW
 If START=1 : RESTART condition is sent

1 : Automatic end mode

STOP condition sent after NBYTES data transfer

Data transfer managed by Interrupts (TXIS / RXNE) or DMA

326

Easy to use event management

327
327

For payload > 255 : in addition, RELOAD must be set in I2Cx_CR2.

RELOAD
0 : No reload

NBYTES data transfer is followed by STOP or ReSTART

1 : Reload mode

NBYTES is reloaded after NBYTES data transfer (data


transfer will continue) :
TCR flag is set. Interrupt if TCIE=1.
TCR is cleared when I2Cx_CR2 is written w/ NBYTES0

AUTOEND = 0 has no effect when RELOAD is set

327

Slave mode

328
328

By default : I2C slave uses clock stretching .


This can be disabled by setting NOSTRETCH=1
Reception : Acknowledge control can be done on selected bytes in Slave Byte
Control (SBC) mode with RELOAD=1
SBC = 1 enables the NBYTES counter in slave mode (Tx and Rx modes).
SBC = 1 is allowed only when NOSTRETCH=0.

SBC
0 : Slave Byte
Control disable

All received bytes are acknowledged.

1 : Slave Byte
Control enable

If RELOAD=1, after NBYTES data are transferred :


TCR set & SCL stretched before ACK pulse in reception.
TCR is cleared when I2Cx_CR2 is written w/ NBYTES0
 if I2Cx_CR2/NACK = 1: received byte is NOT Acknowledged
328

I2C events
Interrupt event

Interrupt flag

Receive Buffer Not Empty

RXNE

Transmit buffer Interrupt Status

TXIS

Stop detection interrupt flag

STOPF

Transfer Complete Reload

TCR

Transfer Complete

TC

Address matched

ADDR

NACK reception

NACKF

329
329

329

SMBUS

330
330

ARP (Address resolution protocol) support : Device default address,


Arbitration in slave mode
Host Notify protocol support : host address
Alert support : Alert pin and Alert Response support
Configurable Timeout detection : Clock low timeout, Cumulative clock low
extend time
Configurable bus idle detection
Command and data acknowledge control in SBC mode
PEC HW calculation
330

SMBUS : PEC (Packet Error Checking)

331
331

HW PEC calculation is enabled when PECEN=1 is I2C1_CR1.


NBYTES (data transfer counter) is used to :
automatically check PEC in reception, after NBYTES-1 are received.

Automatic NACK sending in case of failure


automatically send PEC in transmission, after NBYTES-1 are sent.
Therefore SBC must be set in Slave mode to enable NBYTES counter
Automatic PEC sending/checking is done when PECBYTE=1 in I2C1_CR2.

331

Error conditions
Interrupt event

Interrupt flag

Bus error detection

BERR

Arbitration Loss

ARLO

Over-run / Under-run error

OVR

SMBUS : PEC error

PECERR

SMBUS : Timeout error

TIMEOUT

SMBUS : Alert pin detection

ALERT

332
332

332

Quiz

333
333

How many I2C are in the STM32F3x microcontroller ?


____________
What are the different I2C modes supported by the I2C1 and
I2C2?
____________
What is the SW sequence to read 3 data from slave address
0xAA in master mode?
____________
What are the error flags of the I2C?
____________
333

Hands-on: Write/Read operations


in RF EEPROM.
02/04/2012

Aim of the Hands-on

This Lab illustrates the use of I2C is Master mode to write and read
data in RF EEPROM.

F3 Alpha Training

04/04/2012

335

Lab Step1: I2C2 Configuration


I2C2 is configured in master mode, the standard speed is 100KHz.
Complete the missing code in main.c file to set I2C2 with the following
configuration: fill the missed code in I2C2_Configuration
function
Enable I2C2 with:
Digital Filter OFF
Analog Filter ON
Error, NACK, Receive and Transmit Interrupt Enable
Peripheral Enable
=> The register to be used is I2C2->CR1.

F3 Alpha Training

04/04/2012

336

Lab Step2: Write Data into RF EEPROM


Complete the I2C2_Write_TX_Packet1 function in main.c file to write 4
bytes in RF EEPROM starting from address 0x0
 EEPROM I2C slave address is I2C_E2PROM_ADDR 0xA0
 The EEPROM page write sequence is the following :

 Write I2C2->CR2 to perform this page write operation

Presentation Title

04/04/2012

337

Lab Step3: Read Data from RF EEPROM


Complete the missing code in I2C2_Read_TX_Packet1 function to
read 4 bytes from EEPROM, starting from address 0
The EEPROM sequential random read sequence is the following :

Write the sequence to perform this sequential random read


operation: write 2-bytes address followed by repeated Start followed
by read N bytes, ended by STOP.
Presentation Title

04/04/2012

338

I2S Peripheral

SPI / I2S mode switch


The I2S protocol is used for audio data communication between a
microcontroller/DSP and an audio Codec/DAC.

I2S interface is implemented as a mode in the SPI peripheral.

To switch from SPI to I2S mode:


Disable SPI peripheral (reset SPE bit in SPI_CR1 register)
Select I2S mode (set I2SMOD bit in SPI_I2SCFGR register)

340

I2S Features (1/2)


341

Two I2Ss: Available on SPI2 and SPI3 peripherals.


Two I2Ss extension added for Full-Duplex communication.
Simplex/or Full duplex communication (transmitter and receiver)
I2S2 and I2S3 operate in master or slave configuration.
8-bit programmable linear prescaler to support all standard audio
sample frequencies from 8 kHz up to 192 kHz.
Audio-frequency precision same as high-density and XL-density
devices.
Programmable data format (16-, 24- or 32-bit data formats).

I2S Features (1/2)


342

Underrun flag in slave transmit mode, Overrun flag in receive mode


and new de-synchronization flag in slave transmit/receive mode.
Support for DMA: New DMA requests for I2S2_ext/I2S3_ext allows
full duplex transfers.
I2S protocols supported:

I2S Phillips standard.


MSB Justified standard (Left Justified).
LSB Justified standard (Right Justified).
PCM standard (with short and long frame synchronization on 16-bit channel frame
or 16-bit data frame extended to 32-bit channel frame)

Master clock may be output to drive an external audio component. Ratio is fixed at
256xFs (where Fs is the audio sampling frequency).

The choice of the standard strongly depends on the external


device and the audio data to be transmitted

I2S full duplex block diagram(1/2)


To support I2S full duplex mode, two extra I2S instances called
extended I2Ss (I2S2_ext, I2S3_ext) are available in addition to I2S2
and I2S3. The first I2S full duplex interface is consequently based on
I2S2 and I2S2_ext, and the second one on I2S3 and I2S3_ext.
I2Sx_SCK
I2S_CKIN

I2SxCLK

PLLCLK

SPI/I2Sx

SYSCLK

I2Sx_SD(in/out)
I2Sx_WS

HSI
HSE
I2SSRC

I2Sx_ext

I2Sx_extSD(out/in)

SW

STM32F30xxx
Where x can be 2 or 3

I2Sx_ext can be used only in full duplex mode (Always in slave mode).
Both I2Sx and I2Sx_ext can be configured as transmitters or receivers

343

Half/Full-Duplex Communication

344

 I2S configured in Half/Full-Duplex communication mode:


I2C controls *

STM32F30xxx

I2Sx_WS
I2Sx_extSD(out/in)

SDout
WS

SDin

MCLK
I2Sx_MCLK

MCLK

Analog Interface

I2Sx_SD(in/out)

Audio Codec
CK

Digital Interface

I2Sx_SCK

And

The master and slave


configuration is managed only by
software. The master device is
the CK and WS generator.

The master/slave modes and


transmit/receive directions can be
switched dynamically by
software.

Full--Duplex synchronous audio transmission


Full
Half
Half* Depends on the Codec control method

2 x Full-Duplex Communication
UART controls *
I2S out

Bluetooth

I2S in
I2S out

Audio Codec

STM32F30xxx
I2S in

I2C controls **

Battery

Bluetooth Headset
* Depends on the Codec control method
** Depends on the Bluetooth control method

345

STM32F37xxx vs STM32F30xxx

Features

STM32F37xxx

STM32F30xxx

Instance

3 (I2S1, I2S2,I2S3)

2 (I2S2, I2S3)

Simplex

Simplex/full-duplex

No

Yes

Communication mode

External clock

346

Quiz
How many I2Ss are available in the STM32F30xxx and STM32F37xxx microcontroller?
____________
How to use I2Ss available in the STM32F30xxx in Full duplex mode?
____________
What are the standard audio frequencies supported by I2Ss?
____________
What are the different I2S error flags?
____________

347

STM32F3 Training Agenda (3/4)


Day 3:
Continue with STM32F3 common parts

Controller area network (CAN)


Real Time Clock (RTC)
Genral Purpose Timers
Basic Timers 6 and 7
Universal serial bus full-speed device interface (USB)
Touch sensing controller (TSC)
STM32F3xx Minimum External Components

STM32F30x specific parts


Analog-to-Digital Converter ADC 5MSPS + Hands-on
Advanced Timers TIM1 and TIM8 new functionalties

348

CAN Peripheral

CAN Features (1/2)


350

Supports CAN protocol version 2.0 A, B Active


Bit rates up to 1Mbit/s
Transmission
Three transmit mailboxes
Configurable transmit priority
Time Stamp on SOF transmission

Reception
Two receive FIFOs with three stages
14 scalable filter banks
Configurable FIFO overrun
Time Stamp on SOF reception

CAN Features (2/2)


351

Time Triggered Communication option


Disable automatic retransmission mode
16-bit free running timer
Time Stamp sent in last two data bytes
Management
Maskable interrupts
512 bytes reserved RAM size (No longer shared with USB)
4 dedicated interrupt vectors: transmit interrupt, FIFO0
interrupt, FIFO1 interrupt and status change error interrupt

BxCAN operating modes


Reset

Sleep Mode

Initialization
Mode

Normal
Mode

Operation mode

Test mode
- Slient mode
- LoopBack mode
- Loop back combined with silent mode

352

Block Diagram BxCAN


Tx Mailboxes
Control/Status/Configuration Registers

Mailbox 2
Mailbox 1

Master Status

Transmit Status

Receive FIFO0 Status

Interrupt Enable

Receive FIFO1 Status

Bit Timing

Error Status

Mailbox 0

Filter Master

Filter Scale

Filter Mode

Filter FIFO Assignment

Filter Activation

Filter Bank x[13:0]

CAN 2.0B Active Core

Master Control

Receive FIFO 0

Receive FIFO 1

Mailbox 2

Mailbox 2

Mailbox 1

Mailbox 1

Mailbox 0

Mailbox 0

Transmission
Scheduler

Acceptance Filters
Filter
Memory
Access
Controller

Same as STM32F10x product

..

Filter range : 0 .. 13

..

13

353

Quiz
How many transmit mailboxes are in the STM32F3xxx bxCAN?
____________

How many operating modes are in the STM32F3xxx bxCAN?


____________

What is the difference between STM32F3xxx bxCAN and


STM32F10x bxCAN?
____________

354

Real-Time Clock (RTC)

RTC Features (1/2)

Ultra-low power battery supply current.

Calendar with Sub seconds, seconds, minutes, hours, week day, date, month, year.

Daylight saving compensation programmable by software

Two programmable alarms with interrupt function. The alarms can be triggered by
any combination of the calendar fields.

A periodic flag triggering an automatic wakeup interrupt. This flag is issued by a 16-bit
auto-reload timer with programmable resolution. This timer is also called wakeup
timer.

A second clock source (50 or 60Hz) can be used to update the calendar.

Maskable interrupts/events:

Alarm A, Alarm B, Wakeup interrupt, Time-stamp, Tamper detection

Digital calibration circuit (periodic counter correction) to achieve 0.95 ppm accuracy

Time-stamp function for event saving with sub second precision (1 event)

Backup registers which are reset when an tamper detection event occurs.

64 bytes for STM32F30x

128 bytes for STM32F37x

356

RTC Features (2/2)

Alternate function outputs:

RTC_CALIB: 512 Hz or 1Hz clock output (with an LSE frequency of


32.768 kHz). It is routed to the device RTC_OUT output.
RTC_ALARM: Alarm A, B flag output. It is routed to the device
RTC_OUT output.

Alternate function inputs:


RTC_TAMP1: tamper1 event detection.
RTC_TAMP2: tamper2 event detection.
RTC_TAMP3: tamper3 event detection.
RTC_TS: timestamp event detection.
RTC_REFIN: reference clock input.

The RTC clock source could be any of the following three:

LSE oscillator clock.


LSI oscillator clock.
HSE divided by 32 in clock controller.

357

RTC overview across families (1/2)


358
358

STM32F2x

STM32F0x

STM32F4x

RTC in VBAT

YES

Calendar in
BCD

YES

STM32F30x

Calendar Sub
seconds access

NO

YES
Resolution down to RTCCLK

Calendar
synchronization
on the fly

NO

YES

Alarm on
calendar

2 wo/ subseconds 1 w/
subseconds

Calendar
Calibration

Calib window :
64min
Calibration step:
-2ppm/ +4ppm
Range [63ppm+126ppm]

STM32F37x

2 w/ subseconds

Calib window : 8s/16s/32s


Calibration step: 3.81ppm/1.91ppm/0.95 ppm
Range [-480ppm +480ppm]

RTC overview across families (2/2)


STM32F2x

STM32F0x

STM32F4x

Synchronization
on mains

STM32F30x

YES

NO

Timestamp

YES
Sec, Min,
Hour, Date

YES
Sec, Min, Hour, Date, Sub seconds

Tamper

YES
2 pins/1
event
Edge
detection
only

YES
2 pins/2 event
Level Detection
Configurable
filtering

YES
3 pins/ 3 events
Level Detection
with Configurable filtering

32-bit Backup
registers

20

20

16

PC13-14-15
output state kept
in Standby

NO

YES

NO

YES

(if not used by RTC/LSE)

STM32F37x

YES

Periodic wakeup

359
359

YES

32

RTC Block Diagram


RTC_TAMP1
RTC_TAMP2
RTC_TAMP3

Backup Registers and RTC Tamper


Control registers

Tamper Flag

RTC_TS

TimeStamp Registers

RTC_REFIN

TimeStamp Flag

Alarm B

RTCSEL [1:0]
HSE / 32

ssr, ss, mm,


HH/date

Smooth
Calibration

LSE
LSI

360
360

Alarm A

ssr, ss, mm,


HH/date

RTCCLK

ssr
(binary format)

PREDIV_A [6:0]

Alarm A Flag

Calendar

Calendar

Asynchronous
7bit Prescaler

Alarm B Flag

RTC_ALARM

Synchronous
15bit Prescaler

Day/date/month/year HH:mm:ss
(12/24 format)

PREDIV_S [14:0]
1 Hz

RTC_CALIB
512 Hz
COSEL

Wake-Up

16bit autoreload
Timer

WUCKSEL [2:0]

Periodic
wake up
Flag

RTC registers write protection


By default and after reset, the RTC registers are write protected to
avoid possible parasitic write accesses.
DBP bit must be set in PWR_CR to enable RTC write access
A Key must be written in RTC_WPR register.

To unlock write protection on all RTC registers


1. Write 0xCA into the RTC_WPR register
2. Write 0x53 into the RTC_WPR register
* Except for the clear of Alarm and Wakeup timer interrupt flags
Writing a wrong key reactivates the write protection.

361
361

RTC Clock Sources

362
362

The RTC has two clock sources:


RTCCLK used for RTC timer/counter, can be either the HSE/32, LSE or
LSI clocks.
PCLK1 used for RTC register read/write access.

Before to start using the RTC you have to program the clock
controller :
Configure and Enable the RTCCLK source in the RCC_BDCR register

362

RTC in Low Power Modes and in Reset

The RTC remains active what ever the low power mode

Sleep, STOP, STANDBY

When enabled, 5 events can exit the device from low power modes:

Alarm A
Alarm B
Wakeup
Tamper 1/ 2 / 3
TimeStamp

The RTC remains active in VBAT mode (VDD off) when clocked by LSE
The RTC remains active under Reset except at Power-on Reset

The RTC configuration registers including prescaler programming are not affected by
system Reset else than Power-on Reset.
When clocked by LSE, the RTC clock is not stopped under Reset, except power-on
reset.

363
363

RTC Alternate function configuration (1/2)

364
364

RTC pin (PC13) :


RTC_ALARM
enabled

RTC_CALIB
enabled

Tamper
enabled

Time
stamp
enabled

PC13MODE

PC13VALUE

Alarm out
output OD

Dont care

Dont care

Dont care

Dont care

Alarm out
output PP

Dont care

Dont care

Dont care

Dont care

Calibration out
output PP

Dont care

Dont care

Dont care

Dont care

TAMPER input
floating

Dont care

Dont care

TIMESTAMP
and TAMPER
input floating

Dont care

Dont care

TIMESTAMP
input floating

Dont care

Dont care

Output PP forced

PC13 output
data value

Standard GPIO

Dont care

Pin configuration
and function

OD: open drain; PP: push-pull.

PC13 is available in VBAT mode


364

RTC Alternate function configuration (2/2)

365
365

LSE pin PC14 configuration (1) :


Pin configuration
and function

LSEON

LSEBYP

PC14MODE

PC14VALUE

LSE oscillator

Dont care

Dont care

LSE BYPASS

Dont care

Dont care

Output PP forced

Dont care

PC14 output
data value

Standard GPIO

Dont care

Dont care

1. OD: open drain; PP: push-pull.

LSE pin PC15 configuration (1) :


Pin configuration
and function
LSE oscillator

LSE ON

LSEBYP

PC15MODE

PC15VALUE

Dont care

Dont care

1
1

PC15 output
data value

Dont care

Output PP forced

Standard GPIO

Dont care

Dont care

1. OD: open drain; PP: push-pull.

365

RTC Calendar (1/4)

366
366

The initialization or the reading of the calendar value is done


through 3 shadow registers, SSR, TR and DR. The RTC TR and
DR registers are in BCD format.
SSR register represents the RTC Sub seconds register
Calendar

12h or 24h format


Actual
registers

Shadow
registers

Time

Date
Day : Month : Date : Year

DR

HH

mm : ss : ssr

TR

SSR
366

RTC calendar (2/4)

367
367

RTC initialization :
Enter in initialization phase mode by setting the INIT bit in ISR register
This mode is confirmed with the INITF flag also in ISR register

Program the prescaler register (PRER) according to the clock source to get 1Hz clock to
the calendar.
Load the initial date values in the 2 shadow registers (TR, DR).
And other configuration registers like RTC_CR (hour format, )

Exit the initialization phase clearing INIT bit.


The actual calendar register are then automatically loaded and the counting restarts after few
RTCCLK clock periods.

After reset the check of the INITS flag in ISR register indicates if the calendar is
already initialized (year not at zero) or not (like after Power-on).
To manage the daylight saving there are 3 bits in CR:
SUB1H or ADD1H to subtract or add one hour to the calendar
BCK to memorize above action

367

RTC calendar (3/4)

368
368

The shadow registers are automatically updated each time the RTCCLK
clock is synchronized with System Clock.

The calendar read can be done in 2 different modes :


BYPSHAD=0 : Read shadow registers
RSF flag in ISR register is used to ensure that the calendar value from shadow
register is the up-to-date one.
Update of DR is frozen after reading TR , and unfrozen when DR is read.
Update of TR and DR is frozen after reading SSR , and unfrozen when DR is read.
BYPSHAD=1 : Bypass shadow registers
Reading calendar makes direct access to the calendar counters
Software must read all calendar registers twice and compare the results to ensure
that the data are coherent and correct.

368

RTC calendar (4/4)

369
369

Calendar can be synchronized up to 1s on the fly by adding/subtracting


an offset with the sub second resolution.

Allow synchronization to remote clock

Reference Clock detection: A more precise second source clock (like mains 50
or 60 Hz) can be used to enhance the long-term precision of the calendar:
The second source clock is automatically detected and used to update the calendar
The LSE clock is automatically used to update the calendar whenever the second
source clock becomes unavailable

Timestamp : Calendar value (including sub-seconds) is saved in


Timestamp registers on external I/O event

369

RTC Programmable Alarm

370
370

2 Full programmable Alarms


Able to exit the device from STOP/STANDBY modes.
Alarms event can also be routed to the specific output pin
RTC_OUT with configurable polarity.
The Alarm flags are set if the calendar sub seconds, seconds,
minutes, hours or date match the value programmed in the
alarm registers ( ALRMASSR & ALRMAR, ALRMBSSR &
ALRMBR).
Calendar sub second, seconds, minutes, hours or date fields
can be independently selected (maskable or not maskable).
370

WakeUp configuration (1/3)

371
371

The periodic wakeup flag is generated by a 16-bit programmable binary auto-reload


down counter (WUTR registers)
Able to exit the device from STANDBY modes.
The wakeup clock source selection is done via WUCKSEL [2:0] bits in control register
RTC_CR (to program these bits the auto wakeup must be deactivated, WUTE=0).
3 possible cases are possible:
Case1 WUCKSEL = 0xx

Wake-Up
RTCCLK

WakeUpCLK
Asynchrone 4bit
Prescaler

WUCKSEL[2:0]
ValueMax = div16
ValueMin = div2

WakeUpCLKmin = RTCCLK/(2 x (0x0001 + 1)) => 122s


WakeUpCLKmax = RTCCLK/(16 x (0xFFFF + 1)) => 32s

16bit autoreload
Timer

ValueMax = 0xFFFF
ValueMin = 0x0000

Periodic wake
up Flag

RTCCLK = 32.768KHz
Resolution min=2xRTCCLK=61s
371

WakeUp configuration (2/3)

372
372

Case2 WUCKSEL = 10x

RTCCLK

Asynchrone 7bit
Prescaler

ValueMax = div 27 = 128 (power-on reset


value)
ValueMin = 1

Synchrone 15bit
Prescaler

ValueMax = div 215


ValueMin = 1
Power-on reset value =256

Wake-Up
WakeUpCLK

ck_spre

16bit autoreload
Timer

ValueMax = 0xFFFF
ValueMin = 0x0000

Periodic Wakeup Flag

WakeUpCLKmin = RTCCLK/(1 x (0x0000 + 1))


WakeUpCLKmax = RTCCLK/(222 x (0xFFFF + 1))
If ck_spre is 1Hz (when used for calendar): 1s <= WakeUpCLK <= 18.2h (1s resolution)
372

WakeUp configuration (3/3)

373
373

Case3 WUCKSEL = 11x

RTCCLK

Asynchrone 7bit
Prescaler

ValueMax = div 27
ValueMin = 1

Synchrone 13bit
Prescaler

ValueMax = div 215


ValueMin = 1

Wake-Up
ck_spre

WakeUpCLK

16bit autoreload
Timer

ValueMax = 0xFFFF
ValueMin = 0x0000

Periodic
Wake-up
Flag

WakeUpCLKmin = RTCCLK/(1 x (0x10000 + 1))


WakeUpCLKmax = RTCCLK/(222 x (0x1FFFF + 1))
If ck_spre is 1Hz (when used for calendar): 18.2s <= WakeUpCLK <= 36.4h (1s resolution)
373

Smooth digital calibration


Consists in masking/adding N (configurable) 32KHz
clock pulses, fairly well distributed in a configurable
window.
A 1Hz output is provided to measure the quartz
frequency and the calibration result.
Calibration window

Accuracy

Total range

8s

1.91 ppm

[0 480ppm]

16s

0.95 ppm

[0 480ppm]

32s

0.48 ppm

[0 480ppm]

374

Tamper detection
3 tamper pins and events
RTC_TAMPx

Tamper
switch

STM32

Configurable active level for each event


Configurable use of I/Os pull-up resistors
Configurable pre-charging pulse to support different
capacitance values
1, 2, 4 or 8 cycles

Capacitor is optional (filtering can be


done by software)
Biasing is done using the I/Os Pullup resistor

Configurable filter:
Sampling rate : 128Hz, 64Hz, 32Hz, 16Hz, 8Hz, 4Hz,
2Hz, 1Hz
Number of consecutive identical events before issuing
an interrupt to wake-up the MCU : 1, 2, 4, 8

RTC_TAMP1 available in VBAT mode.


Reset of backup registers when tamper event detected
Tamper event can generate a timestamp event

375

Tamper detection - signals


Clock

Floating input
(Not connected)

Switch opened

Voltage on Tamper
Detect Input

Input voltage sampling


is done here

1 cycle pre-charge
2 cycles pre-charge
4 cycles pre-charge
(8 cycles not shown)

Switch closed

376

Quiz
What are the different RTC clock sources ?
--------------------------------------------------- What are the different RTC interrupts ?
------------------------------------------------------ What is the maximum RTC Sub second (RTC_SSR) resolution?
------------------------------------------------------- How many RTC Backup Registers are available?
----------------------------------------------------------

377

General Purpose Timers


(TIM2/3/4/5 - TIM12/13/14 - TIM15/16/17 - TIM6/7/18)

STM32F30x Timer features overview


Counter
resolution

Counter
Type

Prescaler
factor

DMA

32 bit

Up, Down
and
Up/Down

1...65536

YES

16 bit

Up, Down
and
Up/Down

165536

16 bit

Up

16 bit

16 bit

General purpose

TIM2
General purpose

TIM3 and TIM4

Capture
Compare
channels

Synchronization
Master
config

Slave
config

YES

YES

YES

YES

YES

165536

YES

YES

NO

Up

165536

NO

YES(1)

NO

Up

165536

NO

YES

YES

Basic

TIM6 and TIM7


1 channel, 1
complementary output

TIM16 and TIM17


2 channels, 1
complementary output

TIM15
(1)

TIM16 and TIM17 have no TRGO output, instead OC output is used

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

TIM6/7/18

379

STM32F37x Timer features overview

General purpose

TIM2 and TIM5

Counter
resolution

Counter
Type

Prescaler
factor

DMA

32 bit

Up, Down and


Up/Down

1...65536

YES

16 bit

Up, Down and


Up/Down

165536

16 bit

Up

16 bit

Capture
Compare
channels

Synchronization
Master
config

Slave
config

YES

YES

YES

YES

YES

165536

YES

YES

NO

Up

165536

NO

YES(1)

NO

16 bit

Up

165536

NO

YES

YES

16 bit

Up

165536

NO

YES(2)

NO

16 bit

Up

165536

NO

NO

YES

General purpose

TIM3, TIM4 and


TIM19
Basic

TIM6, TIM7 and


TIM18
1 channel, 1
complementary output

TIM16 and TIM17


2 channels, 1
complementary output

TIM15
1 channel

TIM13 and TIM14


2 channels

TIM12
(1)
(2)

TIM16 and TIM17 have no TRGO output, instead OC output is used


TIM13 and TIM14 have no TRGO output, instead OC output is used

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

TIM6/7/18

380

Features overview (1/3)


ETR

Up to 4 16-bit resolution
Capture Compare channels
(TIM3/4/19)

381

Clock
ITR 1

Trigger/Clock

ITR 2
ITR 3

Controller

Trigger
Output

ITR 4

Up to 4 32-bit resolution
Capture Compare channels
(TIM2/5)

16-Bit Prescaler
Auto Reload REG

Inter-timers synchronization

+/- 16/32-Bit Counter

Up to 6 IT/DMA Requests

CH1
CH1

Encoder Interface

CH2

Hall sensor Interface

CH3

Capture Compare
Capture
Compare
Capture
Compare
Capture Compare

CH2
CH3

CH4
CH4

TIM2/5

TIM3/4/19

Features overview (2/3)


ETR

Clock

Trigger/Clock

ITR 1
ITR 2

Up to 2 16-bit resolution
Capture Compare channels

ITR 3

Controller

ITR 4

Inter-timers synchronization

16-Bit Prescaler

Encoder Interface
Only TIM15 has
complementery output on
channel1

Trigger
Output

Auto Reload REG


+/- 16/32-Bit Counter

CH1
CH1

Capture Compare
Capture Compare

CH1
Comp

CH2
CH2

TIM12

TIM15

382

Features overview (3/3)


ETR

Clock
ITR 1

One 16-bit resolution


Capture Compare channels

383

Trigger/Clock
Trigger
Output

ITR 2
ITR 3

Controller

ITR 4

Only TIM16/17 has


complementary output on
channel 1

16-Bit Prescaler
Auto Reload REG
+/- 16/32-Bit Counter

CH1
CH1
CH1
Comp

Capture Compare

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

TIM6/7/18

Counting Modes (1/2)


There are three counter modes:
Up counting mode
Down counting mode
Center-aligned mode

Center Aligned

Update Event

TIM2/5

TIM3/4/19

Up counting

Down counting

384

Counting Modes (2/2)


There is only one counting mode:
Up counting mode

Up counting

Update Event

TIM12

TIM15

TIM13/14

TIM16/17

TIM6/7/18

385

Update Event

The content of the preload register is transferred into the shadow register
depends on the Auto-reload Preload feature if enabled or not
If enabled, at each Update Event the transfer occurs
If not enabled, the transfer occurs Immediately

The Update Event is generated


For each counter overflow/underflow
Through software, by setting the UG bit (Update Generation)

The Update Event (UEV) request source can be configured to be


Next to counter overflow/underflow event
Nest to Counter overflow/underflow event plus the following events
Setting the UG bit by software
Trigger active edge detection (through the slave mode controller)

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

386

Counter Clock Selection

387

Clock can be selected out of 8 sources


Internal clock TIMxCLK provided by the RCC
Internal trigger input 1 to 4:
ITR1 / ITR2 / ITR3 / ITR4
Using one timer as prescaler for another timer

TIMxCLK

Trigger
Controller

External Capture Compare pins


Pin 1: TI1FP1 or TI1F_ED
Pin 2: TI2FP2
ETR

External pin ETR

Enable/Disable bit
Programable polarity
4 Bits External Trigger Filter
External Trigger Prescaler:

Polarity selection & Edge


Detector & Prescaler &
Filter

ITR1

Controller

ITR2
ITR3
ITR4

TI1F_ED

Prescaler off
Division by 2
Division by 4
Division by 8

TIM2/5

TI1FP1
TI2FP2

TIM3/4/19

TIM12

TIM15

TRGO

Capture Compare Array presentation


Up to 4 channels

TIM2/3/4/5/19 have 4 channels


TIM12/15 have 2 channels
TIM13/14/16/17 have one channel
TIM6/7/18 have no channels

Programmable bidirectional channels


Input direction: channel configured in Capture mode
Output direction: Channel configured in Compare mode
Channels main functional blocs
Capture/Compare register
Input stage for capture

4-bit digital filter


Input Capture Prescaler:

Output stage for Compare


Output control bloc

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

388

Input Capture Mode (1/2)


Capture stage architecture

TI1

Input Filter &


Edge detector

IC1

Prescaler

16 bit Capture/Compare 1 Register

Prescaler

16 bit Capture/Compare 2 Register

Prescaler

16 bit Capture/Compare 3 Register

Prescaler

16 bit Capture/Compare 4 Register

TRC

TI2

Input Filter &


Edge detector

IC2
TRC

TI3

Input Filter &


Edge detector

IC3
TRC

TI4

Input Filter &


Edge detector

IC4
TRC

TIM2/5

TIM3/4/19

389

Input Capture Mode (2/2)


Flexible mapping of TIx inputs to channels inputs ICx
{TI1->IC1}, {TI1->IC2}, {TI2->IC1} and {TI2->IC2} are possible

When an active Edge is detected on ICx input, the counter value is


latched in the corresponding CCR register.
When a Capture Event occurs, the corresponding CCXIF flag is set
and an interrupt or a DMA request can be sent if they are enabled.
An over-capture flag for over-capture signaling
Takes place when a Capture Event occurs while the CCxIF flag was already high

TIM2/5

TIM3/4/19

390

PWM Input Mode


Timer Clock

IC1 and IC2 must be configured to be


connected together to the PWM signal:

PWM

IC1 and IC2 are redirected internally to be


mapped to the same external pin TI1 or TI2.
IC1

Counter

PWM
IC2

IC1 and IC2 active edges must


have opposite polarity.

IC1 - DUTY
CYCLE
IC2 - PERIOD

10

IC1 or IC2 is selected as trigger input and the


slave mode controller is configured in reset
mode.

The PWM Input functionality enables the measurement of the period and the pulse
width of an external waveform.
TIM2/5

TIM3/4/19

TIM12

TIM15

391

Output Compare Mode


The Output Compare is used to control an output waveform or indicate when a period of
time has elapsed.
When a match is found between the
capture/compare register and the counter:
The corresponding output pin is assigned to
the programmable Mode, it can be:

Set
Reset
Toggle
Remain unchanged

Timer Clock
Interrupt

Interrupt

OC1

New CCR1

Set a flag in the interrupt status register


Generates an interrupt if the corresponding
interrupt mask is set

CCR1

Send a DMA request if the corresponding


enable bit is set

The CCRx registers can be programmed


with or without preload registers

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

392

PWM Mode
Available on all channels
Two PWM mode available
PWM mode 1
PWM mode 2
Each PWM mode behavior (waveform shape) depends on the counting direction
Edge-aligned Mode

Center-aligned Mode
Timer Clock

Timer Clock
Update
Event

AutoReload
Capture Compare

AutoReload

Update
Event

Capture Compare

OCx

OCx

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

393

One Pulse Mode (1/2)

394

TI2

One Pulse Mode (OPM) is a particular


case of Output Compare mode

OC1REF

It allows the counter to be started in


response to a stimulus and to
generate a pulse
With a programmable length

OC1

TIM_ARR

After a programmable delay


TIM_CCR1

There are two One Pulse Mode


waveforms selectable by software:
Single Pulse

tDelay

tPulse

Repetitive Pulse

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

One Pulse Mode (2/2)


Exercise:
How to configure One Pulse Mode to generate a repetitive Pulse in response to a stimulus ?

One Pulse Mode configuration steps


1.
i.

Map TIxFPx on the corresponding TIx.

ii.

TIxFPx Polarity configuration.

iii.

TIxFPx Configuration as trigger input.

iv.

3.

2.

Input Capture Module Configuration:

TIxFPx configuration to start the counter


(Trigger mode)

Output Compare Module


Configuration:
i.

OCx configuration to generate the


corresponding waveform.

ii.

OCx Polarity configuration.

iii.

tDelay and tPulse definition.

One Pulse Module Selection: Set or Reset the corresponding bit (OPM) in the Configuration
register (CR1).

TIM2/5

TIM3/4/19

TIM12

TIM15

TIM13/14

TIM16/17

395

Encoder Interface (1/2)


Encoders are used to measure position and
speed of mobile systems (either linear or
angular)
Trigger Controller

The encoder interface mode acts as an external


clock with direction selection
Controller

Encoders and Microcontroller connection


example:

Encoder
Interface

A can be connected directly to the MCU without


external interface logic.
The third encoder output which indicates the
mechanical zero position, may be connected to an
external interrupt and trigger a counter reset.

Encoder enhancement
A copy of the Update Interrupt Flag (UIF) is copied
into bit 31 of the counter register

TI1

Polarity Select & Edge Controller

TI2

Polarity Select & Edge Controller

Simultaneous read of the Counter value and the


UIF flag : Simplify the position determination

TIM2/5

TIM3/4/19

TIM12

TIM15

396

Encoder Interface (2/2)


Exercise:
How to configure the Encoder interface to detect the rotation direction of a motion system?

Encoder interface configuration steps:


1.

Select the active edges: example counting on TI1 and TI2.

2.

Select the polarity of each input: example TI1 and TI2 polarity not inverted.

3.

Select the corresponding Encoder Mode.

4.

Enable the counter.

TIM2/5

TIM3/4/19

TIM12

TIM15

397

Hall sensor Interface (1/2)


TI1F_ED

Trigger &
Slave Mode
Controller

Hall A
Hall B
Hall C TI1
XOR

Input Filter &


Edge detector

IC1

Prescaler

16 bit Capture/Compare 1 Register

Prescaler

16 bit Capture/Compare 2 Register

Prescaler

16 bit Capture/Compare 3 Register

Prescaler

16 bit Capture/Compare 4 Register

TRC

TI2
Input Filter &
Edge detector

IC2
TRC

TI3
Input Filter &
Edge detector

IC3
TRC

TI4
Input Filter &
Edge detector

IC4
TRC

TIM2/5

TIM3/4/19

398

Hall sensor Interface (2/2)


Hall sensors are used for:
Speed detection
Position sensor
Brushless DC Motor Sensor

How to configure the TIM to interface with a Hall sensor?


Select the hall inputs for TI1: TI1S bit in the CR2 register
The slave mode controller is configured in reset mode
TI1F_ED is used as input trigger

To measure a motor speed:


Use the Capture/Compare Channel 1 in Input Capture Mode
The Capture Signal is the TRC signal
The captured value which correspond to the time elapsed between 2 changes on
the inputs, gives an information about the motor speed

TIM2/5

TIM3/4/19

399

Synchronization Mode Configuration


The Trigger Output can be controlled on:

Clock

Counter reset
Master ARR

Counter enable

Master CNT

Update event
Triggered Mode

OC1 / OC1Ref / OC2Ref / OC3Ref / OC4Ref


signals

Triggered mode : only the start of the counter is


controlled
Gated Mode: Both start and stop of the counter
are controlled
Reset Mode - Rising edge of the selected
trigger input (TRGI) reinitializes the counter

Slave CNT

Clock
Gated Mode

The slave timer can be controlled in two


modes:

Master Trigger
Out

New Master
CCR1
Master CCR1
Master CNT
Master CC1

Slave CNT

TIM2/5

TIM3/4/19

TIM12

TIM15

400

Synchronization: Configuration examples (1/3)


Cascade mode:
TIM3 used as master timer for TIM2
TIM2 configured as TIM3 slave, and master for TIM15
MASTER
Timer 3
CLOCK
prescaler

counter

Trigger
Controller

TRG 1

SLAVE / MASTER

Update

Timer 2

TRG 2

ITR 1

ITR 3

prescaler

Trigger
Controller

SLAVE

ITR 4

counter

Update
ITR0

ITR2
ITR 4

TIM2/5

TIM3/4/19

TIM12

Timer 15

prescaler

TIM15

counter

401

Synchronization: Configuration examples (2/3)


One Master several slaves: TIM2 used as master for TIM3, TIM4
and TIM15
MASTER
Timer 2

SLAVE 1

CLOCK

Timer 3

prescaler
Update

Trigger
Controller

TRG1

ITR1
ITR 3

prescaler

counter

ITR 4

counter

SLAVE 2
Timer 4
ITR 3
ITR 2

prescaler

counter

ITR 4

SLAVE 3
ITR3

TIM15

ITR 2

prescaler
ITR 4

TIM2/5

TIM3/4/19

TIM12

TIM15

counter

402

Synchronization: Configuration examples (3/3)


Timers and external trigger synchronization
TIM2, TIM3 and TIM4 are slaves for an external signal connected to respective
Timers inputs

TIM2

Trigger
Controller

TIM3

Trigger
Controller

TRGO

TIM4

Trigger
Controller

TRGO

External Trigger

TIM2/5

TIM3/4/19

TIM12

TIM15

TRGO

403

Universal Serial Bus interface


(USB Device)

USB Speeds & bus components


USB 2.0 speeds
Low speed: 1.5 Mbits/s
Full speed: 12 Mbits/s
High speed: 480 Mbits/s

USB keeps high compatibility level between all supported speeds


Bus components
USB host or Root hub: initiates all the transaction on the bus
USB device: is a set of one or more interfaces that expose capabilities to
the host (ex: mouse, keyboard,..)
USB hub: allows to connect multiple devices to the USB host. It has an
upstream port for communication with the host and multiple downstream
ports for connection to devices

Devices address assignment


After detecting a device attachment, the host will assign a unique address
(on 7-bits) to the device

405

USB 2.0 Bus Tiered-Star Topology

A maximum of 127 devices can be connected in the bus


A maximum of 5 hubs can be connected in series

406

USB Device attachment & speed detection

Full/high Speed: Pull-up on D+

Low Speed: Pull-up on D-

The 1.5K pull-up allows the host to detect the device attachment and its
supported speed
High-speed device is detected first as full-speed device then high-speed
capability is detected through bus handshake called chirp sequence

407

USB Device Power


Two possible power configurations
Self-powered: power provided from external power-supply
Bus-powered: power provided from VBUS (5v)

For bus-powered device, two options are possible:


Low-power devices :maximum power consumption is 100mA
High-power devices :maximum power consumption is 100mA
during bus enumeration and 500mA after configuration

During device enumeration, the device indicates to host


its power configuration (self-powered/bus-powered) and
its power consumption in the device configuration
descriptor

408

USB Suspend mode


USB device should enter in USB Suspend mode when the bus is idle
for more than 3 ms
In suspend mode, the current drawn by the USB device from VBUS
power shouldnt exceed 2.5mA ( old specification was 500A)
USB host prevents device from entering in suspend mode by
periodically issuing Start of Frame (SOF) or Keep alive (for lowspeed)
For High-speed, SOF is sent every micro-frame 125us +/- 65ns
For Full-speed, SOF is sent every frame 1ms +/- 500ns
For Low-speed, keep alive which is EOP (End of Packet) is sent every
1ms in absence of low-speed data

Exist from Suspend mode can be


Initiated from host by issuing the resume signaling ( Resume or Reset)
Initiated from device by issuing the remote wakeup signaling

409

USB Transaction
One bus transaction is composed of a:
Token packet (SETUP, IN, OUT) always issued by the host
Target device address
Target endpoint number
Direction of transaction (IN: Device to host or SETUP/OUT: host to device)

Data packet (DATA0, DATA1, DATA2, MDATA)


Carries the data payload of a transaction sent by the host or device
DATA PID toggle used to synchronize HOST and DEVICE to avoid repeated
packet transfer in case of corrupted/lost handshake

Handshake packet (ACK, NAK, STALL, NYET)

ACK: packet reception acknowledged (sent from host or device)


NAK : packet reception not acknowledged (sent from device)
STALL: control request not supported or endpoint halted (sent from device)
NYET: device not ready to accept further packets (only from high-speed device)

Token packet

PID ADDRESS

ENDPOINT CRC

Data packet (up to 1023 bytes)

PID

DATA

CRC

Handshake packet

PID

410

Examples of IN/OUT transactions


Host

Device

OUT

Host

Device

IN
NAK

DATA0
ACK
IN

OUT

DATA0
DATA1

ACK
NAK

OUT

IN

DATA1

DATA1
ACK

ACK

411

USB Transfer
A USB transfer is composed of one or multiple bus transactions
Four types of USB transfers are defined:
Control: used for control and configuration requests (ex: device
enumeration)
Bulk: used for huge data transfers with no guaranteed delivery rate (ex:
printer, mass-storage drive,..)
Interrupt: used for interrupt driven devices that need to be polled
periodically for small size data transfer (ex: mouse, keyboard, joystick)
Isochronous: used for data streaming applications, that requires a
guaranteed delivery rate, but no error checking (ex: audio, video devices)

During each frame (in LS/FS) or micro-frame (in HS), the host will
schedule the needed transfers with different bandwidth allocation for
each transfer type

412

USB Control Transfer


Used for standard control requests during device enumeration process or
during class operation
All devices should support control transfer through endpoint 0 (bidirectional)
It is given reserved 10% of bus bandwidth for FS/LS and 20% for HS
Control transfer has 3 stages
SETUP stage: one SETUP transaction for issuing the control request (ex: Get
Descriptor)
Optional DATA stage IN or OUT: one or multiple data transactions
Status stage: one IN or OUT transaction with a Zero Length data packet to check
if control transfer request executed correctly or not.

The maximum data packet size during the optional data stage is 8 bytes for
LS and 64 bytes in FS/HS
Transfer error management done through handshake packet and data PID
toggle mechanism

413

Example of a USB Control Transfer

414

Get device descriptor standard request:

SETUP stage

DATA stage IN

STATUS stage

USB Bulk Transfer


Used to transfer large amount of data without guaranteed delivery rate
(sending data to printer, drive,..)
Lowest priority transfer with no reserved bus bandwidth but can occupy the full
bandwidth if no other transfer on the bus
Supported only by full-speed and high-speed devices
Can consist of one or more IN or OUT transactions (one pipe/endpoint needed
for each direction) during each frame/micro-frame
The max packet size is 64 bytes for FS and 512 bytes for HS
Used in most of the case as a transport layer for a higher application or class
protocol layer (Bulk-Only transfer in mass-storage class)
Transfer error management done through handshake packet and data PID
toggle mechanism

415

USB Interrupt Transfer


Useful when data need to be transferred with a maximum transfer latency
(mouse, keyboard,)
IN or OUT data transfers can occur periodically within a maximum latency
period negotiated during device enumeration
Has a limited reserved bandwidth with a guaranteed maximum latency
For LS the packet max length is 8 bytes with a maximum latency of up to 1 packet
each 10 frames
For FS the packet max length is 64 bytes with a maximum latency of up to 1
packet each frame
For HS the packet max length is 1023 bytes with a maximum latency up to 3
packets each micro-frame

When bandwidth is available the host is free to schedule extra OUT or IN


interrupt transactions (more than the predefined maximum latency period)
Transfer error management done through handshake packet and data PID
toggle mechanism

416

USB Isochronous transfers


Used mainly for streaming real-time data like audio and video
Needs a guaranteed transfer rate with a predefined bytes number in every
frame/micro-frame but No transfer error checking
The transfer rate is negotiated between host and device during enumeration
Transfer can consist of one or more data OUT or IN transactions
In Full-speed, the max packet length is 1023 bytes with a maximum of one
packet per frame
In High-speed, the max packet length is 1024 bytes with a maximums of 3 packets
per frame
Isochronous transactions does not include a handshake packet
Not supported by low-speed devices

Clock synchronization between the host and device may be needed (ex:
audio speaker) it can be done by
Device synchronizing its clock to the SOF packet
Using a feedback pipe for flow control

417

Interrupt & Isochronous Transfers Host


Constraints
The host may not be able to provide the requested bandwidth to
device, in this case the host will try other possible configurations with
lower bandwidth requirements (if provided by the device)
If still no bandwidth available, the host will refuse device configuration

Host software may have some latency for processing data and
issuing transfer requests on time due to other processes taking CPU
time

In order to avoid multiple SW calls for handling data to be transmitted


or received, large chunks of data transfers should be scheduled

418

USB controllers in the STM32 microcontroller


series

USB Device Controllers in STM32 series


USB device controller is present in almost all STM32 ARM cortex
M3/M4 series
Three hardware implementations are available
USB 2.0 full-speed device controller
USB 2.0 full-speed dual role host/device OTG controller
USB 2.0 high-speed dual role host/device OTG controller

Selection of the controller that can the application will depends on

Needed USB transfer performance


Needed CPU performance
Available Flash and RAM memory size
Presence of other needed peripherals
Power consumption requirements
External components (BOM)

420

USB Device Controller in STM32F1/F3/L1

USB 2.0 Full-speed Device


Controller

USB 2.0 Full-speed Device Controller


Features
Available on the following ARM Cortex M3/M4 platforms:
STM32F102: USB access line (48 MHz MCU, up to 16KB SRAM and
128KB of FLASH )
STM32F103: Performance line (72 MHz MCU, up to 96KB SRAM and
1MB FLASH)
STM32L152: Ultra-low power series (32 MHz MCU, up to 16KB SRAM
and 128KB of FLASH)
STM32F3xx : DSP & Analog ( 72MHz MCU, up to 32KB of SRAM)

Main features

USB 2.0 full-speed compliant


Up to 8 bi-directional endpoints (or 16 unidirectional endpoints)
Embedded full-speed analog transceiver
Supports all transfer modes (control, bulk, interrupt and isochronous)
Dedicated SRAM area of 512 bytes as packet memory that can be shared
among the needed endpoints
Double-buffering mechanism for isochronous and bulk transfers
USB Suspend/Resume with system entry/wakeup for low power mode

422

USB 2.0 Full-speed Device Controller


Block Diagram
D+

423

D-

SIE (Serial Interface Engine)


NRZI Encoding/Decoding
Synchronization & Pattern Recognition
Bit-stuffing and Handshake evaluation
PID & CRC generation and checking
Interrupt generation

Suspend Timer
Generate the Suspend interrupt when no SOF
is detected for 3ms

Packet Buffer Memory


512 bytes dedicated SRAM memory
The Arbiter allows dual access either from
packet buffer interface or APB interface

48 MHz
RX-TX
Suspend
Timer

Clock
Recovery

Control
Registers & Logic

Endpoint
Selection

Interrupt
Registers & Logic

Control

48MHz
USB Clock
Domain

SIE

APB Clock
Domain

Packet
Buffer
Interface

Endpoint
Registers

3 interrupt vectors (lines)


Low priority interrupt for managing all
endpoints
High priority interrupt: can be used for
managing isochronous/double-buffered
endpoints only
Suspend/Resume interrupt

USB IP

Analog
Transceiver

PLL

Packet
Buffer
Memory

Arbiter

APB Interface

APB_CLK

APB bus

Register
Mapper

Interrupt
Mapper

APB Interface

Interrupt lines

USB 2.0 Full-speed Device Controller


Operation overview
CTR Interrupt is
generated

USB Interrupt

APB
Interface

D+

APB

ARM
Cortex CPU

SRAM

Arbiter

USB IP

D-

Packet

EP2_TX
EP2_RX
EP1_TX

One data packet


received

EP1_RX
EP0_TX
EP0_RX
Packet Memory Area

The PMA size is 512


bytes, and no more
shared with CAN RAM !!

424

USB 2.0 Full-speed Device Controller


Transactional model handling
After each successful transaction on any configured endpoint, an interrupt
(correct transfer CTR) is raised
The Correct transfer interrupt handler has to:
Check interrupt status bits to determine the endpoint on which the transaction has
occurred
For OUT/SETUP endpoints: copy received data packet from packet memory area
to application buffer for processing, then re-enable the endpoint to be able to
receive next incoming packet
For IN endpoints: copy next data to be transferred from application buffer to packet
memory area, then re-enable the endpoint to send the packet when the next IN
token comes from host

The hardware will automatically change the endpoint to NAK state after end of
each transaction, so it is up to application to re enable endpoint for next
transaction
The Transactional model has simple FW handling, but does not allow multiplepacket transfer without CPU intervention after each transferred packet

425

USB 2.0 Full-speed Device Controller


Endpoint Configuration/Enabling
Before start of any transfer on one endpoint, the following
configuration should be done:
Endpoint address (only lower four bits)
Endpoint transfer type (control, bulk, interrupt or isochronous)
Endpoint TX or RX packet start address location in the packet memory
area
For OUT/SETUP endpoints the max receive packet size should be
configured

After the configuration, endpoint can be enabled for a transfer


IN endpoint:
Data can be copied from application buffer to endpoint PMA buffer
the TX transfer count should be updated (the maximum is one max packet
size)
Endpoint status should be changed to ACK to allow data transfer when IN
token arrives

OUT/SETUP endpoint:
Endpoint status should be changed to ACK to allow OUT/SETUP data
packet reception on endpoint

426

USB 2.0 Full-speed Device Controller


Packet Memory Area
EP2_RX
EP2_RX_COUNT
EP2_RX_ADDR

EP1 RX

EP2_TX_COUNT
EP2_TX_ADDR
EP1_RX_COUNT

EP1 TX

EP1_RX_ADDR
EP1_TX_COUNT
EP1_TX_ADDR
EP0_RX_COUNT

EP0 RX

EP0_RX_ADDR
EP0_TX_COUNT

EP0 TX

EP0_TX_ADDR

Buffer Description Table (BTABLE)

Packet Memory Area

427

USB 2.0 Full-speed Device Controller


Double-Buffering mechanism
Double buffering is used to improve the transfer performance for isochronous and bulk
endpoints (in one direction only)
Consists of using two buffers in PMA (buffer0 and buffer1), at any time CPU should be
accessing one buffer (for R/W) while USB IP is accessing the other buffer
USB swapping between buffer0 and buffer1 is done by hardware
In double-buffered bulk transfer, If application (CPU) is too slow to give its buffer to
USB, then NAK will be sent to host

EP1 BUFF1

CPU

USB
EP1 BUFF0

PMA

428

USB 2.0 Full-speed Device Controller


Packet Memory Area with double-buffering

EP1 TX Buffer 1

EP1_TX_COUNT_1

EP1 TX Buffer 0

EP1_TX_ADDR_1
EP1_TX_COUNT_0
EP1_TX_ADDR_0
EP0_RX_COUNT

EP0 RX

EP0_RX_ADDR
EP0_TX_COUNT

EP0 TX

EP0_TX_ADDR

Buffer Description Table (BTABLE)

Packet Memory Area

429

USB 2.0 Full-speed Device Controller


Suspend/Resume Interrupt
When no SOF is detected for 3 ms, a suspend interrupt is generated

In the interrupt handler of the suspend interrupt, if bus powered


device, the MCU should enter in low power mode in order to lower its
power consumption

In order to achieve the best low power consumption, the STM32 can
enter in STOP mode (all peripherals and CPU clocks OFF)

A host resume/reset signaling detection can wakeup the MCU from


STOP mode

430

USB 2.0 Full-speed Device Controller


External Hardware

Optional for forcing a


Device Disconnect/Connect

431

USB Library Footprints

Demo

Flash Code + Const

RAM usage

Joystick demo (HID)

7K Bytes

1400 Bytes

Mass Storage (Bulk)

10K Bytes

2100 Bytes

CDC -Virtual Com Port


( Bulk+ Interrupts)

7K Bytes

3400 Bytes

432

Touch Sensing Controller (TSC)

TSC Features (1/2)


Proven and robust surface charge transfer acquisition principle available on
STM32F05x, STM32F30x and STM32F37x families
Supports up to 24 capacitive sensing channels split over 8 analog I/O groups
Number of channels and analog I/O groups depend on the device used

Up to 8 capacitive sensing channels can be acquired in parallel offering a


very good response time
1 counter per analog I/O group to store the current acquisition result

One sampling capacitor for up to 3 capacitive sensing channels to reduce the


system components
Full hardware management of the charge transfer acquisition sequence
No CPU load during acquisition

Spread spectrum feature to improve system robustness in noisy environment

434

TSC Features (2/2)


Programmable charge transfer frequency
Programmable sampling capacitor I/O pin
Any GPIO of an analog IO group can be used for the sampling capacitor

Programmable channel I/O pin


Any GPIO of an analog IO group can be used for the channel

Programmable max count value to avoid long acquisition when a channel is


faulty
Dedicated end of acquisition and max count error flags with interrupt
capability
Compatible with proximity, touchkey, linear and rotary touch sensors
Designed to operate with STMTouch touch sensing firmware library

435

STM32F302/303 TSC Overview


Supports up to 24 capacitive sensing channels split over 8 analog I/O groups
10.2 MHz maximum charge transfer frequency

Number of capacitive sensing channels


Analog I/O group
G1
G2
G3
G4
G5
G6
G7
G8
Number of capacitive
sensing channels

STM32F30xVx

STM32F30xRx

STM32F30xCx

3
3
3
3
3
3
3
3

3
3
3
3
3
3
0
0

3
3
2
3
3
3
0
0

24

18

17

436

STM32F372/373 TSC Overview


Supports up to 24 capacitive sensing channels split over 8 analog I/O groups
10.2 MHz maximum charge transfer frequency

Number of capacitive sensing channels


Analog I/O group
G1
G2
G3
G4
G5
G6
G7
G8
Number of capacitive
sensing channels

STM32F37xVx

STM32F37xRx

STM32F37xCx

3
3
3
3
3
3
3
3

3
3
3
3
3
2
0
0

3
2
1
3
3
2
0
0

24

17

14

437

TSC Block Diagram


SYNC

fHCLK

Clock
prescaler

Pulse
generator
Spread
spectrum

I/O control logic

G1_IO1
G1_IO2
G1_IO3
G1_IO4

Group counters
TSC_IOG1CR
Interrupt

TSC_IOG2CR

TSC_IOGxCR

Gx_IO1
Gx_IO2
Gx_IO3
Gx_IO4

438

Charge Transfer Measuring Circuit


Rs is used to improve ESD robustness (typically 10K)
Cs sampling capacitor value depends on the required channels sensitivity
Higher Cs value is, higher the sensitivity but longer the acquisition time is

STM32 Device
G1_IO1

G1_IO2

G1_IO3

Rs

Rs

Rs

G1_IO4
Sampling
capacitor
Cs

Cx
(~20pF)

439

Charge Transfer Acquisition Overview


Charge transfer uses the electrical properties of the capacitor charge Q
It uses a sampling capacitor (CS) in which the electrode (CX) charges are transferred to
Charge Transfer is performed through analog switches directly embedded into the GPIO
The charge transfer cycle is repeated N times until the voltage on the sampling capacitor reaches
the VIH threshold of the GPIO it is connected to
The number N of transfer cycles required to reach the threshold represents the size of Cx
The number of transfer decreases when the electrode is touched.

Charge cycle

VDD
Electrode
capacitor
charging
VIH
Charge transfer

440

Charge Transfer Acquisition Sequence


S1

441

S2
S5
S6

S4

Cs

S3

S4 closed for the whole acquisition


S5 & S6 opened for the whole acquisition

IO
register

Repeat until Vcs


is read as a
logical 1

Step

S3

S2

S1

Description

Closed

Opened

Closed

Cs discharge

Opened

Opened

Opened

Deadtime

Opened

Closed

Opened

Charge cycle (Cx charge)

Opened

Opened

Opened

Deadtime

Opened

Opened

Closed

Transfer cycle (charge


transferred to Cs)

Opened

Opened

Opened

Deadtime

Closed

Opened

Closed

Cx discharge

STMTouch Touch Sensing Library


Complete free C source code library with firmware examples
Multifunction capability to combine capacitive sensing functions with traditional MCU
features
Enhanced processing features for optimized sensitivity and immunity
Calibration, environment control system (ECS), debounce filtering , detection exclusion system (DxS),

Complete and simple API for status reporting and application configuration
Touchkey, proximity, linear and rotary touch sensors support
Compliant with MISRA
Compliant with all STM32 C compilers
STM32F051 support planned for end Q2 2012

442

GPIO Analog Switch and Hysteresis Control


In addition to the management of charge transfer acquisition, the
touch sensing controller provides a manual control of both the
embedded analog switches and hysteresis of the GPIOs belonging to
the analog I/O groups.

This could be useful to implement a different capacitive sensing


acquisition principle of for others purpose (ie: analog multiplexor).

443

More Information on Touch Sensing


Solutions
For further information on touch sensing solutions from MCD:
Visit the intranet:
http://mcd.rou.st.com/modules.php?name=mcu&file=familiesdocs&FAM=118
Visit the sharepoint:
http://gnbproject7mms.gnb.st.com/mcdappli/touchsensing/default.aspx
Attend to a dedicated training (please contact Thierry GUILHOT)

444

Quiz
How many channels are supported by STM32F3xx microcontrollers ?
____________

Could you briefly describe the charge transfer acquisition principle?


____________

What is the impact of a touch on the number of charge transfer cycles?


____________

What type of sensors are supported by the STMTouch touch sensing library ?
____________

445

STM32F3xx Minimum External Components


Built-in Power Supply Supervisor reduces need for external
components
Filtered reset input, integrated POR/PDR circuitry, programmable Voltage
Detector (PVD).

Embedded 8 MHz High-Speed Internal (HSI) RC oscillator can be


used as main clock
Optional main crystal drives entire system
Inexpensive 4-32 MHz crystal drives CPU, USB, all peripherals

Optional 32.768 kHz crystal needed additionally for RTC, can run on
40KHz Low Speed Internal (LSI) RC oscillator
Only few mandatory external passive components for base system
on LQFP100 package.

STM32F30x Specific
features/peripherals

Analog-to-digital converter (ADC)


5MSPS

ADC Features (1/2)


 Up to 4 ADCs:


ADC1 & ADC2 are tightly coupled and can operate in dual mode (ADC1 is master)

ADC3 & ADC4 are tightly coupled and can operate in dual mode (ADC3 is master)

 Programmable Conversion resolution : 12, 10, 8 or 6 bit


 External Analog Input Channels for each of the 4 ADCs:



5 fast channels from dedicated GPIOs pads


Up to 11 slow channels from dedicated GPIOs pads

 ADC conversion time:











Fast channels : up to 5.1Ms/s with 12 bit resolution in single mode


Slow channels: up to 4,8Ms/s with 12 bit resolution in single mode

AHB Slave Bus interface


Channel-wise programmable sampling time
Self-calibration
Configurable regular and injected channels
Hardware assistant to prepare the context of the injected channels to allow
fast context switching
 Can manage Single-ended or differential inputs

449

ADC Features (2/2)


 3 internal channels connected to :
 Temperature sensor Vsense connected to ADC1
 Internal voltage reference VREFINT connected to all ADCs
 VBAT/2 power supply connected to ADC1

 Programmable sampling time


 Single, continuous and discontinuous conversion modes
 Dual ADC mode
 Left or right Data alignment with inbuilt data coherency
 Software or Hardware start of conversion
 3 Analog Watchdog per ADC
 DMA capability
 Auto Delay insertion between conversions
 Interrupt generation

450

ADC Pins
Name

Signal Type

Remarks

VREF+

Input, analog reference


positive

The higher/positive reference voltage for the


ADC, 1.8 V VREF+ VDDA

VDDA

Input, analog supply

Analog power supply equal to VDD


and 1.8 V VDDA VDD (3.6 V)

VREF-

Input, analog reference


negative

The lower/negative reference voltage for the


ADC, VREF- = VSSA

VSSA

Input, analog supply


ground

Ground for analog power supply equal to


VSS

VINP[18:1]

Positive input analog


channels for each ADC

Connected either to external channels:


ADC_INi or internal channels.

VINN[18:1]

Negative input analog


channels for each ADC

Connected to VREF- or external channels:


ADC_INi-1

ADCx_IN16:1 External analog input


signals

Up to 16 analog input channels (x=ADC


number = 1,2,3 or 4):
5 fast channels
11 slow channels

451

ADC Block Diagram

452

VREF+
VDDA
ADEN/ADDIS
VOPAMPx
VTS

VINP [18:1]
VINN [18:1]

VREF-

SAR ADC

Injected data register


(4x12bits)

Sample
and hold

Regular data register


(12bits)

Start

AUTDLY

Address/data bus

ANALOG MUX

ADC_IN[15:1]

DMA Request

ADCAL

VREFINT
VBAT

Start & Stop


3 Analog watchdog

ADSTP

Control
S/W
trigger

AREADY EOSMP

EXTI0

EOS

EOC

OVR JEOS JQOVF AWDx

EXTI1
. . . . .

Analog Watchdog
H/W
trigger

AREADYIE EOSMPIE

EXTI15

EOCIE

EOSIE OVRIE JEOSIE JQOVFIE AWDxIE

High Threshold register


(12bits)

EXTSEL[3:0] bits

Low Threshold register


(12bits)

J S/W
trigger

JEXTI0

JEXTSEL[3:0] bits

AWD3_OUT

JEXTI15

AWD2_OUT

. . . . .

AWD1_OUT

JEXTI1

ADC interrupt to NVIC


TIMERs

ADC Clocks
ADC1 &ADC2
HCLK

ADC12_CK

AHB interface

/1 , /2 or /4

Analog ADC1
(master)

/1 /256

Analog ADC1
(slave)

Reset & Clock


controller

CKMODE[0:1]

ADC3 &ADC4
HCLK

ADC34_CK

AHB interface

/1 , /2 or /4

Analog ADC3
(master)

/1 /256

Analog ADC4
(slave)
CKMODE[0:1]

453

How to choose ADC Clock


ADC clock
source

ADCxy_CK

AHB div 1, 2 or 4


Benefits

 Independent and asynchronous


ADC clock versus AHB clock

Drawbacks

 Uncertainty of the trig instant is


added by the resynchronizations
between the two clock domains

Clock
constraints
when using
injected
channels

Bypassing the clock domain


resynchronizations: deterministic
latency between the trigger event
and the start of conversion

 ADC clock depends on the AHB


clock

 FHCLK >= FADC/ 4 if the resolution of all channels are 12-bit or 10-bit
 FHCLK >= FADC/ 3 if there are some channels with 8 bits resolution
 FHCLK >= FADC/ 2 if there are some channels with 6 bits resolution

454

ADC Deep-Power-Down Mode





By default, the ADC is placed in deep-power-down mode where its supply is


internally switched off to reduce the leakage currents,
To start ADC operations the following sequence should be applied:

DEEPPWD
ADVREGEN
TADCVREG_STUP
ADC Calibration process

ADC Calibration

ADC OFF

ADC state

By Software

ADC
calibration

ADC OFF

455

ADC Calibration

456

The calibration factor to be applied for single-ended input conversions is


different from the factor to be applied for differential input conversions:
If ADCALDIF=0, calibration applied for single conversion and value stored in CALFACT_S
If ADCALDIF=1, calibration applied for differential conversion and value stored in CALFACT_D

ADCALDIF

0 : SINGLE ENDED INPUT

1 : DIFFERENTIAL INPUT

ADCAL
ADC state

OFF

OFF

ADC Calibration

0x00

CALFACT_x[6:0]

By Software

startup

By Hardware

ADC Startup

Calibration factor

ADC
Calibration

OFF Request

Note: The calibration factor is lost when entering Standby, Vbat mode or when the ADC enter
deep power down mode. In this case it is possible to re-write the calibration factor into the
ADC_CALFACT register without recalibrating.

ADC ON OFF control

457

To enable ADC: Set ADEN=1 then wait till ADRDY flag will be equal to
1,
What ever is the digital and the analog clock of the ADC, ADRDY
signal guarantees that ADC data will be transmitted from one domain
to the other.
ADC cannot be re-programmed unless it is stopped (ADSTART = 0).
ADEN
T STAB

ADRDY

ADDIS

ADC state

By Software

OFF

startup

By Hardware

ADC Ready to convert

ADC Startup

ADC ready

Req OFF

OFF

OFF Request

ADC Control bits constraints




When ADEN is equal to 0, the software is allowed to write:






When ADEN is equal to 1 and ADDIS to 0, the software is allowed to write:





All control bits related to configuration of regular conversions,

When ADEN=1 and JADSTART = 0, the software is allowed to write:




ADSTART, JADSTART and ADDIS of the ADC_CR,


ADC_JSQR register

When ADEN=1 and ADSTART = 0, the software is allowed to write:




RCC control bits to configure and enable the ADC clock,


The control bits DIFSEL in the ADC_DIFSEL register,
The control bits ADCAL and ADEN in the ADC_CR register,

All control bits related to configuration of injected conversions,

When ADSTART=JARDSTART=1 and ADDIS=0, The software is allowed to write




ADSTP or JADSTP of the ADC_CR register.

Note: There is no hardware protection to prevent these forbidden write accesses and ADC
behavior may become in an unknown state. To recover from this situation, the ADC must be
disabled (clear all ADC_CR register bits).

458

ADC Channel selection


Up to 16 regular and 4 injected conversions with programmable order
and programmable sampling time,
Example: - Conversion of channels: 0, 2, 8, 4, 7, 3 and 11
- Different sampling time.

Ch.0

1,5 cycles

Ch.2

Ch.8

Ch.4

Ch.7

1,5 cycles

4,5 cycles

Ch.3

19,5
cycles

61,5 cycles
181,5 cycles

Ch.11

61,5
cycles

459

ADC Sampling Time (TSampling)


Three bits programmable sampling time channel by channel
programmable:
ADC
1.5 cycles
2.5 cycles
4.5 cycles

ADCCLK

7.5 cycles
19.5 cycles

Selection

1.5 cycles
2.5 cycles
4.5 cycles
7.5 cycles
19.5 cycles
61.5 cycles
181.5 cycles
601.5 cycles

Sample Time

61.5 cycles
181.5 cycles
601.5 cycles

SMPx[2:0]

Note: The sampling time value depends on the type of channel (fast or slow), the
resolution and output impedance of the external signal source to be converted

460

Total Conversion Time


Total conversion Time = TSampling + TConversion

Resolution

Resolution

TConversion

12 bits

12,5 Cycles

10 bits

10,5 Cycles

8 bits

8,5 Cycles

6 bits

6,5 Cycles

Total conversion Time (When FADC = 72MHz)

12 bits

12,5 + 1,5 = 14cycles

19.4 us  5,1 Msps

10 bits

10,5 + 1,5 = 12 cycles

16,6 us  6 Msps

8 bits

8,5 + 1,5 = 10 cycles

13,8 us  7,2 Msps

6 bits

6,5 + 1,5 = 8 cycles

11,1 us  9 Msps

461

End of sampling
The ADC indicates the end of sampling phase by setting the EOSMP
flag only for regular conversion.
The EOSMP flag is cleared by software by writing1 to it.
An interrupt can be generated if the EOSMPIE bit is set in the
ADC_IER register.

Sampling

Conversion

End of channel sampling

As soon as the sampling is completed it is possible to prepare next


conversion (for instance switching I/Os) during the conversion phase.

462

Single-ended & Differential input channels


 Channels can be configured to be either single-ended or differential
input by writing ADC_DIFSEL register:
 In single ended input mode, the analog voltage to be converted for channel i is
the difference between the external voltage ADC_INi (positive input) and VREF(negative input)
 In differential input mode, the analog voltage to be converted for channel i is the
difference between the external voltage ADC_INi (positive input) and ADC_Ini+1
(negative input)

Note 1: When configuring the channel i in differential input mode, channel i+1 is
no longer usable in single-ended mode or in differential mode and must never be
configured to be converted.

463

ADC conversion modes


Start

Single channel

Start

Single channel

CHx

single conversion mode

Continuous conversion mode

CHx

Stop

Start
Start

CHx
CHx

..
.

Multi--channels (Scan)
Multi

Multi--channels (Scan)
Multi
single conversion mode

CHn

..
.

continuous conversion mode

CHn

Stop

Discontinuons conversion
mode

CHa

CHb

CHc

..

CHx

CHy

CHz

464

What is the queue of context


 It is a hardware assistant to prepare the context of the injected
channels to allow fast context switching
 A queue of context is implemented to anticipate up to 2 contexts for
the next injected sequences of conversions,
 The context consist of:
 Configuration of the injected triggers (JEXTEN[1:0] and JEXTSEL[3:0]),
 Definition of the injected sequence (JSQx[4:0] and JL[1:0]),

 Context parameters are defined in ADC_JSQR register which


implements a queue of 2 buffers,

465

How to configure the queue of context


 The JSQR register can be written at any moment even when injected
conversions are ongoing.
 At the beginning, the Queue is empty and the first write access into
the JSQR register immediately changes the context and the ADC is
ready to receive injected triggers.
 Once an injected sequence is complete, the Queue is consumed and
the context changes according to the next JSQR parameters stored
in the Queue.

466

Queue overflow





A Queue overflow occurs when writing into JSQR register while the Queue
is full,
This overflow is signaled by the assertion of the JQOVF flag,
When an overflow occurs, the write access of JSQR register which has created the
overflow is ignored and the queue of context is unchanged,
An interrupt can be generated if bit JQOVFIE is set.
P2

P1

P3  Overflow ignored

Write JSQR
JSQR Queue

EMPTY

P1

P1, P2
By Hardware

By Software

JQOVF
Trigger
JSQR value
ADC state

EMPTY

P1

RDY

JEOS

P1: sequence of 3 conversions


P2: sequence of 1 conversion
P3: sequence of 2 conversions

P2
CONV1 CONV2 CONV3

RDY

EMPTY
CONV1

RDY

467

Queue empty, JQM=0


 When the Queue become empty:
 If JQM=0  The Queue is maintained with the last active context,

P1

The Queue is not empty and


maintains P2 because JQM=0

P2

Write JSQR
JSQR Queue

EMPTY

P1

P2

P1, P2

Trigger

JSQR value

EMPTY

P1

RDY

ADC state

P2
CONV1

P1

sequence of 1 conversion

P2

sequence of 1 conversion

RDY

CONV1

RDY

CONV1

RDY

468

Queue empty, JQM=1


 When the Queue become empty:
 If JQM=1  The Queue become empty and triggers are ignored,

P1

The Queue become empty


and triggers are ignored
because JQM=1

P2

P3

Write JSQR
JSQR Queue

EMPTY

P1

P1, P2

P2

EMPTY

P3

EMPTY

ignored

Trigger

JSQR value

EMPTY

P1

RDY

ADC state

P2
CONV1

P1

sequence of 1 conversion

P2

sequence of 1 conversion

P3

sequence of 1 conversion

RDY

CONV1

EMPTY

RDY

P3
CONV1

EMPTY

RDY

469

ADC Channel offset







An offset x (x=1,2,3,4) can be applied to a channel by setting the


OFFSETx_EN of ADC_OFRx register.
The channel to which the offset will be applied is programmed into the bits
OFFSETx_CH of ADC_OFRx register.
In this case, the converted value is decreased by the user-defined offset
written in the OFFSETx bits.
The result may be a negative value so the read data is signed and the SEXT
bit represents the extended sign value.
Right alignment
0

D11

D10

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

Offset disabled, unsigned value

SEXT SEXT SEXT SEXT D11

D10

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

Offset enabled, signed value

Left alignment
D11

D10

SEXT D11

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

Offset disabled, unsigned value

D10

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

Offset enabled, signed value

470

ADC Overrun management


 The overrun flag (OVR) indicates a buffer overrun event,
 An interrupt can be generated if the OVRIE bit is set in the ADC_IER
register,
 It is possible to configure if the data is preserved or overwritten when
an overrun event occurs by programming the OVRMOD bit:
 OVRMOD=0:
An overrun event preserves the data register from being overwritten: the old
data is maintained and the new conversion is discarded. If OVR remains at 1,
further conversions can be performed but the resulting data is discarded.

 OVRMOD=1:
The data register is overwritten with the last conversion result and the
previous unread data is lost. If OVR remains at 1, further conversions can be
performed and the ADC_DR register always contains the data from the latest
conversion.

471

Auto delayed conversion (1/2)

472

 Auto Delay Mode: when AUTDLYbits = 1, a new conversion can start


only if the previous data has been treated:
 For regular conversions: once the ADC_DR register has been read or if the EOC bit
has been cleared.
HW/SW Trigger
ADC State

Delay

Delay

Delay

EOC Flag

 For injected conversions: when the JEOS bit has been cleared,
HW/SW Trigger
ADC State

1 2 3 4

Delay

1 2 3 4

Delay

JEOS Flag

Regular channel conversion

Note : A trigger event (for the same group of conversions) occurring during an already ongoing sequence or during
this delay is ignored.

 This is a way to automatically adapt the speed of the ADC to the speed of the system

that reads the data.

Auto delayed conversion (2/2)


 No delay inserted between each conversions of different groups (a
regular conversion followed by an injected conversion or conversely)
 If an injected trigger occur during the automatic delay of a regular conversion, the
injected conversion starts immediately,
 Once the injected sequence is complete, ADC waits the delay of the previous
regular conversion before lunching new regular conversion,

 In auto-injected mode (JAUTO=1) a new regular conversion can start


only when the automatic delay of the previous injected sequence of
conversion has ended (when JEOS has been cleared),

473

ADC Analog Watchdogs


 ADC Analog Watchdog 1
 12-bit programmable analog watchdog low and high thresholds
 Enabled on one or all converted channels
 Interrupt generation on low or high thresholds detection

 ADC Analog Watchdog 2&3


 Enabled on some selected channels by programming bits in AWDCHx[19:0],
 Resolution Limited to 8 bits and only the 8 MSBs of the thresholds can be
programmed into HTx[7:0] and LTx[7:0]
ADC_IN
0
ADC_IN1

.
.
.

.
.
.

AWD

Analog Watchdog

Low Threshold
High Threshold

Status Register

ADC_IN19

Note: The watchdog comparison is performed on the raw converted data before any alignment calculation and
before applying any offsets.

474

Analog WDG signal generation


 Each analog watchdog is associated to an internal hardware signal
ADCy_AWDx_OUT which is connected to an output timer,

ADC state

RDY

CONV1
inside

CONV2
outside

CONV3
inside

CONV4
outside

CONV5
outside

CONV6
inside

EOC Flag
AWDx flag
ADCy_AWDx_OUT

Note: AWDx flag has no influence on the generation of ADCy_AWDx_OUT (ex: ADCy_AWDx_OUT can
toggle while AWDx flag remains at 1 if the software did not clear the flag).

475

DMACFG bit management


 DMA can be used to manage the regular channels conversions (ADCx
DR register),
 DMA one shot mode (DMACFG = 0):
In this mode the ADC stops generating DMA requests once the DMA has
reached the last DMA transfer even if conversion has been started again,

 DMA circular mode (DMACFG=1):


In this mode, the ADC generates a DMA transfer request each time a new
conversion data is available in the data register, even if the DMA has reached
the last DMA transfer.

476

ADC Dual mode


 ADC1 and ADC2 can be used together in Dual mode (ADC1 is the master),
 ADC3 and ADC4 can be used together in Dual mode (ADC3 is the master),
 Six possible modes are implemented:
 Injected simultaneous mode,
 Regular simultaneous mode,
 Interleaved mode,
 Alternate trigger mode,
 Injected simultaneous + Regular simultaneous mode,
 Regular simultaneous + Alternate trigger mode,

477

Injected simultaneous mode


 Converts an injected channel group,
 The external trigger source comes from the injected group multiplexer
of the master ADC,


An JEOC is generated at the end of all channels conversion,

 Results stored on injected data registers of each ADC.


ADC1

CH15

CH14

CH13

CH12

ADC2

CH6

CH7

CH8

CH9

Note: Do not convert the same channel on the


two ADCs.
Sampling
End of Injected Conversion on ADC1
and ADC2

Trigger for
injected
channels

This mode can be combined with auto-delayed mode,

Once SW set JADSTART or JADSTP bits of the master ADC, the


corresponding bits of the slave ADC are also automatically set,

Conversion

478

Regular simultaneous mode


 Converts an regular channel group,
 The external trigger source comes from the regular group multiplexer
of the master ADC,


An EOC is generated at the end of each channel conversion,

 Results stored on the common data register ADC_CDR and on the


each ADCx_DR,
ADC1

CH15

CH14

CH13

CH12

ADC2

CH6

CH7

CH8

CH9

Note: Do not convert the same channel on the


two ADCs.
Sampling
End of regular sequence Conversion
on ADC1 and ADC2

Trigger for
regular
channels

This mode can be combined with auto-delayed mode,

Once SW set ADSTART or ADSTP bits of the master ADC, the


corresponding bits of the slave ADC are also automatically set,

Conversion

479

Interleaved mode


Converts a regular channel group (usually one channel).

The external trigger source, which start the conversion, comes from ADC1:
 ADC1 starts immediately,
 ADC2 starts after a configurable delay,

An EOC is generated at the end of each channel conversion,

Results stored on the common data register ADC_CDR and on the each
ADCx_DR,
DMA request every 2 conversions
Sampling

ADC1

CH0

CH0

CH0

Conversion

End of Conversion on
ADC1
CH0

ADC2

CH0

CH0

End of Conversion on
ADC2
Trigger for
regular
channels

Delay

This mode can not be combined with auto-delayed mode,

Once SW set ADSTART or ADSTP bits of the master ADC, the corresponding bits of the slave
ADC are also automatically set,

480

Alternate trigger mode




Converts an injected channel group,

The external trigger source comes from the injected group multiplexer of the
master ADC,

If discontinuous mode is enabled:


1st
Trigger JEOC

ADC1

3th
JEOC
Trigger

CH0

CH1
2nd
Trigger

JEOC on ADC1

CH3
6th

JEOC

Trigger

CH10

ADC2

CH2
4th

JEOC

7th
Trigger

5th
JEOC
Trigger

481

Trigger

CH11

JEOC

8th
Trigger

CH12

JEOC on ADC2

CH13

Sampling

If discontinuous mode is disabled:


1st
Trigger

ADC1
ADC2

JEOC

CH15

JEOC

2nd
Trigger

CH7
JEOC

JEOC

JEOC, JEOS

4th
Trigger

JEOC, JEOS
CH12

CH14

CH6

CH8

JEOC

JEOC

CH15

CH12

CH14

CH6

JEOC, JEOS

Conversion
3td
Trigger

CH7
JEOC

CH8
JEOC

This mode can not be combined with auto-delayed mode,

Once SW set JADSTART or JADSTP bits of the master ADC, the


corresponding bits of the slave ADC are also automatically set,

JEOC, JEOS

Regular simultaneous + Injected


simultaneous

482

 Converts an injected and regular channel groups,


 The external trigger source comes from the master ADC,
 Results of injected channels stored on ADCx_JDRy registers, and
regular channels on each ADCx_DR register and on ADC_CDR
register.
regular simultaneous

ADC1

CH0

ADC2

CH1

CH3

Trigger for
regular
channels

CH2

mode interrupted by
injected simultaneous
one

CH1

CH2

CH3

CH2

CH1

CH0

Sampling
Conversion

End of Conversion on
ADC1 and ADC2

ADC1

CH10

CH11

ADC2

CH15

CH14

Trigger for injected


channels

End of Injected Conversion on


ADC1 and ADC2

Note: Do not convert the same channel on the two


ADCs.

This mode can be combined with auto-delayed mode,

Regular simultaneous + Alternate trigger


 Converts an injected and regular channel groups,
 The external trigger source comes from the master ADC,
 Results of injected channels stored on ADCx_JDRy registers, and
regular channels on each ADCx_DR register and on ADC_CDR
register.
ADC1 reg

CH0

CH1

ADC1 inj
ADC2 reg

CH3

CH0

1st
injected
Trigger
CH10

CH1

CH2

Sampling
End of Injected
Conversion on ADC1
CH0

CH1

ADC2 inj

2nd
injected
Trigger

CH11

This mode can not be combined with auto-delayed mode,

Conversion

483

DMA requests in dual ADC mode




MDMA=0b00:


One DMA channel should be configured for each ADC to transfer the data available on
ADCx_DR register,

MDMA=0b10:


A single DMA request is generated each time both master and slave EOC events have
occurred,

Used in interleaved and in regular simultaneous mode when ADC resolution is 10 or 12 bits

1st DMA request

ADC_CDR[31:0] = SLV_ADC_DR[15:0] | MST_ADC_DR[15:0]

2nd DMA request ADC_CDR[31:0] = SLV_ADC_DR[15:0] | MST_ADC_DR[15:0]

MDMA=0b11:


A single DMA request is generated each time both master and slave EOC events have
occurred,

Used in interleaved and in regular simultaneous mode when ADC resolution is 6 or 8 bits

1st DMA request


2nd

ADC_CDR[15:0] = SLV_ADC_DR[7:0] | MST_ADC_DR[7:0]

DMA request ADC_CDR[15:0] = SLV_ADC_DR[7:0] | MST_ADC_DR[7:0]

484

ADC Flags and interrupts


ADRDY: ADC ready

ADRDY

EOC : Regular End Of Conversion

EOC

EOCIE

EOS : Regular End Of Sequence

EOS

EOSIE

JEOC : Injected End Of Conversion

JEOC

JEOCIE

JEOS

JEOSIE

JEOS : Injected End Of Sequence

485

ADRDYIE

ADC

JQOVF : Injected Injected context queue overflows

JQOVF

JQOVFIE

Global interrupt
(NVIC)

AWD1 : Analog watchdog 1


AWD2 : Analog watchdog 2
AWD3 : Analog watchdog 3

EOSMP: End Of Sampling


OVR: Overrun

AWD1

AWD1IE

AWD2

AWD2IE

AWD3

AWD3IE

EOSMP

EOSMPIE

OVR
Flags

OVRIE

Interrupt enable bits

Quiz
How many ADC external input channels are in the STM32F3
microcontroller ?
--------------------------------------------------- What is the max ADC frequency ?
------------------------------------------------------ What is the queue of context ?
------------------------------------------------------- How to use DMA in single and in Dual ADC modes ?
----------------------------------------------------------

486

ADC Hands-on
 This example describes how to use the ADC1 to convert
continuously the potentiometer analog signal
 The converted value is displayed on the LCD Eval Board,

Complete the missing code and run the example


Code comments may help you !!!

Presentation Title

25/10/2012

487

Timers enhancements in STM32F30X

Channel-level enhancements
Up to 6 channels on Advanced control timers:
Up to 4 channels with input/output stages (as in TIM2/3/4)
Channels remain compatible with those on existing products timers
New features:

More complex waveforms generation


Enhanced triggering capability
More channels modes
Multitude of coupling scenarios between channels

2 extra channels (only on TIM1/8)


Internal channels
Not wired to GPIOs
Used within the Timer itself for complex waveform generation
Routed to the ADC triggering logic (via Timers TRGO output)

Compare-and-PWM-modes-only channels
No capture modes
No DMA channels nor Interrupt request lines

High coupling with channels 1,2,3 and4


Enhance waveform generation on those channel: More complex waveforms can be generated
Enhanced triggering mechanism: ADC oriented triggering mechanism
Designed to meet many Motor Control applications requirements

489

Retriggerable One Pulse Mode (1/2)


(Not available in TIM16/17)

Generated waveforms shape


TRGI

Counter

Output

490

Retriggerable One Pulse Mode (2/2)


(Not available on TIM16/17)

Available on Channel 1, 2, 3 and 4


Different from the existing One Pulse mode:
The outputted pulse starts as soon as a trigger active edge is detected
The pulse length is extended if a new active edge is detected

Pulse length is set using the ARR register


For Up-counting mode, CCRx register has to be set to zero
For Down-counting mode, CCRx register has to be set to ARR value

Configuration sequence
Set the timer to slave mode: the Combined Reset+Trigger mode shall be used
Select the Retriggerable One Pulse mode through the OCxM[3:0] bit field
Retriggerable OPM mode 1
Retriggerable OPM mode 2

491

Channels Coupling (1/2)


Two coupling schemes:
Adjacent channels coupling:
Channel1 and Channel2 coupling
Channel3 and channel4 coupling

Enhanced channels coupling (feature used by Motor Control applications)


Channel5 and Channel1
Channel5 and Channel2
Channel5 and Channel3

Flexible coupling mechanism on adjacent channels


Channels coupling output can be directed to one channel or to both of them

Generated Waveforms shape


Frequency control through TIMx_ARR register value
Phase-shift (delay) control through one of the two channels TIMx_CCR register
Pulse-length (duty-cycle) control through the second channels TIMx_CCR register

492

Channels Coupling (2/2)


Available PWM modes
Each channel among the first four channels can be configured in one of the
following PWM modes
Asymmetric and Combined PWM modes are applicable on coupled channels only

PWM mode 1

PWM mode 2

Independent

OCxM[3:0] = 4b0110

OCxM[3:0] = 4b0111

Asymmetric

OCxM[3:0] = 4b1110

OCxM[3:0] = 4b1111

Combined

OCxM[3:0] = 4b1100

OCxM[3:0] = 4b1101

Coupling between channels is activated

493

Asymmetric PWM mode (1/3)


(Not available in TIM15/16/17)
Output waveform shape
Up-counting
CCR1
CCR2

OC1REF (PWM2)

OC2REF (PWM2)

OC1REFC or OC2REFC

Down-counting

494

Asymmetric PWM mode (2/3)

495

(Not available in TIM15/16/17)


Operation mechanism (1/2)
OCxM[3:0]
OCxREF

Channelx

TIM_CHx

Output
Control

OCxREFC

OCyREFC

TIM_CHy

Output
Control

Channely
OCyREF

Counting Direction

OCyM[3:0]

Asymmetric PWM mode (3/3)


(Not available in TIM15/16/17)
Operation mechanism (2/2)
The counting direction selects which channel output to be directed to OCxREFC
Coupled channel has to be configured in the same PWM mode

Center-aligned counting mode required


Asymmetric mode is effective only when the timer is configured to count in centeraligned mode

Available on the following channel couples:


(Channel1, Channel2)
(Channel3, Channel4)

Two Asymmetric PWM mode are available


Asymmetric PWM1 mode
Asymmetric PWM2 mode

496

Combined PWM mode (1/5)


(Not available in TIM16/17)
Output waveform shape (Logical And)
Up-counting
CCR2

CCR1

OC1REF

OC2REF

OC2REFC or
OC1REFC

497

Combined PWM mode (2/5)


(Not available in TIM16/17)
Output waveform shape (Logical Or)
Up-counting
CCR1

CCR2

OC1REF

OC2REF

OC2REFC or
OC1REFC

498

Combined PWM mode (3/5)

499

(Not available in TIM16/17)


Operation mechanism
OCxM[3:0]
OCxREF

Channelx

TIM_CHx

Output
Control

OCxREFC

OCyREFC

TIM_CHy

Output
Control

Channely
OCyREF

OCyM[3:0]

Combined PWM mode (4/5)


(Not available in TIM16/17)
Two logical operators coupling modes:
Logical And
Logical Or

Two Combined PWM mode are available


Combined PWM1 mode
Combined PWM2 mode

Different PWM mode on each channel


In order to get the desired output, the two coupled channels has to be configured
with different PWM modes: PWM1 and PWM2
If the same PWM mode is configured on both channels, the output signal waveform
is similar to one of the two channels waveforms depending on the Logical Operator
applied

500

Combined PWM mode (5/5)


(Not available in TIM16/17)
Configuration sequence
Configure the two coupled channels on different PWM modes
Configure one channel or both coupled channels to output a logical combination of
the channels waveforms

Counting mode independent:


Acts on Edge-aligned counting mode
Acts on Center-aligned counting mode

Available on the following channel couples:


(Channel1, Channel2)
(Channel3, Channel4)

501

Channels 5&6 features


(Only TIM1 & TIM8)
Channels 5&6 characteristics:
Only available on advanced control Timers: TIM1 & TIM8
Compare-and-PWM-modes-only channels
Internal channels (no external output)

Channel 5&6 use cases:


Can be used to generate more complex waveforms when combined with other
channels (applicable for Channel5 only)
Can be used to trigger ADC conversion (many triggering scenarios)

Compatible with the first four channels implementation


Same control registers (for implemented features)
Same control bit-fields structure (for implemented features)

Typical use case


Used by single-shunt current measurement applications

502

Enhanced Triggering mechanism


(Only TIM1 & TIM8)
Additional set of triggers dedicated for ADC
Outputted on the new (second) trigger output TRGO2
Controlled through the new bit-field MMS2[3:0]
Counter Reset
Update Event

Pulse-type output
Counter Enable

CCI1F Flag

Level-type output

OC1REF
OC6REF
OC2REF
OC4REF

OC6REF
OC3REF

OC5REF

OC6REF
OC4REF

OC5REF

OC6REF
OC5REF
OC4REF
OC6REF
OC6REF

MMS2[3:0]

OC4REF

to ADC

503

Combined 3-phase PWM mode (1/3)


(Only TIM1 & TIM8)
ARR
OC5
OC6

Counter

OC1
OC4
OC2
OC3

OC5REF
OC1REFC
OC2REFC
OC3REFC
Preload
Active
OC4REF
OC6REF
TRGO2

xxx

100
001

xxx
100

504

Combined 3-phase PWM mode (2/3)


(Only TIM1 & TIM8)
Operation mechanism
Prescaler

Counter
OC1REFC

Channel1 Output Stage


OC2REFC

Channel2 Output Stage


OC3REFC

Channel3 Output Stage


Channel4 Output Stage
Channel5 Output Stage
Channel6 Output Stage
GC5C1 / GC5C2 / GC5C3

505

Combined 3-phase PWM mode (3/3)


(Only TIM1 & TIM8)
Waveforms generation on up to three channels
Based on coupling Channel5s output with others channels
Channel1
Channel2
Channel3

Dedicated for Motor Control application


Used by STs patented Single-shunt current reading application
Can reduce CPU load by 5-10% compared to current implementation on F1/F2/F4
families
Frees many MCU resources (DMA channels, Interrupt request lines)

506

Miscellaneous enhancements (1/2)


Repetition counter width is up to 16 bit (Only TIM1/TIM8)
Gives about 650ms between updates for 100KHz PWM frequency

Two OCxREF clearing sources:


External OCxREF clearing input: ETRF input
Internal OCxREF clearing input
Connected internally to comparator output: a pseudo-cell for Cycle-by-cycle current control

Timers synchronization enhancement


Introduction of a new synchronization mode: Combined Reset+Trigger Mode
When a trigger active edge is detected, the counter content is Reset and the counting is
started
For configuring the Retriggerable One-Pulse mode, the timer has to be configured in slave
mode: Combined Reset+Trigger mode shall be used

507

Miscellaneous enhancements (2/2)


Up to two break input sources (Only TIM1/8)
Break input 1 (legacy one)
Idle State programming
Has the highest priority over Break inputs
Multiplexed with internal break signals:

Clock failure event from CSS block


SRAM parity error
Comparators outputs
PVD interrupt
Cortex M0 lockup (hard fault) output

Built with a digital filter with a flexible set of sampling periods


Asynchronous functioning (unless the filter is enabled)
Typical use case: Over-voltage protection handling

Break input 2 (new one)(only on TIM1/TIM8)

No Idle State programming


Lower priority compared to Break input 1 (legacy one)
Built with a digital filter with a flexible set of sampling periods
Typical use case: Over-current protection handling

508

Product-level enhancements (1/2)


Two clock sources for Advanced Control Timers (TIM1/TIM8)
APB clock
PLL output
Advanced Control Timers operate with clock frequency up to 144MHz
To reach 144MHz operation frequency the following conditions shall be fulfilled:
SYSCK/AHB prescaler must be set to 1
AHB/APB prescaler must be set to 1

Clock source selection


Please refer toTIMxSW (x = 1,8)bit description within RCC_CFGR3 register
description (RCC chapter)
The TIMxSW (x = 1,8) control bit can set/reset by software
In case where one of the above conditions is not fulfilled, the TIMxSW control bit is reset by
hardware

509

Product-level enhancements (2/2)


Encoder Mode enhancement
Two Timers can share the same Quadrature Encoder output signals
TIM2 IC1 (respectively TIM2 IC2) is connected to TIM15 IC1 (respectively TIM15 IC2)
TIM3 IC1 (respectively TIM3 IC2) is connected to TIM15 IC1 (respectively TIM15 IC2)
TIM4 IC1 (respectively TIM4 IC2) is connected to TIM15 IC1 (respectively TIM15 IC2)

Configuration
Using ENCODER_MODE bit field within the SYSCFG_CFGR1 register (for more
details refer to SYSCFGR chapter)

Use case
Used with M/T technique for estimating Velocity and Acceleration for wide-range of
velocity values (especially for low velocity values)

510

Timers Hands-on
Preliminary: The aim from the following two hands-on is to get familiarized with
generating Phase-Shifted Signals using the new PWM modes:
Asymmetric PWM mode
Combined PWM mode

Introduction (1/3)
Phase-Shifted signals has the following properties
Adjustable frequency: through ARR register update
Adjustable delay: through the CCxR register update
Adjustable pulse length: through the CCyR register update

Phase-shifted signal waveform shape


Period (frequency): ARR

Delay: CCxR

Pulse length:
CCyR

512

Introduction (2/3)
Hardware requirements
MantaEdge Eval-Board
Two-channel (or more) oscilloscope

How to set up the Hands-on?


Attach the oscilloscope channel1 probe on pin PA.08
Attach the oscilloscope channel2 probe on pin PB.08
Turn-on the oscilloscope
Power-on the STM32F30x Eval-board
Build the Project, then flash the MCU
Press the Auto-Scale button on the oscilloscope front panel
Recommended parameters:
Voltage Scale: 2v/div
Time Scale: 200S/Div for Hands-on1 and 100s/Div for Hands-on2

On the oscilloscope, set the Trigger to be on the channel1 rising edge

513

Introduction (3/3)
What should be seen on the oscilloscope display?
Channel1: A PWM signal with 50% duty cycle. This waveform is the reference to
which the Phase-shifted signal will be compared
Channel2: The Phase-shifted PWM signal

PWM Period

Channel1

Delay: CCxR

Channel2

Pulse length:
CCyR

514

Hands-on1: Asymmetric PWM mode


Aim
After powering-on the STM32F30x Eval-Board, the channel2 waveform should be
different from the desired one. The goal is obtain the desired signal waveform:
Phase-Shifted signal
Within the Timers initialization section there is two wrong parameters (should be
replaced to get the desired waveform)

Issue solving steps (recommendation)


Read carefully the Asymmetric PWM mode slides, again, carefully
Read the comments on the firmware code
Try to find the wrong configuration (2 wrong parameters)
Replace the wrong parameters by the correct ones

After solving the issue, you may adjust the outputted waveform shape
Using Potentiometer to control the Phase-shift and Pulse-length parameters
Press key button to switch between Pulse-length and Phase-shift parameters
adjustment

515

Hands-on2: Combined PWM mode


Aim
After powering-on the STM32F30x Eval-Board, the channel2 waveform should be
different from the desired one. The goal is obtain the desired signal waveform:
Phase-Shifted signal
Within the Timers initialization section there is one wrong parameter (should be
replaced to get the desired waveform)

Issue solving steps (recommendation)


Read carefully the Combined PWM mode slides, again, carefully
Read the comments on the firmware code
Try to find the wrong configuration (1 wrong parameter)
Replace the wrong parameter by the correct one

After solving the issue, you may adjust the outputted waveform shape
Using Potentiometer to control the Phase-shift and Pulse-length parameters
Press key button to switch between Pulse-length and Phase-shift parameters
adjustment

516

Conclusions
To get the desired waveform using Asymmetric PWM mode
Counter should be configured in Center-aligned mode
Coupled channels Shall be configured into the same PWM mode
The phase-shift using asymmetric mode cannot exceed 180

To get the desired waveform using Combined PWM mode


The two coupled channels Shall have been configured into two different PWM
modes
Combined mode is not sensitive to the counting direction

517

518

Day 4:
Continue with STM32F30x Specific parts
Comparators (COMP) + Hands-on
Operational amplifiers (OPAMP) + Hands-on

STM32F37x specific parts

Analog-to-Digital Converter ADC sigma delta + Hands-on


Comparators(COMP) (Only differences vs STM32F30x comparator).
Analog-to-Digital Converter ADC 1 MSPS
CEC

STM32F30x Motor Control kit - Complete development platform with all the
hardware and software required to get STM32-based motor control applications
started quickly + STM32F30x new features/peripherals easing motor control

Comparators (COMP)

COMP features (1/2)

520

7 comparator pairs COMPx, x = 1..7


Rail-to-rail inputs
Programmable speed / consumption: 4 modes
Programmable hysteresis: 4 levels
Inputs and outputs available externally - can be used as a standalone device without MCU
interaction
Comparator pairs can be combined into a window comparators
Multiple choices for output redirection
Comparator blanking The blanking time period is defined by TIM OC multiple timer OC
events available
to avoid reaction of the regulation loop on the current spikes at the beginning of the PWM
period caused by the recovery current in power switches

Can be used for:


Exiting low power modes
Signal conditioning
Cycle-by-cycle current control with blanking (w/ DAC and TIM)

COMP features (2/2)

521

 Comparator characteristics at a glance


 Full operating voltage range 2V < VDDA < 3.6V
 Propagation time vs consumption
High speed / full power





Medium speed / medium power


Low speed / Low power
Very low speed / Ultra-low power
Input offset: +/-4mV typ, +/- 20mV max

 Programmable hysteresis: 0, 8, 15, 31 mV

 Fully asynchronous operation


 Comparators working in STOP mode
 No clock related propagation delay

 Functional safety (Class B)


 The comparator configuration can be locked with a write-once bit

521

Block diagram for STM32F30x

522

BKIN: PWMs Emergency stop input


OCRefClear: PWM clear for cycle-by-cycle current controller
522

Blanking function

523

Purpose: prevent the current regulation to trip upon short current


spikes at the beginning of the PWM period (typically the recovery
current in power switches anti parallel diodes).

523

Quiz

524

How many options are for internal threshold setting if DAC is used by another
task?
Can the threshold go from 0 to VDDA ?
How can the lock bit be reset once activated ?

524

Hands-on: COMP and TIM1


break function
02/04/2012

Aim of the Hands-on

This lab illustrates the use of the COMP with the Timer 1 break
function.

F3 Alpha Training

02/04/2012

526

Step1: Complete missing code in


COMP_Config routine
Enable the COMP7 clock:
RCC_APB2PeriphClockCmd(----------------------, ENABLE);

Complete the COMP7 configuration


PC1 is non inveting input
VREFINT is inverting input
Output connected to TIM1BKIN
No Hysterisis, UltraLow power mode, Output polarity non inverted
Enable the COMP7

COMP_InitStructure.COMP_InvertingInput =--------;
COMP_InitStructure.COMP_NonInvertingInput =-------------------;
COMP_InitStructure.COMP_Hysteresis =----------------------;
COMP_InitStructure.COMP_Mode =-------------------;
COMP_InitStructure.COMP_OutputPol =-----------------;;
COMP_Cmd(------------, ENABLE);

F3 Alpha Training

02/04/2012

527

Step 2: Hardware set up


Connect TIM1 channel1 PA8 to an oscilloscope to display waveform.
While voltage applied on PC1 is lower than VREFINT (1.22V), PWM
signal is displayed on PA8.
While PC1 is higher than VREFINT, no PWM is output on PA8.
To vary the voltage applied on PC1, use the Potentiometer.

Presentation Title

02/04/2012

528

Operational Amplifier

Features (1/2)
Up to 4 operational amplifiers
Rail to Rail input/output
Low Offset voltage
Access to all terminals
Input multiplexer on inverting and non inverting inputs
Input multiplexer can be triggered by a timer and synchronized with a
PWM signal.
4 operating modes:
Standalone mode: External gain setting
Follower mode
PGA mode: internal gain setting (x2, x4, x8, x16)
PGA mode: internal gain setting (x2, x4, x8, x16) with inverting input used for
filtering.

530

Operating conditions

Features (2/2)

2.4V < VDDA < 3.6V


-40C < Temp < 105C

Input stage
Input: rail to rail
Offset: 10mV max
Ibias < +/-1A max (mostly I/O leakage)

Output stage

Output: rail to rail


Iload < 500A (sink and source)
Capacitive load < 50pF (stable when connected internally on ADC input)
GNDA + 100mV < Vout < VDDA 100mV (Max)

Speed
GBW: 8MHz
Slew rate 4.5V/s
unity gain stable

531

Standalone mode, External Gain setting


STM32F30x

ADC

OpAmp

--

532

Follower mode
STM32F30x

+
These I/Os are
available

Always
connected to
OpAmp ouput

ADC

OpAmp

--

533

PGA Mode, Internal Gain setting (Gain = 2 /


4 / 8 / 16)
STM32F30x

+
These I/Os are
available

Always
connected to
OpAmp ouput.

ADC

OpAmp

--

534

PGA Mode, Internal Gain setting (Gain = 2 / 4 / 8 /


16) with Inverting input used for filtering.
STM32F30x

ADC

OpAmp

-Allows
optional lowpass filtering
NB: gain
dependant
cut-off
frequency

Equivalent to

535

Timer Controlled Multiplexer mode (1/2)

536
536

This mode allows switching automatically from one inverting (or non inverting)
input to another inverting (or non inverting) input.

Benefit: useful in dual motor control with a need to measure the currents on the 3 phases on a first
motor and then on the second motor.

The automatic switch is triggered by TIM1 CC6 output arriving on the OPAMP
input multiplexers.

The Timer Controlled Multiplexer mode is enabled by setting TCM_EN bit.

If TCM_EN bit is set, inverting and non inverting input selection is done using
VPS_SEL and VMS_SEL bits.

If TCM_EN bit is reset, inverting and non inverting input selection is done using
VP_SEL and VM_SEL bits.

Timer Controlled Multiplexer mode (2/2)


537

CCR 6

T1 counter

T8 counter
ADC sampling points

T1 output (1 out of 3)
T8 output (1 out of 3)
T1 CC6 output onto OpAmp interface
(internal signal)
Sec.

Def.

Sec.

Def.

Sec.

Op Amp configuration

OPAMP calibration

538
538

It is possible to do the trimming of every opamp offset.

At startup, trimmed offset values are initialized with the preset factory trimming value

The user can switch from the factory values to the user trimmed values using the
USER_TRIM bit in the OPAMP control register.

The offset of each operational amplifier can be trimmed by programming the


TRIMOFFSETN and TRIMOFFSETP bits in the OPAMP control register.

Quiz

539

How many operational amplifiers are there in the STM32F30x


microcontroller ?
____________

How many OPAMP operating modes are there in the STM32F30x?


____________

What is the benefit of Timer controlled multiplexed mode?


____________

539

Hands-on: Using OPAMP in PGA


mode
02/04/2012

Aim of the Hands-on

This lab illustrates the use of the OPAMP to amplify the DAC
output.

F3 Alpha Training

02/04/2012

541

Step1: Complete missing code in


OPAMP_Config routine
Enable the OPAMP2 clock:
RCC_APB2PeriphClockCmd(----------------------, ENABLE);

Complete the OPAMP configuration


PB0 is non inveting input
PGA mode is used
Gain = 2

OPAMP_InitStructure.OPAMP_NonInvertingInput = -----------;
OPAMP_InitStructure.OPAMP_InvertingInput = -----------------;
OPAMP_PGAConfig (-------------, ---------, OPAMP_PGAConnect_No) ;
OPAMP_Cmd(------------------------, ENABLE);

F3 Alpha Training

02/04/2012

542

Step 2: Hardware set up


Connect the OPAMP2 non inverting input (PB0) into DAC2 output
(PA5).
Connect DAC2 output to an oscilloscope
Connect the OPAMP output (PA6) to an oscilloscope

Presentation Title

02/04/2012

543

STM32F37x Specific Features/


peripherals

Sigma delta analog to digital


converter (SDADC)

SDADC introduction (1/2)


Sigma delta principle inside STM32:
High precision (new applications: medical, metering, gaming)
Excellent linearity (simplifies calibration)
No sample & hold

Main properties:
3 - ADCs in all packages (19 single ended and 10 differential inputs max.)
16-bit resolution, ENOB = 14 bits (SNR = 89dB)
Low power modes:
Slow (speed reduced 4x): up to 600uA (instead of 1200uA in run mode)
Standby: up to 200uA, wakeup time 50us
Power down: up to 10uA, wake up time 100us

Internal or external reference voltage usage


Independent power supply pins: SDADCx_VDD
Conversion rates:
Up to 50ksps in fast mode (single channel)
Up to 16.6ksps in normal mode (multiple channels)

7 programmable gains: , 1, 2, 4, 8, 16 *, 32* ( * = digital gains)

546

SDADC introduction (2/2)


Next features:
9 single ended inputs or 5 differential inputs per one SDADC (or combination)
DMA capability to transfer data to RAM (conversion when CPU in sleep mode)
Triggers:

Software
Timer
External pin
Synchronization to first SDADC (SDADC1)

Signed output data format (16-bit signed number)


Zero offset calibration
3 measuring modes per analog channel selection:
Single ended referenced to zero
Single ended offset mode
Differential mode

Interrupts and flags:


Interrupts: EOCAL, REOC, JEOC, ROVR, JOVR
Flags: STABIP, CALIBIP, RCIP, JCIP

547

SDADC block diagram


One SDADC
configuration:

548

SDADC pins
Name

Signal type

Remarks

SDADCx_VDD

Input, analog
Supply

SDADCx_VSS

Input, analog
supply ground

Analog ground power supply.

SDADCx_AIN[8:0]P

Analog input

Positive differential analog inputs for the 9 channels

SDADCx_AIN[8:0]M

Analog input

Negative differential analog inputs for the 9 channels.

SD_VREF+

Input or In/Out,
positive analog
Reference

When the external reference is selected (REFV=00),


this pin must be driven externally to a voltage
between 1.1 V and SDADCxVDD (minimum for
x=1..3).When an internal reference is selected (REFV
is 01, 10, or 11), this pin must have an external
capacitance connected to SD_VREF-

SD_VREF-

Input, negative
analog
reference

This pin, when present, must be driven to the same


voltage level as SDADCxVSS.

Analog power supply. Must be greater than 2.4 V (or


2.2 V in Slow mode) and less than 3.6 V.

549

SDADC power supply and reference


voltages
Power supply:
Independent power supplies:
SDADC1/2 _VDD for SDADC1 and SDADC2
SDADC3_VDD for SDADC3
SDADCx_VSS common for all SDADCs

Voltage range:
Full speed mode operation: 2.4V 3.6V
Slow mode operation: 2.2V 3.6V

Reference voltage selection:


Internal:
Internal bandgap voltage: 1.2V
Internal bandgap voltage amplified by 1.5x : 1.8V
VDDA power supply

External
Dedicated SDADC_VREF+ , SDADC_VREF- pins
Voltage range 1.1V SDADCx_VDD

550

SDADC clock

Clock management:
System clock divided by divider (from 2 to 48, 50% duty cycle)
Clock range:
max. 6MHz standard conversion clock
max. slow mode clock 1.5MHz reduced speed, reduced power, lower voltage operation
min. clock speed = 500kHz

551

Input channel configurations


Measurement modes:
Differential mode:
Used both SDADC analog channel inputs: SDADCx_AINxP and SDADCx_AINxM
Signed result: 0x8000 0x7FFF (-32768 32767)

Single ended modes:


Offset mode: as differential mode with minus input internally grounded (reduced dynamic
range of SDADC only positive range: 0x0000 0x7FFF)
Referenced to zero: minus input internally grounded but offset injected to have full dynamic
range (zero voltage corresponds to code -32768)

Three SDADC configuration registers (SDADC_CONFxR, x = 0..2) =>


3 possible configurations:
In each register is channel configuration:

Measurement mode (differential or single ended)


Gain ( , 1, 2, 4, 8, 16, 32)
Offset calibration value (stored here after offset calibration)
Common voltage used during offset calibration (VSSA, VDDA, VDDA/2)

Each SDADC analog channel is assigned to one configuration register


Example: 3 analog channels in application
Channel 0 uses SDADC_CONF0R
Channel 1 and channel 2 use SDADC_CONF1R (same gain and measuring mode)

552

Channels configuration example


Mixed configurations example of input pins connection:
CH2, CH4 and CH8 are used as differential.
CH0, CH6 and CH7 are used in single-ended mode.
REFM is used VSSA.
PAD 1 is not used.

553

Regular and injected conversions


Injected conversions
Injected group is defined as bitfield in register each one bit corresponds to one
channel
Selected channels in the injected group are always converted sequentially (from
lowest selected channel) scan mode
Triggers:

Software (writing 1 to the JSWSTART bit)


External pin
Timers
Synchronous with SDADC1

Regular conversions
Channel selection is defined as channel number in register
Cannot run in scan mode
Triggers:
Software (writing 1 to the RSWSTART bit)
Synchronous with SDADC1

554

Standard mode:

Standard, slow, low power


conversion modes

Normal:
Multiplexing more channels
One conversion takes 360 cycles (16.6ksps @ 6MHz)

Fast continuous (FAST = 1):


On one channel only in continuous mode regular channel or one injected channel selected
One conversion takes 120 cycles (50ksps @ 6MHz)

Slow mode (SLOWCK = 1):


Reduced power consumption (~600uA consumption), operation from 2.2V
Limited clock speed up to 1.5MHz (so 4x reduced also conversion rate)

Standby when idle (SBI = 1):


SDADC goes to standby when no conversion (~200uA consumption)
Needed time for wakeup from power down 50us

Power down when idle (PDI = 1):


SDADC goes to power down when no conversion (~10uA consumption)
Needed time for wakeup from power down 100us

555

Request precedence
Priority order of SDADC operations:
1. Calibration sequence
2. Injected conversions
3. Regular conversions

But:
Conversion which is already in progress is never interrupted by the request for
another action (current conversion is finished first)
Request is ignored if a like action is already pending or in progress
No action can start before stabilization has finished (wakeup from power down or
standby mode)

556

SDADC calibration
General properties for sigma delta converters:
Perfect linearity (due to 1-bit converter and oversampling)
Resolution increases with decreasing data rate
But large offset and gain error (need calibration)

Offset calibration:
Principle:
Short internally both channel inputs (positive and negative)
Perform conversion and store result to configuration register(s)
During standard conversion subtract from result the calibrated value

Implementation in STM32F37x:
Set in configuration registers:
required gain (1/2 .. 32)
common mode for calibration (VSSA, VDDA, VDDA/2)

Set how many configurations to calibrate (CALIBCNT[1:0] bits)


Start calibration by setting bit STARTCALIB
Calibration sequence then executes on given gain(s) :
Calibration values are stored into configuration registers (OFFSETx[11:0] bits)
30720 cycles (5.12 ms at 6 MHz) for one configuration register

Calibration data are automatically subtracted from each conversion data

557

Deterministic timing
Application requirements:
Launching conversion in precise intervals (e.g. FFT sampling by timer trigger)
Problem: waiting for some ongoing (regular) conversion

Solution in SDADC:
Start of each injected conversion with delay during which cannot be started regular
conversion
When bit JDS = 1 (Injected Delay Start) the start of each injected conversion is
delayed:
by 500 cycles if PDI = 0 (power down when idle)
by 600 cycles if PDI = 1, SLOWCK = 0 (because wakeup from power down takes 600 cycles)
Injected
conversion
request
Regular conversion

Wait
500cycles

Injected conversion

558

Inputs impedances
Analog inputs impedance:
Depends from:
selected SDADC clock
analog gain (0.5 8)
conversion is in progress

Switching capacitance character


Range (examples):
540k
135k
47k

@ 1.5MHz, gain = 0.5


@ 6MHz, gain = 1
@ 6MHz, gain = 8

Reference voltage input impedance:


Depends only from selected SDADC clock
Switching capacitance character
Range (6MHz 1.5MHz):
~ 230k 1000k

559

Quiz
How many analog input channels are in one SDADC in the
STM32F37x ?
What is the calibration result ?
What is the conversion modes regarding low power conversions ?
Which voltages can be used as reference voltage for SDADC ?
What is the priority regarding conversions order ?

560

Comparators (COMP)
F37x COMP vs F30x COMP

F37x COMP vs F30x COMP

562

2 comparators (7 in STM32F30x)
A single register manages both comparators (in STM32F30x: one regsiter per
comparator).
No mux on the non inverting input
No blanking feature

ADC 1 MSPS

ADC Features
 Same like in STM32F1 family:
 12-bit, 1Msps
 Triggers, self-calibration
 Up to 18 input analog channels
 Analog watchdog, interrupts, DMA
 Programmable sampling time, Vref+ input range
 Injected, regular channels, alignment
 Continuous, single, scan conversion modes
 Temperature sensor, Vrefint measuring
 Added feature:
 VBAT measuring

Presentation Title

25/10/2012

564

HDMI-CEC

HDMI-CEC v2 Controller Features


Fully compatible with HDMI-CEC v1.3a standard
Electrical specifications
Messages (Frame formats, bits timings)
Full Arbitration: Signal Free Time (SFT), Header Arbitration

32kHz kernel running from LSE or HSI/244 with wakeup from STOP
Multiple logical addresses support + listen mode
Configurable error handling with selectable extended timing tolerance
Selectable signal free time (SFT) before transmission HW or SW
CEC line needs an external 27k pull up and optional isolation

Presentation Title

25/10/2012

566

HDMI-CEC Controller block diagram

Presentation Title

25/10/2012

567

HDMI-CEC Interrupts
An interrupt is triggered:
if a receive block transfer completes
if a transmit block transfer completes
in case of any receive or transmit error
in case of RX or TX buffer overrun or underrun
for transmission or reception end
in case of arbitration lost

Presentation Title

25/10/2012

568

RX tolerance margins
Start Bit
3.7ms

4.5ms

Data Bit
0.6ms

1.5ms

2.4ms

1.05ms

RxTol bit
0b: Standard tolerance (in line with CEC specification)
Start bit: 200 rise & fall. Data bit: 200 rise , 350 fall

1b: Extended tolerance


Start bit: 400 rise & fall. Data bit: 300 rise, 500 fall

569

Errors handling
CEC specifications says only:
It is the responsibility of all devices acting as followers to detect the existence of spurious
pulses on the control signal line and notify all other devices (primarily the initiator) that a
potential error has occurred.
An error is defined as a period between falling edges that is less than a minimum data bit
period (i.e. too short to be a valid bit).

Other timing errors are not considered in CEC specification user define the action
The error notification (error bit) is a low period on the CEC line of 1.4 to 1.6 times
the nominal data bit period, that is, 3.6 ms nominally:
High
Impedance

3.6 ms 0.24ms

Low
Impedance

A message is considered lost and therefore may be retransmitted under the


following conditions:
a message is not acknowledged in a directly addressed message
a message is negatively acknowledged in a broadcast message
a low impedance is detected on the CEC line when not expected (line error)
Presentation Title

25/10/2012

570

Bit Timing Errors


BRE: Bit Rising Error
BRE is set by HW at the time a rising edge is detected within a data bit outside of
the Rx-windows configured by RxTol. Upon BRE detection CEC message
reception is optionally aborted if BRESTP=1 and Error bit is optionally generated
on the CEC line if BREGEN=1.

SBPE: Short Bit Period Error


SBPE is set by HW when a falling edge is detected ending the data bit before
than expected by the RxTol margin. Upon SBPE detection an Error bit is always
generated on the CEC line and reception aborted. CEC starts waiting for next
start-bit once the CEC line is idle again.

LBPE: Long Bit Period


LPBE is set by HW either when a rising or falling edge is detected after the
maximum RxTol margin. Upon LBPE detection message reception is always
aborted and an Error bit optionally generated on the CEC line if LPBPEGEN=1.

571

Bit Timing Error detection


Data Bit
0.6ms

1.5ms

2.4ms

1.05ms

Rising
Edge

0.0

0.3

RxTol=0

BRE

RxTol=1

Falling
Edge
RxTol=0
RxTol=1

0.4

BRE

0.0

0.3

0.4

0.8

0.9

1.2

1.3

1.7

BRE

BRE

0.8

0.9

1.2
SBPE

SBPE

1.3

1.8
BRE

BRE

1.7

1.8

1.9

2.05

2.75
Ok
Ok

2.9
LBPE

LBPE

572

Bit Timing Error configured action 1/2


BREGEN: Generate Error-Bit on Bit Rising Error
0: BRE detection does not generate Error-Bit on the CEC line.
1: BRE detection generates an Error bit on the CEC line. CEC starts waiting for
next valid Start-Bit at the end of the Error-Bit transmission. SBPE and LBPE errors
are never set in case of BTE detection.
Note: can be set only if BRESTP=1

BRESTP: Stop on Bit Rising Error


0: BRE detection only sets the BRE flag. Rx-data bit is regularly sampled at
1.05ms and stored in the Rx-buffer.
1: BRE detection stops data reception. CEC starts waiting for next valid Start-Bit
immediately after BRE assertion or after Error-Bit generation if BREGEN=1. SBPE
and LBPE errors are never set in case of BRE detection
Note: when this error is detected in a broadcast message, the behavior is more
complex and controlled also by BRDNOGEN bit

573

Bit Timing Error configured action 2/2


LBPEGEN: Generate Error-Bit on Long Bit Period Error
0: LBPE detection does not generate Error-Bit on the CEC line.
1: LBPE detection generates an Error bit on the CEC line. CEC starts waiting for
next valid Start-Bit at the end of the Error-Bit transmission.
Note: when this error is detected in a broadcast message, the behavior is more
complex and controlled also by BRDNOGEN bit

BRDNOGEN: Not force Generate Error-Bit in Broadcast message


0: BRE or LBPE detection in a broadcast message does generate Error-Bit on the
CEC line like if BREGEN or LBPEGEN are set.
1: BRE or LBPE detection in a broadcast message does not generate Error-Bit on
the CEC line if not configured by BREGEN and LBPEGEN bits.

574

Signal Free Time (SFT) configuration


Upon transmission command, CEC starts sending Start-Bit after the
following number of nominal data bit periods of inactivity depends on
SFT value:
0x0: automatic HW control
3: in case of previous transmission unsuccessful
5: in case of new initiator
7: in case of previous transmission successful

0x1: 1
0x2: 1.5
0x3: 2

0xF: 7.5

Note: SFT can be set only when TXSOM=0

575

SFT

576

SFTOPT=0 counts SFT at TXSOM (TX Start Of Message command)


BUSY (RX/TX))

IDLE

SFT

TX

IDLE

TXSOM

TXSOM
TXEND/RXEND

TXEND

SFTOPT=1 counts SFT at TXEND/RXEND/TXERR/RXERR


BUSY (RX/TX)

SFT

IDLE

TX

SFT

RXEND

TXEND

TXSOM

BUSY (RX/TX)

SFT

TX

TXSOM

TXSOM

RXEND

TXSOM
TXEND

TX

SFT

SFT

IDLE

TX

TX

STM32F37x CEC vs STM32F100 CEC


Features

F100 CEC

F37x CEC

Supports HDMI-CEC v1.3a specification

APB clock
-with PRESC frequency divider

x
x

x
no need

32 KHz CEC kernel with Dual clock


- LSE
- HSI/244

x
x

TX missing acknowledge error

RX missing acknowledge error

Reception in Listen Mode

Rx Tolerance Margin
- Standard
- Extended

x
x

Arbitration( Signal free Time)


- Standard (by HW)
- Aggressive (by SW)

x
x

Arbitration Lost Detected flag/interrupt

Automatic transmission retry supported in


case of arbitration lost

Multi-address configuration

577

Quiz
Which errors are handled by HDMI-CEC ?
What are the possible clock sources in STM32F37x CEC ?
What are the new features in STM32F37x CEC comparing to the CEC
in STM32F100 devices ?

Presentation Title

25/10/2012

578

STM32F30x Motor Control


Features
Gianluigi FORTE (SystemLab)

STM32F3 MC kit

Not included

Main Features
Driving Strategy: Vector Control
PMSM motor sensored and
sensorless
Two (34-pin) dedicated motor control
connectors
Encoder sensor input
Hall sensor input
Tachometer sensor input
Current sensing mode:

3 shunt resistors
Single shunt

2nd Power stage

2nd Motor

Key Component
STM32F3xx (32-bit MCU ARM M4 with motor
control dedicated IPs)
L6390D (Gate Drivers)
VIPer16LD (Power Supply down converter)
L7815ABV, L78M05CDT, LD1117S33TR (Voltage
regulators)
STGP10NC60KD (IGBT)
TS391ILT, (Comparator)
M74HC14TTR (Logic)

580

Complementing MC starter kits


STM8/32 Evaluation boards

STM32F100x

STM8/128-EVAL

STM32F103

STM3210E-EVAL

STM32100B-EVAL
STEVAL-IHM033V1

MC connector

Please visit http://www.st.com/evalboards or contact a local ST office

STEVAL-IHM022V1

581

Complementing MC starter kits


STM8/32 Evaluation boards
1000W

STEVAL-IHM025V1

1KW

3 x PWM smart driver L6390


1 converter based on Viper16
7 x IGBT power switch STGP10NC60KD

1 x IGBT SLLIMM STGIPL14K60


1 converter based on Viper16
1 x IGBT STGP10NC60KD
1000W

STEVAL-IHM027V1

STEVAL-IHM021V2

1 x IGBT SLLIMM STGIPS10K60A


1 converter based on Viper16
1 x IGBT STGP10NC60KD
2000W

STEVAL-IHM028V1

3 x PWM smart driver L6390


1 converter based on Viper12
6 x MOSFET power switch STD5N52U
150W

1 x IGBT SLLIMM STGIPS20K60


1 x PWM SMPS VIPer26LD
1 x IGBT STGW35NB60SD

STEVAL-IHM032V1

3 x PWM smart driver:


2xL6392D and 1x L6391D
1 converter based on Viper12
6 x IGBT power switch: STGD3HF60HD

100W

STEVAL-IHM035V1

1 x IGBT SLLIMM STGIPN3H60


1 x PWM SMPS VIPer16L
SLLIMM (ST IPMs) based

STEVAL-IHM023V2

Gate drivers & Power Transistors based

Please visit : System evaulation boards or contact a local ST office

582

Complementing MC starter kits


Low Voltage Power Stages
STEVAL-IHM031V1
120W

2000W

Power stage up to
3 x dual PowerMOSFETs STS8dnh3l
2 x PWM smart driver L6387E
1x step down converter L4976D

STEVAL-IEM003V1
Power stage up to 48V
3 x PWM smart driver L6388
6x LV Power MOSFET STV250N55F3
1x step down converter L4978D

Low Voltage Power Stages

Please visit System evaulation board or contact a local ST office

583

Complete motor drive solutions


45w

STEVAL-IFN003V1

100w

PMSM FOC Motor Drive


1 x 32bit Microcontroller STM32F103C
1 x Motor Drive Ic L6230PD
35W

STEVAL-IFN004V1

STEVAL-IHM036V1
PMSM FOC Motor Drive

1 x 32bit Microcontroller STM32F100C6


1 x IGBT SLLIMM STGIPN3H60
1 converter based on Viper16

BLDC Six-Steps Motor Drive


1 x 8bit-Microcontroller STM8S
1 x Motor Drive Ic L6230Q
2000W

STEVAL-IHM030V1
DC Brushed Motor Drive
1 x 8bit-Microcontroller STM8S
2 x PWM smart driver L6388
4 x LV Power MOSFET STV250N55F3

Low voltage drives

Please visit System evaulation board or contact a local ST office

584

High voltage drives

FOC SDK v3.x


STM32 PMSM FOC SDK v3.x:
is a Motor Control Software Development Kit
for 3-phase Permanent Magnet Synchronous Motors
(PMSM) based on Field Oriented Control (FOC)
supporting STM32F103, STM32F100, STM32F2xx,
STM32F4xx, STM32F0xx, STM32F3xx

Key features:
Single/Dual simultaneous vector control (FOC)
Any combination of current reading topologies and/or
speed/position sensors is supported
Wide range of STM32 microcontrollers families
supported
Full customization and real time communication
through PC software ST MC Workbench
Wide range of motor control algorithms implemented
for specific applications
Application example based on FreeRTOS
Increase code safety through
MISRA C rules 2004 compliancy
Strict ANSI C compliancy
New object oriented FW architecture (better
code encapsulation, abstraction and modularity)

585

STM32 Family & FOC SDK overview


Cortex M4F

168Mhz

Cortex - M3
120Mhz
72Mhz

72Mhz

48Mhz

F100 Value line


F103 Performance Line
F103 High density

Cortex M0

586

Features set, MCU support


STM32F103x HD/XL, STM32F2xx, STM32F4xx, STM32F3xx
STM32F103x LD/MD
STM32F100x, STM32F0xx
Dual FOC

1shunt

Flux
Weakening

IPMSM MTPA

3shunt

Feed Forward

Sensor-less
(STO + PLL)

Sensor-less
(STO +
Cordic)

FreeRTOS

Encoder

Hall sensors

Debug &
Tuning

ST MC
Workbench
support

USART based
com protocol
add-on

Max FOC
F100 ~11kHz
F0xx T.B.D.

F103, F2xx

ICS

Max FOC
~25kHz

Max FOC
F103 ~25kHz
F2xx T.B.D.
F4xx T.B.D.
F3xx T.B.D.

Max FOC dual


F103 ~20kHz
F2xx T.B.D.
F4xx T.B.D.
F3xx T.B.D.

587

MC Workbench

Motor
Power Stage

Drive
Management

Control Stage

ST Motor Control Workbench


PC software that reduces the design effort and time in the STM32 PMSM FOC
firmware library configuration. The user through a graphical user interface (GUI)
generate all parameter header files which configures the library according the
application needs.

588

Serial communication

RS232 (Available)
SPI (T.B.I.)
I2C (T.B.I.)

Real time communication


Using the ST MC workbench is possible to instantiate a real time
communication to send start/stop commands or to set a speed ramp.
Debug or fine tuning motor control variables (like speed PI parameters) can be
assessed using the advanced tab.
Plotting significant motor control variables (virtual oscilloscope) like target or
measured motor speed.

589

New IP & features dedicated for MC - Overview


Cortex M4 + Floating point unit
RAM on instruction bus
Embedded programmable operational amplifier for current sensing
Embedded comparators for fault management and for cycle by cycle
current regulation (6-step)
ADV Timer 4th, 5th and 6th channels used for single shunt current
reading using ST patented method
ADC context FIFO for dual three shunt current sampling in dual motor
control (ADC sharing)

590

Cortex M4 + Floating point unit


Execution rate
10-20kHz
Two times for
dual drive

Flexible
design
Cortex-M0

Cortex-M4

Cortex-M3

High level approach


Matrix, mathematical equations

8/16-bit applications

16/32-bit applications

32-bit/DSC applications

MCU

Binary and tool compatible

Meta language tools


Matlab ,Scilabetc

+
-10% CPU
Load*

* Expected

C code generation
Floating point numbers (float)

FPU

No FPU

No FPU

Direct mapping
No code modification
High performance
Optimal code efficiency

Usage of SW lib
No code modification
Low performance
Medium code efficiency

Usage of integer
based format
Code modification
Corner case behavior
to be checked
(saturation, scaling)
Medium/high
performance
Medium code efficiency

591

RAM on instruction bus


Concept
To execute the FOC algorithm in the RAM exploiting 0 Wait state
The size of the RAM putted in the instruction bus has been sized to store the
algorithm

SRAM on Ibus

0 WS
Maximum speed execution
For critical routines (control loops)
8Kbytes
-20% CPU
Load*

* Expected

FOC
Algorithm

592

Motor phase current measurement

Three shunt

Single shunt

593

Embedded programmable operational


amplifier for current sensing
Concept
To embed the operational amplifier inside the microcontroller

Advantage
Costs reduction
Reduced temperature drift (possible compensation)
Programmable amplifier (x2, x4, )
Offset

+Vdd

OP-AMP
+
-

ADC

RShunt
STM32F3xx

594

Embedded comparators for fault


management
Concept
To embed the comparator inside the microcontroller and connect it to the PWM
timer for the fault management (Over current management, Over voltage
management)

Advantage
Costs reduction
Smart shutdown or active brake
Offset

On over-current open the 6 PWM


output
On over-voltage close low side and
open the high sides.

+Vdd

BUS Voltage

Over current

Over voltage

Comparator
+
-

RShunt

BRK2

Comparator

ADV
TIM

Int. reference

Double emergency input with


programmable digital filter and
programmable outputs behavior

BRK

Int. reference

STM32F3xx

Bus voltage
divider

595

Embedded comparators for cycle by cycle


current regulation (6-step)
Concept
To use the embed the operational amplifier, comparator and DAC to perform a
cycle by cycle current regulation
OCREFCLR

Advantage

Int. reference

Reduction of external components


Reduced temperature drift
PWM
Offset

+Vdd

OP-AMP
Comparator

+
+

RShunt

ETR

ADV
TIM

DAC
Int. reference

STM32F3xx

OCREFCLR

OCREFCLR

596

Single shunt current reading




For each configuration of the switches, the current that is flowing in the shunt resistor can be one of
the motor phase current.

ADVANTAGES
 Single shunt requires just one sensing network
(reduced number of external components).
 ST Patented method to exploit full vector plane.

Single shunt
Three shunts

Active vector insertion

597

ADV Timer 5th channel used for single shunt


current reading using ST patented method
Concept
Use the 5th channel to generate the active vector insertion
ARR
OC5
OC1

OC2

OC3
OC5ref

OC1ref
OC2ref
OC3ref

Update of GC5C bits

598

ADV Timer 4th and 6th channels used for dual


ADC triggering in single shunt current reading
Concept
Use the combination of 4th and 6th channel to generate the dual ADC triggering for
each PWM period
ARR
OC6

OC4

OC4ref
OC6ref
TRGO2

ADC Start

ADC Start

ADC Start

ADC Start

599

ADC context FIFO for dual three shunt current


sampling in dual motor control (ADC sharing)

Concept
Using two/three shunts topologies is required the simultaneous sampling of two
analog quantities. This is actually implemented using two different ADC peripheral.
Dual simultaneous motor driving (2/3 shunt topologies) can be achieved using just
two ADC peripheral if the sampling of each motor current is done in different times
(ADC sharing).
The FOC algorithm of each motor can request the ADC conversions while the
previous one is not already performed.
To perform automatically (saving CPU load) this mechanism the ADC context FIFO
has been implemented.

600

ADC context FIFO


FOC1 requires a conversion of channel x triggered by signal y (ADC context 1)
FOC2 requires a conversion of channel n triggered by signal m (ADC context 2) but the
ADC has been reserved so the context is stored in the FIFO
Signal y triggers the conversion and the result is sent to FOC1. The FIFO go ahead
programming the context 2
Signal m triggers the conversion and the result is sent to FOC2.
Analog channels

FOC1

ADC context 1

ADC

FOC2

ADC context 2
Waiting
Triggering signals

For simplicity only one ADC is used in this Example

601

602

You might also like