Zhong Mian Qian - SDE

C10a: Stochastic Differential Equations
Zhongmin QIAN
Mathematical Institute, University of Oxford
22 November 2005
ii
Contents
1 Brownian motion 1
1.1 Probability space . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 The scaling property . . . . . . . . . . . . . . . . . . . 9
1.2.2 Markov property and finite-dimensional distributions . 11
1.2.3 The reflection principle . . . . . . . . . . . . . . . . . 16
1.2.4 Martingale property . . . . . . . . . . . . . . . . . . . 18
1.3 Quadratic variational processes . . . . . . . . . . . . . . . . . 20
2 It
os calculus 29
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Stochastic integrals for simple processes . . . . . . . . . . . . 34
2.3 Stochastic integrals for adapted processes . . . . . . . . . . . 38
2.3.1 The space of square-integrable martingales . . . . . . 38
2.3.2 Stochastic integrals as martingales . . . . . . . . . . . 40
2.3.3 Summary of main properties . . . . . . . . . . . . . . 44
2.4 Stochastic integrals along martingales . . . . . . . . . . . . . 45
2.5 Stopping times, local martingales . . . . . . . . . . . . . . . . 46
2.5.1 Technique of localization . . . . . . . . . . . . . . . . . 49
2.5.2 Integration theory for semimartingales . . . . . . . . . 50
2.6 Itos formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.6.1 Itos formula for BM . . . . . . . . . . . . . . . . . . . 54
2.6.2 Proof of Itos formula. . . . . . . . . . . . . . . . . . . 54
2.7 Selected applications of Itos formula . . . . . . . . . . . . . . 55
2.7.1 Levys characterization of Brownian motion . . . . . . 55
2.7.2 Time-changes of Brownian motion . . . . . . . . . . . 57
2.7.3 Stochastic exponentials . . . . . . . . . . . . . . . . . 58
2.7.4 Exponential inequality . . . . . . . . . . . . . . . . . . 68
2.7.5 Girsanovs theorem . . . . . . . . . . . . . . . . . . . . 70
iii
iv CONTENTS
2.8 The martingale representation theorem . . . . . . . . . . . . . 72
3 Stochastic differential equations 77

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1.1 Linear-Gaussian diffusions . . . . . . . . . . . . . . . . 79
3.1.2 Geometric Brownian motion . . . . . . . . . . . . . . . 80
3.2 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . 81
3.2.1 Cameron-Martins formula . . . . . . . . . . . . . . . . 81
3.2.2 Existence and uniqueness theorem: strong solution . . 83
3.2.3 Continuity in initial conditions . . . . . . . . . . . . . 88
3.3 Martingales and weak solutions . . . . . . . . . . . . . . . . . 89
4 Appendix: martingales in discrete-time 93

4.1 Several notions . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Doobs inequalities . . . . . . . . . . . . . . . . . . . . . . . . 96
4.3 The convergence theorem . . . . . . . . . . . . . . . . . . . . 102
CONTENTS v
Introduction
A stochastic differential equation is simply a differential equation perturbed
by random noise (its intensity (t, Xt ) depends on time t and the position
Xt ), and therefore has the following form
dXt t .
= A(t, Xt ) + (t, Xt )W
dt
The central limit theorem in probability theory suggests that W t should have

normal distribution, and for the sack of simplicity (Wt )t0 should be inde-
pendent. Such random noise can be ideally modelled by Brownian motion
(Wt )t0 , a mathematical model describing random movements of pollen par-
ticles in a liquid, observed by R. Brown in 1827. The mathematical model
for Brownian motion and the description of its distribution were derived by
Albert Einstein in a short paper On the motion of small particles suspended
in liquids at rest required by the molecular-kinetic theory of heat published
in 1905, Annalen der Physik 17, 549-560. About the same time, in 1900, L.
Bachelier submitted his Ph. D. thesis in which he used Brownian motion to
model stock markets. His results were published in a paper titled Theorie
de la speculation in Ann. Sci. Ecole Norm. sup., 17 (1900), 21-86, which
is the first paper devoted to applications of Brownian motion to finance.
On the other hand, the first mathematical construction of Brownian
motion was achieved in 1923, N. Wiener published his article Differential
space, J. Math. Phys. 2, 132-174. Fruitful results and many unusual
features of Brownian motions were revealed mainly by Paul Levy in 30s -
40s. Among of them, P. Levy showed that almost surely t Wt is non-
where differentiable, and therefore the time-derivative of Brownian motion,
W t , does not exist. It is thus necessary to rewrite the previous differential
equation in differential form
dXt = A(t, Xt )dt + (t, Xt )dWt
which in turn has to be interpreted as an integral equation

Z t Z t
Xt X0 = A(s, Xs )ds + (s, Xs )dWs
0 0
It requires thus to define integral like

Z
(t, Xt )dWt
vi CONTENTS
which does not exist in the ordinary sense. It was Ito in 1940s who first
established an integration theory for Brownian motion, and therefore the
theory of stochastic differential equations. Among of the manifold applica-
tions and connections with PDE, one of the most remarkable applications
of Itos theory is in the theory of finance. Although Itos theory never catch
the attention of the Fields medals committee, while it was brought world-
wide recognition by awarding H. Markowitz, W. Sharpe and M. Miller the
1990 Nobel Prize, and F. Black and M. Scholes the 1997 Nobel Prize, both
in Economics.
This course covers the core part of Itos calculus: it provides the nec-
essary background in stochastic analysis for those who are interested in
stochastic modelings and their applications including the theory of finance,
stochastic control and filtering etc. Students who are majoring in (pure and
applied) analysis, differential geometry, functional analysis, harmonic anal-
ysis, mathematical physics and PDEs will find this course relevant to their
interests.
References:
Below is the list of a collection of text books and monographs on stochas-
tic differential equations and related topics. Items 3 and 4 are recommended
readings for this course.
1. N. Ikeda and S. Watanabe: Stochastic Differential Equations and
Diffusion Processes. North-Holland (1981).
2. I. Karatzas and S. E. Shreve: Brownian Motion and Stochastic Cal-
culus. Graduate Texts in Mathematics, Springer-Verlag (1988).
3. F. C. Klebaner: Introduction to Stochastic Calculus with Applica-
tions. Imperial College Press (1998).
4. B. Oksendal: Stochastic Differential Equations. 6th Edition, Univer-
sitetext. Springer (2003).
5. D. Revuz and M. Yor: Continuous Martingales and Brownian Motion.
Springer-Verlag (1991).
6. L. C. G. Rogers and D. Williams: Diffusions, Markov Processes
and Martingales, Volume 2 Ito Calculus. Cambridge Mathematical Library.
Cambridge University Press 2000.
7. S. E. Shreve: Stochastic Calculus Finance II Continuous-Time Mod-
els. Springer Finance Textbook. Springer (2004).
Web page:
http://www.maths.ox.ac.uk/qianz/ private/SDE05.htm
Chapter 1
Brownian motion
Let us first set up our working framework and introduce several notions.
1.1 Probability space

Let (, F, P) be a probability space. That is, is a sample space of basic
events (also called sample points), F is a -algebra of events, and P : F
[0, 1] is a probability measure, namely a function satisfying the following
conditions:
1. P() = 1, P() = 1 (where is an empty set representing impossible

event) and P(A) 0 for any event A F.
2. Countably additive: If {Ai }i=1, is a countable family of mutually

disjoint events, i.e. Ai F and Ai Aj = , then

X
P (
i=1 Ai ) = P(Ai ) .
i=1
A random variable X on (, F, P) valued in Rd is a measurable (vector-

valued) function on (, F), that is, X : Rd such that for any Borel
subset B of Rd , the inverse image of B under the map X:
X 1 (B) = { : X() B}
belongs to the -algebra F. Loosely speaking, a random variable is such a

function X on that we may be able to talk about for example what is the
probability of the event that X lies in a ball B: { : X() B}.
1
2 CHAPTER 1. BROWNIAN MOTION
If Z
E|X| |X()|P(d) <

then we say X is integrable, denoted by X L1 (, F, P). In this case, the

expectation of X, denoted by EX, is the integral of X against the probability
P: Z
EX X()P(d) .

In general, we say a random variable X is in the Lp -space, write as X
Lp (, F, P) for p 0, if
Z
|X()|p P(d) < .

In this case we also say X is p-th integrable. For p 1, the space Lp (, F, P)

of all p-th integrable random variables X is a Banach space under the usual
algebraic operations for functions and the Lp -norm
Z 1/p
p
||X||Lp |X()| P(d) .

Remark 1.1.1 If p q, then Lp (, F, P) Lq (, F, P) and ||X||q

||X||p . Therefore p ||X||p is increasing in p (0, ]. Indeed, by a
simple use of Holders inequality we have
Z
q
||X||q = |X()|q P(d)

Z q/p
q pq
|X()| P(d)

Z q/p
p
= |X()| P(d)

Stochastic processes are mathematical models which are used to describe

random phenomena evolving in time. We thus need to have a set T of time
parameters. In these lectures, T is either the set of non-negative integers
Z+ or [0, +). T is thus an ordered set endowed with the natural topology.
Definition 1.1.2 A stochastic process is a parameterized family X = (Xt )tT

of random variables valued in a topological space S. In this course, unless
otherwise specified, S will be the real line R, or the Euclidean space Rd of
dimension d.
1.1. PROBABILITY SPACE 3
Of course a stochastic process X = (Xt )tT can be regarded as a function

from T Rd , which is the reason why a stochastic process is also called
a random function.
For each sample point , function t Xt () from T to S is called
a sample path (trajectory) or a sample function. Naturally, a stochastic
process X = (Xt )tT is continuous (resp. right-continuous, right-continuous
with left limits) if sample paths t Xt () are continuous (resp. right-
continuous, right-continuous with left limits) for almost all .
Remark 1.1.3 A function f : (a, b) Rd is right-continuous at t0 (a, b)

if its right-limit at t0 exists and equals f (t0 ). Similarly f is right-continuous
with left limit at t0 , if f is right-continuous at t0 and its left limit at t0
exists. For example monotone functions on intervals of the real line have
right- and left-limits.
Example 1.1.4 (Poisson process) Let (n ) be a sequence of independent

identically distributed random variables with Poisson distribution of inten-
sity > 0. Let
Xn
T0 = 0; Tn = j
j=1
and for t 0 define
Xt = n if Tn t < Tn+1 .
Then for each sample point , t Xt () is a step function, constant on

each interval (Tn , Tn+1 ), with jump 1 at Tn and is right-continuous with left
limit n 1 at Tn .
If X = (Xt )t0 is a stochastic process taking values in Rd , and 0 t1 <

t2 < < tn , the joint distribution of random variables (Xt1 , , Xtn ) given
as
t1 ,t2 , ,tn (dx1 , , dxn ) = P {Xt1 dx1 , , Xtn dxn }
which is a probability measure on the n-th product: Rd Rd , is called

a finite-dimensional distribution of X = (Xt )t0 . If d = 1 and (Xt )t0 is a
real stochastic process, the distribution t1 ,t2 , ,tn is determined through its
distribution function
Ft1 ,t2 , ,tn (x1 , , xn ) = P {Xt1 x1 , , Xtn xn } .

We need to overcome some technical difficulties when we deal with

stochastic processes in continuous-time. For example, a subset of like
{ : Xt () B for all t [0, 1]}
(where for example B is a ball in Rd ) may be not measurable, i.e. not an

event, so that to
P{ : Xt () B for all t [0, 1]}
does not make sense, unless additional conditions on (Xt )t0 are imposed.
In particular, a function like suptK Xt may be not a random variable.
Exercise 1.1.5 Let (Xt )t0 be a stochastic process in Rd on (, F, P), and

let B be a Borel measurable subset. If F is a finite or countable subset of
[0, +), then
{ : Xt () B for any t F }
and
sup |Xt |
tF
are measurable.
To avoid such technical difficulties, a common condition (but including

big enough interesting classes of stochastic processes) is the assumption that
our process X is right-continuous almost surely, and our working probability
space (, F, P) is complete in the sense that any trivial subsets of probability
null are events.
The main task of stochastic analysis is to study the probability (or dis-
tribution) properties of random functions determined by their families of
finite-dimensional distributions.
Definition 1.1.6 Two stochastic processes X = (Xt )t0 and Y = (Yt )t0
are equivalent (distinguishable) if for every t 0 we have
P { : Xt () = Yt ()} = 1 .
In this case, (Yt )t0 is a version of (Xt )t0 .
By definition, the family of finite-dimensional distributions of a stochas-

tic process X = (Xt )t0 is unique up to equivalence of processes.
1.2. BROWNIAN MOTION 5
1.2 Brownian motion

Brownian motion is a mathematical model of random movements observed
by botanist Robert Brown.
Definition 1.2.1 A stochastic process B = (Bt )t0 on a completed prob-

ability space (, F, P) taking values in Rd is called a standard Brownian
motion (BM) in Rd , if
1. P {B0 = 0} = 1.
2. (Bt )t0 possesses independent increments: for any 0 t0 < t1 < <
tn random variables
Bt0 , Bt1 Bt0 , , Btn Btn1
are independent.
3. For any t > s, random variable Bt Bs has a normal distribution

N (0, t s), that is, Bt Bs has pdf (probability density function)
1 |x|2
2(ts)
p(t s, x) = e ; x Rd .
(2(t s))d/2
In other words
P{Bt Bs dx} = p(t s, x)dx .
4. Almost all sample paths of (Bt )t0 are continuous.
Let p(t, x, y) = p(t, x y), and define foe every t > 0

Z
Pt f (x) = f (y)p(t, x, y)dy f Cb (Rd ) .
Rd
Since Z
p(t + s, x, y) = p(t, x, z)p(s, z, y)dz
Rd
therefore (Pt )t0 is a semigroup on Cb (Rd ). (Pt )t0 is called the heat semi-
group in Rd : if f Cb2 (Rd ), then u(t, x) = (Pt f )(x) solves the heat equation

1
+ u(t, x) = 0 ; u(0, ) = f ,
2 t
P 2
where = i x2i is the Laplace operator.
The connection between Brownian motion and the Laplace operator
(hence the harmonic analysis) is demonstrated through the following iden-
tity:
(Pt f ) (x) = E (f (Bt + x))

Z
1 |yx|2
2t
= f (y)e dy .
(2t)d/2 Rd
Example 1.2.2 If B = (Bt )t0 is a standard BM in R, then
E|Bt Bs |p = cp |t s|p/2 for all s, t 0 (1.1)
for p 0, where cp is a constant depending only on p. Indeed
|x|2
Z
1
E|Bt Bs |p = p |x|p exp dx .
2|t s| R 2|t s|
Making change of variable

x p
p =y ; dx = |t s|dy
|t s|
we thus have
p
( |t s|)p |x|2
Z
p p
E|Bt Bs | = |x| exp dx
2 R 2
= cp |t s|p/2
where
|x|2
Z
1 p
cp = |x| exp dx .
2 Rd 2
(1.1) remains true for BM in Rd with a constant cp depending on p and d.
Remark 1.2.3 Since Bt Bs N (0, t s), it is an easy exercise to show

that for every n Z+
(2n)!
E(Bt Bs )2n = |t s|n .
2n n!
Example 1.2.4 Let B = (Bt )t0 be a standard BM in R. Then B is a

centered Gaussian process with co-variance function C(s, t) = st. Indeed, it
is almost obvious that any finite-dimensional distribution of B is Gaussian,
so B is a centered Gaussian process, and its co-variance function (if s < t)
E(Bt Bs ) = E((Bt Bs )Bs + Bs2 )

= E((Bt Bs )Bs ) + EBs2
= E(Bt Bs )EBs + EBs2
= s.
Theorem 1.2.5 (Wiener) There is a standard Brownian motion in Rd .
Proof. We may assume that d = 1, the proof in higher dimension

is similar. Observe that a BM (Bt ) must be a Gaussian process (i.e. a
process whose finite-dimensional distributions are Gaussian distributions)
with mean zero and variance function E(Bt Bs ) = s t. Therefore we may
first construct a Gaussian process (Xt ) such that EXt = 0 and E(Xt Xs ) =
s t on some completed probability space (, F, P). It can be verified that
(Xt )t0 satisfies all conditions in the definition of BM, except the continuity
of its sample paths. The Gaussian process (Xt ) may be not continuous,
we thus need to modify the construction of Xt to make it continuous. Let
D = { 2jn : j Z+ , n N} the dyadic real numbers. The important fact we
need to use is that D is dense in R+ . Define
[
\ N 2 n
[ [ 1
H= X X j1 .

j
N =1 l=1 n=l j=1

2n 2n 2n/8
Let, for fixed N ,
N 2 n
[ [ 1
Al = X X j1 .

j
n=l j=1
2n n 2 2n/8
We are going to show that each

T
l=1 Al has probability zero, and therefore
as a sum of countable many events with probability zero, P(H) = 0. Since

N[2 n
1
X jn X j1 n/8

P

j=1
2 2 n 2
N 2n
X 1
P X X j1

j
j=1
2n 2n 2n/8

n
1
= N 2 P X 1n n/8

2 2
4 4
N 2n 2n/8 E X 1n

2
2

n/8
4
n 1
= 2 N2 3
2n
1
= 3N n/2
2
so that

N[2 n
X 1
P(Al ) X jn X j1 n/8

P
n=l

j=1
2 2 n 2

X 1
3N n/2
n=l
2

3N 2 1
= l .
21 2
Therefore

!
\
P Al = lim P {Al }
n
l=1

3N 2 1
lim l
2 1 n 2
= 0.
It follows that P(H) = 0, thus P(H c ) = 1. On the other hand, by De Morgan
law
[ \ N 2n
c
\ \ 1
H = : X jn () X j1 () < n/8

N =1 l=1 n=l j=1
2 2n 2
and thus, if H c , then for any N , there is an l such that for any n > l
and for all j = 1, , N 2n we have
1
X jn () X j1 () < n/8 .

2 2 n 2
We may show thus that for any H c and t 0 the limit of Xs () exists
as s t along the dyadic numbers, i.e. as s t and s D. Moreover, D
is dense in [0, ), thus for any t [0, ) we may define
Bt () = lim Xs () if H c
sDt
otherwise if H we set Bt () = 0. By definition, (Bt )t0 is a continuous

process which coincides with Xt on H c when t D. It remains to verify
that (Bt )t0 is a Brownian motion in R as an exercise.
1.2.1 The scaling property

Let B = (Bt )t0 be a standard BM in Rd . By definition, the distribution of
t = Bt+S BS
BM B = (Bt )t0 is stationary, so that for any fixed time S, B
is again a standard Brownian motion. This statement is true indeed for any
finite stopping time S, see section 1.2.3.
Lemma 1.2.6 (Scaling invariance, self-similarity) For any real number 6=

0
Mt Bt/2
is a standard BM in Rd .
This statement follows directly from the definition of BM. In particular,

(Bt )t0 is also a standard BM, so that (Bt )t0 and (Bt )t0 have the same
distribution.
Lemma 1.2.7 If U is an d d orthonormal matrix, then U B = (U Bt )t0

is a standard BM in Rd . That is, BM is invariant under the action of
orthogonal group of Rd .
This lemma is an easy corollary of the invariance property of Gaussian

distributions under the orthogonal group action.
Lemma 1.2.8 Let B = (Bt )t0 be a standard BM in R, and define

M0 = 0 , Mt = tB1/t for t > 0
is a standard BM in R.
Proof. Obviously Mt possesses normal distribution with mean zero, and

E (Mt Ms ) = tsE B1/t B1/s

1 1
= ts =st
t s
so that (Mt ) is a centered Gaussian process with co-variance function s t.
Moreover t Mt is continuous for t > 0. To see the continuity of Mt at
t = 0, we use the fact that
Bt
lim =0
t t
which is the law of large numbers for BM. We will not prove this here, but
see the remark below.
Remark 1.2.9 To convince yourself why the law of large numbers for BM is
true, we may look at a special way t through natural numbers, namely
Bn X1 + + Xn
lim = lim
n n n n
where Xi = Bi Bi1 . Notice that (Xi ) is a sequence of independent random
variables with identical distribution N (0, 1), so that by the strong law of large
numbers
X1 + + Xn
EX1 = 0 almost surely.
n
In order to handle the general case t 0, we may write t = [t] + rt where
[t] is the integer part of t and rt [0, 1). Then
Bt Bt B[t] [t] B[t]
= +
t t t [t]
B
the second term tends to 0 since as t , [t]t 1 and [t][t] 0. To see
why
Bt B[t]
0
t
as t , we need the following Gaussian tail estimate for BM (see section
1.2.3 below)
( ) r Z
2 x2 /2
P : sup |Bt ()| R = 2 e dx
t[0,T ] R/ T
R2

2 exp for all R > 0 .
2T
It follows that for any > 0

( )
X |Bt () Bn ()|
P : sup <
t[n,n+1] n
n=0
and thus by the Borel-Cantelli lemma

Bt Bn
lim sup =0 almost surely.
n t[n,n+1] t n
For more detail, see D. Stroock: Probability Theory: An Analytic View, page
180-181.
1.2.2 Markov property and finite-dimensional distributions

A standard d-dimensional Brownian motion B = (Bt )t0 is a Markov process
with transition semigroup (Pt )t0 , the heat semigroup, where
Z
Pt f (x) = p(t, x, y)f (y)dy
Rd
and
|x y|2

1
p(t, x, y) = exp
(2t)d/2 2t
is the Gaussian kernel. To make this statement more precise, let us recall
two notions from the probability theory.
Conditional Expectation
The concept of conditional expectations is perhaps the most important one
in the theory of probability. The properties studied in the theory of proba-
bility like independence, Markov property and martingale property, can be
re-phased in terms of conditional expectation. It is a good idea to review
what you have learned about conditioning in Mods and Part A probabil-
ity courses: conditional probability P(A|B), conditional expectation E(X|A)
and conditional probability density function pY (x) the probability density
function of E(X|Y ), and relate all these notions in terms of conditional
expectations E(X|G) (see definition below).
The formal definition stated below was formulated by J.L. Doob. Let X
be an integrable or non-negative random variable valued in R on a probabil-
ity space (, F, P), and let G be a sub -algebra of F. Then the conditional
expectation E(X|G) of X given G is a random variable (unique up to almost
surely) such that
1. E(X|G) is measurable with respect to G; and
2. For any A G we have
E {E(X|G)1A } = E {X1A } .
E(X|G) is the best prediction of the random variable X based on avail-

able information G. The last equation holds not only for indicator functions
but also for any random variables measurable with respect to G: if Y is
measurable with respect to G, then
E {E(X|G)Y } = E {XY }
provide the integrals make sense.

Like in any text book, E(X|Y ) means indeed E(X|(Y )) where (Y )
is the smallest -algebra for which Y is measurable. It can be shown that
E(X|Y ) is a measurable function of Y , that is, there is a function F such
that
E(X|Y ) = F (Y ) .
By definition, if Y is G-measurable, then
E(Y X|G) = Y E(X|G) .
If X and G are independent, then
E(X|G) = EX .
Indeed X is independent of -algebra G if and only if for any bounded Borel

measurable function f
E(f (X)|G) = E (f (X)) .
Filtered Probability Spaces

If X = (Xt )t0 is a stochastic process on (, F, P), for each t 0, we set
Ft0 = {Xs : s t} .
Then {Ft0 }t0 is an increasing family of sub -algebras of F, and for each
t 0, Xt is measurable with respect to Ft0 (we say then the process (Xt )t0
is adapted to {Ft0 }t0 ). Obviously, {Ft0 }t0 is the smallest increasing family
of sub -algebras that has the last property. {Ft0 }t0 is called the filtration
generated by the process X = (Xt )t0 . In general, we introduce
Definition 1.2.10 An increasing family (Ft )t0 of sub -algebras of F is

called a filtration. Ft represents the information available up to time t. A
stochastic process X = (Xt )t0 is adapted to the filtration (Ft )t0 if Xt is
measurable with respect to Ft for all t 0:
{ : Xt () B} Ft
for all Borel subset B in Rd .
A probability space (, F, P) together with a filtration (Ft )t0 is called

a filtered probability space (, F, Ft , P).
Let (Ft )t0 be the natural filtration of Brownian motion (Bt )t0 , that
is, Ft is the completion of the -algebra {Bs : s t}, and let F =
t0 Ft . It is a matter of fact that the natural filtration (Ft )t0 is continuous.
We call (Ft )t0 together with F Brownian filtration. It follows by the
independence of increments that
Lemma 1.2.11 For any t > s 0, the increment Bt Bs is independent

of Fs .
Recall that we denote p(t, x) the Gaussian density
1 |x|2
2t
p(t, x) = e
(2t)d/2
in Rd , and (Pt )t0 the heat semigroup

Z
Pt f (x) = f (y)p(t, x y)dy for every t > 0.
Rd
Lemma 1.2.12 If t > s, then the joint distribution of Bs and Bt is given

by
P {Bs dx, Bt dy} = p(s, x)p(t s, y x)dxdy .
Indeed, since Bs and Bt Bs are independent, and thus (Bs , Bt Bs )

has a pdf
p(s, x1 )p(t s, x2 )
and thus, for any bounded Borel measurable function f
Ef (Bs , Bt ) = Ef (Bs , Bt Bs + Bs )
ZZ
= f (x1 , x2 + x1 )p(s, x1 )p(t s, x2 )dx1 dx2 .
Making change of variables x1 = x and x2 + x1 = y in the last double

integral, the induced Jacobi is 1 so that dx1 dx2 =dxdy (as measures), and
therefore
ZZ
Ef (Bs , Bt ) = f (x, y)p(s, x)p(t s, y x)dxdy
which implies that the pdf of (Bs , Bt ) is p(s, x)p(t s, y x).
Theorem 1.2.13 For any t > s and any bounded Borel measurable function
f we have
E {f (Bt )|Fs } = Pts f (Bs ) a.s. (1.2)
where (Pt )t>0 is the heat semigroup. In particular E {f (Bt )|Fs } = E {f (Bt )|Bs }
which is called Markov property, and E {f (Bt )|Fs } equals F (Bs ) where
|xy|2
Z
1 2(ts)
F (x) = Pts f (x) f (y)e dy .
(2(t s))d/2 Rd
Proof. First we show that
E {f (Bt )|Fs } = E {f (Bt )|Bs }
which is called Markov property of (Bt )t0 . Clearly we only need to prove
this for bounded continuous (and smooth) function f . For such a function,
we can show that
Nn
X
f (x + y) = lim fnk (x)gnk (y) .
n
k=1
for some functions fnk , gnk (for example, taking the Taylor expansion of
f (x + y)). Hence
E {f (Bt )|Fs } = E {f (Bt Bs + Bs )|Fs }

Nn
X
= lim E {fnk (Bt Bs )gnk (Bs )|Fs }
n
k=1
Nn
X
= lim E {fnk (Bt Bs )|Fs } gnk (Bs )
n
k=1
Nn
X
= lim E {fnk (Bt Bs )} gnk (Bs )
n
k=1
= E {f (Bt )|Bs } .
To compute the conditional expectation E {f (Bt )|Bs }, we use the fact that
the pdf of (Bs , Bt ) is
p(s, x)p(t s, y x)
so that
Z Z
E {1A (Bs )f (Bt )} = 1A (x)f (y)p(s, x)p(t s, y x)dxdy
Z
= 1A (x)Pts f (x)p(s, x)dx
= E {1A (Bs )Pts f (Bs )}
as Z
Pts f (x) = f (y)p(t s, y x)dy .
Since Pts f (Bs ) is a function of Bs so that

E (f (Bt )|Bs ) = Pts f (Bs ) .
The family of finite-dimensional distributions of BM can be calculated

in terms of the Gaussian density function p(t, x).
Proposition 1.2.14 For any 0 < t1 < t2 < < tn , the (Rnd -valued)
random variable (Bt1 , , Btn ) has a pdf
p(t1 , x1 )p(t2 t1 , x2 x1 ) p(tn tn1 , xn xn1 )
where
1 2 /(2t)
p(t, x) = d/2
e|x|
(2t)
is a standard Gaussian pdf in Rd . That is, the joint distribution of (Bt1 , , Btn )
is given by
P {Bt1 dx1 , , Btn dxn }
= p(t1 , x1 )p(t2 t1 , x2 x1 ) p(tn tn1 , xn xn1 )dx1 dxn(1.3)
.
Proof. Let f be a bounded, continuous function. We want to calculate

E (f (Bt1 , , Btn )) .
One can use the fact that Bt1 , Bt2 Bt1 , , Btn Btn1 are independent,
and has the joint distribution with pdf
p(t1 , z1 )p(t2 t1 , z2 ) p(tn tn1 , zn ) .
(1.3) follows after change of variables. Below we present an induction argu-

ment which uses only the Markov property. Indeed, by the Markov property
E (f1 (Bt1 ) fn (Btn ))

= E E f1 (Bt1 ) fn (Btn )|Ftn1

= E f1 (Bt1 ) fn1 (Btn1 )E fn (Btn )|Ftn1

= E f1 (Bt1 ) fn1 (Btn1 ) Ptn tn1 fn (Btn1 )

= E f1 (Bt1 ) fn2 (Btn2 ) fn1 Ptn tn1 fn (Btn1 )
which reduces the number of times ti to n 1, so the conclusion follows from
the induction immediately.
Corollary 1.2.15 Let Bt = (Bt1 , , Btd ) be a d-dimensional standard Brow-

nian motion. Then for each j, Btj is a standard BM in R, and (Btj )t0
(j = 1, , d) are mutually independent.
Therefore a d-dimensional BM is d independent copies of BM in R.
1.2.3 The reflection principle

Brownian motion starts afresh at a stopping time, i.e. the Markov property
for Brownian motion remains true at stopping times. Therefore Brownian
motion possesses the strong Markov property, a very important property
which had been used by Paul Levy in the form of the reflection principle, long
before the concept of strong Markov property had been properly defined.
We will exhibit this principle by computing the distribution of the running
maximum of a Brownian motion.
In many applications, especially in statistics, we would like to estimate
distributions of running maxima of a stochastic process. For Brownian mo-
tion B = (Bt )t0 , the distribution of sups[0,t] Bs can be derived by use of
the reflection principle.
Let B = (Bt )t0 be a standard Brownian motion on (, F, Ft , P) in R.
Let b > 0 and b > a, and let
Tb = inf{t > 0 : Bt = b} .
Then Tb is a stopping time, and the Brownian motion starts afresh as a
standard Brownian motion after hitting b, and therefore
( ) ( )
P sup Bs b, Bt a = P sup Bs b, Bt 2b a
s[0,t] s[0,t]
= P {Bt 2b a}
where the first equality follows from the fact that the Brownian motion
starting at Tb (in position b): BTb = b, runs afresh like a Brownian motion,
so that it moves with equal probability about the line y = b. The second
equality follows from 2b a = b + (b a) > b.
The above equation may be written as
P {Tb t, Bt a} = P {Tb t, Bt 2b a}
= P {Bt 2b a} ,
which can be justified by the strong Markov property of Brownian motion,
a topic that will not pursue here. Therefore
( ) Z +
1 x2
P sup Bs b, Bt a = e 2t dt ,
s[0,t] 2t 2ba
which gives us the joint distribution of a Brownian motion and its maximum
at a fixed time t. By differentiating in a and in b we conclude the following
Theorem 1.2.16 Let B = (Bt )t0 be a standard BM in R, and let t > 0.
Then the pdf of the joint distribution of random variables (sups[0,t] Bs , Bt )
is given as
( )
(2b a)2

2(2b a)
P sup Bs db, Bt da = exp dadb
s[0,t] 2t3 2t
over the region {(b, a) : a b, b 0} in R2 .

In particular, for any b > 0,
( )
P sup Bs b = P {Tb t}
s[0,t]
(2c a)2
Z Z
2
= (2c a) exp dadc
2t3 {ac,cb} 2t
Z + Z c
(2c a)2

2
= (2c a) exp dadc
2t3 b 2t
Z + Z + 2
2 x
= x exp dxdc
3
2t b c 2t
Z + 2
2 x
= exp dx
2t b 2t
which is the exact distribution function of sups[0,t] Bs (the stopping time
Tb ) and leads to an exact formula for the tail probability of the Brownian
motion.
1.2.4 Martingale property

You may find a summary of martingales in discrete-time in the Appendix.
Recall the definition of martingale properties.
Definition 1.2.17 Let X = (Xt )t0 be a stochastic process adapted to a

filtration (Ft )t0 such that each random variable Xt is integrable. Then
X = (Xt )t0 is a martingale (resp. super-martingale; resp. sub-martingale)
if for any t > s we have
E(Xt |Fs ) = Xs
resp.
E(Xt |Fs ) Xs (super-martingale)
resp.
E(Xt |Fs ) Xs (sub-martingale) .
Remark 1.2.18 If X = (Xt )t0 is a martingale (resp. super-martingale;

resp. sub-martingale), then t E(Xt ) is a constant function E(X0 ) (resp.
decreasing function; resp. increasing function).
Let B = (Bti )t0 (i = 1, , d) be a standard BM in Rd , with its natural

filtration (Ft )t0 . We have the following
Proposition 1.2.19 1) Each Bt is p-th integrable for any p > 0, and for
t>s
E(|Bt Bs |p ) = cp,d |t s|p/2 . (1.4)
2) (Bt ) is a continuous, square-integrable martingale.

3) For each pair i, j, Mt = Bti Btj ij t is a continuous martingale.
Proof. The first part was proved before. Since Bt Bs is independent

of Fs when t > s we thus have
E(Bt Bs |Fs ) = E(Bt Bs ) = 0
so that
E(Bt |Fs ) = E(Bs |Fs ) = Bs
so that (Bt ) is a continuous martingale.

Obviously we only need to show 3) for BM in R. In this case
E(Bt2 Bs2 |Fs ) = E((Bt Bs )2 |Fs )

+E(2Bs (Bt Bs ) |Fs )
= E((Bt Bs )2 ) + 2Bs E((Bt Bs ) |Fs )
= E (Bt Bs )2
= ts
so that
E(Bt2 t|Fs ) = E(Bs2 s|Fs )

= Bs2 s
which shows that Bt2 t is a martingale.
Theorem 1.2.20 Let B = (Bt )t0 be a continuous stochastic process in R

such that B0 = 0. Then (Bt )t0 is a standard BM in R, if and only if for
any R and t > s
(t s)||2

E {exp (ih, Bt Bs i) |Fs } = exp . (1.5)
2
Proof. We observe that (1.5) implies Bt Bs is independent of Fs

and has normal distribution with variance t s. Conversely, if (Bt )t0 is a
standard BM in R, then Bt Bs is independent of Fs , and Bt Bs has a
normal distribution of mean zero and variance (t s), so that
E {exp (ih, Bt Bs i) |Fs }

= E {exp (ih, Bt Bs i)}
|x|2
Z
1 ih,xi 2(ts)
= p e dx
2(t s) R
(t s)||2

= exp .
2
Corollary 1.2.21 Let (Bt ) be a BM in R. If R, then
||2

Mt exp ih, Bt i + t
2
is a martingale.
Remark 1.2.22 Note that both sides of (1.5) are analytic in so that the
identity continues to hold for any complex vector . In particular, by replac-
ing by i we obtain that
(t s)||2

E {exp (h, Bt Bs i) |Fs } = exp
2
so that for any vector
||2

exp h, Bt i t
2
is a continuous martingale. This statement will be extended to vector fields
in R. The resulted identity is called Cameron-Martin formula.
BM is the basic example of Levy processes: right continuous stochastic

processes in Rd which possess stationary independent increments, and (1.5)
is the Levy-Khinchin formula for BM. In general if (Xt ) is a Levy process
in Rd , then
E {exp (ih, Xt Xs i) |Fs } = exp (()(t s))
for t > s and Rd , where
1
() = hAAT , i + ihb, i
2Z

+ eih,xi 1 i1{|x|<1} h, xi (dx)
Rd \{0}
for some d r matrix A, vector b and Levy measure (dx) of (Xt ) which is
a -finite measure on Rd \ {0} satisfying the following integrable condition
|x|2
Z
2
(dx) < + . (1.6)
Rd \{0} 1 + |x|
1.3 Quadratic variational processes

As we have seen that, both (one-dimensional) Brownian motion Bt and
Mt Bt2 t are martingales, and thus
Bt2 = Mt + At
where of course At = t. Therefore, the continuous sub-martingale Bt2 is a
sum of a martingale and an adapted increasing process. We will see this
decomposition for Bt2 is the key to establish Itos integration theory.
1.3. QUADRATIC VARIATIONAL PROCESSES 21
Lemma 1.3.1 Let

D = {0 = t0 < t1 < < tn = t}
be a finite partition of the interval [0, t], and let
n
X
VD = |Btl Btl1 |2
l=1
the quadratic variation of B over the partition D, which is a non-negative
random variable. Then
EVD = t
and the variance of VD
n o n
X
E (VD EVD )2 = 2 (tl tl1 )2 .
l=1
Proof. Indeed
n
X
EVD = E|Btl Btl1 |2
l=1
n
X
= (tl tl1 )
l=1
= t.
To prove the second formula we proceed as the following
n o
E (VD EVD )2
!2
X n
= E |Btl Btl1 |2 t

l=1
!2
X n
|Btl Btl1 |2 (tl tl1 )

= E

l=1
n
X
|Btk Btk1 |2 (tk tk1 ) |Btl Btl1 |2 (tl tl1 )

= E
k,l=1
n n
X 2 o
= E |Btl Btl1 |2 (tl tl1 )
l=1
n
X
|Btk Btk1 |2 (tk tk1 ) |Btl Btl1 |2 (tl tl1 )

+ E .
k6=l
Since the increments over different intervals are independent, so that the
expectation of each product in the last sum on the right-hand side equals
the product of their expectations, which gives contribution zero, therefore
n o
E (VD EVD )2
n n
X 2 o
= E |Btl Btl1 |2 (tl tl1 )
l=1
n
X
E |Btl Btl1 |4 2(tl tl1 )|Btl Btl1 |2 + (tl tl1 )2

=
l=1
n
X
E|Btl Btl1 |4 2(tl tl1 )E|Btl Btl1 |2 + (tl tl1 )2

=
l=1
Xn
= 2 (tl tl1 )2
l=1
where we have used the integral
E|Btl Btl1 |4 = 3(tl tl1 )2 .
We are now in a position to prove the following
Theorem 1.3.2 Let B = (Bt )t0 be a standard BM in R. Then
X
lim |Btl Btl1 |2 = t in L2 (, P)
m(D)0
l
for any t, where D runs over all finite partitions of interval [0, t], and
m(D) = max |tl tl1 | .

l
Therefore
X
lim |Btl Btl1 |2 = t in probability.
m(D)0
l
Proof. According to the above Lemma we have

2
X
|Btl Btl1 | t = E |VD E (VD )|2
2

E

l
n
X
= 2 (tl tl1 )2
l=1
n
X
2m(D) (tl tl1 )
l=1
= 2tm(D)
and therefore 2
X
lim E |Btl Btl1 |2 t = 0 .

m(D)0
l
For good partitions the convergence in the above theorem takes

P place al-
most surely. To this end, we recall the Borel-Cantelli lemma: if
n P(An ) <
, then limsupn An = 0, where
[
\
limsupn An = An
m=1 n=m
= { belongs to infinitely many An } .
P
If in addition {An } are independent, then n P(An ) = if and only if
P (limsupn An ) = 1 .
Proposition 1.3.3 Let (Bt )t0 be a standard BM in R. Then for any t > 0
we have
2n
X 2
B jn t B j1 t a.s. (1.7)

2 n t 2
j=1
as n .
Proof. Let Dn be the dyadic partition of [0, t]
0 1 2n
Dn = {0 = t < t < < t = t} .
2n 2n 2n
and Vn denote VDn . Then, according to Lemma 1.3.1, EVn = t and

n
2
l1 2

2
X l
E |Vn EVn | = 2 t n t
2n 2
l=1
1 2

= 22n t
2n
1 2
= t .
2n1
Therefore, by Markovs inequality,

1
P |Vn EVn | n2 E |Vn EVn |2
n
n2 2
= t
2n1
so that

n2

X 1 2
X
P |Vn EVn | =t < + .
n 2n1
n=1 n=1
By the Borel-Cantelli lemma, it follows that Vn t almost surely.
Remark 1.3.4 Indeed the conclusion is true for monotone partitions. More
precisely, for each n let
Dn = {0 = tn,0 < t1,n < < tnk ,n = t}
be a finite partition of [0, t]. Suppose Dn+1 Dn and
lim m(Dn ) = lim max |tni ,n tni1 ,n | = 0 .

n n
Then
nk
X 2
Btni ,n Btni1 ,n t a.s. (1.8)

i=1
as n . Indeed, in this case, if we denote by Mn the right-hand side
of (1.8), then (M
n )n1 is a non-negative martingale with respect to the fil-
tration Gn = Btni ,n : i = 0, , nk , so (1.8) follows from the martingale
convergence theorem.
It can be shown (not easy) that

X
sup |Btl Btl1 |p < a.s.
D l
if p > 2, where sup is taking over all finite partitions of [0, 1], and
X
sup |Btl Btl1 |2 = a.s.
D l
That is to say, Brownian motion has finite p-variation for any p > 2. Indeed
almost all Brownian motion sample paths are -Holder continuous for any
< 1/2 but not for = 1/2. It follows thus that almost all Brownian
motion paths are nowhere differentiable. We will not go into a deep study
about the sample paths of BM, which are not needed in order to develop
Itos calculus for Brownian motion.
In view of these results, let us introduce the following
Definition 1.3.5 A path (a function on [0, T ]) f (t) in Rd is said to have

finite p-variation (where p > 0 is a constant) on [0, T ], if
X
sup |f (ti ) f (ti1 )|p < +
D l
where D runs over all finite partitions of [0, T ]. f (t) in Rd has finite (total)
variation if it has finite 1-variation.
A function with finite variation must be a difference of two increasing

functions (and therefore must have at most countably many discontinuous
points).
A stochastic process V = (Vt )t0 is called a variational process if for
almost all , sample path t Vt () possesses finite variation on any
finite interval. A Brownian motion is not a variational process.
If M = (Mt )t0 is a continuous, square-integrable martingale, then
(Mt2 )t0 is no longer a martingale but a sub-martingale, except trivial case.
As in the case of Brownian motion, the following limit
Mt Mt 2
X
hM it = lim l l1
m(D)0
l
exists both in probability and in the L2 -norm, where the limit takes over
all finite partitions D of the interval [0, t]. By definition, t hM it is
an adapted, continuous, increasing stochastic process (and therefore has
finite variation). The following theorem demonstrates the importance of the
adapted increasing process hM it .
Theorem 1.3.6 (The quadratic variational process) Let M = (Mt )t0

be a continuous, square-integrable martingale. Then hM it is the unique con-
tinuous, adapted and increasing process with initial zero, such that Mt2
hM it is a martingale.
The process hM i is called the quadratic variational process associated

with the martingale M . The theorem is a special case of the Doob-Meyer
decomposition for sub-martingales: any sub-martingale can be decomposed
into a sum of a martingale and a predictable, increasing process with initial
value zero. The decomposition was conjectured by L. Doob, and proved by
P. A. Meyer in the 60s, which opened the new era of stochastic calculus.
Remark 1.3.7 If M = (Mt )t0 is a continuous martingale, and A =

(At )t0 is an adapted, continuous and integrable increasing process, then
X = M + A is a continuous sub-martingale. The reverse statement is also
true, that is the context of Doob-Meyers decomposition theorem. Consider a
sub-martingale in discrete-time: X = (Xn )nZ+ with respect to the filtration
(Fn )nZ+ . An increasing sequence (An )nZ+ may be defined by
A0 = 0 ;
An = An1 + E(Xn Xn1 |Fn1 ), n = 1, 2, .
Then
1. E(Xn Xn1 |Fn1 ) = E(Xn |Fn1 ) Xn1 0, (An )nZ+ is increas-

ing.
2. An Fn1 , so that (An )nZ+ is predictable!
3. By definition
E(Xn An |Fn1 ) = Xn1 An1 , n = 1, 2, ,
therefore Mn = Xn An is a martingale.
Theorem 1.3.8 Let (Mt )t0 and (Nt )t0 be two continuous, square-integrable
martingales, and let
1
hM, N it = (hM + N it hM N it )
4
called the bracket process of M and N . Then hM, N it is the unique adapted,
continuous, variational process with initial zero, such that Mt Nt hM, N it
is a martingale. Moreover
n
X
lim (Mtl Mtl1 )(Ntl Ntl1 ) = hM, N it , in prob. (1.9)
m(D)0
l=1
where D = {0 = t0 < < tn = t} and m(D) = maxl (tl tl1 ).
If (Bt )t0 is a Brownian motion in Rd , then for any f Cb2 (Rd )

Z t
1
Mtf f (Bt ) f (B0 ) f (Bs )ds
0 2
is a continuous martingale (with respect to the natural filtration generated

by the Brownian motion (Bt )t0 ), and
Z t
f g
hM , M it = hf, gi(Bs )ds .
0
These claims will be proven below after we have established Itos lemma for
Brownian motion.
Chapter 2
It
os calculus
In this part we develop Itos integration

R t theory in a traditional way, that
is, we first define stochastic integral 0 Fs dBs for simple adapted processes
(Ft )t0 , and then extend the definition to a large class of adapted integrands
by exploiting the martingale property of Itos integrals.
2.1 Introduction
Let B = (Bt )t0 be a standard Brownian motion in R on a complete prob-
ability space (, F, P) and let (Ft )t0 be its natural filtration, that is, for
each t 0
Ft = {Bs for s t; null events}
the history of the Brownian motion B = (Bt )t0 up to time t, in other words
the information accumulated by the Brownian motion B till time t.
We are going to define Itos integrals of the following form
Z t
Fs dBs for t 0
0
where F = (Ft )t0 is a stochastic process satisfying certain conditions that

will be described later. For example, we are interested in very much defining
integrals like
Z t
f (Bs )dBs
0
for Borel measurable functions f .
29
30 CALCULUS
CHAPTER 2. ITOS
Since, for almost all , the sample path of Brownian motion t

Bt () is nowhere differentiable, the obvious definition via Riemann sums
X
Fti (Bti Bti1 )
i
does not work: the limit of Riemann sums does not exist. The limit exists
however in a probability sense, if for any finite partition we properly choose
ti [ti1 , ti ] and if the integrand process (Ft )t0 is adapted to the Brownian
filtration (Ft )t0 , that is, for each t 0, Ft is measurable with respect to Ft .
This approach works because both (Bt )t0 and (Bt2 t)t0 are continuous
martingales. Rt
In summary, Itos integral 0 Fs dBs of an adapted process F = (Ft )t0
(such that F is measurable with respect to the product -algebra B([0, ))
F , a condition you are advised to forget at the first reading) with respect
to the Brownian motion B = (Bt )t0 is simply defined to be the limit of
special sort of Riemann sums:
Z t X
Fs dBs = lim Fti1 (Bti Bti1 )
0 m(D)0
i
where the limit takes place in L2 -sense (with respect to the product measure
P(d)dt: you are forgiven to ignore its meaning), over finite partitions
D = {0 = t0 < t1 < < tn = t}
of the interval [0, t] such that m(D) = maxi (ti ti1 ) 0, and through
special Riemann sums X
Fti1 (Bti Bti1 )
i
i.e. sums of increments of Brownian motion B over [ti1 , ti ] multiplying the
value of F at the left-end point ti1 . The reason to choose Fti1 (Bti Bti1 )
rather then other forms is the following: only with this choice we have

E Fti1 (Bti Bti1 ) = 0 (2.1)
and
E Ft2i1 (Bti Bti1 )2 Ft2i1 (ti ti1 ) = 0 . (2.2)
It will become clear that, it is these important features that this sort of
Riemann sums converge to a martingale! These equations also imply that
not only the resulted Itos integral
Z t
Fs dBs
0
2.1. INTRODUCTION 31
is a martingale, but also is the process

Z t 2 Z t
Fs dBs Fs2 ds .
0 0
Exercise 2.1.1 Prove equations (2.1) and (2.2). Indeed, since Fti1 is
Fti1 -measurable, so that

E Fti1 (Bti Bti1 ) = E E Fti1 (Bti Bti1 ) Fti1

= E Fti1 E (Bti Bti1 ) Fti1

= E Fti1 E Bti Bti1
= 0,
and similarly, since Bt2 t is a martingale
E (Bti Bti1 )2 (ti ti1 ) Fti1

= E Bt2i ti Fti1 2E Bti1 Bti Fti1

+Bt2i1 + ti1
= 2Bt2i1 2Bti1 E Bti | Fti1

= 0
and therefore

E Ft2i1 (Bti Bti1 )2 Ft2i1 (ti ti1 )
n o
2 2 2
= E E Fti1 (Bti Bti1 ) Fti1 (ti ti1 ) Fti1

n o
2 2 2
= E Fti1 E (Bti Bti1 ) Fti1 (ti ti1 ) Fti1

= 0.
The Itos integration theory can be established for a continuous, square-

integrable martingale by using exactly the same approach. That is, if
M = (Mt )t0 is a continuous, square-integrable martingale on a filtered
probability space (, F, Ft , P) and F = (Ft )t0 is an adapted stochastic
process , then
Z t n
X
Fs dMs = lim Ftl1 (Mtl Mtl1 )
0 m(D)0
l=1
exists under certain integrable conditions.

32 CALCULUS
CHAPTER 2. ITOS
R t The characteristic property of Itos integrals is again that: Itos integrals

0 Fs dMs are martingales. Rt
The Stratonovich integral 0 Fs dMs , which was discovered later then
Itos, is defined by
Z t n
X Ftl1 + Ftl
Fs dMs = lim (Mtl Mtl1 )
0 m(D)0 2
l=1
Rt
which in general is different to the Ito integral 0 Fs dMs .
Example 2.1.2 Let M = (Mt )t0 be a continuous, square-integrable mar-

tingale. According to definition,
Z t Xn
Ms dMs = lim Mtl1 (Mtl Mtl1 )
0 m(D)0
l=1
n
X 1 2 1 2 2
= lim (Mtl Mtl1 ) + Mtl Mtl1
m(D)0 2 2
l=1
n n
1 X 1 X
= lim (Mtl Mtl1 )2 + lim Mt2l Mt2l1
2 m(D)0 2 m(D)0
l=1 l=1
1 1
= hM it + (Mt2 M02 ) .
2 2
In other words Z t
Mt2 M02 =2 Ms dMs + hM it .
0
In particular, for BM we have
Z t
Bt2 B02 =2 Bs dBs + t
0
which is a special case of the It

o formula. The Stratonovich integral
Z t n
X Mtl + Mtl1
Ms dMs = lim (Mtl Mtl1 )
0 m(D)0 2
l=1
n n
1 X X
= lim (Mtl Mtl1 )2 + lim Mtl1 (Mtl Mtl1 )
2 m(D)0 m(D)0
l=1 l=1
Z t
1
= hM it + Ms dMs
2 0
1
Mt2 M02

=
2
so that Z t
Mt2 M02 =2 Ms dMs
0
which coincides with the fundamental theorem in Calculus.
Below is a simple lemma relates the Stratonovich integrals to Itos inte-

grals.
Lemma 2.1.3 Let N, M be two continuous, square-integrable martingales,

and let Ft = Nt + At where At is an adapted process with finite variation.
Define Stratonovichs integral
Z t X Ftl + Ftl1
0 m(D)0 2
l
Then Z t Z t
1
Fs dMs = Fs dMs + hN, M it
0 0 2
Indeed,
Z t n
X Ntl + Ntl1
0 m(D)0 2
l=1
n
X Atl + Atl1
+ lim (Mtl Mtl1 )
m(D)0 2
l=1
n
1 X
= lim Ntl Ntl1 (Mtl Mtl1 )
2 m(D)0
l=1
n
X
+ lim Ftl1 (Mtl Mtl1 )
m(D)0
l=1
n
1 X
+ lim Atl Atl1 (Mtl Mtl1 )
2 m(D)0
l=1
Z t
1
= hM, N it + Fs dMs .
2 0
In particular, if F = (Ft )t0 is a process with finite variation, then

Z t Z t
Fs dMs = Fs dMs .
0 0
We will concentrate on Itos integrals only.

34 CALCULUS
CHAPTER 2. ITOS
2.2 Stochastic integrals for simple processes

An adapted stochastic process F = (Ft )t0 is called a simple process, if it
possesses a representation

X
Ft () = f0 ()1{0} (t) + fi ()1(ti ,ti+1 ] (t) (2.3)
i=0
where 0 = t0 < t1 < < ti , such that for any finite time T 0,
there are only finite many ti [0, T ]; each fi Fti i.e. fi is measurable
with respect to Fti and F is a bounded process. The space of all simple
(adapted) stochastic processes will be denoted by L0 . If F = (Ft )t0 L0 ,
then Itos integral of F against Brownian motion B = (Bt )t0 is defined as

X
I(F )t fi (Btti+1 Btti ) .
i=0
where the sum makes sense because only finite terms may not vanish. It is
obvious that I(F ) = (I(F )t )t0 is continuous, square-integrable, adapted to
(Ft )t0 .
Lemma 2.2.1 Let M = (Mt )t0 be a continuous and square-integrable

martingale, and let s < t u < v, f Fs , g Ft . Then
E (g(Mv Mu )(Mt Ms )|Fs ) = 0
and
E f (Mt Ms )2 |Fs = E (f (hM it hM is ) |Fs ) .

Proof. By the smooth property of conditional expectation, we have
E (g(Mv Mu )(Mt Ms )|Fs )

= E {E (g(Mv Mu )(Mt Ms )|Fu ) |Fs }
= E {g(Mt Ms )E(Mv Mu |Fu )|Fs }
=0.
The second equality is trivial as f Fs that can be moved out from the
conditional expectation.
Lemma 2.2.2 (I(F )t )t0 is a martingale
E (I(F )t I(F )s |Fs ) = 0 , t > s .

2.2. STOCHASTIC INTEGRALS FOR SIMPLE PROCESSES 35
Proof. Assume that tj < t tj+1 , tk < s tk+1 for some k, j N.

Then k j and
j1
X
I(F )t = fi (Bti+1 Bti ) + fj (Bt Btj ) ;
i=0
k1
X
I(F )s = fi (Bti+1 Bti ) + fk (Bs Btk ) .
i=0
If k < j 1, then
j1
X
I(F )t I(F )s = fi (Bti+1 Bti )
i=k+1
+fj (Bt Btj ) + fk (Btk+1 Bs ) . (2.4)
If k + 1 i j 1, s ti so that Fs Fti . Hence

E fi (Bti+1 Bti )|Fs

= E E( fi (Bti+1 Bti )|Fti |Fs

= E fi E Bti+1 Bti |Fti |Fs
=0.
In the first equality we have used the fact that fi Fti , and in the second
equality the martingale property of (Bt ). Similarly

E fj (Bt Btj )|Fs = 0, t > tj s, fj Ftj ,

E fk (Btk+1 Bs )|Fs = 0, tk+1 s > tk , fk Ftk Fs .
Putting these equations together we have
E (I(F )t I(F )s |Fs ) = 0 .
If k = j 1, then tj1 < s tj < t tj+1 and
I(F )t I(F )s = fj1 (Btj Bs ) + fj (Bt Btj )
we thus again have

E (I(F )t I(F )s |Fs ) = 0 .
36 CALCULUS
CHAPTER 2. ITOS
Rt
Lemma 2.2.3 I(F )2t 0 F 2 ds
s is a martingale. Therefore I(F )
t0
Mc2 and
Z t
hI(F )it = Fs2 ds .
0
Proof. We want to prove that for any t s
Z t Z s
2 2 2
Fu2 du .

E I(F )t Fu du Fs = I(F )s

0 0
In other words, we have to prove that
Z t
2 2 2

E I(F )t I(F )s Fu du Fs = 0.
s
Obviously
I(F )2t I(F )2s = (I(F )t I(F )s )2 2I(F )t I(F )s 2I(F )2s
= (I(F )t I(F )s )2 2(I(F )t I(F )s )I(F )s ,
and (I(F )t )t0 is a martingale, so that
E (I(F )t I(F )s |Fs ) = 0 .
While, I(F )s Fs so that
E {I(F )s (I(F )t I(F )s ) |Fs }
= I(F )s E {I(F )t I(F )s |Fs } = 0 .
We therefore only need to show
Z t
2 2

E (I(F )t I(F )s ) Fu du Fs = 0 .
s
Now we use the same notations as in the proof of Lemma 2.2.2.
It is clear from eqn 2.4 that if k < j 1, then
j1
X
(I(F )t I(F )s )2 = fi fl (Bti+1 Bti )(Btl+1 Btl )
i,l=k+1
j1
X
+ fi fj (Bti+1 Bti )(Bt Btj )
i=1
j1
X
+ fi fk (Bti+1 Bti )(Btk+1 Bs )
i=1
+fj2 (Bt Btj )2 + fk2 (Btk+1 Bs )2
+fk fj (Bt Bti )(Btk+1 Bs ) .
2.2. STOCHASTIC INTEGRALS FOR SIMPLE PROCESSES 37
Using Lemma 2.2.1 below and the fact that both (Bt )t0 and (Bt2 t)t0
are martingales, we get
E (I(F )t I(F )s )2 Fs

j1
X
2 2 2

=E fi (ti+1 ti ) + fj (t tj ) + fk (tk+1 s) Fs
j=k+1
so that
t
Z
2
Fu2 du Fs

E (I(F )t I(F )s ) |Fs = E .
s
Lemma 2.2.4 F I(F ) is linear, and for any T 0

Z T
2 2

E I(F )T = E Fs ds .
0
Proposition 2.2.5 If N Mc2 ,

then
Z t
I(F )t Nt Fs d hB, N is
0 t0
is a martingale.
Exercise 2.2.6 Show Prop. 2.2.5 by proving the followings.
1. First note that Prop. 2.2.5 is equivalent to the following equality:
Z t
E (I(F )t Nt I(F )s Ns |Fs ) = E Fu d hB, N iu |Fs , t > s.
s
2. Prove
E (I(F )t Nt I(F )s Ns |Fs ) = E ((I(F )t I(F )s )(Nt Ns )|Fs )
for any t > s.
3. If tj < t tj+1 , tk < s tk+1 , then k j. If indeed k < j 1, then
j1
X
Nt Ns = (Nti+1 Nti ) + (Nt Ntj ) + (Ntk+1 Ns ) .
i=k+1
4. Following the same procedures as in the proof of Lemma 2.2.3 to

complete the proof.
Proposition 2.2.7 If N Mc2 , then
Z t
hI(F ), N it = Fs d hB, N is for all t 0 . (2.5)
0
38 CALCULUS
CHAPTER 2. ITOS
2.3 Stochastic integrals for adapted processes

Rt
We next aim to extend stochastic integrals against Brownian motion 0 Fs dBs
to a class of adapted processes Fs . The main tool we need is a maximal in-
equality for martingales.
2.3.1 The space of square-integrable martingales

Let (, F, Ft , P) be a filtered probability space, and let Mc2 denote the
vector space of all continuous, square-integrable martingales M = (Mt )t0
on (, F, Ft , P) with initial zero: M0 = 0.If M = (Mt )t0 Mc2 , then
Mt2 hM it is a martingale. In particular, for every T > 0 we have
E MT2 hM iT = E M02 hM i0

= 0.
and therefore for every T > 0
E MT2 = EhM iT .

(2.6)
If M and N belong to Mc2 , then
Mt Nt hM, N it
is a martingale, where
1
hM, N it = (hM + N it hM N it )
4
called the bracket process of M and N . According to definition
k
X
hM, N it = lim Mtl Mtl1 Ntl Ntl1 in probability
m(D)0
l=1
where the limit takes over all finite partitions {0 = t0 < t1 < < tk = t}
of [0, t].
The space of square-integrable martingales Mc2 is endowed with the fol-
lowing distance

X 1 np o
d(M, N ) = E|M n N n |21 for M, N Mc2 .
2n
n=1
2.3. STOCHASTIC INTEGRALS FOR ADAPTED PROCESSES 39
A sequence of square-integrable martingales (M (k)t )t0 (k = 1, ,) con-

verges to M in Mc2 , if and only if for every T > 0
M (k)T MT in L2 (, F, P)
as k . According to (2.6), it is also equivalent to that for every T > 0
hM (k)iT hM iT in L1 (, F, P) .
We need the following maximal inequality, which is the martingale version

of the Markov inequality in the elementary probability.
Theorem 2.3.1 (Kolmogorovs inequality) Let M Mc2 . Then for any

T > 0 and > 0
( )
1
P sup |Mt | 2 E MT2 .

0tT
Proof. Since (Mt )t0 is continuous
sup |Mt | = sup |Mt |

0tT tD
for any countable dense subset D of [0, T ], so that sup0tT |Mt | is a ran-
dom variable. For each n N, we may apply the Kolmogorov inequality
to martingale in discrete-time (see the Appendix) {MT k/2n ; FT k/2n }k0 to
obtain ( )
1
sup |MT k/2n | 2 E MT2 .

P
0k2n
However, since D = {T k/2n : n, k N} is dense in [0, T ] so that
sup |MT k/2n | sup |Mt |

0k2n 1 0tT
as n we therefore have
( ) ( )
P sup |Mt | = lim P sup |MT k/2n |
0tT n 0k2n 1
1
E MT2 .

2

40 CALCULUS
CHAPTER 2. ITOS
Theorem 2.3.2 (Mc2 , d) is a complete metric space.
Proof. Let M (k) Mc2 ( k = 1, 2, ) be a Cauchy sequence in Mc2 .

Then for any T > 0,
E|M (k)T M (l)T |2 0 , as k, l .
By Kolmogorovs inequality
( )
1
P sup |M (k)t M (l)t | E|M (k)T M (l)T |2 ,
0tT 2
so that, for any fixed T > 0, M (k) uniformly converges to a limit M on [0, T ]
in probability and therefore there exists a stochastic process M (Mt )t0
such that
sup |M (k)t Mt | 0 in prob.
0tT
Obviously (Mt )t0 is a continuous and square -integrable martingale as the

uniform limit of a sequence of continuous martingales.
2.3.2 Stochastic integrals as martingales

Let (Bt )t0 be a standard Brownian motion in R, and let (Ft )t0 be the
natural filtration generated by (Bt )t0 , called the Brownian filtration. Then
for any simple (adapted)
Rt process (Ft )t0 , the Ito integral I(F )t belongs to
Mc2 and hI(F )it = 0 Fs2 ds. Thus, since F I(F ) is linear, and if F and
G are both simple processes, then
q
X 1
d(I(F ), I(G)) = E|I(F )j I(G)j |2 1
2j
j=1
s
Z j
X 1
2 ds 1 .
= E (F s G s )
2j 0
j=1
Therefore, by Theorem 2.3.2, we have the following
Corollary 2.3.3 Let F = (Ft )t0 be an adapted stochastic process which is

the limit of simple (adapted) processes in the sense that: there is a sequence
of simple processes F (n) L0 , such that for every T > 0
Z T
lim E |Fs F (n)s |2 ds = 0 . (2.7)
n 0
The space of all such adapted stochastic processes (Ft )t0 will be denoted by
L2 . If F L2 and F (n) L0 such that (2.7) holds, then limn I(F (n))
exists in Mc2 which is denoted again by I(F ), called the It
o integral of F =
(Ft )t0 along the Brownian motion (Bt )t0 .
By definition, if F = (Ft )t0 L2 , then as a function F on [0, +) :

F (t, ) = Ft (), F is measurable on [0, +) with respect to the product
-algebra B([0, +)) F, and for every T > 0
Z T
E Fs2 ds < + .
0
The linearity of the Ito integrals F I(F ) from L2 into Mc2 , and the
martingale property of both processes (I(B)t )t0 and
Z t
I(F )2t Fs2 ds
0
Rt
are preserved for any F L2 . Thus hI(F )it = 0 Fs2 ds.
Rt
The stochastic integral I(F )t for F L2 is also denoted by 0 Fs dBs .
We may also use F.B to denote the Ito integral (I(F )t )t0 . Thus
Z t
I(F )t = (F.B)t = Fs dBs
0
and Z t
hI(F )it = hF.Bit = Fs2 ds .
0
We next describe a class of stochastic processes F = (Ft )t0 in L2 .
Let L denote the vector space of all adapted, left-continuous stochastic
processes F = (Ft )t0 such that for all T > 0
Z T
E Fs2 ds < + . (2.8)
0
Remark 2.3.4 The condition that F = (Ft )t0 is adapted to Brownian

filtration (Ft )t0 , i.e. each Ft is measurable with respect to Ft , is essen-
tial in the definition of It os integrals. On the other hand, left-continuity
of t Ft is a technical one, that can be replaced by some sort of Borel
measurability (e.g.; right-continuous, continuous, measurable in (t, ) etc.).
Left-continuity becomes a correct condition if we attempt to define stochastic
integrals of F = (Ft )t0 against martingales which may have jumps. The
42 CALCULUS
CHAPTER 2. ITOS
reason is that the left-limit of F at time t happens before time t, and if

t Ft is left-continuous, then for any time t, the value Ft can be predicted
by the values taking place strictly before time t:
Ft = lim Fs .
st
Remark 2.3.5 We should point out that some kind of measurability of ran-
dom function (t, ) Ft () is necessary in order to ensure (2.8) make
sense. Note that (2.8) may be written as
Z Z T
Fs ()2 dsP(d) < +
0
so the natural measurability condition should be that the function
F (t, ) Ft ()
is measurable with respect to B([0, T ]) FT for any T > 0, where B([0, T ])

is the Borel -algebra generated by open subsets in [0, T ], and B([0, T ]) FT
is the product -algebra on [0, T ] .
Lemma 2.3.6 Let F = (Ft )t0 be in L. For n > 0, let
Dn {0 = tn0 < tn1 < < tnnk = n}
be any sequence of finite partitions of [0, n] such that
m(Dn ) = sup |tnj tnj1 | 0 as n

j
and let
nk
X
F (n)t = F0 1{0} (t) + Ftnl1 1(tnl1 ,tnl ] (t) ; for t 0 . (2.9)
l=1
Then Fn L0 and for any T > 0

Z T
E |F (n)s Fs |2 ds 0 as n .
0
Therefore L L2 .
Proof. Since t Ft is left-continuous, so that for all t T
F (n)t Ft .
All other conclusions follow easily.

In particular, if F = (Ft )t0 L, then, for every t 0,
n
X
I(F )t = lim Ftl1 (Btl Btl1 ) in L2 (, F, P)
m(D)0
l=0
where the limit takes place over all finite partitions D of the interval [0, t].
One thing you may keep in your mind is that the class L2 of integrands
F = (Ft )t0 for which we may integrate against Brownian motion is large
enough that it includes many interesting stochastic processes. On the other
hand, there is a serious restriction on these processes, that is, we require our
integrands being adapted to the Brownian filtration: for each t 0, Ft is
measurable with respect to Ft .
Let us give some examples.
If X = (Xt )t0 is a continuous stochastic process which is adapted to
(Ft )t0 , if f is a Borel function, and if for every T > 0
Z T
E f (Xt )2 dt <
0
then the stochastic process (f (Xt ))t0 belongs to L2 . In particular, for any
Borel measurable function f such that
Z T
E f (Bt )2 dt < (2.10)
0
then (f (Bt ))t0 is in L2 . What does condition (2.10) mean? While

Z T Z T
E f (Bt )2 dt = Ef (Bt )2 dt
0 0
Z T
= Pt (f 2 )(0)dt
0
where
Z
1 2
2
Pt (f )(0) = d/2
f (x)2 e|x| /2t dx
(2t) d
ZR
1 2
= d/2
f ( tx)2 e|x| /2 dx .
(2) Rd
44 CALCULUS
CHAPTER 2. ITOS
Therefore, if f is a polynomial, then f (Bt ) is in L2 , and for any constant

the process (eBt )t0 belongs to L2 as well. How about the stochastic
2
process Ft =eBt ? In this case
Z T Z TZ
1 2 2
E 2
Ft dt = d/2
e2tx e|x| /2 dx
0 (2) 0 Rd
and therefore Z T
E Ft2 dt < if 0 .
0
In the case > 0, then
Z T
1
E Ft2 dt < iff T < .
0 4
If F L2 , then I(F ) is called the Ito integral of F = (Ft )t0 against
Brownian motion B = (Bt )t0 , and we will denote it by
Z t
Fs dBs ; t 0 .
0
2.3.3 Summary of main properties

If F = (Ft )t0 L2 , then both
Z t Z t 2 Z t
Fs dBs and Fs dBs Fs2 ds
0 0 0
Rt
are continuous martingales with initial zero, and therefore hF.Bit = 0 Fs2 ds.
In general Z t
hF.B, G.Bit = Fs Gs ds
0
if F and G belong to L2 .
For any T 0
Z T 2 Z T
E Fs dBs =E Fs2 ds .
0 0
and for any t s,

( Z
t 2 ) Z t
2

Fu dBu Fs = E Fu du Fs .

E
s s
2.4. STOCHASTIC INTEGRALS ALONG MARTINGALES 45
2.4 Stochastic integrals along martingales

We may apply the same procedure of defining Itos integrals along Brownian
motion to any continuous, square-integrable martingales. Indeed, if M
Mc2 if F = (Ft )t0 is a bounded, adapted, simple process
X
Ft = f0 1{0} (t) + fi 1(ti ,ti+1 ] (t)
i
then define

X
M
I (F ) = fi (Mtti+1 Mtti ) .
i=0
By the same arguments as before, we have
1. I M (F ) Mc2 .
2. For any N Mc2 , the stochastic process

Z t
M
I (F )t Nt Fs d hM, N is
0 t0
is a martingale.
3. For any T > 0,

Z T 2 Z T
E Ft dMt =E Ft2 d hM it .
0 0
Definition 2.4.1 An adapted stochastic process F = (Ft )t0 L2 (M ) if

there is a sequence {F (n)} of adapted, simple and bounded stochastic pro-
cesses (F (n)) such that for any T > 0
Z T
E F (n)2t d hM it <
0
and Z T
2
E |F (n)t Ft | d hM it 0 as n .
0
If F L2 (M ), then we define
I M (F ) = lim I M (Fn ) , in Mc2 .

n
46 CALCULUS
CHAPTER 2. ITOS
Rt
We use either F.M or 0 Fs dMs to denote I M (F ).
If M = (Mt )t0 is a continuous, square-integrable martingale, and F =
(Ft )t0 belongs to L2 (M ), then both the stochastic integral
Z t
Fs dMs
0
and 2
Z t Z t
Fs dMs Fs2 dhM is
0 0
are martingales.
If M and N are two continuous, square-integrable martingales, and if
F L2 (M ) and G L2 (N ), then
Z t
hF.M, G.N it = Fs Gs d hM, N is .
0
Finally we have F.(G.M ) = (F G).M as far as these stochastic integrals

make sense, that is,
Z t Z s Z t
Fs d Gu dMu = Fs Gs dMs .
0 0 s 0
2.5 Stopping times, local martingales

Itos theory of stochastic integration can be further extended to an even
larger class of stochastic processes, by a so-called localization technique.
The technique involves a very important concept in the modern theory of
probability, called stopping times.
Let (, F, Ft , P) be a filtered probability space.
Definition 2.5.1 A random variable T : [0, +] (note that the value

+ is allowed) is called a stopping time (a random time) if for each t 0
the event
{ : T () t} Ft .
Given a stopping time T , we may define a -algebra
FT = {A F : A {T t} Ft for all t 0}
which represents the information available up to random time T . For tech-

nical reasons, we will require the following conditions to be satisfied, unless
otherwise specified.
2.5. STOPPING TIMES, LOCAL MARTINGALES 47
1. (, F, P) is a complete probability space.

2. The filtration (Ft )t0 is right-continuous, that is, for each t 0
Ft = Ft+ s>t Fs .
3. Each Ft contains all null sets in F.

In this case, we say the filtered probability space (, F, Ft , P) satisfies
the usual conditions.
For example, if B = (Bt )t0 is a BM in Rd , then its natural filtration
(Ft )t0 satisfies conditions 1-3, and as matter of fact, (Ft )t0 is continuous:
not only we have Ft = Ft+ but also
Ft = Ft {Fs : s < t} for all t > 0.
Remark 2.5.2 If X = (Xt )t0 is a right-continuous stochastic process
on a complete probability space (, F, P), then its natural filtration (Ft )t0
satisfies the usual conditions.
The following is a result we will not prove in this course.
Theorem 2.5.3 If X = (Xt )t0 is a right-continuous stochastic process
adapted to (Ft )t0 (recall that our filtration (Ft )t0 satisfies the usual con-
ditions), and if T : [0, +] is a stopping time, then the random variable
XT 1{T <} is measurable with respect to -algebra FT , where
XT 1{T <} () = XT () ()1{:T ()<} ()

XT () () ; if T () < + ,
=
0; if T () = + .
Remark 2.5.4 If X = (Xn )nZ+ and T : Z+ {+}, then
X
XT 1{T <} = Xn 1{T =n}
nZ+
X
= Xn 1{T =n} .
n=0
Therefore, if X is adapted to {Fn }nZ+ and T is a stopping time, then for

any n Z+ ,
n
X
XT 1{T <} 1{T n} = Xk 1{T =k}
k=0
which is measurable with respect to Fn , thus by definition XT 1{T <} is FT -
measurable.
48 CALCULUS
CHAPTER 2. ITOS
The following theorem provides us with a class of interesting stopping

times.
Theorem 2.5.5 Let X = (Xt )t0 be an Rd -valued, adapted stochastic pro-

cess that is right-continuous and has left-limits. Then for any Borel subset
D Rd and t0 0
T = inf {t t0 : Xt D}
is a stopping time, where inf = +. T is called the hitting time of D by
the process X.
Remark 2.5.6 Let us look at the discrete-time case. If X = (Xn )nZ+ is

adapted to {Fn }nZ+ taking values in Rd . Then for a Borel subset D Rd ,
and k Z+
T = inf {n k : Xn D}
is a stopping time. Indeed, if n k 1 then {T = n} = and for n k we
have
n1
\ \
{T = n} = {Xj Dc } {Xn D}
j=k
which belongs to Fn .
Example 2.5.7 If X = (Xt )t0 is an adapted, continuous process on (, F, Ft , P)

and if D Rd is a bounded closed subset of Rd , then
T = inf{t 0 : Xt D}
is a stopping time. If X0 Dc , then
XT 1{T <+} D .
In particular, if d = 1 and b is a real number, then
Tb = inf{t 0 : Xt = b}
is a stopping time. In this case supt[0,N ] Xt is a random variable,

( )
sup Xt < b = {Tb > N }
t[0,N ]
and ( )
sup Xt b = {Tb N } .
t[0,N ]
2.5.1 Technique of localization

The concept of stopping times provides us with a means of localizing
quantities. Suppose (Xt )t0 is a stochastic process, and T is a stopping
time, then X T = (XtT )t0 is a stochastic process stopped at (random)
time T , where

Xt () if t T () ;
XtT () =
XT () () if t T () .
Another interesting stopped process at random time T associated with X is

the process X1[0,T ] which is by definition

X1[0,T ] t () = Xt 1{tT } ()

Xt () if t T () ;
=
0 if t > T () .
It is obvious that
XtT = Xt 1{tT } + XT 1{t>T } .
If (Xt )t0 is adapted to the filtration (Ft )t0 , so are the process (XtT )t0
stopped at stopping time T and Xt 1{tT } .
Definition 2.5.8 An adapted stochastic process X = (Xt )t0 on the fil-

tered probability space (, F, Ft , P) is called a local martingale if there is an
increasing family {Tn } of finite stopping times such that
Tn + as n +
and such that for each n, (XtTn )t0 is a martingale.
Similarly, we may define local super- or sub-martingales etc.

The Ito integration can be extended to local martingales. Let me briefly
describe the idea. Suppose M = (Mt )t0 is a continuous, local martingale
with initial zero, then we may choose a sequence {Tn } of stopping times
such that Tn a.s. and for each n, M Tn = (MtTn )t0 is a continuous,
square-integrable martingale with initial zero. In this case we may define
hM it = hM Tn it if t Tn
which is an adapted, continuous, increasing process with initial zero such

that
Mt2 hM it
50 CALCULUS
CHAPTER 2. ITOS
is a local martingale.
Let F = (Ft )t0 be a left-continuous, adapted process such that for each
T >0 Z T
Fs2 dhM is < a.s. (2.11)
0
and define Z t
Sn = inf t 0 : Fs2 dhM is n n
0
which is a sequence of stopping times. The condition (2.11) ensures that

Sn . Let Tn = Tn Sn . Then Tn almost surely, for each n, M Tn
Mc2 . Let
F (n)t = Ft 1{tTn } .
Then
Z Z Tn
F (n)2s dhM is = Fs2 dhM is n
0 0

so that F (n) L2 (M Tn ). We may define
Z t
Tn
(F.M )t = F (n)s d M if t Tn
0 s
for n = 1, 2, 3, , which is called the Ito integral of F with respect to local

martingale M . We can show that F.M does not depend on the choice of
stopping times Tn . By definition, both F.M and
Z t
(F.M )2t Fs2 dhM is
0
are continuous, local martingales with initial zero.
2.5.2 Integration theory for semimartingales

Finally let us extend the theory of stochastic integrals to the most useful
class of (continuous) semimartingales. An adapted, continuous stochastic
process X = (Xt )t0 is a semimartingale if X possesses a decomposition
X t = Mt + V t
where (Mt )t0 is a continuous local martingale, and (Vt )t0 is stochastic
processes with finite variation on any finite interval.
If f (t) is a function on [0, T ] having finite variation:

X
sup |f (tl ) f (tl1 )| < +
D l
where D runs over all finite partitions of [0, t] (for any fixed t), then
Z t
g(s)df (s)
0
is understood as the Lebesgue-Stieltjes integral. If in addition s f (s) is

continuous, then
Z t X
g(s)df (s) = lim g(tl1 )(f (tl ) f (tl1 )) .
0 m(D)0
l
Therefore, if V = (Vt )t0 is a continuous stochastic process with finite vari-

ation, then
Z t
Fs dVs
0
is a stochastic process defined path-wisely as the Lebesgue-Stieltjes integral

Z t Z t
Fs dVs () Fs ()dVs ()
0 0
X
= lim Ftl1 ()(Vtl () Vtl1 ()) .
m(D)0
l
The definition of stochastic integrals may be extended to any continuous

semi-martingale in an obvious way, namely
Z t Z t Z t
Fs dXs = Fs dMs + Fs dVs
0 0 0
where, the first term on the right-hand side is the Itos integral with respect
to local martingale M defined in probability sense, which is again a local
martingale, the second term is the usual Lebesgue-Stieltjes integral which is
defined path-wisely. Moreover
Z t X
Fs dXs = lim Ftl1 Xtl Xtl1 in probab.
0 m(D)0
l
52 CALCULUS
CHAPTER 2. ITOS
2.6 Itos formula

Itos formula (also called Itos lemma) is the fundamental theorem in stochas-
tic calculus. Let us begin with a special case of Itos formula integration
by parts for stochastic integrals. Recall Ables summation formula
n1
X
an bn = a0 b0 + (ai+1 bi+1 ai bi )
i=0
n1
X n1
X
= bi+1 (ai+1 ai ) + ai (bi+1 bi )
i=0 i=0
which yields immediately an integration by parts formula in the context of

Lebesgue-Stieltjess integration. Namely, if a and b are two continuous and
increasing functions, then
Z t Z t
a(t)b(t) a(0)b(0) = a(s)db(s) + b(s)da(s) . (2.12)
0 0
Applying this integration by parts (path-wisely) to continuous processes

(At )t0 and (Bt )t0 which have finite variations to obtain an integration by
parts for variational processes
Z t Z t
At Bt = A0 B0 + As dBs + Bs dAs .
0 0
The story for local martingales will be different, basically due to the fact
that, if (Mt )t0 is a continuous local martingale, then
Mt Mt 2 = hM i , in prob.
X
lim l l1 t
m(D)0
l
where D = {0 = t0 < t1 < < tr = t}, m(D) = max(tl tl1 ). Indeed, in

terms of Itos integration, this fact can be restated as the following
Lemma 2.6.1 If M, N are two continuous local martingales, then we have

the integration by parts
Z t Z t
Mt N t = M0 N 0 + Ns dMs + Ms dNs + hM, N it .
0 0
2.6. ITOS FORMULA 53
Proof. If D = {0 = t0 < t1 < < tk = t} is a finite partition, then

X X
(Mti+1 Mti )2 = Mt2 M02 2 Mti (Mti+1 Mti )
i i
so that Z t
hM it = Mt2 M02 2 Ms dMs .
0
Applying this equation to M + N and M N we obtain the integration by

parts formula.
Corollary 2.6.2 (Integration by parts) Let X = M + A and Y = N + B be

a continuous semimartingale: M and N are continuous local martingales,
and A, B are continuous, adapted processes with finite variations. Then
Z t Z t
Xt Yt X0 Y0 = Xs dYs + Ys dXs + hM, N it .
0 0
In particular
Z t
Xt2 X02 = 2 Xs dXs + hM it .
0
We next state the Itos formula.
Theorem 2.6.3 (It os formula) Let X = (Xt1 , , Xtd ) be a continuous

semimartingale in Rd with decompositions Xti = Mti + Ait : Mt1 , , Mtd are
continuous local martingales, and A1t , , Adt are continuous, locally inte-
grable, adapted processes with finite variations. Let f C 2 (Rd , R). Then
d Z t
X f
f (Xt ) f (X0 ) = (Xs )dXsi
x i
i=1 0
d
1 X t 2f
Z
+ (Xs )dhM i , M j is . (2.13)
2 0 xi xj
i,j=1
The Ito formula may be written in a vector form
t d Z
1 X t 2f
Z
f (Xt ) f (X0 ) = f (Xs ).dXs + (Xs )dhM i , M j is
0 2 0 xi xj
i,j=1
54 CALCULUS
CHAPTER 2. ITOS
f f
where f = ( x 1
, , xn
) is the gradient vector field of f . As a conse-
quence, f (Xt )f (X0 ) is again a continuous semimartingale, with martingale
part
Z t d Z t
X f
f (Xs ).dMs = (Xs )dMsi
0 0 xii=1
and
Z t d
X f f
hf (X.)it = (Xs ) (Xs )dhM i , M j is .
0 i,j=1 xi xj
If f and g belong to C 2 (Rd , R), then

Z t d
X f g
hf (X.), g(X.)it = (Xs ) (Xs )dhM i , M j is .
0 i,j=1 xi xj
2.6.1 It
os formula for BM
If B = (Bt1 , , Btd )t0 is Brownian motion in Rd , then, for f C 2 (Rd , R),
then Z t Z t
1
f (Bt ) f (B0 ) = f (Bs ).dBs + f (Bs )ds .
0 0 2
Let Z t
1
Mtf = f (Bt ) f (B0 ) f (Bs )ds .
0 2
Then Mf is a local martingale for every f C 2 (Rd , R) and
Z t
f g
hM , M it = hf, gi(Bs )ds .
0
2.6.2 Proof of It
os formula.
Let us prove the Ito formula for one-dimensional case. For simplicity let us
just do it for a continuous, square-integrable martingale M = (Mt )t0 . In
this case we need to show
Z t
1 t 00
Z
0
f (Mt ) f (M0 ) = f (Ms )dMs + f (Ms )dhM is . (2.14)
0 2 0
The formula is true for f (x) = x2 (f 0 (x) = 2x and f 00 (x) = 2) as we have

seen Z t
2 2
Mt M0 = 2 Ms dMs + hM it .
0
FORMULA
2.7. SELECTED APPLICATIONS OF ITOS 55
Suppose the formula is true for f (x) = xn :
t t
n(n 1)
Z Z
Mtn M0n =n Msn1 dMs + Msn2 dhM is .
0 2 0
Applying the integration by parts formula to M n and M we obtain

Z t Z t
Mtn+1 M0n+1 = Msn dMsMs dMsn + hM, M n it
+
0 0
Z t Z t
n(n 1) n2
= Msn dMs + Ms d nMsn1 dMs + Ms dhM is
0 0 2
Z t
+ nMsn1 dhM is
0
Z t
(n + 1)n t n1
Z
n
= (n + 1) Ms dMs + Ms dhM is
0 2 0
which implies that (2.14) for power function xn+1 . Therefore Itos formula
is true for any polynomial, so is it for any C 2 function f due to Taylors
expansions.
2.7 Selected applications of It

os formula
In this section, we present several applications of Itos lemma.
2.7.1 L
evys characterization of Brownian motion
Our first application is Levys martingale characterization of Brownian mo-
tion. Let (, F, Ft , P) be a filtered probability space satisfying the usual
condition.
Theorem 2.7.1 Let Mt = (Mt1 , , Mtd ) be an adapted, continuous stochas-

tic process on (, F, Ft , P) taking values in Rd with initial zero. Then
(Mt )t0 is a Brownian motion if and only if
1. Each Mti is a square integrable, continuous martingale.
2.
For any i and j, the process Mti Mtj ij t is a martingale, that is,
M i , M j t = ij t.
56 CALCULUS
CHAPTER 2. ITOS
Proof. We only need to prove the sufficient part. For any = (i ) Rd ,

we first show that
||2

Zt = exp 1 h, Mt i + t
2
d
!
X i ||2
= exp 1 i Mt + t
2
i=1
is a martingale. To this end, we apply Itos formula to f (x) = ex (in this

case f 0 = f 00 = f ) and semi-martingale
d
X ||2
Xt = 1 i Mti + t,
2
i=1
and obtain
d
!
t ||2
Z X
Zt = Z0 + Zs d 1 i Msi + s
0 2
i=1
d
t
Z
1 X
+ Zs dh 1 i M i is
2 0 i=1
d
t
||2 t
X Z Z
= 1+ 1 i Zs dMsi + Zs ds
0 2 0
i=1
Z d
t X
1
i j Zs dhM i , M j is
2 0 i,j=1
d
t
X Z
= 1+ 1 i Zs dMsi
i=1 0
the last equality follows from the assumption that hM i , M j is = ij s so that

Z t d Z t
1 X 1
i j Zs dhM i , M j is = ||2 Zs ds .
2 0 i,j=1 2 0
2 s/2
Moreover, since |Zs | = e|| , so that for any T > 0
Z T Z T
2
E |Zs | ds = 2
e|| s ds < +
0 0
FORMULA
and therefore (Zt ) L2 (M i ) for i = 1, , d as hM i it = t. It follows thus

Z t
Zs dMsi Mc2 .
0
That is, Zs is a continuous, square-integrable martingale with initial value
1, so that
||2

ih,Mt Ms i
E e Fs = exp (t s) .
2
We may then conclude that Mt Ms and Fs are independent. After taking
expectation of both sides we obtain
||2

E eih,Mt Ms i = exp (t s)
2
that is, h, Mt Ms i has a normal distribution with mean zero and variance
||2 (t s). Therefore (Mt )t0 is a Brownian motion in Rd .
2.7.2 Time-changes of Brownian motion

Theorem 2.7.2 (Dambis, Dubins and Schwarz) Let M = (Mt )t0 be a
continuous, local martingale on (, F, Ft , P) with initial value zero satisfying
hM i = , and let
Tt = inf{s : hM is > t} .
Then Tt is a stopping time for each t 0, Bt = MTt is an (FTt )-Brownian
motion, and Mt = BhM it .
Proof. The family T = (Tt )t0 is called a time-change, because each Tt
is a stopping time (exercise), and obviously t Tt is increasing (another
exercise). Each Tt is finite P-a.e. because hM i = P-a.e. (exercise). By
continuity of hM it
hM iTt = t, P-a.s.
Applying Doobs stopping theorem for the square integrable martingale
(MsTt )s0 and stopping times Tt Ts (t s), we obtain that
E (MTt |FTs ) = MTs
i.e. Bt is a (FTt )-local martingale. By the same argument but to the mar-
2
tingale (MsT hM isTt )s0 we have
t
E MT2t hM iTt |FTs = MT2s hM iTs .

Hence (Bt2 t) is an (FTt )-local martingale. We can prove that t Bt is

continuous, so that B = (Bt )t0 is an (FTt ) Brownian motion.
58 CALCULUS
CHAPTER 2. ITOS
2.7.3 Stochastic exponentials

Consider the differential equation that defines the exponential of a constant
square matrix A
df (t)
= Af (t) ; f (0) = I .
dt
Its solution is given by

X An
f (t) = exp(tA) = tn .
n!
n=0
The differential equation may be written as an integral equation, namely

Z t
f (t) = I + f (s)d (As)
0
where I deliberately write Ads as d(As) in order to emphasize that f (t) is

really the exponential of At. It is thus very natural to define the stochastic
exponential of a semi-martingale Xt = Mt + At (where M is a continuous
local martingale with X0 = 0, A is an adapted continuous process with finite
total variation) to be the solution of the following integral equation
Z t
Yt = 1 + Ys dXs (2.15)
0
where we use the Ito integral. To find the solution to (2.15) we may try
Yt = exp(Xt + Vt )
where (Vt )t0 to be determined later is a correction term (which has finite
variation) due to the quadratic variation of X. Applying Itos formula we
obtain Z t
1 t
Z
Yt = 1 + Ys d(Xs + Vs ) + Ys dhM is
0 2 0
and therefore, in order to match the equation (2.15) we must choose Vt =
21 hM it .
Lemma 2.7.3 Let Xt = Mt +At (where M is a continuous local martingale,

A is an adapted continuous process with finite total variation) with X0 = 0.
Then (2.15) has a unique solution

1
Yt = exp Xt hM it ,
2
which is called the stochastic exponential of X = (Xt )t0 , denoted by E(X).
FORMULA
If M = (Mt )t0 is a continuous local martingale, so is its stochastic

exponential E(M ), as
Z t
E(M )t = E(M )0 + E(M )s dMs .
0
Proposition 2.7.4 Let (Mt )t0 be a continuous local martingale with M0 =

0, and let

1
E(M )t = exp Mt hM it for all t 0 . (2.16)
2
Then the stochastic exponential E(M ) is the solution to

Z t
Xt = 1 + Xs dMs . (2.17)
0
In particular, E(M ) is a continuous, non-negative local martingale.
Remark 2.7.5 According to the definition of It

os integration, if T > 0
such that Z T
2Mt hM it
E e dhM it < + (2.18)
0
then the stochastic exponential

1
E(M )t = exp Mt hM it
2
is a non-negative, continuous martingale.
The remarkable fact is that, although E(M ) may fail to be a martingale,

but it is nevertheless a super-martingale.
Lemma 2.7.6 Let X = (Xt )t0 be a non-negative, continuous local mar-

tingale such that X0 L1 (, F, P). Then X = (Xt )t0 is a super-martingale:
E(Xt |Fs ) Xs for any s < t. In particular, t EXt is decreasing, and
therefore EXt EX0 for any t > 0.
Proof. Recall Fatous lemma: if {fn } is a sequence of non-negative,

integrable functions on a probability space (, F, P), such that
limn E (fn ) < + ,

60 CALCULUS
CHAPTER 2. ITOS
then limn fn is integrable and

E (limn fn |G) limn E (fn |G)
for any sub -algebra G (see page 88, D. Williams: Probability with Mar-
tingales).
By definition, there is a sequence of finite stopping times Tn + P-a.e.
such that X Tn = (XtTn )t0 is a martingale for each n. Hence
E (XtTn |Fs ) = XsTn , t s, n = 1, 2, .
In particular
E (XtTn ) = EX0
so that, by Fatous lemma, Xt = limn XtTn is integrable. Applying
Fatous lemma to XtTn and G = Fs for t > s we have

E(Xt |Fs ) = E lim XtTn |Fs
n
limn E(XtTn |Fs )
= limn XsTn
= Xs
According to definition, X = (Xt )t0 is a super-martingale.
Corollary 2.7.7 Let M = (Mt )t0 be a continuous, local martingale with
M0 = 0, then its stochastic exponential E(M ) defined by eqn 2.16 is a super-
martingale. In particular,

1
E exp Mt hM it 1 for all t 0 .
2
It is obvious that a continuous super-martingale X = (Xt )t0 is a mar-
tingale if and only if its expectation t E(Xt ) is constant. Therefore
Corollary 2.7.8 Let M = (Mt )t0 be a continuous, local martingale with
M0 = 0. Then
1
E(M )t exp Mt hM it
2
is a martingale up to time T , that is,
E (E(M )t |Fs ) = E(M )s for 0s<tT
if and only if
1
E exp MT hM iT =1. (2.19)
2
FORMULA
Stochastic exponentials of local martingales play an important role in

probability transformations. It is vital in many applications to know whether
the stochastic exponential of a given martingale M = (Mt )t0 is indeed a
martingale. A simple sufficient condition to ensure (2.19) is the so-called
Novikovs condition stated in Theorem 2.7.14 below (A. A. Novikov: On
moment inequalities and identities for stochastic integrals, Proc. second
Japan-USSR Symp. Prob. Theor., Lecture Notes in Math., 330, 333-339,
Springer-Verlag, Berlin 1973).
Uniform integrability
To prove Novikovs theorem, we need the concept of uniform integrability
of a family of integrable random variables and a theorem about local mar-
tingales. The concept of uniform integrability was introduced to handle
the convergence of random variables in L1 (, F, P), in spirit which is very
close to what you have learned in analysis: uniform convergence, uniform
continuity.
If X is integrable: E|X| < +, then
Z
lim |X|dP = 0 .
N {|X|N }
Definition 2.7.9 Let A be a family of integrable random variables on (, F, P).

A is uniformly integrable if
Z
lim sup |X|dP = 0
N XA {|X|N }

that is, E 1{|X|N } |X| tends to zero uniformly on A as N .
In terms of - language, A is uniformly integrable, if for any > 0 there

is a C > 0 depending only on such that
Z
|X|dP <
{|X|N }
for all X A.
According to the definition, we have the following.
1. Any finite family of integrable random variables is uniformly inte-

grable.
62 CALCULUS
CHAPTER 2. ITOS
2. Let A L1 (, F, P) be a family of integrable random variables. If

there is an integrable random variable Y such that |X| Y for every
X A, then A is uniformly integrable. In fact
Z Z
sup |X|dP Y dP 0 as N .
XA {|X|N } {Y N }
3. If A Lp (, F, P) for some p > 1 and
sup E|X|p <

XA
(that is, if A is a bounded subset of Lp (, F, P)), then A is uniformly

integrable. Indeed
Z Z
1
sup |X|dP sup p1
|X|p dP
XA {|X|N } XA {|X|N } N
1
p1
sup E|X|p 0
N XA
as N .
4. If X L1 (, F, P), and {G }A is a collection of sub -algebras of

F, then the family
A = {X E (X|G ) : A}
is uniformly integrable. In fact, since {X N } and {X N } are

G -measurable,
Z Z Z
|X |dP = X dP X dP
{|X |N } {X N } {X N }
Z Z
= XdP XdP
{X N } {X N }
Z
= |X|dP
{|X |N }
Z
|X|dP
{|X|N }
and the claim follows immediately.
Theorem 2.7.10 Let A L1 (, F, P). Then A is uniformly integrable if

and only if
FORMULA
1. A is a bounded subset of L1 (, F, P), that is, supXA E|X| < .

2. For any > 0 there is a > 0 such that
Z
|X|dP ; X A
A
for any A F such that P(A) .
Proof. Necessity. For any A F and N > 0

Z Z Z
|X|dP = |X|dP + |X|dP
A A{|X<N } A{|X|N }
Z
N P (A) + |X|dP .
{|X|N }
Given > 0, choose N > 0 such that

Z

sup |X|dP .
XA {|X|N } 2
Then Z

sup |X|dP N P (A) +
XA A 2
for any A F. In particular
Z

sup |X|dP N + ,
XA 2
and by setting = /(2N ) we also have
Z
sup |X|dP
XA A
as long as P(A) .
Sufficiency. Let = supXA E|X|. By the Markov inequality

P(|X| N )
N
for any N > 0. For any > 0, there is a > 0 such that the inequality in 2
holds. Choose N = /. Then P(|X| N ) so that
Z
|X|dP
{|X|N }
for any X A.
64 CALCULUS
CHAPTER 2. ITOS
Corollary 2.7.11 Let A L1 (, F, P) and L1 (, F, P) such that for

any D F
E (1D |X|) E (1D ||) ; X A .
Then A is uniformly integrable.
The following theorem demonstrates the importance of uniform integra-

bility.
Theorem 2.7.12 Let {Xn }nZ+ be a sequence of integrable random vari-

ables on (, F, P). Then Xn X in L1 (, F, P) for some random variable
as n : Z
|Xn X|dP 0 as n ,

if and only if {Xn }nZ+ is uniformly integrable and Xn X in probability

as n .
Proof. Necessity. For any > 0 there is a natural number m such that
Z

|Xn X|dP for all n > m .
2
Therefore for every measurable subset A

Z Z Z
|Xn |dP |X|dP + |Xn X|dP
A A
so that Z Z Z

sup |Xn |dP |X|dP + sup |Xk |dP + .
n A A km A 2
In particular

sup E|Xn | E|X| + sup E|Xk | +
n km 2
i.e. {Xn : n 1} is bounded in L1 (, F, P). Moreover, since X, X1 , , Xm
belong to L1 , so that there is > 0 such that, if P(A) , then
Z m Z
X
|X|dP + |Xk |dP
A A 2
k=1
and therefore Z
sup |Xn |dP
n A
FORMULA
as long as P(A) .
Sufficiency. By Fatous lemma
Z Z
|X|dP sup |Xn |dP < +
n
so that X L1 (, F, P). Therefore {Xn X : n 1} is uniformly integrable,

thus, by Theorem 2.7.10, for any > 0 there is > 0 such that
Z
|Xn X|dP <
A
for any A F such that P(A) . Since Xn X in probability, there is

an N > 0 such that
P (|Xn X| ) n N .
Therefore for n N we have

Z Z
|Xn X|dP |Xn X|dP + P (Xn X| < )
{|Xn X|}
+ P (Xn X| < )
2 .
Hence
lim E|Xn X| = 0 .
n
Corollary 2.7.13 Let X = (Xt )t0 be a continuous local-martingale with

X0 L1 (, F, P), and let T > 0. If
{XS : stopping times S T }
is uniformly integrable, then (Xt )t0 is a martingale up to time T .
Proof. Let Tn be a sequence of stopping times such that each

XTn t is a bounded martingale, so that for any t > s we have
E {XTn t |Fs } = XTn s
and limn XTn t = Xt . Since {XTn t }nZ+ (for any fixed t T ) is uni-
formly integrable, so that (Theorem 2.7.12)
lim XTn t = Xt in L1 (, F, P) .
n
66 CALCULUS
CHAPTER 2. ITOS
Therefore
n o
E {Xt |Fs } = E lim XTn t |Fs
n
= lim E {XTn t |Fs }
n
= lim XTn s = Xs .
n
for any 0 s < t T .
Novikovs theorem
Now we are in a position to prove Novikovs theorem.
Theorem 2.7.14 (A. A. Novikov) Let M = (Mt )t0 be a continuous local

martingale with M0 = 0. If

1
E exp hM iT < + , (2.20)
2
then (2.19) holds, and therefore

1
E(M )t exp Mt hM it
2
is a martingale up to time T .
Proof. The following proof is due to J. A. Yan: Crit`eres dintegrabilite

uniforme des martingales exponentielles, Acta. Math. Sinica 23, 311-318
(1980). The idea is the following, first show that under the Novikov condition
(2.20) for any 0 < < 1

1
E(M )t exp Mt 2 hM it
2
is a uniformly integrable martingale up to time T .

For any , E(M )t is the stochastic exponential of the local martin-
gale Mt , so that E(M ) is a non-negative, continuous local martingale,
E {E(M )t } 1. We also have the following scaling property

1 1
E(M )t exp Mt hM it ( 1) hM it
2 2

1
= (E(M )t ) exp (1 ) hM it .
2
FORMULA
For any finite stopping time S T and for any A FT

1
E {1A E(M )S } = E 1A (E(M )S ) exp (1 ) hM iS . (2.21)
2
By use of Holders inequality with 1 > 1 and 1 1

in (2.21)

1
E {1A E(M )S } = E (E(M )S ) exp (1 ) hM iS
2
1
1
{E (E(M )S )} E 1A exp hM iS
2
1
1
{E (E(M )T )} E 1A exp hM iT
2
1
1
E 1A exp hM iT
2

1
E 1A exp hM iT (2.22)
2
According to Corollary 2.7.11
{E(M )S : any stopping times S T }
is uniformly integrable, so that (by Corollary 2.7.13) E(M ) must be a

martingale on [0, T ]. Therefore
E {E(M )T } = E {E(M )0 } = 1, (0, 1).
Set A = and S = t T in (2.22), the first inequality of (2.22) becomes
1 = E {E(M )t }
1
1
(E (E(M )t )) E exp hM iT
2
for every (0, 1). Letting 1 we thus obtain
E (E(M )t ) 1
so that E (E(M )t ) = 1 for any t T , it follows thus that E(M )t is a mar-

tingale up to T .
Consider a standard Brownian motion B = (Bt ), and F = (Ft )t0 L2 .
If Z T
1 2
E exp F dt <
2 0 t
68 CALCULUS
CHAPTER 2. ITOS
then Z t Z t
1
Xt = exp Fs dBs Fs2 ds (2.23)
0 2 0
is a positive martingale on [0, T ]. For example, for any bounded process

F = (Ft )t0 L2 : |Ft ()| C (for all t T and ), where C is a
constant, then
Z T
1 2 1 2
E exp F dt exp C T <
2 0 t 2
so that, in this case, X = (Xt ) defined by (2.23) is a martingale up to time

T.
Novikovs condition is very nice, it is however not easy to verify in many
interesting Rcases. For example, consider the stochastic exponential of the
t
martingale 0 Bs dBs , the Novikov condition requires to estimate the integral
Z T
1 2
E exp Bt dt
2 0
which is already not an easy task.
2.7.4 Exponential inequality

We are going to present three significant applications of stochastic exponen-
tials: a sharp improvement of Doobs maximal inequality for martingales,
Girsanovs theorem, and the martingale representation theorem (in the next
section). Additional applications will be discussed in the next chapter.
Theorem 2.7.15 (Doobs maximal inequality). Let (Xt )t0 be a continuous

super-martingale on [0, T ]. Then for any > 0
( )
1
E(X0 ) + 2E(XT )

P sup |Xt |
t[0,T ]
where x = x if x < 0 and = 0 if x 0.
In particular, if (Xt )t0 is a non-negative, continuous super-martingale

on [0, T ], then ( )
1
P sup Xt E(X0 ) . (2.24)
t[0,T ]
FORMULA
The discrete-time version of Doobs maximal inequality has been (and

should be) proved in B10, and the above generalization to super-martingale
in continuous-time can be proved by the continuity technique as we have
used to show Kolmogorovs inequality.
Theorem 2.7.16 Let M = (Mt )t0 be a continuous square-integrable mar-

tingale with M0 = 0. Suppose there is a (deterministic) continuous, increas-
ing function a = a(t) such that a(0) = 0, hM it a(t) for all t [0, T ].
Then ( )
2
P sup Mt a(T ) e 2
a(T )
. (2.25)
t[0,T ]
Proof. For every > 0 and t T
2 2
Mt hM it Mt hM iT
2 2
2
Mt a(T )
2
so that
2
E(M )t eMt 2
a(T )
for > 0 .
Hence, by applying Doobs maximal inequality to the non-negative super-

martingale E(M ) we obtain
( ) ( )
2
a(T ) 2 a(T )
P sup Mt a(T ) P sup E(M )t e
t[0,T ] t[0,T ]
2
ea(T )+ 2
a(T )
E {E(M )0 }
2
a(T )+ 2 a(T )
= e
for any > 0. The exponential inequality follows by setting = .

In particular, by applying the exponential inequality to a standard Brow-
nian motion B = (Bt )t0 ,
( )
2
P sup Bt T e 2
T
. (2.26)
t[0,T ]
70 CALCULUS
CHAPTER 2. ITOS
2.7.5 Girsanovs theorem

Let T > 0. Let Z = (Zt )t0 be a continuous, positive martingale up to time
T , with Z0 = 1, on a filtered probability space (, F, Ft , P) that satisfies
the usual condition. Then it is clear that Zt is the stochastic exponential of
some continuous local martingale. Indeed, since Zt > 0 we may apply the
Ito formula to log Zt , therefore
Z t Z t
1 1
log Zt log Z0 = dZs 2
dhZis .
0 Zs 0 Zs
Hence Zt = E(N )t where

Z t
1
Nt = dZs
0 Zs
is a continuous local martingale.
We define a probability measure Q on the measurable space (, FT ) by
Q(A) = P (ZT A) if A FT . (2.27)
We notice that Q is a probability measure as E (Zt ) = 1 for any t T , but it

may not be well defined on F. It is clear, however, that Q may be extended
to be a probability measure on F {Ft : t 0} if (Zt ) is a martingale
for all t 0. Moreover, since (Zt )t0 is a martingale
Q(A) = P (Zt A) if A Ft and t T (2.28)
therefore the Radon-Nikodym derivative of Q with respect to the probability

measures P (as measures restricted on the -algebra Ft , where t T if T is
finite, and t < + otherwise) is

dQ
= Zt . (2.29)
dP Ft
We are now in a position to prove Girsanovs theorem.
Theorem 2.7.17 (Girsanovs theorem) Let (Mt )t0 be a continuous local

martingale on (, F, Ft , P) up to time T . Then
Z t
1
X t = Mt d hM, Zis
0 Zs
is a continuous local martingale on (, F, Ft , Q) up to time T .

FORMULA
Proof. Using stopping time technique, we may assume that M, Z, 1/Z

are bounded. In this case M, Z are bounded martingales. We want to prove
that X is a martingale under the probability Q:
Q {Xt |Fs } = Xs for all s < t T ,
that is,
Q {1A (Xt Xs )} = 0 for all s < t T , A Fs .
By definition
Q {1A (Xt Xs )} = P {(Zt Xt Zs Xs )1A }
thus we only need to show that (Zt Xt ) is a martingale up to time T under

probability measure P. By use of integration by parts, we have
Z t Z t
Zt Xt = Z0 X0 + Zs dXs + Xs dZs + hZ, Xit
0 0
Z t
1
= Z0 X0 + Zs dMs d hM, Zis
0 Zs
Z t
+ Xs dZs + hZ, Xit
0
Z t Z t
= Z0 X0 + Zs dMs + Xs dZs
0 0
which is a local martingale.

If Zt = E(N )t is a stochastic exponential of a continuous local martingale
with N0 = 0, and if Zt is a martingale up to time T , then (Zt ) satisfies Itos
integral equation
Z t
Zt = 1 + Zs dNs
0
so that Z t Z t Z t
hM, Zit = h dMs , Zs dNs i = Zs dhN, M is .
0 0 0
Therefore Z t
1
d hM, Zis = hN, M it .
0 Zs
72 CALCULUS
CHAPTER 2. ITOS
Corollary 2.7.18 Let Nt be a continuous local martingale on (, F, Ft , P),

N0 = 0. Assume that

1
E exp hN iT < + ,
2
so that E(N )t is a continuous martingale up to T , and E(N )0 = 1. Define
a probability measure Q on the measurable space (, FT ) by

dQ
= E(N )t for all t T .
dP Ft
If M = (Mt )t0 is a continuous local martingale under the probability P,
then
Xt = Mt hN, M it
is a continuous, local martingale under Q up to time T .
2.8 The martingale representation theorem

The martingale representation theorem is a deep result about Brownian mo-
tion. There is a natural version for multi-dimensional Brownian motion, for
simplicity of notations, we however concentrate on one-dimensional Brown-
ian motion.
Let B = (Bt )t0 be a standard BM in R on a complete probability space
(, F, P), and let (Ft )t0 (together with F = Ft ) be the natural filtration
of (Bt )t0 . Then, as a matter of fact, (Ft )t0 is continuous.
Theorem 2.8.1 Let M = (Mt )t0 be a square-integrable martingale on

(, F, Ft , P). Then there is a stochastic process F = (Ft )t0 in L2 , such
that Z t
Mt = E(M0 ) + Fs dBs a.s.
0
for any t 0. In particular, any martingale with respect to the Brownian
filtration (Ft )t0 has a continuous version.
The proof of this theorem relies on the following several lemmata. Let
T > 0 be any fixed time.
Lemma 2.8.2 The following collection of random variables on (, FT , P)

{(Bt1 , , Btn ) : n Z+ , tj [0, T ] and C0 (Rn )}
is dense in L2 (, FT , P).
2.8. THE MARTINGALE REPRESENTATION THEOREM 73
Lemma 2.8.2 follows from the following facts:
1. The -algebra FT is generated by Bt1 , , Btn for n Z+ , tj [0, T ]

and events with probability zero.
2. C0 (Rn ) is dense in Lp (Rn ).
Lemma 2.8.3 For any h L2 ([0, T ]) (functions on the interval [0, T ] which
RT
are square-integrable 0 h(s)2 ds < +) we associate with an exponential
martingale up to time T :
Z t
1 t
Z
2
M (h)t = exp h(s)dBs h(s) ds ; t [0, T ]. (2.30)
0 2 0
Then the vector space spanned by all these M (h)T
L = span{M (h)T : h L2 ([0, T ])}
is dense in L2 (, FT , P).
Proof. Let H L2 (, FT , P) such that

Z
HdP = 0 for all L .

We need to show that H = 0 almost surely. For any 0 t1 < < tn T

and any ci R, we choose a step function h(t) = ci for t (ti , ti+1 ]. Then
( )
X 1X 2
M (h)T = exp ci (Bti+1 Bti ) ci (ti+1 ti )
2
i i
and therefore
Z ( )
X 1X 2
H exp ci (Bti+1 Bti ) ci (ti+1 ti ) dP = 0
2
i i
so that Z ( )
X
H exp ci (Bti+1 Bti ) dP = 0 .
i
Since ci are arbitrary numbers, it thus follows that
Z ( )
X
H exp ci Bti dP = 0
i
74 CALCULUS
CHAPTER 2. ITOS
for any ci and ti [0, T ]. Since the left-hand is analytic in ci , so that the
last equality is true for any complex numbers ci . We may conclude that, for
any C0 (Rn ), Z
H(Bt1 , , Btn )dP = 0 . (2.31)

Indeed (2.31) can be shown via Fourier integral theorem: if C0 (Rn )
then Z
1 ihz,xi
(x) = (z)e dz
(2)n/2 Rn
where Z
1

(z) = (x)eihz,xi dx
(2)n/2 Rn
is the Fourier transform of . We thus may rewrite the left-hand side of
(2.31) via the Fourier integral formula

Z Z Z
1
X
H(Bt1 , , Btn )dP = H (z) exp i
zj Btj dzdP
(2)n/2 Rn j

Z ( Z ! )
1
X
= (z) H exp i zi Bti dP dz
(2)n/2 Rn i
= 0.
By Lemma 2.8.2, the collection of all functions like (Bt1 , , Btn ) is dense
in L2 (, FT , P), so that
Z
HGdP = 0 for any G L2 (, FT , P) .

2
R
In particular, H dP = 0 so that H = 0 almost surely.
Lemma 2.8.4 (It os representation theorem) Let H L2 (, FT , P). Then

there is a stochastic process F = (Ft )t0 in L2 , such that
Z T
H = E(H) + Ft dBt .
0
Proof. By Lemma 2.8.3 we only need to show this lemma for X(h)T
(where h L2 ([0, T ])) defined by (2.30). While, X(h)t is an exponential
martingale so that it must satisfy the following integral equation
Z T Z t
X(h)T = 1 + X(h)t d h(s)dBs
0 0
Z T
= E(X(h)T ) + X(h)t h(t)dBt .
0
2.8. THE MARTINGALE REPRESENTATION THEOREM 75
Therefore Ft = X(h)t h(t) will do.

The martingale representation theorem now follows easily from the mar-
tingale property and Itos representation theorem.
76 CALCULUS
CHAPTER 2. ITOS
Chapter 3
Stochastic differential
equations
Key concepts: stochastic differential equation, infinitesimal generator, linear

(Gaussian) equation, geometric Brownian motion, Ornstein-Uhlenbeck pro-
cess, strong solution, weak solution, uniqueness in law, path-wise uniqueness
Key theorems: Existence and uniqueness theorem for SDE with Lipschitz
coefficients, Cameron-Martin formula
3.1 Introduction
Stochastic differential equations (SDE) are ordinary differential equations
perturbed by noises. In this course, we will only consider noises which can
be modelled by Brownian motion.
Stochastic differential equations we will consider thus have the following
form
n
j
fij (t, Xt )dBti + f0j (t, Xt )dt ; j = 1, , N
X
dXt = (3.1)
i=1
where Bt = (Bt1 , , Btn )t0 is a standard Brownian motion in Rn on a

filtered probability space (, F, Ft , P) and
fij : [0, +) RN RN
are Borel measurable functions.

An adapted, continuous, RN -valued stochastic process Xt (Xt1 , , XtN )
77
78 CHAPTER 3. STOCHASTIC DIFFERENTIAL EQUATIONS
is a solution to SDE (3.1), if

n Z t Z t
Xtj X0j fkj (s, Xs )dBsk f0j (s, Xs )ds .
X
= + + (3.2)
k=1 0 0
Since we are only interested in the distribution determined by the solu-

tion (Xt )t0 of SDE (3.1), and we can believe for any Brownian motion
B = (Bt )t0 , SDE (3.1) should offer the same solution in the distribution
sense. It thus leads to different concepts of solutions and uniqueness: strong
solutions and weak solutions, path-wise uniqueness and uniqueness in law.
Definition 3.1.1 1) An adapted, continuous, RN -valued stochastic process

X = (Xt )t0 on (, F, Ft , P) is a (weak) solution of (3.1), if there is a
standard Brownian motion W = (Wt )t0 in Rn , adapted to the filtration
(Ft ), such that
n Z t Z t
Xtj X0j flj (s, Xs )dWsl f0j (s, Xs )ds ;
X
= + j = 1, , N .
l=1 0 0
In this case we also call the pair (X, W ) a solution of (3.1).

2) Given a standard Brownian motion B = (Bt )t0 in Rn on (, F, P)
with its natural filtration (Ft )t0 , an adapted, continuous stochastic process
X = (Xt )t0 on (, F, Ft , P) is a strong solution of (3.1), if
n Z t Z t
Xtj X0j fij (s, Xs )dBsi f0j (s, Xs )ds .
X
= +
i=1 0 0
We also have two concepts of uniqueness.
Definition 3.1.2 Consider SDE (3.1).
1. We say the path-wise uniqueness holds for (3.1), if whenever (X, B)

and (X,
e B) are two solutions defined on the same filtered space and
same Brownian motion B, and X0 = X e0 , then X = X.e
2. It is said that uniqueness in law holds for (3.1), if whenever (X, B)

and (X,e B)
e are two solutions with possibly different Brownian motions
B and B, e and X0 and X e0 possess same distribution, then X and X e
have the same distribution.
Theorem 3.1.3 (Yamada-Watanabe) Path-wise uniqueness implies unique-

ness in law.
The following is a simple example of SDE for which has no strong solu-
tion, but possesses weak solutions and uniqueness in law holds.
Example 3.1.4 (H.Tanaka) Consider 1-dimensional stochastic differential

equation: Z t
Xt = sgn(Xs )dBs , 0t<
0
where sgn(x) = 1 if x 0, and equals 1 for negative value of x.
1. Uniqueness in law holds, since X is a standard Brownian motion

(Levys Theorem).
2. If (X, B) is a weak solution, then so is (X, B).
3. There is a weak solution.

Rt Let Wt be a one-dimensional Brownian
motion, and let Bt = 0 sgn(Ws )dWs . Then B is a one-dimensional
Brownian motion, and
Z t
Wt = sgn(Ws )dBs ,
0
so that (W, B) is a solution.
4. Path-wise uniqueness does not hold.
5. There is not any strong solution.
3.1.1 Linear-Gaussian diffusions

Linear stochastic differential equations can be solved explicitly. Consider
n N
dXtj = ij dBti + kj Xtk dt
X X
(3.3)
i=1 k=1
(j = 1, , N ), where B is a Brownian motion in Rn , = (ij ) a constant

N n matrix, and = (kj ) a constant N N matrix. (3.3) may be written
as
dXt = dBt + Xt dt .
Let
k
X t
et = k
k!
k=0
be the exponential of the square matrix . Using Itos formula, we have

Z t Z t
et Xt X0 = es dXs es Xs ds
0 0
Z t
= es (dXs Xs ds)
0
Z t
= es dBs
0
so that Z t
t
Xt = e X0 + e(ts) dBs .
0
In particular, if n = N = 1 and X0 = x, then Xt has a normal distribution
with mean et x and variance
Z t 2
t 2 (ts)
E(Xt e x) = E e dBs
0
Z t 2
2t s
= e E e dBs
0
Z t
= 2 e2(ts) ds
0
2 2t
= e 1 .
2
If B = (Bt1 , , Btn )t0 is a Brownian motion in Rn , then the solution
Xt of the SDE:
dXt = dBt AXt dt
is called the Ornstein-Uhlenbeck process, where A 0 is a d d matrix
called the drift matrix. Hence we have
Z t
Xt = eAt X0 + e(ts)A dBs .
0
Exercise 3.1.5 If X0 = x Rn , compute Ef (Xt ), where Xt is the Ornstein-

Uhlenbeck process with drift matrix A.
3.1.2 Geometric Brownian motion

The stock price in the Black-Scholes model satisfies the stochastic differential
equation
dSt = St (dt + dBt ) (3.4)
3.2. EXISTENCE AND UNIQUENESS 81
and so that the solution to SDE (3.4) is the stochastic exponential of

Z t Z t
ds + dBs .
0 0
Hence Z t Z t
1 2
St = S0 exp dBs + ds .
0 0 2
In the case and are constants, then

1 2
St = S0 exp Bt + t
2
which is called the geometric Brownian motion.
3.2 Existence and uniqueness

In this section we present two fundamental theorems about solving stochastic
differential equations.
3.2.1 Cameron-Martins formula

Consider a simple stochastic differential equation
dXt = dBt + b(t, Xt )dt (3.5)
where b(t, x) is a bounded, Borel measurable function on [0, +) R. We

may solve (3.5) by means of change of probabilities.
Let (Wt )t0 be a standard Brownian motion on (, F, Ft , P), and define
probability measure Q on (, F ) by

dQ
= E(N )t for all t 0
dP Ft
Rt
where Nt = 0 b(s, Ws )dWs is a martingale (under the probability P), with
Rt
hN it = 0 b(s, Ws )2 ds, which is bounded on any finite interval, thus
Z t
1 t
Z
2
E(N )t = exp b(s, Ws )dWs b(s, Ws ) ds
0 2 0
is a martingale. According to Girsanovs theorem
Bt Wt W0 hW, N it
is a martingale under the new probability Q, and hBit = hW it = t. By

Levys martingale characterization of Brownian motion, (Bt )t0 is a Brow-
nian motion. Moreover
Z t Z t
hW, N it = h dWs , b(s, Ws )dWs i
0 0
Z t
= b(s, Ws )ds
0
and therefore Z t
Wt W0 b(s, Ws )ds = Bt
0
is a standard Brownian motion on (, F, Q). Thus
Z t
Wt = W0 + Bt + b(s, Ws )ds (3.6)
0
so that (Wt )t0 on (, F , Q) is a solution of SDE (3.5). The solution we

have just constructed is a weak solution of SDE (3.5).
Theorem 3.2.1 (Cameron-Martins formula) Let b(t, x) = (b1 (t, x), , bn (t, x))
be bounded, Borel measurable functions on [0, +)Rn . Let Wt = (W 1 , , Wtn )
be a standard Brownian motion on a filtered probability space (, F, Ft , P),
and let F = {Ft , t 0}. Define probability measure Q on (, F ) by

dQ Pn R t k k 1
k=1 0 b (s,Ws )dBs 2
Pn R t k 2
k=1 0 |b (s,Ws )| ds
= e for t 0 .
dP Ft

Then (Wt )t0 under the probability measure Q is a solution to the SDE
dXtj = dBtj + bj (t, Xt )dt (3.7)
for some Brownian motion (Bt1 , , Btn )t0 under probability Q.
On the other hand, if (Xt ) is a solution of SDE (3.7) on some probability

space (, F, Ft , P) and define P
( n Z n Z
)

dP X t 1 X t k 2
k k
= exp b (s, Xs )dBs b (s, Xs ) ds for t 0

dP 0 2 0
Ft k=1 k=1
we may show that (Xt )t0 under probability measure P is a Brownian mo-
tion. Therefore solutions to SDE (3.7) is unique in law: all solutions have
the same distribution.
3.2.2 Existence and uniqueness theorem: strong solution

By definition, any strong solution is a weak solution. We next prove a basic
existence and uniqueness theorem for a stochastic differential equation under
a global Lipschitz condition. Our proof will rely on two inequalities: The
Gronwall inequality and Doobs Lp -inequality (Theorem 4.2.5).
Lemma 3.2.2 (The Gronwall inequality) If a non-negative function g sat-

isfies the integral equation
Z t
g(t) h(t) + g(s)ds , 0tT
0
where is a constant and h : [0, T ] R is an integrable function, then

Z t
g(t) h(t) + e(ts) h(s)ds , 0tT .
0
Rt
Proof. Let F (t) = 0 g(s)ds. Then F (0) = 0 and
F 0 (t) h(t) + F (t)
so that 0
et F (t) et h(t) .
Integrating the differential inequality we obtain
Z t Z t
s
0
e F (s) ds es h(s)ds
0 0
and therefore Z t
F (t) e(ts) h(s)ds
0
which yields Gronwalls inequality.
Consider the following SDE
n
dXtj = flj (t, Xt )dBtl + f0j (t, Xt )dt ;
X
j = 1, , N (3.8)
l=1
where fkj (t, x)

are Borel measurable functions on R+ RN , which are bounded
on any compact subset in RN .
We will use a special case of Doobs Lp - inequality: if (Mt )t0 is a square-
integrable, continuous martingale with M0 = 0, then for any t > 0

E sup |Ms | 4 sup E |Ms |2 = 4EhM it .
2

(3.9)
st st
Lemma 3.2.3 Let (Bt )t0 be a standard BM in R on (, F, F, P ), and let

(Zt )t0 and (Zt )t0 be two continuous, adapted processes. Let f (t, x) be a
Lipschitz function
|f (t, x) f (t, y)| C|x y| ; t 0, x, y R
for some constant C.
1. Let Z t Z t
Mt = f (s, Zs )dBs f (s, Zs )dBs t 0 .
0 0
Then Z t 2
E sup |Ms | 4C 2 2
E Zs Zs ds

st 0
for all t 0.
2. If
Z t Z t
Nt = f (s, Zs )ds f (s, Zs )ds t 0
0 0
then Z t 2
sup |Ns |2 Ct E Zs Zs ds t 0 .

st 0
Proof. To prove the first statement, we notice that

Z s 2
2
f (u, Zu ) f (u, Zu ) dBu

sup |Ms | = sup
st st 0
so that, by Doobs L2 -inequality

Z s 2
2
f (u, Zu ) f (u, Zu ) dBu

E sup |Ms | = E sup
st st 0
Z t 2

4E
f (s, Zs ) f (s, Zs ) dBs
0
Z t 2
= 4E

f (s, Zs ) f (s, Zs ) ds

0
Z t 2
4C 2 E Zs Zs ds .

0
Next we prove the second claim. Indeed

Z s 2
2

sup |Ns | = sup f (u, Zu ) f (u, Zu ) du
st st 0
Z t 2

f (s, Zs ) f (s, Zs ) ds

0
Z t 2
t

f (s, Zs ) f (s,
Z s ds
)

0
Z t 2
2
C t Zs Zs ds

0
where the second inequality follows from the Schwartz inequality.
Theorem 3.2.4 Consider SDE (3.8). Suppose that fij satisfy the Lipschitz
condition:
j j
fi (t, x) fi (t, y) C|x y| (3.10)

and the linear-growth condition that

j
fi (t, x) C(1 + |x|) (3.11)

for t R+ and x, y RN . Then for any L2 (, F0 , P) and a standard

Brownian motion Bt = (Bti ) in Rn , there is a unique strong solution (Xt ) of
(3.8) with X0 = .
Proof. Let us only prove the one dimensional case, the proof of multi-
dimensional case is similar. We are going to thus construct a strong solution
to SDE
dXt = f1 (t, Xt )dBt + f0 (t, Xt )dt .
We will apply Picards iteration to the corresponding integral equation
Z t Z t
Xt = + f1 (t, Xt )dBt + f0 (t, Xt )dt
0 0
where B = (Bt )t0 is a standard Brownian motion in R on (, F, Ft , P).

Define
Y0 (t) = ;
Z t Z t
Yn+1 (t) = + f1 (s, Yn (s))dBs + f0 (s, Yn (s))ds
0 0
n = 0, 1, 2, . We will show that, for every T > 0, the sequence {Yn (t)}
converges to a solution Y (t) uniformly on [0, T ] almost surely. Introduce
notations
D(n)t = Yn (t) Yn1 (t) ; n = 1, 2,
and
d(n)t = E sup |D(n)s |2

st
= E sup |Yn (s) Yn1 (s)|2 .

st
Then
Z t Z t
D(1)t = f1 (s, )dBs + f0 (s, )ds
0 0
and for n 1 we have

Z t
D(n + 1)t = (f1 (s, Yn (s)) f1 (s, Yn1 (s))) dBs
0
Z t
+ (f0 (s, Yn (s)) f0 (s, Yn1 (s))) ds .
0
Using the elementary inequality
(x + y)2 2x2 + 2y 2
and Lemma 3.2.3 we deduce that, for all t T ,
Z t
d(n + 1)t 2 C 2 (4 + s) d(n)s ds
0
Z t
2
2C (4 + T ) d(n)s ds .
0
We therefore have
2n C 2n (4 + T )n
d(n + 1)t d(1)T t.
n!
Moreover

2
d(1)t = E sup |Y1 (s) Y0 (s)|
0st
( Z s 2 )
2E sup |f1 (, )|dB
0st 0
( 2 )
Z s
+2E sup |f0 (, )|ds
0st 0
Z t Z t
2
8E f1 (, ) ds + 2tE f0 (, )2 ds
0 0
16t + 4t2 1 + E 2 .

Therefore, ( )
2 (C2 T )n
E sup |Yn+1 (t) Yn (t)| C1 .
0tT n!
By the Markov inequality
( )
1 (4C2 T )n
P sup |Yn+1 (t) Yn (t)| n C1 ,
0tT 2 n!
and thus, by the Borel-Cantelli lemma,
Yn (t) Xt uniformly on [0, T ] , P-a.s.
It is easy to see that (Xt ) is a strong solution of the stochastic differential

equation.
Next we prove the uniqueness. Let Y and Z be two solutions with same
Brownian motion B. Then
Z t Z t
Yt = + f1 (s, Ys )dBs + f0 (s, Ys )ds
0 0
and Z t Z t
Zt = + f1 (s, Zs )dBsi + f0 (s, Zs )ds .
0 0
Again, by Lemma 3.2.3
Z t
2 2 2
E|Ys Zs |2 ds

E |Yt Zt | C (4 + T )
0
The Gronwall inequality implies thus that
E |Yt Zt |2 = 0 .

Remark 3.2.5 The iteration Yn constructed in the proof of Theorem 3.2.4

is a function of the Brownian motion B, and Yn (t) only depends on and
Bs , 0 s t.
3.2.3 Continuity in initial conditions
Theorem 3.2.6 Under the same assumptions as in Theorem 3.2.4. Given

a BM B = (Bt )t0 in Rn on (, F, Ft , P), let (X x (t))t0 be the unique strong
solution of (3.8). Then x X x is uniformly continuous almost surely on
any finite interval [0, T ]:
( )
x y 2
lim sup E sup |X (t) X (t)| =0. (3.12)
0 |xy|< 0tT
Proof. Let us only consider 1-dimensional case. Thus
Z t Z t
x x
X (t) = x + f1 (s, X (s))dBs + f0 (s, X x (s))ds
0 0
and
Z t Z t
y y
X (t) = y + f1 (s, X (s))dBs + f0 (s, X y (s))ds .
0 0
3.3. MARTINGALES AND WEAK SOLUTIONS 89
Therefore, by Doobs maximal inequality,

( )
E sup |X x (t) X y (t)|2 3|x y|2
0tT
( Z t 2 )
+3E sup (f1 (s, X x (s)) f1 (s, X y (s)))dBs

0tT 0
( Z t 2 )
x y

+3E sup (f0 (s, X (s)) f0 (s, X (s)))ds
0tT 0
(Z 2 )
2
T x y

3|x y| + 12E (f1 (s, X (s)) f1 (s, X (s))) dBs
0
Z T
x y 2
+3T E |f0 (X (s)) f0 (X (s))| ds
0
Z T
2 x y 2
3|x y| + 12E |f1 (s, X (s)) f1 (s, X (s))| ds
0
Z T
2 x y 2
+3T C E |X (s) X (s)| ds
0
Z T
2 2
E |X x (t) X y (t)|2 dt.

3|x y| + 3C (4 + T )
0
Setting
(t) = E sup |X x (s) X y (s)|2 ,
0st
then we have
Z T
2 2
(T ) 3|x y| + 3C (4 + T ) (t)dt
0
and therefore by Gronwalls inequality
(T ) 6|x y|2 exp(12C 2 + 3T C 2 )
which yields (3.12).
3.3 Martingales and weak solutions

For simplicity, let us consider the following one-dimensional, homogenous
SDE
dXt = (Xt )dBt + b(Xt )dt (3.13)
where C (R) is a positive smooth function with at most linear growth,

and b C (R) has at most linear growth. Let X = (Xt )0 be the strong
solution with initial X0 on a filtered probability space (, F, Ft , P). If f
Cb2 (RN , R), then by Itos formula
Z t Z t
1
f (Xt ) f (X0 ) = f 0 (Xs )dXs + f 00 (Xs )dhXis
0 2 0
Z t
= f 0 (Xs ) ((Xs )dBs + b(Xs )ds)
0
1 t 00
Z
+ f (Xs ) 2 (Xs )ds
2 0
Z t Z t
0 1 2 00 0
= (Xs )f (Xs )dBs + f + bf (Xs )ds .
0 0 2
Let us introduce
1 d2 d
L = (x)2 2 + b(x) (3.14)
2 dx dx
which is an elliptic differential operator of second-order. Then the previous
formula may be written as
Z t Z t
f (Xt ) f (X0 ) = (Xs )f 0 (Xs )dBs + (Lf )(Xs )ds .
0 0
If we set Z t
Mtf = f (Xt ) f (X0 ) (Lf )(Xs )dBs ,
0
then Z t
Mtf = (Xs )f 0 (Xs )dBs
0
is a martingale on (, F, Ft , P), and
Z t
f g
hM , M it = ( 2 f 0 )(Xs )ds .
0
Lemma 3.3.1 If (Xt )t0 is a strong solution to SED (3.13) on (, F, Ft , P)

(with a given Brownian motion) then for any f Cb2 (R)
Z t
Mtf = f (Xt ) f (X0 ) (Lf )(Xs )ds
0
is a martingale under the probability, where L is defined by (3.14).

3.3. MARTINGALES AND WEAK SOLUTIONS 91
1 d2 1
For example, if = 1 and b = 0 (in this case L= 2 dx2 = 2 ), then
(Bt )t0 itself is a strong solution to
dXt = dBt
so that
1 t
Z
Mtf
= f (Bt ) f (B0 ) (f )(Bs )ds
2 0
is a martingale under P. On the other hand, Levys martingale characteri-
zation shows that the previous property that
1 t
Z
f (Bt ) f (B0 ) (f )(Bs )ds
2 0
is martingale, completely characterizes Brownian motion. Therefore we may
believe that the martingale property of all M f should completely the distri-
bution of a solution (Xt )t0 to SDE (3.13), and hence those of weak solution
of (3.13). Thus we give
Definition 3.3.2 Let L be a linear operator on C (R). Let (Xt )t0 be) a
stochastic process on a filtered space (, F, Ft , P). Then we say that (Xt )t0
together with the probability P is a solution to the L-martingale problem, if
for every f Cb (R)
Z t
f
Mt f (Xt ) f (X0 ) Lf (Xs )ds
0
is a local martingale under the probability P.
Therefore a strong solution (Xt )t0 of SDE (3.13) on (, F, P) is a solu-

tion to L-martingale problem, where L is given by (3.14):
Z t
f
Mt = f (Xt ) f (X0 ) Lf (Xs )ds
0
is a martingale under P. Moreover, since
L(f g) f (Lg) g (Lf ) = 2 f 0
we thus have
Z t
f g
hM , M it = {L(f g) f (Lg) g (Lf )} (Xs )ds .
0
Conversely, we can show that any solution to the L-martingale problem

is a weak solution to SDE.
Theorem 3.3.3 Let b, be Borel measurable functions on R which are

bounded on any compact subset, and let
1 d2 d
L = (x)2 2 + b(x) .
2 dx dx
If (Xt )t0 on (, F, P) is a continuous process solving the L-martingale prob-
lem: for any f Cb2 (R)
Z t
Mtf = f (Xt ) f (X0 ) Lf (Xs )ds
0
is a continuous local martingale, then (Xt )t0 on (, F, P) is a weak solution

to SDE
dXt = (Xt )dBt + b(Xt )dt . (3.15)
Let us outline the proof only, a detailed proof will be given in next section
that handles the multi-dimensional case. To show (Xt )t0 on (, F, P) is a
weak solution, we need to construct a Brownian motion B = (Bt )t0 such
that Z t Z t
Xt = X0 + (Xs )dBs + b(Xs )ds . (3.16)
0 0
The key of the proof is to compute hXit , and the result is
D E Z t
f g
M ,M = (L(f g) f Lg gLf )(Xs )ds
t 0
Z t
f g
= 2 (Xs )ds .
0 x x
In particular, if we choose f (x) = x the coordinate function (and write in

this case M f as M ), then
Z t
hM it = ((Xs ))2 ds
0
so that Z t
1
Bt = dMs
0 (Xs )
is a Brownian motion (Levys martingale characterization for Brownian mo-
tion). It is then obvious that (Xt , Bt ) satisfies the stochastic integral equa-
tion (3.16), so that (Xt )t0 is a weak solution to (3.15).
Chapter 4
Appendix: martingales in
discrete-time
In this appendix we collect several fundamental results about martingales in

discrete-time, including Doobs martingale inequalities and the convergence
theorem for martingales. A good reference is the book Probability with
Martingales by D. Williams, which is the text book for the third year
course B10a: Martingales through measure theory.
4.1 Several notions

In the probability theory, we study properties of random variables: those
determined by their distributions. Let (, F, P) be a probability space,
and let Z+ denote the set of all non-negative integers. An increasing family
{Fn }nZ+ of sub -algebras of F is called a filtration, and a probability space
(, F, P) together with a filtration {Fn }nZ+ is called a filtered probability
space, denoted by (, Fn , F, P). Given a sequence X {Xn }nZ+ of random
variables on the probability space (, F, P), the natural filtration of the
sequence {Xn } is defined to be FnX = {Xm : m n} the smallest -
algebra such that X0 , , Xn are measurable. If X = {Xn } represents the
process of a random phenomenon evolving in discrete-time, then FnX is the
information up to time n.
Definition 4.1.1 A sequence {Xn : n Z+ } of random variables on (, F, P)

is adapted to {Fn } if for every n Z+ , Xn is Fn -measurable. In this case we
say {Xn } is an adapted sequence, or adapted process with respect to {Fn }.
93
94 CHAPTER 4. APPENDIX: MARTINGALES IN DISCRETE-TIME
If Xn is Fn1 -measurable for any n N and X0 F0 , then we say

{Xn } is predictable.
Definition 4.1.2 Let {Fn : n Z+ } be a filtration on a probability space

(, F, P). Then a measurable function T : Z+ {+} (allowed to take
value +) is called a stopping time (or a random time) with respect to the
filtration {Fn } if { : T () = n} Fn for every n.
Remark 4.1.3 1. By definition, if T is a stopping time, then { : T () =

+} F.
2. A random variable T : Z+ {+} is a stopping time if and
only if { : T () n} Fn for every n. Indeed
{ : T () n} = nk=0 { : T () = k} .
Of course, a constant time T = n for some n N or + is a stopping

time.
Example 4.1.4 A basic example of stopping times is the following. Let

{Xn } be an adapted process on a filtered probability space (, Fn , F, P), and
let B B(R). Then the first time T that the process {Xn } hits domain B:
T () = inf{n 0 : Xn () B}
(with the convention that inf = +) is a stopping time with respect to the
filtration {Fn }. T is called a hitting time.
To see why such a hitting time is really a stopping time, we observe that
{T = n} = n1 c
k=0 {Xk B } {Xn B} .
Since {Xn } is adapted, and therefore {Xk B c } Fk and {Xn B} Fn .

Hence {T = n} Fn and therefore T is a stopping time.
Given a stopping time T on the filtered probability space (, Fn , F, P),
the -algebra that represents information available up to random time T :
FT = {A F : such that A {T n} Fn for any n Z+ } .
It is obvious that FT = Fn if T = n is a constant time n.
Theorem 4.1.5 Let {Xn } be an adapted process on (, Fn , F, P), let be

a random variable, and set X = . Let T be a stopping time with respect
to {Fn }, and define XT () = XT () () for any . Then XT is FT -
measurable. In particular, XT 1{T <} is FT -measurable.
4.1. SEVERAL NOTIONS 95
Proof. For any r R, we have

{XT r} {T n} = nk=0 {Xk r} {T = k}
which belongs to Fn as {Xk r} {T = k} Fk , k = 0, 1, , n, so that
XT is FT -measurable.
The concept of a martingale originated from a fair game which once
(perhaps still) was popular, in which regardless of the whims of chance in
deciding the outcomes to the past and present, a gamers fortune is exactly
the gamers current capital.
Definition 4.1.6 Let {Xn } be an adapted process on a filtered probability
space (, Fn , F, P). Suppose each Xn L1 (, F, P) (i.e. Xn is integrable).
1. {Xn } is a martingale, if E(Xn+1 |Fn ) = Xn for all n.
2. {Xn } is a supermartingale (resp. a submartingale), if E(Xn+1 |Fn )
Xn (resp. E(Xn+1 |Fn ) Xn ) for every n.
Example 4.1.7 (Martingale transform) Let {Hn } be a predictable process
and let {Xn } be a martingale, and let
n
X
(H.X)n = Hk (Xk Xk1 ), (H.X)0 = 0 .
k=1
Then {(H.X)n } is a martingale.

Recall Jensens inequality for conditional expectation: if : R R is
a convex function, , () L1 (, F, P), and G is a sub -field of F, then
(E(|G)) E(()|G) .
(t) = (t ln t) 1(1,) (t), t1(0,) and |t|p (for p 1) are all convex functions.
Theorem 4.1.8 1. Let {Xn } be a martingale, and let : R R be
a convex function. If (Xn ) is integrable for any n, then {(Xn )} is a
submartingale.
2. If {Xn } is a submartingale, and : R R is a increasing and convex
function. If (Xn ) is integrable for any n, then {(Xn )} is a submartingale.
Proof. For example, let us prove the first claim. Indeed
(Xn ) = (E(Xn+1 |Fn )) (martingale property)
E((Xn+1 )|Fn ) (Jensens inequality).
Corollary 4.1.9 If X = (Xn ) is a sub-martingale, so is (Xn+ ). If, in addi-

tion, each Xn log+ Xn is integrable, then (Xn log+ Xn ) is a sub-martingale,
where log+ x = 1{x1} log x.
4.2 Doobs inequalities

The most important result about martingales is the following Doobs op-
tional sampling theorem, which says the (super-, sub-) martingale property
holds at random times.
Theorem 4.2.1 Let {Xn } be a martingale (resp. supermartingale), and let

S, T be two bounded stopping times. Suppose S T . Then E(XT |FS ) = XS
(resp. E(XT |FS ) XS ).
Proof. We prove this theorem for the case that {Xn } is a supermartin-
gale. Let T n (as it is bounded by our assumption). Then
n
X
E|XT | = E (|XT | : T = j)
j=0
Xn
= E (|Xj | : T = j)
j=0
Xn
E|Xj | ,
j=0
which implies that XT is integrable. Similarly XS L1 (, FS , P).

Let A FS , j 0. Then A {S = j} Fj and {T > j} Fj as S and
T are stopping times. We consider several cases.
1. If 0 T S 1, then
Z n Z
X
(XS XT )dP = (Xj Xj+1 )dP
A j=0 A{S=j}{T >j}
as each term in the above sum is non-negative.

2. General case. Set Rj = T (S + j), j = 1, , n. Then each Rj is a
stopping time,
S R1 Rn = T
4.2. DOOBS INEQUALITIES 97
and
R1 S 1, Rj+1 Rj 1, for 1 j n 1 .
Let A FS . Then A FRj (as S Rj ). Therefore apply the first case to
Rj we have Z Z Z
XS dP XR1 dP XT dP
A A A
so that
E (1A XS ) E (1A XT ) for any A FS .
Since XS FS we may thus conclude that
XS E(XT |FS ) .
Thus we have proved the theorem.
Corollary 4.2.2 Let {Xn } be a submartingale, and let T be a stopping time.

Then
E|XT k | E (X0 ) + 2E Xk

for any k = 0, 1, 2,
and therefore
E(|XT |1{T <} ) 3 sup E|Xn | .
n
Proof. We know that {Xn } be a supermartingale, so by the previous

theorem we have
E|XT k | = EXT k + 2E XTk

EX0 + 2E Xk .

This is the first inequality. While
E(|XT k |1{T <} ) EX0 + 2E Xk

3 sup E (|Xn |)
n
and the second inequality thus follows from the Fatou lemma.
Theorem 4.2.3 (Doobs maximal inequality). Let {Xn } be a super-martingale,

and let n 0. Then for any > 0, we have
( ) Z
P sup Xk EX0 Xn dP
kn supkn Xk <
= EX0 E(Xn ; sup Xk < ) ;
kn
Z
P inf Xk Xn dP
kn inf kn Xk
= E(Xn ; inf Xk )
kn
and ( )
EX0 + 2E Xn

P sup |Xk | .
kn
Proof. Let us prove the first inequality. Let R = inf{k 0 : Xk }

and T = R n. Then T is a bounded stopping time. By definition,
XR , on {R < },
so that
{sup Xk } {XT },
kn
{sup Xk < } {T = n}.
kn
By Doobs optional sampling theorem,
EX0 EXT
Z Z
= XT dP+ XT dP
{supkn Xk } {supkn Xk <}
( ) Z
P sup Xk + Xn dP
kn {supkn Xk <}
( ) ( )
= P sup Xk + E Xn ; sup Xk < .
kn kn
In order to prove the second inequality, we set Yk = Xk . Then {Yn } is a

submartingale. Define
R = inf{k 0 : Yk }, T = R n.
Then T is a stopping time and T n. Again we have
{sup Yk } {YT } ;
kn
{sup Yk < } {T = n} .
kn
Therefore by applying Doobs stopping theorem to Y we have
EYn EYT
Z Z
= YT dP+ YT dP
{supkn Yk } {supkn Yk <}
( ) Z
P sup Yk + Yn dP
kn {supkn Yk <}
( ) Z
= P sup Xk + Yn dP.
kn {supkn Yk <}
Therefore
( )
P sup Xk = P inf Xk
kn kn
Z
EYn Yn dP
{supkn Yk <}
Z
= Yn dP
{supkn Yk }
Z
= Xn dP.
{inf kn Xk }
The third inequality follows from the first two inequalities.

As a consequence we have
Theorem 4.2.4 (Kolmogorovs inequality) Let {Xn } be a martingale and

let Xn L2 (, F, P). Then for any > 0,
( )
1
P sup |Xk | 2 E Xn2 .
kn
Proof. By Jensens inequality, for any k n we have
E Xk2 = E{(E(Xn |Fk ))2 } E Xn2 < .

Therefore (Xk2 ) (k = 0, 1, , n) is a supermartingale. By the second

inequality in the above theorem, we have
Z
2 P inf Xk2 2 Xn2 dP
kn inf kn Xk2 2
and therefore
( ) Z
2
P sup Xk2 2
Xn2 dP
kn { inf kn Xk2 2 }
Z
Xn2 dP = E Xn2

.

Next we establish Doobs Lp -inequality. Let Xn = maxkn Xk . If

: R+ [0, ) is a continuous and increasing function such that (0) = 0,
then (Z ) Z
Xn

E(Xn ) = E d() = d()dP .
0 [0,Xn ]
Theorem 4.2.5 (Doobs Lp -inequality) 1. If (Xn ) is a sub-martingale, then

for any p > 1
p p
+ p
E max Xk E|Xn+ |p .
kn p1
2. If (Xn ) is a martingale, then for any p > 1,
p
p p
E max |Xk | E|Xn |p .
kn p1
Proof. If (Xn ) is a martingale, then (|Xn |) is a submartingale, so 2)

follows 1). Let us prove the first conclusion. By replace (Xn ) by (Xn+ ),
we may, without lose of the generality, assume that (Xn ) is a non-negative
sub-martingale. By Fubins theorem
(Z )
Xn
E {(Xn )} = E d()
0
Z
= P(Xn )d()
Z0
1
E(Xn ; Xn )d()
0
together with Doobs maximal inequality
1
P(Xn ) E {Xn ; Xn }

we thus obtain
Z
1
E {(Xn )} E {Xn ; Xn } d()

Z0 Z
1
= Xn dPd()
0 {Xn }
( Z Xn !)
1
= E Xn d() . (4.1)
0
Choose () = p , then 0 () = pp1 , and therefore

( Z Xn !)
1
E|Xn |p E Xn pp1 d
0

p p1
= E Xn (Xn )
p1
p
= E Xn (Xn )p1
p1
p 1 1
(|EXn |p ) p (E|Xn |p ) q
p1
the lase equality follows from Holder inequality.
Exercise 4.2.6 1. Prove log x x/e for all x > 0, hence prove that
b
a log+ b a log+ a + .
e
Theorem 4.2.7 (Doobs inequality) Let (Xn ) be a non-negative sub-martingale.
Then
e +

E max Xk 1 + max E Xk log Xk .
kn e1 kn
Proof. We may use the same argument as in the proof of the previous
theorem, but with the choice that () = ( 1)+ . We thus obtain (by
(4.1))
( Z Xn !)
1
E {(Xn )} E Xn d()
0
( Z Xn !)
1
= E Xn 1{Xn 1} d
1
= E Xn log+ Xn ,

which thus implies that

E (Xn 1) E (Xn 1)+ E Xn log+ Xn .

Together with the inequality (see the previous exercise)

1
Xn log+ Xn Xn log+ Xn + Xn
e
it follows thus that
E (Xn 1) E Xn log+ Xn

1
E Xn log+ Xn + EXn

e
so that
1
EXn E Xn log+ Xn .

1 1/e
4.3 The convergence theorem

Let {Xn : n Z+ } be an adapted sequence of random variables, and let
[a, b] be a closed interval. Define
T0 = inf{n 0 : Xn a} ;
T1 = inf{n > T0 : Xn b} ;

T2j = inf{n > T2j1 : Xn a} ;
T2j+1 = inf{n > T2j : Xn b} .
Then {Tk } be a sequence of stopping times, which is increasing Tk . If
T2j1 () < , then the sequence
X0 (), , XT2j1 ()
upcrosses the interval [a, b] j times. Denote by Uab (X; n) the number of
upcrossing [a, b] by {Xk } up to time n. Then we have
{Uab (X; n) = j} = {T2j1 n < T2j+1 } Fn .
Note that by definition,
XT2j a, on {T2j < } ;
XT2j+1 b, on {T2j+1 < } .
4.3. THE CONVERGENCE THEOREM 103
Theorem 4.3.1 (Doobs upcrosssing theorem) 1. If X = {Xn } is a super-

martingale, then for any n 1, k 0, we have
n o 1 n o
P Uab (X; n) k E (Xn a) : Uab (X; n) = k
ba
and
1
EUab (X; n) E(Xn a) .
ba
2. Similarly, if X = {Xn } is a submartingale, then
n o 1 n o
P Uab (X; n) k E (Xn a)+ : Uab (X; n) = k
ba
and
1
EUab (X; n) E(Xn a)+ .
ba
Proof. We first prove the inequalities for a supermartingale. Since X

is a supermartingale, by Doobs optional sampling theorem,

0 E XT2k+1 n XT2k n

= E XT2k+1 n XT2k n 1{T2k n<T2k+1 }

+E XT2k+1 n XT2k n 1{T2k+1 n}
E (Xn a) 1{T2k n<T2k+1 } ( as XT2k n = XT2k a)
+E (b a) 1{T2k+1 n} (as XT2k +1n = XT2k+1 b).
However
{Uab (X; n) k} {T2k1 n},
{Uab (X; n) = k} = {T2k1 n < T2k }
so that
0 E (Xn a) 1{Uab (X;n)=k}

+E (b a) 1{Uab (X;n)k}
= E (Xn a) 1{Uab (X;n)=k} + (b a) P{Uab (X; n) k} .
Thus we get the first inequality. By adding up over all k 0 we get the
second inequality.
Now we prove the inequalities for a submartingale X. The argument is

very similar. Again by Doobs stopping theorem,

0 E XT2k1 n XT2k n

= E XT2k1 n XT2k n 1{T2k1 n<T2k }

+E XT2k1 n XT2k n 1{T2k n}
E (b Xn ) 1{T2k1 n<T2k } + E (b a) 1{T2k n}
= E (a Xn ) 1{T2k1 n<T2k } + E (b a) 1{T2k1 n}
which yields the wanted inequality.
Theorem 4.3.2 (The martingale convergence theorem). Let {Xn } be a su-

permartingale. If supn E|Xn | < +, then
Xn X exists almost surely.
Moreover if in addition {Xn } is non-negative, then
E(X |Fn ) Xn for any n.
Proof. For any rationales a, b Q, a < b we set
Uab (X) = lim Uab (X; n).

n
Then by the Fatou lemma

1
EUab (X) sup E(Xn a)
ba n
|a| 1
+ sup E|Xn | < .
ba ba n
Therefore
Uab (X) < , almost surely.
Let
W(a,b) = {liminfn Xn < a, limsupn Xn > b}
and
W = (a,b) W(a,b)
the union over all rational pairs (a, b), a < b, which is a countably union.
Clearly
W(a,b) {Uab (X) = }
4.3. THE CONVERGENCE THEOREM 105
so that
P(W(a,b) ) = 0.
Hence P(W ) = 0. However if / W, then limn Xn () exists, and we
denote it by X () and on W we let X () = 0. Then we have Xn X
almost surely. Moreover by the Fatou lemma,
E|X | sup E|Xn | < ,

n
i.e. X L1 (, F, P ).
If in addition {Xn } is non-negative, then
E(Xm |Fn ) = Xn , for any m n,
by letting m , the Fatou lemma then yields that
E(X |Fn ) Xn .

Zhong Mian Qian - SDE

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Zhong Mian Qian - SDE

Uploaded by

Copyright:

Available Formats

C10a: Stochastic Differential Equations

2.8 The martingale representation theorem . . . . . . . . . . . . . 72

3 Stochastic differential equations 77

4 Appendix: martingales in discrete-time 93

dXt = A(t, Xt )dt + (t, Xt )dWt

which in turn has to be interpreted as an integral equation

It requires thus to define integral like

1.1 Probability space

1. P() = 1, P() = 1 (where is an empty set representing impossible

2. Countably additive: If {Ai }i=1, is a countable family of mutually

A random variable X on (, F, P) valued in Rd is a measurable (vector-

belongs to the -algebra F. Loosely speaking, a random variable is such a

then we say X is integrable, denoted by X L1 (, F, P). In this case, the

In this case we also say X is p-th integrable. For p 1, the space Lp (, F, P)

Remark 1.1.1 If p q, then Lp (, F, P) Lq (, F, P) and ||X||q

Stochastic processes are mathematical models which are used to describe

Definition 1.1.2 A stochastic process is a parameterized family X = (Xt )tT

Of course a stochastic process X = (Xt )tT can be regarded as a function

Remark 1.1.3 A function f : (a, b) Rd is right-continuous at t0 (a, b)

Example 1.1.4 (Poisson process) Let (n ) be a sequence of independent

and for t 0 define

Then for each sample point , t Xt () is a step function, constant on

If X = (Xt )t0 is a stochastic process taking values in Rd , and 0 t1 <

which is a probability measure on the n-th product: Rd Rd , is called

Ft1 ,t2 , ,tn (x1 , , xn ) = P {Xt1 x1 , , Xtn xn } .

We need to overcome some technical difficulties when we deal with

{ : Xt () B for all t [0, 1]}

(where for example B is a ball in Rd ) may be not measurable, i.e. not an

P{ : Xt () B for all t [0, 1]}

Exercise 1.1.5 Let (Xt )t0 be a stochastic process in Rd on (, F, P), and

To avoid such technical difficulties, a common condition (but including

In this case, (Yt )t0 is a version of (Xt )t0 .

By definition, the family of finite-dimensional distributions of a stochas-

1.2 Brownian motion

Definition 1.2.1 A stochastic process B = (Bt )t0 on a completed prob-

Bt0 , Bt1 Bt0 , , Btn Btn1

3. For any t > s, random variable Bt Bs has a normal distribution

P{Bt Bs dx} = p(t s, x)dx .

4. Almost all sample paths of (Bt )t0 are continuous.

Let p(t, x, y) = p(t, x y), and define foe every t > 0

(Pt f ) (x) = E (f (Bt + x))

Example 1.2.2 If B = (Bt )t0 is a standard BM in R, then

E|Bt Bs |p = cp |t s|p/2 for all s, t 0 (1.1)

for p 0, where cp is a constant depending only on p. Indeed

Making change of variable

(1.1) remains true for BM in Rd with a constant cp depending on p and d.

Remark 1.2.3 Since Bt Bs N (0, t s), it is an easy exercise to show

Example 1.2.4 Let B = (Bt )t0 be a standard BM in R. Then B is a

E(Bt Bs ) = E((Bt Bs )Bs + Bs2 )

Theorem 1.2.5 (Wiener) There is a standard Brownian motion in Rd .

Proof. We may assume that d = 1, the proof in higher dimension

N =1 l=1 n=l j=1

Let, for fixed N ,

We are going to show that each

otherwise if H we set Bt () = 0. By definition, (Bt )t0 is a continuous

1.2.1 The scaling property

Lemma 1.2.6 (Scaling invariance, self-similarity) For any real number 6=

This statement follows directly from the definition of BM. In particular,

Lemma 1.2.7 If U is an d d orthonormal matrix, then U B = (U Bt )t0

This lemma is an easy corollary of the invariance property of Gaussian

Lemma 1.2.8 Let B = (Bt )t0 be a standard BM in R, and define

Proof. Obviously Mt possesses normal distribution with mean zero, and

It follows that for any > 0

and thus by the Borel-Cantelli lemma