Professional Documents
Culture Documents
c
M.
Isabel Ribeiro, 2004
February 2004
Contents
1 Introduction
8
8
10
12
14
15
22
23
24
25
27
29
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Introduction
This report presents and derives the Kalman filter and the Extended Kalman filter
dynamics. The general filtering problem is formulated and it is shown that, under linearity and Gaussian conditions on the systems dynamics, the general filter
particularizes to the Kalman filter. It is shown that the Kalman filter is a linear,
discrete time, finite dimensional time-varying system that evaluates the state estimate that minimizes the mean-square error.
The Kalman filter dynamics results from the consecutive cycles of prediction
and filtering. The dynamics of these cycles is derived and interpreted in the framework of Gaussian probability density functions. Under additional conditions on
the system dynamics, the Kalman filter dynamics converges to a steady-state filter and the steady-state gain is derived. The innovation process associated with
the filter, that represents the novel information conveyed to the state estimate by
the last system measurement, is introduced. The filter dynamics is interpreted in
terms of the error ellipsoids associated with the Gaussian pdf involved in the filter
dynamics.
When either the system state dynamics or the observation dynamics is nonlinear, the conditional probability density functions that provide the minimum
mean-square estimate are no longer Gaussian. The optimal non-linear filter propagates these non-Gaussian functions and evaluate their mean, which represents an
high computational burden. A non optimal approach to solve the problem, in the
frame of linear filters, is the Extended Kalman filter (EKF). The EKF implements
a Kalman filter for a system dynamics that results from the linearization of the
original non-linear filter dynamics around the previous state estimates.
Chapter 2
The Filtering Problem
This section formulates the general filtering problem and explains the conditions
under which the general filter simplifies to a Kalman filter (KF).
The general filtering problem may formulated along the following lines. Let
x(k + 1) = f (x(k), u(k), w(k))
y(k) = h(x(k), v(k))
(2.1)
(2.2)
(2.3)
for increasing values of k. This pdf conveys the amount of certainty on the knowledge of the value of x(k).
Consider that, for a given time instant k, the sequence of past inputs and the
sequence of past measurements are denoted by1
U0k1 = {u0 , u1 , . . . , uk1 }
Y1k = {y1 , y2 , . . . , yk }.
(2.4)
(2.5)
The entire system evolution and filtering process, may be stated in the following steps, [1], that considers the systems dynamics (2.1)-(2.2):
Given x0
- Nature apply w0 ,
- We apply u0 ,
- The system moves to state x1 ,
- We make a measurement y1 .
Question: which is the best estimate of x1 ?
Answer: is obtained from p(x1 |Y11 , U00 )
- Nature apply w1 ,
- We apply u1 ,
- The system moves to state x2 ,
- We make a measurement y2 .
Question: which is the best estimate of x2 ?
Answer: is obtained from p(x2 |Y12 , U01 )
- ...
- ...
- ...
- ...
Question: which is the best estimate of xk1 ?
Answer: is obtained from p(xk1 |Y1k1 , U0k2 )
- Nature apply wk1 ,
- We apply uk1 ,
- The system moves to state xk ,
1
- We make a measurement yk .
Question: which is the best estimate of xk ?
Answer: is obtained from p(xk |Y1k , U0k1 )
- ...
- ...
- ...
- ...
Therefore, aiming at obtaining the best state estimate, the filter propagates the
conditional pdf for increasing values of k, and for each k, it obtains the estimate
xk that optimizes a chosen criteria, as represented in the following diagram.
p(x0 )
p(x1 |Y11 , U00 )
p(x2 |Y12 , U01 )
..
.
x1
x2
.
..
Chapter 3
Estimation of Random Parameters.
General Results
This section presents basic results on the estimation of a random parameter vector
based on a set of observations. This is the framework in which the Kalman filter
will be derived, given that the state vector of a given dynamic system is interpreted
as a random vector whose estimation is required. Deeper presentations of the
issues of parameter estimation may be found, for example, in [3], [5], [10].
Let
Rn
(3.1)
be a random vector, from which the available information is given by a finite set
of observations
Y1k = [y(1), y(2), . . . , y(k 1), y(k)]
(3.2)
with no assumption on the dependency between y(i) and .
Denote by
p(, Y1k ), p(|Y1k ) e p(Y1k )
the joint probability density function (pdf) of and Y1k , the conditional pdf of
given Y1k , and the pdf of Y1k , respectively.
3.1
Problem Formulation
(3.3)
T (k)]
J[(k)]
= E[(k)
(3.4)
(k)
(k).
(3.5)
is given by
According to the above formulated problem, the estimate (k)
T
= argminE[( (k))
(k)
( (k)].
(3.6)
T (k)]
minimize the condition mean E[(k) (k)|Y1 ] relative to (k). In fact, from the
definition of the mean operator, we have
Z Z
T
T (k)p(,
E[(k) (k)] =
(k)
Y1k )ddY1k
(3.7)
where d = d1 d2 ...dn and dY1k = dy1 dy2 ...dyk . Using the result obtained from
Bayes law, (see e.g., [8])
p(, Y1k ) = p(|Y1k )p(Y1k )
(3.8)
in (3.7) yields:
E[(k)
(k)]
=
T
Z
(k)
(k)p(|Y
1 )d
p(Y1k )dY1k .
Moreover, reasoning about the meaning of the integral inside the square brackets,
results
Z
k
T
k
k
T (k)|Y
E[(k)
E[(k) (k)] =
1 ]p(Y1 )dY1 .
Therefore, minimizing the mean value of the left hand side of the previous equality
relative to (k)
is equivalent to minimize, relative to the same vector, the mean
k
T (k)|Y
value E[(k)
1 ] on the integral on the right hand side. Consequently, the
estimation of the random parameter vector can be formulated in a different way,
as stated in the following subsection.
9
3.2
Problem Reformulation
Given the set of observations y(1), y(2), ..., y(k), the addressed problem is the
derivation of an estimator of that minimizes the conditional mean-square error,
i.e.,
k
= argminE[(k)
T (k)|Y
(k)
(3.9)
1 ].
Result 3.2.1 : The estimator that minimizes the conditional mean-square error is
the conditional mean, [5], [10],
= E[|Y k ].
(k)
1
(3.10)
Proof: From the definition of the estimation error in (3.5), the cost function in
(3.9) can be rewritten as
T
k
J = E[( (k))
( (k))|Y
1 ]
(3.11)
or else,
J
k
(k)
T + (k)
T (k)|Y
= E[T T (k)
(3.12)
1 ]
T
k
T
k
T
k
T
k
The last equality results from the fact that, by definition (see (3.3)), (k) is a
function of Y1k and thus
k
E[(k)|Y
1 ] = (k).
If we add and subtract E[T |Y1k ]E[|Y1k ] to (3.13) yields
E[|Y k ]]T [(k)
E[|Y k ]]
J = E[T |Y1k ] E[T |Y1k ]E[|Y1k ] + [(k)
1
1
where the first two terms in the right hand side do not depend on (k).
The de
pendency of (k) on J results from a quadratic term, and therefore it is immediate
that J achieves a minimum when the quadratic term is zero, and thus
= E[|Y k ],
(k)
1
which concludes the proof.
2
Corollary 3.2.1 : Consider that f (Y1k ) is a given function of the observations
f (Y1k ), this
Y1k . Then, the estimation error is orthogonal to f (Y1k ), (k)
meaning that
T
E[( (k))f
(Y1k )] = 0.
(3.14)
10
Proof: For the proof we use the following result on jointly distributed random
variables. Let x and y be jointly distributed random variables and g(y) a function
of y. It is known that, [8]
E[xg(y)] = E [E(x|y)g(y)]
(3.15)
where the outer mean-value operator in the right hand side is defined relative to
the random variable y. Using (3.15) in the left hand side of (3.14) results
T
k
T
k
E[(k)f
(Y1k )] = E[E((k)|Y
1 )f (Y1 )].
(3.16)
Evaluating the mean value of the variable inside the square brackets in (3.16)
leads to
k
k
(3.17)
E[(k)|Y
1 ] = E[|Y1 ] (k)
because (k)
is known when Y1k is given. Therefore, (3.17) is zero, from where
(3.14) holds, this concluding the proof.
2
yields,
The particularization of the corollary for the case where f (Y1k ) = (k)
T (k)] = 0.
E[(k)
(3.18)
3.3
The Result 3.2.1 is valid for any joint distribution of and Y1k , i.e., it does not
particularize the joint pdf of these variables.
It is well known from the research community dealing with estimation and
filtering theory that many results simplify when assuming that the involved variables are Gaussian. This subsection discusses the simplifications resulting from
considering that and Y1k in Result 3.2.1 are jointly Gaussian.
Result 3.3.1 If e Y1k are jointly Gaussian random vectors, then,
E[|Y1k ] = E[] + RY1k RY1k [Y1k E[Y1k ]],
(3.19)
(3.20)
where
(3.21)
The previous result is very important. It states that, when e Y1k are jointly
Gaussian, the estimatior of that minimizes the conditional mean-square error is a
linear combination of the observations. In fact, note that (3.19) may be rewritten
as
E[|Y1k ]
f (E(), E(Y1k ))
k
X
W i Yi ,
(3.22)
i=1
E[(k)]
= E[].
(3.23)
12
Note that the minimization in Result 3.3.5 is subject to the constraint of having
a linear estimator while in Result 3.3.1 no constraint is considered. If the linear
estimator constraint in Result 3.3.5 was not considered, the minimum mean square
13
Chapter 4
The Kalman Filter
Section 2 presented the filtering problem for a general nonlinear system dynamics.
Consider now that the system represented in Figure 2.1 has a linear time-varying
dynamics, i.e., that (2.1)-(2.2) particularizes to,
xk+1 = Ak xk + Bk uk + Gwk
yk = Ck xk + vk
k0
(4.1)
(4.2)
(4.3)
(4.4)
(4.5)
E[(x0 x0 )(x0 x0 )T ] = 0 .
(4.6)
14
(4.7)
represent the conditional pdf as a Gaussian pdf. The state estimate x(k|k) is the
conditional mean of the pdf and the covariance matrix P (k|k) quantifies the uncertainty of the estimate,
x(k|k) = E[x(k)|Y1k , U0k1 ]
P (k|k) = E[(x(k) x(k|k))(x(k) x(k|k))T |Y1k , U0k1 ].
Therefore, rather than propagating the entire conditional pdf, the Kalman filter
only propagates the first and second moments. This is illustrated in Figure 4.1.
Subsection 4.1 derives the filter dynamics in terms of the mean and covariance
matrix of the conditional pdf, i.e., it shows how the filter propagates the mean and
the covariance matrix. This dynamics is recursive in the sense that to evaluate
x(k + 1|k + 1), the Kalman filter only requires the previous estimate, x(k|k) and
the new observation, y(k + 1).
4.1
When vk , wk and x0 are Gaussian vectors, the random vectors xk , xk+1 , Y1k are
jointly Gaussian. As discussed before, the Kalman filter propagates the Gaussian
15
Figure 4.1: Propagation of the conditional pdf: General filter and Kalman filter
pdf p(xk )|Y1k , U0k1 ) and therefore the filter dynamics defines the general transition from p(xk )|Y1k , U0k1 ) to p(xk+1 )|Y1k+1 , U0k )
where both pdf are Gaussian and the input and observation information available
at time instant k and k+1 are displayed. Rather than being done directly, this transition is implemented as a two step-procedure, a prediction cycle and a filtering or
update cycle, as represented in the diagram of Figure 4.2, where
Figure 4.2: Prediction and Filtering cyles in the Kalman Filter dynamics
16
Figure 4.3: Consecutive prediction and filtering cycles on Kalman Filter dynamics
p(xk+1 |Y1k , U0k ), defined for time instant k + 1, represents what can be said
about x(k + 1) before making the observation y(k + 1).
The filtering cycle states how to improve the information on x(k + 1) after
making the observation y(k + 1).
In summary, the Kalman filter dynamics results from a recursive application of
prediction and filtering cycles, as represented in Figure 4.3.
Let
p(xk |Y1k , U0k1 ) N (
x(k|k), P (k|k))
(4.8)
x(k + 1|k), P (k + 1|k))
p(xk+1 |Y1k , U0k ) N (
(4.9)
(4.10)
(4.11)
and
P (k|k) = E[(xk x(k|k))(xk x(k|k))T |Y1k , U0k1 ]
(4.12)
T
k
k
P (k + 1|k) = E[(xk+1 x(k + 1|k)(xk+1 x(k + 1|k)) |Y1 , U0 ].(4.13)
For the derivation of the filters dynamics, assume, at this stage, that p(xk |Y1k , U0k1 ),
is known, i.e., x(k|k) and P (k|k) are given.
Step 1: Evaluation of p(xk+1 |Y1k , U0k )
State PREDICTION
17
(4.15)
(4.16)
and replacing in this expression the values of x(k + 1) and x(k + 1|k) yields:
x(k + 1|k) = Ak xk + Bk uk + Gk wk Ak x(k|k) Bk uk = Ak x(k|k) + Gk wk
(4.17)
where the filtering error was defined similarly to (4.16)
4
(4.18)
(4.20)
The predicted estimate of the systems state and the associated covariance matrix in (4.15) and (4.20), correspond to the best knowledge of the systems state at
time instant k + 1 before making the observation at this time instant. Notice that
the prediction dynamics in (4.15) follows exactly the systems dynamics in (4.1),
which is the expected result given that the system noise has zero mean.
Step 2: Evaluation of p(yk+1 |Y1k , U0k )
Measurement PREDICTION
(4.21)
(4.22)
(4.23)
(4.24)
(4.25)
Multiplying xk+1 on the right by y(k + 1|k)T and using (4.24) we obtain:
T
T
xk+1 yT (k + 1|k) = xk+1 x(k + 1|k)T Ck+1
+ xk+1 vk+1
from where
T
E[xk+1 yT (k + 1|k)] = P (k + 1|k)Ck+1
.
(4.26)
Given the predicted estimate of the state at time instant k + 1 knowing all the
observations until k, x(k+1|k) in (4.15), and taking into account that, in the linear
observation dynamics (4.2) the noise has zero mean, it is clear that the predicted
measurement (4.22) follows the same observation dynamics of the real system.
Step 3: Evaluation of p(xk+1 |Y1k+1 , U0k )
FILTERING
(4.27)
On the other hand, Y1k and y(k + 1|k) are independent (see Corollary 3.2.1 in
Section 3) and therefore
x(k + 1|k + 1) = E[x(k + 1)|Y1k ] + E[xk+1 , yT (k + 1|k)Py1
(k + 1|k)
(k+1|k) y
19
(4.29)
y(k+1|k)
(4.30)
(4.31)
from where we can conclude that, the filtered state estimate is obtain from the
predicted estimate as,
filtered state estimate = predicted state estimate + Gain * Error
The Gain is the Kalman gain defined in (4.29). The gain multiplies the error. The
error is given by [y(k + 1) Ck+1 x(k+1|k) ], i.e., is the difference between the real
measurement obtained at time instant k + 1 and measurement prediction obtained
from the predicted value of the state. It states the novelty or the new information
that the new observation y(k + 1) brought to the filter relative to the state x(k + 1).
Defining the filtering error as,
4
from where
T
T
P (k+1|k+1) = P (k+1|k)P (k+1|k)Ck+1
[Ck+1 Pk+1|k Ck+1
+R]1 Ck+1 P (k+1|k).
(4.32)
Summary:
Prediction :
20
(4.33)
(4.34)
(4.35)
(4.36)
(4.37)
Filtering
Initial conditions
x(0| 1) = x0
(4.38)
P (0| 1) = 0
(4.39)
is actually independent of Y1k1 , which means that no one set of measurements helps any more than other to eliminate some uncertainty about x(k).
The filter gain, K(k) is also independent of Y1k1 . Because of this, the error
covariance P (k|k 1) and the filter gain K(k) can be computed before the
filter is actually run. This is not generally the case in nonlinear filters.
Some other useful properties will be discussed in the following sections.
4.2
Using simultaneously (4.33) and (4.35) the filter dynamics is written in terms of
the state predicted estimate,
x(k + 1|k) = Ak [I K(k)Ck ]
x(k|k 1) + Bk uk + Ak K(k)yk
(4.40)
(4.41)
where,
K(k) = P (k|k 1)CkT [Ck P (k|k 1)CkT + R]1
P (k + 1|k) = Ak P (k|k
1)ATk
(4.42)
T
T
A(k)K(k)Ck P (k|k 1)A(k) + Gk QG
(4.43)
k
P (0| 1) = o
(4.44)
Equation (4.44) may be rewritten differently by replacing the gain K(k) by its
value given by (4.42),
P (k+1|k) = Ak P (k|k1)ATk +Gk QGTk Ak P (k|k1)CkT [Ck P (k|k1)CkT +R]1 Ck P (k|k1)ATk
(4.45)
or else,
P (k+1|k) = Ak P (k|k1)ATk +Gk QGTk Ak K(k)[Ck P (k|k1)CkT +R]K T (k)ATk .
(4.46)
which is a Riccati equation.
From the definition of the predicted and filtered errors in (4.16) and (4.18),
and the above recursions, it is immediate that
x(k + 1|k) = Ak x(k|k) + Gk wk
x(k|k) = [I K(k)Ck ]
x(k|k 1) K(k)vk
22
(4.47)
(4.48)
(4.49)
Evaluating the mean of the above equation, and taking into account that wk and
vk are zero mean sequences, yields,
E[
x(k + 1|k)] = Ak [I K(k)Ck ]E[
x(k|k 1)].
(4.50)
4.3
k0
(4.51)
(4.52)
(4.54)
(4.55)
4.4
Consider the system dynamics (4.51)-(4.52) and assume the following additional
assumptions:
1. The matrix Q = QT > 0, i.e., is a positive definite matrix,
2. The matrix R = RT > 0, i.e., is a positive definite matrix,
3. The pair (A, G) is controllable, i.e.,
rank[G | AG | A2 G | . . . | An1 G] = n,
4. The pair (A, C) is observable, i.e.,
2
rank[C T | AT C T | AT C T | . . . | AT
Result 4.4.1 Under the above conditions,
24
n1
C T ] = n.
(4.58)
i.e., in steady-state the Kalman gain is constant and the filter dynamics is timeinvariant.
4.5
Initial conditions
In this subsection we discuss the initial conditions considered both for the system
and for the Kalman filter. With no loss of generality, we will particularize the
discussion for null control inputs, uk = 0.
System
Let
(4.59)
where
E[x0 ] = x0
0 = E[(x0 x0 )(x0 x0 )T ]
(4.60)
(4.61)
and the sequences {vk } and {wk } have the statistical characterization presented in
Section 2.
25
(4.62)
Thus, if x0 6= 0, {xk } is not a stationary process. Assume that the following hypothesis hold:
Hypothesis: x0 = 0
The constant variation formula applied to (4.59) yields
x(l) = A
lk
x(k) +
l1
X
Al1j Gwj .
(4.63)
j=k
Multiplying (4.63) on the right by xT (k) and evaluating the mean value, results:
E[x(l)xT (k)] = Alk E[x(k)x(k)T ], l k.
Consequently, for x(k) to be stationary, E[x(l)xT (k)] should not depend on k.
Evaluating E[x(k)x(k)T ] for increasing values of k we obtain:
E[x(0)x(0)T ]
E[x(1)x(1)T ]
= 0
(4.64)
= E[(Ax(0) + Gw(0))(xT (0)AT + wT (0)GT )] = A0 AT + GQGT(4.65)
E[x(2)x(2)T ]
T
= AE[(x(1)x(1)T ]AT + GQGT = A2 0 A2 + AGQGT AT + GQG
(4.66)
from where
E[x(k)x(k)T ] = AE[(x(k 1)x(k 1)T ]AT + GQGT .
(4.67)
26
The filter initial conditions, given, for example, in terms of the one-step prediction are:
x(0 | 1) = x0
P (0 | 1) = 0 ,
(4.68)
(4.69)
which means that the first state prediction has the same statistics as the initial
condition of the system. The above conditions have an intuitive explanation.In
the absence of system measurements (i.e., formally at time instant k = 1), the
best that can be said in terms of the state prediction at time instant 0 is that this
prediction coincides with the mean value of the random vector that is the system
initial state.
As will be proved in the sequel, the choice of (4.68) and (4.69) leads to unbiased state estimates for all k. When the values of x0 and 0 are not a priori
known, the filter initialization cannot reflect the system initial conditions. A possible choice is
x(0 | 1) = 0
P (0 | 1) = P0 = I.
4.6
(4.70)
(4.71)
Innovation Process
The process
e(k) = y(k) y(k|k 1)
(4.72)
is known as the innovation process. It represents the component of y(k) that cannot be predicted at time instant k 1. In other others, it represents the innovation,
the novelty that y(k) brings to the system at time instant k. This process has some
important characteristics, that we herein list.
given that {vk } is zero mean. For a time-invariant system, the prediction error dynamics is given by (4.50) that is a homogeneous dynamics. The same conclusion
holds for a time-invariant system. For k = 0, E[
x(0| 1)] = 0 given that
E[
x(0| 1)] = E[x0 ] x(0| 1)
and we choose x(0| 1) = x0 (see 4.38). Therefore, the mean value of the
prediction error is zero, and in consequence, the innovation process has zero mean.
2
The above proof raises a comment relative to the initial conditions chosen for
the Kalman filter. According to (4.50) the prediction error has a homogeneous
dynamics, and therefore an initial null error leads to a null error for every k. If
x0|1 6= x0 the initial prediction error is not zero. However, under the conditions
for which there exists a steady solution for the discrete Riccati equation, the error
assimptotically converges to zero.
Property 4.6.2 The innovation process is white.
Proof: In this proof we will consider that x0|1 = x0 , and thus E[e(k)] = 0, i.e.,
the innovation process is zero mean. We want to prove that
E[e(k)eT (j)] = 0
for k 6= j. For simplicity we will consider the situation in which j = k + 1; this
is not the entire proof, but rather a first step towards it. From the definition of the
innovation process, we have :
e(k) = C x(k|k 1) + vk
and thus
T
E[e(k)eT (k + 1)] = CE[
x(k|k 1)
x(k + 1|k)]C T + CE[
x(k|k 1)vk+1
]
T
T
T
+E[vk x (k + 1|k)]C + E[vk vk+1 ]
(4.73)
As {vk } has zero mean and is a white process, the second and fourth terms in
(4.73) are zero. We invite the reader to replace (4.33) and (4.35) in the above
equality and to conclude the demonstration.
Property 4.6.3
E[e(k)eT (k)] = CP (k|k 1)C T + R
Property 4.6.4
lim E[e(k)eT (k)] = C P C T + R
28
4.7
29
Figure 4.6: Error ellipsoid propagation in the Kalman filter prediction cycle
Figure 4.7: Error ellipsoid propagation in the Kalman filter filtering cycle
30
Chapter 5
The Extended Kalman Filter
In this section we address the filtering problem in case the system dynamics (state
and observations) is nonlinear. With no loss of generality we will consider that
the system has no external inputs. Consider the non-linear dynamics
xk+1 = fk (xk ) + wk
yk = hk (xk ) + vk
(5.1)
(5.2)
where,
xk
yk
vk
wk
Rn , fk (xk ) : Rn , Rn
Rr hk (xk ) : Rn Rr
Rr
Rn
(5.3)
and {vk }, {wk } are white Gaussian, independent random processes with zero
mean and covariance matrix
E[vk vkT ] = Rk ,
E[wk eTk ] = Qk
(5.4)
the entire conditional pdf. One of these particular cases, referred in Section 4, is
the one in which the system dynamics is linear, the initial conditional is a Gaussian random vector and system and measurement noises are mutually independent
white Gaussian processes with zero mean. As a consequence, the conditional pdf
p(x(k) | Y1k ), p(x(k + 1) | Y1k ) and p(x(k + 1) | Y1k+1 ) are Gaussian.
With the non linear dynamics (5.1)-(5.2), these pdf are non Gaussian. To
evaluate its first and second moments, the optimal nonlinear filter has to propagate
the entire pdf which, in the general case, represents a heavy computational burden.
The Extended Kalman filter (EKF) gives an approximation of the optimal estimate. The non-linearities of the systemss dynamics are approximated by a linearized version of the non-linear system model around the last state estimate. For
this approximation to be valid, this linearization should be a good approximation
of the non-linear model in all the uncertainty domain associated with the state
estimate.
5.1
and covariance
(5.5)
(5.6)
and the Bayes law, the conditional pdf of xk+1 given Y1k is given by
Z
k
p(xk+1 | Y1 ) =
p(xk+1 | xk )p(xk | Y1k )dxk ,
or also,
p(xk+1 |
Y1k )
(5.7)
where
pwk (xk+1 fn (xk )) =
1
1
exp[ (xk+1 fk (xk ))T Q1
k (xk+1 fk (xk ))].
2
(2)n/2 [detQk ]1/2
(5.8)
z
}|k
{
k
k
= fk (F ) 5fk |Fk F + 5 fk |Fk xk .
where 5fk is the Jacobian matrix of f (.),
5fk =
1
f (x(k))
|k
x(k) F
F - refers filtering
34
(5.9)
(5.10)
sk
(5.11)
Note that (5.11) represents a linear dynamics, in which sk is known, has a null
conditional expected value and depends on previous values of the state estimate.
According to (5.9) the pdf in (5.7) can be written as:
p(xk+1 | Y1k )
Z
=
Z
=
(5.12)
To simplify the computation of the previous pdf, consider the following variable
transformation
zk = 5fk xk .
(5.13)
where we considered, for the sake of simplicity, the simplified notation 5fk to
represent 5fk |Fk .
Evaluating the mean and the covariance matrix of the random vector (5.13)
results:
E[yk ] = 5fk E[xk ] = 5fk Fk
E[yk ykT ] = 5fk VFk 5fkT .
(5.14)
(5.15)
From the previous result, the pdf of xk in (5.5) may be written as:
N (xk Fk , VFk ) =
1
1
exp[ (xk Fk )T (VFk )1 (xn Fk )] =
k
n/2
1/2
2
(2) (detVF )
1
1
exp[ (5fk xk 5fk Fk )T (5fFk )T (VFk )1 (5fFk )1 (5fk xk 5fk Fk )] =
k
n/2
1/2
2
(2) (detVF )
1
1
exp[ (5fk xk 5fk Fk )T (5fk VFk 5 fkT )1 (5fk xk 5fk Fk )] =
2
(2)n/2 (detVFk )1/2
1
= det 5 fk
(2)n/2 (det 5 fk VFk 5 fkT )n/2
1
exp[ (5fk xk 5fk Fk )T (5fk VFk 5 fkT )1 (5fk xk 5fk Fk )].
2
35
(5.16)
where ? represents the convolution of the two functions. We finally conclude that,
p(xk+1 | Y1k ) = N (xk+1 5fk |Fk Fk sk , Qk + 5fk |Fk VFk 5 fk |Tk (5.17)
F
(5.18)
(5.19)
(5.20)
These values are taken as the predicted state estimate and the associated covariance obtained by the EKF, i.e.,
x(k + 1|k) = Pk+1
P (k + 1|k) = = VPk+1 ,
representing the predicted dynamics,
36
(5.21)
(5.22)
x(k + 1|k) = fk (
x(k|k)
P (k + 1|k) = 5fk |Fk P (k|k) 5fkT |Fk
Filtering
In the filtering cycle, we use the system measurement at time instant k + 1,
yk+1 to update the pdf p(xk+1 | Y1k ) as represented
yk+1
p(Y1k )
[p(yk+1 | xk+1 ) p(xk+1 | Y1k )].
p(Y1k+1 )
(5.23)
Given that
yk+1 = hk+1 (xk+1 ) + vk+1 ,
the pdf of yk+1 conditioned on the state xk+1 is given by
p(yk+1 | xk+1 ) =
1
(2)r/2 (detR
k+1
)1/2
(5.24)
1
1
exp[ (yk+1 hk+1 (xk+1 ))T Rk+1
(yk+1 hk+1 (xk+1 ))].
2
(5.25)
With a similar argument as the one used on the prediction cycle, the previous pdf
may be simplified through the linearization of the observation dynamics.
Linearizing hk+1 (xk+1 ) around Pk+1 and neglecting higher order terms results
hk+1 (xk+1 ) ' hk+1 (Pk+1 ) + 5h |k+1 (xk+1 Pk+1 ),
P
(5.26)
(5.27)
(5.28)
with
P
being a known term in the linearized observation dynamics, (5.27). After the
linearization around the predicted state estimate - that corresponds to Pk+1 =
xk+1|k (see (5.21), - the observation dynamics may be considered linear, and the
computation of p(yk+1 | xk+1 ) in (5.25) is immediate. We have,
p(yk+1 | xk+1 ) = N (yk+1 rk+1 5hk+1 |k+1 xk+1 , Rk+1 ).
P
37
(5.29)
(5.30)
Using a variable transformation similar to the one used in the prediction cycle, the
previous pdf may be expressed as
p(xk+1 | Y1k ) = det5h |k+1 N (5hk+1 |k+1 xk+1 5hk+1 |k+1 Pk+1 , 5hk+1 VPk+1 5hTk+1 )
P
P
P
(5.31)
(5.32)
(5.34)
=
V
(5.35)
[hk+1 (Pk+1 )
(5.37)
(5.38)
Note that (5.32) expresses the pdf of 5hk+1 |k+1 xk+1 and not that of xk+1
P
as desired. In fact, the goal is to evaluate the mean and covariance matrix in
N (xk+1 1 , V1 ).
(5.39)
38
5hk+1 1
= Pk+1 + VPk+1 5 hTk+1 (5hk+1 VPk+1 5 hTk+1 + Rk+1 )1 [yk+1 hk+1 (Pk+1 )].
(5.42)
that has to be solved relative to VFk+1 . From the above equalities, we successively
obtain:
5hk+1 VPk+1 5 hTk+1
1
= 5hk+1 VFk+1 5 hTk+1 Rk+1
(5hk+1 VPk+1 5 hTk+1 + Rk+1 )
1
= 5hk+1 VFk+1 5 hTk+1 Rk+1
5 hk+1 VPk+1 5 hTk+1 + 5hk+1 VFk+1 5 hTk+1
or else,
1
5 hk+1 VPk+1 + VFk+1
VPk+1 = VFk+1 5 hTk+1 Rk+1
1
VPk+1 = VFk+1 [I + 5hTk+1 Rk+1
5 hk+1 VPk+1 ]
1
VFk+1 = VPk+1 [I + 5hTk+1 Rk+1
5 hk+1 VPk+1 ]1 .
VFk+1 = VPk+1 VPk+1 5 hTk+1 [Rk+1 + 5hk+1 VPk+1 5 hTk+1 ]1 5 hk+1 VPk+1
(5.43)
k
Therefore, if we consider that p(xk+1 |Y1 ) is a Gaussian pdf, have access
to the measurement yk+1 and linearize the system observation dynamics around
39
Pk+1 = x(k + 1|k) we obtain a Gaussian pdf p(xk+1 | Y1k+1 ) with mean Fk+1 and
covariance matrix VFk+1 given by (5.41) and (5.43), respectively.
Finally, we summarize the previous results and interpret the Extended Kalman
filter as a Kalman fiter applied to a linear time-varying dynamics.
Let:
Pk+1
VPk+1
Fk+1
VFk+1
=
=
=
=
x(k + |k)
P (k + 1|k)
x(k + 1|k + 1)
P (k + 1|k + 1)
and consider
5fk |Fk = 5fk |x(k|k) = F (k)
5hk+1 |k+1 = 5h |x(k+1|k) = H(k + 1)
P
s(k) = fk (
x(k|k)) F (k) x(k|k)
x(k + 1|k)) H(k + 1) x(k + 1|k).
r(k + 1) = hk+1 (
Assume the linear system in whose dynamics the just evaluated quantities are
included.
x(k + 1) = F (k)x(k) + wk + s(k)
y(k + 1) = H(k + 1)x(k + 1) + vk+1 + r(k + 1)
(5.44)
(5.45)
where wk and vk+1 are white Gaussian noises, s(k) and r(k) are known quantities
with null expected value.
The EKF applies the Kalman filter dynamics to (5.44)-(5.45), where the matrices F (k) and H(k) depend on the previous state estimates, yielding
x(k + 1|k) = fk (
x(k|k))
x(k + 1|k + 1) = x(k + 1|k) + K(k + 1)[yk+1 hk+1 (
x(k + 1|k))]
where K(k + 1) is the filter gain and
K(k + 1) = P (k + 1|k)H T (k + 1)[H(k + 1)P (k + 1|k)H T (k + 1) + R(k + 1)]1
P (k + 1|k) = F (k)P (k|k)F T (k) + Q(k)
P (k + 1|k + 1) = P (k + 1|k) P (k + 1|k)H T (k + 1)
[H(k + 1)P (k + 1|k)H T (k + 1) + R(k + 1)]1 H(k + 1)P (k + 1|k)
(5.46)
40
= P (k + 1|n)
(5.47)
T
T
1
[I P (k + 1|n)H (k + 1)[H(k + 1) + P (k + 1|k)H (k + 1) + Rk+1 ] H(k(5.48)
+ 1)]
41
(5.49)
Bibliography
[1] Michael Athans, Dynamic Stochastic Estimation, Prediction and Smoothing, Series of Lectures, Spring 1999.
[2] T. Kailath, Lectures Notes on Wiener and Kalman Filtering, SpringerVerlag, 1981.
[3] Thomas Kailath, Ali H. Sayed, Babak Hassibi, Linear Estimation, Prentice Hall, 2000.
[4] Peter S. Maybeck, The Kalman Filter: An Introduction to Concepts, in
Autonomous Robot Vehciles, I.J. Cox, G. T. Wilfong (eds), Springer-Verlag,
1990.
[5] J. Mendel, Lessons in Digital Estimation Theory, Prentice-Hall, 1987.
[6] K. S. Miller, Multidimensional Gaussian Distributions, John Wiley &
Sons, 1963.
[7] J. M. F. Moura, Linear and Nonlinear Stochastic Filtering, NATO Advanced Study Institute, Les Houches, September 1985.
[8] A. Papoulis, Probability, Random Variables and Stochastic Processes,
McGraw-Hill, 1965.
[9] M. Isabel Ribeiro, Gaussian Probability Density Functions: Properties and
Error Characterization, Institute for Systems and Robotics, Technical Report, February 2004.
[10] H. L. Van Trees, Detection, Estimation and Modulation Theory, John Wiley & Sons, 1968.
42
ERRATA
Theequation(4.37)hasanerror.Thecorrectversionis
P(k|k)=[IK(k)C]P(k|k1)
ThankstoSergioTrimbolithatpointedouttheerrorinapreliminaryversion
23.March.2008