You are on page 1of 6

Copyright (c) 2005 IFAC.

All rights reserved


16th Triennial World Congress, Prague, Czech Republic

UNSCENTED KALMAN FILTER FOR FAULT DETECTION

K. Xiong*, C. W. Chan** and H. Y. Zhang*

*School of Automation Science and Electrical Engineering, Beihang University


Beijing, 100083 China
e-mail: tobelove2001@tom.com

** Department of Mechanical Engineering, The University of Hong Kong


Pokfulam Road, Hong Kong, China
e-mail: mechan@hkucc.hku.hk

Abstract: In this paper, the approximation of nonlinear systems using unscented Kalman
filter (UKF) is discussed, and the conditions for the convergence of the UKF are derived.
The detection of faults from residuals generated by the UKF is presented. As fault
detection often reduced to detecting irregularities in the residuals, such as the mean, the
local approach, a powerful statistical technique to detect such changes, is used to detect
fault from the residuals generated from the UKF. The properties of the proposed method
are also presented. To illustrate the performance of the proposed method, it is applied to
detect faults in the attitude sensors of a satellite. Copyright 2005 IFAC

Keywords: fault detection, nonlinear filters, Kalman filters, unscented transformation

1. INTRODUCTION accuracy in the EKF. Further, as it is not necessary to


compute the Jacobians or Hessians, it is being widely
Fault detection for nonlinear system is an important used in applications, such as target tracking (Julier, et
research area attracting considerable interest. Model- al., 2000) and multi-sensor fusion (Hall, et al., 2001).
based fault detection techniques are popular. For
nonlinear systems with additive Gaussian noise, the Very few results are available in the literature on the
extended Kalman filters (EKF) are used to generate convergence of the UKF. In this paper, the sufficient
residuals for fault detection (Gobbo, et al., 2001). conditions for the convergence of the UKF are
However, the EKF suffers from two well-known derived based on a new formulation of the unscented
drawbacks: 1) it is a first- order approximation of the transform. Based on this result, fault detection for
nonlinear system, introducing large errors in the nonlinear systems is derived using the local approach,
mean and covariance of the state vector, and even a statistical tool that transforms the fault detection
divergence of the filter, and 2) the derivation of the problem into one that detects changes in the mean of
Jacobian matrices is nontrivial and can often lead to a Gaussian random variable (Zhang, et al., 1998).
significant implementation difficulties. The performance of the proposed technique is
demonstrated by the attitude sensors of a satellite.
Unscented Kalman filters (UKF) have been proposed
recently for estimating the state of nonlinear systems. The paper is organized as follows. In Section II, a
Another important method in state estimation for brief review of the UT is presented, followed by the
nonlinear systems is presented by (Ravn, etal., 2000). derivation of the UKF, and the conditions for it to
The UKF is derived using the unscented converge. In section III, the detection of faults from
transformation (UT) involving a set of carefully the residuals generated by the UKF using the local
chosen sample points, called the sigma points. It has approach is derived, together with the miss-detection.
shown that the UKF outperforms the EKF (Julier, et In section IV, the performance of the proposed
al., 2000), as it is able to approximate the posterior method is illustrated by applying it to detect attitude
mean and covariance of the output variable with a sensor faults of a satellite.
second order accuracy instead of a first order

113
Property 1: The mean and covariance of the set of
2. THE UNSCENTED KALMAN FILTER sigma points given by (2) are identical to that of x.
2.1 The unscented transform Property 2: The approximation of the mean and
covariance of y by and P has a second order
The UT is a method for calculating the statistics of a
random variable that undergoes a nonlinear accuracy.
transformation (Hall, et al., 2001). A discrete
distribution composed of a number of samples, 2.2 The unscented Kalman filter (UKF)
referred to as the sigma points, are computed based Consider the nonlinear system:
on the known initial mean and covariance of the state x(k ) = f ( x(k 1)) + w(k )
variable. Then the nonlinear transformation is y (k ) = h( x(k )) + v(k ) (5)

applied to each sample. As an example, the UT of a
where f(.) and h(.) are known nonlinear functions, x(k)
variable with dimension 2 is shown in Fig. 1. The
is the state vector, y(k) is the output vector, w(k) and
sample mean and covariance of the transformed
v(k) are normally distributed white noise with zero
ensemble can then be used to compute the estimate
mean and covariance matrices: E[w(k)w(k)T] = Q(k)
of the nonlinear transformation of the original
and E[v(k)v(k)T] = R(k). It is assumed that the output
distribution. The computed mean and covariance is
can be measured, but not the state. Similar to the
accurate up to second order (Julier, et al., 2000).
Kalman filter, the UKF is obtained by minimizing
the mean-squared error. The new state of the system
x (k | k 1) , the estimated output y (k ) and the
corresponding covariance matrices are computed
recursively using the state after applying the UT. The
procedure for implementing the UKF is as follows
(Wan and Merwe, 2000),
Step 1 Calculate the sigma points from (2),
Fig. 1. The unscented transformation i ( k 1) = x ( k 1) i=0
12
Consider a random variable, x RL. Let
i ( k 1) = x ( k 1) + a [ LP ( k 1)]i i = 1,L, L
(k 1) = x (k 1) a[ LP ( k 1)]1 2 i = L + 1,L,2 L
y = f ( x) R L (1) i iL
where y is a nonlinear mapping of x, and f(x) is a Step 2 Compute the predicted mean, from (4),
known nonlinear function. Denote the mean of x i (k | k 1) = f ( i (k 1)) (6)
by x , and the covariance by Px RLL. The statistics 2L

of y are computed from x using the sigma points = x (k | k 1) = w m


i i (k | k 1) (7)
{i, i = 0, 1, , 2L}, as given below. i =0

i = x i=0 and the predicted covariance from (4),


12 P(k | k 1)
i = x + a ( LPx )i i = 1,L, L (2)
2L
= x a ( LP )1 2 i = L + 1,L,2 L
i x i L = wic [ i (k | k 1) x (k | k 1)][i (k | k 1)
i =0
where a is the spread of the sigma points around x ,
x (k | k 1)] + Q(k ) (8)
and ( LPx )1i 2 is the ith column of the matrix square
root of LPx. The parameter a is to provide an extra Step 3 The predicted observation is computed by
2L
degree of freedom to "fine tune" the higher order
moments of the approximation, and is usually set to a i (k ) = h( i (k | k 1)) , y (k ) = w
i =0
m
i i (k )
small positive value. The sample mean and
covariance of are (Wan and Merwe, 2000): the covariance and the cross correlation matrix by,
2L 2L
2L
P = wic [ i ( k ) y (k )][ i (k ) y ( k )]T + R ( k )
=
i =0
wi i ; P = i =0
T
wi ( i )( i ) (3) i =0
2L
1 Pxy = wic [ i (k | k 1) x (k | k 1)][ i (k ) y (k )]T
wi = 1 2 i=0 i =0
where a
1 And the predicted state is computed using the
wi = i = 1,L,2 L classical Kalman filter,
2 La 2
x (k ) = x (k | k 1) + Pxy P1[ y(k ) y (k )]
Let i = f(i) RL, for i = 0, 1, , 2L. The mean and (9)
1 T
covariance for y can be approximated by the sample P(k ) = P(k | k 1) Pxy P Pxy
mean and covariance of , as given below.
2L 2L Step 4 Repeat steps 1 to 3 for the next sample.
= wi i ; P = wi ( i )( i )T (4)
i =0 i =0 Since the mean and covariance of x(k) are accurate
The properties of the UT are (Julier, et al., 2000): up to second order, and the same also applied to the
computed mean and covariance of y(k), the UKF can

114
predict with a second-order accuracy, but without the where U = I + H T (k ) P1Pxy
T
+ Pxy P1 (k ) H +
need to compute the Jacobian or Hessian matrix. In
contrast, the state vector computed by the EKF is H T (k ) P1Pxy
T
Pxy P1 (k ) H , and max () is the
only a first-order approximation of the nonlinear
maximum eigenvalue of the matrix F, then
system, and hence can only achieve the first-order
accuracy. Further, the computational load of UKF is lim ~
x (k ) = 0 (20)
k
only in the same order as that for the EKF.
Proof: Let V (k ) = ~ x (k )T ~
x (k ) . From (9), (11), (17)
2.3 Convergence analysis of the UKF and (18),
The convergence analysis of the UKF is derived V (k ) V (k 1) = ~
x (k )T ~
x (k ) ~
x (k 1)T ~
x (k 1)
~ T ~
= [ x (k | k 1) + P P e(k )] [ x (k | k 1) + P P 1e(k )]
1
using an approach similar to that of the EKF xy xy
(Boutayeb, et al., 1997). Denote the error of the ~
x (k 1) T ~
x (k 1)
estimated state by
~ =~
x ( k 1) [ F T (k )U (k ) F I ]~
T
(21)
x (k 1)
x (k ) = x(k ) x (k ) (10)
From the Rayleigh-Ritz theorem (Yu and Shi, 2004),
and the prediction error of the state by for any vector z 0,
~
x (k | k 1) = x(k ) x (k | k 1) (11) x 0
(
max (U ) = max z T Uz ( z T z ) )
Assuming that w(k) is neglectable, expanding x(k)
1
x 0
(
min ( F T F ) = min z T F T F 1 z ( z T z ) )
given by (5) by a Taylor Series about x (k 1) If assumption (19) hold, then
gives,
x( k ) = f ( x (k 1)) + f ( x ( k 1)) ~
x ( k 1)
12
z T F T F 1 z (
min z T F T F 1 z ( z T z ) ) 12

z 0
1
+f ( x (k 1)) ~
x (k 1) 2 + L (12)

z T Uz
z 0
(
max z T Uz ( z T z ) )

2 12
Similarly, x (k | k 1) can be expressed as, ( F T F 1 )
= min > max ( (k )) j (k ) (22)
x (k | k 1) = f ( x (k 1)) +
1
f ( x (k 1)) P(k 1) max (U )
2 where the subscript j denotes the jth component of the
+ (13) diagonal matrix (k). From (22), the following
hence, ~ x (k | k 1) can be approximated by, inequality is obtained:
~
x (k | k 1) F~ x (k 1) (14) zT [ j (k )U j (k ) F T F 1]z < 0 (23)
where F = f ( x (k 1)). Assuming that v(k) is small, As z 0, hence
expanding y(k ) and y (k ) about x (k | k 1) gives, (k )U (k ) F T F 1 < 0 (24)
y (k ) = h( x (k | k 1)) + h( x (k | k 1)) ~
x (k | k 1) For ~x (k 1) 0 , it follows that
1 ~
x (k 1)T [ F T (k )U (k ) F I ]~
+ h( x ( k | k 1)) ~
x ( k | k 1) 2 + L x (k 1) < 0 (25)
2
1
From (25) and (21), V (k ) V (k 1) < 0 , V(k) is a
y (k ) = h( x (k | k 1)) + h( x (k | k 1)) P(k | k 1) + L
2 decreasing sequence, and hence lim V ( k ) = 0 . It
k
Similarly, (k ) = y(k ) y (k ) can be approximated
follows that lim ~x ( k ) = 0 .
by, k
(k ) H~x (k | k 1) (16) The unknown diagonal matrices (k) and (k) are
where H = h( x (k | k 1)). In general, (k) is not introduced to evaluate the UT of the state variables
that propagates through the nonlinear model. If the
identically zero, as it is a second order approximation
magnitude of the eigenvalue of (k) is sufficiently
of ~x (k | k 1). Hence, (14) and (16) are modified as, small, the convergence of the UKF is ensured. If the
~
x (k | k 1) = (k ) F~x (k 1) (17) magnitude of ik are small enough, the convergence
~
(k ) = (k ) Hx (k | k 1) (18) of the UKF may be improved in the sense that the
where (k) = diag(1(k), 2(k), , N(k)) and domain of max((k)) will be enlarged. Indeed, the
(k ) = diag ( 1 (k ), 2 (k ), L , L (k )) are unknown sufficient conditions (19) mean that if the error
introduced by the UT is small enough, V(k) is a
diagonal matrices. The sufficient condition for the decreasing sequence. As (k) and (k) are unknown
convergence of the UKF is given below. factors, sigma points should be chosen properly to
decrease the error of the UT so that (20) is fulfilled.
Theorem 1: Assuming F is a nonsingular matrix, and
(k) satisfies the following condition:
12
min ( F T F 1 ) 3. FAULT DETECTION BY LOCAL APPROACH
max ( (k )) < (19)
max (U ) The measurement equation (5) can be rewritten as,
y (k ) = h ( x(k )) + (k ) + v(k ) (26)

115
where h ( x(k )) is a measurement model, and where N is a large positive integer, the subscript i
denotes the ith component of vector.
(k ) = h( x(k )) h ( x(k )) is the modelling error.
Step 3 At the kth sampling period, the cumulative sum
Consider the predicted observation y (k ) obtained
of residuals is computed from (29) as given below.
from the UKF. Under normal operating condition, 1 k
the residual of the UKF is, Dim ( k ) = ( i (t ) bi (0)) (34)
m t = k m +1
(k ) = y (k ) y (k ) = h ( x(k )) + (k ) + v(k ) h ( x (k ))
where k > m . Normalizing the cumulative sum of
= ( k ) + ( k ) + v( k ) (27)
the residual by its variance gives,
where y (k ) = h ( x (k )) is the predicted observation Sim (k ) = [ Dim (k )]2 [ Pii (k m)]1 (35)
and (k ) = h ( x(k )) h ( x (k )) is the estimation error. where Pii is the ith diagonal element P.
When there is a sensor fault, the residual becomes,
(k ) = y (k ) + b f y (k ) Step 4 If Sim (k ) i , then there is no fault, but a
= b f + (k ) + (k ) + v(k ) (28) fault otherwise. As S im (k ) is 2-distributed, i can be
where bf 0 is the output arising from the sensor obtained from 2-table for a given confidence level.
fault. However, faults can only be detected if the
Step 5 Repeat step 3 and 4.
term is large compared with the modelling errors and
the system noise. For small faults, it is difficult to
3.2 Properties of the fault detection method
detect bf from (k).
If there is no fault, (k) is Gaussian distributed: N(0,
The local approach is now applied to the residuals P). From (35), the expectation of (k) and the
generated by the UKF. In the local approach, the covariance matrix of Dm are respectively:
cumulative sum of the residual Dm is computed for a 1 k
window size of m samples (Wang and Chan, 2002), E( Dm ) = E ( (t )) = 0 (36)
m t = k m +1
1 m
Dm = (k ) 1 k
m k =1 Cov ( D m ) = Cov ( (t )) = P (37)
m t = k m +1
1 m m m
= (k ) + (k ) + v(k ) (29) where E(.) and Cov(.) are respectively the

m k =1
k =1 k =1 expectation and the covariance. Hence Dm is also
Assuming the model is accurate and (k) = 0, then Gaussian distributed: N(0, P). If there is a fault, the
the residual Dm can now be approximated by distribution of (k) is: N(bf, P), and the mean and
1 m 1 m covariance of Dm are:
Dm = (k ) + v (k ) (30)
m k =1 m k =1 E ( Dm ) = mb f , Cov ( D m ) = P (38)
From Theorem 1, lim ( x(k ) x (k )) = 0 holds under The distribution of Dm is: N ( mb f , P ) . The
k
certain conditions. Assuming h(.) is a continuous miss-detection of the proposed fault detection
function, then scheme is given in the following theorem. This result
lim [h ( x(k )) h ( x (k ))] = lim (k ) = 0 (31) provides a guideline for choosing m and the
k k probability of the miss-detection. The argument t and
Consequently, if the sufficient condition (19) is the subscript i are ignored for simplicity.
satisfied and k is sufficiently large, Dm is Gaussian
distributed with zero mean. If there is a sensor fault, Theorem 2: Let be obtained for a given confidence
(30) becomes, level. A fault is detected, if S m = ( D m ) 2 P1 > .
1
m m m The false alarm PF is independent of m, while the
Dm =
m k =1
bf +
(k ) + v(k )

(32) miss-detection PM depends on m.
k =1 k =1
As bf is non-zero, Dm is also non-zero. Proof: If there is no fault, the distribution of Dm is
N(0, P), and the probability density function (pdf)
3.1 Fault detection method of Dm is:
The proposed fault detection scheme can be
implemented on-line as follows:
p( D m | H 0 ) =
1
2P
(
exp ( D m )2 (2 P ) (39) )
Step 1 Select m, the window size for computing the Let the null hypothesis denoting no fault be H0. From
cumulative sum of residual. (35), S 2 = ( D m )2 P1 , the pdf of Sm is given by,
Step 2 Compute the mean of the residual generated
from the UKF. This is necessary, as (k) is ignored p( S m | H 0 ) =
2
2P
(
exp S m 2 ) P
in the above analysis. 2 P S m

( )
N
1 1
bi (0) =
N (k )
k =1
i (33) =
2S m
exp S m 2 (40)

116
The false alarm PF is defined by, & Ix
& +

m m
PF = p( S | H 0 )dS (41) Iy 0

m
& Iz
Since p(S |H0) is independent of m, PF is also &
independent of m. If there is a fault, the distribution bIx A11 I 33 I 33 bIx 0
b&Iy = 0 0 0 b + 0 +W
of Dm becomes N ( mb f , P ) , and the pdf is: & 33 33 33 Iy

bIz 033 033 A33 bIz 0
( D m mb f ) 2
1 d& d 0
p( D m | H1) = exp (42) Ix Ix
2P 2 P d&Iy d Iy 0

&
Let H1 be the hypothesis that there is a fault. Then d Iz d Iz 0
the pdf of Sm can be expressed as, S x0 + S y0 + S z0
1 1
p ( S m | H1 ) = {exp[ ( S m m P b f ) 2 ]
m 2 S x0 S 0y + S z0
2 2S m 0 0 0
1 m S x + S y S z
+ exp[ ( S m + m P b f )2 ]} (43) 0 0 0
2 m = S x S y + S z + V (46)
From (43), the miss-detection PM is given by, S 0 + S 0 S 0
h x y z
PM (m) = 0 p ( S m | H1)dS m h S x0 + S y0 + S z0

1 1
= 0 {exp[ ( x m P b f ) 2 ]
2 2
1 where
+ exp[ ( x + m P b f ) 2 ]}dx
2 0 0 0
( ) (
= m P b f + m P b f ) A11 = 0 0 0 ,
+ ( m P b + ) ( m P b )
0 0 0
f (44) f
1 Ix 0 0
exp( x 2 2)dx . Therefore P
x 1
where = M A33 = 0 1 Iy 0
2
depends on m. 0 0 1 Iz
I33 is the identity matrix, 033, the zero matrix, ,
The relation between PM and m is shown in Fig. 2, and are the roll, the pitch and the heading of
where the shaded part of the curve is PM. Clearly, if satellite, 0 is orbit angle velocity, Ix, Iy, Iz are
m is large, the miss-detection from (44) is small, the measurement from gyroscope, bIx, bIy, bIz, dIx, dIy,
lim PM (m) = 0 (45) dIz are the drifting errors of the gyroscope, Ix, Iy, Iz
m
However, if m is large, a longer time is required are the first order Markov time constant, S x0 , S y0 , S z0
before faults are detected (Wang and Chan, 2002). If are the projections of sun vector onto the coordinate
the false alarm and the miss-detection are chosen to
of the spacecraft, m, m, m are measurements of sun
be small, and m can be determined from (41) and
sensor, h, h are measurements of earth sensor, W
(44), as illustrated in the example presented below.
and V are zero mean Gaussian white noise.

4.2 Simulation results


It is assumed in the simulation that the satellite is
being stabilized relative to the earth. The initial
values of , and are set to zero. For a sampling
interval of 0.1 second, the satellite given by (46) is
simulated for 50 seconds. The drifting error of the
parameters of the gyroscope is 10/h, the
measurement noises of sun sensor and earth sensor
are zero-mean, uncorrelated noises with covariance
Fig. 2 The relation between PM and m given by constants Rii = 0.012, for i = 1, , 5. The
proposed fault detection scheme is applied to detect
the following fault in sun sensor, which occurred
4. SIMULATION EXAMPLE separately at 30s.
y f ,1 (k ) = y1 (k ) + 0.02 , for k 300 ;
4.1 Satellite attitude determination system
where the constant bf = 0.02 represents a sensor fault
The satellite attitude determination system consists is being added to the observation m, and from (46)
of the sun sensor, the earth sensor and the gyroscope, and (32), there is a drift in 1(k). Following the
described by the following equation (Yang, 2002): procedure described in section 2.2, residuals are

117
obtained from the UKF. The false alarm is set to: PF approach. Since the UKF can approximate the mean
= 0.1%, and the miss-detection PM is expected to be and the covariance of a Gaussian random variable up
not larger than 6%. From (41), i obtained from the to a second order accuracy, it is used here to generate
2-table for a 0.1% false alarm is: i = 10.8. When residuals for detecting faults. The sufficient condition
the fault occurs, bf = 0.02, the miss-detection rate can for the convergence of the UKF is presented. The
be computed by (44). For m = 6, PM 6% from local approach is applied to detect faults from the
residuals, and properties of this method are derived.
statistical table on Gaussian distribution as P11 is These properties are then used to devise guidelines
set to 0.012. So the requirement on miss-detection for choosing the window size in the statistical test.
can be satisfied. If m = 1, the miss-detection is about The proposed method has been applied successfully
90%, and hence the requirement on the to detect faults in the satellite attitude determination
miss-detection is not satisfied. In this case, it is system.
necessary to increase m to reduce the miss-detection.

When the fault occurs, the residuals 1(k) and 4(k) ACKNOLEDGEMENT
are shown in Fig. 3, showing a small step change in
the mean of 1(k), for k > 300. For m = 6, S1m (k ) This work was supported in part by China Natural
Science Foundation (No. 60234010), and the
and S 4m (k )
are shown in Fig. 4. As only S1m (k )
is HKSAR RGC Grant (HKU 7050/02E).
greater than the threshold for k > 302, a fault is
detected in the component of the sun sensor, which
corresponds to S1m (k ) . This result agrees with the REFERENCES
properties of the fault detection method presented in
Boutayeb, M., H. Rafaralahy, and M. Darouach
section 3.2, illustrating the ability of the local
(1997). Convergence analysis of the extended
approach in detecting faults.
Kalman filter used as an observer for nonlinear
deterministic discrete-time systems. IEEE Trans.
Auto. Contr., 42, 581-586.
Del Gobbo, D., M. Napolitano, P. Famouri and M.
Innocenti (2001). Experimental application of
extended Kalman filtering for sensor validation.
IEEE Trans. Contr. Sys. Technology, 9, 376-380.
Hall, D. L. and J. Llinas (2001), Handbook of
Multisensor Data Fusion, CRC Press.
Julier, S., J. Uhlmann and H. F. Durrant-Whyte
(2000). A new method for the nonlinear
transformation of means and covariances in Filters
and Estimators. IEEE Trans. Auto. Contr. 45,
477-482.
Magnus Norgaard,Niels K. Poulsen, Ole Ravn(2000),
New developments in state estimation for
Fig. 3 1(k) and 4(k) nonlinear systems, Automatic, 36, 1627-1638
Wan, E. A. and R. van der Merwe (2000). The
unscented Kalman filter for nonlinear estimation,
adaptive systems for signal processing.
Communication and Contr. Symposium, 153-158.
Wang, Y. and C. W. Chan (2002). Asymptotic local
approach in fault detection with faults modeled by
neurofuzzy networks. Proceedings 15th Triennial
World Congress, Barcelona, Spain.
Yang, J. (2002). Attitude determination, positioning
and fault diagnosis of GPS. Ph.D. Dissertation,
Beihang University.
Yu, S. X. and J. Shi (2004). Segmentation given
partial grouping constraints. IEEE Trans. on
Pattern Analy. and Mach. Intellig., 26, 173-183.
Zhang, Q., M. Basseville and A. Benveniste (1994).
Fig. 4 S1m (k ) and S 4m (k ) Early warning of slight changes in system.
Automatica, 30, 95-115.
Zhang, Q., M. Basseville and A. Benveniste (1998).
5. CONCLUSION Fault detection and isolation in nonlinear dynamic
systems: a combined input-output and local
In this paper, a fault detection scheme for nonlinear approach. Automatica, 34, 1359-1373.
systems is derived based on the UKF and the local

118

You might also like