Adaptive Gaussian Mixture Estimation

Adaptive Gaussian Mixture Estimation and Its
Application to Unsupervised Classification of

Remotely Sensed Images
Sumit Chakravarty 1, Qian Du 1, Hsuan Ren 2
1
Department of Electrical Engineering and Computer Science

Texas A&M University-Kingsville, Texas 78363
2
National Research Council, Washington DC 20001
maximum likelihood estimation process, which results in the

well-known EM algorithm [2]. Then the membership
assignment (i.e., hard classification) of a pixel can be
achieved by using the maximum likelihood criterion, or soft
classification is implemented by generating membership
probability. The EM algorithm can be applied to
multispectral and hyperspectral images by assuming
multivariate Gaussian distribution. Also, the algorithms can
be made adaptive in order to capture the local statistics and
fit in the nonstationary case, referred to as adaptive EM
(AEM) [3]. Since the number of classes is unknown in
unsupervised classification, the EM and AEM algorithms are
modified to automatically select the class number.
Abstract This paper addresses unsupervised statistical

classification to remotely sensed images based on mixture
estimation. The application of the well-known technique,
Expectation Maximization (EM) algorithm to multidimensional image data is to be investigated, where Gaussian
mixture is assumed. The number of classes can be estimated
via Neyman-Pearson detection theory-based eigen-thresholding
approach, which is used as a reference value in the learning
process. Since most remotely sensed images are nonstationary,
adaptive EM (AEM) algorithm will also be explored by
localizing the estimation process. Remote sensing data is used
in the experiments for performance analysis. In particular,
comparative study will be conducted to quantify the
improvement from the adaptive EM algorithm.
IndexTerms: Remote sensing imagery, Classification, EM
algorithm, Adaptive EM alogirthm.
II. EXPECTATION MAXIMIZATION (EM) ALGORITHM

I. INTRODUCTION
The EM algorithm is an iterative technique for
maximum-likelihood estimation, which is widely used in
signal
processing
area.
Assume
that
a
multispectral/hyperspectral remote sensing image of size
N1N2 with L spectral bands contains K classes with prior
probability being denoted as k , 1 k K . The
Due to the recent advance in remote sensing instruments,

the spectral resolution of remotely sensed image is
significantly improved as well as the image quality. Such
improvement provides the possibility of the precise object
identification. Since the spatial resolution or the area covered
by a single pixel is very large (typically several square
meters for images acquired by an airborne sensor, and
several hundred square meters for images acquired by a
spaceborne sensor), many materials are embedded in this
area. The radiance of a pixel is usually considered as the
mixture from all these materials. So image analysis in remote
sensing actually deals with mixed pixel processing instead of
pure pixel processing in standard digital images.
xi
probability density function (pdf) of the i-th pixel vector

is given by
K
p(x i ) = k p k (x i )
(1)
k =1
where i = 1, , N1 N 2 . If each class is Gaussian distributed,
In most cases of remote sensing, the prior information

about the image scene is unavailable, and unsupervised
classification has to be implemented [1]. An intuitive way to
deal with the involved mixing problem is to assume
Gaussian mixture and estimate mixing parameters via
p k (x i ) =
(2 )
L/2
1/ 2
T
exp (x i k ) k1 (x i k )
2
(2)
0-7803-7929-2/03/$17.00 (C) 2003 IEEE
1796
0
The number of classes K is initialized using a relatively

large number. If the prior probability of the k-th class k is
k and k are the mean vector and covariance
where
matrix of the k-th class, respectively. The prior probability

k is non-negative and satisfies the following relationship
K
=1.
less than a threshold , this class will be removed, and

K = K 1 . The estimation result using the NeymanPearson detection theory-based eigen-thresholding approach
in [4] can be adopted as the initial value of K.
(3)
k =1
The whole image can be well approximated by an

independent and identically distributed random field X, and
the corresponding joint pdf is
p ( X) =
III. ADAPTIVE EXPECTATION MAXIMIZATION (AEM)

Since most remotely sensed
adaptive EM algorithm (AEM)
localizing the estimation process.
small window w moving around
N1 N 2 K
k pk (x i ) .
(4)
i =1 k =1
The task is to estimate the
k , k
and
estimation in Eq. (6) only depends on pixels covered by the

window, w. This makes all unsupervised procedures valid in
the nonstationary case [3]. The small image cube (for
multispectral / hyperspectral image) overlapped by the small
window, w is treated as a multi-dimensional image and the
estimation step (E step) of the EM algorithm is applied. This
results in the calculation of the probabilistic membership
k such that
p (X) can be maximized, 1 k K . The resulting

iterative algorithm is summarized as follows.
1) Initialization:
(0)
k ,
(k0) and
initialize
(k0) ,
the
algorithm
with
( z ik ) of each pixel in the window. Using the obtained
1 k K .
probabilistic membership the prior probability ( k ) for
2) Estimation step: compute the membership probability by

using
z ik( m)
k( m ) p k (x i )
K
k =1
k( m)
images are nonstationary,

is also be explored by
The basic idea is to use a
the image. The prior k
individual class is obtained. This prior probability is assigned

as local probability (~k ) to the center pixel of the window.
The window is then moved to the next position in the image
and the process repeated. The prior value obtained for each
pixel of the image is used in the global EM algorithm, to
(5)
p k (x i )
iteratively estimate the

maximized.
k and k such that p ( X) can be
for 1 k K , where m is the iteration index.

IV. EXPERIMENTS
3) Maximization step: update the parameter estimates by

using
( m +1)
k
(km+1) =
(km+1) =
N1 N 2
1
=
N1 N 2
( m)
ij
(6)
i =1
N1 N 2
zij(m) x i
N1 N 2 k( m+1) i =1
1
The image used in experiments is about a view of

Michigan Technological University (MTU) campus area
acquired with its Visible Fourier Transforms Hyperspectral
Imager. This instrument records hundreds of narrow spectral
bands in the visible to near infrared part of the spectrum. The
false color composite in Fig. 1 displays trees as red,
indicating their high reflectance in the near infrared region.
The light blue area of the image corresponds to urban
presence. The darker blue area of the image is attributed to
water bodies.
(7)
N1 N 2
z ij(m) (x i (km+1) )(x i (km+1) )T
N1 N 2 k( m+1) i =1
When applying the EM classification the three major

groups in the image could be easily segmented out. If the K
was set to any value higher than three, after several
iterations the value of K was reduced to three. The output of
the maximum likelihood classifier showed the three major
groups in Fig. 2. The first group in Fig. 2(a) represents the
segmented vegetation region as white while the rest of the
(8)
4) If the difference between the parameters in the m-th and
(m + 1) -th iterations is less than a prescribed threshold
, the algorithm is terminated. Otherwise, set
m = m + 1 and go to Step 2.
0-7803-7929-2/03/$17.00 (C) 2003 IEEE
1797
0-7803-7930-6/$17.00 (C) 2003 IEEE
segmentation, Graphical Models and Image

Processing, Vol. 57, No. 5, pp. 389-399, 1995.
[4] C.-I Chang and Q. Du, A noise subspace projection
approach to determination of intrinsic dimensionality for
hyperspectral imagery, Proceedings of 1999 European
Symp on Image and Signal Processing for Remote
Sensing V, pp.34-44, Florence, Italy, September 1999.
image is left dark. The second group in Fig. 2(b) is the

representation of the urban region. The third group in Fig.
2(c) depicts the water bodies including the lake and rivers.
When the adaptive EM algorithm was applied, more land
cover patterns could be classified. In addition to the urban
area and water bodies classified as in Fig. 2, more patterns
could be distinguished. For instance, if the K was initialized
as four, then four classes were resulted as in Fig. 3. Now the
vegetation in Fig. 2(a) was further classified into two classes
in Fig. 3(a) and Fig. 3(c), denoted as vegetation 1 and
vegetation 2. If we compare them with Fig. 1, we find that
Fig. 3(c) corresponds to the area with a shade of red distinct
from the rest brown-red vegetation, and represents another
vegetation pattern. Since this area is small and represents a
local phenomenon, the global EM was unable to capture it
and just considered it as part of the vegetation. Comparing
the water bodies of the global EM in Fig. 2(c) and the
adaptive EM in Fig. 3(b), we can see that Fig. 2(c) has a lot
of tiny granules while in Fig. 3(b) the background is clearer.
This is due to finer classification achieved by the adaptive
EM than by the global EM. By comparison of the urban
classification in Fig. 2(b) and Fig. 3(d), the distinction is also
saliently observed in the adaptive EM result in Fig. 3(d)
where the contours of the periphery distinctly resemble those
blue area in the original image in Fig.1. As observed, the
global EM was unable to provide similar degree of accuracy.
It is worth noting that such improvement in classification is
achieved at the sacrifice of computational time.
Fig1. The original image
(a) vegetation
(b) urban area
IV. CONCLUSIONS
(c) water bodies
The application of EM and adaptive EM algorithms to
remote sensing image is investigated. The number of
resulting classes in the EM and adaptive EM algorithms can
be automatically selected. Using the adaptive EM algorithm,
local statistics in an image scene can be captured and
modeled. As a result, small classes can be more accurately
detected and classified at the sacrifice of computational
time, comparing to the high potential of being merged into
large classes when using EM algorithm for global statistics.
Fig 2. Classification result using EM algorithm.
(a) vegetation 1
(b) water bodies
(c) vegetation 2
(d) urban area
REFERENCES
[1] R. A. Schowengerdt, Remote Sensing, Models and
Methods for Image Processing, Academic Press, 1997.
[2] A. P. Dempster, N. M. Laird and D. B. Rubin,
Maximum likelihood from incomplete data via the EM
algorithm, Journal Royal Statistics Society, Vol. 39,
No. 1, pp. 1-21, 1977.
[3] A. Peng and W. Pieczynski, Adaptive mixture
estimation and unsupervised local Baysian image
0-7803-7929-2/03/$17.00 (C) 2003 IEEE
Fig 3. Classification result using AEM algorithm.
1798
0-7803-7930-6/$17.00 (C) 2003 IEEE

Adaptive Gaussian Mixture Estimation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adaptive Gaussian Mixture Estimation

Uploaded by

Copyright:

Available Formats

Adaptive Gaussian Mixture Estimation and Its

Application to Unsupervised Classification of

Department of Electrical Engineering and Computer Science

National Research Council, Washington DC 20001

maximum likelihood estimation process, which results in the

Abstract This paper addresses unsupervised statistical

II. EXPECTATION MAXIMIZATION (EM) ALGORITHM

Due to the recent advance in remote sensing instruments,

probability density function (pdf) of the i-th pixel vector

where i = 1,  , N1 N 2 . If each class is Gaussian distributed,

In most cases of remote sensing, the prior information

0-7803-7929-2/03/$17.00 (C) 2003 IEEE

The number of classes K is initialized using a relatively

k and k are the mean vector and covariance

matrix of the k-th class, respectively. The prior probability

less than a threshold , this class will be removed, and

The whole image can be well approximated by an

III. ADAPTIVE EXPECTATION MAXIMIZATION (AEM)

The task is to estimate the

estimation in Eq. (6) only depends on pixels covered by the

p (X) can be maximized, 1 k K . The resulting

( z ik ) of each pixel in the window. Using the obtained

probabilistic membership the prior probability ( k ) for

2) Estimation step: compute the membership probability by

images are nonstationary,

individual class is obtained. This prior probability is assigned

iteratively estimate the

k and k such that p ( X) can be

for 1 k K , where m is the iteration index.

3) Maximization step: update the parameter estimates by

The image used in experiments is about a view of

z ij(m) (x i (km+1) )(x i (km+1) )T

When applying the EM classification the three major

0-7803-7929-2/03/$17.00 (C) 2003 IEEE

segmentation, Graphical Models and Image

image is left dark. The second group in Fig. 2(b) is the

Fig1. The original image

(b) urban area

Fig 2. Classification result using EM algorithm.

(b) water bodies

(d) urban area

0-7803-7929-2/03/$17.00 (C) 2003 IEEE

Fig 3. Classification result using AEM algorithm.

You might also like

where i = 1, , N1 N 2 . If each class is Gaussian distributed,