You are on page 1of 6

Hyperspectral Image Classification Based on

Spectral-Spatial Features Using Probabilistic SVM


and Locally Weighted Markov Random Fields
Mostafa Borhani Hassn Ghassemian
Faculty of Electrical & Computer Engineering Faculty of Electrical & Computer Engineering
Tarbiat Modares University Tarbiat Modares University
Tehran, Iran Tehran, Iran
m.borhani@modares.ac.ir ghassemi@modares.ac.ir

Abstract The proposed approach of this paper is based on Landgrebe and his research group were the pioneered of
integration of the local weighted Markov Random Fields (MRF) introduction of spatial context in multi-band image
on support vector machine (SVM) framework for hyperspectral classification. They introduced the well-known ECHO
spectral-spatial classification. Our proposed method consists of (Extraction and Classification of Homogeneous. Objects) [2].
performing probabilistic SVM classification followed by a spatial We used ECHO in this paper as a standard technique for
regulation based on the MRF. One important innovation of this spectral-spatial classification. The ECHO classification
paper is the use of marginal weighting function in the MRF originally are designed to identify objects in multispectral data,
energy function, which preserves the edge of regions. The gather the statistics of the identified objects, and where
proposed spectral-spatial classification was examined with four
possible, to classify the data on an object-by-object basis.
real hyperspectral images such as aerial images of urban,
agriculture and volcanic with different spatial resolution (1.3m
ECHO includes spatial as well as spectral information in the
and 20m), different spectral channels (from 102 to 200 bands) classification algorithm and thereby increases the classification
and different sensors (AVIRIS and ROSIS). The novel approach accuracy.
was compared with some pervious spectral-spatial methods such In this paper, a novel spectral-spatial classification method
as ECHO and EMP. Experimental results are presented and was proposed using the constant nearest neighbor to explore
compared with class map visualization, and some measurements and analyze the dependencies between the pixels.
such as average accuracy, overall accuracy and Kappa factor.
The proposed method improves accuracy of classification Probabilistic SVM and MRF respectively, are two powerful
especially in cases where spatial additional information is tools to classify the hyperspectral data and context analysis.
significant (such as forest structure).
Bovolo [3] and Liu [4] had developed methods based on
Keywords- Hyperspectral Spectral-Spatial Classification, SVM and MRF, respectively, for the SAR and multi-spectral
Markov random fields, probabilistic SVM, local weighted marginal, image classification (four bands). Authors used the MAP
remote sensing decision rule before the final decision, both of the papers
employed SVM in order to estimate the class conditional PDF
I. INTRODUCTION and MRF to estimate the location-based class. We extend this
Pixelwise classifiers for Hyperspectral image classification approach to the Hyperspectral data. Then, we proposed a new
are solely applied on spectral features regardless of how method based on MRF and SVM for Hyperspectral image
classify the neighboring pixels. But in a real image, adjacent classification. In the first step of the proposed method, the
pixels are connected and interdependent [1], there are two probabilistic SVM classification is applied [5] [6]. The second
reason for independency of neighbor pixels; first because the phase is the use of spatial data in order to refine the
imaging sensors are receiving considerable energy from classification results obtained in the first phase. This is
adjacent pixels and second reason is related to the similar achieved by MRF Markov Random Fields. The significant
structures in the scene image, those are usually greater than a differences with the previously proposed methods [7] [3] [4]
pixel in size. This local information should help to properly are in the definition and integration of weighting function in
interpret the landscape. So, for improving the classification MRF energy function to protect margins in the location, while
accuracy, some novel spectral-spatial methods must be procedures. The operational scheme of proposed classification
developed to allocate the correct class to each pixel by method is shown in Figure 1. There is a B-band Hyperspectral
followed conditions: image as input and which can be seen as a set of pixel vectors
of n elements X X RB , j 1,2, , n . We remind that the
1. Spectral characteristic of pixel (Spectral features) classification involves assigning each pixel to one of the K
2. The extracted information from its neighbors (Spatial classes w , w , , w .
features).

978-1-4799-3351-8/14/$31.00 2014 IEEE


K
K
min r p r p subject to p 1, p 0 (3)
:

This problem has a unique solution that can be obtained by


solving a simple linear system as described in [5].
III. THE WEIGHTED AVERAGE DIRECTIONAL GRADIENT
Hyperspectral images of a single band gradient, is
calculated independent of the stage. Gradients are transitions
between areas that define the boundaries between objects and
they have high levels on local minimums. In addition, the
average of directional gradient is used to define the weighting
function. B-band image gradient calculation approaches have
been analyzed in [12] and [13].
In our proposed method we have used the following
approach: First, we calculated the slope of horizontal, vertical
and diagonal (respectively corresponding to the directions of
zero, 90, 45 and 135 degrees) using masks Sobel [14], which
each slope is calculated as the sum of the slope of each spectral
channel. The overall gradient banding ,
1,2, , is considered as average of these four directional
slopes.
IV. FIXED WEIGHTED MRF REGULATION
Figure 1. The proposed scheme based on probabilistic SVM and MRF
In the last step, regulation of SVM classification map is
implemented using MAP-MRF framework. This framework is
II. . PROBABILISTIC SVM CLASSIFICATION based on the assumption of independence among pixels of
class, meaning that it is likely that the pixel belongs to the class
The first step of the proposed process includes performing , then the adjacent pixels are belong to the same class. In this
pixelwise probabilistic SVM in Hyperspectral Images [8] [9]. paper, we implement the generalized Metropolis-Hasting [20]
Other classifiers can also be used [10]. However, SVM is well stochastic relaxation with annealing algorithm to calculate the
suited for classification of Hyperspectral Data [10] [9]. The MAP estimate from the pixelwise classification maps [15] [16]
results are as follows: . This Method is based on Bayesian approach and intended to
1. Classification map; where each pixel has its own class minimize the overall image energy by repeated minimization
label. energy process in the spatial domain [21]. ,
1,2, , is the set of class labels for the X image. We propose
2. Estimation of the probability of each pixel belonging
that the local energy associated with a given pixel
to each of the classes.
considered as following:
The standard SVM does not provide estimates of the
probability of different classes. Two techniques for calculating S (4)
estimates of the probability of multi-class classification is where S , and are the observed spectral
described in [6] and [5]. We recommend using one of the and spatial energy function respectively, those are is calculated
methods that have been implemented in the LIBSVM library on local vicinity. ( is set of neighboring pixels of which
[6]. Our goal is to estimate the probability of each pixel has been considered the 8 in our implement (a square with side
belonging to the target class: 3)). Spectral energy function is defined as:
| , 1, , (1) | (5)
Therefore, the 'one against one' multi-class SVM
classification strategy with Gaussian RBF kernel is executed. where | is estimated by using binary probability
estimates from multi-class "one against one" SVM results [4]
Pairwise class probabilities p y i|y i or j, x are
[5]. Two different equations is proposed for the spatial energy
implemented using improved operations, such as [11] and [8]. function. We will start with the standard spatial energy formula
r (2) [3] that is calculated as:
A B

Where A and B are obtained by minimizing the negative S 1 , (6)


logarithm of the likelihood function using training data and the
value of f is estimated. In addition, the probabilities of (1) are
calculated with the solving of the followed optimization where . , . is the Kronecker delta and parameter controls
problem: the importance of spatial energy against the spectral energy in

978-1-4799-3351-8/14/$31.00 2014 IEEE


equation (4). Equation X is proportional to the investigated and the number of training samples for each class
number of pixels those are dedicated to one of the other and accuracy per class for various methods is given in Table I.
classes except . A False composite image and different methods classification
results are shown in Figure 2.
This spatial energy equation is good, especially for images
with large spatial structures. However, if the object is small and Hekla image has been obtained by AVIRIS sensor over the
is displayed as a single pixel in the image, the model prefers to area around the central volcano in Iceland [18]. The AVIRIS
assign this pixel to the class of objects around it. In order to sensor collects reflected waves with wavelength from 0.4m to
reduce the earlier spatial energy function bugs and maintaining 2.4m and uses four sensors with 224-channel spectrometers.
margins and small structures on the classification map we During data collection, spectrometers #4 did not work properly.
propose to use weighting function such as (7). calculating the All of the data recorded with the out of service sensor and the
correct margin map for Hyperspectral image is a challenging first band of other spectrometers (those channels were empty),
task. For example, it may be bulid with applying a threshold to were excluded from the dataset. Thus, the remaining 157 data
gradient image , 1,2, , . Therefore, an channels were used for our tests. Spatial dimension of
investigated image is 560 600 pixels. Twelve classes of land
appropriate threshold should be chosen. Instead of calculating
cover were examined and the number of instances of each class
the margin map, we suggest the definition of following
function: label and the accuracy of the different methods are described in
Table II. Tri-band image with artificial colors and also the
1 0 1 classification results of different methods are shown in Figure
(7)
3. in Indiana and Hekla Images, 50 samples from reference
data are considered as training samples randomly for each class
where, 0 is a parameter which controls the approximation and the remaining samples formed the test set.
margin threshold. When 0 (no margin), then 1.
University of Pavia dataset has been recorded by ROSIS
Increasing leads to smaller and closer to zero . Thus as optical sensor over the metropolitan area, University of Pavia
innovation of this article, the spatial energy term is offered as: in Italy. This image has 340 610 pixels, with a spatial
resolution of 1.3m per pixel. The number of captured data
1 , (8) channels is 115 (with spectral range from 0.43m to 0.86m).
12 channels were the most impaired deleted and the remaining
Both of the approaches for calculating spatial energy 103 bands were used for experiments of this paper. Artificial
function (equation (6) and (8)) are used in experimental composite color image and also the classification results of
implementation of this paper. different methods were shown in Figure 4.

Here, we summarize the proposed algorithm for optimizing Center of Pavia dataset was recorded from an urban area by
the energy function: In each iteration, an image location (ie the sensor ROSIS. The image used for the experiments has 300
pixel ) is randomly chosen. Local spatial energy is 900 pixels with 102 spectral channels (13 channels with the
most noise were excluded). Nine class labels of reference
calculated by Equation (4). Then, a new class labels N is
ground truth map with the number of labeled samples in each
chosen randomly for new pixel and a new local energy class are given in Table III. Classification results are shown in
N
is calculated. If the energy fluctuation is defined as Figure 5. Thirty samples were randomly selected as training

0 , the new class labels goes samples for each class.
out : L LN . Otherwise, allocate a new class with
probability exp | is accepted. Here T is the control TABLE I. ACCURACY OF INDIANA PINES DATASET FOR EACH CLASS FOR
parameter of general [15]. The proposed algorithm, DIFFERENT METHODS, YELLOW BOXES IN ALL TABLES MEANS THE BEST
ACCURACY AMOUNT ALL OF THE APPROACHES
generalized Metropolis- Hasting algorithm to the weighted
margin. Proposed Approach
Class # 3-NN ML SVM ECHO without/with
V. EXPERIMENTAL EVALUATION Weighting

Figures 2-5 and Tables I, II and III give the results of an 1 1434 41.84 71.39 78.18 83.45 93.28 98.48
empirical evaluation and comparison of the proposed spectral- 2 834 62.24 63.01 69.64 75.13 83.93 90.82
spatial classification in Hyperspectral remote sensed image 3 234 73.37 85.87 91.85 92.39 99.46 98.37
with four different datasets (urban, agricultural and volcanic),
different spatial resolution (1.3m and 20m) and the number of 4 968 67.43 79.43 82.03 90.1 98.58 98.91
distinct spectral channels (from 102 to 200 bands) with 5 2468 53.91 52.65 58.95 64.14 82.09 76.92
different sensors (AVIRIS imaging spectrometer and ROSIS).
6 614 64.72 85.99 87.94 89.89 97.7 97.34
Indian Pines Hyperspectral image has been recorded by the 7 54 84.62 48.72 74.36 48.72 97.44 97.44
AVIRIS sensor from the vegetated area in northwest Indiana.
Spatial dimensions of this image is 145 145 pixels, and the 8 497 86.35 93.51 92.17 94.18 97.54 97.54
spatial resolution of each pixel is 20m. Twenty water 9 747 91.97 94.69 91.68 96.27 97.7 97.56
absorption bands [17] were removed and only 200 bands has
10 26 100 36.36 100 36.36 100 100
been used in our experiments. The ten classes have been

978-1-4799-3351-8/14/$31.00 2014 IEEE


b a b a

d c d c

f e f e
Figure 2. Indianan Pines, (middle) color combination of three bands 837, 636 Figure 5. Hekla Image, (middle) Tri-band color combinations, 1125, 636 and
and 537 nm, the results were classified as: (a) 3-NN, (b) ML, (c) SVM, (d) 576 nm, (a) 3-NN, (b) ML, (c) SVM, (d) ECHO, (e) SVM classifier with MRF,
ECHO, (e) SVM with MRF, (f) SVM classifier with locally weighting MRF. (f) SVM classifier with locally weighting MRF.
TABLE II. ACCURACY OF HEKLA FOR EACH CLASS FOR DIFFERENT METHODS
Proposed Approach
Class # 3-NN ML SVM ECHO
Without/with Weighting
1 342 87.67 98.97 88.36 99.66 100 100

2 708 93.02 98.94 87.25 99.24 99.85 100

3 1496 92.39 94.26 88.24 94.26 100 100

4 2739 97.1 94.01 84.94 94.38 96.24 96.47

5 410 68.33 96.39 93.33 96.94 100 100

6 1023 91.98 98.15 94.24 98.25 100 98.97

7 684 87.07 98.74 87.54 99.37 100 100

8 700 82 96.15 91.69 96.31 99.38 99.38

Figure 3. Indiana Pines dataset; The comparison of the accuracy, reliability 9 404 86.44 92.37 85.88 95.48 100 100
and Kappa factor of the proposed method with previous methods. 10 550 55 97.6 74.2 99.6 100 97.8

Figure 4. Hekla dataset; the comparison of the accuracy, validity, and Kappa Figure 6. The comparison of the accuracy, validity, and Kappa factor of the
factor of the proposed method with previous methods. proposed method with previous methods and EMP [19] and ECHO

978-1-4799-3351-8/14/$31.00 2014 IEEE


b a

d c

Figure 9. Center of Pavia dataset, (right) SVM, (left) proposed classifier with
locally weighting MRF.
TABLE III. ACCURACY OF CENTER OF PAVIA PINES DATASET FOR EACH CLASS
FOR DIFFERENT METHODS
ECH Proposed Approach
Class # 3-NN ML SVM O Without/with Weighting

1 34352 98.87 99.35 99.78 99.35 100 100


2 2627 70.54 92.72 90.26 93.88 96.03 96.03
f e
3 1788 90.16 87.14 96.42 88.17 99.66 100
Figure 7. University of Pavia dataset, (middle) false color combinations, 650,
558 and 478 nm, Classification map of (a) 3-NN, (b) ML, (c) SVM, (d) 4 2140 74.55 79.34 64.03 82.56 62.89 64.17
ECHO, (e) SVM with MRF, (f) SVM classifier with locally weighting MRF.
5 5365 79.16 90.03 88.38 91.13 94.64 95.11
6 5568 68.04 89.67 90.45 90.45 95.03 96.01
7 972 60.62 87.9 87.47 92.36 98.94 100
8 1112 93.16 95.66 98.71 95.93 100 100
9 2146 89.46 98.39 99.95 98.39 100 100

VI. DISCUSSIONS AND CONCLUSIONS


The proposed strategy of this paper is the use of SVM
techniques within the closest vicinity based on the integration
of the weighted MRF in the spectral classifications. Our
proposed method consists of performing probabilistic SVM
classification followed by a weighted MRF based spatial
regulation. One of the important innovations is the integration
method of weighting function in the energy function in Markov
Figure 8. University of Pavia dataset; the comparison of the accuracy, validity,
and Kappa factor of the proposed method with previous methods.
random fields, which seeks to maintain margins during spatial

978-1-4799-3351-8/14/$31.00 2014 IEEE


uniformity. The proposed strategy with and without weighting resolution imagery, Remote Sensing of Environment, Volume 101,
method was examined on four real datasets and was compared Issue 2, 30 March 2006, Pages 167-180, ISSN 0034-4257,
http://dx.doi.org/10.1016/j.rse.2005.12.012.
with pixelwise methods such as 3-NN, ML and SVM
[5] Ting-Fan Wu, Chih-Jen Lin, and Ruby C. Weng. 2004. Probability
classification methods and some spectral spatial classification Estimates for Multi-class Classification by Pairwise Coupling. J.
approaches such as ECHO and EPM [22]. Mach. Learn. Res. 5 (December 2004), 975-1005.
[6] C. Chang and C. Lin , "LIBSVM: A library for SVM
As implementations show the proposed method creates http://www.csie.ntu.edu.tw/~cjlin/libsvm/ 2014.
good classification results for different types of images. [7] Farag, A.A.; Mohamed, R.M.; El-Baz, A., "A unified framework for
However, as expected, the fixed nearest neighbor method did MAP estimation in remote sensing image segmentation," Geoscience
not extract small objects in the image correctly. Furthermore, and Remote Sensing, IEEE Transactions on , vol.43, no.7,
due to the use of nearest neighbors, which contain a small pp.1617,1634, July 2005, doi: 10.1109/TGRS.2005.849059.
number of pixels, the proposed method is useful only if a large [8] J. Platt , "Probabilistic outputs for Support Vector Machines and
comparison to regularized likelihood methods," In A. Smola, P.
area with wrong label pixels is not present in the original plans. Bartlett, B. Scholkopf, and D. Schuurmans, Advances in Large Margin
If there is such an area, MRF-based methods cannot reconstruct Classifiers Cambridge, MA MIT Press., 2000.
the correct class label. [9] ladimir N. Vapnik. Statistical learning theory. Wiley, 1 edition,
NewYork:, September 1998.
Experimental evaluations of the proposed method for [10] Camps-Valls, G.; Bruzzone, L., "Kernel-based methods for
spectral-spatial classification were provided in Figures 2-5 and hyperspectral image classification," Geoscience and Remote Sensing,
Table I, II and III for the four hyperspectal images. Test results IEEE Transactions on , vol.43, no.6, pp.1351,1362, June 2005, doi:
can be summarized as follows: 10.1109/TGRS.2005.846154
[11] Licciardi, G.; Pacifici, F.; Tuia, D.; Prasad, S.; West, T.; Giacco, F.;
SVM classification approach is suitable pixelwise method Thiel, C.; Inglada, J.; Christophe, E.; Chanussot, J.; Gamba, P.,
for large dataset with limited training samples. "Decision Fusion for the Classification of Hyperspectral Data:
Outcome of the 2008 GRS-S Data Fusion Contest," Geoscience and
Examination of the spatial dependence between pixels is Remote Sensing, IEEE Transactions on , vol.47, no.11, pp.3857,3865,
useful for classification. Nov. 2009, doi: 10.1109/TGRS.2009.2029340.
[12] G. Noyel, J. Angulo, and D. Jeulin , "Morphological segmentation of
In all cases, the proposed method gives better results than hyperspectral images," Image Analysis and Stereology, pp. 26:101
the pixelwise methods. 109, 2007.doi: 10.5566/ias.v26.p101-109
[13] Tarabalka, Y.; Chanussot, J.; Benediktsson, J.A.; Angulo, J.; Fauvel,
Novel spectral-spatial classification methods developed in M., "Segmentation and Classification of Hyperspectral Data using
this paper have a remarkable performance in terms of Watershed," Geoscience and Remote Sensing Symposium, 2008.
accuracy and reliability when compare with the previously IGARSS 2008. IEEE International , vol.3, no., pp.III - 652,III - 655, 7-
11 July 2008, doi: 10.1109/IGARSS.2008.4779432
recognized ECHO [2] and EMP [19].
[14] R.C. Gonzalez and R.E. Woods , "Digital Image Processing," Second
Uniformity based MRF showed a powerful tool to analyze Edition. Prentice Hall, 2002.
the texture information is hyperspectral Images. [15] Geman, Stuart; Geman, D., "Stochastic Relaxation, Gibbs
Distributions, and the Bayesian Restoration of Images," Pattern
Development approaches based on MRF to classify Analysis and Machine Intelligence, IEEE Transactions on , vol.PAMI-
findings efficient and highly potent methods to classify 6, no.6, pp.721,741, Nov. 1984, doi: 10.1109/TPAMI.1984.4767596
different types of remote sensing images. [16] A. H. Teller, and E. Teller , "Equations of state calculations by fast
computing machines," Journal of Chemical Physics, pp. 21(6):1087
The proposed method with weighting function leads to the 1092, 1953. doi: 10.1063/1.1699114
rather prevent border provides better classification map. [17] Tadjudin, S.; Landgrebe, D.A., "Covariance estimation with limited
Therefore, these techniques are recommended for training samples," Geoscience and Remote Sensing, IEEE Transactions
classification of hyperspectral images, especially for on , vol.37, no.4, pp.2113,2118, Jul 1999, doi: 10.1109/36.774728.
images containing large areas with unknown spectral [18] Benediktsson, J.A.; Kanellopoulos, I., "Classification of multisource
and hyperspectral data based on decision fusion," Geoscience and
features. Remote Sensing, IEEE Transactions on , vol.37, no.3, pp.1367,1377,
May 1999, doi: 10.1109/36.763301
In all cases, the weighting Markov Random Field margins
[19] Antonio Plaza, Jon Atli Benediktsson, Joseph W. Boardman, Jason
in the proposed method gives better results than the Brazile, Lorenzo Bruzzone, Gustavo Camps-Valls, Jocelyn Chanussot,
proposed method. Mathieu Fauvel, Paolo Gamba, Anthony Gualtieri, Mattia Marconcini,
James C. Tilton, Giovanna Trianni, Recent advances in techniques for
REFERENCES hyperspectral image processing, Remote Sensing of Environment,
[1] J. A. Richards, X. Jia , "Remote Sensing Digital Image Analysis: An Volume 113, Supplement 1, September 2009, Pages S110-S122, ISSN
Introduction," Springer-Verlag Berlin Heidelberg, 2006. 0034-4257, http://dx.doi.org/10.1016/j.rse.2007.07.028.
[2] Kettig, R. L.; Landgrebe, D.A., "Classification of Multispectral Image [20] Tierney, Luke. "A note on Metropolis-Hastings kernels for general
Data by Extraction and Classification of Homogeneous state spaces." Annals of Applied Probability, 1-9, 1998.
Objects," Geoscience Electronics, IEEE Transactions on , vol.14, no.1, [21] Pandolfi, Silvia, Francesco Bartolucci, and Nial Friel. "A
pp.19,26, Jan. 1976, doi: 10.1109/TGE.1976.294460 generalization of the Multipletry Metropolis algorithm for Bayesian
[3] F. Bovolo and L. Bruzzone , "A context-sensitive technique estimation and model selection." International Conference on Artificial
Intelligence and Statistics. 2010. doi:10.1.1.207.2806
based on support vector machines for image classification,"
[22] M. Borhani, H. Ghassemian, Novel Spatial Approaches for
First International Conference, PReMI 2005, Kolkata, India, Classification of Hyperspectral Remotely Sensed Landscapes,
December 20-22, 2005. DOI:10.1007/11590316_36 Symposium on Artificial Intelligence and Signal Processing ,
[4] Desheng Liu, Maggi Kelly, Peng Gong, A spatialtemporal approach December 2013.
to monitoring forest disease spread using multi-temporal high spatial

978-1-4799-3351-8/14/$31.00 2014 IEEE

You might also like