Professional Documents
Culture Documents
PROJECT REPORT
On
Hyperspectral Signal Processing to Identify Land Cover Pattern
Submitted in partial fulfilment of the requirements for the award of the degree of
Bachelor of Engineering
in
Electrical and Electronics
by
D.R.RAGHURAM
1AR09EE010
MEGHANA SUDHINDRA
1AR09EE026
SUSHANT KULKARNI
1AR09EE044
External Guide:
Dr S.N.OMKAR
Principal Research Scientist
Dept of AE,IISc
Acknowledgements
For the successful completion of this project, many people have taken time out of their busy
schedules and helped us. We would like to acknowledge their help and contribution.
We express our sincere gratitude to Dr A.Prabhakar, Principal, AIeMS and Prof H.L.Dinakar,
Head of Department, Electrical & Electronics Engineering, for encouraging us to carry out
the project at a premier institute.
We are indebted to our internal guide, Smt. Sharmila R.S, Assistant Professor for her
guidance, and constant feedback and support from the very beginning of this project.
We wholeheartedly thank our external guide, Dr. S.N.Omkar, Principal Research Scientist, Department of Aerospace Engineering, Indian Institute of Science for giving us an
opportunity to do our project on such an interesting and upcoming field of research.
We would also like to thank our Project Co-ordinators, Smt. Sulochana Akkalkot, Assistant Professor and Ms.Rashmi S, Lecturer, for their valuable suggestions and encouragement.
We are grateful to J.Senthilnath, Doctoral Student, Department of Aerospace Engineering, Indian Institute of Science for his invaluable guidance and Ashoka Vanjare, Research
Assistant, Department of Aerospace Engineering, Indian Institute of Science for his constant and generous help.We would also like to thank Nikil Rao, BITS, Pilani, fellow member
of our project group. We are also thankful to our friends in the Computational Intelligence
Lab, Department of Aerospace Engineering, IISc and all the teaching and non-teaching staff at
our college who have directly or indirectly contributed to the success completion of this project.
Above all, we are thankful to our parents, whose tireless efforts have culminated in us
reaching this juncture of life.
Contents
Pg.No
List of Figures
iii
List of Tables
iv
Abstract
1 Introduction
1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Literature Survey
3 Image Acquisition
3.1.1 Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
4.1 Water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Vegetation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Land . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Pg.No
4.4 Urban areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Data Preparation
17
20
29
35
54
9.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
10 Future Work
60
References
62
APPENDIX - A
List of Figures
Pg.No
Figure 1.1 Block diagram illustrating the process flow. . . . . . . . . . . . . . . . .
Figure 1.3 The difference between hyperspectral images and other images. . . . . .
ii
Figure 9.8 Graph illustrating the variation of SAM classification accuracy with number of principal components for QUAC corrected image. . . . . . . . . . . . . . . 58
iii
List of Tables
Pg.No
Table 1.1 The three colours and their wavelengths. . . . . . . . . . . . . . . . . . .
Table 5.1 Removed bands and the reason for their removal. . . . . . . . . . . . . . . 17
Table 5.2 The various bands used and their uses. . . . . . . . . . . . . . . . . . . . 18
Table 6.1 Details of various parameters required to perform FLAASH. . . . . . . . . 26
Table 8.1 SAM Classification efficiency for FLAASH corrected image. . . . . . . . 42
Table 8.2 SAM Classification efficiency for QUAC corrected image. . . . . . . . . . 42
Table 8.3 Mahalanobis Distance Classification efficiency for FLAASH corrected
image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 8.4 Mahalanobis Distance Classification efficiency for QUAC corrected image. 44
Table 8.5 Minimum Distance Classification efficiency for FLAASH corrected image. 47
Table 8.6 Minimum Distance Classification efficiency for QUAC corrected image. . 48
Table 8.7 Maximum Likelihood Classification efficiency for FLAASH corrected image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Table 8.8 Maximum Likelihood Classification efficiency for QUAC corrected image.
51
iv
2012-13
Abstract
Land cover assessment plays a vital role in several issues pertaining to policy making, directly
and indirectly affecting the lives of many people. Accurate information pertaining to land cover
is therefore of great importance.
The aim of this project is to to automate the process of classifying land cover into different
classes using images acquired by a hyperspectral sensor. The four classes of interest are: water,
vegetation, barren land, and built up areas.
This is a Level 1 classification, meaning that the main classes are not further subclassified
into various other classes.The hyperspectral images are passively sensed using optical signals.
The sensed signals are processed to remove the effects of atmosphere.
Atmospheric correction is performed using two techniques, namely, QUAC(Quick Atmospheric Correction) and FLAASH(Fast Line of Sight Analysis of Spectral Hypercubes). Dimensionality reduction is done through Principle Components Analysis(PCA).
Automatic classification of the image into the target classes was done through the use of
supervised as well as unsupervised algorithms. Some of the supervised algorithms used were:
Mahalanobis Distance Classifier, Spectral Angle Mapper, etc. The unsupervised algorithm
used was the k-means algorithm.
The results indicate that supervised methods are the best classifiers and that among the
atmospheric correction techniques, the FLAASH correction module is the best on.
2012-13
Chapter 1
Introduction
Land cover assessment and analysis plays a very vital role in framing various national policies
of governments across the globe. Information obtained from land cover analysis gives information pertaining to the environment, agricultural patterns, estimation of forest cover, preparation
of digital maps and are also used for military purposes.
Policies and government decisions regarding various issues pertaining to the aforementioned domains are based on the information available at hand. Therefore, accurate realisation
of land cover is of vital importance, as any wrong or inaccurate information results in bad policies and decisions jeopardising the security of the nation and endangering the livelihoods of
millions.
Land cover refers to what is actually present on the ground, describing the physical state of
the earths surface and immediate surface in terms of natural environment(water, vegetation,
rocks, etc) and man-made objects.
Land cover shouldnt be confused with the term land use, even though they are used interchangeably. Land cover provides information as to what features are present on the ground.
For example, land cover
Prior to automatic classification, land cover and land use assessment was done manually
by analysing aerial photographs of the area to be studied. As technology increased, automatic
2012-13
classification techniques to classify the image, the need for automatic classification techniques
was felt.
There are several advantages of using automatic classification techniques. Previously, multispectral sensors were being used to acquire images. Multispectral sensors contain on an
average ten bands of the same area. However, this is not the case with hyperspectral images. Hyperspectral images contain hundreds of bands of the same area, each band separated
by a very small wavelength. Manually analysing hundreds of bands is an impossible task.
Moreover, humans are prone to errors and bias, which negatively influences the accuracy of
classification. Therefore, the only viable solution left is automatic classification.
1.1
Methodology
Prior to classification, various other procedures have to be performed, these are illustrated in
the form of a block diagram in Fig 1.1 The steps written in logical order are as follows:
1. Data Preparation
2. Pre-Processing and Atmospheric Correction
3. Dimensionality Reduction
4. Classification
5. Validation using Ground Truth
2012-13
The flow as illustrated in the block diagram, was arrived at by referring to the literature, and
other requirements which had to suit the specifications of the dataset we possessed.[1].
1.2
Normal images are acquired in a very small portion of the electromagnetic spectrum, this being
the visible region. The visible region comprises of wavelengths corresponding to three colours,
red, blue and green. The wavelengths corresponding to the three colours are given in Table 1.1
Colour
Red
Blue
Green
Wavelength
700 nm
400nm
500 nm
2012-13
Figure 1.3: The difference between hyperspectral images and other images.
Hyperspectral imaging is also referred to as imaging spectroscopy, due to the fact that different materials can be reasonably identified using their spectrum. Mathematically, the spectrum
of a continuous signal,x(t) can be written as [2]:
X() =
x(t)et dt
(1.1)
Hyperspectral
Continuous coverage of the
spectrum
Spectral resolution is around 100 nm.
Spectral resolution around
10-40 nm
Low information content compared to hyperspectral signals Greater information content
1.3
The area of study is an image of Bangalore acquired in the year 2002, by the Hyperion sensor
onboard the EO-1 satellite. The image ranges from Nandi Hills in the north to the outskirts of
Kanakapura in the south. The image has spatial dimensions 911x3191, and a spectral dimension of 196 bands in the image.
A subset of the image was chosen, primarily due to the lack of ground truth for the outlying
areas of the image, and any subsequent ground truth prepared would not have reflected the outlying areas as of 2002. The subset of the image varies from the outskirts of Yelahanka, in North
Bangalore to the areas adjoining Banneraghata National Park in South Bangalore. Referring
Department of Electrical and Electronics Engineering, AIeMS
2012-13
to Google Earth, we observed no significant changes in these areas, and hence these particular
areas were chosen to form the subset image.
Details of the subset image are given in the form of a table (Table 1.3) on the next page,
along with the original image (Fig 1.4) and the subset image(Fig 1.5). The table lists various
parameters, such as latitude, longitude, the spatial and spectral dimensions. The black portion
in Fig 1.4 is indicative of areas which do not fall under the sensors view.
Parameter
Latitude of UL corner
Longitude of UL corner
Latitude of LR corner
Longitude of LR corner
Rows
Columns
Spatial resolution
Details
13.07721667 N
77.57472500 E
12.91586889 N
77.61036142 E
750
500
30mx30m
2012-13
1.4
Chapter 2 describes the previous attempts at land cover classification, along with a section
on other applications of hyperspectral signals.
Chapter 3 describes in detail the methods of image acquisition.
Spectral profiles are the pillars of imaging spectroscopy, and we talk about this in Chapter
4.
Chapter 5 describes the way the acquired data is prepared for pre-processing.
In Chapter 6 the techniques of atmospheric correction, the need for atmospheric correction, and the effects of atmospheric correction on the image are discussed.
Chapter 7 talks about dimensionality reduction in general, and Principle Components
Analysis in particular.
Chapter 8 deals with the various algorithms used for classification, the mathematical background of the algorithms used to classify the image.
The results obtained for various classification algorithms are summarised and discussed
in Chapter 9.
Chapter 10 indicates the road ahead for hyperspectral imaging research.
2012-13
Chapter 2
Literature Survey
In this chapter, we take a look at previous attempts at land cover classification using satellite
images and later on in the chapter, look at the many uses of hyperspectral images and gauge
the broad spectrum of areas across which hyperspectral signals are being used.
Uttam Kumar,et.al., have given a clear picture of the detailed history behind land cover classification using satellite images in [3].
Land cover classification using satellite images has a long history. The first use of satellite
images for land use classification was started in the late 1970s and early 1980s as part of the
National Mapping Program, by the United States Geological Survey, based on images acquired
by NASA. The images were acquired using aerial photography and classification was carried
out manually by analysts[3].
Later on, in 1991, the entire country of China has been mapped to produce a digital map
containing 20 different land use /land cover classes. This was done using field surveys, satellite
and aerial images to understand land cover change.[4]
More attempts at land use/land cover classification include the National Land Cover Database
2012-13
provides a standardised land cover database for South Africa, Swaziland and Lesotho using
Landsat images acquired in 1994-95, the Global Vegetation Monitoring unit of the JRC,ISPRA
in Italy to generate a global land cover/land use mapper in the year 2000[3].
In India, land use and land cover study using remote sensing has been initiated by the Indian
Space Research Organisations remote sensing satellite; Resourcesat and the National Remote
Sensing Agency, Department of Space. The major classes of interest were agricultural areas,
surface water bodies, waste lands, forests,etc. This was carried out on a national level using
multi-temporal IRS(Indian Remote Sensing), AWiFS(Advanced Wide Field Sensor) datasets
to provide on an annual basis, net sown areas for different cropping seasons and an integrated
land cover map[4].Senthilnath et.al, have used multi-spectral satellite images[5] to determine
land cover over the city of Bangalore in the year 2010. Uttam Kumar,et.al,[3] have used hyperspectral data acquired by MODIS satellite to assess land use pattern over the district of
Chikkaballapur in the state of Karnataka, India.
Other uses of hyperspectral signals include assessing the impact of climate change[6],detecting
the type of crops and their growth stage[7],[8], medical imaging[9],face recognition[10]. The
applications are not limited to the aforementioned areas, it can be used in many other areas,
limited only by our imagination.
To the best of our knowledge, assessing land cover using hyperspectral data acquired by the
Hyperion sensor hasnt been done till now.
2012-13
Chapter 3
Image Acquisition
The sensor used to acquire the image is the Hyperion sensor onboard the Earth Observing1(EO-1) satellite, launched by NASA. The Hyperion sensor is a hyperspectral sensor.
3.1
Sensor Models
The sensor used in any satellite based or airborne imager maybe modelled as depicted in the
figure below. It has been adapted from[11].
3.1.1
Scanner
There are basically two types of scanning methods: a)Along track scanning and b)Across track
scanning.
2012-13
D = h0
(3.1)
= IFOV in radians
10
2012-13
3.1.2
Imaging Optics
Optics usually describes the behavior of visible, ultraviolet, and infrared light used in imaging.
Here optical system is used to project information collected in sensors onto detectors. Before
optical signals are projected onto detectors they are made to split into there constituent regions
of the electromagnetic spectrum with help of prisms and dichroic grating. The dichroic grating
splits optical signal into thermal and non thermal regions. The prism is then used to split the
non thermal optical signals into visible, ultra-violet(UV) and near Infrared regions.
3.1.3
Detectors
Due to sensor platform and scan mirror velocity, various sample timings for bands and pixels,
the need to physically separate different spectral bands, and limited availability on focal plane,
11
2012-13
the detector patterns are often in different pattern as per requirement. Here, each detector integrates the energy that strikes surface(irradiance) to form measurement at each pixel. Then
integrated irradiance at each pixel is converted into electrical signal and quantized as integer
value, called as Digital Number(DN). A finite no. of bits, Q, is used to code the continuous
data measurements as binary numbers. The no. of Discrete Numbers is given by,
NDN = 2Q
(3.2)
DNRAN GE = [0, 2Q 1]
(3.3)
The larger the value of Q, the more closely the quantized data approximates the original continuous signal generated by detectors, and the higher will be the radiometric resolution of that
sensor[11].
3.2
No instrument can measure the signal it has sensed with 100% accuracy, because the signal is
always varying as a function of some parameter. This parameter, in the case of remote sensing,
is usually time, or wavelength, space. So, to obtain the output signal, the instrument must
integrate the signal over a non-zero parameter value. This can be written as[11]:
Z
i()r(z0 )
o(z0 ) =
(3.4)
where i is the input signal r(z0 ) is the instrument response, inverted and shifted by
z0 o(z0 ) is the output of the instrument at z = z0 W being the range over which the integral is
significant, W also depends on the parameter of integration.
What the above equation means is that the output signal of the instrument is the convolution of the input signal,(the signal being sensed) and the instrument response. Written more
concisely, Eq 3.1 can be rewritten as
(3.5)
12
2012-13
3.3
Hyperion Sensor
As discussed above, the Hyperion sensor is a pushbroom sensor. Each pushbroom image frame
captures an area of approximately 30m along track, by 7.7km cross track[12]. The Hyperion
optics is a three mirror astigmate design with a 12-cm primary aperture and a f -number of
11. The sensor acquires the signals reflected of the image at an altitude of 705-km above the
surface of the earth. Hyperion also has two onboard spectrometers, a VNIR spectrometer and
a SWIR spectrometer. The spectrometers are temperature controlled at 2932K, and 283K for
the SWIR and VNIR spectrometers respectively.
13
2012-13
Chapter 4
Spectral Profiles
4.1
Water
The variation between the spectral profiles of the four classes is explained as follows. Water
acts as an absorber in the IR(Infra-Red) region. The infra red region varies from 700 nm to
1 mm. Maximum water absorption occurs at 1450 nm, which is the SWIR region of the IR
region, as can be seen from the Fig 4.1.
4.2
Vegetation
The spectral profile of vegetation is a function of the chlorophyll content present. In this case,
a minor peak occurs at the wavelength corresponding to the colour green. This indicates that
Department of Electrical and Electronics Engineering, AIeMS
14
2012-13
the vegetation is photosynthetically active. The troughs at the red and blue wavelengths occur
because the wavelengths in these regions are absorbed to satisfy the energy requirement for
photosynthesis. The reflectance remains very high in the NIR region, because of interaction
between the leaf tissues and electromagnetic radiation.
A dip can be seen in the SWIR region, because water absorption, due to water present in the
leaves and stem, predominates at these wavelengths.
4.3
Land
The reflectivity of barren land depends on the soil type, moisture content, soil texture, and organic matter present in the soil. From Fig 4.3, it can be seen that there is maximum reflectivity
in the NIR region, and a very low reflectivity in the SWIR region. This indicates that the land
under study has a very high moisture content.
15
2012-13
4.4
Urban areas
Urban areas and land have similar spectral properties. The reflectivity of built up areas, is again
dependent on various factors such as the type of material used,the moisture content present in
the materials, etc. Here, the spectral profile of urban areas indicates a high moisture content in
the materials.
16
2012-13
Chapter 5
Data Preparation
For Level 1 radiometric image, out of the 224 bands used to acquire the image, only 198 bands
have been calibrated. Calibrated channels are 8-57 for VNIR, and 77-224 for SWIR. All the
channels are not calibrated because of detectors low response and hence bands that are not
calibrated are set to zero. The remaining 26 bands do not contain any useful information. The
bands which are removed and the reasons for their removal are listed in the form of a table
below.
Removed bands
Bands 1-7
Bands 58-76
Bands 77-78
Bands 56-57
Bands 225-242
Table 5.1: Removed bands and the reason for their removal.
The digital values of Level 1 images are 16 bit radiances and are stored as 16 bit signed
integer. After all the unwanted bands are removed, the remaining 196 bands are stacked on
top of another. The stacked image is stored in the Band Interleaved Format(BIL), using ENVI
4.7. The wavelengths and the Full Wavelength at Half Maximum(FWHM) for each band are
specified using data specified in the Hyperion Data Format Control Book[13].
The human eye can detect only three colours and their various combinations, hence the image is visualised in three bands.
17
2012-13
Band Numbers
Use
Band 29(Red),Band 23(Green),Band 16(Blue) True Colour Composite
Band 50,Band 23,Band 16
False Colour Composite(Vegetation
appears as Red)
5.1
Ground Truth
Ground truth refers to what is actually present on the ground. The ground truth is used as a
reference to validate the classifying accuracy of different algorithms used.
The ground truth was prepared by dividing the subset image into 40 clusters using the kmeans technique. Using Google Earth images of Bangalore acquired in 2002, and False Colour
Composites of the subset image, the 40 classes were merged and finally four different classes
corresponding to water, vegetation, built up area and land.
18
2012-13
The false colour composite was used to find the areas containing vegetation, since vegetation
appears as red in the false colour composite image.The false colour composite image in which
vegetation appears as red is shown below.
Figure 5.2: Vegetation appears as red in the false colour composite image.
19
2012-13
Chapter 6
Atmospheric correction
The process, which transforms the data from spectral radiance to spectral reflectance, is known
as Atmospheric Correction, Compensation or Removal[14]. Hyperion images are rich source
of information contained in hundreds of narrow contiguous spectral bands. There are number
of atmospheric agents which contaminate this information contained in various bands. Therefore to get complete advantage of Hyperion data its required to remove the effects of such
atmospherical agents on earth observation data.
6.1
Earths atmosphere is not clear or plain, but it contains a lot of dust particles, aerosols, water
vapour molecules, carbon particles, etc due to which reflectance path and amount of spectra
between source i.e sun and pixel under observation and between pixel under observation and
sensors is altered. Hence actual parameters are not acquired and hence we have to correct these
defects in order to get original information. This is illustrated in Fig 6.1.
20
2012-13
It is a critical pre-processing step since most approaches have been implemented using spectral library or field spectra[14]. If atmospheric correction is not performed, then there is marked
difference between observed spectral irradiance and spectral library or field spectra. These differences will negatively influence accuracy to which classification has been carried out based
on field spectra or independent spectral library.
The atmosphere effects the brightness or radiance, recorded over any point on the ground in
two almost contradictory ways, when sensor records reflected solar energy. First it attenuates
energy illuminating a ground object at particular wavelengths, thus decreasing radiance that can
be measured. Second, the atmosphere acts as a reflector itself, adding a scattered, extraneous
path radiance to signal detected by sensor which is unrelated to properties os surface. By
expressing these two atmospheric effects mathematically, the total radiance recorded by the
sensor may be related to the reflectance of the ground object and the incoming radiation or
irradiance using following equation:
Ltot =
ET
+ L
(6.1)
where,
All these factors depend on wavelength. The irradiance (E) stems from two sources: directly reflected sunlight and diffuse skylight (sunlight scattered by the atmosphere). The relative dominance of sunlight versus skylight in a given image is strongly dependent on weather
Department of Electrical and Electronics Engineering, AIeMS
21
2012-13
conditions. The irradiance varies with the seasonal changes in solar elevation angle and the
changing distance between the earth and the sun[3].
The magnitude of absorption and scattering varies from place to place and time to time
depending on the concentrations and particle sizes of the various atmospheric constituents.
The end result is the raw radiance values observed by a hyperspectral sensor that cannot be
directly compared to laboratory spectra or remotely sensed hyperspectral imagery acquired at
other times or places. Before such comparisons can be performed, an atmospheric correction
process must be used to compensate for the transient effects of atmospheric absorption and
scattering.
6.1.1
In all the atmospheric correction techniques which are going to be described in this report, radiance is converted to reflectance. Information, when first acquired, is in the form of radiance.
Radiance, also known as spectral radiance, is a measure of the quantity of radiation, that
passes through, or is emitted from the surface and falls within a given solid angle in a specified
direction. Radiance has units of W.sr1 m3 . The radiation of an object can be affected by the
radiance of other objects in its surroundings.Energy transfer occurs between different objects,
and thus the signal acquired does not truly reflect the object under observation. Fig ?? gives a
pictorial representation of this explanation.
6.2
Atmospheric correction may be applied by collecting information from scene i.e Scene Based
Empirical Approaches or by modelling radiation transmission through atmosphere i.e Radia-
22
2012-13
6.2.1
These approaches are based on radiance values present in image i.e scene, hence name Scene
Based Empirical Approaches. IAR (Internal Average Reflectance) and ELM (Empirical Line
Method) and QUAC (Quick Atmospheric Correction) are some of major examples of Scene
Based Empirical Approaches.
6.2.2
The Scene Based Empirical Approach are not generally producing good results as the linearity assumption, which presumes uniform atmospheric transmission, scattering and adjacency
effects throughout atmosphere, which may not be accurate. But Radiation Transport Model
tries to understand and remove effects of major atmospheric process with radiation such as
absorption and scattering. Very effective and latest models are MODTRAN (MODerate resolution atmospheric TRANsmission) and FLAASH (Fast Line of Sight Atmospheric Analysis
of Spectral Hypercubes.
QUAC and FLAASH atmospheric correction modules were performed on the stacked image. These two methods are chosen because they are effective for atmospheric correction
of multispectral and hyperspectral data. The two correction modules are implemented using
ENVI 4.7.
6.3
QUAC
QUAC i.e Quick Atmospheric Correction is a scene based empirical approach. It determines
the atmospheric compensation parameters directly from the information contained within the
scene using the observed pixel spectra, requiring only approximate specification of sensor band
locations (i.e., central wavelengths) and their radiometric calibration and no additional informa-
23
2012-13
tion is required[15]. The approach is based on the empirical finding that the spectral standard
deviation of a collection of diverse material spectra, such as the constituent material spectra in
a scene, is essentially spectrally flat.
It allows the retrieval of reasonably accurate reflectance spectra even when the sensor does
not have a proper radiometric or wavelength calibration, or when the solar illumination intensity is unknown. The computational speed of the atmospheric correction method is significantly faster than for the first-principles methods, it is significantly faster than physics based
approaches, making it potentially suitable for real-time applications. The aerosol optical depth
retrieval method, unlike most prior methods, does not require the presence of dark pixels.
QUAC creates an image of retrieved surface reflectance, scaled into two byte signed integers
using reflectance scale factors of 10,000.
6.4
FLAASH
The main objectives are to provide accurate, physics-based derivation of atmospheric properties such as surface pressure, water vapor column, aerosol and cloud overburdens, to incorporate those same quantities into a correction matrix, and, finally, to invert radiance-at-detector
measurements into reflectance-at-surface values. Atmospheric correction serves a critical role
in the processing of remotely sensed image data, particularly with respect to identification of
pixel content.
Efficient and accurate realization of images in units of reflectance, rather than radiance, is
essential for building consistency into the development, maintenance, distribution, and analysis of any library of such images, acquired under a variety of measurement conditions. Unlike
Department of Electrical and Electronics Engineering, AIeMS
24
2012-13
other atmosphere correction algorithms that interpolate radiation transfer properties from a precalculated database of modelling results, FLAASH incorporates the MODTRAN-4 radiation
transfer code. User is provided a option of choosing any one of the standard MODTRAN model
of atmospheres and aerosols types to represent the scene and a unique MODTRAN solution is
computed for each image.
FLAASH processes radiance images with spectral coverage from the mid-IR through UV
wavelengths where thermal emission can be neglected. For this situation the spectral radiance
L* at sensor pixel may be parameterized as
L =
Be
A
+
+ La
1 e S 1 e S
(6.2)
where,
Various options corresponding to different parameters such as the sensor type,the altitude
of the sensor acquired from the ground, pixel size, the type of environment the image was
acquired in, can be incorporated in FLAASH. All these parameters and others essential to
carry out FLAASH for the image under study is listed in the form of a table (Table 6.1) on the
next page.
Spectral polishing was carried out since it it reduces the noise of the obtained spectra. This
was done using the average of the neighbouring multiple bands. For Hyperion data, a value of
9 was chosen for spectral polishing[16].
25
2012-13
Parameter
Sensor Type
Sensor Altitude
Pixel Size
Latitude of scene center
Longitude of scene center
Flight date
Atmospheric Model
Aerosol Model
Zenith Angle
Azimuth Angle
Details
Hyperion
705.00 km
30.00 m
12.41810036 N
77.57769775 E
22nd March,2002
Tropical
Urban
148.655303
110.924797
6.5
A comparison between the atmospheric techniques used to correct the images in our project is
done in this section. Fig 6.2 illustrates the subset image after performing atmospheric correction, and Fig 6.3 prior to atmospheric correction.
26
2012-13
6.5.1
Prior to atmospheric correction, the reflectance of water is very high in the VNIR and SWIR regions. As mentioned in the previous chapter, this is not the case. After atmospheric correction,
the reflectance of water has reduced in these bands. This might seem counter-intuitive, but this
unexpected result can be accounted for. Pure water,viz, water that is without any impurities, or
almost negligible amount of impurities has a low reflectance in the IR region. But, the water
body in our area of study, is not pure. It contains a large amount of organic matter, making it
impure, contributing to higher values of reflectance in the IR region. After atmospheric correction, due to the impurities being removed, the reflectance reduces to a minimum in the IR
region, as expected, for the spectral profile of water.
27
2012-13
Figure 6.4: Spectral profile of Ulsoor Lake before and after atmospheric correction.
Another thing to be noted is that the spectral profile after QUAC is not a continuous curve.
It is not defined for certain wavelengths. This is rather expected, since QUAC is a scene independent, empirical method of atmospheric correction, and hence these anomalies are observed.
28
2012-13
Chapter 7
Dimensionality Reduction
To overcome these challenges, the dimension of the hyperspectral images are reduced to an
acceptable number. Acceptable, here is defined by the classifier accuracy.
There are many methods of dimensionality reduction. Examples include Principle Components Analysis(PCA), Independent Components Analysis (ICA), Vertex Component Analysis(VCA),etc.
ICA is based on the fact that a signal(in any number of dimensions) is composed of statistically independent signals. But, a major drawback of ICA is that it doesnt work on gaussian
datasets.Since our dataset is very big, we work under the assumption that our dataset obeys the
29
2012-13
VCA assumes that the spectrally pure endmembers constituting the hyperspectral image lie
at the vertices of a n-d polygon in n-dimensional space.
7.1
The method of principal components, also known as the Karhunen-Lo`eve Transform, or the
Hotelling Transform is a data-analytic technique that obtains linear transformations of a group
of correlated variables such that certain optimal conditions are satisfied. The most important
of these conditions is that the transformed variables are uncorrelated[19].
The number of principle components is less than or equal to the number of original components. If a multivariate dataset, visualised as a set of coordinates in a high-dimensional data
space, PCA provides a lower-dimensional picture. This is done using only the first few principal components.
30
2012-13
7.1.1
Mathematical Description
For hyperspectral images, the PCA is illustrated below. The figure has been adapted from[20]
xi = [x1 , x2 , . . . , xn ]i T , f ori = 1, 2, . . . , M
(7.1)
where x1 , x2 , . . . , xn , represent the pixel value of the hyperspectral image at a pixel location
and T denotes the transpose,
and M represents the dimensions of the hyperspectral image(mxn),
m being the number of rows, and n being the number of columns,
The mean vector m of all the image vectors [x1 , x2 , . . . , xn ]Ti is calculated for i = 1, 2, . . . , M .
The mean vector m is calculated as follows:
Department of Electrical and Electronics Engineering, AIeMS
31
2012-13
m=
1
[x1 , x2 , . . . , xn ]Ti
M
(7.2)
(7.3)
1 X
(xi m)(xi m)T
CX =
M i=1
(7.4)
CX = ADAT
(7.5)
AT = A1 .
(7.6)
A = (a1 , a2 , . . . , aN )
The linear transformation which gives the principal components is described as:
y i = AT x i
(7.7)
x1
y1
y2
a11 a12 . . . a1K . . . a1N x2
.
.
.. a21 a22 . . . a2K . . . a2N
..
yi = =
..
..
..
..
..
..
y
.
.
.
.
.
. xK
K
.
..
...
a
a
.
.
.
a
.
.
.
a
K1
K2
KK
KN
yN
xN
(7.8)
32
2012-13
If only the first K eigenvectors are chosen, the first K principal components of the image
are obtained.Usually K N , and hence this leads to reduction in the dimensionality of the
image.
Fig 7.3, Fig 7.4 and Fig 7.5 shows the first six principal components of the study image.
As can be seen from Fig 7.3, Fig 7.4 and Fig7.5, the first three principal components contain the majority of the information. The information content reduces to a bare minimum in the
remaining two bands. It is evident that, the SNR (Signal to Noise Ratio) progressively reduces,
as we look at principal components away from the first three components.
33
2012-13
34
2012-13
Chapter 8
Classification Algorithms
8.1
Methodology of classification
Classification can be primarily of two types, supervised classification and unsupervised classification. Supervised classification is a learning method wherein a training set of correctly
identified observations are available. An unsupervised procedure involves grouping data into
categories based on some measure of inherent similarity and this procedure is also called clustering.
There are three steps involved in any classification process[21]. They are training, classification and accuracy assessment. Training sites are needed for supervised classification and in
this study the training areas were taken from ground truth image. The satellite image is then
classified using four supervised classifiers and an one unsupervised classifiers. Two measurements of accuracy assessment were carried out for each of the classifier methods mentioned
earlier, such as overall accuracy and error matrix or confusion matrix. Accuracy assessment
35
2012-13
was carried out to compute the probability of error for the classified map. The error matrix
is the term used to describe the measure of accuracy between the images that have been classified and the training side of the same image. The term accuracy here refers to degree of
correctness of the classification.
8.1.1
Training
independent, i.e. a change in the description of one training class should not change the
value of another,
discriminatory, i.e. different image features should have significantly different descriptions,
reliable, all image features within a training group should share the common definitive descriptions of that group.
A convenient way of building a parametric description of this sort is via a feature vector (v1 , v2 , ...vn ), where n is the number of attributes which describe each image feature and
training class. This allows us to consider each image feature as occupying a point, and each
training class as occupying a sub-space (i.e. a representative point surrounded by some spread,
or deviation), within the n-dimensional classification space. Viewed as such, the classification
problem is that of determining to which sub-space class each feature vector belongs.
8.1.2
Confusion matrix
A confusion matrix is a specific table layout that allows visualization of the performance of
an algorithm, typically a supervised learning one (in unsupervised learning it is usually called
a matching matrix). Each column of the matrix represents the instances in a predicted class,
while each row represents the instances in an actual class. The name stems from the fact that it
36
2012-13
makes it easy to see if the system is confusing two classes (i.e. commonly mislabeling one as
another). Confusion Matrix is also called as Error Matrix.
There are many unique image classification methods available to extract rich source of information present in hyperspectral imagery. Most methods usually compare study image spectra
with reference image spectra (Ground truth image). The reference spectra can be obtained by
defining Regions of Interest, building spectral libraries, making measurements on field or by
directly extracting it from image pixels.
8.2
In order to apply supervised classification algorithms on the images, a training dataset has to be
created from the ground truth data. This will be used to train the classifier and after training,the
remaining pixels of the image will be classified Half of the data from the ground truth was used
to create the training datasets for supervised classification algorithms. The following steps have
been followed to prepare the training datasets :
A copy of ground truth is saved as an excel file as it is easier to group similar values in excel.
Each pixel in ground truth is divided into 4 different classes and an additional class for
unclassified data and each pixel will have 5 different values being 1- Water 2 - Vegetation
3 - Barren Land 4 - Urban/Built up Area and 0 - Unclassified.
Four new excel sheets containing co-ordinates of each pixel in the class are created for the
four different classes.
The first excel sheet for Water class is taken and an extra column is added using random
number generator (Rand()) function in excel.
Department of Electrical and Electronics Engineering, AIeMS
37
2012-13
The entire sheet is sorted from the lowest to highest according to values in the new column.
Half the number of pixels in the class, selected randomly, is stored in a different ASCII file.
This procedure is repeated for all the remaining classes four classes and four files each containing randomly selected half of the ground truth pixels is created.
Using ENVI 4.7, these pixels are read into four different regions of interests (ROIs) and
thereby half of the data in ground truth is randomly selected as training datasets.
8.3
The Spectral Angle Mapper(SAM) is a classification method that permits rapid mapping by
calculating spectral similarity between image spectrums to reference reflectance spectras[22].
The reference reflectance spectra can be either taken from laboratory or field measurements or
Department of Electrical and Electronics Engineering, AIeMS
38
2012-13
extracted directly from image. This technique was developed by Roberta, et al[22].
SAM is a very powerful classification method because its relatively unaffected by surrounding illumination conditions and targets highlights the target reflectance characteristics. But the
drawback of this method is the spectral mixture problem. The most erroneous assumption
made with SAM, is the assumption that the endmembers chosen to classify the image represents pure spectra of reference material whereas as in actual practice there is mixture of pixels
due to various physical phenomena.
SAM compares image spectra to known spectra which is present in ground truth image. It
takes arc cosine of the dot product between test spectrums t to reference spectrums r with
the following equation
= cos
Pnb
i=1 ti ri
qP
qP
nb 2
nb
2
(
i=1 ti )(
i=1 ri )
(8.1)
ti = test spectrum
ri = reference spectrum
Smaller the spectral angle calculated, the correlation between the image spectrum and the
reference spectrum increases. Pixels which are further away than the specified maximum angle
threshold are not classified. The reference spectra for the Spectral Angle Mapper in our case
were generated by the average spectral profiles for each of the training class dataset. They can
also come from ASCII files, spectral libraries, statistics files.
In an n-dimensional multispectral or hyperspectral space, a pixel vector x has both magnitude (i.e. length) and an angle measured with respect to axes that defines co-ordinate system
of the space[23]. In SAM, only the angular information is used. It is based on the idea that
an observed reflectance spectrum can be considered as a vector in n-dimensional space, where
number of dimensions is equal to the number of spectral bands. If the overall illumination
Department of Electrical and Electronics Engineering, AIeMS
39
2012-13
increases or decreases, due to scattering of sunlight or shadows, the length of this vector will
increase or decrease respectively, but the angular orientation between the two vectors will remain constant.
One major drawback of SAM is that it fails if the vector magnitude is important in providing discriminating information, which may happen in certain instances. However, if the pixel
spectra from different classes are well distributed in feature space there is high likelihood that
angular information alone will provide good separation.
Fig 8.3 shows reference spectra generated from the training dataset Regions of Interest(ROIs)
as described earlier in the chapter.
40
2012-13
Figure 8.3: Endmember collection spectra generated from ground truth image.
41
2012-13
Confusion matrices were then generated for all SAM classified images to measure its efficiency. Ground truth image was taken as reference for generating confusion matrix with respect
to classified output images for 196 band image and 5, 10, 15 band PCA images. These results
were then tabulated into table form as shown
Class name
Water
Vegetation
Barren Land
Built Up Area
Overall
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
85.02
83.75
77.41
92.67
81.0987
PCA-15
bands(%)
100
81.04
73.09
92.88
77.9839
PCA-10
bands(%)
100
81.03
72.79
92.77
77.7876
PCA-5
bands(%)
99.95
80.99
72.88
92.06
77.7355
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
84.20
83.74
74.77
84.05
78.3785
PCA-15
bands(%)
99.58
81.88
74.12
96.49
79.2934
PCA-10
bands(%)
99.58
81.88
74.11
96.48
79.2836
PCA-5
bands(%)
99.58
81.85
96.35
73.99
79.1880
8.4
42
2012-13
8.4.1
Mathematical Background
DM (x) =
p
(x )T S 1 (x )
(8.2)
Mahalanobis Distance or generalized squared interpoint distance for its squared value,
can also be defined as a dissimilarity measure between two random vectors ~x, ~y of the same
distribution with same covariance matrix S
d(~x, ~y ) =
p
(~x ~y )T S 1 (~x ~y )
(8.3)
If the covariance matrix is the identity matrix, the Mahalanobis distance reduces to the
Euclidian distance. If the covariance matrix is diagonal, then resulting distance measure is
called normalized Euclidian distance :
v
u N
uX (xi yi )2
d(~x, ~y ) = t
s2i
i=1
(8.4)
where si is the standard deviation of the xi and yi over the sample set.
Fig 8.5 shows classification output for the Mahalanobis distance classification on FLAASH
and QUAC corrected images.
Confusion matrices were then generated for all Mahalanobis Distance classified images to
estimate its efficiency. Ground truth image was taken as reference for generating confusion
matrix with respect to classified output images for 196 band image and 5, 10, 15 band PCA
images. These results were then tabulated into the form of a table as shown below
43
2012-13
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
91.45
86.51
87.17
88.56
87.2305
PCA-15
bands(%)
84.01
85.40
85.64
88.60
85.9416
PCA-10
bands(%)
82.39
85.57
80.53
91.92
83.3208
PCA-5
bands(%)
71.19
84.82
79.95
86.55
81.9570
Table 8.3: Mahalanobis Distance Classification efficiency for FLAASH corrected image.
Class name
Water
Vegetation
Barren Land
Built Up Area
Overall
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
92.61
86.20
86.98
88.31
87.0161
PCA-15
bands(%)
97.63
85.13
83.12
92.08
84.9569
PCA-10
bands(%)
97.82
84.86
82.75
92.49
84.7178
PCA-5
bands(%)
99.21
84.90
81.21
95.83
84.2497
Table 8.4: Mahalanobis Distance Classification efficiency for QUAC corrected image.
8.5
The Minimum Distance classification is used in many remote sensing methods like crop species
identification, land pattern identification etc. Minimum Distance classifiers belong to family
of classifiers referred to as sample classifiers. In such classifiers the items classified are groups
44
2012-13
of measurement vectors(e.g. all measurement vectors from agricultural field), rather than individual vectors as in more conventional vector classifiers[27].
Minimum distance classification resembles what is probably the the oldest and simplest approach to pattern recognition, namely template matching. In template matching a template
is stored for each class or pattern to be recognized (for e.g. letters of an alphabet and an unknown pattern (e.g. an unknown letter) is then classified into the pattern class whose template
best fits pattern on basis of some previously defined similarity measure. In minimum distance
classification the templates and unknown patterns are distribution functions and the measure of
similarity used is a distance measure between distribution functions.
Thus, an unknown distribution is classified into the class whose distribution function is nearest to the unknown distribution in terms of some predetermined distance measure. In practice
the distribution functions involved are usually not known, nor can they be observed directly.
Rather a set of random measurement vectors from each distribution of interest is observed and
classification is based on estimated rather than actual distributions.
8.5.1
Mathematical Background
As described earlier, Minimum Distance Classifier is used to classify unknown image data to
classes which minimize distance between the image data and the class in multi-feature space.
The distance is defined as an index of similarity so that the minimum distance is identical to
the maximum similarity. Fig 8.6 shows the concept of a minimum distance classifier on the
next page.
45
2012-13
d2k = (X k )T (X k )
(8.5)
where,
dk = Euclidian Distance
X = vector of image data (N bands)
X = [x1 , x2 , x3 , ....., xn ]
Similarly, the minimum distance is calculated for each pixel, and based on this distance the
pixel is assigned to specific class. The Minimum Euclidian Distance classification is applied
to the study image containing entire 196bbands and is also applied to the three dimensionally
reduced PCA images with dimensions 5, 10 and 15. The training dataset with the required
Regions of Interest for the four different classes are taken as input for training. The remaining
pixels are classified into one of the four classes based on the minimum Euclidian distance from
the means of the training classes.
46
2012-13
Fig 8.7 shows classification output for the Minimum distance classification on FLAASH
and QUAC corrected images on the next page.
(a) Minimum Distance classification for QUAC Correction. (b) Minimum Disatnce classification for FLAASH Correction.
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
99.86
81.03
74.01
100
79.4611
PCA-15
bands(%)
99.86
81.03
74.00
100
79.4580
PCA-10
bands(%)
99.86
81.03
73.98
100
79.4400
PCA-5
bands(%)
99.86
81
73.99
100
79.4435
Table 8.5: Minimum Distance Classification efficiency for FLAASH corrected image.
47
2012-13
Class name
Water
Vegetation
Barren Land
Built Up Area
Overall
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
99.77
81.95
73.05
99.99
79.1177
PCA-15
bands(%)
99.77
81.95
73.03
99.99
79.1090
PCA-10
bands(%)
99.77
81.95
73.03
99.99
79.1090
PCA-5
bands(%)
99.77
81.93
72.99
99.99
79.0811
Table 8.6: Minimum Distance Classification efficiency for QUAC corrected image.
8.6
Its also very popular method of classification in field of remote sensing, wherein each pixel
with the maximum likelihood is classified into corresponding class. The Likelihood Lk is
defined as the posterior probability of pixel belonging to class k[28].
P (k) ? P (X/k)
Lk = P (k/X) = P
(P (i) ? P (X/i))
(8.6)
where,
P(k) = prior probability of class k
P(X/k) = conditional probability to observe X from class k, or probability density function
P
Normally P(k) are assumed are equal to each other and (P (i) ? P (X/i)) is also common to
all classes. Therefore Lk depends upon P (X/k) or probability density function.
48
2012-13
Lk (X) =
X1
1
1
T
P 1 ? exp{ (X k ) k (X k ) }
2
(2) | k | 2
n
2
(8.7)
where,
n = number of bands,
X = image data of n bands
49
2012-13
When applying Maximum Likelihood classifier certain precautions need to be taken for
obtaining best possible results. They are:
1. Sufficient ground truth data should be sampled to allow estimation of the mean vector and
variance covariance matrix of population.
2. The inverse matrix of the variance covariance matrix becomes unstable in case where
there exists very high correlation between two bands or the ground truth is very homogeneous. In such cases, the number of bands should be reduced by principal component
analysis.
3. When the distribution of population does not follow the normal distribution, the Maximum Likelihood method cannot be applied.
Fig 8.9 shows classification output for the Maximum Likelihood classification on FLAASH
and QUAC corrected images.
50
2012-13
Confusion matrices were then generated for all Maximum Likelihood classified images to
measure its efficiency. Ground truth image was taken as reference for generating confusion
matrix with respect to classified output images for 196 band image and 5, 10, 15 band PCA
images. These results were then tabulated into table form as shown in next page.
Class name
Water
Vegetation
Barren Land
Built Up Area
Overall
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
98.88
75.33
90.84
88.44
86.6344
PCA-15
bands(%)
98.19
87.21
83.22
95.81
86.0418
PCA-10
bands(%)
98.42
88.77
81.72
97.64
85.7799
PCA-5
bands(%)
98.56
89.37
81.78
98.75
86.1105
Table 8.7: Maximum Likelihood Classification efficiency for FLAASH corrected image.
Class name
Water
Vegetation
Barren Land
Built Up Area
Overall
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
97.91
83.41
84.48
92.89
85.4460
PCA-15
bands(%)
99.07
87.29
83.08
95.80
85.9814
PCA-10
bands(%)
99.16
87.75
82.87
96.54
86.0707
PCA-5
bands(%)
99.02
88.89
81.79
98.57
85.9798
Table 8.8: Maximum Likelihood Classification efficiency for QUAC corrected image.
vspace35mm
8.7
K-means is one of the simplest unsupervised learning algorithms. The procedure follows a
simple and easy way to classify a given data set through a certain number of clusters (assume k
clusters) fixed a priori. The main idea is to define k centers, one for each cluster. These centers
should be placed in an intelligent way because of different location causes different result[30].
So, the better choice is to place them as much as possible far away from each other. The next
step is to take each point belonging to a given data set and associate it to the nearest center.
When no point is pending, the first step is completed and an early group age is done.
After obtaining the k new centroids as barycenter of the clusters resulting from the previous
step. After obtaining these k new centroids, a new binding has to be done between the same
Department of Electrical and Electronics Engineering, AIeMS
51
2012-13
data set points and the nearest new center, a loop has been generated. As a result of this loop we
may notice that the k centers change their location step by step until no more changes are done
or in other words centers do not move any more. Finally, this algorithm aims at minimizing an
objective function know as squared error function given by:
J(V ) =
ci
c X
X
(||xi vj ||)2
(8.8)
i=1 j=1
where,
||xi vj || is the Euclidian distance between xi and vj .
ci is the number of data points in the ith cluster.
c is the number of cluster centers.
8.7.1
Let x = (x1 , x2 , x3 , . . . , xn ) be the set of data points and V = (v1 , v2 , v3 , . . . , vn ) be the set of
centers [31].
1. Randomly select c cluster centers.
2. Calculate the distance between the each data point and cluster centers.
3. Assign the data point to cluster center whose distance from cluster center is minimum of
all cluster centers.
4. Recalculate the new cluster using:
ci
1X
xi
vi =
ci j=1
(8.9)
52
2012-13
Fig 8.10 shows classification output for the k-means classification performed on FLAASH
corrected image.
Sample Size
(Pixels)
2150
49747
116639
25033
193569
Subset-196
bands(%)
95.67
46.58
53.55
100
58.2340
PCA-15
bands(%)
97.12
50.64
55.09
100
60.2219
PCA-10
bands(%)
97.16
50.60
55.06
100
60.1951
PCA-5
bands(%)
97.16
50.57
55.01
100
60.1584
53
2012-13
Chapter 9
Results and Conclusions
We discuss the results obtained in the previous chapter and present it in a concise manner, in
this chapter.
A Level 1 classification of a hyperspectral image was carried out using supervised as well
as unsupervised methods.The image was atmospherically corrected using two different correction modules: QUAC and FLAASH. Dimensionality reduction was carried out prior to classification, using PCA. Classification was done using both supervised as well as unsupervised
methods. For each classifier, the accuracy was noted. We conclude our work in the next section.
9.1
Results
Algorithm
Maximum
Likelihood
Spectral Angle Mapper
Minimum Distance
Mahalanobis
Distance Classifier
81.0987
78.3785
79.4611
87.2305
79.1177
87.0161
Table 9.1: Classification accuracy for FLAASH and QUAC correction modules.
54
2012-13
Figure 9.1: Graph illustrating the variation of Mahalanobis classification accuracy with number
of principal components.
Figure 9.2: Graph illustrating the variation of Minimum distance classification accuracy with
number of principal components.
55
2012-13
Figure 9.3: Graph illustrating the variation of SAM accuracy with number of principal components.
Figure 9.4: Graph illustrating the variation of Maximum Likelihood classification accuracy
with number of principal components.
56
2012-13
Figure 9.5: Graph illustrating the variation of Mahalanobis classification accuracy with number
of principal components for QUAC corrected image.
Figure 9.6: Graph illustrating the variation of Minimum Distance classification accuracy with
number of principal components for QUAC corrected image.
57
2012-13
Figure 9.7: Graph illustrating the variation of Maximum Likelihood classifier accuracy with
number of principal components for QUAC corrected image.
Figure 9.8: Graph illustrating the variation of SAM classification accuracy with number of
principal components for QUAC corrected image.
58
2012-13
9.2
Conclusions
1. It can be observed that for both the atmospheric correction modules, Mahalanobis Distance Classifier gave the best results with an accuracy of 87.2305% and 87.0161% for
FLAASH and QUAC respectively. The Minimum Distance Classifier gave the least accuracy of 79.4611% and 79.1177% for FLAASH and QUAC respectively.
2. Classification accuracy for FLAASH correction is higher than QUAC correction.
3. Due to the high classification accuracy for FLAASH, we state that FLAASH is a better
form of compensating for atmospheric techniques compared to QUAC.
4. Unsupervised techniques have a very low classification accuracy of only about 60% for
FLAASH corrected images. This is due to the fact that they dont have a priori knowledge
of the dataset.
5. It is also observed that the variation of the number of Principal Components doesnt affect
the accuracy significantly.
6. For all the classifiers, it is observed that the accuracy of the classifier decreases if only two
Principal Components are used, and thereafter, the accuracy increases. The low accuracy
for two PCs is rather expected because only the first two Principal Components are not
enough to store all the information of the original image.
NOTE: There is no unsupervised classification data for QUAC corrected images.
59
2012-13
Chapter 10
Future Work
The image classification carried out in this project is a Level 1 classification. If detailed information about the target classes is desired, Level 2 classification can be performed. Level 2 classification is used to subdivide target classes into constituent classes. For example, vegetation
can be subdivided into the following classes: Healthy Vegetation, and Unhealthy Vegetation
using the concept of Normalized Difference Vegetation Index(NDVI). Note that the example is
illustrative and does not place any constraints on the constituents of the base class.
Dimensionality reduction was carried out using PCA. As described in Chapter 6, other techniques exist, viz, ICA and VCA. Using these techniques of feature selection, classifier accuracy
can be increased to almost 100%[18].
As can be noted from the previous chapter, unsupervised techniques gave a very low classification accuracy. To improve the accuracy of unsupervised techniques, hierarchial clustering
methods using the technique of splitting and merging can be used, which again increases classifier accuracy.
Hyperspectral images, with their large sizes and multiple bands are intrinsically prone to
parallel processing. Using parallel processing, the computational time required drastically reduces. The following steps in this project can be parallelized; Dimensionality Reduction and
Image Classification. Using GPUs has an added advantage over other forms of parallelization, such as cluster computing, since a GPU is situated on a single computer, and hence it is
60
2012-13
61
2012-13
References
62
2012-13
[9] J.Freeman, et.al., Multispectral and hyperspectral imaging: applications for medical and
surgical diagnostics, in Engineering in Medicine and Biology Society, 1997, Proc. of the
63
2012-13
[21] K.C.Tan et.al., Comparison of Neural Network and Maximum Likelihood Classifiers for
Land Cover Classification Using Landsat Multispectral Data, Proceedings of 2011 IEEE
Conference on Open Systems, Langkawi, Malaysia, September 25-28, 2011
[22] G.Girouard, et.al. Validated Spectral Angle Mapper Algorithm for Geological Mapping :
Comparitive Study between QuickBird and Landsat TM, Remote Sensing and GIS applied
to Geosciences and Environment, Ottawa.
[23] J.A.Richards and Jia X, Remote Sensing Digital Image Analysis - An Introduction, Third
Edition, Springer, May 2006.
[24] F.Kruse, The Spectral Image Processing Systems (SIMS) - The interactive visualization
and anaysis of imaging spectrometer data, Remote Sensing Environment, 1993, Vol. 44,
pp 145-163.
[25] P.C.Mahalanobis ,On the generalised distance in statistics, Proceedings of the National
Institute of Sciences of India, 1936, Vol 2(1), pp 49-55.
[26] S.Xiang, Learning a Mahalanobis Distance Metric for Data Clustering and Classification, Tsinghua National Laboratory for Information Science and Technolgy, Tsinghua
University, Beijing.
[27] A.G.Wacker, D.A.Landgrebe, Minimum Distance Classification in Remote Sensing, Laboratory for Applications of Remote Sensing, Purdue University, Indiana.
[28] J. Myung, Tutorial on Maximum Likelihood Estimation, Journal of Mathematical Psychology, Academic Press, Columbus, October 2002.
[29] G.L.Goodman, D.McMichael, Objective Functions for Maximum Likelihood Classifier
Design, Centre for Sensor Signal and Information Processing, Salisbury, South Africa.
[30] T.Kanungo et.al., An efficient k-Means Clustering Algorithm: Analysis and Implementation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 24, No 7,
2002, pp 881-892.
[31] J.P.Ortega et.al., Research Issues on k-means algorithm: An experimental trial using Matlab, http://ceur-ws.org/Vol-534/Paper10.pdf, as of 5th May, 2013.
Department of Electrical and Electronics Engineering, AIeMS
64
2012-13
APPENDIX
2
3
clc
clear all
5
6
%% read in image
img = rgb2gray(imread('flaash.jpg'));
%display image
10
% declare no of segments
11
k=3;
12
13
img = im2double(img);
14
[mu,mask] = kmeans(img,k);
15
16
b = uint8(mask==1);
17
b1=b*255;
18
19
20
21
22
l1 = bwlabel(b1);
23
24
stats1 = regionprops(l1,'Centroid','Area','BoundingBox');
25
26
area_values1 = [stats1.Area];
27
28
29
30
c = uint8(mask==2);
31
32
c1 = c*255;
figure,imshow(c1),title('classification for object two');
33
34
35
l2 = bwlabel(c1);
36
2012-13
37
stats2 = regionprops(l2,'Centroid','Area','BoundingBox');
38
area_values2 = [stats2.Area];
39
40
41
d = uint8(mask==3);
42
43
d1 = d*255;
figure,imshow(d1),title('classification for object three');
44
45
46
l3 = bwlabel(d1);
47
48
stats3 = regionprops(l3,'Centroid','Area','BoundingBox');
49
area_values3 = [stats3.Area];
50
% Usage
% Inputs
M - 2D
10
11
% Outputs
12
13
V - Transformation matrix.
14
lambda - eigenvalues
15
16
% References
17
[M_pct, V] = hyperPct(M, q)
matrix (p x N)
http://en.wikipedia.org/wiki/Principal_component_analysis
18
19
[p, N] = size(M);
20
21
22
u = mean(M.').';
23
%M = M - repmat(u, 1, N);
24
M = M - (u*ones(1,N));
25
26
27
C = (M*M.')/N;
28
29
30
31
32
% Transform data
2012-13
33
M_pct = V'*M;
34
35
lambda = diag(D);
36
37
return;
% where N = mn
% Usage
% Inputs
[M] = hyperConvert2d(M)
M - 3D HSI cube (m x n x p)
10
% Outputs
11
M - 2D data matrix (p x N)
12
13
if (ndims(M)>3 || ndims(M)<2)
error('Input image must be m x n x p or m x n');
14
15
end
16
if (ndims(M) == 2)
17
numBands = 1;
18
[h, w] = size(M);
19
else
[h, w, numBands] = size(M);
20
21
end
22
23
24
25
return;
4
5
% where N = m * n
%
% Usage
% Inputs
[M] = hyperConvert3d(M)
M - 2D data matrix (p x N)
10
% Outputs
11
M - 3D data cube (m x n x p)
12
13
2012-13
14
if (ndims(img)
6=
2)
15
16
end
17
18
[numBands, N] = size(img);
19
20
if (1 == N)
img = reshape(img, h, w);
21
22
else
img = reshape(img.', h, w, numBands);
23
24
end
25
26
return;