Professional Documents
Culture Documents
Terms
Image Processing:
Image in Image out
Image Analysis:
Image in Measurements out, e.g. mean value
Image Understanding:
Image in High-level description out, e.g. this image contains two cars
ENGN8530: CVIU
Useful Textbooks
J.C. Russ, The Image Processing Handbook, CRC Press, 5th Edition, 2006. (latest) J.D. Foley, A. van Dam, S.K. Feiner, and J.F. Hughes, Computer Graphics: Principles and Practice, AddisonWesley, 2nd Edition, 1997.
ENGN8530: CVIU
Camera Geometry
aperture d z optical axis f (focal length) y' x' image plane
The aperture allows light to enter the camera The image plane is where the image is formed The focal length is the distance between the aperture and the image plane The optical axis passes through the center of the aperture and is perpendicular to it.
ENGN8530: CVIU 4
ENGN8530: CVIU
What is in Image?
An image is an array/matrix of values (picture elements = pixels) on a plane which describe the world from the point of view of the observer. Pixels are rectangular parts of the image plane (the imaging sensor, to be precise) Square pixels are most desirable but cameras often do not output square pixels!
ENGN8530: CVIU
Image Definition
An image is a function of two real variables f(x,y), where x and y are the image coordinates of the pixels and f is the brightness / intensity at these. For volumetric image data: f(x,y,z) For colour images: f(x,y) = (r(x,y), g(x,y), b(x,y)) For videos: f(x,y,t) where t = time
ENGN8530: CVIU
I(x,y) = f[x,y]
I(x,y) = f[x,y]
We use the square brackets here to highlight that it is a discrete space
8
Nyquist Rate
Defined as fN = 2 B where B is the bandwidth Why is this important?
If the sampling rate < fN, aliasing effects occur. If we want to avoid aliasing, sampling rate must be twice that of the highest frequency.
In images, the samples are the pixels. Since the number of pixels is fixed by the camera, spatial aliasing will occur!
Moir pattern ENGN8530: CVIU 9
Gate
Saturation is not noise, it is a physical limit of the sensor In signal processing systems, the amount of noise is compared to that of the signal via the signal to noise ratio SNR=signal power/noise power
ENGN8530: CVIU 10
SNR (2)
This ratio can vary enormously; from 1 to 105 or more It is convenient to take the logarithm
SNR(B)=log10(signal power/noise power) Decibels (dB) are simply 10x Bels, i.e. 10log10(signal power/noise power) 10dB is a tenfold increase in power, 3dB is a doubling of power
ENGN8530: CVIU
11
CCD Properties
Saturation
Response
Response
Photon noise
Dark current
Light intensity
Light intensity
Ideal response:
Typical CCD response: Noise from Thermal noise Quantisation of pixels Amplifier
12
Response=gainLight intensity
Response=gainLight intensity+bias+noise
ENGN8530: CVIU
ENGN8530: CVIU
13
Dark Current
Enhanced contrast Constant response exhibited by a receptor of radiation during periods when it is not actively being exposed to light.
14
= 3B + 2.4
12
SNR=26dB for 8-bit digitisation, much better than the photon noise Photon noise is the limiting factor for CCDs
ENGN8530: CVIU 15
Poisson noise
Time t Mean intensity = t Standard deviation = t
ENGN8530: CVIU
16
Photon noise
Nave method: Take three separate pictures, each with a different filter placed over the CCD
If the subject is moving between frames, the channels will not line up correctly!
ENGN8530: CVIU
18
Bayer filter
Source: Wikipedia
19
j i
The position vector of the point can also be written in terms of two other vectors as 3i+4j i and j are basis vectors vectors than can be used to construct other vectors
ENGN8530: CVIU 21
Basis Sets
Basis vectors do not have to be aligned along the x and y axes
y
2a + 3b
b a b
We can extend the space to n dimensions, where we need at least n basis vectors But if a and b were in the same direction, we could not describe any point because we can only go in one direction
ENGN8530: CVIU 22
Complete Basis
What about being able to describe every point? We can write the vector p as
p1 1 0 0 p2 1 0 0 p = = p1 + p2 + K+ pn = p1e1 + p2e2 + K+ pnen M M M M 0 0 1 p n
As long as we can construct all the standard basis vectors, we can construct p a j = ek x j = x jk Construct ek with an orthonormal basis set
ek = a1x1 + a2x2 + K+ an xn
(i k ) = x jk x ji Completecondition
j
24
Orthogonal Basis
Now consider a special set of basis vectors, such that
xa xb = (a b)
All of these vectors are at right angles to each other Unit length Orthogonal and Normal - Orthonormal
And in general ai=pxi When the basis set is orthogonal, the computation of coefficients is much easier
ENGN8530: CVIU 25
Linear Transformations
Given a vector p and a complete basis set {x1,x2,,xn} we can compute the coefficients {a1,a2,,an} Put the coefficients into a vector p1 a1
p2 a2 M M = a p a n n
Given a vector a of coefficients and the basis set, we can compute p This bidirectional mapping is a linear transform Both directions a p No information lost Constructed by linear combination of the basis elements a is the transformation of p
ENGN8530: CVIU 26
Matrix Notation
Write down the transform mathematically n x n matrix, rows are the basis vectors
T x1 T x2 T= M xT n
p = TT a
The transform is
a = (TT )1p
27
ENGN8530: CVIU
Orthonormal Transforms
If the basis set is orthonormal then the following apply
TTT = I T1 = TT (orthonorm condition) ality
a = Tp
TT T = I
ENGN8530: CVIU
28
Image Transforms
Images are 2D, whereas vectors are 1D Can also use the vector notation, by vectorizing the images, discussed later We can still use the idea of basis sets and transforms Image basis set {B1,1(x,y), B1,2(x,y), , Bn,m(x,y)} Now there need to be the same number of basis images as pixels in the image Orthonormal condition Bi, j ( x, y)Bk ,l ( x, y) = (i k , j l )
x y
Coefficients
ENGN8530: CVIU
ai , j = Bi, j ( x, y)I ( x, y)
x y
29
ENGN8530: CVIU
30
Fourier Transform
Made from cos and sin put together using complex numbers 1 2ikj n 1D xkj = e n
1 = cos(2kj n) + i sin(2kj n) n
2D
1 Bk ,l ( x, y) = e NM
2 ikx 2 ily N M
ENGN8530: CVIU
31
FT (2)
There is a continuous version of this transform which is useful when working with functions rather than discrete images
F (u) = f ( x)eiuxdx
This is make a Fourier integral pair with
1 f ( x) = F (u)eiuxdu 2
1 f ( x, y) = F (u, v)eiuxeivydudv 2
F (u, v) = f ( x, y)eiuxeivydxdy,
ENGN8530: CVIU
32
FT (3)
Some common functions for Fourier Transform:
f ( x) = 1 f ( x) = sin x f ( x) = 1 x 2 2 2 e 2 F (u) = ( 0 ) i F (u) = ((u 1) - (u +1)) 2 2 2 1 F (u) = e u / 2 2
ENGN8530: CVIU
33
Bu,v ( x, y) = e
2 iux 2 ivy N M
They contain complex numbers so the resultant output may be complex As before, the transformed image is calculated by
Fu,v = f x, y e
x =0 y =0
ENGN8530: CVIU
M 1 N 1
2 iux
2 ivy N
,
34
DFT (2)
The complex exponentials are in fact sine and cosine functions
ei = cos + i sin e
2 iux N
Really just sine and cosine transforms stuck together Low frequency detail in first components, high frequency in higher order components
ENGN8530: CVIU
35
DFT Example
DFT coefficients
18 16 14 12 10 8 6 4 2
ENGN8530: CVIU
36
Fu,v = f x, y e
x=0 y =0
2 iux
f x, y
This, and other transforms, normally take NM operations per 2 2 pixel, or N M operations in total The DFT can be computed by an algorithm called the Fast Fourier Transform(FFT) which takes just log N+log M operations per pixel or NM(logN+logM) total. The FFT requires that N and M are powers of 2
ENGN8530: CVIU 37
f ( x x0 , y y0 ) eix0u eiy0v F (u, v) | F (u, v) |2 = eix0u eiy0v F (u, v)eix0u eiy0v F (u, v) =| F (u, v) |2
| f ( x, y) |2 = | F (u, v) |2
x, y x, y
Total power
2
ENGN8530: CVIU
38
2 1 xki = cos k (i + ) 2 n n
2D
Orthonormal transform Often used for compressing images Also a similar sine version
ENGN8530: CVIU 39
DCT (2)
We go from image to the frequency domain and back
f ( x, y ) F (u, v) Shift the values of the pixels into signed integers [2 7 ,2 7 1] Split the image into 8 8 pixel blocks
In the encoder we apply the forward Discrete Cosine Transform (DCT)
7 7 (2 x + 1) u cos (2 y + 1) v 1 F (u, v) = C (u )C (v) f ( x, y ) cos 16 16 4 x =0 y =0 C (u ), C (v) = 1 2 {1 / for for u , v = 0 - DC coefficient u, v = 1, L,7 - AC coefficients
In the decoder we apply the Inverse DCT transform which is the dual of the forward DCT transform and we reconstruct the image
ENGN8530: CVIU
40
DCT Example
DCT coefficients
10
-5
-10
ENGN8530: CVIU
41
ENGN8530: CVIU
42
Compression ratio
25
20
15
10
5
0
10
20
30
40 50 60 Quality factor
70
80
90
100
ENGN8530: CVIU
43
Compression ratio
80
60
40
20
10
20
30
40 50 60 Quality factor
70
80
90
100
Due to the sub-sampling of the chrominance components the compression achieved in colour images is higher.
ENGN8530: CVIU 44
Linear Filtering
Filtering is the process of modifying the image by removing some unwanted information while retaining the important information Linear filters are popular because they are easy to implement via the convolution theorem
Also possible to analyse the statistical properties of the filtered image
ENGN8530: CVIU
45
Convolution
Consider the class of linear operators on the image array
If we are interested in performing the same operation at each pixel, we can often define g in coordinates relative to the pixel (x,y) (a stationary or shift invariant filter)
g ( x, y, a, b) = g ( x a, y b)
So we have a convolution A continuous version:
ENGN8530: CVIU
Convolution (2)
The convolution operation is often written as
Ig=
Example
g ( x a, y b) I (a, b)
a ,b
1
Input Image
1 2 3 4 5 6 7 8 9
Convolution Filter
1 2 3 4 5 6 7 8 9
Result
ENGN8530: CVIU
47
Convolution (3)
Can combine two filters to form a double linear filter
I ' ( x, y) = h( x c, y d ) g (c a, d b)I (a, b)
c,d a,b
= h( x c, y d ) g (c a, d b)I (a, b) a,b c,d = h(r, s) g ( x a r, y b s)I (a, b) a,b r , s = f ( x a, y b)I (a, b)
a,b
Any number of these filters can be applied in one operation For image of resolution NxM takes N2M2 operations
ENGN8530: CVIU 48
Convolution Example
0.03
0.02 0.01 0 10 5 0 0 5
10
Convolution Filter
Result 49
Convolution Theorem
The Fourier transform of the convolution of two functions is the product of the Fourier transforms of the functions
F ( f g) = F ( f )F ( g)
In other words, the convolution can be calculated by
f g = F -1 [F ( f )F ( g)]
This can be calculated in O(NM log[MN]) steps not O(N2M2) which can be a significant computational reduction for large N,M.
ENGN8530: CVIU 50
Correlation
The correlation between two images is given by
a,b
I G = G( x + a, y + b)I (a, b)
F ( f g) = F ( f )F ( g)
Correlation Example
Correlation filter G
Result 52
Spatial Averaging
After this excurse to convolution, back to linear filtering Each pixel is replaced by a weighted average of its neighbourhood pixels
Useful for sub-sampling and noise reduction
I ' ( x, y) =
1 4
k ,lW
a(k, l)I (x k, y l)
1 9
1 4
1 9
1 9
0
1 8
1 8
0
1 8
1 4
1 4
1 9
1 9
1 9
1 4
1 8
1 9
1 9
1 9
ENGN8530: CVIU
53
1 4
1 9
1 9
1 9
0
1 8
1 8
0
1 8
1 4
1 4
1 9
1 9
1 9
1 4
1 8
1 9
1 9
1 9
ENGN8530: CVIU
54
Median Filtering
Often easily confused with mean filter Median filter is non-linear Good for noise / outlier removal Example:
Window size of 3, repeating edge values: x = [2 80 6 3] y[1] = Median[2 2 80] = 2 y[2] = Median[2 80 6] = Median[2 6 80] = 6 y[3] = Median[80 6 3] = Median[3 6 80] = 6 y[4] = Median[6 3 3] = Median[3 3 6] = 3 so y = [2 6 6 3] where y is the median filtered output of x
ENGN8530: CVIU 55
Original image
ENGN8530: CVIU
56
Gaussian Filter
Gaussian filter
1 ( x2 + y2 ) 22 G( x, y) = e 2
10
ENGN8530: CVIU
57
High-Pass Filter
We have an image with a slowly varying background which we want to remove Use a high-pass filter which removes low frequency components from the spectrum
ENGN8530: CVIU
58
Low-Pass Filter
We have an image with too much detail Use a low-pass filter which removes high frequency components from the spectrum
First derivative
dI dx
Second derivative
d2 I dx2
ENGN8530: CVIU
60
ENGN8530: CVIU
61
ENGN8530: CVIU
62
ENGN8530: CVIU
63
1
2
1 2
0 0 0
1 2
0
1
0
2
0
1
Sobel
Noise filter
x Edge Filter
1 1
y Edge Filter
1 1
1 0
0 0 0
1 1
Roberts
-1
0
1
0
1
0
1
Smoothed (Prewitt)
y Edge Filter 64
N N ( x 2 + y 2 ) 22 ( x 2 + y 2 ) 22 Ey = = ye Ex = = xe y x
ENGN8530: CVIU
65
y = r sin
x = r cos
r g
I I tan max = y x
By taking gradients in the x and y directions we can find the maximum gradient of the edge and its direction
ENGN8530: CVIU
66
ENGN8530: CVIU
69
Weak edges
ENGN8530: CVIU
70
Result
ENGN8530: CVIU
71
ENGN8530: CVIU
72
I I I= 2 + 2 x y
2 2 2
The LoG (Laplacian of Gaussian) operator measures the strength of the 2nd derivative in both directions after the image has been smoothed by a Gaussian G
2(G * I) = 2G * I
ENGN8530: CVIU 73
K G
y
a
x
y x
x a
K G
x x
K 2G
2G 2G = 2 + 2 x y
74
Difference of Gaussian
=
This is the DoG (Difference of Gaussians) operator also called the Mexican Hat Operator.
ENGN8530: CVIU 75
ENGN8530: CVIU
76
Disadvantages:
2nd derivative is very sensitive to noise Not all edges form closed contours
ENGN8530: CVIU
77
Original
ENGN8530: CVIU 78
Marr-Hildreth
ENGN8530: CVIU
79
Marr-Hildreth
Canny
ENGN8530: CVIU
80
Hough Transform
It started as a feature detector for detecting lines in edge images. Circular Hough Transform extends this idea to detect circles. The Generalised Hough transform
can be used to find any shape in an image. more complex shapes lead to rapid increases in computational complexity. practical applications are limited.
References: R.O. Duda and P. E. Hart, Use of the Hough Transformation to Detect Lines and Curves in Pictures, Comm. ACM, Vol. 15, January 1972, pp. 1115.
ENGN8530: CVIU 81
ENGN8530: CVIU
82
x
First point Second point
ENGN8530: CVIU
83
Third point
ENGN8530: CVIU
84
ENGN8530: CVIU
85
y r
ENGN8530: CVIU
86
y r
r
ENGN8530: CVIU 87
r y
3 2 1 1 2 3 4
0 5
Hough space
-5
88
ENGN8530: CVIU
ENGN8530: CVIU
89
LHT Algorithm
Parameterise the line in terms of the and r, r = x cos + y sin Zero the accumulator array. Determine an edge map Iedge of the image. Examine each point (x,y) in Iedge. If Iedge(x,y)=1 increment all cells in the accumulator array that
correspond to a line passing through this point in the image. That is, points on the line
r = x cos + y sin
in parameter space (, r)
Maxima in the accumulator array indicate the dominant parameter values, which in turn specify the lines in the image.
ENGN8530: CVIU 90
Original image
Hough accumulator
Output image
ENGN8530: CVIU
91
For each data point in the edge image consider every possible shape S(qi) that co-exists with that data point. A vote is registered for each associated set of parameters qi.
Parameter space is divided into a number of discrete accumulator cells. An accumulator array is used to count the votes for each accumulator cell.
Shapes are located in the image as those that co-exist with the maximum number of data points, i.e. maxima in the accumulator array.
ENGN8530: CVIU
92
Thresholding
Thresholding is used to convert grey-scale images into binary images as well as for object segmentation.
I bin
ENGN8530: CVIU
93
Thresholding (2)
Advantages:
Simplifies images, makes it easier to analyse Can help to segment objects Can construct a mask from result
Disadvantages:
Loss of information How to choose a good threshold? A bit of a Black art. Threshold generally chosen manually Image-specific
ENGN8530: CVIU
94
Intensity Histogram
Intensity histogram can be useful
14000 12000
No pixels
10000
8000
6000
4000
2000
0 0 0. 1 0. 2 0. 3 0.4 0.5 0. 6 0. 7 0. 8 0. 9 1
intensity
black
white
Intensity
ENGN8530: CVIU
96
white
Intensity
ENGN8530: CVIU
97
Histogram Equalisation
In practice not all the intensity values will be packed into a single region. Fully utilise sensory capacity by spreading information uniformly over all intensities Flatten histogram Histogram Equalisation
ENGN8530: CVIU No. occurances
black
white
Intensity
No. occurances
black
white
Intensity
98
Transformed histogram
No. pixels
a
99
f : I(p) Inew(p)
These operators alter the intensity histogram. A monotonic f ensures the ordering of intensity values is preserved.
ENGN8530: CVIU 100
f(a)
f(a)
ENGN8530: CVIU
101
3500
No occurrences
3000
2500
2000
1500
1000
500
No occurrences
0 0 50
Original Histogram
intensity
100 150
200
250
50
intensity
100 150
200
250
Original Image
Equalised Histogram
Equalised Image
ENGN8530: CVIU
102
Before
No. occurrence
After
No. occurrence
255
Grey levels 103
Colour Images
Humans can distinguish
100,000s of different colour shades and intensities, But only ~100 levels of gray.
Digital images
8 bit = 2 = 256 values 3x 8 bit = 256 x 256 x 256 = 16777216 values
8
ENGN8530: CVIU
104
c
The spectrum of a light source is S(), the intensity of light at wavelength .
ENGN8530: CVIU 105
Increasing wavelength
Pure light has only one wavelength in it Normally, light is a mixture of wavelengths - the spectrum of light S()
The relationship between colour perception and composition of the light is much more complicated in this case
ENGN8530: CVIU 106
ENGN8530: CVIU
107
Colour Theory
Every colour on a computer screen is a combination of three primary colours. The primary colours form a basis spanning (almost) the entire space of possible colours. There are 2 sets of primary colours:
Additive Subtractive
ENGN8530: CVIU Subtractive colour: Painting, printing 108
109
Artists mixing paint use the subtractive primaries, because paints absorb all colours of light except those they reflect, and are thus a source of subtracting light.
ENGN8530: CVIU Subtractive colour: Painting, printing 110
111
ENGN8530: CVIU
112
Colour Perception
Humans cannot perceive the entire spectrum, instead we sample it using colour-sensitive cone cells in the eye. The responses of these cells determine the raw colour information and brightness. The responses look like this
r = R()S ()Fr ()d g = R()S ()Fg ()d b = R()S ()Fb ()d
ENGN8530: CVIU 113
Colour Constancy
The coloured blocks in each image are the same colour but look different. Post processing in the brain attempts to remove the colour of the light source and find the colour of the object. It uses background to find light source colour.
ENGN8530: CVIU 115
Colour Spaces
Because different devices have different gamuts, there is a need to select the colour representation which is matched to the device. There are many different colour specifications. RGB, normalised RGB CMY CMYK HSV HSL / HSI YUV YIQ CIE-XYZ CIE-LAB
116
ENGN8530: CVIU
Intensity v. Luminance
Notice the difference between intensity and luminance. Intensity is a physical quantity determined by the energy of the signal. Luminance is a product of colour perception and is a purely psychological phenomenon.
ENGN8530: CVIU
117
Colour Gamuts
A range of colours can be created by selecting 3 (or more) primary colours from this diagram. Such a range is called a colour gamut. 3 colours give a triangle such as the red, green and blue phosphors in a screen. Colours outside the gamut are not displayable.
ENGN8530: CVIU 118
CIE XYZ
In 1931 CIE (Commission defined 3 standard primary colours X, Y, Z that can be added together to make all visible colours. Based on response curves of the eyes 3 receptors. X,Y,Z do not physically exist.
Internationale de l'clairage)
Source http://cit.dixie.edu/vt/reading/gamuts.asp
ENGN8530: CVIU
119
x,y are colour coordinates Linear additive colour space Mixed colours lie on the line between the colours White lies in the centre Axis out of page is brightness
ENGN8530: CVIU
CIE Chromaticity Diagram: This is a perceptual map based on responses from people about how colours are related to each other
120
CIE XYZ
Disadvantages:
Hard to include brightness Perceived colour difference is not well correlated with spacing in the color space.
ENGN8530: CVIU
121
RGB
RGB Colour Cube B
Blue (0,0,1) Magenta
s rey G
Cyan
Green (0,1,0)
R
ENGN8530: CVIU
Red (1,0,0)
Yellow
122
RGB (2)
RGB cannot represent all visible colours! Different colour spaces allow us to separate different components, such as
Intensity Luminance Hue Saturation
ENGN8530: CVIU
123
CIE L*a*b*
While the retina directly registers 3 colour stimuli (red, green and blue) these are later processed and 3 sensations are generated:
a red-green sensation a yellow-blue sensation a brightness sensation
These sensations form the basis of the CIE L*a*b* (or CIE Lab) colour space, developed in 1976.
ENGN8530: CVIU
124
ENGN8530: CVIU
125
ENGN8530: CVIU
126
Hue
Value
Hue
Intensity
HSV
ENGN8530: CVIU
HSL/HSI
127
Illumination Effects
Problem:
Many colour spaces are not robust with respect to changes in lighting.
Cyan
Green (0,1,0)
G
128
Yellow
R r= R+G+ B
G g= R+G + B
alternatively could use rb or bg. HS: the first 2 channels of HSV or HSL. AB: the 2nd and 3rd channels of CIELab
ENGN8530: CVIU
129
ENGN8530: CVIU
130
ENGN8530: CVIU
131
Sample images of skin are passed to the system (in RGB format) The colour value of every pixel is mapped to a 2D chrominance space. A model is selected describing the distribution of skin colour pixels in chrominance space.
ENGN8530: CVIU
132
b
n discrete bins
ENGN8530: CVIU
n discrete bins
a
134
a
ENGN8530: CVIU
a
135
a
60 50
Gaussian
40 30 20 10 0
-10 -10 0
10 20 30 40
b
137
ENGN8530: CVIU
138
(ai,bi)
Itest
ai
20 10 0 -10 -10 0
Iskin
ENGN8530: CVIU
10 20 30 40
b
139
bi
Image Analysis
Image / Region-of-Interest statistics:
Arithmetic mean Standard deviation / variance Histogram
No. occurances
black
white
Intensity
ENGN8530: CVIU
140
ENGN8530: CVIU
141
DE ( P(i, j ), X (k , l )) = (i k ) 2 + ( j l ) 2
City Block Distance
P (i,j)
DCB ( P (i, j ), X (k , l )) =| i k | + | j l |
Chessboard Distance
ENGN8530: CVIU
X (k,l)
DC ( P (i, j ), X (k , l )) = Max{| i k |, | j l |}
142
Moments of Objects
Often work with binary images / masks Binary images are particularly useful for
Identifying objects with distinctive silhouettes, e.g. components on a conveyor in a manufacturing plant. Recognising text and symbols, e.g. document processing, or interpreting road signs. Determining the orientation of objects
Area is given by the 0th moment Centre of mass (centroid) is given by the 1st moments
A = I ( x, y )
x y
x=
x y
xI ( x, y )
y=
x y
yI ( x, y ) I ( x, y )
x y
I ( x, y )
x y
ENGN8530: CVIU
146
xy
U xy = ( x x )( y y )I ( x, y )
x y
U yy = ( y y ) I ( x, y )
2 x y
ENGN8530: CVIU
147
Calculating Moments
Let (xi, yi) be all the points on the object in I, then the summations can be more compactly represented as Moments can be quickly calculated from the following summations
x
i
n i
yim = x n y m I ( x, y )
x y
S = 1
i
S xx = xi2
i
S x = xi
i
S yy = yi2
i
S y = yi
i
S xy = xi yi
i
ENGN8530: CVIU
148
It is a simple matter to prove that these are equivalent to the definitions on the previous slide.
ENGN8530: CVIU
U yy U xy
149
Central Moments
Spatial moments taken from image origin
M mn = ( x) m ( y ) n I ( x, y )
x y
U mn = ( x x ) m ( y y ) n I ( x, y )
x y
ENGN8530: CVIU
150
Normalised Moments
Centralised moments invariant to translation Normalised moments invariant to translation and scale:
( x x ) m ( y y ) n I ( x, y )
x y
N mn =
( I ( x, y ))
x y
m+ n+ 2 2
ENGN8530: CVIU
151
Image Understanding
Extract features (edges, boundaries, etc.) Then get understanding of the scene Scene Description Image Features Scene Interpretation Scene Understanding
ENGN8530: CVIU 152
Representation
Thresholding Boundary modelling Clustering Quad-trees Texture modelling Skeleton
Interpretation
Statistical Syntactical Decision trees Similarity measures Matching
153
Primitives are extracted from image Tokens - edge segments, blobs, bars, corners, groups of points
ENGN8530: CVIU
Interpretation
154
ENGN8530: CVIU
155
ENGN8530: CVIU
156
ENGN8530: CVIU
157
Boundary Representation
Problems with existing edge detectors:
Edges are fragmented The threshold is derived experimentally
Boundaries are linked edges that characterize the shape of an object. Proper representation of object boundaries is important for analysis and synthesis of shapes. Boundaries can be found by tracing the connected edges.
ENGN8530: CVIU
158
t2
Connect the edge segments in regions where the gradient exceeds a second threshold t2
ENGN8530: CVIU 159
- crosses
161
Example of structural representation using curvature of segments resulted after boundary splitting
ENGN8530: CVIU
162
y = ax + b
- A line is characterised by only two parameters: a and b - An edge can be represented with a line by connecting its two end points and solving the two equation system for finding a and b Line model y
Original edge x - This method does not assure the best fit for the intermediary points.
ENGN8530: CVIU 163
E y = ( yi y ) = [ yi (axi + b)]
i =1 i =1
x
164
Solution:
E =0 a
N i
E =0 b
N 1 N b = yi a xi N i =1 i =1
a=
x y
i =1 N i i =1 N i =1 i =1
N xi yi
i =1 N i =1
xi xi N xi2
N 2 N
1 E x = ( xi x ) = xi ( yi b) a i =1 i =1
ENGN8530: CVIU
165
di
E = d i2
i =1
Error evaluated in terms of the perpendicular distance between the point ( x, y ) and the optimal line
166
ENGN8530: CVIU
1 x= N
x
i =0
1 y= N
(x x )( y
i N i
y
i =0
y ) i =1 N 1 N 2 ( yi y ) i =1 N 1
- Calculate the eigenvalues (2) and eigenvectors (2) - The eigenvector corresponding to the largest eigenvalue, passes through the centre of the data set and shows its orientation.
ENGN8530: CVIU 167
Initial edges
ENGN8530: CVIU
ENGN8530: CVIU
169
1 p x (1) p (2 ) 1 px = x = T C = 1 p x (3) 1 p x (4 )
t1 t2 t3 t4
C = T 1 p x
Matrix T should be invertible
ENGN8530: CVIU 170