You are on page 1of 56

Image Representation

Sunday, March 20, 2016

10:17 AM

Image Model

Digital Image

- Discrete representation of image

- Discretized element:
Spatial resolution (x, y-coordinate)

- f(x, y) = i(x, y) r(x, y)


f(x, y): Intensity

Intensity (quantization)

i(x, y): Illumination


Amount of source illumination
incident on the viewing scene

- Storage requirement: M * N * k
M = Image width
N = Image height

0 i(x, y) < +

k = No. of bits required to represent grey lvl

r(x, y): Reflectance


Amount of source illumination
reflected by objects in scene
0 r(x, y) 1
0 = Total absorption

1 = Total reflectance
- i, r are continuous function f(x, y) can be
continuous

COMP 4421 Page 1

Image Enhancement in Spatial Domain


Sunday, March 20, 2016

10:58 AM

Spatial Domain Enhancement

g(x, y) = T[f(x, y)]


- f(x, y) = Input image
- g(x, y) = Output image
- T = Operation on f (Intensity Transformation function)
Defined over neighborhood
Pixel, 3x3 square, rectangular sub-area
Operated on single/set of images
g(x, y) = T(f1(x, y), f2(x, y), , fn(x y)]

COMP 4421 Page 2

Gray-Level Transformation
Sunday, March 20, 2016

11:22 AM

Typical transform
General Properties

- Neighborhood: 1x1 (pixel)


- Notation: s = T(r)

Type

Def.

Illustration

Inverse

s = (L - 1) - r

Log

s = c log(1 + r)

r = Gray lvl of f(x, y) x, y


s = Gray lvl of g(x, y) x, y

c = Scale-factor Rescale intensity


to range [0, L -1]

Window/Gray-level Slicing

Operate on a gray window [ri, ri+x]

Power
(Gamma
correction) c = Scale-factor

Contrast
For (rmin, rmax):
Stretching
- r < rmin s = 0

- r > rmax s = L - 1
- rmin r rmax: Linearly map to
full range [0, L - 1]

Linear Mapping: [a1, b1] [a2, b2]

COMP 4421 Page 3

Image Histogram
Sunday, March 20, 2016

11:38 AM

Usage

Construction

- Gray lvl distribution (can be modelled by stat


distribution) Help identify intensity threshold)

- Input: Digital image with gray lvl [0, L - 1]


- Output: p(rk) = nk / N (Prob of occur of gray lvl rk)

rk = kth gray lvl


nk = No of pixel with kth gray lvl
N = Total no of pixels
k = 0, 1, , (L - 1)

- Brightness

- Constrast

Histogram Equalization

- Purpose:
Approx flat histogram (uniform distribution)
Equally many pixel at every gray lvl

Global vs Local Histogram Equalization


- Advantage

Significantly improve image appreance (perform contrast stretching


+ compressing simultaneously)
Automatic, good pre-processing step

Pros

Global

Local

Run fast

- Reveal more detailed


structure (adaptive to local
statistics)

- Methods:

- Suitable for image with


varying illumination

sk = Tk(L - 1) = (p(r1) + p(r2) + + p(rk))(L - 1)


Tk = p(r1) + p(r2) + + p(rk)

Cons

sk: New gray lvl for kth gray lvl


(L - 1): Max gray lvl
r

300

700

800

900

500

400

196

300

p(r)

0.0732 0.1709 0.1953 0.2197 0.1221 0.0977 0.0479 0.0732

0.0732 0.2441 0.4394 0.6591 0.7812 0.8789 0.9268 1

N = 4096, Max gray lvl = 7

COMP 4421 Page 4

- Might greatly
enhance noise

- Slow

- Not produce histogram as


- Might fail to
flat as global
recover detailed
- Might cause discontinuity
structures due to (2 pixel initially same
low contrast
intensity might transformed
differently)

Local Enhancement
Sunday, March 20, 2016

Linear filtering using mask

12:16 PM

- Usage: Apply mask on pixel z 5 (Convolution)


Image Averaging

gi(x, y) = ith observed noisy image


i = 1, 2, , K
Put mask on image such that z5 is in center of mask

Properties:

New intensity for z5:

f(x, y) = True image


(x, y) = Zero mean, random
Gaussian noise
K large

close to f, small

- Common filter:
Filter

Smoothing/Low-pass

Sharpening/High-pass

Usage

- Blurring: Remove small image


details
- Noise reduction

- Highlight fine details

- Freq filtering: Stop high, allow


low freq.

Non-linear Filter: Median Filter

- Enhance blurred details


- Freq filtering: Stop low, allow high
freq.

Design
principle

- Operation: On pixel z
Put window such that its center is at z
Set z = Median of windowed neighoring

- Near center: Positive wi

- All wi > 0
-1

- Scale factor = (wi)

- Near outer: Neg wi


- wi = 0
- NOTE: Result might < 0
Scale/Clip gray lvl back to [0, L - 1]

Neighbor sort = {2, 2, 3, 3, 4, 4, 8. 9, 10}


zNew = Median = 4
- Property: Convergence (Perform for many times
Image eventually will not change)
- Usage:
Remove impulse noise
Avoid excessive smoothing
Preserve edges

COMP 4421 Page 5

Example

First Order Derivatives


Sunday, March 20, 2016

6:49 PM

Maginute calculation using 3x3 mask


Gradient
- Basic method: Mag at z5

- Function of x, y
- Represent intensity change over x, y-axis

z1

z2

z3

z4

z5

z6

z7

z8

z9

Precise:
Fast approx:
- Magnitude:

- Cross different method:

z1

z2

z3

z4

z5

z6

z7

z8

z9

- Linear mask:
Prewitt operator:

mag( f) |GradientX| + |GradientY|


= |(z3 + z6 + z9) - (z1 + z4 + z7)| + |(z7 + z8 + z9) - (z1 + z2 + z3)|
Sobel operator:

COMP 4421 Page 6

Image Enhancement in Frequency Domain


Sunday, March 20, 2016

6:50 PM

1-D Discrete Fourier Transform (DFT)

Fourier Transform

- Any continuous function = Sum of sine function of


different freq

- DFT:

f(t) = g(t1) + g(t2) + g(t3) +


g(tk):
Sine function

u = 0, 1, 2, , (N - 1)

(Freq, Ampl, Phase diff) = (f k,, Ak, k)

- Inverse:

- Let f = 0 +

x = 0, 1, 2, , (N - 1)

Let H(f) = Function such that H(fk) can represent (Ak, k)


Fourier Transform: Given f(t), find such H(f)

H(fk) = bk (bk is complex)


|bk| = Ak

2-D Discrete Fourier Transform

arg(bk) = k

- DFT:

- 1-D Fourier Transform:


Input: f(x)
Output: u = Freq

u = 0, 1, , M - 1

v = 0, 1, , N - 1
- Inverse:

- 1-D Inverse Fourier Transform:

- Separability of 2D DFT: 1-D DFT Row 1-D DFT Col

Filtering in Frequency Domain

1. Pre-processing Input image f(x, y)


2. Fourier transform:

f(x, y) F(u, v)
3. Apply filter function:

G(x, y) = F(u, v)H(u, v)


(element-by-element multiplication)
H(u, v) = Filter in freq. domain
4. Inverse Fourier transform:

G(x, y) g(x, y)
5. Post-processing Output image g(x, y)

COMP 4421 Page 7

Smoothing Frequency-Domain Filters


Sunday, March 20, 2016

Filter

11:39 PM

Ideal Low-Pass Filter

Butterworth Low-Pass Filter

Gaussian Low-Pass Filter

(ILPF)

(BLPF)

(GLPF)

Main
formula
: Distance from (u, v)
to origin of freq plane
D0 0
Illustration

Relating
concept

- Image power:

- Slight variation:

Use Limit the min. value of H(u, v)


(cutoff freq)
Commonly:

R(u, v): Real part of DFT


I(u, v): Imaginary part of DFT

- ILPF with radius D0 enclosed % of power:

- Calculation of :
Cutoff freq = fc (at D = D0)

(u, v) summed over points inside


circle centered (0, 0) and radius D0
Pros &
Cons

Limitation: Sharp cut off of freq Ringing


Artifact

- Usual BLPF: No sharp discontinuity of freq:


n = 1: No ringing artifact
1 < n 2: Ringing artifact hardly visible

- Slight-varied BLPF: Smoother image (although


allowing more high freq)

Illustration of ILPF

COMP 4421 Page 8

- Most popular low-pass filter


- No ringing artifact

Left: Input image


Right:
Its Fourier spectrum

Outside Inside: Circle of radius 5, 15, 30, 80, 230 enclosing 94.6, 96.4, 98, 99.5% image power respectively

COMP 4421 Page 9

Sharpening Frequency-Domain Filters


Sunday, March 20, 2016

11:39 PM

Low-Pass vs. High-Pass Filter


- Low-pass: Suppress high freq Blur image
- High-pass: Suppress low freq Highlight details
Hhp(u, v) = 1 - Hlp(u, v)

Common Sharpening/High-Pass Filters

Filter

Ideal High-Pass Filter

Butterworth High-Pass Filter

Gaussian High-Pass Filter

(IHPF)

(BHPF)

(GHPF)

Main
formula

- Main version:

: Distance from
(u, v) to origin of freq plane
D0 0

- Slight variation:

Cutoff freq (limit max. H(u, v)): At D = D0:

- High freq emphasis: Preserve low freq but also


amplify high freq

0 a 1, b > a
Illustration

Pros &
Cons

Ringing artifact

n Less ringing artifact

Illustration of High-Pass Filter

- GHPF:

COMP 4421 Page 10

No ringing artifact

- BHPF with High-freq Emphasis:

COMP 4421 Page 11

Linear Image Restoration & Filtering


Sunday, March 20, 2016

6:50 PM

Noise

- Source:

CCD-Camera: Light lvl, sensor temp,


Magnetic resonance: Magnetic field strength
Corruption in transmission channel
- Noise modelling assumption:

Independent of spatial cooridnate


Uncorrelated with image
Use statistical model:
Describe by Prob. Distribution Function (PDF)
PDF identical to all pixel
Noise signal is random

Statistical Noise Model

Noise Model Formula

PDF Graph

Gaussian
= Mean
= sd

Rayleigh

Gamma

COMP 4421 Page 12

Image Illustration

Exponential

Uniform

Impulse
(Salt &
Pepper)
For b > a:
- Gray lvl b: Light dot (Salt noise)
- Gray lvl a: Dark dot (Pepper noise)
If Pa = 0 OR Pb = 0: Unipolar noise
If Pa 0 AND Pb 0, especially Pa Pb:

Salt & pepper granules (small


pieces) randomly distribute all over
image

Noise Model Parameter Estimation

- Choose an region of interest (ROI) from the image


- Construct histogram of ROI

Obtain zi's (gray lvls) and p(zi)


- Param calculation:

S = ROI

COMP 4421 Page 13

Mean Filters
Monday, March 21, 2016

10:38 AM

Filter

Arithmetic Mean

Main
formula

Geometric Mean

Harmonic Mean

Contraharmonic Mean

Q = Order of filter

- Similar to using mask:

(n row, m col)
Properties Work well for rand noise

- Compare to Arthimetic Mean Filter: - Salt noise: Work well

- Q > 0 Eliminate pepper noise

- Pepper noise: Fail


Smoothing effect: Similar
Tends to lose less image detail
- Work well for rand noise
- Other noises: Work well

- Q < 0 Eliminate salt noise

Notation:
SXY = Mask area (centered at the operating pixel)
g(x, y) = Corrupted image
= Restored image

Illustration

COMP 4421 Page 14

- Q = 0 Arthmetic Mean Filter


- Q = -1 Harmonic Mean Filter

COMP 4421 Page 15

Order-Statistics Filters
Monday, March 21, 2016

Filter

10:39 AM

Median

Max

Min

Midpoint

Main
formula

Alpha-trimmed Mean
Delete d/2 lowest & d/2 highest grey lvl values
in SXY

gr(s, t) = Undeleted grey lvl


Properties - Excellent noise-reduction

- Considerably less
blurring than linear
filters of same size
- Preserve edge

Find brightest point in


SXY

Find darkest point in SXY Work best for randomly distributed noise

Gaussian, uniform noise

- Work best when multiple types of noise


combined together in a single image
Gaussian + Salt & Pepper,
- d = mn - 1 Median filter

- Convergence
Illustration

COMP 4421 Page 16

Adaptive Filters
Monday, March 21, 2016

Filter

10:40 AM

Adaptive, Local Noise Reduction

Main Formula - Calculate means & variance:

Adaptive Median
- Within neighboring area SXY:
Zxy = Grey lvl at (x, y)
Zmin = Min grey lvl, Zmax = Max grey lvl
Zmed = Median grey lvl
Smax = Max. allowed size of SXY

Start with min. window size S = St


Lvl A:
A1 = Zmed - Zmin, A2 = Zmed - Z
If A1 > 0 & A2 < 0: Go to Lvl B
SXY = Neighbouring pixel in IMAGE
S'XY = Pixels in pure NOISE REGION

Else: Window size S


If window size Smax: Repeat Lvl A
Else: Output ZXY
Lvl B:

- Filter:

B1 = ZXY - Zmin, B2 = ZXY - Zmax


If B1 > 0 & B2 < 0: Output ZXY
Else: Output Zmed

Properties

No filter
-

Arithmetic Mean Filter

close to g(x, y) Preseve edge, boundaries

- Lvl A: Determine whether Zmed is impulse (noise):


Zmin < Zmed < Zmax (A1 > 0, A2 < 0) NOT impulse

Otherwise: Window size


- Lvl B: Determine whether ZXY is impulse (noise):
Zmin < ZXY < Zmax (B1 > 0, B2 < 0) Not impulse Output ZXY

Otherwise: Output Zmed


Illustration

COMP 4421 Page 17

Periodic Noise Reduction by Frequency Domain


Filtering
Sunday, March 20, 2016

6:50 PM

Noise-Reduction Filter

Main type

Band-Reject

Notch Reject

Operation

Remove a band of freq about origin of Fourier spectrum

Remove freq in predefined neighborhood about center freq (u0, v0)

Filter
visualization

Ideal
formula

W = Width of band
D0 = Radical center

Butterworth
formula

Gausian
formula

COMP 4421 Page 18

Illustration

COMP 4421 Page 19

Image Restoration Process


Monday, March 21, 2016

10:41 AM

Minimum mean square error (Wiener) Filtering

Model of Image Restoration Process


- Aim: Minimze
Term

Spatial Domain

Frequency Domain

Input image

f(x, y)

F(u, v)

N(x, y) & F(x, y) statistically uncorrelated

Degradation
function

h(x, y)

H(u, v)

N(x, y):

Additive
Noise

(x, y)

N(u, v)

Degraded
image

g(x, y)

G(u, v)

= h(x, y) * f(x, y) + (x, y)

= H(u, v)F(u, v) + N(u, v)

Restored
image

- Assumption:

Zero mean
Power spectrum known:
H: linear & known
Power spectrum of F(u, v) known:

- Estimation
N & F known ( S & Sf know):

* NOTE:
h(x, y) * f(x, y): Matrix convolution (using mask)
H(u, v)F(u, v): Element-by-element multiplication

H*(u, v) = Conjungate of H(u, v)


H(u, v) = a + bi H*(u, v) = a - bi

Modelling steps:

N & F unknown:

1. Input image

2. Apply Degradation function + Noise


3. Apply Restoration filter

4. Output image

- Properties: Incorporate H & into restoration process


Not suffer from zero-value of H(u, v)

Inverse Filter

- Assumption: H known
Geometric Mean Filter

- Estimation:

- Properties:

If N not known exactly F not recovered


exactly
If H(u, v) very small

very large, dominate calculation

bad estimation

Solution:
Limit freq near origin Try avoiding
very small H(u, v)
Use const in denominator:

COMP 4421 Page 20

= 1 Inverse filter
= 0, = 1 Standard Wiener filter

Geometric Transformations
Monday, March 21, 2016

10:41 AM

Gray-Level Assignment

Spatial Transformation

1. Distortion Correction:
f(x, y) unknown g(x', y') known
Integral f(x, y) Non-integral g(x', y')
Grey lvl of f(x, y) = ?
- Solution 1: Nearest neighbor approach:
Map non-integral g(x', y') Nearest integral g(x'', y'')
Set f(x, y) = g(x'', y'')
- Solution 2: Bilinear Interpolation
In g: Get 4 integral points nearest to g(x', y')

Calculation:
x' = r(x, y) = c1x + c2y + c3xy + c4

y' = s(x, y) = c5x + c6y + c7xy + c8


Need 4 pairs (x, y) (x', y') to estimate all coeff

2. Distortion:
f(x, y) known g(x', y') unknown
Integral f(x, y) Non-integral g(x', y')
Grey lvl of g(x', y') = ?
Solution: Nearest neighbor approach
Map non-integral g(x', y') Nearest integral g(x'', y'')
Set g(x'', y'') = f(x, y)

COMP 4421 Page 21

Non-Linear Filtering
Tuesday, March 22, 2016

1:12 AM

Linear Diffusion Filtering

Concept of Diffusion (Fick's Laws)

- Assumption:
Intensity diffuses isotropicaly (Diffusion Intensity grad)

D=I

- 1st Law:
Max flux Concentration gradient
(Diffusion) (Change in concentration)

- Equations:

Initially:

Low-pass Gaussian filter


- 1D Implementation:

- 2nd Law:
Rate of concentration accumulation within volume

Change of local concentration gradient

- Limitation:

Blur important features (edges, )


Noise reduced t Signal smoother, but edge
dislocated

- Diffusion equation (Heat equation, 1st + 2nd law combined):


1D:

D = Diffusivity (const/function)
2D:

Non-Linear Isotropic Diffusion Filtering


If D is const:
- Aim: Only smooth WITHIN region, NOT ACROSS boundaries

- How: Diffusivity function


Zero on boundary
Non-Linear Anisotropic Diffusion Filtering

Non-zero inside region

- Concept:
Smoothing/Reduce noise ALONG edges, not only within
each region
Diffusion Intensity grad
- 2D Implementation:
- How: Diffuse along grad & tangent vector of Gaussian
intensity smoothed image u
COMP 4421 Page 22

Diffusion Intensity grad


- 2D Implementation:
- How: Diffuse along grad & tangent vector of Gaussian
intensity smoothed image u

u = G * u

Grad vector

Tangent vec

w2 very large, w1 very small Smoothing along edge


achieved
* NOTE: 0, anisotropic isotrotopic diffusion
- Equations:

: Matrix with eigenvector

& eigenvector 1, 2

At boundary, 1 0

COMP 4421 Page 23

COMP 4421 Page 24

Morphological Image Processing


Sunday, March 20, 2016

6:50 PM

Basic Binary Morphological Operation


Set Theory Basis

A, B Z2

- Empty set:
A=

Dilation

Erosion

Def

Set of all displ z, such that (translated


by z) and A has overlapping elements

Set of all displ z such that B (translated by z)


is contained in A

- Union:
A B = {w | w A or w B}

How to use

- Intersection:
A B = {w | w A and w B}

- Put on top of A
- If they overlap Mark CENTER of

- Put B on top of A
- If B totally inside A Mark CENTER of B

Illustration

Briding gap in broken characters

Eliminate irrelevant details in image

- Subset:
A B w A: w B

A B = {z | ( )z A }

A B = {z | (B)z A}

- Complement:
AC = {w | w A}
- Difference:
A B = {w | w A, w B} = A BC
- Reflection:
= {w | w = -b, for b B}
- Translation by z:
(A)z = {c | c = a + z, for a A}

Application

COMP 4421 Page 25

Advance Binary Morphological Operations


Tuesday, March 22, 2016

10:18 PM

Operation

Def

Effect

Properties

Opening

- A B = (A B) B
(Erosion Dilation)

Smooth object contour from INSIDE

- Convergence:
(A B) B = A B

(Analogy: Imagine roll ball inside object)

- Alternative:
A B = {(B)z | (B)z A}
(Union of all translate of B
fitting into A)

Closing

A B = (A B) B

- Subimage:
ABA

Smooth object contour from OUTSIDE

(Dilation Erosion)

(Analogy: Image roll ball outside object)

- Convergence:
(A B) B = A B

- Subimage:
AAB
- Dual:
(A B)C = AC

Boundary
Extraction

(A) = A - (A B)

Region
Filling

- Begin with point p inside boundary


Fill inside bounded region with 1

(Difference between object &


eroded object)

- Procedure:
X0 = p
Xk = (Xk-1 B) AC

Stop when Xk = Xk-1


- Why AC: Avoid floodfill OUTSIDE
boundary

Connected

Begin with point p inside object Fill

COMP 4421 Page 26

Connected
components

- Begin with point p inside object Fill


out entire object with 1
- Procedure:
X0 = p
Xk = (Xk-1 B) A
Stop when Xk = Xk-1

- Why A: Avoid going outside object

- Black region = 1, NOT founded by algorithm


- Striped region =1, FOUNDED by algorithm

Skeleton

1. Definition: Skeleton S(A) of set A:


a. If:
i. z S(A)
ii. (D)z A: Largest disk, centered at z

b. Then:
i. D': D' A, D' disk (not necessarily
centered at z) larger than (D)z
ii. (D)z = Max. disk

iii. (D)z touches A's boundary at 2 places

2. Skeleton construction: B = Structuring element

Sk(A) = (A kB) - ((A kB) B)


A kB = ((A B) B) ) B

(k successive erosion of A by B)
K = max{k | A kB = }

(Last iterative step before A erodes to


empty)

3. Image reconstruction: Starting with Sk(A), K, B

(Sk(A) kB) = ((Sk(A) B) B) ) B


(k successive dilation of Sk(A) by B)

COMP 4421 Page 27

Gray-scale Morphological Image Processing


Tuesday, March 22, 2016

11:54 PM

Operations

f(x, y) = Input image, domain Df


b(x, y) = Structuring element, domain Db
Operation Def
Dilation

Effect

- 2-D:

If b(x, y) > 0 x, y:
(f b)(s, t) = max{f(s - x, t - y) + b(x, y) | (s - x, t - y)

Df ; (x, y)

Db}

- 1-D:

- Dark, small details reduced


(f b)(s) = max{f(s - x) + b(x) | (s - x)

Erosion

- Brighter output image

Df ; x

D b}

- 2-D:

If b(x, y) > 0 x, y:

(f b)(s, t) = min{f(s + x, t + y) - b(x, y) | (s + x, t + y)


- 1-D:

Df ; (x, y)

D b}

- Darker output image


- Bright, small details reduced

(f b)(s) = min{f(s + x) - b(x) | (s + x)

Df ; x

D b}

- Dilation-Erosion relation:
(f b)c(s, t) = (fc )(s, t)
fc(x, y) = -f(x, y)
(x, y) = b(-x, -y)

Opening - f b = (f b) b

- Remove small light detail

Analogy: Roll ball UNDER rough surface

Closing

- f b = (f b) b

- Undisturb overall gray lvl, large bright features

- Remove small dark detail

Analogy: Roll ball ABOVE rough surface

- Undisturb overall gray lvl, large bright features

- Duality: (f b)c(s, t) = (fc )(s, t)

1-D Grayscale Morphologial operation Illustration: Structuring element: b(x) = 1 x [-1, 1]

COMP 4421 Page 28

Applications

Def

Morphological smoothing

Morphological gradient

Granulometry

g = (f b) b

g = (f b) - (f b)

- Do opening with structuring element of gradually size

(Opening Closing)

- Compute difference between image in previous & new step


- Ending: Differences normalized Used to construct histogram
of particle-size distribution

Effect

Remove/Attenuate bright + dark


artifacts, noise

Highlight sharp gray lvl transitions

COMP 4421 Page 29

Determine size distribution of particles in image

Image Segmentation
Sunday, March 20, 2016

6:50 PM

Point Detection: Laplacian mask

Gradient (Discontinuity) Based Segmentation

- Mask design: Sum of all weights = Zero No


response at flat region
- Principle: Second derivative

f = Input image

- Intensity discontinuity detected based on mask response


within pre-defined window
3x3 window:
w1

w2

w3

w4

w5

w6

w7

w8

w9

Mask response (at pixel under w5):

- Principle: |R| T Intensity discontinuity detected


T = Pre-defined threshold

Line Detection

- Principle: 1 aligned, connected points detected Line detected


- Masks:

- Mask usage: For line in


Specific direction: Use single mask

All directions: Combine all masks


R = max(|Rhoriz|, |Rvert|,|R+45|, |R-45|)

COMP 4421 Page 30

3x3 Mask:
|R| = |2f(x, y)| T Point detected

COMP 4421 Page 31

Edge Detection
Thursday, March 24, 2016

5:54 PM

Edge models

Type

Step

Roof

Ramp

Illustration

- Edge point: Any point contained in ramp


Properties - Edge = Collection of connected pixel - Edge = Line cutting through region
on region boundary
- Width Thickness & Sharpness of line - Width = Length of ramp
- Width = 1 pixel
- Represent thin features
- Intensity profile (left right):
1st derivative: (+) transition in, (-) transition out

roads, line, drawings,

2nd derivative: (+) dark side, (-) bright side

* NOTE: Derivative approximation:

Gradient operator

Edge Detection Strategy


- Usage: 1st derivative Detect edge presence
- Theory:

- Strategies:
f(x, y): Detect presence of edge
Sign of 2f(x, y): Detect whether edge pixel
on dark/brige side
Zero-crossing property of 2f(x, y): Locate
center of thick edge
-

- Limitation: Noise
Solution: Smooth image first

- Computation: Use mask


Mask

Horiz
Gx

Vert

Gy

COMP 4421 Page 32

Prewitt

Sobel

-1

-1

-1

-1

-1

-1

-1

Gy

+45

Laplacian of Gausian (LoG) operator

- Usage: Detect zero-crossing value Center of thick edge


- Properties:
Laplacian: Pure 2nd derivative Unacceptably sensitive
to noise
LoG: Laplacian smoothed by Gaussian Noise
Reliably detect zero-crossing
- Def:
LoG(f) =

(f * h) = f *

(h)

- Computation: 5x5 mask

-1

-1

-2

-1

-1

-2

16

-2

-1

-1

-2

-1

-1

COMP 4421 Page 33

-45

-1

-1

-1

-2

-1

-1

-1

-1

-1

-1

-2

-1

-1

-1

-2

-1

-1

-1

Edge Point Linking & Boundary Detection


Thursday, March 24, 2016

5:52 PM

Global processing: Hough transform


Local processing

- Idea: Similar edge points connected


- Criteria for simiarity:
Grad Magnitude:
|f(x, y) - f(x0, y0)| E
Grad Direction:
|(x, y) - (x0, y0)| A
Distance:

- Concept: Each line can be represented by 2 params:


: Distance from origin
: Slope

(x0, y0) = Current pixel

Given edge point (x0, y0), slope Find :

(x, y) = Neighbouring pixel

= x0cos + y0sin

E, A, D = Threshold

- Param range:
(D = Max. image dimension)

-90 90

- Why use (, ) but NOT (a, b) (as y = ax + b):


, has LIMITED range

a, b has INFINITE range


- Edge-linking procedure:
Initialize accumulating array:
A[, ] = 0

[-90, 90]

Iterate through every edge point


For each edge point (xi, yi):
Iterate from -90 to 90
For each j, calculate j:
j = xicosj + yisinj
(Curve in (, )-param space)
Set A[j, j] = A[j, j] + 1
Pick highest A[j, j]'s (based on pre-defined criteria) Edges
detected

COMP 4421 Page 34

COMP 4421 Page 35

Global Intensity-based Segmentation


Thursday, March 24, 2016

5:53 PM

Automatic Thresholding Method


Global Thresholding Problem
1. Iterative approach:
- Initial estimate T

- Assumption:
Image f(x, y): Light object + Dark background
Histogram: 2 groups of intensity
- Objective: Select threshold T Clearly separate
these groups

- Segment image using T


Regions: G1 ( T) + G2 (< T)
- Compute avg grey lvl 1, 2 for G1, G2

= Prob of occur of gray lvl zi in region Gk


- Compute new T:

- Repeat until T converges (changes in T small enough)

- Extension: Multi-lvl Thresholding

2. Statistical, Minimum error approach

- Assumption:
Histogram is mixture of 2 distribution

p(z | k) = Prob of occur of gray lvl z in region k


p(z) = w1p(z | 1) + w2p(z | 2)
wk = Prob of occur of region k in image
w1 + w2 = 1; 0 w1, w2 1
Error:

Expectation-Maximisation (EM) Method

1. Problem:
Given a histogram, how to model it using multiple Gaussian
distributions
E(T) min w1p(T | 1) = w2p(T | 2)
If p(z | k) k are Gaussian distributed with same sd:

Then:

2. Relations:
- n different gray lvl: z1, z2, , zn
p(zk) = Prob of occur of gray lvl zk
= No of pixel of gray lvl zk
N = f1 + f2 + + fn: Total no of pixels

- M Gaussian models (1, 1), (2, 2), , (M, M)

COMP 4421 Page 36

p(z | j) = Prob of occur of gray lvl z in region j

p(j) = Prop of occur of region j in image


0 p(j) 1

p(j | z) = Prop of occur of region j, given gray lvl z


(Posterior prob)
-

3. Objective: Max. expected neg log-likelihood function


E = -lnL
4. Method:
- Initialize estimate j, j, p(j)
Based on observation
Easy way: p(j) = 1 / M
- Estimate pold(j | z) j, z:

- Compute

- Iterate until convergence

COMP 4421 Page 37

Watershed Segmentation
Wednesday, March 30, 2016

Watersed Segmentation Algorithm

10:14 AM

Concept: Region-based Segmentation

1. Definition:
- g(x, y): Grad. Image
- min = min{g(x, y)}, max = max{g(x,y)}
- M1, M2, , MR: Regional minima of g(x, y)
- Cn(Mi) = Set of POINTS in catchment basin of Mi that will be
flooded at step n
-

- Tn = {(s, t) | g(s, t) < n}: Set of POINTS with grad mag < n
- Qn = Set of CONNECTED COMPONENTS in Tn
- B: 3x3 structuring element

- Gradually pour water into catchment basin


Gradually grey lvl startingku

- When 2 basin start merging Build a dam/watershed/ridge


in between to prevent merging
- When no more merge detected Water reaches highest
possible lvl Stop

2. Algorithm:
- Initialization: Cmin(M1), Cmin(M2), , Cmin(MR)
- Run n = (min + 1) (max + 1). At step n:
For each q Qn:
Case 1: q Cn-1 = :
New regional minimum detected
q MR+1, R R + 1
Cn(MR+1) = q

- Dam lines = Object boundaries

Case 2: q Cn-1 = {Contain 1 Cn-1(Mi)}:


Cn-1(Mi) Cn(Mi)
Case 3: q Cn-1 = {Contain 2 Cn-1(Mi)}:
Watershed/Ridge detected Build dam here

For each q Qn:


Case 2:
Continuously dilate q Cn-1 with B but limit
dilation INSIDE q
Case 3:
Continuously dilate q Cn-1 with B but limit
dilation INSIDE q
After each dilation, mark newly obtained
points reached by 2 Cn-1(Mi) as DAM
POINT

* NOTE: Wateshed segmentation is often applied on grad


image
Regional minima Small value of grad ( Small grey
lvl changes)

Inside object of interest

COMP 4421 Page 38

* NOTE: Dilation used here are BINARY mophorlogical


dilation

COMP 4421 Page 39

Use of Motion in Segmentation


Wednesday, March 30, 2016

10:14 AM

Accumulative Difference Approach


Basic Approach: Direct differences

- Idea:
Compare 2 images acquired at different time pixel by
pixel.
Subsequent image subtracted from reference image:

- Ideas: Accumulative Difference Image (ADI)


Differences between reference & subsequence frames accumulated
Noise sensitivity
Motion direction + Speed estimated

Difference occur at a pixel Counter of that pixel

Stationary elements canceled

Non-stationary elements = Non-zero entires


- Implementation

- Implementation:
f(x, y, t1): Reference frame
f(x, y, t2), , f(x, y, t n): Subsequent frames

Absolute ADI:
dij: Difference image
f(x, y, ti): Reference frame, taken at time ti

Positive ADI:

f(x, y, tj): Subsequent frame, taken at time tj


Negative ADI:

Assumption:
f(x, y, ti), f(x, y, tj) same scene

Rigid-body, in-plane motion


Issues of dij:
Noise sensitive:
Incorrect T Noise not cancelled

Solution - Island Removal: Remove all 4-/8connected region of 1's having < Predefined
no of 1's
Give limited info about motion: Motion direction +
speed not easily estimated

- Observations:
Positive ADI: Non-zero area
Size = Size of moving object
Location = Location of moving object in reference frame

When moving object displaced completely with respect to its


position in reference frame No of non-zero entries stop
Absolute ADI = Positive ADI Negative ADI
Direction + Speed: Determined from entries in Absolute & Negative
ADI

COMP 4421 Page 40

Snake: Active Contour Model


Wednesday, March 30, 2016

10:15 AM

Control the Shape of Contour (Snake) with Energy Function

Paramaterize the Position of 2D Contour (Snake)


- Internal energy:

First term: Control "tension" of contour


Tension of the "spring" that connects 2 consecutive
contour points
Second term: Control "rigidity" of contour
How easily the contour can bend itself
- External energy:

0 s 1: Param that as traversing from 1 point of


contour to another
x(s), y(s): Position of contour points, function of s

:
Contour can be open/close

Scalar potential function, defined on image plane

Describe image features that snake will be attracted to


To make snake seeks image edge:
Implementation

c > 0: const
G * I: Image convolved with Gaussian
smoothing filter of sd

Contour
- Theorem:
Lagrange equation:

must satisfy Euler- Final shape of contour

Dynamic Deformable Model of Snake


- Discrete formulation using finite difference scheme:
: Time-varying contour
- Evolution equation:

- Discrete formulation of dynamic snake in matrix form (finite


difference scheme):
h = Step size
= Time step size
Solve iteratively:

- Matrix form:
A: Matrix in terms of ,

COMP 4421 Page 41

At equilibrium: Stationary contour with min internal


+ external energy obtained

- Matrix form:

A: Matrix in terms of ,

At equilibrium: Stationary contour with min internal


+ external energy obtained

With appropriate (s), (s),

, snake will adjust itself to

seek the boundary the the shape correctly despite the noisy
background

COMP 4421 Page 42

Image Registration
Tuesday, May 3, 2016

Affine Transformation Functions

11:20 AM

Transformation that preserves collinearity (all points lying on


line initially still lie on line after transformation)

Image Registration

1. Rigid-body transformation (Object size unchanged)

- Definition: Process of:


Combining multiple images into a single one
Transforming image to make it match with referenced
image/coordinate system
- Applications:
Panorama photo

a. Translation:

b. Rotation by about axis:


Rotation is measured clockwise while looking along
positive axis direction
- z-axis:

Medical imaging
Accommodate tissue deformation in image-guided
surgery
Satellite image repairing
- General method:

- x-axis:

- y-axis:

c. Translation + Rotation:
-

6 deg of freedom (DOF): x, y, z, dx, dy, dz


- Combining into single matrix:

2. Non-rigid-body transformation (Object size changed)


a. Scaling:

b. Shearing:

Objective Functions

kxy 0 Stretch along x-axis on xy-plane

1. External markers

* NOTE: Combining all transformation forms into single matrix:

12 unknowns 4 pairs of corresponding points (before +


after transformation) needed

COMP 4421 Page 43

12 unknowns 4 pairs of corresponding points (before +


after transformation) needed

stereotactic frame, fiducial markers,


- Seen & located in imagebs as tiny 3D volumes, represented
by points
- Used to guide/evaluate transformation Pinpoints to find
transformation matrix parameters
- Registration error: Euclidean distance between
corresponding points in
and

Optimization method

1. Problem:

Transformation parameters a1, a2, , an

2. Internal markers

a. Surface-based
Segmentation

Corresponding surfaces delineated & guide


transformation
Minimize distance between 2 surfaces

Find

For rigid-body transformation:


Find T = (x, y, z, tx, ty, tz) F min

b. Edge-based
- Edges extracted based on gradient function Guide
transformation

- Gradient magnitude functions:

2. Powell's method:
- Initialization:
n basis vector

Starting point
: Gaussian-smoothed image
: Input image

- Repeat the following cycle until

significantly small:

For i = 1, 2, , n: 1D Line Minimization


Find i such that

min
T

Set
Set

for the NEXT cycle

- Search strategies to achieve global optimization:


Multi-resolution (Coarse-to-fine) approach:
c. Voxel/Pixel intensity

Multi-start approach:

- Sum of Square intensity difference (SSD):

D: Interested domain (overlap region


between u & v, entire domain, )
Widely used
Very sensitive to outliners (pixels with large
intensity difference)
- Sum of Absolute difference (SAD):

COMP 4421 Page 44


outliner effect
- Correlation coefficient (CC):

: Mean
-1 CC 1

Assumption: Linear relationship between intensity


values in images

COMP 4421 Page 45

Image Compression
Tuesday, May 3, 2016

Data redundancy

11:20 AM

1. Coding redundancy:
- Different symbols can be encoded using different no. of bits

Basic Concepts

- Lossless: Image compressed & decompressed


without losing information

- Related measurements:
Avg no of bits:
r = Discrete rand var in [0, 1]
kth gray lvl (k = 0, 1, 2, , (L - 1)):

- Lossy: Change/reduce some image data when


compression Higher data reduction, but
imperfect reproduction of original image

Occurrence = nk, pr(rk) = nk / n, Symbol bit length = l(rk)


Compression ratio:

Original image: Lavg1

- Relative data redundancy:


Original dataset X1, size = n1
Compressed dataset X2, size = n2

Compressed image: Lavg2


rk

pr(rk)

Compression ratio:

Relative data redundancy of X2 to X1:

n2 = n1 (RD = 0): No redundant data


n2 << n1 (CR 1, RD 1): Highly
redundant data, significant compression
n2 >> n1 (CR 0, RD -): X2 has much
more data than original X1

Code scheme 1 l1(rk)

Code scheme 2

(Fix-length)

(Variable length)

l2(rk)

r0 = 0/7 0.19

000

11

r1 = 1/7 0.25

001

01

r2 = 2/7 0.21

010

10

r3 = 3/7 0.16

011

001

r4 = 4/7 0.08

100

0001

r5 = 5/7 0.06

101

00001

r6 = 6/7 0.03

110

000001

r7 = 7/7 0.02

111

000000

Lavg

2. Interpixel redundancy:
- Adjacent pixel values can be reasonably predicted from current pixels

Fidelity Criteria

- Related measurements: Autocorrelation coefficients:

f(x, y) = Original image

= Re-construction image from compression


n = Pixel distance
N = No. of pixels in image

- Objective fidelity criterion:


Root-mean-square error:

(n) Pixel correlation

Mean square signal-to-noise ratio:

- Subjective fidelity criterion: Based on human's subjective


rating
Rating scale of Television Allocations Study Organization
Value

Rating

Excellent An image of extremely high quality, as


good as you could desire.

Description

Fine

An image of high quality, providing


enjoyable viewing. Interference is not
objectionable.

Passable

An image of acceptable quality.


Interference is not objectionable.

Marginal An image of poor quality. You wish you


could improve it. Interference is
COMP 4421 Page 46

2 image, same histogram, different autocorrelation coef profile


- Mapping: Transform 2D image to more efficient format Pixel
redundancy

6
2.7

Passable

An image of acceptable quality.


Interference is not objectionable.

Marginal An image of poor quality. You wish you


could improve it. Interference is
somewhat objectionable.

Interior

Unusable An image so bad that you could not


watch it.

- Mapping: Transform 2D image to more efficient format Pixel


redundancy

A very poor image, but you could watch


it. Objectionable interference is definitely
present.

Binary image, size 1024343

Natural scheme: 1 pixel = 1 bit (black/white)


n1 = 10243431 = 351232

Run-length coding scheme:


Each scan line = Alternate chain of black/white pixels

Sequence of pairs (g1, w1), (g2, w2),


gi = ith grey lvl (0/1)
wi = run length
gi needs 1 bit, wi need 11 bits (max(wi) = 1024)
Total 12166 pairs (gi, wi)

n2 = 1216611 = 133826
Image Compression Model

3. Psychovisual redundancy

- Certain info relatively less important than other to human visual Can be
removed, but irreversible
Human eye sensitive to distinguishing features (edges, patterns, ) but
not detailed quantitative info of image
- Quantization:
Map intensity values broad narrow range
CR, but Image quality (false contour, )
- Improved grayscale (IGS) quantization:
Compensate for degraded visual impact due to quantization
- Source encoder: Remove redundancies
Mapper: Transform input data into interpixelredundancy-reduced format
Quantizer:

Mechanism:
Add some amount of locally-dependent random to pixel's gray lvl
before quantization
How to create locally-dependent random:

Reduce accuracy of mapper's output based on


pre-defined fidelity criterion Psychovisual
redundancy

Observation: Low-order bits of pixel intensity are noisy &


nearly random
Use lower bits of neighboring pixel Generate pseudorandom

Irreversible info loss


Symbol Encoder: Create fix/variable-length code to
represent quantizer output
- Channel encoder: Noise immunity of source encoder's
output

kth Bitplane: Image of kth bit of pixel intensities in image

(7,4)-Hamming code (Channel encoding)

- 3 redundant bits added to 4-bit word All single-bit


error (in 7-bit combined word) detected & corrected

Procedures:
Initialization: Sum = 0
For each pixel:

New Sum = Pixel grey lvl + Lower-half bits of Sum


IGS Code = Higher-half bits of Sum
If Sum = 11111, reset Sum = 0

- Implementation:
Input: b3b2b1b0

Output: h1h2h3h4h5h6h7
h1 = b3 b2 b0

Pixel

Gray lvl

Sum

IGS Code

h2 = b3 b1 b0

i-1

n/a

0000 0000

n/a

0110 1100 0110 1100 = 0110 1100 + 0000 0110

h3 = b3

COMP 4421 Page 47

Output: h1h2h3h4h5h6h7
h1 = b3 b2 b0

Pixel

Gray lvl

Sum

IGS Code

h2 = b3 b1 b0

i-1

n/a

0000 0000

n/a

0110 1100 0110 1100 = 0110 1100 + 0000 0110

i+1

1000 1011 1001 0111 = 1000 1011 + 1100 1001

i+2

1000 0111 1000 1110 = 1000 0111 + 0111 1000

i+3

1111 0100 1111 0100 = 1111 0100 + 1110 1111

h3 = b3
h4 = b2 b1 b0

h5 = b2
h6 = b1
h7 = b0
Error detection & correction:

c4 = h1 h3 h5 h7
c2 = h2 h3 h6 h7
c1 = h4 h5 h6 h7
x = c4c2c1(2)

x 0 hx corrupted, need flipping


Input b3b2b1b0 = 1010
Output h1h2h3h4h5h6h7 = 1011010

h1 = b3 b2 b0 = 1 0 0 = 1
h2 = b3 b1 b0 = 1 1 0 = 0
h4 = b2 b1 b0 = 0 1 0 = 1
Receiver input h1h2h3h4h5h6h7 = 1011110

c4 = h1 h3 h5 h7 = 1 1 1 0 = 1
c2 = h2 h3 h6 h7 = 0 1 1 0 = 0

c1 = h4 h5 h6 h7 = 1 1 1 0 = 1
x = c4c2c1(2) = 101(2) = 5
h5 = 1 corrupted, SHOULD BE h5 = 0
* NOTE: X = t1 t2 tn: XOR operation
X = 1 if ODD no of 1's in t1, t2, , tn
X = 0 if EVEN no of 1's in t1, t2, , tn
X called even-parity-bit: Set {t1, t2, , tn} {X} has
EVEN no of 1's

COMP 4421 Page 48

Lossless Compression
Tuesday, May 17, 2016

Lossless Predictive Coding

4:42 PM

- Features:
Interpixel redundancies of closely spaced pixels by coding only
new info of each pixel

Huffman Encoding

New info of pixel = Actual value - Predicted value

- Variable-symbol encoding: Assign longer code for symbol


with smaller occurrence

Predictor:

f'n = Predicted intensity value

- Implementation:
Order probs of symbols in descending order

fn = Input intensity value


n = Pixel index (1, , N)

Combine 2 lowest probs into single one for next


source reduction

- Encoder:
Input: Image f1, f2, , fN
Output: f1, f2, , fm, em+1, em+2, , em
en = fn - f'n (Prediction error)

No quantization of en Lossless

Decoding: 010100111100
01010

0111

00

a3

a1

a2

a2

a6

- Decoder:
Input: f1, f2, , fm, em+1, em+2, , em
Output: f1, f2, , fm, fm+1, fm+2, , fm

- Produce optimal codes


- Need to store symbol table with compressed data

(f1, , fm known)

(f2, , fm+1 known)

- Effect: Variance of en < Variance of fn Require less bit to encode CR

COMP 4421 Page 49

Lossy Compression
Tuesday, May 17, 2016

11:02 PM

Transform Coding

- Features:

Lossy Predictive Coding

Transform pixel to new space Compact representation of info


Compression = Quantization of transformed coeffs

- Based on Lossless Predictive Coding Scheme, except


Predicted error quantized to limited range

- Discrete Cosine Transform (DCT):


Features: Like DFT, but no imaginary comp, better power packing
Formula:

- Encoder:
Input: Image f1, f2, , fN

Forward DCT:

Output: f1, f2, , fm, e'm+1, e'm+2, , e'm

e'n = Quantize(en)

Inverse DCT:

- Decoder:
Input: f1, f2, , fm, e'm+1, e'm+2, , e'm
Output: f1, f2, , fm, f''m+1, f''m+2, , f''m

X1 = f1, , Xm = fm

X1 = f2, X2 = f3, , Xm-1 = fm, Xm


= f''m+1

- Encoding procedures:

Sub-image size n:
Should be n = 2k Apply fast transform algorithm O(nlogn)
n Compression, + Computational complexity
Ideal (empirical testing): n = 8, 16

X1 = f3, X2 = f4, , Xm-2 = fm, Xm-1 =


f''m+1, Xm = f''m+2

* NOTE: f''n fn
- Delta Modulation (simple lossy coding)
Encoder:
f''1 = f1
f'n = f''n-1, en = fn - f'n, f''n = f'n + e'n
Quantization:

JPEG Coding Scheme

- Encoding:

Send out f''1, e2, , en


Decoder: f''1 available, f'n = f''n-1, f''n = f'n + e'n

- Decoding:

COMP 4421 Page 50

- Decoding:

- JPEG Quality: Quantify compression


Quality Values in Quantization table Blurring + CR
100% quality T(u, v) = 1 u, v
- JPEG Color Image Compression:

COMP 4421 Page 51

Face Detection & Recognition


Tuesday, May 3, 2016

11:20 AM

Face Detection using Integral image & AdaBoost learning


(Viola, Jones, 2004)

Face Detection using Image pyramid & Neural networks

(Rowley, Baluja, Kanade, 1998)

- Concepts:
Integral image ii:
Sum of pixel intensity values above & left of (x, y), inclusive

- Classifier trained directly using preprocessed & normalized


face & nonface training sub-windows
Any rectangular sum can be computed efficiently using integral
image:

- Neural network
Input:

Preprocessed 2020 sub-windows extracted from


image pyramid (multi-scale representation of same
image)
Why use image pyramid:
Vary sub-window size Detect face of
different size

D = (A + B + C + D) - (A + C) - (A + B) + A

Fix subwindow size & Varying image size


Keep design of neural network simple

= ii(x1, y1) - ii(x1, y2 - 1) - ii(x2 - 1, y1) + ii(x2 - 1, y2 - 1)

Rectangular feature (Haar-like function):

Output: Face/Non-face

Value = Pixels in black region - Pixels in white region

- False-detection error minimization:


Partially-trained system applied to random scenery image
not containing any faces

3 types:

2-rectangle

3-rectangle 4-rectangle

Region detected as faces added into neg training set


- Issues:
Undetectable situations: blurry, highly-rotated, partiallyappeared faces
System trained only on real faces, hand drawn faces
sometimes detected

- Machine learning strategy analysis:


Training data: Set of 2424 face/non-face images
Need to select rectangular features to represent each image:
Problems: 160000 potential features (different type + size) Too
many
How to select a SINGLE feature + threshold Best separate
positive (face) & negative (non-face) training examples?

Solution: Combine several weak classifier (return high classification


error) Stronger classifier
- AdaBoost:
Algorithm to construct strong classifier as linear combination of selected
weak classifier:
Weak classifier:

- Detecting rotated faces:


- Train router network Preprocess each input window
Estimate orientation
- Adjust window back to upright, frontal position
Preprocess Feed to main neural network

: Training image
: Classification function utilizing rectangular feature

Strong classifier:

sgn(x): sign function:


Input:

: Training data (i = 1, 2, , N)
: Training image

hj: Weak classifier (j = 1, 2, , P)


Training procedures:
Initialization:

COMP 4421 Page 52

M = No. of face image in training data

Face Recognition

L = No. of nonface image in training data


- 1D vector representative of mn-size face image can dimen
(mn) to much lower

For t = 1, 2, , T:
Normalize all weights:

Perform Principle Component Analysis (PCA) dimen

- Eigenfaces:
Give set of N mn-size images, each represented by 1D

For each classifier hk (k = 1, 2, , P):


Train Minimize classification error

vector of dimen mn
Select weak classifier with minimum error:
Mean image:

Set weight for ht:

Scatter matrix:
Reweight each examples:
Final strong classifier:

Corresponding t eigenvectors + non-zero


eigenvalues:
with 1 2 t

N = 108, P = 180000: T can be 200

: PCA-transformed to new space of lower

dimen (as t mn)

Near Infrared-image for Face Recognition

- Infrared imaging system can produce face mages at good


condition regardless of visible lights in environment Better
than normal use of color camera

: Eigenfaces

- Face recognition using Eigenfaces:


Process image database with PCA + Eigenface

Given new image to be recognize :


Calculate coefs of :

Face detection:
Face
If is face, find closest labeled face based on nearest
neighbor in t-dimen space

COMP 4421 Page 53

Iris Recognition
Thursday, May 19, 2016

Step
no

Step

Iris Image
Acquisition

5:30 PM

Objectives

Method
Daugman system

- Capture high-quality image of iris:


Sufficient resolution + sharpness
Good contrast in interior iris pattern
Iris must be centered

- LED-based point light source, carefully positioned to avoided reflection


off eyeglasses

- Live video feedback via tiny LCD

- Non-invasive to human operator:


Lvl of illumination reflected to iris should not be
annoying
Capturing device should not constraining
- Challenge: Iris is small (diameter 1cm)
2

Iris
Localization

- Step 1: Gradient-based edge detection (Obtain gradient magnitude +


Thresholding) Convert image into binary edge
Gradient magnitude | G(x, y) * I(x, y)|
Del operator
2D Gaussian
Localize portion of acquired image corresponding to iris

- Step 2: Circle Hough transform Localize boundaries of limbic &


pupillary

(xj, yj): Edge point (j = 1, , n)

g(xj, yj, xc, yc, r) = (xj - xc)2 + (yj - yc)2 - r2

Pattern
matching :
Alignment

Bring newly acquired iris pattern into spatial alignment


with candate database entry

- Map: Cartesian image coordinates (x, y) Dimensionless polar image


coordinates (r, )
( Cut circle at = 0 and spread it to rectangular form)
x(r, ) = (1 r)xp(0, ) + rxl(1, )
y(r, ) = (1 r)yp(0, ) + ryl(1, )
r [0, 1]

[0, 2)
(xp(0, ), yp(0, )): Coordinates of pupillary boundary in
direction

(xl(1, ), yl(1, )): Coordinates of limbic boundary in direction


- Why use this alignment: Invariant to shift, scaling & rotation

COMP 4421 Page 54

- Why use this alignment: Invariant to shift, scaling & rotation


Shift (Offsets of eye parallel to camera's sensor): Hough transform
independent of iris position in whole image
Scaling (Offsets along camera's optical axis): Solved by transformation
to common representation (polar coordinate (r, ))
Rotation (Angular deviation about optical axis): Just need to shift the
Polar representation left/right
4

Pattern
Choose a representation of aligned iris patterns that makes Laplacian pyramind + Laplacian of Gaussian filter:
matching:
their distinctive patterns apparent
Laplacian of Gaussian (LoG) filter:
Representation
: Radial dist of a point from filter's center

: sd
Laplacian pyramid:

Pattern
matching:
Goodness of
Match

Evaluate goodness of match between newly acquired &


database representations

- Normalized correlation
p1(i, j), p2(i, j): 2 image of size nm
For each image pk:

Normal correlation between p 1 & p2:

- With representation of acquired iris image & iris in database 4 NCs


Used for identification
6

Pattern
matching Decision

Decide if newly acquired data & database entry derived


from same iris, based on goodness of match

- Strategy formulation:
Training data: Extracted from database
Authentic class (A): Database entry derived from same iris as
the newly acquired

Imposter class (I): Database entry derived from different iris with
newly acquired
Imagine you can make binary decision along a line:

Apply to arbitary sets of points: Draw a line ("weight vector") and


project all points onto that line
- Method: Find weight vector
A):

such that for 2 projected class samples (I &

Variance within class minimized

COMP 4421 Page 55

Variance within class minimized


Variance between different classes maximized

Within class variance:

Total within class variance:


Between class variance:
* NOTE:

are 44 matrix

Optimization:

Values: Above this point Class A


Below this point Class I

COMP 4421 Page 56

You might also like