Professional Documents
Culture Documents
Source: Wikipedia
Image Understanding
Computer vision goes hand in hand with image understanding What information do we need to know to understand the scene? How can we make decisions about what objects are present, their shape, their positioning?
ENGN8530: CVIU
Many different fields are involved, e.g. computer science, AI, neuroscience, psychology, engineering, philosophy, art.
ENGN8530: CVIU
Sub-areas of CVIU
Scene reconstruction Event detection Object tracking Object recognition Object structure recovery Ego-motion Multi-view geometry Indexing of image / video databases
ENGN8530: CVIU 7
Scene Reconstruction
From stereo
ENGN8530: CVIU
Event Detection
Source: MERL
ENGN8530: CVIU
Object Tracking
ENGN8530: CVIU
10
Object Recognition
Query
Result
Source: David Nister
Database 11
ENGN8530: CVIU
Reference: A.D. Worrall, J.M. Ferryman, G.D. Sullivan and K.D. Baker, Pose and structure recovery using active models, Proc. 6th British Machine Vision Conference, Vol.1, Birmingham, UK, pp137-146. ENGN8530: CVIU 12
Ego-motion
Estimated camera path
Optical flow
ENGN8530: CVIU
13
Multi-View Geometry
Epipolar geometry
ENGN8530: CVIU
14
1 Query
Reference: J. Sivic and A. Zisserman, Video Google: A Text Retrieval Approach to Object Matching in Videos, Proc. International Conference on Computer Vision, Nice, France, 2003, pp. 1470-1477. ENGN8530: CVIU 15
Feature grouping
Object recognition
ENGN8530: CVIU
Characterization of parts
17
What is in Image?
An image is an array/matrix of values (picture elements = pixels) on a plane which describe the world from the point of view of the observer. Because of the line of sight effect, this is a 2D representation of the 3D world. The meaning of the pixels depends on the sensors used for their acquisition.
ENGN8530: CVIU
18
Imaging Sensors
The information seen by the imaging device is digitised and stored as pixel values. Two important quantities of imaging sensors are:
Spatial resolution: How many pixels are there? Image size Signal resolution: How many values per pixel?
Electro-Magnetic Spectrum
UV Visible NIR
0.4m 1.0m
SWIR
1.7m2.5m3.0m
MWIR
5.0m 8.0m
LWIR
14.0m
The human eye can see light between 400 and 700 nm.
ENGN8530: CVIU 20
Source: Wikipedia
ENGN8530: CVIU
21
CCD (2)
Generally, the light-sensitive unit of construction is arranged in an array whose topology is a lattice Not always true, e.g. log-polar CCDs Colour CCDs:
Bayer filter: 1x Red, 1x Blue, 2x Green because the human eye is more sensitive to green RGBE filter: 1x Red, 1x Blue, 1x Green, 1x Emerald (Cyan)
ENGN8530: CVIU
Bayer filter
RGBE filter
Source: Wikipedia
22
Bolometers
Invented by the astronomer Samuel Pierpont Langley in 1878. It is a device comprised of an "absorber" in contact with a heat sink through an insulator. The sink can be viewed as a reference for the absorber temperature, which is raised by the power of the incident electromagnetic wave.
ENGN8530: CVIU
23
Microbolometer
The microbolometer, a particular kind of bolometer, is the basis for thermal cameras. It is a grid of vanadium oxide or amorphous silicon heat sensors atop a corresponding grid of silicon. IR radiation from a specific range of wavelengths strikes the vanadium oxide and changes its electrical resistance. This resistance change is measured and processed into temperatures which can be represented graphically.
ENGN8530: CVIU
24
For a conventional radar, the footprint is governed by the size of the antenna (aperture). SAR creates a synthetic aperture and delivers a 2D image. One dimension is the range (cross track), whereas the other one is the azimuth (along track). Sonar and ultrasound work on the same principles but in different wavelengths
ENGN8530: CVIU 25
SAR (2)
Nadir Track
Range
RADAR = Radio Detection and Ranging NADIR = Opposite of zenith ENGN8530: CVIU SAR image of Venus
Source: Wikipedia
26
Source: Wikipedia 27
MRI
Functional MRI
Functional MRI (fMRI) measures signal changes in the brain that are due to changing neural activity. Increases in neural activity cause changes in the MR signal due to change in ratio of oxygenated to deoxygenated haemoglobin. Deoxygenated haemoglobin attenuates the MR signal.
ENGN8530: CVIU
fMRI of head: Highlighted areas show primary visual cortex Source: Wikipedia
29
30
CAT/CT
Good for showing bones Not good for showing soft tissue
Camera Geometry
aperture d z optical axis f (focal length) y' x' image plane
The aperture allows light to enter the camera The image plane is where the image is formed The focal length is the distance between the aperture and the image plane The optical axis passes through the center of the aperture and is perpendicular to it.
ENGN8530: CVIU 32
x'
x'b
x'
xb
x't
And, using the formula in the previous slide ( xt xb ) f xf xt f xb f , xb = and x = xt xb = xt = = z z z z Hence, size transforms as x = f 2 tan f 2
ENGN8530: CVIU 34
Distant object
Close object
Rays that pass through the camera aperture spread out and do not make a sharp point on the image. These rays need to be focussed to make a sharp point in the image. The rays from close objects diverge more than from distant objects For very distant objects, the rays are effectively parallel
ENGN8530: CVIU 35
min = 1.22
Rmin = 1.22
Resolution
The resolution of a camera is the minimum separation between two points such that they appear separately on the image plane Since distant objects appear smaller and closer together, the resolution varies with respect to the distance. The angle between separable objects does not vary wrt distance angular resolution The distance on the image plane does not vary image
plane resolution.
ENGN8530: CVIU
37
Camera Models
Pinhole camera Camera with lenses
ENGN8530: CVIU
38
Pinhole Camera
Advantages
No distortion of image Depth of field from a few cm to infinity Wide angular field Works on ultra-violet and X-rays
Disadvantages
Very limited light gathering Poor resolution
ENGN8530: CVIU
39
Simplest camera The pinhole (aperture d ) must be small to get a sharp image But we need a large pinhole to get enough light!
ENGN8530: CVIU 40
R=d
Geometric
d R= d f
R = 1.22f / d
The best resolution occurs when these two are equal:
Diffraction
R
d = 1.22f * / d
or
f * = d 2 / 1.22
41
Longer wavelength
Smaller aperture
ENGN8530: CVIU 42
Pinhole path
ENGN8530: CVIU
43
= 1.22 / d
The larger the aperture, the better the resolution
R = 1.22f / d
The image-plane resolution is still f
ENGN8530: CVIU
44
ENGN8530: CVIU
45
Illumination
The amount of light entering the camera is proportional to the area of the lens (d 2/4) The area covered by the image is proportional to f 2 So, the brightness of the image is proportional to d 2/f 2 Dependent on the focal ratio f /d Brightness is controlled by a moveable aperture which changes d Referred to by a sequence of f-stops; f:1 is fully open, each successive f-stop halves the brightness (so the aperture is reduced by 2): f:1.4, f:2, f:2.8, f:4, f:5.6
ENGN8530: CVIU 46
Reflection
Absorption
Transmission
ENGN8530: CVIU 47
The BSDF
Bidirectional Scattering Distribution Function Describes the way in which light is scattered by a surface BSDF = BRDF + BSSRDF + BTDF
BRDF - Bidirectional reflectance distribution function BSSRDF - Bidirectional surface scattering reflectance distribution function (incl. subsurface scattering) BTDF - Bidirectional transmittance distribution function
ENGN8530: CVIU
Source: Wikipedia 48
The BRDF
It describes the reflectance of an object as a function of the illumination, viewing geometry and wavelength. Its given by the ratio of irradiance (incident flux per unit area) to radiance (reflected flux per unit area).
Reference: F. Nicodemus, "Reflectance nomenclature and directional reflectance and emissivity," Appl. Opt., Vol. 9, 1970, pp. 14741475. ENGN8530: CVIU 49
ENGN8530: CVIU
50
Radiance
Power per unit projected area perpendicular to the ray per unit solid angle in the direction of the dA ray dw Flux given by d = L(x,) cos d dA L(x,w) Solid angle is proportional to the surface area, S of a projection of the object onto a sphere divided by the square of its radius R.
ENGN8530: CVIU 52
Example BRDFs
Oren and Nayar
ENGN8530: CVIU
53
ENGN8530: CVIU
55