Professional Documents
Culture Documents
Face-Based Auto-Focus
on TI Digital Camera Processor
TIDC
Feb 28, 2008 1
Agenda
Overview of passive auto-focus (AF)
and previous related projects
Face-based AF feature
Existing approaches
Real-time face detection
DM350 implementation
Demo/Q&A
2
Digital Camera Image Pipeline
Digital camera image pipeline consists of
many image processing components.
Some key components are shown below:
3
Passive Auto-Focus (AF)
Extract a measure of sharpness from image
Establish a feedback loop to reach peak or
in-focus position
Out-of-focus In-Focus
4
Rule-based AF Search (1)
In a previous work (Kehtarnavaz ’03), a fast
AF search named Rule-based Search (RS)
was developed achieving faster focusing
time with comparable accuracy to the
standard Global Search (GS)
5
Rule-based AF Search (2)
Commercial vs. Developed Solution
(DSC class: 10Mpix, 3x zoom)
[http://www.dpreview.com]
AF Time
Vendor Model
Wide Tele
Canon PowerShot SD900 0.60 0.82
Casio Exilim EX-Z1000 0.46 0.70
Kodak EasyShare V1003 0.69 2.06
Samsung NV10 0.60 0.60
Sony Cyber-Shot DSC-N2 0.29 0.69
UTD TI DM350 based DSC 0.21 0.33
6
Multi-Window AF
In a follow-up work (Peddigari ’05), RS was
extended to multi-windows which allowed
focusing on objects appearing in different
parts of the image, thus supporting
different photography situations.
Single Window Multi-Window
8
Low-Light AF
Recently (Gamadia ’07), a systematic
preprocessing approach has been introduced to
enable focusing in low light conditions.
Low-Light (~30 lux)
Pre-
processing
9
AF Videoclips
Note: Videoclips shown correspond to the preview (low
resolution), not the capture (high resolution) mode
Example 1: GS vs. RS
Global search AF
Multi-win, rule-based AF
Example 2: Single vs. Multi-Window
Single-win, rule-based AF
Multi-win, rule-based AF
Example 3: Continuous AF
Off
On, Sharpness
Example 4: Low Light AF (16 lux)
Preprocessing Off
Preprocessing On
10
This Project: Face-based AF
Objective: Perform AF on faces (object of
interest in great majority of photographs taken)
13
Existing Approaches: Facial Features
Facial features consisting of eyes, nose or
mouth have been used for face detection
(e.g., Yow ’97)
Pros
High accuracy
Cons
Require access to the entire frontal face, fail for
profile or partial faces
Fail if some portion of face is obstructed, in
particular if eyes are covered (glasses, etc.)
14
Existing Approaches: Rule-based
Multi-resolution rule-based face detection
utilizing simple rules including positions and
relative distances between facial features
(e.g.,Yang ’02)
Pros
Several different rules can be used together
Cons
Very much dependent on the strictness of rules,
strict rules fail to detect faces and too general
rules generate many false positives
15
Existing Approaches: Skin Color
Human skin color is shown to be an effective
feature for performing real-time face
detection (e.g., Paschalakis ’04)
Pros
Able to detect profile or partial faces
Able to detect even if some portion of face is
obstructed (for example, a person wearing
sunglasses)
Cons
Additional post processing is needed as other
skin areas are also detected
16
Our Solution: Also Skin Color-based
17
Color Space
Different chrominance spaces can be used for
representing skin color, e.g.:
Normalized RGB (r,g)
HSI
YCbCr
Normalized YCbCr
Y
Cb
Cr
19
Face Database (1)
AR Database
Aleix Martinez and Robert Benavente in the Computer Vision
Name:
Center (CVC)
Color Images: Yes
Image Format: RAW
Image Size: 768x576
Number of background
0
pictures per person:
20
Face Database (2)
PIE Database
Name: PIE Database
Number of
68
unique people:
Number of
pictures per 603 (approximately)
person:
Number of
background
13
pictures per
person:
Different 13 different poses, 43 different illumination
Conditions: conditions, and with 4 different expressions.
21
Face Database (3)
UOPB Database
The University of Oulu Physics-Based Face
Name:
Database
Color Images: Yes
Image Format: BMP
Image Size: 428 x 569
Number of unique
125
people:
Number of pictures
16
per person:
Number of
background pictures 0
per person:
Different All frontal images: 16 different camera
Conditions: calibration and illuminations
22
Face Database (4)
DM350 Database
The University of Texas at Dallas DM350 Face
Name:
Database
Color Images: Yes
Image Format: RAW
Image Size: 640 x480
Number of unique
30
people:
Number of pictures
3
per person:
Number of
background pictures 0
per person:
Different
All frontal images: different lighting conditions
Conditions:
23
Skin Color Distribution (1)
250
200
150
Cr
100
50
0
0 50 100 150 200 250
Cb
200
150
Cr
100
50
0
0 50 100 150 200 250
Cb
26
SGM(1)
The skin color distribution can be represented by the
following single Gaussian model N(µ, Σ):
xskin
i = ⎡
⎣ i
Cbskin
, Cri
skin
⎤⎦
µ = [ µCb , µCr ] = n −1 ⎡ ∑Cbiskin , ∑Criskin ⎤
⎢⎣ i i ⎥⎦
Σ = n −1 ∑ ( xskin
i − µ)( x skin
i − µ)t
i
SGM is then used to construct a binary image
representing the skin color pixels of an input image
within the 98% confidence area in terms of the
Mahalanobis distance between the input image
chrominance pixels and the SGM model.
27
SGM (2)
Pros
Fast detection with very low computational
complexity
Cons
Sensitive to the confidence level, higher
confidence captures all skin pixels but also
increases false alarms
28
GMM (1)
The skin color distribution can be more accurately
modeled by a weighted combination of M normal
density functions given by:
M
p( x) = ∑ p( x | j) P( j)
j =1
29
GMM (2)
The training process is done by using the
Expectation Maximization (EM) method
n
N
∑ x P( j | x )
∑ P( j | xi ) µ new
= i =1
i i
P( j ) new =
j
i =1 n
n ∑ P( j | x )
i =1
i
∑ i j i j ] P( j | xi )
[ x − µ new
].[ x − µ new T
∑ new
j = i =1
n
∑ P( j | x )
i =1
i
30
GMM (3)
Pros
Higher detection accuracy
Cons
Higher computational complexity
31
Binary skin images generated by
SGM and GMM
34
Lookup Table (2)
Using the lookup table approach, the face
detection using GMM took 10 to 25 ms on
DM350, considered to be a real-time
throughput as it added an acceptable time
increase to passive AF running at 210 to 330
ms.
36
Post Processing
It is not possible to detect faces using only
skin color information because of other
exposed skin areas, and also due to the
presence of similar colors in the background.
37
Paxel Image (1)
The binary skin image is divided into blocks
or paxels.
39
Simple Face Shape Processing
Connectivity
Candidate regions are first selected by examining the 8
neighborhood connectivity among the paxels
Face size
A minimum face size of 40X40 pixels in a 640X480 image is
used to obtain candidate face regions (determined based on
ROC curve analysis)
Aspect ratio of face
Standard golden ratio 1.618 (Yang ’02) works well only for
frontal faces
To accommodate for rotated faces, a more flexible ratio is used
here (aspect ratio between 0.8-1.8) to detect face regions
Face score
Best face region is then obtained using face scores, or the
amount of skin presence in the face regions
40
Shape Processing Example
42
Real-Time DM350 Implementation (1)
43
Real-Time DM350 Implementation (2)
45
Summary
For the last four years, an R&D program
has been established between the SIP Lab
at UTD and TI to look into various
improvements of digital and cell-phone
camera image pipelines. This effort has
been the latest accomplished project under
this program.
Demo/Q&A