Face-AF TI

Real-Time Implementation of
Face-Based Auto-Focus
on TI Digital Camera Processor
N. Kehtarnavaz, M. Gamadia, and M. Rahman

Signal and Image Processing Lab
University of Texas at Dallas
TIDC
Feb 28, 2008 1
Agenda
Overview of passive auto-focus (AF)
and previous related projects
Face-based AF feature
Existing approaches
Real-time face detection
DM350 implementation
Demo/Q&A
2
Digital Camera Image Pipeline
Digital camera image pipeline consists of
many image processing components.
Some key components are shown below:
3
Passive Auto-Focus (AF)
Extract a measure of sharpness from image
Establish a feedback loop to reach peak or
in-focus position
Out-of-focus In-Focus
4
Rule-based AF Search (1)
In a previous work (Kehtarnavaz ’03), a fast
AF search named Rule-based Search (RS)
was developed achieving faster focusing
time with comparable accuracy to the
standard Global Search (GS)
5
Rule-based AF Search (2)
Commercial vs. Developed Solution
(DSC class: 10Mpix, 3x zoom)
[http://www.dpreview.com]
AF Time
Vendor Model
Wide Tele
Canon PowerShot SD900 0.60 0.82
Casio Exilim EX-Z1000 0.46 0.70
Kodak EasyShare V1003 0.69 2.06
Samsung NV10 0.60 0.60
Sony Cyber-Shot DSC-N2 0.29 0.69
UTD TI DM350 based DSC 0.21 0.33
6
Multi-Window AF
In a follow-up work (Peddigari ’05), RS was
extended to multi-windows which allowed
focusing on objects appearing in different
parts of the image, thus supporting
different photography situations.
Single Window Multi-Window
Out of focus In focus

7
Continuous AF
In another extension (Gamadia ’06), an
improved continuous AF feature was
developed in order to maintain focus
without user half-press.
8
Low-Light AF
Recently (Gamadia ’07), a systematic
preprocessing approach has been introduced to
enable focusing in low light conditions.
Low-Light (~30 lux)
Out of focus In focus
Pre-
processing
9
AF Videoclips
Note: Videoclips shown correspond to the preview (low
resolution), not the capture (high resolution) mode
Example 1: GS vs. RS
Global search AF
Multi-win, rule-based AF
Example 2: Single vs. Multi-Window
Single-win, rule-based AF
Multi-win, rule-based AF
Example 3: Continuous AF
Off
On, Sharpness
Example 4: Low Light AF (16 lux)
Preprocessing Off
Preprocessing On
10
This Project: Face-based AF
Objective: Perform AF on faces (object of
interest in great majority of photographs taken)
Solution is software-based, no dedicated processor

ÂDevelopment of a computationally efficient algorithm suitable for
real-time deployment on DM350 or similar camera processors
Solution to be relatively robust to face rotation, occlusion,
different lighting conditions
11
Existing Approaches (1)
Nearly all major digital camera manufacturers

including Canon, Fuji, Nikon, Panasonic,
Pentax, Samsung and Sony offer camera
models with face recognition features.
Difference here is to have a software

approach for achieving a robust faced-based
auto-focusing, i.e., without utilizing any
dedicated face detection hardware.
12
Existing Approaches (2)
The image processing literature includes plenty of
face detection algorithms based on facial features,
skin color, face shape, etc.
Although some of these algorithms have been

reported to be capable of achieving high detection
rates, very few of them are suitable for real-time
software deployment on digital or cell-phone
camera processors due to their high computational
and memory demands.
13
Existing Approaches: Facial Features
Facial features consisting of eyes, nose or
mouth have been used for face detection
(e.g., Yow ’97)
Pros
High accuracy
Cons
Require access to the entire frontal face, fail for
profile or partial faces
Fail if some portion of face is obstructed, in
particular if eyes are covered (glasses, etc.)
14
Existing Approaches: Rule-based
Multi-resolution rule-based face detection
utilizing simple rules including positions and
relative distances between facial features
(e.g.,Yang ’02)
Pros
Several different rules can be used together
Cons
Very much dependent on the strictness of rules,
strict rules fail to detect faces and too general
rules generate many false positives
15
Existing Approaches: Skin Color
Human skin color is shown to be an effective
feature for performing real-time face
detection (e.g., Paschalakis ’04)
Pros
Able to detect profile or partial faces
Able to detect even if some portion of face is
obstructed (for example, a person wearing
sunglasses)
Cons
Additional post processing is needed as other
skin areas are also detected
16
Our Solution: Also Skin Color-based
Although different people have different skin

colors, several studies have shown that skin
chrominances form a tight color cluster.
Considered two models to describe this

cluster:
Single Gaussian Model (SGM)
Gaussian Mixture Model (GMM)
17
Color Space
Different chrominance spaces can be used for
representing skin color, e.g.:
Normalized RGB (r,g)
HSI
YCbCr
Normalized YCbCr
Since raw Bayer pattern images captured by

an image sensor are transformed into the
YCbCr color space for compression purposes
within a digital camera, YCbCr was chosen
here to provide the chrominance information
(Normalized YCbCr provided similar outcome).
18
Model Training
Various skin areas from a face image were
selected and the corresponding pixels were
mapped into the CbCr space.
Y
Cb
Cr
19
Face Database (1)
AR Database
Aleix Martinez and Robert Benavente in the Computer Vision
Name:
Center (CVC)
Color Images: Yes
Image Format: RAW
Image Size: 768x576
Number of unique people: 126
Number of pictures per

26
person:
Number of background
0
pictures per person:
different facial expressions, illumination conditions, and

Different Conditions:
occlusions (sun glasses and scarf)
20
Face Database (2)
PIE Database
Name: PIE Database
Color Images: Yes
Image Format: JPEG
Image Size: 640 x 486
Number of
68
unique people:
Number of
pictures per 603 (approximately)
person:
Number of
background
13
pictures per
person:
Different 13 different poses, 43 different illumination
Conditions: conditions, and with 4 different expressions.
21
Face Database (3)
UOPB Database
The University of Oulu Physics-Based Face
Name:
Database
Color Images: Yes
Image Format: BMP
Image Size: 428 x 569
Number of unique
125
people:
Number of pictures
16
per person:
Number of
background pictures 0
per person:
Different All frontal images: 16 different camera
Conditions: calibration and illuminations
22
Face Database (4)
DM350 Database
The University of Texas at Dallas DM350 Face
Name:
Database
Color Images: Yes
Image Format: RAW
Image Size: 640 x480
Number of unique
30
people:
Number of pictures
3
per person:
Number of
background pictures 0
per person:
Different
All frontal images: different lighting conditions
Conditions:
23
Skin Color Distribution (1)
250
200
150
Cr
100
50
0
0 50 100 150 200 250
Cb
Distribution of human skin color within the

chrominance CbCr space for 250 faces
corresponding to the AR, PIE, and UOPB
databases 24
250
200
150
Cr
100
50
0
0 50 100 150 200 250
Cb
Distribution of human skin color within the

chrominance CbCr space for 30 sample
faces collected by the DM350 platform
25
Example: Skin color distribution corresponding

to the four manually selected skin areas
26
SGM(1)
The skin color distribution can be represented by the
following single Gaussian model N(µ, Σ):
xskin
i = ⎡
⎣ i
Cbskin
, Cri
skin
⎤⎦
µ = [ µCb , µCr ] = n −1 ⎡ ∑Cbiskin , ∑Criskin ⎤
⎢⎣ i i ⎥⎦
Σ = n −1 ∑ ( xskin
i − µ)( x skin
i − µ)t
i
SGM is then used to construct a binary image
representing the skin color pixels of an input image
within the 98% confidence area in terms of the
Mahalanobis distance between the input image
chrominance pixels and the SGM model.
27
SGM (2)
Pros
Fast detection with very low computational
complexity
Cons
Sensitive to the confidence level, higher
confidence captures all skin pixels but also
increases false alarms
28
GMM (1)
The skin color distribution can be more accurately
modeled by a weighted combination of M normal
density functions given by:
M
p( x) = ∑ p( x | j) P( j)
j =1
A study has shown that the use of two Gaussians

(M=2) is adequate (Caetanoa ’03)
29
GMM (2)
The training process is done by using the
Expectation Maximization (EM) method
n
N
∑ x P( j | x )
∑ P( j | xi ) µ new
= i =1
i i
P( j ) new =
j
i =1 n
n ∑ P( j | x )
i =1
i
∑ i j i j ] P( j | xi )
[ x − µ new
].[ x − µ new T
∑ new
j = i =1
n
∑ P( j | x )
i =1
i
30
GMM (3)
Pros
Higher detection accuracy
Cons
Higher computational complexity
31
Binary skin images generated by
SGM and GMM
GMM captures skin area more effectively as

compared to SGM, more accurate model. However…
32
GMM Time Issue
GMM increases the skin detection
performance at the expense of much higher
computational time.
Floating-point calculation for the Gaussian

density functions adds to the computational
time.
Face detection time with two Gaussians

mixture model: 5 to 6 seconds, not real-time!
33
Lookup Table (1)
To overcome the time issue, calculated the
probability for all possible Cb-Cr combinations
and created a lookup table.
As the variation of skin color in the

chrominance space is small as compared to
the entire color space, it was found that a
50X50 lookup table was adequate for covering
the skin color cluster.
34
Lookup Table (2)
Using the lookup table approach, the face
detection using GMM took 10 to 25 ms on
DM350, considered to be a real-time
throughput as it added an acceptable time
increase to passive AF running at 210 to 330
ms.
Note that as size of the lookup table is fixed,

increasing the number of Gaussians does not
alter this time.
35
Lookup Table for GMM
36
Post Processing
It is not possible to detect faces using only
skin color information because of other
exposed skin areas, and also due to the
presence of similar colors in the background.
Need a fast post processing step, utilized

Blocks or so called paxels to reduce data to be
processed
Simple shape processing
37
Paxel Image (1)
The binary skin image is divided into blocks
or paxels.
Total skin area within a paxel is computed -

if this is greater than or equal to 50% of the
paxel area, the paxel label is assigned to be
skin.
This step significantly reduces the amount of

data to be processed, speeding up post
processing.
38
Paxel Image (2)
Example: A 640X480 size skin image is
converted to 64X48 size paxel image when
using a paxel size of 10X10.
39
Simple Face Shape Processing
Connectivity
Candidate regions are first selected by examining the 8
neighborhood connectivity among the paxels
Face size
A minimum face size of 40X40 pixels in a 640X480 image is
used to obtain candidate face regions (determined based on
ROC curve analysis)
Aspect ratio of face
Standard golden ratio 1.618 (Yang ’02) works well only for
frontal faces
To accommodate for rotated faces, a more flexible ratio is used
here (aspect ratio between 0.8-1.8) to detect face regions
Face score
Best face region is then obtained using face scores, or the
amount of skin presence in the face regions
40
Shape Processing Example
Candidate regions after connectivity

Face regions after face size + aspect ratio
Best face region after scoring
41
Sample outcome of skin color
face detection algorithm
42
Real-Time DM350 Implementation (1)
43
Real-Time DM350 Implementation (2)
Extensive experimentation (200 scenes) was done:

88% of cases, the best (closest) face area was
detected and focusing was done on that area
Remaining 12% of cases

8% of cases, hands or other exposed parts of the body
were picked – note: objective here was not face recognition
rather auto-focusing, therefore these cases were OK since
these skin areas were on the same focal plane as faces.
4% of cases, could not find a well-defined sharpness peak
associated with a detected face area – auto-focusing
switched to the original AF using the paxel area containing
no face but having the best sharpness function.
44
Face-based AF Videoclips
Single and multiple faces
Frontal face
Profile face
Multiple faces
Different lighting conditions
Fluorescent light
Incandescent light
Mixed light (fluorescent+ incandescent)
Outdoor
45
Summary
For the last four years, an R&D program
has been established between the SIP Lab
at UTD and TI to look into various
improvements of digital and cell-phone
camera image pipelines. This effort has
been the latest accomplished project under
this program.
Introduced a real-time software based

solution to achieving faced-based passive
auto-focusing. 46
References
T. Caetanoa, S. Olabarriagab, and D. Baronea, “Do mixture models in chromaticity space improve skin detection?”,
Pattern Recognition, Vol. 36, No. 12, pp. 3019-3021, December 2003.
M. Gamadia and N. Kehtarnavaz, “A real-time continuous automatic focus algorithm for digital cameras,” in
Proceedings of IEEE Southwest Symposium on Image Analysis and Interpretation, pp. 163-167, March 2006.
M. Gamadia, N. Kehtarnavaz, and K. Roberts-Hoffman, “Low-light auto-focus enhancement for digital and cell-phone
camera image pipelines,” IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, pp. 249-257, May 2007.
N. Kehtarnavaz and H.-J. Oh, “Development and real-time implementation of a rule-based auto-focus algorithm,”
Journal of Real-Time Imaging, Vol. 9, No. 3, pp. 197-203, June 2003.
C. Kotropoulos and I. Pitas, “Rule-based face detection in frontal views”, in Proc. IEEE ICASSP, Vol. 4, pp. 2537-
2540, April 1997.
A. Martinez and R. Benavente, “The AR face database,” CVC Technical Report, No. 24, June 1998.
S. McKenna, S. Gong, and Y. Raja, “Modeling facial color and identity with gaussian mixtures”, Pattern Recognition,
Vol. 31, No. 12, pp. 1883-1892, December 1998.
V. Peddigari, M. Gamadia, and N. Kehtarnavaz, “Real-time implementation issues in passive automatic focusing for
digital still cameras”, Journal of Imaging Science and Technology, Vol. 49, No. 2, pp. 114-123, Mar/Apr 2005.
M. Rahman, M. Gamadia, and N. Kehtarnavaz, “Real-time Face-based Auto-Focus for Digital Still and Cell-Phone
Cameras”, submitted to 2008 IEEE Southwest Symposium on Image Analysis and Interpretation, Mar 2008.
T. Sim, S. Baker, and M. Bsat, “The CMU pose, Illumination, and expression database,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol. 25, No. 12, pp. 1615-1618, December 2003.
G. Yang and T. Huang, “Human face detection in complex background”, Pattern Recognition, Vol. 27, No. 1, pp. 53-
63, January 1994.
J. Yang and A. Waibel, “A real-time face tracker”, in Proceedings of IEEE Workshop on Applications of Computer
Vision, pp. 142-147, December 1996.
M. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images: a survey”, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 24, No. 1, January 2002.
K. Yow and R. Cipolla, “Feature-based human face detection,” Image and Vision Computing, Vol. 15, No. 9, pp. 713-
735, September 1997.
S. Paschalakis, M. Bober, “Real-time face detection and tracking for mobile videoconferencing”, Real-Time Imaging,
Vol. 10, pp. 81-94, April 2004.
Texas Instruments, Inc., TMS320DM350 CPU and Peripherals Technical Reference Manual, 2007.
47
Real-Time Implementation of
Face-Based Auto-Focus
on TI Digital Camera Processor
Demo/Q&A
N. Kehtarnavaz, M. Gamadia, and M. Rahman

Signal and Image Processing Lab
University of Texas at Dallas
48
IMPORTANT NOTICE
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improvements,
and other changes to its products and services at any time and to discontinue any product or service without notice. Customers should
obtain the latest relevant information before placing orders and should verify that such information is current and complete. All products are
sold subject to TI’s terms and conditions of sale supplied at the time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in accordance with TI’s standard
warranty. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where
mandated by government requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are responsible for their products and
applications using TI components. To minimize the risks associated with customer products and applications, customers should provide
adequate design and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right, copyright, mask work right,
or other TI intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information
published by TI regarding third-party products or services does not constitute a license from TI to use such products or services or a
warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual
property of the third party, or a license from TI under the patents or other intellectual property of TI.
Reproduction of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied
by all associated warranties, conditions, limitations, and notices. Reproduction of this information with alteration is an unfair and deceptive
business practice. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional
restrictions.
Resale of TI products or services with statements different from or beyond the parameters stated by TI for that product or service voids all
express and any implied warranties for the associated TI product or service and is an unfair and deceptive business practice. TI is not
responsible or liable for any such statements.
TI products are not authorized for use in safety-critical applications (such as life support) where a failure of the TI product would reasonably
be expected to cause severe personal injury or death, unless officers of the parties have executed an agreement specifically governing
such use. Buyers represent that they have all necessary expertise in the safety and regulatory ramifications of their applications, and
acknowledge and agree that they are solely responsible for all legal, regulatory and safety-related requirements concerning their products
and any use of TI products in such safety-critical applications, notwithstanding any applications-related information or support that may be
provided by TI. Further, Buyers must fully indemnify TI and its representatives against any damages arising out of the use of TI products in
such safety-critical applications.
TI products are neither designed nor intended for use in military/aerospace applications or environments unless the TI products are
specifically designated by TI as military-grade or "enhanced plastic." Only products designated by TI as military-grade meet military
specifications. Buyers acknowledge and agree that any such use of TI products which TI has not designated as military-grade is solely at
the Buyer's risk, and that they are solely responsible for compliance with all legal and regulatory requirements in connection with such use.
TI products are neither designed nor intended for use in automotive applications or environments unless the specific TI products are
designated by TI as compliant with ISO/TS 16949 requirements. Buyers acknowledge and agree that, if they use any non-designated
products in automotive applications, TI will not be responsible for any failure to meet such requirements.
Following are URLs where you can obtain information on other Texas Instruments products and application solutions:
Products Applications
Amplifiers amplifier.ti.com Audio www.ti.com/audio
Data Converters dataconverter.ti.com Automotive www.ti.com/automotive
DSP dsp.ti.com Broadband www.ti.com/broadband
Clocks and Timers www.ti.com/clocks Digital Control www.ti.com/digitalcontrol
Interface interface.ti.com Medical www.ti.com/medical
Logic logic.ti.com Military www.ti.com/military
Power Mgmt power.ti.com Optical Networking www.ti.com/opticalnetwork
Microcontrollers microcontroller.ti.com Security www.ti.com/security
RFID www.ti-rfid.com Telephony www.ti.com/telephony
RF/IF and ZigBee® Solutions www.ti.com/lprf Video & Imaging www.ti.com/video
Wireless www.ti.com/wireless
Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2008, Texas Instruments Incorporated

Face-AF TI

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Face-AF TI

Uploaded by

Copyright:

Available Formats

Real-Time Implementation of

N. Kehtarnavaz, M. Gamadia, and M. Rahman

Out of focus In focus

Out of focus In focus

 Solution is software-based, no dedicated processor

Nearly all major digital camera manufacturers

Difference here is to have a software

Although some of these algorithms have been

Although different people have different skin

Considered two models to describe this

Since raw Bayer pattern images captured by

Number of unique people: 126

Number of pictures per

different facial expressions, illumination conditions, and

Color Images: Yes

Image Format: JPEG

Image Size: 640 x 486

Distribution of human skin color within the

Distribution of human skin color within the

Example: Skin color distribution corresponding

A study has shown that the use of two Gaussians

GMM captures skin area more effectively as

Floating-point calculation for the Gaussian

Face detection time with two Gaussians

As the variation of skin color in the

Note that as size of the lookup table is fixed,

Need a fast post processing step, utilized

Total skin area within a paxel is computed -

This step significantly reduces the amount of

Candidate regions after connectivity

Extensive experimentation (200 scenes) was done:

Remaining 12% of cases

Introduced a real-time software based

N. Kehtarnavaz, M. Gamadia, and M. Rahman

You might also like

Solution is software-based, no dedicated processor

Nearly all major digital camera manufacturers

Difference here is to have a software

Although some of these algorithms have been

Although different people have different skin

Considered two models to describe this

Since raw Bayer pattern images captured by

A study has shown that the use of two Gaussians

GMM captures skin area more effectively as

Floating-point calculation for the Gaussian

Face detection time with two Gaussians

As the variation of skin color in the

Note that as size of the lookup table is fixed,

Need a fast post processing step, utilized

Total skin area within a paxel is computed -

This step significantly reduces the amount of

Extensive experimentation (200 scenes) was done:

Remaining 12% of cases

Introduced a real-time software based