You are on page 1of 49

Real-Time Implementation of

Face-Based Auto-Focus
on TI Digital Camera Processor

N. Kehtarnavaz, M. Gamadia, and M. Rahman


Signal and Image Processing Lab
University of Texas at Dallas

TIDC
Feb 28, 2008 1
Agenda
‹Overview of passive auto-focus (AF)
and previous related projects
‹Face-based AF feature
„Existing approaches
„Real-time face detection
„DM350 implementation
‹Demo/Q&A

2
Digital Camera Image Pipeline
‹Digital camera image pipeline consists of
many image processing components.
Some key components are shown below:

3
Passive Auto-Focus (AF)
‹Extract a measure of sharpness from image
‹Establish a feedback loop to reach peak or
in-focus position

Out-of-focus In-Focus
4
Rule-based AF Search (1)
‹In a previous work (Kehtarnavaz ’03), a fast
AF search named Rule-based Search (RS)
was developed achieving faster focusing
time with comparable accuracy to the
standard Global Search (GS)

5
Rule-based AF Search (2)
‹Commercial vs. Developed Solution
„(DSC class: 10Mpix, 3x zoom)
[http://www.dpreview.com]
AF Time
Vendor Model
Wide Tele
Canon PowerShot SD900 0.60 0.82
Casio Exilim EX-Z1000 0.46 0.70
Kodak EasyShare V1003 0.69 2.06
Samsung NV10 0.60 0.60
Sony Cyber-Shot DSC-N2 0.29 0.69
UTD TI DM350 based DSC 0.21 0.33

6
Multi-Window AF
‹In a follow-up work (Peddigari ’05), RS was
extended to multi-windows which allowed
focusing on objects appearing in different
parts of the image, thus supporting
different photography situations.
Single Window Multi-Window

Out of focus In focus


7
Continuous AF
‹In another extension (Gamadia ’06), an
improved continuous AF feature was
developed in order to maintain focus
without user half-press.

8
Low-Light AF
‹Recently (Gamadia ’07), a systematic
preprocessing approach has been introduced to
enable focusing in low light conditions.
Low-Light (~30 lux)

Out of focus In focus

Pre-
processing
9
AF Videoclips
‹ Note: Videoclips shown correspond to the preview (low
resolution), not the capture (high resolution) mode
‹ Example 1: GS vs. RS
„ Global search AF
„ Multi-win, rule-based AF
‹ Example 2: Single vs. Multi-Window
„ Single-win, rule-based AF
„ Multi-win, rule-based AF
‹ Example 3: Continuous AF
„ Off
„ On, Sharpness
‹ Example 4: Low Light AF (16 lux)
„ Preprocessing Off
„ Preprocessing On

10
This Project: Face-based AF
‹Objective: Perform AF on faces (object of
interest in great majority of photographs taken)

„ Solution is software-based, no dedicated processor


ÂDevelopment of a computationally efficient algorithm suitable for
real-time deployment on DM350 or similar camera processors
„ Solution to be relatively robust to face rotation, occlusion,
different lighting conditions
11
Existing Approaches (1)

‹Nearly all major digital camera manufacturers


including Canon, Fuji, Nikon, Panasonic,
Pentax, Samsung and Sony offer camera
models with face recognition features.

‹Difference here is to have a software


approach for achieving a robust faced-based
auto-focusing, i.e., without utilizing any
dedicated face detection hardware.
12
Existing Approaches (2)
‹The image processing literature includes plenty of
face detection algorithms based on facial features,
skin color, face shape, etc.

‹Although some of these algorithms have been


reported to be capable of achieving high detection
rates, very few of them are suitable for real-time
software deployment on digital or cell-phone
camera processors due to their high computational
and memory demands.

13
Existing Approaches: Facial Features
‹ Facial features consisting of eyes, nose or
mouth have been used for face detection
(e.g., Yow ’97)
‹ Pros
„ High accuracy
‹ Cons
„ Require access to the entire frontal face, fail for
profile or partial faces
„ Fail if some portion of face is obstructed, in
particular if eyes are covered (glasses, etc.)
14
Existing Approaches: Rule-based
‹ Multi-resolution rule-based face detection
utilizing simple rules including positions and
relative distances between facial features
(e.g.,Yang ’02)
‹ Pros
„ Several different rules can be used together
‹ Cons
„ Very much dependent on the strictness of rules,
strict rules fail to detect faces and too general
rules generate many false positives
15
Existing Approaches: Skin Color
‹ Human skin color is shown to be an effective
feature for performing real-time face
detection (e.g., Paschalakis ’04)
‹ Pros
„ Able to detect profile or partial faces
„ Able to detect even if some portion of face is
obstructed (for example, a person wearing
sunglasses)
‹ Cons
„ Additional post processing is needed as other
skin areas are also detected
16
Our Solution: Also Skin Color-based

‹Although different people have different skin


colors, several studies have shown that skin
chrominances form a tight color cluster.

‹Considered two models to describe this


cluster:
„ Single Gaussian Model (SGM)
„ Gaussian Mixture Model (GMM)

17
Color Space
‹Different chrominance spaces can be used for
representing skin color, e.g.:
„ Normalized RGB (r,g)
„ HSI
„ YCbCr
„ Normalized YCbCr

‹Since raw Bayer pattern images captured by


an image sensor are transformed into the
YCbCr color space for compression purposes
within a digital camera, YCbCr was chosen
here to provide the chrominance information
(Normalized YCbCr provided similar outcome).
18
Model Training
‹Various skin areas from a face image were
selected and the corresponding pixels were
mapped into the CbCr space.

Y
Cb
Cr
19
Face Database (1)
‹AR Database
Aleix Martinez and Robert Benavente in the Computer Vision
Name:
Center (CVC)
Color Images: Yes
Image Format: RAW
Image Size: 768x576

Number of unique people: 126

Number of pictures per


26
person:

Number of background
0
pictures per person:

different facial expressions, illumination conditions, and


Different Conditions:
occlusions (sun glasses and scarf)

20
Face Database (2)
‹PIE Database
Name: PIE Database

Color Images: Yes

Image Format: JPEG

Image Size: 640 x 486

Number of
68
unique people:
Number of
pictures per 603 (approximately)
person:
Number of
background
13
pictures per
person:
Different 13 different poses, 43 different illumination
Conditions: conditions, and with 4 different expressions.
21
Face Database (3)
‹UOPB Database
The University of Oulu Physics-Based Face
Name:
Database
Color Images: Yes
Image Format: BMP
Image Size: 428 x 569
Number of unique
125
people:
Number of pictures
16
per person:
Number of
background pictures 0
per person:
Different All frontal images: 16 different camera
Conditions: calibration and illuminations

22
Face Database (4)
‹DM350 Database
The University of Texas at Dallas DM350 Face
Name:
Database
Color Images: Yes
Image Format: RAW
Image Size: 640 x480
Number of unique
30
people:
Number of pictures
3
per person:
Number of
background pictures 0
per person:
Different
All frontal images: different lighting conditions
Conditions:

23
Skin Color Distribution (1)
250

200

150
Cr

100

50

0
0 50 100 150 200 250
Cb

Distribution of human skin color within the


chrominance CbCr space for 250 faces
corresponding to the AR, PIE, and UOPB
databases 24
Skin Color Distribution (2)
250

200

150
Cr

100

50

0
0 50 100 150 200 250
Cb

Distribution of human skin color within the


chrominance CbCr space for 30 sample
faces collected by the DM350 platform
25
Skin Color Distribution (3)

Example: Skin color distribution corresponding


to the four manually selected skin areas

26
SGM(1)
‹The skin color distribution can be represented by the
following single Gaussian model N(µ, Σ):
xskin
i = ⎡
⎣ i
Cbskin
, Cri
skin
⎤⎦
µ = [ µCb , µCr ] = n −1 ⎡ ∑Cbiskin , ∑Criskin ⎤
⎢⎣ i i ⎥⎦
Σ = n −1 ∑ ( xskin
i − µ)( x skin
i − µ)t

i
‹SGM is then used to construct a binary image
representing the skin color pixels of an input image
within the 98% confidence area in terms of the
Mahalanobis distance between the input image
chrominance pixels and the SGM model.
27
SGM (2)
‹Pros
„Fast detection with very low computational
complexity
‹Cons
„Sensitive to the confidence level, higher
confidence captures all skin pixels but also
increases false alarms

28
GMM (1)
‹The skin color distribution can be more accurately
modeled by a weighted combination of M normal
density functions given by:
M
p( x) = ∑ p( x | j) P( j)
j =1

‹A study has shown that the use of two Gaussians


(M=2) is adequate (Caetanoa ’03)

29
GMM (2)
‹The training process is done by using the
Expectation Maximization (EM) method

n
N
∑ x P( j | x )
∑ P( j | xi ) µ new
= i =1
i i

P( j ) new =
j
i =1 n

n ∑ P( j | x )
i =1
i

∑ i j i j ] P( j | xi )
[ x − µ new
].[ x − µ new T

∑ new
j = i =1
n

∑ P( j | x )
i =1
i

30
GMM (3)
‹Pros
„Higher detection accuracy
‹Cons
„Higher computational complexity

31
Binary skin images generated by
SGM and GMM

‹GMM captures skin area more effectively as


compared to SGM, more accurate model. However…
32
GMM Time Issue
‹GMM increases the skin detection
performance at the expense of much higher
computational time.

‹Floating-point calculation for the Gaussian


density functions adds to the computational
time.

‹Face detection time with two Gaussians


mixture model: 5 to 6 seconds, not real-time!
33
Lookup Table (1)
‹To overcome the time issue, calculated the
probability for all possible Cb-Cr combinations
and created a lookup table.

‹As the variation of skin color in the


chrominance space is small as compared to
the entire color space, it was found that a
50X50 lookup table was adequate for covering
the skin color cluster.

34
Lookup Table (2)
‹Using the lookup table approach, the face
detection using GMM took 10 to 25 ms on
DM350, considered to be a real-time
throughput as it added an acceptable time
increase to passive AF running at 210 to 330
ms.

‹Note that as size of the lookup table is fixed,


increasing the number of Gaussians does not
alter this time.
35
Lookup Table for GMM

36
Post Processing
‹It is not possible to detect faces using only
skin color information because of other
exposed skin areas, and also due to the
presence of similar colors in the background.

‹Need a fast post processing step, utilized


„Blocks or so called paxels to reduce data to be
processed
„Simple shape processing

37
Paxel Image (1)
‹The binary skin image is divided into blocks
or paxels.

‹Total skin area within a paxel is computed -


if this is greater than or equal to 50% of the
paxel area, the paxel label is assigned to be
skin.

‹This step significantly reduces the amount of


data to be processed, speeding up post
processing.
38
Paxel Image (2)
‹Example: A 640X480 size skin image is
converted to 64X48 size paxel image when
using a paxel size of 10X10.

39
Simple Face Shape Processing
‹Connectivity
„ Candidate regions are first selected by examining the 8
neighborhood connectivity among the paxels
‹Face size
„ A minimum face size of 40X40 pixels in a 640X480 image is
used to obtain candidate face regions (determined based on
ROC curve analysis)
‹Aspect ratio of face
„ Standard golden ratio 1.618 (Yang ’02) works well only for
frontal faces
„ To accommodate for rotated faces, a more flexible ratio is used
here (aspect ratio between 0.8-1.8) to detect face regions
‹Face score
„ Best face region is then obtained using face scores, or the
amount of skin presence in the face regions
40
Shape Processing Example

Candidate regions after connectivity


Face regions after face size + aspect ratio
Best face region after scoring
41
Sample outcome of skin color
face detection algorithm

42
Real-Time DM350 Implementation (1)

43
Real-Time DM350 Implementation (2)

‹Extensive experimentation (200 scenes) was done:


88% of cases, the best (closest) face area was
detected and focusing was done on that area

‹Remaining 12% of cases


„ 8% of cases, hands or other exposed parts of the body
were picked – note: objective here was not face recognition
rather auto-focusing, therefore these cases were OK since
these skin areas were on the same focal plane as faces.
„ 4% of cases, could not find a well-defined sharpness peak
associated with a detected face area – auto-focusing
switched to the original AF using the paxel area containing
no face but having the best sharpness function.
44
Face-based AF Videoclips
‹Single and multiple faces
„Frontal face
„Profile face
„Multiple faces
‹Different lighting conditions
„Fluorescent light
„Incandescent light
„Mixed light (fluorescent+ incandescent)
„Outdoor

45
Summary
‹For the last four years, an R&D program
has been established between the SIP Lab
at UTD and TI to look into various
improvements of digital and cell-phone
camera image pipelines. This effort has
been the latest accomplished project under
this program.

‹Introduced a real-time software based


solution to achieving faced-based passive
auto-focusing. 46
References
‹ T. Caetanoa, S. Olabarriagab, and D. Baronea, “Do mixture models in chromaticity space improve skin detection?”,
Pattern Recognition, Vol. 36, No. 12, pp. 3019-3021, December 2003.
‹ M. Gamadia and N. Kehtarnavaz, “A real-time continuous automatic focus algorithm for digital cameras,” in
Proceedings of IEEE Southwest Symposium on Image Analysis and Interpretation, pp. 163-167, March 2006.
‹ M. Gamadia, N. Kehtarnavaz, and K. Roberts-Hoffman, “Low-light auto-focus enhancement for digital and cell-phone
camera image pipelines,” IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, pp. 249-257, May 2007.
‹ N. Kehtarnavaz and H.-J. Oh, “Development and real-time implementation of a rule-based auto-focus algorithm,”
Journal of Real-Time Imaging, Vol. 9, No. 3, pp. 197-203, June 2003.
‹ C. Kotropoulos and I. Pitas, “Rule-based face detection in frontal views”, in Proc. IEEE ICASSP, Vol. 4, pp. 2537-
2540, April 1997.
‹ A. Martinez and R. Benavente, “The AR face database,” CVC Technical Report, No. 24, June 1998.
‹ S. McKenna, S. Gong, and Y. Raja, “Modeling facial color and identity with gaussian mixtures”, Pattern Recognition,
Vol. 31, No. 12, pp. 1883-1892, December 1998.
‹ V. Peddigari, M. Gamadia, and N. Kehtarnavaz, “Real-time implementation issues in passive automatic focusing for
digital still cameras”, Journal of Imaging Science and Technology, Vol. 49, No. 2, pp. 114-123, Mar/Apr 2005.
‹ M. Rahman, M. Gamadia, and N. Kehtarnavaz, “Real-time Face-based Auto-Focus for Digital Still and Cell-Phone
Cameras”, submitted to 2008 IEEE Southwest Symposium on Image Analysis and Interpretation, Mar 2008.
‹ T. Sim, S. Baker, and M. Bsat, “The CMU pose, Illumination, and expression database,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol. 25, No. 12, pp. 1615-1618, December 2003.
‹ G. Yang and T. Huang, “Human face detection in complex background”, Pattern Recognition, Vol. 27, No. 1, pp. 53-
63, January 1994.
‹ J. Yang and A. Waibel, “A real-time face tracker”, in Proceedings of IEEE Workshop on Applications of Computer
Vision, pp. 142-147, December 1996.
‹ M. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images: a survey”, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 24, No. 1, January 2002.
‹ K. Yow and R. Cipolla, “Feature-based human face detection,” Image and Vision Computing, Vol. 15, No. 9, pp. 713-
735, September 1997.
‹ S. Paschalakis, M. Bober, “Real-time face detection and tracking for mobile videoconferencing”, Real-Time Imaging,
Vol. 10, pp. 81-94, April 2004.
‹ Texas Instruments, Inc., TMS320DM350 CPU and Peripherals Technical Reference Manual, 2007.
47
Real-Time Implementation of
Face-Based Auto-Focus
on TI Digital Camera Processor

Demo/Q&A

N. Kehtarnavaz, M. Gamadia, and M. Rahman


Signal and Image Processing Lab
University of Texas at Dallas
48
IMPORTANT NOTICE
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improvements,
and other changes to its products and services at any time and to discontinue any product or service without notice. Customers should
obtain the latest relevant information before placing orders and should verify that such information is current and complete. All products are
sold subject to TI’s terms and conditions of sale supplied at the time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in accordance with TI’s standard
warranty. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where
mandated by government requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are responsible for their products and
applications using TI components. To minimize the risks associated with customer products and applications, customers should provide
adequate design and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right, copyright, mask work right,
or other TI intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information
published by TI regarding third-party products or services does not constitute a license from TI to use such products or services or a
warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual
property of the third party, or a license from TI under the patents or other intellectual property of TI.
Reproduction of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied
by all associated warranties, conditions, limitations, and notices. Reproduction of this information with alteration is an unfair and deceptive
business practice. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional
restrictions.
Resale of TI products or services with statements different from or beyond the parameters stated by TI for that product or service voids all
express and any implied warranties for the associated TI product or service and is an unfair and deceptive business practice. TI is not
responsible or liable for any such statements.
TI products are not authorized for use in safety-critical applications (such as life support) where a failure of the TI product would reasonably
be expected to cause severe personal injury or death, unless officers of the parties have executed an agreement specifically governing
such use. Buyers represent that they have all necessary expertise in the safety and regulatory ramifications of their applications, and
acknowledge and agree that they are solely responsible for all legal, regulatory and safety-related requirements concerning their products
and any use of TI products in such safety-critical applications, notwithstanding any applications-related information or support that may be
provided by TI. Further, Buyers must fully indemnify TI and its representatives against any damages arising out of the use of TI products in
such safety-critical applications.
TI products are neither designed nor intended for use in military/aerospace applications or environments unless the TI products are
specifically designated by TI as military-grade or "enhanced plastic." Only products designated by TI as military-grade meet military
specifications. Buyers acknowledge and agree that any such use of TI products which TI has not designated as military-grade is solely at
the Buyer's risk, and that they are solely responsible for compliance with all legal and regulatory requirements in connection with such use.
TI products are neither designed nor intended for use in automotive applications or environments unless the specific TI products are
designated by TI as compliant with ISO/TS 16949 requirements. Buyers acknowledge and agree that, if they use any non-designated
products in automotive applications, TI will not be responsible for any failure to meet such requirements.
Following are URLs where you can obtain information on other Texas Instruments products and application solutions:
Products Applications
Amplifiers amplifier.ti.com Audio www.ti.com/audio
Data Converters dataconverter.ti.com Automotive www.ti.com/automotive
DSP dsp.ti.com Broadband www.ti.com/broadband
Clocks and Timers www.ti.com/clocks Digital Control www.ti.com/digitalcontrol
Interface interface.ti.com Medical www.ti.com/medical
Logic logic.ti.com Military www.ti.com/military
Power Mgmt power.ti.com Optical Networking www.ti.com/opticalnetwork
Microcontrollers microcontroller.ti.com Security www.ti.com/security
RFID www.ti-rfid.com Telephony www.ti.com/telephony
RF/IF and ZigBee® Solutions www.ti.com/lprf Video & Imaging www.ti.com/video
Wireless www.ti.com/wireless
Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2008, Texas Instruments Incorporated

You might also like