Real time face detection using Viola-Jones and AdaBoost

Wednesday, November 06, 2013 Real time face detection
REAL TIME FACE DETECTION

1
By :SUMEET SAURAV
Wednesday, November 06, 2013
INTRODUCTION
Face
detection has been one of the most active research topic in computer vision over the past decade. It is the core of all facial analysis, e.g., face
localization, facial feature detection, face recognition, face authentication, face tracking, and facial expression recognition.
It
Real time face detection
is a fundamental technique for other applications such as content-based image retrieval, video conferencing, and intelligent human computer interaction (HCI).
GOAL AND CHALLENGES????

The
goal of face detection is to determine whether or not there are any faces in the image and, if present, return the location and the extent of each face. It is a challenge for computer vision due to the variations in scale, location, orientation, pose, facial expression, light condition, and various appearance features (e.g., presence of glasses, facial hair, makeup, etc.)
PERFORMANCE METRICS
Learning
time Execution time The number of samples required in training, and the ratio between the detection rate and the false alarm. Some common terms related with the face detection. False Positive(needs to be minimized) True Positive(needs to be maximized) False negative(needs to be minimized)
DIFFERENT FACE DETECTION APPROACHES

Yang classified face detection approaches into four major categories. These are Knowledge-based(depend on a set of rules, based on human knowledge, to detect faces.) Feature invariant(locate faces by extracting structural features of the face using statistical classifier) Template matching(use predefined or parameterized face templates to locate and detect faces, by computing the correlation values between the template and the input image). Appearance-based approaches.(depend on a set of representative training face images to learn face models).It shows best performance.
VIOLA JONES FACE DETECTION

very high Detection Rate (True-Positive Rate) & very low False-Positive Rate always. Real Time For practical applications at least 2 frames per second must be processed. Face Detection not recognition. The goal is to distinguish faces from non-faces (face detection is the first step in the identification process) There are three key contributions: Introduction of Integral Image. Simple and efficient classifier. Cascading of classifiers.
Robust
INTEGRAL IMAGE
A new image representation that allows for very fast feature evaluation. A set of features which are reminiscent of Haar basis functions are used for the face detection. To compute these features at many scales integral image is used. Similar to summed area table used in computer graphics. Can be computed using few operations per pixels. Once computed all the haar features can be calculated at any location or at any scale in constant time.
CLASSIFIER
The
second contribution of the viola Jones face detection framework is the introduction of simple and efficient classifier based on the Adaboost algorithm. The classifier selects the small number of important features from the pool of haar features(nearly 16000!!) within any sub-window. Feature selection is achieved using the AdaBoost learning algorithm by constraining each weak classifier to depend on only a single feature. Each stage of the boosting process can be viewed as 8 the feature selection process(selects a new week classfier)
CASCADING OF CLASSIFIERS
Cascade structure which dramatically increases the speed of the detector by focusing attention on promising regions of the image. More complex processing is reserved only for these promising regions. The key measure of such an approach is the false negative rate of the attentional process. Those sub-windows which are not rejected by the initial classifier are processed by a sequence of classifiers, each slightly more complex than the last. If any classifier rejects the sub-window, no further processing is performed.
WHAT IS FEATURE??????
1)
2)
1)
2) 3)
Why features??Why not pixels??? Features can act to encode ad-hoc domain knowledge that is difficult to learn using a finite quantity of training data. Feature-based system operates much faster than a pixelbased system. The simple features used are reminiscent of Haar basis functions. Three kinds of features are use: Two-rectangle feature. Three-rectangle feature. Four rectangle feature.
10
FEATURES USED
Given that the base resolution of the detector is 2424, the exhaustive set of rectangle features is quite large, 160,000.
11
(unlike the Haar basis, the set orectangle features is overcomplete.)
INTEGRAL IMAGE DESCRIPTION
Rectangle features can be computed very rapidly using an intermediate representation for the image which we call the integral image.
The integral image at location x,y contains the sum of the pixels above and to the left of x,y ,inclusive. Where ii(x ,y) is the integral image and i(x,y)is the original image .
12
Using the following pair of recurrences:
(where s(x,y) is the cumulative row sum. s(x,1)=0, and ii(1,y) =0) the integral image can be computed in one pass over the original image. Using the integral image any rectangular sum can be computed in four array references
13
LEARNING CLASSIFICATION FUNCTIONS

There are 160,000 rectangle features associated with each image sub-window, a number far larger than the number of pixels. Computing the complete set is prohibitively expensive. Based on hypothesis it was found out that a very small number of these features can be combined to form an effective classifier. But the main challenge is to find these features. Viola Jones system used AdaBoost to select the features and to train the classifier.
14
ADABOOST ALGORITHM
Adaboost algorithm was developed by Freund and Schapire and it is one of the most cited paper. It comes under the category of Ensemble learning system where weak learners are boosted to strong learners which can make very accurate prediction. A weak learners or base learners are one which are slightly better than random guess. The algorithm was formulated to answer the question asked by (Kearns and valiant) Whether two complexity classes, weakly learnable and strongly learnable problems are equal.
15
RATIONALE
Imagine the situation where we want to build an email filter that can distinguish spam from non-spam. The general way we would approach this problem is: 1)Gathering as many examples as possible of both spam and non-spam emails. 2)Train a classifier using these examples and their labels. 3)Take the learned classifier, or prediction rule, and use it to filter your mail. 4)The goal is to train a classifier that makes the most accurate predictions possible on new test examples. But, building a highly accurate classifier is a difficult task. (we still get spam)
16
We
could probably come up with many quick rules of thumb. These could be only moderately accurate. An example could be if the subject line contains buy now then classify as spam. This certainly doesnt cover all spams, but it will be significantly better than random guessing.
17
BASIC IDEA OF BOOSTING

Boosting refers to a general and provably effective method of producing a very accurate classifier by combining rough and moderately inaccurate rules of thumb. It is based on the observation that finding many rough rules of thumb can be a lot easier than finding a single, highly accurate classifier. To begin, we define an algorithm for finding the rules of thumb, which we call a weak learner. The boosting algorithm repeatedly calls this weak learner, each time feeding it a different distribution over the training data (in Adaboost). Each call generates a weak classifier and we must combine all of these into a single classifier that, hopefully, is much more accurate than any one of the rules.
18
19
20
21
22
23
KEY QUESTIONS DEFINING AND ANALYZING BOOSTING
How should the distribution be chosen each round? How should the weak rules be combined into a single rule? How should the weak learner be defined? How many weak classifiers should we learn?
24
GETTING STARTED
25
WEAK LEARNERS AND WEAK CLASSIFIERS
26
A WL/WC EXAMPLE FOR IMAGES
27
28
29
30
THE STRONG ADABOOST CLASSIFIER
31
32
ILLUSTRATION OF ADABOOST CLASSIFIER
33
VIOLA JONES APPROACH
The weak learning algorithm is designed to select the single rectangle feature which best separates the positive and negative examples.
For each feature, the weak learner determines the optimal threshold classification function, such that the minimum number of examples are misclassified. A weak classifier (h(x, f, p,)) thus consists of a feature (f), a threshold ()and a polarity (p)indicating the direction of the inequality:
34
35
THE ATTATIONAL CASCADE

Simpler classifiers are used to reject the majority of sub-windows before more complex classifiers are called upon to achieve low false positive rates. Stages in the cascade are constructed by training classifiers using AdaBoost.
Starting with a two-feature strong classifier, an effective face filter can be obtained by adjusting the strong classifier threshold to minimize false negatives
36
HAAR-FEATURE BASED OBJECT DETECTION ALGORITHM(FRANK VAHID)

Algorithm overview Image scaling Haar-feature and integral image Decision cascade
37
38
39
DESIGNED ARCHITECTURE
40
CLASSIFIER DESIGN
41
GENERAL ARCHITECTURE FOR A SINGLE
CLASSIFIER
42
PARALLELIZED ARCHITECTURE OF MULTIPLE CLASSIFIERS FOR FACE DETECTION (J.CHO)

Face Detection Algorithm. Integral Image. Haar Feature Cascade
43
44
HARDWARE ARCHITECTURE
45
BLOCK DIAGRAM OF PROPOSED FACE
DETECTION SYSTEM
46
ARCHITECTURE FOR GENERATING INTEGRAL
IMAGE WINDOW
47
EQUATIONS INVOLVED FOR INTEGRAL IMAGE
CALCULATION
48
HAAR FEATURE CALCULATION OF HAAR CLASSIFIER.
49
ARCHITECTURE FOR PERFORMING HAAR CLASSIFICATION.
50
NUMBER OF HAAR CLASSIFIERS IN EACH STAGE
51
DEVICE UTILIZATION CHARACTERISTICS FOR THE FACE DETECTION SYSTEMS
52
PERFORMANCE OF PROPOSED FACE DETECTION SYSTEMS
53
MODIFIED ARCHITECTURE FOR REAL-TIME FACE DETECTION USING FPGA .

The system is based on well-known Viola Jones Frame-work which consists of AdaBoost algorithm integrated with Haar features. Modification in hardware design techniques to achieve more parallel processing and higher detection speed of the system. The system implemented on Xilinx Virtex-5 FPGA development board outputs a high face detection rate (91.3%) at 60 frame/second for a VGA (640 480) video source. The power consumption of the implementation is 2.1 W.
54
FACE DETECTION ARCHITECTURE
55
FRAME STORE MODULE
The frame store system comprises of four functions: Storing the incoming frame line by line.
Sending the stored line to Integral image generator.

Indicating the detected results on stored frame. Showing the processed frame out to DVI interface.
56
INTEGRAL IMAGE GENERATOR

The Integral Image Generator performs two functions: It converts the given 24-bits RGB image into the 8-bits gray scale. After conversion it generates the integral image of the gray scale image, the generation of integral takes place line by line as the gray scale image is formed. So the expression for evaluation of the integral image is as followed. II(x,y) = I(k,y) + II(x,y-1)
57
INTEGRAL LINE CALCULATION
58
IMAGE SCALAR SYSTEM

Instead of applying the image scaling method on the original image, the algorithm scaled the integral image only by a factor of 2. This provides two advantages:
One is that the overall system accuracy increases due to reduction in scaling error.
Second it only requires alternate selection of data value from every alternate lines of the image.
59
INTEGRAL SUB-SYSTEM
It comprises of a NxN window, thus there is to design a N2 to 1 multiplexer in order to provide parallel access to all feature. (Difference with Architecture-1). We require a 32 bit N2 to 1 multiplexer, as we generating integral of the whole line at a time.

Instead of generating 12 such multiplexer for feature extraction, we only require 9 multiplexer for extraction.
60
CLASSIFICATION SYSTEM
The system basically consists of 3 classification hardware system. The whole classification is handled by controller starting from first classifier selection to decision of each stage. The controller have the all the value for the system including stage threshold for every stages, the number of each types of classifier in every stages.
61
CLASSIFICATION SYSTEM
62
CLASSIFIER TYPE 1 AND 2:
63
CLASSIFIER TYPE 3, 4 AND 5:
64
CLASSIFIER TYPE 5
65
IMPLEMENTATION
The whole system is implementations on Xilinx Virtex-5 LX110T FPGA Board using VHDL. Classifier set directly available from OpenCV Faceclassifier system has been used. The sub-window size taken in this system is of 20x20 and it consist 22 stages and 2135 feature classifier. The frame store module is implemented on the SRAM memory chip available on the kit. The integral image generator is made using the BRAM available within the FPGA Processor. The BRAM is configured for 32-bits memory word which can store up to 1024 such words.
66
It requires 20 active BRAM for 20 lines storage. The sub-system is implemented completely on LUTs resource as well as the multiplexer system.
The classification system consists of both BRAM as well as LUTs.

Two BRAM modules to store the complete classifier node position for type 1 & 2 and three BRAM modules for type 3, 4 & 5. Both left and right value are 8-bits as well as the largest classifier threshold is of 16-bits. Hence require only one memory word to store all this value. The detected face consists of two BRAM module and store the position of the window for display.
67
RESOURCE UTILIZATION AND PERFORMANCE
68
COMPARISON WITH OTHER IMPLEMENTATIONS
69
RERERENCES.
P. Viola and M. J. Jones, Robust real-time face detection, Int. J. Comput. Vision, vol. 57, no. 2, pp. 137154, 2004. C. Huang and F. Vahid, Scalable Object Detection Accelerators on FPGAs Using Custom Design Space Exploration, in Proceeding of the IEEE 9th Symposium on Application Specific Processors. 2011, pp. 115-121.
70

Real time face detection using Viola-Jones and AdaBoost

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Real time face detection using Viola-Jones and AdaBoost

Uploaded by

Copyright:

Available Formats

Wednesday, November 06, 2013 Real time face detection