You are on page 1of 28

Eigenface vs Fisherface

Comparison between Eigenface and


Fisherface

by Marian Moise
Overview

● Eigenface is still used today and its main idea comes from the fact that each
face image can be reconstructed based on the weighted average of the principal
components of the original training set of face images.
Reconstruction is performed by projecting a new image into the subspace
spanned by the eigenfaces (“face space”) and then classifying the face by
comparing its position in face space with the positions of known individuals

● On the other hand Fisherface is based on LDA technique which searches for
those vectors in the underlying space that best discriminate among classes
(rather than those that best describe the data).
More formally, given a number of independent features relative to which data is
described, LDA creates a linear combination of these which yields the largest
mean differences between the desired classes.
PCA vs FLD

PCA actually smears the


classes together so that they
are no longer linearly
separable in the projected
space. It is clear that,
although PCA achieves larger
total scatter, FLD achieves
greater between-class scatter,
and, consequently,
classification is simplified.
Face Recognition Using
Eigenfaces

The main idea behind Eigenfaces is to represent


a face as a linear combination of a set of basis


images(PCA technique):

k
Φ i =∑ w j∗u j
j=1
Recognition overview
●Initialization: Acquire the training set of face
images and calculate the eigenfaces, which define
the face space.

●When a new face image is encountered,


calculate a set of weights based on the input
image and the M eigenfaces by projecting the
input image onto each of the eigenfaces.

●Determine if the image is a face at all (whether


known or unknown) by checking to see if the
image is sufficiently close to “face space.”
Recognition overview

●If it is a face, classify the weight pattern as either


a known person or as unknown.

●(Optional) If the same unknown face is seen


several times, calculate its characteristic weight
pattern and incorporate into the known faces (i.e.,
learn to recognize it).
Eigenfaces algorithm

➔Obtain M training images I1,I2...IM


➔Represent each image I as a vector Γ :
i i

[]
a11

[ ]
a11 a12 ⋯ a1N ⋮
a 1N
⋮ ⋮ ⋱ ⋮
I i=  i= ⋮
a21 a22 ⋯ a2N
a 2N
a N1 a N2 ⋯ a NN

a NN
Eigenfaces algorithm
Compute the average face vector
M
Ψ:
1
= ∑  i
M i=1

Subtract the mean face from each face vector Γi to


get a set of vectors Φi. The purpose of subtracting
the mean image from each image vector is to be
left with only the distinguishing features from each
face and removing information that is common:

i = i −
Eigenfaces algorithm

➔Find the covariance matrix C:


T
C= A∗A , where A=[1 2  M ]

Note that C is a N2 * N2 matrix, while A matrix size


is N2 * M.
Eigenfaces algorithm

➔We now need to compute the Eigenvectors ui of


C. However note that C is a N2 * N2 matrix and it
would return N2 eigenvectors, each being N2
dimensional. For an image this is HUGE. The
computations required would easily make your
system run out of memory.
Eigenfaces algorithm
Instead of the matrix A*AT consider the matrix
AT*A. Remember A is a N2 * M matrix, thus AT*A is
a M * M matrix. If we find the eigenvectors of this
matrix, it would return M eigenvectors, each of
dimension M*1, let's call these eigenvectors vi.

Now from properties of matrices, it follows that


ui=A*vi, where ui are the M largest eigenvectors of
the covariance matrix C with M<<N2.

When calculating the eigenvectors ui, we should


take into account also that ||ui||=1.
Eigenfaces algorithm

➔Select the best K eigenvectors(principal


components). Usually the selection of these
eigenvectors is done heuristically.
Finding weights

Each normalized face in the training set could now


be represented as a linear combination of these
eigenvectors:
K
Φ i =∑ w j∗u j
j=1

These weights can be calculated as:


T
w j =u ∗i
j
Finding weights

So each training image Φi (i=1,2,...,M) will be


represented in the new model as:

[]
w1
w2
i =

wK
Recognition process
Let's say we have a face image Γ that is to be
recognized, then the following steps should be
performed:
1) Face normalization: Φ=Γ-Ψ
2) Project this normalized probe onto the
eigenspace and find the weights:
T
w j=u j ∗i

3) The normalized probe Φ can now be


represented as: T
=[ w 1 w 2 ⋯ w K ]
Recognition process

Classification of the feature vector will be done


using distance measures:

er =min∥−i∥

If er < Θ, where Θ is a threshold chosen


heuristically, then the probe image is recognized
as the image with which it gives the lowest score.
However, if er > Θ then the probe doesn't belong
to the database.
Distance measures
Euclidean Distance:


n
d  p , q= ∑  pi −qi  2

i=1

Mahalanobis distance:

d  p , q=  p−q C  p−q


T −1

Because it takes into account also the covariance


of the vectors p and q, thus removing the
problems related to scale and correlation, it is a
better solution for pattern recognition problems.
Eigenfaces issues
➔ All the above calculus has been made on the
assumptions that faces are mostly upright and
frontal.

➔Because it might happen that the probe image is


not a face and however it still resembles a
particular face class stored in the database, face
detection is recommended to be part of such a
system.

➔The results indicate that changing lighting


conditions causes relatively few errors, while
performance drops dramatically with size change.
Fisherfaces

Fisher's Linear Discriminant (FLD) is an example of a class specific


method, in the sense that it tries to “shape” the scatter in order to make it
more reliable for the classification. This method selects Wopt (optimal
eigenvectors) in such a way that the ratio of the between-class scatter
(SB) and the within-class scatter (SW) is maximized (in case SW is non-
singular):

T
∣W SB W∣
Wopt=argmaxW T
=[ w w
1 2  w m ]
∣W SW W∣
Fisherfaces
Between-class scatter matrix is defined as:
c

S B= ∑ N i μ i− μμi −μ T

i=1
Within-class scatter matrix is defined as:
c

S W =∑ ∑ xk− μi xk −μ i T

i=1 xk ∈X i
where μi the mean image of class Xi and Ni the number of samples in class Xi
Singularity issue

In the face recognition problem, one is confronted with the difficulty that the
within-class scatter matrix SW is always singular. This stems from the fact that
the rank of SW is at most N-c and, in general, the number of images in the
learning set N is much smaller than the number of pixels in each image n.

In order to overcome the complication of a singular SW , an alternative


method, called Fisherface, deals with it by projecting the image set to a lower
dimensional space so that the resulting within-class scatter matrix SW is non-
singular.
This is achieved by using PCA to reduce the dimension of the feature space to
N-c, and then applying the standard FLD to reduce the dimension to c-1. More
formally, Wopt is given by:

T T T
W =W W
opt fld pca
Singularity issue

T
W pca=argmaxW ∣W ST W∣
T T
∣W W S W W∣
pca T pca
W fld =argmax W T T
∣W W S W W∣
pca W pca
Results
Related work

These two papers confirms my obtained results:

● Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear


Projection, Peter N.Belhumeur, Joao P. Hespanha, and David J. Kriegman
● Eigenfaces and Fisherfaces, Naotoshi Seo, University of Maryland
Limitations and possible
extensions

Faces are assumed to be upright and frontal:


● it would be a good idea to have an additional subsystem which

transforms any given image face to frontal centered view


● another possibility would be to create a small number of face classes

for each known person corresponding to characteristic views (various


posing)

PCA is sensitive to lighting conditions:


● specularities removal should be considered as a preprocessing step

● unfortunately, there is no available method for grayscale images, so

highlights removal should be used instead


Limitations and possible
extensions

Probe image might not be a face and however it still resembles a


particular face class stored in the database:
● face detection is recommended to be part of such a system

Getting more significant eigenvectors and improve the recognition rate:


● face cropping should be done, for example by using an ellipsoidal

shape kernel(filtering matrix)


Conclusions
➔ Fisherface method appears to be the best at simultaneously handling
lighting variation, facial expression variation and presence of glasses.

➔ as expected, the PCA method suffers when confronted with variation in


facial expression and presence of glasses.

➔ the optimal method between these two is Fisherface as it has several


properties that intuitively suggest that it should fare well in a variety of
circumstances, most notably the fact that it eliminates intra-class
differences from its feature set. This suggests that it is close to optimal in
deciding exactly what features are relevant to a particular class, given
enough examples of that class.

➔ PCA can outperform LDA when the training dataset is small or when
there are chosen a few eigenvectors.

➔both methods performed well if presented with an image in the test set
which is similar to an image in the training set.
Questions?

THANK YOU !

You might also like