Professional Documents
Culture Documents
Manuscript received December 10, 2007. This work was supported by II. RELATED WORKS
Department of Computer Science and Engineering, Sikkim Manipal Institute Many methods for hand gesture recognition using visual
of Technology, Sikkim, India.
Prateem Chakraborty is a Final yr. B. Tech. student of Department of analysis have been proposed for hand gesture recognition.
Computer Science and Engineering, Sikkim Manipal Institute of Sebastiean Marcel, Oliver Bernier, Jean Emmanuel Viallet
Technology, Majitar, Rangpo, Sikkim, India. (e-mail: and Danieal Collobert have proposed the same using
prateem_chakraborty@yahoo.co.in). Input-output Hidden Markov Models [1]. Xia Liu and Kikuo
Prashant Sarawgi is a Final yr. B. Tech. student of Department of
Computer Science and Engineering, Sikkim Manipal Institute of
Fujimura have proposed the hand gesture recognition using
Technology, Majitar, Rangpo, Sikkim, India- 737132 (e-mail: depth data [2]. For hand detection, many approached uses
prashant_sarawgi@yahoo.com). color or motion information [3, 4]. Attila Licsar and Tamas
Ankit Mehrotra is a final yr. B. Tech. student of Department of Computer Sziranyi have developed a hand gesture recognition system
Science and Engineering, Sikkim Manipal Institute of Technology, Majitar,
Rangpo, Sikkim, India- 737132 (e-mail:
based on the shape analysis of the static gesture [5]. Another
ankit_mehrotra2000@yahoo.co.in). method is proposed by E. Stergiopoulou and N. Papamarkos
Gaurav Agarwal is a final yr. B. Tech. student of Department of Computer [6] which says that detection of the hand region can be
Science and Engineering, Sikkim Manipal Institute of Technology, Majitar, achieved through color segmentation. Byung-Woo Min,
Rangpo, Sikkim, India-737132. (e-mail: garv_ag@yahoo.com).
Ratika Pradhan is a Reader in Department of Computer Science and
Ho-Sub Yoon, Jung Soh, Yun-Mo Yangc and Toskiaki Ejima
Engineering, Sikkim Manipal Institute of Technology, Majitar, Rangpo, have suggested the method of Hand Gesture Recognition
Sikkim, India-737132. (e-mail: ratika_pradhan@yahoo.co.in ).
using Hidden Markov models [7]. Another very important Then, we had to realize a Gaussian filter to blur the image
method is suggested by Meide Zhao, Francis K.H. Quek and and have a homogeneous picture. It will permit to obtain
Xindong Wu [8]. They have used AQ Family Algorithms and better results in the gradient magnitude. The goal of this filter
R-MINI Algorithms for the detection of Hand Gestures. is to erase the background defects. It is really important to
There is another efficient technique which uses Fast have a uniform background to avoid noise. We created a
Multi-Scale Analysis for the recognition of hand gestures as gradient magnitude threshold which had to erase the lower
suggested by Yikai Fang, Jian Cheng, Kongqiao Wang and levels gradients in order to keep the really interesting ones.
Hanqing Lu [9], but this method is computationally This will cut all the noise and regularize the background. This
expensive. Chris Joslin et. al. have suggested the method for part will be complementary with the Gaussian filter. The
enabling dynamic gesture recognition for hand gestures [10]. Gaussian filter will blur the big defects and the threshold will
Rotation Invariant method is widely used for texture cut the lowest magnitudes. Then the noise will be quite well
classification and recognition. Timi Ojala et. al. have cut. Then, the next step was to calculate the Euclidian
suggested the method for texture classification using Local distance between the vectors of the different images
Binary Patterns [11]. analyzed. This part is made to compare the different pictures,
by comparing the different histograms. This is the final step.
With this, we are able to recognize the different gestures. To
III. SUBTRACTION METHOD conclude, we can say that this method does not require
This is a very simple method to implement but not a very special mathematical back-grounds.
efficient one since the result generated may be highly
inaccurate. This method involves first converting all the
images, including the test image, into black and white. This is V. PRINCIPAL COMPONENT ANALYSIS (PCA)
done by selecting a threshold value for the pixels i.e., if T is METHOD
the selected threshold value and p is the pixel intensity value In this section, we will study the hand gesture recognition
then replace p=0 if p<T and p=255 if p>=T. We then perform through Principal Components Analysis, but we will need
direct subtraction of each of the pixels in the test image with some mathematical background to understand the method.
the corresponding pixels in each of the images in the database This method is called: PCA or Eigenfaces [12-14]. It is a
and calculate the Euclidean distance between them. The useful statistical technique that has found application in
smallest value of ‘d’ will imply the closest match. Here, ‘d’ different fields (such as face recognition and image
can be defined as compression). This is also a common technique for finding
width height patterns in data of high dimension too. Before realizing a
d= ∑ ∑ ( f1(x, y) − f2(x, y))
2
x =0 y =0
description of this method, we will first introduce
where f1 and f2 are the two images being compared. mathematical concepts that will be used in PCA.
A. Mathematical Backgrounds:
IV. GRADIENT METHOD 1) Standard Deviation:
The steps involved in this process are as follows: First of In statistics, we generally use samples of population to
all, we had to implement the gradient magnitude calculation. realize the measurements. For the notation, we will use the
The aim is to define where in the picture the biggest gradient symbol X to refer to the entire sample and we will use the
magnitudes are. Then, it will be easy to apply a threshold in symbol Xi to indicate a specific data of the sample.
n
the gradients in order to keep the really interesting one and to ∑ Xi
cut all the background noise. To realize this part, the theory is X = 1
i =
n
to calculate the magnitude with the formula:
2) Standard deviation s,
magnitude = dx 2 + dy 2 n
∑ ⎛⎜ X − X ⎞⎟
2
⎝ i ⎠
s= i=1
Therefore, we have to calculate the derivative of the image (n - 1)
in x and y direction to have the magnitude. We have used
Sobel filter to approximate the gradients. The Sobel operator 3) Variance
used for gradient-x is as shown below: Variance is another measure of the spread out of data in a
set. In fact it is quite the same as the standard deviation.
-1 -2 -1 n
(
∑ Xi − X
i =1
) 2
s =
2
0 0 0 (n - 1)
1 2 1
4) Covariance
The Sobel operator used for gradient-y is as shown below:- Covariance can be expressed as:
-1 0 -1
n
(
∑ Xi − X Xi − X )( )
i=1
var(X) =
-2 0 -2 (n − 1)
-1 0 -1
A. Achieving Gray-Scale Invariance rotation angles. To remove the effect of rotation, i.e., to
As the first step toward gray-scale invariance, we subtract assign a unique identifier to each rotation invariant local
without losing information, the gray value of the center pixel binary pattern we define:
from the gray values of the circularly symmetric
LBPP,R = min{ROR(LBPP,R , i) | i = 0,1,...., P - 1}
ri
neighborhood gp(p=0,…., P-1). Next, we assume that
differences gp-gc is independent of gc, which allows us to
factorize. The factorized distribution is only an
where ROR (x, i) performs a circular bit-wise right shift on
approximation of the joint distribution. We neglect small loss
the P-bit number x i times. In terms of image pixels, simply
in information as it allows us to achieve invariance with
corresponds to rotating the neighbor set clockwise so many
respect to shifts in gray scale. The distribution t(gc)
times that a maximal number of the most significant bits,
describes the over-luminance of the image, which is
starting from gp-1, is 0. LBPriP, R quantifies the occurrence
unrelated to local image texture and, consequently, does not
statistics of individual rotation invariant patterns
provide useful information for texture analysis. Hence, much
corresponding to certain micro-features in the image; hence,
of the information in the original joint gray level distribution
the patterns can be considered as feature detectors.
about the textural characteristics is conveyed by the joint
difference distribution .This is a highly discriminative texture
operator. It records the occurrences of various patterns in the
VII. RESULT AND DISCUSSIONS
neighborhood of each pixel in a P-dimensional histogram.
For constant regions, the differences are zero in all directions. We compared the various algorithms based on the results
On a slowly sloped edge, the operator records the highest obtained by testing the MATLAB implementations of the
difference in the gradient direction and zero values along the methods with a test database for 4 gestures containing 4
edge and, for a spot, the differences are high in all directions. images for each gesture. Three test images for each type of
Signed differences gp-gc is not affected by changes in mean gesture were considered for the comparison. Each test image
luminance; hence, the joint difference distribution is was processed using the entire four methods one after the
invariant against gray-scale shifts. We achieve invariance other. Then the method that works best for a particular type
with respect to the scaling of the gray scale by considering of gesture was inferred based on the results obtained. We
just the signs of the differences instead of their exact values: tested the following four gestures namely one, two, palm and
fist. The gestures are as shown below
T = t (s( g0 - g c ), s( g0 - g c ),........., s( g0 - g c ) )
gesture type where N denotes number of test images taken and C denotes
number of correct matches.
IX. ACKNOWLEDGMENTS
We would like to express our sincere thanks to all the Staff
(d) of Department of Computer Science and Engineering,
Figure 2: Euclidean distance computed between the test image and the Sikkim Manipal Institute of Technology, Sikkim for their
images in the database for a) subtraction method, b) Gradient method and c) support and guidance throughout.
PCA method and the no of pixels, p computed between the test image and the
images in the database for d) rotation invariant method.
REFERENCES
We also calculated which method gave how many matches
for a particular type of gesture. Based on those matches, we [1] Sebastian Marcel, Oliver Bernier, Jean Emmanuel Viallet and Daniel
made a table which summarizes the no. of database images Collobert, “ Hand Gesture Recognition using Input – Output Hidden
Markov Models”, Proc. of the Fourth IEEE International Conference
and the no. of matches for all methods. Those are shown in on Automatic Face and Gesture Recognition, pp. 456 - 461 2000.
the table1 given below. [2] Xia Liu and Kikuo Fujimura, “ Hand Gesture Recognition using Depth
Data”, Proc. of the Sixth IEEE International conference on automatic
Test Subtraction Gradient PCA Rotation Face and Gesture Recognition, pp. 529-534, 2004.
image Invariant [3] L. Bretzner, I. Laptev, and T. Lindeberg, “Hand Gesture using
multi-scale color features, hierarchical models and particle filtering”,
gesture N C N C N C N C Proc. of the Fifth International conference on Automatic Face and
type Gesture Recognition, pp. 423- 428, 2003.
One 3 2 3 1 3 3 3 0 [4] V. Pavlovic, et al. visual interpretation of hand gesture for
Two 3 1 3 2 3 1 3 2 human-computer interaction: a review, IEEE Trans. On Pattern anal.
Mach. Intel. 19(7), pp 677-695, 1997.
Palm 3 2 3 2 3 1 3 2
Fist 3 1 3 1 3 1 3 2 [5] Attila Licsar and Tamas Sziranyi, “Supervised training based hand
gesture recognition system”, Proc. of the 16th International
Conference on Pattern Recognition, Vol. 3, pp 30999 – 31003, 2002.
Table 1: Comparison between the various methods showing the number
of matches retrieved by Hand Gesture Recognition system for different