Professional Documents
Culture Documents
of Diabetes Disease
Mostafa Fathi Ganji Mohammad Saniee Abadeh
Faculty of Electrical and Computer Engineering Faculty of Electrical and Computer Engineering
University of Tarbiat Modares University of Tarbiat Modares
Tehran, Iran Tehran, Iran
m.ganji@modares.ac.ir saniee@modares.ac.ir
Abstract Ant colony optimization (ACO) has been used the expert made on other patients with the same conditions [4].
successfully in data mining field to extract rule based The former method depends on experts knowledge while the
classification systems. The Objective of this paper is to utilize latter strongly depends on experts experience with his earlier
ACO to extract a set of rules for diagnosis of diabetes disease. patients. This job is not easy to consider the number of factors
Since the new presented algorithm uses ACO to extract that the expert has to evaluate. To reduce the possible errors
fuzzy If-Then rules for diagnosis of diabetes disease, we call and help the expert, the classification system can be used. The
it FADD. We have evaluated our new classification system via use of classifier systems in medical diagnosis is increasing
Pima Indian Diabetes data set. Results show FADD can detect the
gradually [4]. Expert systems and different artificial
diabetes disease with an acceptable accuracy and competitive or
even better than the results achieved by previous works. In intelligence techniques for classification also help experts in a
addition, the discovered rules have good comprehensibility. great deal and For this reason, many algorithms are proposed
to classification diabetes patients [3, 4, 13, 14].
Keywords- Ant Colony Optimization, diabetes diagnosis, Ant colony optimization (ACO) has been successfully used
medical data mining, classification, fuzzy logic. for the classification task. Parpinelli et al [2] for the first time
employed the ACO for data mining and named it AntMiner1.
I. INTRODUCTION They showed that ACO is a successful method for data
Diabetes is one of the most dangerous diseases, named mining. Then, Liu et al [15] improved AntMiner1, and called
Silent killer. This disease is a major health problem in both it AntMiner2. They believed AntMiner2 doesnt give the
industrial and developing countries, and its incidence is rising. chance of searching to ants and they introduced the new
It is a disease in which either the body does not produce version of it and called it AntMiner3 [12]. Finally, Martens et
enough insulin or the cells ignore the insulin. Insulin is al [1] proposed a new method which has all the advantages of
necessary for the body to be able to use glucose for energy previous versions of AntMiner and named it AntMiner+.
[11]. Diabetes increases the risk of blindness, blood pressure, Saniee et al [5,16] combined the ACO and Fuzzy Logic for
heart disease, kidney disease and nerve damage. This disease Network intrusion detection, and they obtained significant
has two main types [4]: type1 and type 2. The most usual form results. To our best knowledge, ACO is never used for
of diabetes is diabetes type 2 or Diabetes mellitus type 2. diagnosis of diabetes. In this paper we have use ACO and
Millions of people have been diagnosed with diabetes type 2, Fuzzy Logic for diagnosis of diabetes disease. We also have
and unfortunately many more are unaware that they are at high proposed a new framework for fuzzy rule learning. In the new
risk [11]. In diabetes type 2, the body is resistant to the effects presented framework the learning process for each class done
of insulin (a hormone that regulates the movement of sugar independently. To evaluate the final rule-base classifier, two
into cells) or the body doesn't produce enough insulin to evaluation criteria are considered which are classification Rate
maintain a normal glucose level[3].The Pima Indians of and comprehensibility. The former denotes the capability of
Arizona have the highest prevalence and incidence of diabetes the classifier for detecting diabetes pattern in the input
Type 2 of any population in the world[4]. Although with new samples, while the latter refers to the interpretability grade of
medical progresses, early diagnosis of disease has improved the classification system which is dependent on the classifier
but about half of the patients diabetes Type 2 are unaware number of rules and the mean of rules length.
from their disease and may take more than ten years as the The proposed method has been tested using the public
delay from disease onset to diagnosis [11]. While early Pima Indian Diabetes data set available at the University of
diagnosis of disease and treatment of hyperglycemia and California, Irvine web site [17].The results show that this
related metabolic abnormalities are of vital importance. The algorithm can classify the Pima Indian diabetes data set with
diagnosis of diabetes is not easy because there are many acceptable accuracy and competitive or even better than the
factors that the physician must consider. The most important results achieved by earlier works. Also this algorithm has good
two stages in diagnosis of diabetes disease are evaluating data comprehensibility, because it produces a few numbers of rules
taken from patient and referring to the previous decisions that with short length.
( )
c
constructed rule, if constructed rule is proper (improve the
CFj = Class h (R j )
j
h =1
Class h (R j ) (3)
classification Rate more than a threshold) then augmented to
the DiscoveredRules otherwise the constructed rule is ignored.
Where The steps of proposed algorithm as follow:
Step1: Set the DiscoveredRules as empty and TrainingSet
= Class h (R j ) (c 1) (4)
as all of training samples.
h h j
Step2: for each class
Step2-1: Call FADD(fig.2.) for learning the
Now, we can specify the certainty grade for any rules of each class.
combination of antecedent fuzzy sets. Such a combination is Step2-2: Add the rules that recently learned (by
generated by the proposed hybrid system will be explained in step 2-1) to DiscoveredRules.
the next sections. Step2-3: Remove the covered samples of
The task of our fuzzy classifier system is to generate TrainingSet.
combinations of antecedent fuzzy sets for generating a rule set
S with high classification ability. When a rule set S is given, Step 3: Compute the grade of certainty CF for each rule
an input pattern x p = ( x p 1 , x p 2 ,..., x pn ) is classified by a of the DiscoveredRules.
Step4: For each input pattern Xp=(x1, x2, x3, ..., xn), the
single winner rule R j * in S , which is determined as follows: single rule Rj can classify Xp which Rj has maximum
product of the compatibility and the certainty grade CF among
all of rules.
(5)
B. Pheromone Initialization
Whenever function FADD called for learning the rules of
That is, the winner rule has the maximum product of the each class, all of cells in the pheromone table are initialized
compatibility and the certainty grade CF j . equally to the following value:
Each fuzzy if-then rule is coded as a string. The following
symbols are used for denoting the five linguistic values: (Fig. (6)
1)
0: don't care (DC), 1: small (S), 2: medium small (MS), 3: Where:
medium (M), 4: medium large (ML), 5: large (L). a: is the total number of attributes;
bi :is the number of values in the domain of attribute i.
IV. THE PROPOSED METHOD C. Rule Construction
As it was mentioned earlier, ACO algorithm has recently Each time function FADD is called at first iteration (T=0),
been used in various kinds of data mining problems such as a rule is created which all terms of this rule have DC value. In
clustering, and classification [1,8].In this section, we discuss the next iterations (T1) an ant can only modify the terms of
the detail of our proposed algorithm for the discovery of the rule that in previous iterations has been constructed. The
classification rules(We call it FADD). This section is divided maximum terms that each ant can modify in each iteration
into six subsections namely, a general description of proposed (T1) determined with a parameter named Max_Change. The
algorithm, Pheromone Initialization, Rule Construction, largest value of Max_Change is number of feature (In our
Quality Computation Function, Pheromone Update Rule, and experiments, Max_Change=2). The number of ants that
Stopping Conditions. modify the rule in inner loop of FADD is determined by user
(No_Of_Ants). The probability that each ant chooses termi,j to
A. A general description modify is
The FADD utilizes of the artificial ants in order to explore
the training search space and gradually make candidate rules.
The major difference of this algorithm with the previous
algorithms is that this algorithm learns rules for each class Where
separately. In other words, for each class such as k the main i,j : Is a problem-dependent heuristic value for termij. In this
function calls a function FADD, which this function learns the algorithm we use 0.5 for DC and 0.1 for other values.
i,j : The amount of pheromone currently available E. Pheromone Update Rule
(at time t) on the path between attribute i and value j. After each ant modifies the terms of a rule according to
a: The total number of attributes Max_Change parameter, pheromone updating is carried out.
bi: The total number of values in the domain of attribute i We have defined a new function to update pheromone, in such
I: Is the set of attributes that are not yet used by the ant a way that whenever each ant has modified the terms of rule
D. Quality Computation Function Rj, quality of rule Rj is calculated, if the quality of rule Rj is
increased then pheromone of this rule is increase according to
Whenever a rule modified by an ant, the quality function
value of quality that improved. We believe (by our
calculates the quality of modified rule. The quality of a rule
experiments) that with this new update Strategy, in each
such as Rj is computed according to equation (8).
iteration the pheromone helps improve the quality of rule.
(8) Pheromone updating is carried out according to equation (10).
Where
TP: true positives, the number of cases in our training set
covered by the rule that have the class predicted by the
rule. Where
FP: false positives, the number of cases covered by the rule
Q: show difference the quality of the rule after
that have a class different from the class predicted by
and before modification.
the rule
c: is a parameter to regulate influence of quality.
FN: false negatives, the number of cases that are not covered
by the rule but that have the class predicted by the rule.
It is necessary to decrease the pheromone of terms that have
TN: true negatives, the number of cases that are not covered
not participated in the construction of rules. For this purpose,
by the rule and that do not have the class predicted by the
rule. pheromone evaporation is simulated. To simulate the
phenomenon evaporation in real ant colony, the amount of
pheromone associated with each termij that does not occur in
Algorithm I:
the constructed rule must be decreased. The pheromone of
1. j=1, LearnedRules=[]; unused terms is decreased by dividing the amount of the value
2. While (Not satisfy stopping conditions) of each ij by the summation of all ij.
2.1 T=0;
2.2 Pheromone initialization; /* all of cells in the
pheromone table are initialized equally to F. Stopping Conditions
equation(5).*/ Stopping condition in outer loop of FADD function refers to
2.3 Create rule Rj; /* all of terms in this rule any condition that user has defined to terminate the loop. For
have DC value*/ example user can use the fix number of iterations or using the
2.4 Repeat minimum uncovered instances to terminate the FADD
2.4.1 T=T+1; function. In our experiments, we have used the combination of
2.4.2 Modify Rj according to these two conditions to terminate FADD function.
Max_change; /*Each ant can modify
the terms of rule Rj according to max_change
parameter */
2.4.3 Compute the quality of Rj ; V. EXPERIMENTAL RESULTS
/*according to equation(3).*/ Our experiments used data sets from the UCI data set
2.4.4 Update pheromone; /* according to repository [17]: the Pima Indian Diabetes, which contains 768
equation(5) */ instances, 8 integer-valued attributes and 2 classes. We
2.5 Until (T > No_Of_Ants) normalized the data sets, where each numerical value in the
2.6 If isProper(Rj) add Rj to LearnedRules; data set is normalized between 0.0 and 1.0. For this purpose,
/* Rj must Improve the classification rate */ the below function is applied to normalize the data set.
3. j=j+1;
4. End While; (11)
5. Return LearnedRule;
End Function FADD;
We evaluate comparative performance of FADD using ten-
fold cross-validation. Data set is divided into ten partitions,
Figure 2. A high description of FADD and FADD is run ten times, using a different partition as test
set each time, with the other nine as training set. The
classification rate being calculated according to equation (12)
(where the meanings of TP, TN, FN, FP are as in equation modifications that ants were did. With this new update
(8)). pheromone function ants in order to improve the quality of
rule, make better decisions in next iterations.
3. There are two important concepts in ACO that are:
(12) Competition and Cooperation. The previous versions of
Table I shows classification rate for the rule sets produced AntMiner paid more attention to Competition and this
by different algorithms. And table II shows the results of caused some of the rules was very strong while the other
FADD. It can be seen that proposed algorithm discovers less rules was nearly weak. In this paper we have paid attention
rules, but also it has the good classification rate, in comparison to cooperation in order to produce a set of nearly strong
with other methods. Also because of the number of rules that rules. For this propose, we have encouraged the ants to
FADD algorithm has produced and mean length of rules is have more cooperation in the body FADD function.
low, FADD has good comprehensibility.
REFERENCES
TABLE I: CLASSIFICATION RATE OBTAINED WITH DIFFERENT
CLASSIFIER [1] David Martens, Manu De Backer, Raf Haesen, Jan Vanthienen, Monique
Snoeck, and Bart Baesens, Classification With Ant Colony Optimization,
Method Classification Rate IEEE Trans on Evolutionary Computaion, Vol. 11, pp.651-656, 2007.
Decision Table* 71.224 [2] R. S. Parpinelli, H. S. Lopes, and A. A. Freitas, Data mining with an ant
RBF* 75.8 colony optimization algorithm, IEEE Trans on Evolutionary Computaion
vol.6, pp. 321332, 2002
NNGE* 73.5677 [3] Kemal Polat, Salih Gunes, Ahmet Arslan, A cascade learning system for
C4.5 Dta* 73.0 classication of diabetes disease: Generalized Discriminant Analysis and
Least Square Support Vector Machine, Expert Systems with Applications
Bayesa* 72.2
vol.34, pp.482487, 2008.
Regression Coefficients* 72.3958 [4] Hasan Temurtas , Nejat Yumusak , Feyzullah Temurtas, A comparative
Naive Bayes* 76.3021 study on diabetes disease diagnosis using neural networks, Expert Systems
with Applications vol.36 pp. 86108615, 2009.
CART* 72.8
[5] Mohammad Saniee Abadeh, Jafar Habibi, and Emad Soroush, Induction of
C4.5 rules* 67.0 fuzzy classification systems via evolutionary ACO-based Algorithms,
Deng et al[13] 78.4 International journal of simulation, systems, science, technology, VOL. 9,
NO.3, 2008.
Kayaer et al[14] 77.08 [6] Marco Dorigo, Christian Blum, Ant colony optimization theory: A survey,
Polat et al[3] 78.21 Theoretical Computer Science Vol.344, pp. 243 278, 2005.
Temurtas et al[4] 79.16 [7] M. Dorigo, V. Maniezzo, A. Colorni, The ant system: optimization by a
* The methods that is marked with asterisk, have been tested by software colony of cooperating agents, IEEE Transactions on Systems, Man and
Cybernetics, Vol.26, pp.1-13, 1996.
Weka [10].
[8] Urszula Boryczka, Finding groups in data: Cluster analysis with ants,
TABLE II. RESULT OF FADD Applied Soft Computing, Vol. 9, pp.6170, 2009.
[9] Christian Blum, Review Ant colony optimization: Introduction and recent
Number of Mean Classification Rate Mean length of rules trends, Physics of Life Reviews Vol.2, pp. 353373, 2005.
Rules [10] Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning
Tools and Techniques, vol.1, Morgan Kaufmann publications, pp. 363-483,
8 79.481.1 2.571 2005.
[11] American diabetes association. http://www.diabetes.org/diabetes-basics
(last accessed: November 2009)
[12] B. Liu, H. A. Abbass, and B. McKay, Classification rule discovery with
VI.
I. CONCLUSION ant colony optimization, In Proc. IEEE/WIC Int. Conf. Intell. Agent Technol,
2003.
This paper presents a mixture of Ant Colony Optimization [13] Deng, D., & Kasabov, On-line pattern analysis by evolving self-
and Fuzzy Logic for mining among Pima Indian diabetes data organizing maps, In Proceedings of the fifth biannual conference on artificial
set. Already, Ant Colony Optimization used for data mining to neural networks and expert systems, 2001.
classification [1,2,11,15]. The main new features of the [14] Kayaer, K., & Yldrm, T. Medical diagnosis on Pima Indian diabetes
using general regression neural networks. In Proceedings of the international
presented algorithm are as follows: conference on artificial neural networks and neural information processing,
1. Introducing a new framework for learning the rules in such 2003.
a way that the rules are learned for each class [15] B Liu, HA Abbass, B McKay, Density-based heuristic for rule discovery
independently. with ant-miner, The 6th Australia-Japan joint workshop on intelligent, 2002
[16] Mohammad Saniee Abadeh, Jafar Habibi, Emad Soroush, Induction of
2. A different strategy for controlling the influence of Fuzzy Classification Systems Using Evolutionary ACO-Based Algorithms,
pheromone values was studied. We proposed the new Proceedings of the First Asia International Conference on Modelling &
update pheromone rule that improves the quality of each Simulation (AMS'07), IEEE, 2007.
rule. Because for each rule, the value of pheromone that [17] Blake, C. L., & Merz, C. J. UCI Repository of Machine Learning
Databases, 1996, Available from http://www.ics.uci.edu./~mlearn/
increased in each iteration depend on the quality of MLReporsitory.html.