You are on page 1of 7

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 54

Using Feature Selection for Speed up Hybrid


PSO/ACO
H. Dallaki, A. Sami, A. Hamzeh and S. Hashemi

Abstract— Recently, hybrid Particle Swarm Optimisation/Ant Colony Optimisation (PSO/ACO) has proposed for the discovery
of classification rules. This algorithm with using advantages of both algorithms, which include experience of each particles and
their neighbours –from PSO- and using pheromone and update them –from ACO- has good result for nominal data set, but has
not enough speed. In addition, PSO/ACO2 algorithm can directly cope with nominal attributes, without converting nominal
values into numbers in a pre-processing phase. In this paper, we use two algorithms of feature selection to speed it up.

Index Terms— Ant Colony Optimization (ACO), Feature Selection (FS), Hybrid PSO/ACO2, Particle Swarm Optimization
(PSO).

——————————  ——————————

1 INTRODUCTION

D ATA mining is the process of extracting pattern or


knowledge from data, and on it two concepts is very
important: accuracy and comprehensibility. The ac-
section.
In these algorithms, structure of each individual could
be very simple, but, their collective behavior is complex.
curacy is the degree closeness of measured quantity to its Indeed, collective behavior did not depend on only indi-
actual value. The comprehensibility means how easily vidual behavior, but, it is also depend on how the indi-
user can understand. The most common type of know- viduals interact between themselves. Interaction between
ledge representation is in the form of IF-THEN rules such individuals increased their experience about the envi-
as rule shown in Figure 1: ronment and cause population progression.
In [9], has been shown ACO is a powerful paradigm in
nominal dataset classification, but it has not enough
speed for large and huge datasets. In the other hand, al-
Fig. 1 IF-THEN rules knowledge representation though PSO has proposed by Sousa et al. for nominal da-
taset classification, but originally it is used for numeric
Where the antecedent contains some terms (attribute-
data classification and for nominal attributes need to con-
value pairs) which are combined using AND or OR logi-
vert them to number. Holden, in his PhD thesis [21], used
cal operators and the consequent is usually the predicted
experience and topology of particle in PSO, combined it
class label. In this form of representation, it is obvious
with ACO, and has showed PSO/ACO is at least compet-
whatever the antecedent is shorter; the rule would be
itive with J48 (WEKA’s implementation of C4.5) in terms
more understandable.
of accuracy, and that PSO/ACO often generate much
So, in many recent researches in the realm of Data
simpler rule sets.
Mining, scientists have tried to achieve this goal based on
In this paper, we try to speed up hybrid PSO/ACO2
various approaches. In the meantime, biology inspired
[20] by using Feature Selection (FS) algorithms, with
algorithms such as genetic algorithm [6] and swarm-
keeping accuracy. The rest of the paper is organized as
based approaches for example Ant Colony Optimization
follows. Section 2 provides some background informa-
(ACO) [9] and Particle Swarm Optimization (PSO) [12]
tion. Section 3 causes the PSO/ACO2 algorithms. We
has been shown to be promising [8]. The biology inspired
show experimental results, conclusion and discusses
algorithms is a category of algorithms that simulate the
possible future research in next sections.
way performed by nature. Genetic algorithm is a search
heuristic that mimics the process of natural evolution.
ACO and PSO inspired from ant colony and flock of birds 2 BACHGROUND
(school of fish) sequentially that will be explained in next In this section, we review three main basic concepts
———————————————— used in this work including ACO, PSO and FS.
 H. Dallaki is with the CSE & IT Department, Shiraz University, Shiraz, 2.1 The Ant Colony Optimization
Iran. In 1983, Deneubourg and his colleagues studied the col-
 A. Sami is with the CSE & IT Department, Shiraz University, Shiraz,
Iran. lective behavior of ants. They found these ants deposit
 A. Hamzeh is with the CSE & IT Department, Shiraz University, Shiraz, pheromone on the ground in order to mark some favora-
Iran. ble path that should be followed by other members of the
 S. Hashemi is with the CSE & IT Department, Shiraz University, Shiraz, colony. If there exist more than one path between the
Iran.
food and the nest, reacting to the pheromone, ants are
attracted to the strongest scent. Thus, among shorter and
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 55

longer paths between food and the nest, if the number of


ant in both paths is equal, due to the first random distri-
bution of ants in environment, after a while, the amount
of pheromone in the shorter path increases and so, more
and more ants are attracted to this path. As a result, ants
tend to converge to the shortest path [1].
ACO is a member of Ant Colony algorithms family, in
swarm intelligence methods that initially proposed by
Macro Dorigo in 1992 [13], exploits a similar mechanism
for solving optimization problems. The ACO meta-
heuristic is shown in Algorithm 1. Fig. 2 Three well-known topologies for PSO
After forming a population based on the chosen to-
Algorithm 1 The Ant Colony Optimization Meta-heuristic pology, each particle is assigned a position and a velocity
Set parameters, initialize pheromone trails value. Position of each particle is interpreted as a candi-
WHILE termination condition not met do date solution for the underlying problem. Additionally,
Construct Ant Solutions each particle memorizes its best previous position –with
Apply Local Search (optional) respect to the given fitness function. The movement of
Update Pheromones each particle which determines its next position is con-
ENDWHILE trolled by its velocity value where each particle moves in
the search space with respect to its best previous position
Ant colony optimization algorithms have been applied and the best position among all its neighbors, according
to many combinatorial optimization problems, such as to the chosen topology; means in each iteration, every
network routing , and a lot of derived methods have been particle update its position as shown in (2).
adapted to dynamic problems in real variables, stochastic vi = χ (vi + c1φ1 (pi − xi) + c2 φ2 (pg − xi)) (1)
problems, multi-targets and parallel implementations. It xi = xi + vi (2)
has also been used to produce near-optimal solutions to
the Travelling Salesman Problem (TSP). Where xi is the particle i's position, vi is the particle
The Ant-Miner classification algorithm proposed in [9] i's velocity, χ is the constriction factor to control the
is a data miner based on ACO paradigm to generate rules velocity, c1 and c2 are acceleration coefficients, φ1 and
from a given dataset. Its overall architecture is as follows: φ2 are random numbers in [0,1], pi is the best pre-
at first, an ant has an empty antecedent. An ant is a can- vious position and pg is position of best neighbor.
didate solution for given problem. At the moment, one PSO was originally developed to solve real-valued
attribute-value pair is selected and added to the rule’s
optimization problems [12]. As a try, to extend the
antecedent with respect to the amount of pheromone and
on the value of a heuristic function that measures the in- real-value version of PSO to the binary space, Kennedy
formation gain of each attribute-value pair. and Eberhart proposed the binary PSO (BPSO) [3]
where, after calculating vi, it has been normalized us-
2.2 Particle Swarm Optimization ing (3) where the constant k is used to determine how
PSO is a branch of swarm intelligence algorithms [2] “deterministic” the search is:
which is stochastic and population-based introduced by
Kennedy and Eberhart in 1995 [2]. PSO simulates the so- s(vi) = 1/(1 + exp(-kvi)) (3)
cial behavior of mid-size organisms such as birds in flock
Next, xi is computed using (4) where rand is a random
or fish in school. In this algorithm, the whole population
number in [0,1].
is called a swarm and each individual is called a particle.
Particles in a swarm learn from each other and based on IF rand < s(vi)
an obtained knowledge, proceed towards their best xi (t) = 1 (4)
neighbors. ELSE
As can be seen in Fig. 2 different topologies have been xi (t) = 0
proposed to form the involving particles where the most
As another extension, Sousa et al. proposed Discrete
well-known methods are: [10]
PSO which is the extended version of BPSO to cope with
1. The Ring (lbest) Topology
multi-valued categorical attributes [12]. Fig. 3 showed
2. The Global (gbest) Topology
behavior of this algorithm.
3. And the Von-Neumann Topology

© 2010 Journal of Computing Press, NY, USA, ISSN 2151-9617


http://sites.google.com/site/journalofcomputing/
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 56

FCBF [11] consists of two stages: in the first stage, features


have been sorted depending on a relevance score, which
is computed as the symmetric uncertainty [14] with re-
spect to the target class. In this stage, irrelevant variables,
which are those whose ranking score is below a prede-
fined threshold, are discarded. In the second stage, select
Fig. 3 Discrete PSO
predominant features from the relevant set obtained in
the first stage. Predominant feature means features that
2.3 Feature Selection contain information of one or more other features.
In this section, we briefly introduce Feature Selection (FS)
as one of the key points needed to understand the work. 3 HYBRID PSO/ACO
FS is the process of selecting, hopefully, the smallest sub- ACO, as a tool to classify nominal datasets, is a very time
set of features from an original feature set that to preserve consuming approach especially in high dimensional data-
or increase the efficiency of learning models such as clas- sets. In the other hand, PSO is usually used for classifica-
sification. Thus, it can be viewed as a principal pre- tion numeric datasets, but it’s developed to cope with
processing tool prior to solving, for example, the classifi- nominal attributes by Sousa [12]. In PSO, particles used
cation problems [5]. Note that FS do not map the current both their experience and neighbors. So, it’s relatively
data set to another space and only reduce its dimension. fast. However, in [15, 16], it is shown that if the combined
Although FS is used in datasets with many fea- algorithms, achieve higher accuracy in a faster manner.
tures, but it can be used for dataset with small num- Holden, in his PhD thesis [21], combines both ACO and
ber of feature to keep relevant features and remove PSO to cover disadvantages of one with the other’s ad-
redundant, unnecessary or evenly misleading ones to vantages.
improve accuracy, robustness and even efficiency of
3.1 Sequential Covering Approach
classification in small datasets. In this paper, two FS
Hybrid PSO/ACO2, new version of PSO/ACO uses a
algorithms are used: sequential covering approach to generate one rule at a
2.3.1. Correlation-based Feature Selection (CFS) time. In this approach, at first, the RuleSet (RS) is empty.
Then, for each class, rule generated one by one by hybrid
CFS [24] gives high scores to subsets that include PSO/ACO2. When rule generate, it should be pruned.
features that are highly correlated to the class Then, add pruned rule to RS and delete all instances that
attribute but have low correlation to each other. CFS this rule covered them. This process continues until the
uses a correlation based heuristic to evaluate the number of uncovered instances is less than threshold.
worth of features. Finally, RS are ordered by descending quality (fitness).
Let S be an attribute subset that has k attributes, rcf Pseudo code of this approach is given in [20].
models the correlation of the attributes to the class Also, in [20] offered to use Ant-Miner pruning pro-
attribute, rff the intercorrelation between attributes. cedure for rule pruning. This procedure try to find worst
term at each iteration, that if remove it, the quality
krcf would be increased or at least not changing. In both cas-
Merit S  ( 5) es, there will be more general rule than before. So, in
k  k ( k  1) r ff
each iteration, remove each term and calculating the
Where Merits is the heuristic "merit" of a feature sub- rule’s quality after removal and then, this term back to
set S containing k features, and rfc is the mean feature where it was before. So, in each iteration, the worst term
class correlation and defines as (6): will be removed. If quality suppose to be reduce with
removing worst term, the worst term does not remove
1 
rfc   f S   ( f i , c) 
and the process will be terminated.
(6)
i
k  3.2 Detail of Hybrid PSO/ACO2
And rff is the average feature inter-correlation. In this algorithm, each particle shows the antecedent of
CFS calculates feature-class and feature-feature corre- one rule and includes some pheromone matrices. If we
lations using symmetrical uncertainty and then selects a have n nominal attributes in dataset, pheromone matrices
subset of features using the Best First search with a stop- is 2×n. this means for each attribute two probability phe-
ping criterion of five consecutive fully expanded non- romone is required: one for appearance of this attribute
improving subsets. Merits of CFS selects the maximum (on state) and another for disappearance of this attribute
relevant feature and avoids the re-introduction of redun- (off state) on the rule. The sum of these two probabilities
dancy. But the drawback is that CFS cannot handle prob- is one. Furthermore, each particle has quality to evaluate
lems where the class is numeric. it and keep in mind the neighbors. Fig. 4 illustrates better
the structure of particle for one instance from tic-tac-toe
2.3.2. Fast Correlation-Based Filter (FCBF) dataset.
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 57

3.3 Calculate Quality of the rules


Holden and Freitas during introducing hybrid
PSO/ACO2 had suggested some formula for quality that
ultimately, the best quality in [20] is defined Eq. (7):
Laplace-corrected Precision = (1 + TP) / (1+ TP + FP) (7)
Fig. 4 Structure of particle

5 RESULT
To update pheromone of current particle for next ite-
ration used position of best neighbor and best previous This section includes the results of the implementation on
position of this particle. If quality of current particle is several classification problems from UCI repository has
better than best previous position, should be replaced. been done. We use MATLAB 7.6.0 for implementation. In
In addition, in [20] has said that if topology of neighbors addition, for CFC and FCBF algorithm, we used the pack-
is Von-Neumann, result will be better. Algorithm 2 age that it was ready in [25]. A summary of all datasets
shows the PSO/ACO2 algorithm. has been shown in Table 1.
TABLE 1: SUMMARY OF DATASETS
Algorithm 2 The Hybrid PSO/ACO2 Algorithm
Dataset Instances Attributes Classes
Initialize population
REPEAT for MaxIterations Lymph 148 18 4
FOR every particle x Hayes-roth 160 5 3
/* Rule Creation */
Set Rule Rx = “IF  THEN C” Breast-cancer 286 9 2
FOR ever dimension d in x Vote 435 16 2
Use roulette selection to choose whether the state Balance 625 4 3
should be set to off or on. If it is on then the cor-
Kr-vs-kp 3196 36 2
responding attribute-value pair set in the initiali-
zation will be added to R; otherwise (i.e., if off is Splice 3190 61 3
selected) nothing will be added. Mushroom 8124 22 2
ENDFOR
Calculate Quality Qx of Rx
/* Set the past best position */ In Table 2 and Table 3 respectively, the result of using
P = x’s best previous position CFC and FCBF algorithms has been shown as pre-
QP = P’s quality processing phase. In many cases, with keeping accuracy,
IF Qx > QP the average rule size (ARS) and average rule length (ARL)
Qp = Qx is reduced. However, in some cases, we have also
P=x
increased accuracy.
ENDIF
ENDFOR Table 4 and Table 5 compare the speed of algorithm in
FOR every particle x both cases: with and without FS. Although in some cases
P = x’s best previous position the speed up is impressive, but when number of
N = the best position ever held by a neighbor of x ac- attributes is high, we have not choice except reducing
cording to N’s quality QN features.
FOR every dimension d in x
/* Pheromone updating procedure */ 4 CONCLUSION
IF Pd = Nd THEN
pheromone_entry for Pd is increased by 0.25 In this paper, we speed up hybrid PSO/ACO2 with
ELSEIF Pd = off AND seeding term for xd ≠ Nd feature selection. The result shows after select relevant
THEN features, we have more simple rules addition to speed
pheromone_entry for the off state is increased up of the algorithm. However, in datasets with large
by 0.25 number of feature or datasets have large noise, the use
ELSE
of feature selection are suggested.
pheromone_entry for Pd is decreased by 0.25
For future work, we consider to use other methods
ENDIF
Normalize pheromone_entries and algorithms to speed up PSO/ACO2 algorithm.
ENDFOR
ENDFOR ACKNOWLEDGMENT
ENDREPEAT The authors would like to thank N. Holden for his useful
RETURN best rule discovered comments and suggestions.
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 58

REFERENCES For Discovering Classification Rules in Data Mining", In Journal


of Artificial Evolution and Applications (JAEA). Springer, 2008.
[1] R. Beckers, S. Goss, Jean-Louis Deneubourg, and J. M. Pasteels, "Colony
[21] N. Holden, "Improving the hierarchical classification of protein
size, communication and ant foraging strategy", PSYCHE (CAM-
functions with swarm intelligence", PhD dissertation, the uni-
BRIDGE), 1989.
versity of Kent at canterbury, August 2008.
[2] J. Kennedy and R. Eberhart, "Particle Swarm Optimization". In
[22] M. Sordo, G. Ochoa, S.N. Murphy, "A PSO/ACO Approach to
Proceedings of IEEE International Conference on Neural Net-
Knowledge Discovery in a Pharmacovigilance Context". In
works, Perth, Australia, 1995.
Proc. Genetic and Evolutionary Computation Conference
[3] J. Kennedy, R. Eberhart, "A discrete binary version of the par-
(GECCO-2009). ACM, 2009.
ticle swarm algorithm". In Proceeding of the 1997 Conference
[23] M.B. Osama, A.S. Abd-Elhay, and I.H. Mohamed, "Quantitative
on Systems, 1997.
Association Rule Mining Using a Hybrid PSO/ACO Algorithm
[4] C.L Blake;C.J Merz UCI Respository of machine Learning data-
(PSO/ACO-AR)". 2009.
set,Technical report, University of California,Irvine,CA,1998.
[24] Z. Zhao, F. Morstatter, S. Sharma, S. Alelyani, A. Anand, and H.
http://www.ics.uci.edu/~mlearn/MLRespository.html
Liu. "Advancing Feature Selection Research - ASU Feature Se-
[5] M. A. Hall, L. A. Smith, "Practical feature subset selection for
lection Repository", School of Computing, Informatics, and De-
machine learning". Australian Computer Science Conference.
cision Systems Engineering, Arizona State University, Tempe,
Springer. 1998.
2010.
[6] E. Noda, A.A. Freitas, "Discovery Interesting Prediction Rules
[25] free package feature selection algorithm with MATLAB:
with a Genetic Algorithm", Conference of Evolutionary Com-
http://featureselection.asu.edu/software.php
putation, Washington DC, 1999.
[7] J. Kennedy and R. Eberhart, "Swarm Intelligence", Morgan
Kaufmann Academic Press, 2001.
[8] A.A. Freitas, "A survey of evolutionary algorithms for data
mining and knowledge discovery, in: A. Ghosh, S. Tsutsui
(Eds.) ", Advances in Evolutionary Computation, Springer-
Verlag, 2001.
[9] A.A. Freitas, R.S. Parpinelli, and H.S. Lopes, "Data Mining with
an Ant Colony Optimization Algorithm", IEEE Trans. on Evolu-
tionary Computation, special issue on Ant Colony algorithms.
2002.
[10] J. Kennedy and R. Mendes, "Population structure and particle
swarm performance". in Proceedings of the IEEE Conference on
Evolutionary Computation, Honolulu, Hawaii, USA, 2002.
[11] L. Yu and H. Liu. "Feature Selection for High-Dimensional
Data: A Fast Correlation-Based Filter Solution". In Proceedings
of The Twentieth International Conference on Machine Leaning
(ICML-03), 2003.
[12] T. Sousa, A. Silva, and A. Neves, "Particle Swarm Based Data
Mining Algorithms For Classification Tasks", Parallel Compu-
ting 30, Elsevier, 2004.
[13] M. Dorigo and T. Stützle, "Ant Colony Optimization", the MIT
Press, Cambridge, Mass, USA, 2004.
[14] I.H. Witten, E. Frank, "Data Mining: Practical Machine Learning
Tools and Techniques", Morgan Kaufmann, Amsterdam, 2005.
[15] N. Holden, and A.A. Freitas, "A Hybrid Particle Swarm/Ant
Colony Algorithm For The Classification Of Hierarchical Bio-
logical Data". In: Proc. 2005 IEEE Swarm Intelligence Sympo-
sium (SIS-05), IEEE, 2005.
[16] N. Holden, and A.A. Freitas, "Hierarchical Classification of
GProtein-Coupled Receptors with a PSO/ACO Algorithm". In:
Proc. IEEE Swarm Intelligence Symposium (SIS-06), IEEE, 2006.
[17] C. Grosan1, A. Abraham2 and M. Chis3, "Swarm intelligence in
data mining". Studies in Computational Intelligence (SCI).
Springer-Verlag Berlin Heidelberg. 2006.
[18] M. Clerc, "Particle Swarm Optimization". ISTE Ltd. London.
2006.
[19] N. Holden, and A.A. Freitas, "A Hybrid PSO/ACO Algorithm
For Classification". In Proc. Genetic and Evolutionary Compu-
tation Conference (GECCO-2007) Workshop on Particle
Swarms: The Second Decade. ACM, 2007.
[20] N. Holden, and A.A. Freitas, "A Hybrid PSO/ACO Algorithm
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 59

TABLE 2: COMPARE ACCURACY, SIZE AND LENGTH OF RULES USING CFC ALGORITHM AND WITHOUT IT

Dataset Acc(Pre) Acc(CFC) ARS(Pre) ARS(CFC) ARL (Pre) ARL (CFC)

Lymph 77.90±9.30 79.05±9.73 1.92±0.23 1.72±0.13 5.49±0.90 5.80±1.32

Hayes-roth 59.73±9.37 59.07±14.22 1.61±0.14 1.49±0.15 7.26±1.24 6.40±1.43

Breast-cancer 70.26±9.56 75.22±9.69 2.88±0.16 1.46±0.18 14.29±1.09 7.80±1.03

Vote 90.47±4.23 95.40±2.42 2.32±0.35 1.84±0.20 6.43±1.12 5.00±1.49

Balance 73.12±4.51 63.49±7.49 1.92±0.02 0.80±0.00 29.97±1.22 5.00±0.00

Kr-vs-kp 98.84±0.49 94.24±1.82 5.11±0.28 1.37±0.04 23.53±1.86 9.50±0.53

Splice 78.12±4.00 92.29±1.79 3.69±0.26 3.13±0.18 89.50±17.02 70.20±4.34

Mushroom 100.0±0.00 99.02±0.35 2.69±0.26 1.13±0.00 11.69±1.89 16.00±0.00

TABLE 3: COMPARE ACCURACY, SIZE AND LENGTH OF RULES USING FCBF ALGORITHM AND WITHOUT IT

Dataset Acc(Pre) Acc(FCBF) ARS(Pre) ARS(FCBF) ARL (Pre) ARL (FCBF)


Lymph 77.90±9.30 78.29±12.30 1.92±0.23 1.63±0.13 5.49±0.90 5.20±1.14

Hayes-roth 59.73±9.37 59.29±10.35 1.61±0.14 1.47±0.13 7.26±1.24 7.10±1.37

Breast-cancer 70.26±9.56 74.69±9.70 2.88±0.16 1.30±0.15 14.29±1.09 6.50±2.17

Vote 90.47±4.23 94.95±2.34 2.32±0.35 1.86±0.25 6.43±1.12 4.90±1.52

Balance 73.12±4.51 73.12±4.51 1.92±0.02 1.66±0.11 29.97±1.22 25.3±3.11

Kr-vs-kp 98.84±0.49 94.24±1.55 5.11±0.28 1.31±0.05 23.53±1.86 8.70±0.67

Splice 78.12±4.00 92.29±1.79 3.69±0.26 3.13±0.18 89.50±17.02 70.20±4.34

Mushroom 100.0±0.00 99.02±0.42 2.69±0.26 1.13±0.00 11.69±1.89 16.00±0.00


JOURNAL OF COMPUTING, VOLUME 3, ISSUE 1, JANUARY 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 60

TABLE 4: COMPARE NUMBER OF FEATURES AND SPEED UP HYBRID ALGORITHM USING CFC ALGORITHM AND WITHOUT IT

Dataset no. of Attributes (before CFC) no. of Attributes (after CFC) Speed up (before/after)
Lymph 18 10 1

Hayes-roth 5 3 1.2

Breast-cancer 9 4 3.2

Vote 16 4 2.2

Balance 4 1 8.8

Kr-vs-kp 36 7 10.1

Splice 61 31 11

Mushroom 22 4 2.4

TABLE 5: COMPARE NUMBER OF FEATURES AND SPEED UP HYBRID ALGORITHM USING FCBF ALGORITHM AND WITHOUT IT

Dataset no. of Attributes (before FCBF) no. of Attributes (after FCBF) Speed up (before/after)
Lymph 18 7 1

Hayes-roth 5 3 1.2

Breast-cancer 9 3 3.8

Vote 16 4 2.2

Balance 4 4 0.9

Kr-vs-kp 36 7 10.1

Splice 61 31 11

Mushroom 22 4 2.4

You might also like