Content uploaded by Ahmet Cevahir Cinar
Author content
All content in this area was uploaded by Ahmet Cevahir Cinar on Sep 03, 2020
Content may be subject to copyright.
Arabian Journal for Science and Engineering
https://doi.org/10.1007/s13369-020-04872-1
RESEARCH ARTICLE-COMPUTER ENGINEERING AND COMPUTER SCIENCE
Training Feed-Forward Multi-Layer Perceptron Artificial Neural
Networks with a Tree-Seed Algorithm
Ahmet Cevahir Cinar1
Received: 28 March 2020 / Accepted: 13 August 2020
© King Fahd University of Petroleum & Minerals 2020
Abstract
The artificial neural network (ANN) is the most popular research area in neural computing. A multi-layer perceptron (MLP)
is an ANN that has hidden layers. Feed-forward (FF) ANN is used for classification and regression commonly. Training of FF
MLP ANN is performed by backpropagation (BP) algorithm generally. The main disadvantage of BP is trapping into local
minima. Nature-inspired optimizers have some mechanisms escaping from the local minima. Tree-seed algorithm (TSA) is
an effective population-based swarm intelligence algorithm. TSA mimics the relationship between trees and their seeds. The
exploration and exploitation are controlled by search tendency which is a peculiar parameter of TSA. In this work, we train FF
MLP ANN for the first time. TSA is compared with particle swarm optimization, gray wolf optimizer, genetic algorithm, ant
colony optimization, evolution strategy, population-based incremental learning, artificial bee colony, and biogeography-based
optimization. The experimental results show that TSA is the best in terms of mean classification rates and outperformed the
opponents on 18 problems.
Keywords Tree-seed algorithm ·Multi-layer perceptron ·Training neural network ·Artificial neural network ·Neural
networks ·Nature inspired algorithms
1 Introduction
Neural computing mimics the human brain which is the
most complex organ of the human body [1]. Neural net-
works simulate the connections in the human brain [2]. This
simulation is named as the artificial neural network (ANN).
Basically ANN takes inputs, computes them, and produces
the outputs. This process is a learning process. Learning
has two main types: supervised and unsupervised. In super-
vised learning, the training data have output labels but in
unsupervised learning, the data have not got output labels.
ANN is a balancer between the input and outputs. In the
literature, there are various types of networks such as feed-
forward (FF) [3], Kohonen [4], radial-basis function (RBF)
[5], recurrent neural [6], spiking neural [7]. FF is a network
that has one way (one direction). In FF, the association of
inputs and outputs is provided with weights and biases. If a
FF ANN has hidden layers, it is named as multi-layer percep-
BAhmet Cevahir Cinar
accinar@selcuk.edu.tr; ahmetcevahircinar@gmail.com
1Department of Computer Engineering, Faculty of
Technology, Selcuk University, 42075 Konya, Turkey
tron (MLP) [8,9]. MLP has three layers: input, hidden, and
output. Training data are used for learning the hidden weights
between the attributes and class labels. The deterministic and
stochastic learning approaches are used for training an ANN.
The gradient-based methods and backpropagation (BP) algo-
rithm are deterministic methods [10]. If the training data
does not change, deterministic methods produce the same
results. The deterministic methods are simple and speedy.
Stochastic methods try to improve the learning rate during
the iterations. Thus, time usage is higher than the deter-
ministic methods, but it gives us better results. The main
drawback of the deterministic method is the initial solu-
tions dependency. Stochastic optimization techniques start
with random solutions. These random solutions are evolved
in every iteration, and the main advantage is avoiding the
local optima. Nature-inspired optimizers are in the stochas-
tic optimization techniques group. Most of these methods are
multi-solution-based algorithms but some of them like hill
climbing [11] and simulated annealing [12,13] are single
solution-based algorithms. TSA, particle swarm optimiza-
tion (PSO), gray wolf optimizer (GWO), genetic algorithm
(GA), ant colony optimization (ACO), evolution strategy
(ES), population-based incremental learning (PBIL), arti-
123
Arabian Journal for Science and Engineering
ficial bee colony (ABC), biogeography-based optimization
(BBO) are some of the multi-solution nature-inspired opti-
mizers. These algorithms are not only used in training of FF
ANN MLPs but also used in various applications like feed
formulation [14], traveling salesman problem [15], layout
problem [16], and combinatorial problems [17]. Also, ANN
is used in many applications like forecasting [18], classi-
fication [19,20], estimation [21,22], and prediction [23].
The ANN is not only used for classifying stationary signals
but also non-stationary signals [24]. Various techniques are
used in classification of non-stationary signals, for example,
Koh and Woo [24] combined ensemble technique and multi-
view learning for classifying stationary signals; Boashash
and Ouelha [25] focused on extracting information from
non-stationary signals; Delsy et al. [26] extracted the fea-
tures from non-stationary signals and classified them with
backpropagation network. According to the No Free Lunch
(NFL) [27] theorem, a nature-inspired optimizer cannot solve
all optimization problems successfully. Therefore, since the
GA is proposed in 1975, more than 300 nature-inspired
algorithms are proposed until now. Every nature-inspired
optimizer has peculiar property, so, in this work, we want
to prove the success of the TSA to training MLP. TSA is an
effective solver on low dimensional problems. In this work,
we modify the basic TSA for solving large-scale MLP train-
ing.
The remainder of the paper is organized as follows:
Sect. 1.1 gives the main contribution of the study. In Sect. 2,
the related works are given. FF MLP ANN and TSA are
examined in Sects. 3.1 and 3.2, respectively. The experimen-
tal setup and information about datasets are given in Sect. 4.
The results and discussion are located in Sect. 5. Finally, in
Sect. 6we conclude the work.
1.1 The Main Contribution of the Study
•TSA is used for training the FF MLP ANN for the first
time.
•TSA is compared and outperformed on 18 different
datasets (6 to 6786 dimensions, 4 to 7400 samples) with 8
metaheuristic algorithms.
•TSA finds eligible weights and biases of FF MLP ANN.
•The parameter adjustment for the basic TSA has increased
the mean classification accuracy.
•TSA is the best solver on 18 different type datasets in terms
of mean classification rates.
2 Literature Review
Neural computing and nature-inspired optimizers are a huge
research domain. Training feed-forward multi-layer percep-
tron artificial neural networks subject is not a fresh idea, but
it is a most discussed, alive, and growing research problem
in the literature. Therefore, in this section, we only focus
on the recent applications related to the training of FF MLP
ANN with nature-inspired optimizers and the literature of
TSA.
Wienholt [28] uses ES for minimizing the system error of
an MLP. Seiffer [29] propose a GA approach for avoiding
local minima on training MLP in 2001. Mendes et al. [30]
use PSO for training MLPs on classification and regression
tasks in 2002. In 2005, Blum and Socha [31] extend ACO
for pattern classification on medical data. Karaboga et al.
[32] use ABC for training FF ANN. Five function approxi-
mation problems are used in experiments. ABC outperforms
BP and GA in this work. Mirjalili et al. [33] hybridize PSO
and gravitational search algorithm, and it is named PSOGSA.
In this work, MLP is trained with PSOGSA. The obtained
results are compared with PSO and GSA. PSOGSA is bet-
ter than PSO and GSA in terms of convergence, training
error, and classification rate. Mirjalili et al. [34]trainMLP
with BBO in 2014. Five classification and six approximation
datasets are used for experiments. BBO is outperformed to
PSO, GA, ACO, ES, and PBIL. Also, the obtained results
are compared with the BP algorithm and extreme learning
machine. Mirjalili [8] investigates the effectiveness of the
GWO on training MLP in 2015. Five classification and three
function approximation datasets are used for determining
the performance of GWO. GWO creates better results than
PSO, GA, ACO, ES, and PBIL. Amirsadri et al. [35] com-
bine BP and GWO for training MLP. Levy flight technique
is used to improve the exploration capability of GWO like
[36]. BP increases the exploitation capability of GWO. The
success of the proposed model is shown on 12 classification
and function-approximation datasets. Xu et al. [37] modify
ABC with the global best-guided approach for continuous
optimization problems, and this method is named ABC-ISB.
ABC-ISB is compared with ABC variants in the literature. In
this work, basic ABC and ABC-ISB is compared on training
MLP. ABC-ISB creates promising results. Zhang et al. [38]
optimize the weights and biases of a MLP with improved
GWO, and their approach is named RSMGWO in 2019.
RSMGWO used a random opposition learning strategy for
avoiding the local optima. Nineteen different cancer-related
datasets are used for experiments. RSMGWO produces com-
petitive results. Heidari et al. [39] use ant–lion optimizer
for the training MLP in 2020, and their approach is named
as ALOMLP. ALOMLP outperforms GA, PBIL, DE, and
PSO in this work. Dalwinder et al. [40] weighted the fea-
tures of the datasets, for increasing the classification rate in
2020. In this work, an ant–lion optimizer is used for train-
ing the MLPs. Three breast cancer datasets are used in the
experimental setup. The obtained results showed that this
paradigm increases the classification rate. Faris et al. [41]
train MLP with multi-verse optimizer (MVO). Nine different
123
Arabian Journal for Science and Engineering
Fig. 1 The structure of 2–3-1 MLP
bio-medical datasets selected from the UCI machine learning
repository are used in experiments. MVO is compared with
GA, PSO, DE, firefly, and cuckoo search algorithms. The
experimental results show that MVO produces compatible
results. The metaheuristic algorithms are not used only for
time-series prediction but also used in image processing [42],
electrical machine design [43], and optimization of energy
consumption in wireless sensor network [44].
TSA is another iterative continuous search algorithm pro-
posed by Kiran [45] in 2015. In the literature, TSA is used
in the wide range of research areas such as constrained ver-
sions of TSA [46,47], engineering optimization problems
solved with TSA [48–58], RBF network training and appli-
cations with TSA [48,59], parallel versions of TSA [60,61],
image processing with TSA [62–65], binary optimization
with TSA [66–68], improved versions of TSA [50,69–76],
feature selection with TSA [77], discrete versions of TSA
[15].
Until now, there is no work in the literature for training
MLP with TSA; our main motivation is to present the effec-
tiveness of TSA for training MLPs.
3 Materials and Methods
3.1 Feed-Forward Neural Network and Multi-Layer
Perceptron
The FF neural network is a neural network that has
one way (one direction) between their neurons. If a NN
has hidden layers, it is named as MLP. In this study,
vector representation is used for individuals. The indi-
vidual for the 2-3-1 MLP that is presented in Fig. 1is
X[W13W23 W14 W24 W15 W25 W36W46W56θ1θ2θ3θ4].
The dimension is calculated as
((InputNumber+OutputNumber+1)∗HiddenNodesNumber)+
1.
Mean square error (MSE) for all training samples is used
as the objective function. Equation 1shows this calculation:
MSE
T
t1
m
i1Rt
i−Ct
i2
T(1)
where Tis the number of training samples, m is the number
of outputs, Ct
iis the created output value for ith input for tth
training sample, Rt
iis the real output value for ith input for
tth training sample.
3.2 Tree-Seed Algorithm
TSA was proposed by Kiran [45] for solving unconstrained
continuous optimization problems in 2015. TSA simulates
the relationship between trees and their seeds. TSA is
a population-based swarm intelligence techniques. It has
two peculiar parameters. These parameters are the Search
Tendency (ST) and the Number of Seeds (NS). ST controls
the seed creation direction. The population is named as stand
in TSA. Kiran [45] recommends that the NS can be between
10% of the stand size and 25% of the stand. But if nec-
essary one can change this number. In TSA, the trees and
seeds correspond to the possible solution of an optimization
problem. At the initialization phase, the population is cre-
ated randomly in a predetermined search space. The trees
and seeds are D-dimensional vectors, and D is the dimen-
sionality of an optimization problem. The search process is
a trade-off of exploration and exploitation. If this trade-off
is balanced, the algorithm creates more qualified solutions.
In TSA, this situation is controlled by the ST parameter with
two different seed creation formulas given in Eqs. 2and 3,
respectively.
Seed(k,j)Tree(i,j)+Best j−Tree(r,j)×Rand(−1,1)
(2)
Seed(k,j)Tree(i,j)+Tree(i,j)−Tree(r,j)×Rand(−1,1)
(3)
where kis the index of the seed, jis the index of the dimen-
sion, ris the index of the random neighbor tree, Best is the
best tree obtained so far, Rand(−1,1) is a random number
between −1 and 1. The Equation Xprovides the exploita-
tion and Equation Xprovides the exploration. The detailed
pseudocode of the basic TSA is given in Fig. 2.
3.3 Training MLP with TSA
This section describes how to train a FF MLP with TSA
deeply. TSA is a continuous optimization algorithm, and
Sect. 3.2 gives detailed information about TSA. The main
aim is to determine the optimum parameters of MLP. These
123
Arabian Journal for Science and Engineering
Determine the number of trees (N)
Determine the search tendency (ST) parameter
Determine the maximum function evaluation number (Maxfes)
Dis the dimensionality of the problem
Initialize the trees
Evaluate the trees
Fes=N
WHILE Fes<Maxfes
FOR i=1 to N
Determine the number of seeds between the 10% of the population size and 25%of the
population (NS)
Select a random neighbor tree (r) that not equals the current tree
FOR k=1 to NS
FOR j=1 to D
IF rand<ST
Seed(k,j)=Tree(i,j)+rand(-1,1)*(Best(j)-Tree(r,j))
Relocate the seeds if cross the search space boundaries
ELSE
Seed(k,j)=Tree(i,j)+rand(-1,1)*(Tree(i,j)-Tree(r,j))
Relocate the seeds if cross the search space boundaries
END
END
END
Determine the best seed with a greedy selection mechanism
If the best seed is better than its tree, then the tree is removed from the search space and the
best seed become a tree
END
Determine the best tree with a greedy selection mechanism
END
Fig. 2 The detailed pseudocode of the basic TSA
parameters are clearly explained in Sect. 3.1. At the initializa-
tion phase, these values are started as a random vector. After
that, this vector is optimized by TSA, and finally optimized
parameters of a MLP are produced by TSA. The flowchart
of the proposed method is given in Fig. 3.
3.4 The Computational Complexity of the Proposed
Method
The computational complexity of the proposed method is
related to the structure of the MLP, the number of instances
in the training data, the stand size, the number of maximum
function evaluation numbers, and the number of seeds. The
Big-0 notation of the computational complexity of the pro-
posed method is given in Eq. 4.
O(TSA,MLP)O(Maxfes(O(MLP)+O(TSA))) (4)
where Maxfes is the number of maximum function evaluation
numbers, O(MLP)is the Big-O notation of MLP, and it is
calculated as in Eq. 5,O(TSA
)is the Big-O notation of TSA,
and it is calculated as in Eq. 6.
O(MLP)(t(h+o)) (5)
123
Arabian Journal for Science and Engineering
Fig. 3 The flowchart of the proposed method
123
Arabian Journal for Science and Engineering
Table 1 The details of datasets
No Dataset name Number of attributes MLP structure Dimensions Weight numbers Bias numbers Range
1 XOR6 2 2-2-1 6 6 weights 0 bias [−100,100]
2 XOR9 2 2-2-1 9 6 weights 3 biases [−10,10]
3 XOR13 2 2-3-1 13 9 weights 4 biases [−10,10]
4 3-bit Parity 3 3-3-1 16 12 weights 4 biases [−10,10]
5 4-bit Enc. Dec 4 4-2-4 22 16 weights 6 biases [−10,10]
6 3-bits XOR 3 3-7-1 36 28 weights 8 biases [−10,10]
7 Sigmoid 1 1-15-1 46 30 weights 16 biases [−10,10]
8 Cosine 1 1-15-1 46 31 weights 17 biases [−10,10]
9 Sine 1 1-15-1 46 32 weights 18 biases [−10,10]
10 Balloon 4 4-9-1 55 45 weights 10 biases [−10,10]
11 Iris 4 4-9-3 75 63 weights 12 biases [−10,10]
12 Breast Cancer 9 9-19-1 210 190 weights 20 biases [−10,10]
13 Heart 22 22-45-1 1082 1035 weights 46 biases [−10,10]
14 Banknote 4 4-9-1 55 45 weights 10 biases [−10,10]
15 Diabetic 19 19-39-1 820 780 weights 40 biases [−10,10]
16 Twonorm 20 20-41-1 903 861 weights 42 biases [−10,10]
17 Ringnorm 20 20-41-1 903 861 weights 42 biases [−10,10]
18 Spambase 57 57-115-1 6786 6670 weights 116 biases [−10,10]
where tis the number of instances in the training data, his
the number of hidden nodes in the MLP, ois the number of
output values. In this work, hand oare smaller than t,soin
the worst-case O(MLP)t.
O(TSA)(N×NS×D)(6)
where Nis the stand size, NS is the number of seeds, Dis
the dimensionality of the training dataset. In the best case
NS N/10 and in the worst-case NS N/4. NS must
be smaller than N. Generally Dis smaller than t; for easing
the calculation we suppose Dt. So, in the worst-case O
(TSA)N×N/4×t.
The overall computational complexity of the proposed
method is given in Eq. 7.
O(TSA,MLP)O(Maxfes(t+N×N/4×t)) (7)
4 Experimental Setup
In this section, we gave the details of our experimental setup.
The details of the datasets (the name of the dataset, the num-
ber of the attributes of the dataset, the MLP structure for
training the dataset, the dimension of the dataset, the total
weight numbers of the dataset, the total bias numbers of
the dataset, and the search space range for the dataset) are
given in Table 1. Determining the performance of the algo-
rithm, 18 different datasets (XOR6, XOR9, XOR13, 3-bit
Parity, 4-bit Encoder Decoder, 3-bits XOR, Sigmoid, Cosine,
Sine, Balloon, Iris, Breast Cancer, Heart, Banknote, Diabetic,
Twonorm, Ringnorm, and Spambase) are used in experi-
ments. The large datasets have more than 1000 training/test
samples discussed in Sect. 5.3. There is no strict rule for
selecting the number of hidden nodes. Equation 8.isused
for the determining the number of hidden nodes.
H2×I+1 (8)
where His the number of hidden nodes of MLP and Iis the
number of input nodes. For function approximation datasets,
the number of hidden nodes is set as 15.
In this work, we use nine algorithms. All specific param-
eters of these algorithms are listed in Table 2.
The maximum iteration numbers, the maximum number
of the evaluation numbers, the population sizes (for PSO,
GWO, GA, ACO, ES, PBIL, and BBO), the stand size of
TSA, and the colony size of ABC for every dataset are given
in Table 3.
The information about training/test samples is given in
Table 4.
The datasets are mapped to [−1, + 1] space with the
min–max normalization method that was formulated as seen
in Eq. 5.
X(X−xmin)×(1 −(−1))
(xmax −xmin)+(−1) (9)
123
Arabian Journal for Science and Engineering
Table 2 Specific parameters of the algorithms that used in this work
Algorithm Parameter Value
TSA Search Tendency 0.1
Number of Seeds N:
Stand Size
N*0.1—N*0.25
GWO a (linearly decreased) 2 to 0
ABC Colony Size (CS) N:
Population Size
N/2
Limit D: Dimension of
the problem
CS*D
BBO Habitat modification
probability
1
Immigration
probability bounds
per gene
[0,1]
Step size for numerical
integration of
probabilities
1
Max immigration (I)
and Max emigration
(E)
1
Mutation probability 0.005
PSO Cognitive constant
(C1)
1
Social constant (C2) 1
Inertia constant (w) 0.3
GA (real coded,
selection Roulette
wheel)
Crossover single point
(probability)
1
Mutation uniform
(probability)
0.01
ACO Initial pheromone (s0) 1.00E−06
Pheromone update
constant (Q)
20
Pheromone constant
(q0)
1
Global pheromone
decay rate (pg)
0.9
Local pheromone
decay rate (pt)
0.5
Pheromone sensitivity
(a)
1
Visibility sensitivity
(b)
5
ES Lambda 10
Sigma 1
PBIL Learning rate 0.05
Good population
member
1
Bad population
member
0
Elitism parameter 1
Mutational probability 0.1
where Xis the mapped value, Xis the real value, xmax is the
maximum value of the dataset, xmin is the minimum value of
the dataset.
4.1 Balloon Dataset
The balloon dataset, about blowing up a balloon, has 4
attributes (color, size, act, and age) and 20 training/test sam-
ples (4 repeated). If the balloon is inflated, the output is 1;
otherwise, the output is zero. The string input variables are
converted to binary format. The color values are yellow and
purple, the size values are small and large, the act values are
stretch and dip, and the age values are adult and child. The
files related to the dataset can be found in https://archive.ics.
uci.edu/ml/datasets/Balloons.
4.2 Iris Dataset
The Iris dataset, about class of iris plant, has 4 attributes
(sepal length, sepal width, petal length, and petal width) and
150 training/test samples. If the class is Iris Setosa the output
is −1, if the class is Iris Versicolour the output is 0, and if
the class is Iris Virginica the output is 1. The input variables
are mapped between −1 and 1 with the min–max normal-
ization method that aforementioned before. The files related
to the dataset can be found in https://archive.ics.uci.edu/ml/
datasets/Iris.
4.3 Breast Cancer Dataset
The Breast cancer dataset, about patients who cancer or not,
has 10 attributes (id, clump thickness, uniformity of cell size,
uniformity of cell shape, marginal adhesion, single epithe-
lial cell size, bare nuclei, bland chromatin, normal nucleoli,
and mitoses) and 599 training/100 test samples. If the can-
cer is benign the output is 0; if the cancer is malignant, the
output is 1. The input variables are converted to continuous
variables between −1 and 1 with the min–max normaliza-
tion method that aforementioned before. The files related
to the dataset can be found in https://archive.ics.uci.edu/ml/
datasets/breast+cancer+wisconsin+(original).
4.4 Heart Dataset
The Heart dataset, about patients who has heart disease or
not, has 22 attributes (binary features that extracted from
images) and 267 training/test samples. In this work, we only
use the first 80 training/test samples. If the patient is normal,
the output is 0; if the patient is abnormal, the output is 1. The
files related to the dataset can be found in https://archive.ics.
uci.edu/ml/datasets/spect+heart.
123
Arabian Journal for Science and Engineering
Table 3 The
population/colony/stand sizes
and iteration/function evaluation
numbers
No Dataset name Population size Maximum
iteration
number
Maximum
function
evaluations
Colony size Stand size
1 XOR6 50 250 12,500 25 10
2 XOR9 50 250 12,500 25 10
3 XOR13 50 250 12,500 25 10
4 3-bit Parity 50 250 12,500 25 10
5 4-bit Enc. Dec 50 250 12,500 25 10
6 3-bits XOR 50 250 12,500 25 10
7 Sigmoid 200 250 50,000 100 50
8 Cosine 200 250 50,000 100 50
9 Sine 200 250 50,000 100 50
10 Balloon 50 250 12,500 25 10
11 Iris 200 250 50,000 100 50
12 Breast cancer 200 250 50,000 100 50
13 Heart 200 250 50,000 100 50
14 Banknote 50 250 12,500 25 10
15 Diabetic 50 250 12,500 25 10
16 Twonorm 50 250 12,500 25 10
17 Ringnorm 50 250 12,500 25 10
18 Spambase 50 250 12,500 25 10
Table 4 The training/test samples information
No Dataset name Training samples Test samples NOTrS NOTeS
1XOR6 (0 0;0 1;1 0;1 1) >(0;1;1;0) Same as training samples 4 4
2XOR9 (0 0;0 1;1 0;1 1) >(0;1;1;0) Same as training samples 4 4
3XOR13 (0 0;0 1;1 0;1 1) > (0;1;1;0) Same as training samples 4 4
43-bitParity (000;001;010;011;100;101;110;11
1)→(0;1;1;0;1;0;0;1)
Same as training samples 8 8
54-bitEnc.Dec (0001;0010;0100;1000)→(0001;001
0;0100;1000)
Same as training samples 4 4
63-bitsXOR (000;001;010;011;100;101;110;11
1)→(0;1;1;0;1;0;0;1)
Same as training samples 8 8
7Sigmoid x in [−3:0.1:3] x in [−3:0.05:3] 61 121
8Cosine x in [1.25:0.05:2.75] x in [1.25:0.04:2.75] 31 38
9Sine xin [−2π:0.1:2π]xin[−2π:0.05:2π] 126 252
10 Balloon The details are given in Sect. 4.1 Same as training samples 20 20
11 Iris The details are given in Sect. 4.2 Same as training samples 150 150
12 Breast cancer The details are given in Sect. 4.3 Not the same as training samples 599 100
13 Heart The details are given in Sect. 4.4 Same as training samples 80 80
14 Banknote The details are given in Sect. 4.5 Same as training samples 1372 1372
15 Diabetic The details are given in Sect. 4.6 Same as training samples 1151 1151
16 Twonorm The details are given in Sect. 4.7 Same as training samples 7400 7400
17 Ringnorm The details are given in Sect. 4.8 Same as training samples 7400 7400
18 Spambase The details are given in Sect. 4.9 Same as training samples 4601 4601
NOTrS: Number of training samples NOTeS: Number of test samples
123
Arabian Journal for Science and Engineering
Table 5 Experimental results of the XOR6 dataset
XOR6 ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 1.98E−23 2.40E–23 2.97E−29 6.73E−29 2.34E−29 1.05E−29 1.24E−29 1.05E−29 1.05E−29
Best 1.06E−29 1.24E−29 2.97E−29 1.06E−29 1.24E−29 1.03E−29 1.24E−29 1.03E−29 1.03E−29
Worst 3.99E−22 7.16E−22 2.97E−29 5.48E−28 8.41E−29 1.09E−29 1.24E−29 1.09E−29 1.09E−29
SD 7.39E−23 1.31E−22 2.28E−44 1.16E−28 2.40E−29 1.66E−31 2.85E−45 1.91E−31 1.88E−31
Median 5.10E−26 8.29E−29 2.97E−29 2.51E−29 1.24E−29 1.05E−29 1.24E−29 1.05E−29 1.05E−29
Mean time 1.3564 1.6653 1.3946 1.4546 1.8817 1.4412 0.7653 2.2254 2.5986
Friedman rank 8.4 7.8 6.6 5.8 5.5 1.9 4.7 2.3 1.9
Manuel rank 2 3 4 2 3 1311
Wilcoxon 1.92E−06 1.73E−06 1.73E−06 2.13E−06 1.73E−06 4.91E−01 1.73E−06 2.06E−01 0.00E +00
Classification rate (%) 100 100 100 100 100 100 100 100 100
4.5 Banknote Authentication Dataset
The Banknote authentication dataset is related to whether the
banknote is valid or invalid. It has four continuous attributes
(variance of wavelet transformed image, skewness of wavelet
transformed image, curtosis of wavelet transformed image,
the entropy of image), and 1372 training/test samples. The
files related to the dataset can be found in https://archive.ics.
uci.edu/ml/datasets/banknote+authentication.
4.6 Diabetic Retinopathy Debrecen Dataset
The Diabetic Retinopathy Debrecen dataset contains infor-
mation about people who have diabetic retinopathy or
not. It has 19 continuous and integer attributes and 1151
training/test samples. The files related to the dataset can
be found in https://archive.ics.uci.edu/ml/datasets/Diabetic+
Retinopathy+Debrecen+Data+Set.
4.7 Twonorm Dataset
The Twonorm dataset is an artificial dataset that has 20 con-
tinuous attributes and 7400 training/test samples. The files
related to the dataset can be found in https://www.cs.toronto.
edu/~delve/data/twonorm/desc.html
4.8 Ringnorm Dataset
The Ringnorm dataset is an artificial dataset that has 20 con-
tinuous attributes and 7400 training/test samples. The files
related to the dataset can be found in https://www.cs.toronto.
edu/~delve/data/ringnorm/desc.html.
4.9 Spambase Dataset
The Spambase dataset is about the classifying emails as spam
or not. It has 57 continuous or integer attributes and 4601
training/test samples. The files related to the dataset can be
found in https://archive.ics.uci.edu/ml/datasets/Spambase/.
5 Results and Discussion
All obtained results and discussion about these results are
located in this section. The best training results and the best
classification rates are highlighted with bold type text and
italic background in Tables 5,6,7,8,9,10,11,12,13,14,
15,16,17 ,21,22,23,24. The statistical tests are impor-
tant for determining the significant difference between the
obtained results. In this work, two different statistical tests
are conducted. These are the Wilcoxon signed rank test and
Friedman’s test. The 30 runs obtained results are used in these
tests. The significance level is taken as 5% (0.05), and the p
values of the Wilcoxon signed rank test and mean rank values
of Friedman’s test are located in Tables 5,6,7,8,9,10,11,
12,13,14,15,16,17 and Tables 21,22,23,24.Thelarge
datasets have more than 1000 training/test samples discussed
in Sect. 5.3.
The experimental results for the XOR6 dataset are given
in Table 5. TSA, PSO, and GWO share the first position in
terms of mean training error results. The classification rate is
100% for all methods. Therefore, XOR6 is not an identifier
problem.
The experimental results for the XOR9 dataset are given
in Table 6. ABC, BBO, GA, PBIL, and GWO share the first
position in terms of mean training error results. The classifi-
cation rate is 100% for all methods. Therefore, XOR9 is not
an identifier problem.
The experimental results for the XOR13 dataset are given
in Table 7. ABC, BBO, GA, and GWO share the first position
in terms of mean training error results. The classification
rate is 100% for all methods. Therefore, XOR13 is not an
identifier problem.
123
Arabian Journal for Science and Engineering
Table 6 Experimental results of the XOR9 dataset
XOR9 ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 2.70E−09 2.30E−06 2.69E−09 1.86E−05 2.69E−09 2.89E −06 3.23E−09 1.75E−08 2.94E−09
Best 2.69E−09 1.74E−08 2.69E−09 3.11E−09 2.69E−09 2.69E−09 2.69E−09 2.70E−09 2.70E−09
Wor s t 2.71E−09 2.58E−05 2.69E−09 0.00028 2.69E−09 5.12E−05 5.75E−09 1.15E−07 3.67E−09
SD 3.03E−12 4.84E−06 4.21E−25 5.34E−05 1.08E−24 1.12E −05 8.34E−10 2.41E−08 2.60E−10
Median 2.69E−09 8.89E−07 2.69E−09 7.10E−07 2.69E−09 2.70E −09 2.69E−09 9.30E−09 2.83E−09
Mean time 1.4393 1.4955 1.4015 1.4390 1.8169 1.4570 0.7771 2.2102 2.7048
Friedman rank 3.9 8.4 1.7 8.2 1.7 4.6 4.0 6.8 5.6
Manuel rank 141311122
Wilcoxon 1.73E−06 1.73E−06 1.73E−06 1.73E−06 1.73E −06 6.16E−04 6.58E −01 4.73E−06 0.00E + 00
Classification rate (%) 100 100 100 100 100 100 100 100 100
Table 7 Experimental results of the XOR13 dataset
XOR13 ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 1.43E−13 2.20E−07 1.40E−13 1.12E−06 1.73E−09 4.51E −08 5.95E−10 1.32E−09 6.72E−10
Best 1.40E−13 4.49E−11 1.40E−13 6.51E−10 1.40E−13 1.40E−13 1.52E−13 2.54E−13 1.55E−13
Wor s t 1.59E−13 1.72E−06 1.40E−13 2.24E−05 2.16E−08 6.81E−07 3.96E−09 2.32E−08 5.21E−09
SD 4.41E−15 3.82E−07 2.57E−29 4.15E−06 4.17E−09 1.61E −07 1.03E−09 4.57E−09 1.50E−09
Median 1.41E−13 5.03E−08 1.40E−13 2.28E−08 1.40E−13 1.42E −13 6.08E−13 1.34E−11 3.92E−13
Mean time 2.0267 2.1107 2.0286 1.8804 2.4000 2.0502 0.9473 2.7743 3.7753
Friedman rank 2.8 8.3 1.4 8.1 3.5 4.6 5.2 5.8 5.3
Manuel rank 151611 243
Wilcoxon 1.73E−06 2.60E−06 1.73E−06 3.88E−06 5.30E −01 9.26E−01 8.13E−01 1.92E−01 0.00E + 00
Classification rate (%) 100 100 100 100 100 100 100 100 100
Table 8 Experimental results of the 3-bit Parity dataset
3-bit Parity ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 1.49E−13 2.74E−05 1.44E−13 2.00E−05 6.77E−10 1.39E−09 1.13E−09 3.34E−07 3.87E−09
Best 1.32E−13 5.41E−08 1.44E−13 2.43E−09 1.40E−13 1.34E−13 6.04E−13 7.00E−12 4.01E−13
Wor s t 2.24E−13 0.00017 1.44E−13 2.99E−05 1.41E−09 2.67E−09 1.73E−08 9.13E−06 2.44E−08
SD 2.54E−14 4.03E−05 2.57E−29 1.26E−05 6.99E−10 1.12E−09 3.51E−09 1.66E−06 5.78E−09
Median 1.37E−13 7.79E−06 1.44E−13 2.99E−05 3.58E−10 1.40E−09 3.42E−12 1.43E−08 2.07E−09
Mean time 1.9296 2.3495 2.0153 2.0680 2.5710 2.2769 1.0279 2.7993 4.0024
Friedman rank 1.6 8.5 2.2 8.4 3.6 4.5 4.2 6.6 5.4
Manuel rank 194832675
Wilcoxon 1.73E−06 1.73E−06 1.73E−06 1.73E−06 4.11E−03 6.27E−02 6.64E−04 3.32E−04 0.00E + 00
Classification rate (%) 50.00 62.50 50.00 62.50 50.00 50.00 62.50 62.50 50.00
The experimental results for the 3-bit Parity dataset are
given in Table 8. ABC is the best in terms of mean training
error results. But the classification rate of ABC is 50%. ACO,
ES, PBIL, and PSO have the same classification rate (62.5%).
The best trained model cannot produce the best classification
accuracy.
The experimental results for the 4-bit Encoder Decoder
dataset are given in Table 9. All algorithms are trapped in
the same local minima, and all of them produce the same
classification rates. Thus, the 4-bit Encoder Decoder is not
an identifier problem. This problem has four output values;
therefore, our model cannot be appropriate for solving the
4-bit Encoder Decoder problem.
The experimental results for the 3-bits XOR dataset are
given in Table 10. GA is the best in terms of mean training
123
Arabian Journal for Science and Engineering
Table 9 Experimental results of the 4-bit Encoder Decoder dataset
4-bit Encoder Decoder ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 6.23E−02 6.23E−02 6.23E−02 6.25E −02 6.23E−02 6.23E −02 6.23E−02 6.23E −02 6.23E−02
Best 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02
Wor s t 6.23E−02 6.24E−02 6.23E−02 6.37E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E −02
SD 1.09E−07 2.31E−05 2.82E−17 3.76E −04 1.40E−05 1.60E −07 1.18E−05 3.85E −06 2.32E−06
Median 6.23E−02 6.23E−02 6.23E−02 6.24E −02 6.23E−02 6.23E −02 6.23E−02 6.23E −02 6.23E−02
Mean time 2.2708 2.9314 2.4063 2.4090 2.7637 2.7607 1.1166 3.1452 4.7394
Friedman rank 2.8 8.0 1.5 8.5 5.2 2.3 7.3 5.3 4.1
Manuel rank 111111111
Wilcoxon 3.41E−05 1.73E−06 1.73E −06 1.73E−06 2.41E −03 9.71E−05 1.73E −06 1.04E−03 0.00E +00
Classification rate (%) 25 25 25 25 25 25 25 25 25
Table 10 Experimental results of the 3-bits XOR dataset
3-bits XOR ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 2.97E−17 2.63E−06 3.14E−30 9.28E−10 7.43E−20 7.11E−15 3.84E−12 7.37E−10 1.15E−10
Best 5.93E−20 2.41E−09 3.14E−30 3.40E−15 1.36E−30 3.68E−26 4.83E−19 4.81E−13 1.94E−18
Worst 9.91E−17 2.09E−05 3.14E−30 1.38E−09 2.23E−18 1.35E−13 1.13E−10 1.37E−08 2.75E−09
SD 2.76E−17 4.62E−06 2.14E−45 3.99E−10 4.07E−19 2.78E−14 2.06E−11 2.53E−09 5.06E−10
Median 1.70E−17 5.28E−07 3.14E−30 6.91E−10 1.88E−25 5.64E−22 1.44E−15 4.22E−11 8.68E−14
Mean time 3.1274 4.3180 3.4139 3.4170 3.5980 4.1621 1.3527 3.8864 7.0275
Friedman rank 3.9 9.0 1.1 7.8 2.0 3.4 5.1 7.0 5.7
Manuel rank 4 9 2 7 13586
Wilcoxon 2.35E−06 1.73E−06 1.73E−06 3.41E−05 1.73E−06 5.31E−05 1.48E−02 1.96E−03 0.00E + 00
Classification rate (%) 100 100 100 100 100 100 100 100 100
Table 11 Experimental results of the Sigmoid dataset
Sigmoid ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 2.46E−01 2.48E−01 2.46E−01 2.49E−01 2.47E−01 2.47E −01 2.47E−01 2.47E−01 2.47E−01
Best 2.46E−01 2.47E−01 2.46E−01 2.47E−01 2.46E−01 2.46E−01 2.47E−01 2.47E−01 2.47E−01
Wor s t 2.46E−01 2.51E−01 2.46E−01 2.53E−01 2.49E−01 2.47E−01 2.48E−01 2.48E−01 2.48E−01
SD 2.15E−05 1.13E−03 8.47E−17 1.20E−03 5.79E−04 1.67E −04 3.04E−04 3.26E−04 3.47E−04
Median 2.46E−01 2.48E−01 2.46E−01 2.49E−01 2.47E−01 2.46E −01 2.47E−01 2.47E−01 2.47E−01
Mean time 95.9492 34.0330 35.7778 31.2610 31.3496 35.0565 23.0235 47.3332 39.8589
Friedman rank 1.1 7.9 2.6 8.6 4.7 2.8 5.5 5.8 6.2
Manuel rank 121211 222
Wilcoxon 1.73E−06 2.60E−05 1.73E−06 2.60E−06 5.67E −03 3.18E−06 1.75E−02 2.99E−01 0.00E + 00
Classification rate (%) 100.00 94.21 100.00 100.00 100.00 100.00 100.00 100.00 100.00
error results. The classification rate is 100% for all methods.
Therefore, 3-bits XOR is not an identifier problem.
The experimental results for the Sigmoid dataset are given
in Table 11. ABC, BBO, GA, and GWO share the first posi-
tion in terms of mean training error results. The classification
rate is 100% for all methods except ACO. Therefore, Sigmoid
is not an identifier problem.
The experimental results for the Cosine dataset are given
in Table 12. GWO and ABC share the best position in terms
of mean training error results. PSO is the best in terms of the
classification. Cosine is an identifier problem because every
algorithm produces different results.
The experimental results for the Sine dataset are given in
Table 13. ABC is in the best position in terms of mean training
error results. GA is the best in terms of the classification. Sine
123
Arabian Journal for Science and Engineering
Table 12 Experimental results of the Cosine dataset
Cosine ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 1.77E−01 1.86E−01 1.78E−01 1.95E−01 1.79E−01 1.76E−01 1.81E−01 1.82E−01 1.81E−01
Best 1.76E−01 1.79E−01 1.78E−01 1.82E−01 1.77E−01 1.76E−01 1.78E−01 1.78E−01 1.78E−01
Wor s t 1.77E−01 1.97E−01 1.78E−01 2.13E−01 1.83E−01 1.77E−01 1.85E−01 1.85E−01 1.83E−01
SD 3.48E−04 4.46E−03 5.65E−17 8.32E−03 1.51E−03 4.70E−04 1.82E−03 1.53E−03 1.34E−03
Median 1.76E−01 1.86E−01 1.78E−01 1.95E−01 1.78E−01 1.76E−01 1.81E−01 1.81E−01 1.81E−01
Mean time 88.9853 29.4610 31.5415 26.7391 26.9503 30.9972 18.8338 43.6489 33.0181
Friedman rank 1.7 7.7 3.6 8.7 3.7 1.3 5.7 6.6 5.9
Manuel rank 143521333
Wilcoxon 1.73E−06 1.24E−05 2.13E−06 1.73E−06 1.92E−06 1.73E−06 9.26E−01 1.48E−02 0.00E + 00
Classification rate (%) 97.37 78.95 92.11 76.32 94.74 97.37 92.11 100.00 84.21
Table 13 Experimental results of the Sine dataset
Sine ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 4.13E−01 4.52E−01 4.41E−01 4.49E−01 4.54E−01 4.53E−01 4.35E−01 4.40E−01 4.46E−01
Best 4.00E−01 4.43E−01 4.41E−01 4.30E−01 4.45E−01 4.44E−01 4.20E−01 4.22E−01 4.34E−01
Wor s t 4.27E−01 4.56E−01 4.41E−01 4.56E−01 4.60E−01 4.56E−01 4.47E−01 4.52E−01 4.53E−01
SD 6.20E−03 4.12E−03 1.13E−16 7.51E−03 3.99E−03 1.86E−03 6.38E−03 7.42E−03 5.31E−03
Median 4.14E−01 4.54E−01 4.41E−01 4.52E−01 4.55E−01 4.53E−01 4.35E−01 4.42E−01 4.48E−01
Mean time 88.4847 36.3617 38.3013 33.9895 34.4315 36.8322 26.4933 48.3780 42.7450
Friedman rank 1.0 7.0 3.6 6.4 8.1 7.2 2.7 3.8 5.1
Manuel rank 176498235
Wilcoxon 1.73E−06 2.61E−04 1.15E−04 7.52E−02 2.16E−05 2.60E−06 1.64E−05 3.38E−03 0.00E + 00
Classification rate (%) 59.52 57.54 56.75 60.32 67.06 56.75 66.67 60.71 57.94
Table 14 Experimental results of the Balloon dataset
Balloon ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 6.22E−17 4.13E−07 2.22E−31 8.22E−10 3.15E−23 7.44E−19 1.65E−13 1.33E−10 8.46E−12
Best 1.39E−20 3.99E−10 2.22E−31 4.68E−14 3.40E−40 1.10E−31 4.10E−21 1.43E−15 2.08E−17
Worst 5.68E−16 4.51E−06 2.22E−31 1.55E−09 9.45E−22 8.07E−18 1.87E−12 1.19E−09 1.73E−10
SD 1.09E−16 8.96E−07 4.45E−47 3.86E−10 1.72E−22 2.13E−18 4.45E−13 2.44E−10 3.25E−11
Median 3.62E−17 8.62E−08 2.22E−31 1.03E−09 9.86E−34 1.05E−22 2.55E−15 3.11E−11 6.73E−14
Mean time 10.3630 6.2161 4.8502 4.9285 4.5307 6.0645 1.9197 4.9773 12.5299
Friedman rank 4.3 8.9 1.8 8.0 1.3 2.9 5.1 6.9 5.8
Manuel rank 5 9 3 8 12476
Wilcoxon 2.35E−06 1.73E−06 1.73E−06 1.92E−06 1.73E−06 1.73E−06 6.04E−03 1.97E−05 0.00E + 00
Classification rate (%) 50 050 50 50 50 50 50 0
is an identifier problem because every algorithm produces
different results.
The experimental results for the Balloon dataset are given
in Table 14. GA is in the best position in terms of mean train-
ing error results. The best trained model of ACO and TSA
cannot classify the test data. ABC, BBO, ES, GWO, PBIL,
and PSO have a 50% classification rate. Balloon is an iden-
tifier problem because every algorithm produces different
results.
The experimental results for the Iris dataset are given in
Table 15. All algorithms except ACO and GA trapped in the
same local minima. ABC is the best in terms of the classifi-
cation. This problem has three output values; therefore, our
model cannot appropriate for solving the Iris problem.
123
Arabian Journal for Science and Engineering
Table 15 Experimental results of the Iris dataset
Iris ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 2.50E−01 2.84E−01 2.50E−01 2.51E−01 3.52E−01 2.50E −01 2.50E−01 2.51E −01 2.52E−01
Best 2.50E−01 2.54E−01 2.50E−01 2.50E−01 2.54E−01 2.50E−01 2.50E−01 2.50E−01 2.50E−01
Wor s t 2.51E−01 3.32E−01 2.50E−01 2.54E−01 4.91E−01 2.50E−01 2.51E−01 2.55E−01 2.56E−01
SD 1.64E−04 2.02E−02 1.69E−16 8.61E−04 7.18E−02 2.90E −05 1.43E−04 1.03E −03 1.58E−03
Median 2.50E−01 2.84E−01 2.50E−01 2.50E−01 3.19E−01 2.50E −01 2.50E−01 2.51E −01 2.51E−01
Mean time 255.9866 37.5246 42.3152 33.8377 33.6497 41.2012 20.9516 49.5602 56.7811
Friedman rank 3.5 8.2 1.5 5.0 8.8 1.8 3.6 6.2 6.5
Manuel rank 121121111
Wilcoxon 5.22E−06 1.92E−06 1.73E−06 4.86E −05 1.73E−06 1.73E−06 2.35E −06 8.97E−02 0.00E +00
Classification rate (%) 20.67 5.33 12.00 8.67 3.33 11.33 10.00 4.67 20.00
Table 16 Experimental results of the Breast Cancer dataset
Breast Cancer ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 9.36E−03 1.10E−02 2.35E−03 3.76E−02 1.82E−03 1.34E−03 2.55E−02 2.76E−02 2.92E−02
Best 3.95E−03 9.28E−03 2.35E−03 3.49E−02 1.18E−03 1.14E−03 1.24E−02 2.11E−02 1.67E−02
Worst 1.56E−02 1.41E−02 2.35E−03 4.07E−02 8.17E−03 1.55E−03 3.28E−02 3.49E−02 3.39E−02
SD 2.89E−03 1.96E−03 8.82E−19 1.59E−03 1.32E−03 1.26E−04 4.92E−03 3.18E−03 3.45E−03
Median 9.23E−03 9.64E−03 2.35E−03 3.74E−02 1.46E−03 1.35E−03 2.62E−02 2.76E−02 2.96E−02
Mean time 592.9885 178.6869 200.6069 181.4763 170.7403 208.1356 145.6965 189.3432 232.0039
Friedman rank 4.3 4.7 2.9 9.0 1.8 1.4 6.8 6.8 7.4
Manuel rank 4 5 3 9 2 1687
Wilcoxon 1.73E−06 1.73E−06 1.73E−06 1.73E−06 1.73E−06 1.73E−06 7.27E−03 2.18E−02 0.00E + 00
Classification rate (%) 0 100 100 95 0 0 52 1 77
Table 17 Experimental results of the Heart dataset
Heart ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 6.49E−25 1.57E−17 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Best 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00
Wor s t 1.94E−23 4.70E−16 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
SD 3.55E−24 8.59E−17 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Median 1.12E−33 4.48E−33 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Mean time 562.8594 462.9802 636.0618 466.6571 457.8536 632.8333 225.8350 494.0318 356.5683
Friedman rank 6.7 6.8 4.5 4.5 4.5 4.5 4.5 4.5 4.5
Manuel rank 111111111
Wilcoxon 6.10E−05 6.10E−05 1.00E +00 1.00E +00 1.00E +00 1.00E +00 1.00E +00 1.00E +00 0.00E +00
Classification rate (%) 100 0100 100 59.09 100 00100
The experimental results for the Breast Cancer dataset
are given in Table 16. GWO is the best in terms of mean
training error results, but its trained model cannot classify
the test data. ACO and BBO share the best position in terms
of the classification. According to these results, it has seen
that breast cancer is a challenging problem.
The experimental results for the Heart dataset are given in
Table 17. All algorithms achieved the zero error, but ACO,
PBIL, and PSO cannot classify the test data. The classifica-
tion rate of GA is 59.09%. ABC, BBO, ES, GWO, and TSA
classify the test data successfully.
123
Arabian Journal for Science and Engineering
Table 18 The manual ranks
overview Manual ranks ABC ACO BBO ES GA GWO PBIL PSO TSA
XOR6 234231311
XOR9 141311122
XOR13 151611243
3-bit Parity 194832675
4-bit Enc. Dec 111111111
3-bits XOR 492713586
Sigmoid 121211222
Cosine 143521333
Sine 176498235
Balloon 593812476
Iris 121111111
Breast cancer 453921687
Heart 111111111
Total ranks 24 61 31 57 27 24 37 48 43
Table 19 The Friedman ranks
overview Friedman ranks ABC ACO BBO ES GA GWO PBIL PSO TSA
XOR6 8.4 7.8 6.6 5.8 5.5 1.9 4.7 2.3 1.9
XOR9 3.9 8.4 1.7 8.2 1.7 4.6 4.0 6.8 5.6
XOR13 2.8 8.3 1.4 8.1 3.5 4.6 5.2 5.8 5.3
3-bit Parity 1.6 8.5 2.2 8.4 3.6 4.5 4.2 6.6 5.4
4-bit Enc. Dec 2.8 8.0 1.5 8.5 5.2 2.3 7.3 5.3 4.1
3-bits XOR 3.9 9.0 1.1 7.8 2.0 3.4 5.1 7.0 5.7
Sigmoid 1.1 7.9 2.6 8.6 4.7 2.8 5.5 5.8 6.2
Cosine 1.7 7.7 3.6 8.7 3.7 1.3 5.7 6.6 5.9
Sine 1.0 7.0 3.6 6.4 8.1 7.2 2.7 3.8 5.1
Balloon 4.3 8.9 1.8 8.0 1.3 2.9 5.1 6.9 5.8
Iris 3.3 8.6 1.0 8.1 4.0 2.0 7.1 5.9 5.0
Breast cancer 4.3 4.7 2.9 9.0 1.8 1.4 6.8 6.8 7.4
Heart 6.7 6.8 4.5 4.5 4.5 4.5 4.5 4.5 4.5
Total ranks 45.9 101.6 34.4 100.1 49.6 43.5 68.0 74.0 68.1
General rank 3 9 1842576
Table 20 The classification rates
overview ABC ACO BBO ES GA GWO PBIL PSO TSA
XOR6 100 100 100 100 100 100 100 100 100
XOR9 100 100 100 100 100 100 100 100 100
XOR13 100 100 100 100 100 100 100 100 100
3-bit Parity 50.00 62.50 50.00 62.50 50.00 50.00 62.50 62.50 50.00
4-bit Enc. Dec 25 25 25 25 25 25 25 25 25
3-bits XOR 100 100 100 100 100 100 100 100 100
Sigmoid 100.00 94.21 100.00 100.00 100.00 100.00 100.00 100.00 100.00
Cosine 97.37 78.95 92.11 76.32 94.74 97.37 92.11 100.00 84.21
Sine 59.52 57.54 56.75 60.32 67.06 56.75 66.67 60.71 57.94
Balloon 50 0 50 50 50 50 50 50 0
Iris 20.67 5.33 12.00 8.67 3.33 11.33 10.00 4.67 20.00
Breast cancer 0 100 100 95 0 0 52 1 77
Heart 100 0 100 100 59.0909 100 0 0 100
Mean CR 69.4 63.3 75.8 75.2 65.3 68.5 66.0 61.8 70.3
123
Arabian Journal for Science and Engineering
Table 21 Experimental results of the Balloon dataset for TSA
Balloon N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 4.08E−10 1.26E−10 7.19E−11 5.27E−09 5.17E−09 5.97E−09 2.67E−08 1.71E−08 2.62E−08
Best 9.43E−19 9.81E−21 4.95E−20 1.51E−16 7.38E−14 2.79E−15 3.75E−14 1.01E−14 5.59E−14
Worst 6.51E−09 3.23E−09 1.95E−09 7.56E−08 3.41E−08 1.57E−07 3.29E−07 2.01E−07 1.99E−07
SD 1.47E−09 5.91E−10 3.56E−10 1.44E−08 1.01E−08 2.86E−08 6.56E−08 4.31E−08 5.94E−08
Median 8.42E−14 4.81E−15 8.00E−14 4.06E−10 6.68E−10 5.13E−11 2.41E−09 4.89E−10 9.13E−10
Mean time 8.0720 8.0343 8.7865 5.0810 4.9332 4.7059 4.6307 4.5278 4.2975
Friedman
rank
2.8 2.4 2.5 5.9 6.2 5.1 7.3 6.5 6.3
Manuel
rank
312495768
Wilcoxon 0.00E + 00 2.54E−01 4.05E−01 8.19E−05 1.80E−05 6.84E−03 9.32E−06 1.24E−05 1.74E−04
Classification
rate %
50.00 50.00 50.00 50.00 50.00 50.00 50.00 0.00 50.00
Table 22 Experimental results of the Iris dataset for TSA
Iris N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 2.51E−01 2.51E−01 2.51E−01 2.55E−01 2.54E −01 2.53E−01 2.54E −01 2.53E−01 2.53E −01
Best 2.50E−01 2.50E−01 2.50E−01 2.51E−01 2.50E−01 2.50E−01 2.50E−01 2.50E−01 2.50E−01
Wor s t 2.54E−01 2.56E−01 2.55E−01 2.69E−01 2.69E−01 2.66E−01 2.76E−01 2.64E−01 2.58E−01
SD 8.92E−04 1.15E−03 1.23E−03 5.24E−03 3.94E −03 3.42E−03 4.71E −03 3.30E−03 2.25E −03
Median 2.50E−01 2.50E−01 2.50E−01 2.52E−01 2.52E −01 2.52E−01 2.53E −01 2.52E−01 2.52E −01
Mean time 122.8270 122.7315 107.2986 42.7832 41.5848 41.1891 31.9500 31.3707 30.3779
Friedman
rank
2.6 2.8 3.0 6.5 5.9 6.1 6.3 5.8 5.9
Manuel
rank
111211111
Wilcoxon 0.00E +00 6.58E−01 7.19E−01 1.97E−05 2.16E −05 3.72E−05 4.29E −06 4.86E−05 1.64E −05
Classification
rate %
16.67 27.33 9.33 30.00 20.67 7.33 14.00 30.67 4.00
Table 23 Experimental results of the Cancer dataset for TSA
Cancer N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 2.55E−02 2.49E−02 2.54E−02 3.05E−02 2.89E−02 2.93E−02 3.03E−02 3.08E−02 3.00E−02
Best 1.62E−02 1.50E−02 1.25E−02 2.44E−02 1.51E−02 2.43E−02 2.24E−02 2.17E−02 2.19E−02
Worst 3.39E−02 3.44E−02 3.31E−02 3.47E−02 3.41E−02 3.44E−02 3.59E−02 3.62E−02 3.66E−02
SD 4.25E−03 4.32E−03 5.16E−03 2.63E−03 4.34E−03 3.02E−03 3.64E−03 3.61E−03 3.69E−03
Median 2.58E−02 2.54E−02 2.60E−02 3.04E−02 2.98E−02 2.91E−02 3.08E−02 3.21E−02 3.06E−02
Mean time 442.0449 337.4356 362.2962 231.0281 253.7927 311.0903 217.8359 213.5434 227.1923
Friedman
rank
3.3 2.9 3.7 6.4 5.5 5.3 5.9 6.1 6.0
Manuel
rank
421938756
Wilcoxon 0.00E + 00 5.58E−01 8.61E−01 3.41E−05 1.71E−03 1.38E−03 1.15E−04 1.15E−04 3.06E−04
Classification
rate %
77.00 74.00 0.00 89.00 77.00 81.00 76.00 77.00 87.00
123
Arabian Journal for Science and Engineering
Table 24 Experimental results of the 3-bit parity dataset for TSA
3-bit Parity N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 1.70E−09 1.99E−06 2.06E−08 4.60E−06 1.66E−06 8.88E−07 1.39E−05 8.09E−06 6.14E−06
Best 6.41E−13 8.06E−13 8.19E−13 1.17E−11 1.54E−11 7.66E−11 5.49E−11 2.68E−09 4.83E−12
Wor s t 5.34E−09 5.95E−05 5.28E−07 5.17E−05 1.44E−05 1.89E−05 7.07E−05 8.19E−05 5.04E−05
SD 1.85E−09 1.09E−05 9.60E−08 1.15E−05 3.29E−06 3.54E−06 2.18E−05 1.85E−05 1.29E−05
Median 1.26E−09 2.52E−09 2.56E−09 2.28E−08 8.53E−08 1.66E−08 1.76E−06 4.26E−07 4.15E−07
Mean time 1.8920 1.8433 1.8087 1.2872 1.2497 1.1923 1.2163 1.2349 1.1735
Friedman
rank
2.2 2.8 2.8 5.5 5.5 5.2 7.2 7.0 6.9
Manuel
rank
123568794
Wilcoxon 0.00E +00 1.59E−01 5.71E−02 4.29E−06 6.34E−06 6.98E−06 2.13E−06 1.73E−06 2.13E−06
Classification
rate %
50.00 75.00 50.00 75.00 50.00 50.00 50.00 62.50 75.00
Table 25 Experimental results of the 4-bit Encoder Decoder dataset for TSA
4-bit
Encoder
Decoder
N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 2.06E−02 2.05E−02 2.06E−02 2.05E −02 2.06E−02 2.05E −02 2.07E−02 2.06E −02 2.06E−02
Best 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02
Wor s t 2.23E−02 2.11E−02 2.17E−02 2.09E−02 2.18E−02 2.06E−02 2.20E−02 2.15E−02 2.10E −02
SD 3.51E−04 1.11E−04 2.65E−04 8.37E −05 2.64E−04 1.42E −05 3.22E−04 1.85E −04 1.16E−04
Median 2.05E−02 2.05E−02 2.05E−02 2.05E −02 2.05E−02 2.05E −02 2.05E−02 2.05E −02 2.05E−02
Mean time 2.5172 2.3700 2.3480 1.8992 1.8609 1.7707 1.8014 1.7484 1.7475
Friedman
rank
3.6 3.1 3.4 5.4 5.2 5.1 6.9 6.3 6.0
Manuel
rank
111111111
Wilcoxon 0.00E +00 1.85E−01 6.88E−01 3.00E −02 3.68E−02 3.16E −02 8.31E−04 3.61E −03 9.84E−03
Classification
rate %
25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00
For overall analysis, the manual ranks overview is given in
Table 18, the Friedman ranks overview is given in Table 19,
and the classification rates overview is given in Table 20.////
ABC and GWO have the same total rank value in terms
of mean training error results. BBO is the best in terms of
Friedman rank and classification rank values. Every dataset
has a different type of search space. An algorithm that solves
one problem well cannot solve another problem. This situa-
tion is proved by Wolpert and Macready [27], and its name
is no free lunch theorems for optimization. The mean classi-
fication rate of the TSA is 70.3%. This value is a compatible
result. In these experiments, we use fixed stand sizes (10 and
50) and ST (0.1) for TSA. These two peculiar parameters
affect the results of the algorithm. In the next experiment,
we analyze the different stand sizes and ST values for iden-
tifier datasets (Balloon, Iris, Cancer, Parity, EncDec, Cosine,
Sine).
5.1 The Parameter Adjustment for TSA
In this section, we adjust the peculiar parameters of TSA
for the 7 identifier datasets (Balloon, Iris, Cancer, Parity,
EncDec, Cosine, and Sine). In experiments, we use 10, 50,
and 100 as stand sizes and 0.1, 0.5, and 0.9 as ST parameters.
The base method for Wilcoxon signed rank test is N10
and ST 0.1.
The experimental results of the Balloon dataset for the
TSA are given in Table 21.N10 and ST 0.5 variant is
the best in terms of mean training error results. All variants
except N100 and ST 0.5 has the same classification
rates (50%).
123
Arabian Journal for Science and Engineering
Table 26 Experimental results of the Cosine dataset for TSA
Cosine N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 1.80E−01 1.80E−01 1.81E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01
Best 1.77E−01 1.77E−01 1.78E−01 1.78E−01 1.78E−01 1.78E−01 1.78E−01 1.77E−01 1.79E−01
Wor s t 1.83E−01 1.83E−01 1.86E−01 1.89E−01 1.85E−01 1.86E−01 1.87E−01 1.87E−01 1.85E−01
SD 1.56E−03 1.64E−03 1.88E−03 2.13E−03 1.73E−03 1.98E−03 2.46E−03 2.12E−03 1.69E−03
Median 1.80E−01 1.80E−01 1.81E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01
Mean time 73.7480 66.3652 60.0203 36.9796 55.8863 50.7422 27.1082 27.2170 25.9935
Friedman
rank
3.7 4.0 4.2 5.0 5.7 5.6 5.9 5.2 5.5
Manuel
rank
112222213
Wilcoxon 0.00E +00 7.81E−01 3.39E−01 1.04E−02 1.20E−03 2.96E−03 3.32E−04 2.30E−02 5.32E−03
Classification
rate %
36.84 36.84 100.00 36.84 36.84 100.00 44.74 100.00 36.84
Table 27 Experimental results of the Sine dataset for TSA
SineNN10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Mean 4.44E−01 4.43E−01 4.40E−01 4.44E−01 4.44E−01 4.44E−01 4.46E−01 4.45E−01 4.43E−01
Best 4.29E−01 4.23E−01 4.23E−01 4.17E−01 4.27E−01 4.29E−01 4.33E−01 4.29E−01 4.24E−01
Worst 4.54E−01 4.53E−01 4.54E−01 4.54E−01 4.53E−01 4.53E−01 4.53E−01 4.54E−01 4.54E−01
SD 6.67E−03 7.45E−03 8.59E−03 8.24E−03 6.32E−03 6.96E−03 6.23E−03 6.60E−03 8.10E−03
Median 4.46E−01 4.45E−01 4.42E−01 4.47E−01 4.44E−01 4.46E−01 4.48E−01 4.47E−01 4.43E−01
Mean time 56.8857 56.5048 55.8578 37.4354 54.5648 52.6769 52.1033 48.5141 38.4682
Friedman
rank
5.3 4.6 3.9 5.4 4.9 5.1 5.5 5.4 4.9
Manuel
rank
522145653
Wilcoxon 0.00E + 00 4.17E−01 7.52E−02 9.26E−01 8.13E−01 9.43E−01 2.21E−01 4.41E−01 6.00E−01
Classification
rate %
100.00 52.78 51.98 55.95 100.00 51.98 100.00 89.68 51.98
Table 28 The Friedman ranks overview for the TSA variants
NN10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Balloon 2.8 2.4 2.5 5.9 6.2 5.1 7.3 6.5 6.3
Iris 2.6 2.8 3.0 6.5 5.9 6.1 6.3 5.8 5.9
Cancer 3.3 2.9 3.7 6.4 5.5 5.3 5.9 6.1 6.0
Parity 2.2 2.8 2.8 5.5 5.5 5.2 7.2 7.0 6.9
EncDec 3.6 3.1 3.4 5.4 5.2 5.1 6.9 6.3 6.0
Cosine 3.7 4.0 4.2 5.0 5.7 5.6 5.9 5.2 5.5
Sine 5.3 4.6 3.9 5.4 4.9 5.1 5.5 5.4 4.9
Tot a l FR 23. 5 22.6 23.5 40.2 38.9 37.5 45.0 42.4 41.4
123
Arabian Journal for Science and Engineering
Table 29 The classification rates overview for the TSA
N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
Balloon 50.00 50.00 50.00 50.00 50.00 50.00 50.00 0.00 50.00
Iris 16.67 27.33 9.33 30.00 20.67 7.33 14.00 30.67 4.00
Cancer 77.00 74.00 0.00 89.00 77.00 81.00 76.00 77.00 87.00
Parity 50.00 75.00 50.00 75.00 50.00 50.00 50.00 62.50 75.00
EncDec 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00
Cosine 36.84 36.84 100.00 36.84 36.84 100.00 44.74 100.00 36.84
Sine 100.00 52.78 51.98 55.95 100.00 51.98 100.00 89.68 51.98
Total FR 50.8 48.7 40.9 51.7 51.4 52.2 51.4 55.0 47.1
Table 30 The classification rates
overview ABC ACO BBO ES GA GWO PBIL PSO TSA
XOR6 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
XOR9 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
XOR13 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
3-bit Parity 50.00 62.50 50.00 62.50 50.00 50.00 62.50 62.50 62.50
4-bit Enc. Dec 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00
3-bits XOR 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
Sigmoid 100.00 94.21 100.00 100.00 100.00 100.00 100.00 100.00 100.00
Cosine 97.37 78.95 92.11 76.32 94.74 97.37 92.11 100.00 100.00
Sine 59.52 57.54 56.75 60.32 67.06 56.75 66.67 60.71 89.68
Balloon 50.00 0.00 50.00 50.00 50.00 50.00 50.00 50.00 0.00
Iris 20.67 5.33 12.00 8.67 3.33 11.33 10.00 4.67 30.67
Breast cancer 0.00 100.00 100.00 95.00 0.00 0.00 52.00 1.00 77.00
Heart 100.00 0.00 100.00 100.00 59.09 100.00 0.00 0.00 100.00
Banknote 66.69 62.76 55.54 60.28 55.54 55.54 55.54 61.95 55.54
Diabetic 51.26 58.12 47.87 51.69 51.26 43.96 54.04 54.74 52.13
Twonorm 84.58 73.61 86.16 85.95 95.95 92.95 89.85 83.64 88.59
Ringnorm 63.24 55.04 60.69 64.18 78.19 69.15 67.64 66.59 63.08
Spambase 42.19 43.06 40.62 42.84 42.84 39.40 41.88 42.21 40.69
Mean CR 67.25 62.01 70.93 71.26 65.17 66.19 64.85 61.83 71.38
The experimental results of the Iris dataset for the TSA
are given in Table 22. All variants except N50 and ST
0.1 variant are trapped in the same local minima in terms of
mean training error results. N100 and ST 0.5 variant is
the best in terms of classification rates.
The experimental results of the Cancer dataset for the TSA
are given in Table 23.N10 and ST 0.9 variant is the
best in terms of mean training error results. N50 and ST
0.1 variant is the best in terms of classification rates.
The experimental results of the 3-bit Parity dataset for the
TSA are given in Table 24.N10 and ST 0.1 variant is
the best in terms of mean training error results. N10 and
ST 0.5, N50 and ST 0.1 and N100 and ST
0.9 variants share the best position in terms of classification
rates.
The experimental results of the 4-bit Encoder Decoder
dataset for the TSA are given in Table 25. All variants are
trapped in the same local minima, and all of them produce
the same classification rates.
The experimental results of the Cosine dataset for the TSA
are given in Table 26.N10 and ST 0.1, N10 and
ST 0.5 and N100 and ST 0.5 variants share the best
position in terms of mean training error results. N10 and
ST 0.9, N50 and ST 0.9 and N100 and ST
0.5 variants share the best position in terms of classification
rates.
The experimental results of the Sine dataset for the TSA
are given in Table 27.N50 and ST 0.1 variant is the best
in terms of mean training error results. N10 and ST 0.1,
N50 and ST 0.5 and N100 and ST 0.1 variants
share the best position in terms of classification rates.
123
Arabian Journal for Science and Engineering
Table 31 The classification rates of the Cancer dataset for the TSA variants
Run No N10 ST
0.1
N10 ST
0.5
N10 ST
0.9
N50 ST
0.1
N50 ST
0.5
N50 ST
0.9
N100
ST 0.1
N100
ST 0.5
N100
ST 0.9
1799190879088818853
2747491859282838757
3578074886186898325
4848166338694877783
5798983887990848782
6826583748495908484
7883669778383794877
8847883928583847982
9848490828481909083
10 84 88 91 79 76 81 79 96 88
11 94 77 85 89 82 85 80 88 89
12 82 79 59 84 83 82 87 87 87
13 95 89 41 84 85 86 88 84 42
14 87 77 86 92 90 81 85 92 83
15 86 76 73 92 86 78 84 88 83
16 86 78 52 83 88 84 86 82 64
17 56 88 47 82 77 86 85 78 81
18 84 78 87 89 83 88 25 75 82
19 78 81 82 91 42 82 89 77 39
20 78 80 34 85 83 84 84 70 90
21 93 85 80 76 87 92 93 81 81
22 87 86 85 45 85 85 56 81 89
23 83 80 37 86 76 82 13 81 82
24 77 80 69 89 90 91 76 87 83
25 78 82 81 83 79 73 88 79 45
26 67 26 78 87 74 91 86 84 84
27 86 81 0 74 90 81 92 53 80
28 79 83 87 83 80 85 89 85 85
29 67 87 83 81 61 70 87 86 85
30 88 61 84 88 84 90 83 63 63
Mean CR 80.87 77.33 71.67 81.60 80.83 84.63 80.07 80.67 74.37
Rank379241658
MaxCR959191929295939690
Min CR 56 26 0 33 42 70 13 48 25
For the overall analysis of the TSA variants, the Friedman
ranks overview for the TSA variants is given in Table 28 and
the classification rates overview for the TSA variants is given
in Table 29.
N10 and ST 0.5 variant is the best in terms of mean
Friedman rank results. N100 and ST 0.5 variant is the
best in terms of classification rate values. The last combined
values of the classification rate are given in Table 30.
According to the classification rates in Table 30, TSA is the
best classifier in these experiment area. TSA is the best solver
on 18 different type datasets in terms of mean classification
rates. In this work, the second is ES, the third is BBO, the
fourth is ABC, the fifth is GWO, the sixth is GA, the seventh
is PBIL, the eighth is ACO, and the last is PSO.
5.2 The Deep Run Analyses for TSA
In the aforementioned experiments, we use the best trained
model for classification the test data. When we look at deeply,
the best trained model is not the best in the test phase at every
time. The classification rates of the Cancer dataset for the
TSA variants are given in Table 31.
The maximum classification rate is 96% when N100
and ST 0.5. According to these results, for better classifi-
123
Arabian Journal for Science and Engineering
Fig. 4 The convergence graph for the Cancer dataset
cation we should not use only the best trained model, but we
must look at the different run model results. The convergence
graph for the Cancer dataset is given in Fig. 4.
According to Fig. 4, TSA achieves the best in terms of
the mean square error in the training phase for the Cancer
dataset. ES, PSO, PBIL, and ACO trapped local minima early
but ABC, BBO, GA, GWO, and TSA continue the search
process until the termination criterion is met. This graph is
also proved the success of the TSA again.
In this type of study, there are two main limitations:
general limitations and specific limitations. The general lim-
itations are related to all methods. These are determining
the population size; trapping into local optima; detecting the
effective exploration and exploitation ratio. The specific lim-
itations of TSA are related to the peculiar parameters of the
method. These are determining the search tendency param-
eter and determining the number of seeds parameter. The
search tendency controls the new candidate solution creating
scheme, and the number of seeds controls the exploitation in
the search space. In this work, we analyzed the search ten-
dency parameter. Experimental results showed that 0.5 is a
good value for the search tendency parameter. This means
the new candidate solution creation equations are used as
half-and-half.
5.3 The Experiments on Large Datasets
In this section, the experiments on five large datasets are
given. The experimental results of Banknote, Diabetic,
Twonorm, Ringnorm, and Spambase datasets are located into
Tables 32,33,34,35,36, respectively.
The experimental results for the Banknote dataset are
given in Table 32. BBO is in the best position in terms of
mean training error results. ABC produced the maximum
classification rate.
The experimental results for the Diabetic dataset are given
in Table 33. GA is in the best position in terms of mean train-
ing error results. ACO produced the maximum classification
rate.
The experimental results for the Twonorm dataset are
given in Table 34. GA is in the best position in terms of mean
training error results. GA produced the maximum classifica-
tion rate.
The experimental results for the Ringnorm dataset are
given in Table 35. GA is in the best position in terms of mean
training error results. GA produced the maximum classifica-
tion rate.
The experimental results for the Spambase dataset are
given in Table 36. All algorithms produced the optimum
results in terms of mean training error results. ACO produced
the maximum classification rate.
6 Conclusion
In this paper, FF MLP ANN is trained by TSA for the first
time. TSA is one of the population-based swarm intelligence
Table 32 Experimental results of the Banknote dataset
Banknote ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 4.39E−20 7.84E−24 1.38E−87 1.02E−53 1.88E−79 1.31E−71 7.72E−69 1.95E−41 6.79E−55
Best 4.81E−34 5.00E−36 1.38E−87 2.37E−70 1.38E−87 1.51E−87 2.95E−87 1.80E−51 1.61E−74
Worst 6.43E−19 1.71E−22 1.38E−87 4.12E−53 4.96E−78 2.04E−70 2.30E−67 5.28E−40 2.04E−53
SD 1.55E−19 3.22E−23 9.08E−103 1.49E−53 9.10E−79 4.79E−71 4.19E−68 9.65E−41 3.72E−54
Median 1.35E−22 2.04E−28 1.38E−87 8.53E−61 1.65E−87 2.43E−79 9.67E−78 1.46E−45 6.19E−66
Mean time 91.9752 5.7195 4.7103 4.7717 4.5926 5.5933 2.8472 4.8154 42.1367
Friedman rank 8.9 8.1 1.0 5.9 2.1 3.1 3.8 7.0 5.1
Manuel rank 8 7 1512364
Wilcoxon 1.73E−06 1.73E−06 1.73E−06 0.001287 1.73E−06 2.13E−06 1.73E−06 1.73E−06 0
Classification rate (%) 66.69 62.76 55.54 60.28 55.54 55.54 55.54 61.95 55.54
123
Arabian Journal for Science and Engineering
Table 33 Experimental results of the Diabetic dataset
Diabetic ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 6.81E−02 2.08E−01 2.14E−05 1.52E−01 3.44E−02 7.82E−02 1.39E−01 1.42E−01 1.43E−01
Best 4.07E−03 1.58E−01 2.14E−05 1.21E−01 6.77E−06 3.67E−02 1.06E−01 9.14E−02 6.54E−02
Worst 1.50E−01 2.51E−01 2.14E−05 1.84E−01 1.00E−01 1.31E−01 1.87E−01 1.72E−01 1.76E−01
SD 3.01E−02 2.48E−02 6.89E−21 1.61E−02 2.95E−02 2.20E−02 1.87E−02 2.38E−02 2.29E−02
Median 6.24E−02 2.11E−01 2.14E−05 1.58E−01 3.26E−02 7.77E−02 1.41E−01 1.46E−01 1.44E−01
Mean time 298.6501 57.9424 47.1950 47.5073 37.6225 61.5118 17.7858 38.4946 195.2061
Friedman rank 3.3 8.8 1.1 7.1 2.3 3.5 6.0 6.5 6.4
Manuel rank 3 9 2 8 14765
Wilcoxon 2.35E−06 1.92E−06 1.73E−06 0.135908 1.73E−06 1.92E−06 0.158855 0.975387 0
Classification rate (%) 51.26 58.12 47.87 51.69 51.26 43.96 54.04 54.74 52.13
Table 34 Experimental results of the Twonorm dataset
Twonorm ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 1.43E−03 1.27E−01 9.37E−107 3.78E−04 6.10E−59 5.06E−54 3.19E−14 1.50E−06 4.01E−08
Best 1.99E−17 4.18E−02 9.37E−107 1.29E−20 1.10E−139 1.63E−94 4.83E−31 9.20E−18 1.28E−23
Worst 4.30E−02 2.01E−01 9.37E−107 7.26E−03 1.83E−57 1.49E−52 9.34E−13 1.43E−05 1.13E−06
SD 7.85E−03 3.93E−02 2.46E−122 1.48E−03 3.34E−58 2.72E−53 1.70E−13 4.00E−06 2.06E−07
Median 9.37E−17 1.22E−01 9.37E−107 2.87E−08 1.48E−107 4.42E−69 1.18E−21 4.17E−09 3.32E−14
Mean time 2112.1370 97.5102 87.5008 87.7857 77.1834 102.8828 55.6546 77.8253 927.3152
Friedman rank 5.5 9.0 1.5 7.3 1.6 2.9 4.2 7.2 5.7
Manuel rank 8 9 2 6 13475
Wilcoxon 0.007731 1.73E−06 1.73E−06 0.000529 1.73E−06 1.73E−06 6.32E−05 9.32E−06 0
Classification rate (%) 84.58 73.61 86.16 85.95 95.95 92.95 89.85 83.64 88.59
Table 35 Experimental results of the Ringnorm dataset
Ringnorm ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 2.13E−02 1.69E−01 6.51E−28 1.03E−01 8.88E−10 5.60E−12 3.18E−02 7.47E−02 5.43E−02
Best 6.21E−15 1.00E−01 6.51E−28 5.52E−02 1.44E−32 6.81E−24 3.46E−07 3.05E−02 2.61E−03
Worst 5.02E−02 2.26E−01 6.51E−28 1.41E−01 2.66E−08 1.48E−10 1.06E−01 1.08E−01 1.12E−01
SD 2.43E−02 3.20E−02 9.12E−44 2.33E−02 4.86E−09 2.72E−11 2.47E−02 2.23E−02 2.72E−02
Median 1.32E−03 1.68E−01 6.51E−28 1.06E−01 3.39E−18 6.39E−17 3.45E−02 7.82E−02 5.05E−02
Mean time 2115.7338 97.4103 87.4961 87.9626 76.6757 102.8440 55.4910 77.9352 928.1096
Friedman rank 4.4 9.0 1.1 7.7 2.3 2.7 5.0 6.9 5.9
Manuel rank 5 3 6 1 79842
Wilcoxon 4.07E−05 1.73E−06 1.73E−06 3.18E−06 1.73E−06 1.73E−06 0.003854 0.002255 0
Classification rate (%) 63.24 55.04 60.69 64.18 78.19 69.15 67.64 66.59 63.08
algorithms. TSA has two peculiar parameters which are ST
and NS. ST controls the exploration and exploitation progress
of the algorithm. NS provides better intensification about the
current solutions. FF MLP ANN is converted to a vector
and TSA optimizes this vector. Eighteen different datasets
(XOR6, XOR9, XOR13, 3-bit Parity, 4-bit Encoder Decoder,
3-bits XOR, Sigmoid, Cosine, Sine, Balloon, Iris, Breast
Cancer, Heart, Banknote, Diabetic, Twonorm, Ringnorm,
and Spambase) are used in experiments. TSA is compared
with PSO, GWO, GA, ACO, ES, PBIL, ABC, and BBO. The
experimental results show that TSA is the best in terms of
mean classification rates and outperformed the opponents on
18 problems. The obtained results are proven by two differ-
ent statistical (Wilcoxon signed rank test and Friedman’s test)
tests. Generally speaking, the swarm-based methods suffer
from low exploration, but TSA has an efficient exploration
123
Arabian Journal for Science and Engineering
Table 36 Experimental results of the Spambase dataset
Spambase ABC ACO BBO ES GA GWO PBIL PSO TSA
Mean 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Best 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Wor s t 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
SD 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Median 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00
Mean time 2751.0268 468.9487 418.0443 428.8673 344.7158 538.8551 154.8249 341.6761 1754.3480
Friedman rank 111111111
Manuel rank 111111111
Wilcoxon 111111110
Classification rate (%) 42.19 43.06 40.62 42.84 42.84 39.40 41.88 42.21 40.69
mechanism. In future studies, the improved versions of TSA
would be used for training FF MLP ANN.
Funding The authors wish to thank Scientific Research Projects Coor-
dinatorship at Selcuk University and The Scientific and Technological
Research Council of Turkey for their institutional supports.
Compliance with ethical standards
Conflicts of interest The authors declare that they have no conflict of
interest.
References
1. Bassett, D.S.; Gazzaniga, M.S.: Understanding complexity in the
human brain. Trends Cognit. Sci. 15(5), 200–209 (2011)
2. Haykin, S.: Neural Networks: A Comprehensive Foundation. Pren-
tice Hall PTR, New York (1994)
3. Yao, L.; Li, T.; Li, Y.; Long, W.; Yi, J.: An improved feed-forward
neural network based on UKF and strong tracking filtering to estab-
lish energy consumption model for aluminum electrolysis process.
Neural Comput. Appl. 31(8), 4271–4285 (2019)
4. Zhang, Y.; Gendeel, M.A.A.; Peng, H.; Qian, X.: Xu H Super-
vised Kohonen network with heterogeneous valuedifference metric
for both numeric and categorical inputs. Soft Comput.ss 24(3),
1763–1774 (2020)
5. Mirjalili, S.: Evolutionary radial basis function networks. In:
Evolutionary Algorithms and Neural Networks: Theory and Appli-
cations. Springer International Publishing, Cham, pp 105–139
(2019). https://doi.org/10.1007/978-3-319-93025-1-8
6. Shojaeifard, A.; Amroudi, A.N.; Mansoori, A.; Erfanian, M.: Pro-
jection recurrent neural network model: a new strategy to solve
weapon-target assignment problem. Neural Process. Lett. 30(8),
2538–2547 (2019)
7. Tavanaei, A.; Ghodrati, M.; Kheradpisheh, S.R.; Masquelier, T.;
Maida, A.: Deep learning in spiking neural networks. Neural Netw.
111, 47–63 (2019)
8. Mirjalili, S.: How effective is the Grey Wolf optimizer in training
multi-layer perceptrons. Appl. Intell. 43(1), 150–161 (2015)
9. Lee, S.-J.; Tseng, C.-H.; Lin, G.R.; Yang, Y.; Yang, P.; Muham-
mad, K.; Pandey, H.M.: A dimension-reduction based multilayer
perception method for supporting the medical decision making.
Pattern Recogn. Lett. 131, 15–22 (2020)
10. Hertz, J.A.: Introduction to the Theory of Neural Computation.
CRC Press, Amsterdam (2018)
11. Mitchell, M.; Holland, J.H.; Forrest, S.: When will a genetic algo-
rithm outperform hill climbing. In: Advances in Neural Information
Processing Systems, pp. 51–58 (1994)
12. Sonuc, E.; Sen, B.; Bayir, S.: A cooperative GPU-based parallel
multistart simulated annealing algorithm for quadratic assignment
problem. Eng. Sci. Technol. Int. J. 21(5), 843–849 (2018). https://
doi.org/10.1016/j.jestch.2018.08.002
13. Pandey, H.M.; Rajput, M.; Mishra, V.: Performance compari-
son of pattern search, simulated annealing, genetic algorithm and
jaya algorithm. In: Data Engineering and Intelligent Computing.
Springer, Berlin, pp 377–384 (2018)
14. ¸Sahman, M.A.; Altun, A.A.; Dündar, A.O.: A new MILP model
proposal in feed formulation and using a hybrid-linear binary PSO
(H-LBP) approach for alternative solutions. Neural Comput. Appl.
29(2), 537–552 (2018)
15. Cinar, A.C.; Korkmaz, S.; Kiran, M.S.: A discrete tree-seed algo-
rithm for solving symmetric traveling salesman problem. Eng. Sci.
Technol. Int. J. (2019)
16. Tongur, V.; Hacibeyoglu, M.; Ulker, E.: Solving a big-scaled hos-
pital facility layout problem with meta-heuristics algorithms. Eng.
Sci. Technol. Int. J. (2019)
17. Xu, X.; Rong, H.; Trovati, M.; Liptrott, M.; Bessis, N.: CS-PSO:
chaotic particle swarm optimization algorithm for solving com-
binatorial optimization problems. Soft. Comput. 22(3), 783–795
(2018)
18. Egrioglu, E.; Yolcu, U.; Bas, E.; Dalar, A.Z.: Median-Pi artifi-
cial neural network for forecasting. Neural Comput. Appl. 31(1),
307–316 (2019)
19. Yasar, A.; Saritas, I.; Sahman, M.A.; Dundar, A.O.: Classification
of leaf type using artificial neural networks. Int. J. Intell. Syst. Appl.
Eng. 3(4), 136–139 (2015)
20. Yasar, A.; Saritas, I.; Sahman, M.; Cinar, A.: Classification of
parkinson disease data with artificial neural networks. In: IOP
Conference Series: Materials Science and Engineering, vol 1. IOP
Publishing, p. 012031 (2019
21. Sulistyo, S.B.; Woo, W.L.; Dlay, S.S.: Regularized neural networks
fusion and genetic algorithm based on-field nitrogen status esti-
mation of wheat plants. IEEE Trans. Industr. Inf. 13(1), 103–114
(2016)
22. Sulistyo, S.B.; Woo, W.L.; Dlay, S.S.; Gao, B.: Building a globally
optimized computational intelligent image processing algorithm
for on-site inference of nitrogen in plants. IEEE Intell. Syst. 33(3),
15–26 (2018)
123
Arabian Journal for Science and Engineering
23. Gu, K.; Zhou, Y.; Sun, H.; Zhao, L.; Liu, S.: Prediction of air quality
in Shenzhen based on neural network algorithm. Neural Comput.
Appl. 1–14 (2019)
24. Koh, B.H.D.; Woo, W.L.: Multi-view temporal ensemble for clas-
sification of non-stationary signals. IEEE Access 7, 32482–32491
(2019)
25. Boashash, B.; Ouelha, S.: Designing high-resolution time–fre-
quency and time–scale distributions for the analysis and classifica-
tion of non-stationary signals: a tutorial review with a comparison
of features performance. Digital Signal Process. 77, 120–152
(2018)
26. Delsy, T.T.M.; Nandhitha, N.; Rani, B.S.: Feasibility of spectral
domain techniques for the classification of non-stationary signals.
J. Ambient Intell. Hum. Comput, 1–8 (2020)
27. Wolpert, D.H.; Macready, W.G.: No free lunch theorems for opti-
mization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)
28. Wienholt, W.: Minimizing the system error in feedforward neural
networks with evolution strategy. In: International Conference on
Artificial Neural Networks, Springer, Berlin, pp 490–493 (1993)
29. Seiffert, U.: Multiple layer perceptron training using genetic algo-
rithms. In: ESANN, Citeseer, pp 159–164 (2001)
30. Mendes, R.; Cortez, P.; Rocha, M.; Neves, J.: Particle swarms for
feedforward neural network training. In: Proceedings of the 2002
International Joint Conference on Neural Networks. IJCNN’02
(Cat. No. 02CH37290), IEEE, pp 1895–1899 (2002)
31. Blum, C.; Socha, K.: Training feed-forward neural networks with
ant colony optimization: an application to pattern classification.
In: Fifth International Conference on Hybrid Intelligent Systems
(HIS’05), IEEE (2005)
32. Karaboga, D.; Akay, B.; Ozturk, C.: Artificial bee colony (ABC)
optimization algorithm for training feed-forward neural networks.
In: International conference on modeling decisions for artificial
intelligence, Springer, Berlin, pp. 318–329 (2007)
33. Mirjalili, S.; Hashim, S.Z.M.; Sardroudi, H.M.: Training feedfor-
ward neural networks using hybrid particle swarm optimization
and gravitational search algorithm. Appl. Math. Comput. 218(22),
11125–11137 (2012)
34. Mirjalili, S.; Mirjalili, S.M.; Lewis, A.: Let a biogeography-based
optimizer train your multi-layer perceptron. Inf. Sci. 269, 188–209
(2014)
35. Amirsadri, S.; Mousavirad, S.J.; Ebrahimpour-Komleh, H.: A Levy
flight-based grey wolf optimizer combined with back-propagation
algorithm for neural network training. Neural Comput. Appl.
30(12), 3707–3720 (2018)
36. Haklı, H.; U˘guz, H.: A novel particle swarm optimization algorithm
with Levy flight. Appl. Soft Comput. 23, 333–345 (2014)
37. Xu, F.; Pun, C.-M.; Li, H.; Zhang, Y.; Song, Y.; Gao, H.: Training
feed-forward artificial neural networks with a modified artificial
bee colony algorithm. Neurocomputing (2019)
38. Zhang, X.; Wang, X.; Chen, H.; Wang, D.; Fu, Z.: Improved GWO
for large-scale function optimization and MLP optimization in can-
cer identification. Neural Comput. Appl., 1–21 (2019)
39. Heidari, A.A.; Faris, H.; Mirjalili, S.; Aljarah, I.; Mafarja, M.: Ant
lion optimizer: theory, literature review, and application in multi-
layer perceptron neural networks. In: Nature-inspired optimizers.
Springer, Berlin, pp 23–46 (2020)
40. Dalwinder, S.; Birmohan, S.; Manpreet, K.: Simultaneous feature
weighting and parameter determination of neural networks using
ant lion optimization for the classification of breast cancer. Biocy-
bernet. Biomed. Eng. (2019)
41. Faris, H.; Aljarah, I.; Mirjalili, S.: Training feedforward neural
networks using multi-verse optimizer for binary classification prob-
lems. Appl. Intell. 45(2), 322–332 (2016)
42. Gao, B.; Li, X.; WooyunTian, W.L.G.: Physics-based image
segmentation using first order statistical properties and genetic
algorithm for inductive thermography imaging. IEEE Trans. Image
Process. 27(5), 2160–2175 (2017)
43. Mutluer, M.; ¸Sahman, M.A.; Çunka¸s, M.: Heuristic optimization
based on penalty approach for surface permanent magnet syn-
chronous machines. Arab. J. Sci. Eng. 1–17 (2020)
44. Karasekreter, N.; ¸Sahman, M.A.; Ba¸sçiftçi, F.; Fidan, U.: PSO
based clustering for the optimization of energy consumption in
wireless sensor network. Emerg. Mater. Res, 1–7 (2020)
45. Kiran, M.S.: TSA: tree-seed algorithm for continuous optimization.
Expert Syst. Appl. 42(19), 6686–6698 (2015)
46. Kıran, M.S.: An implementation of tree-seed algorithm (TSA) for
constrained optimization. In: Intelligent and evolutionary systems.
Springer, Berlin, pp 189–197 (2016)
47. Babalik, A.; Cinar, A.C.; Kiran, M.S.: A modification of tree-seed
algorithm using Deb’s rules for constrained optimization. Appl.
Soft Comput. 63, 289–305 (2018)
48. El-Fergany, A.A.; Hasanien, H.M.: Tree-seed algorithm for solv-
ing optimal power flow problem in large-scale power systems
incorporating validations and comparisons. Appl. Soft Comput.
64, 307–316 (2018)
49. Zhou, J.; Zheng, Y.; Xu, Y.; Liu, H.; Chen, D.: A heuristic
TS fuzzy model for the pumped-storage generator-motor using
variable-length tree-seed algorithm-based competitive agglomer-
ation. Energies 11(4), 944 (2018)
50. Horng, S.-C.; Lin, S.-S.: Embedding ordinal optimization into
tree–seed algorithm for solving the probabilistic constrained sim-
ulation optimization problems. Appl. Sci. 8(11), 2153 (2018)
51. Zheng, Y.; Zhou, J.; Zhu, W.; Zhang, C.; Li, C.; Fu, W.: Design of a
multi-mode intelligent model predictive control strategy for hydro-
electric generating unit. Neurocomputing 207, 287–299 (2016)
52. Chen, W.; Tan, X.; Cai, M.: Parameter identification of equivalent
circuit models for Li-ion batteries based on tree seeds algorithm.
In: IOP Conference Series: Earth and Environmental Science, vol
1. IOP Publishing, p 012024 (2017)
53. Chen, W.; Cai, M.; Tan, X.; Wei, B.: Parameter identification and
state-of-charge estimation for Li-Ion batteries using an improved
tree seed algorithm. IEICE Trans. Inf. Syst. 102(8), 1489–1497
(2019)
54. Ding, Z.; Zhao, Y.; Lu, Z.: Simultaneous identification of structural
stiffness and mass parameters based on Bare-bones Gaussian Tree
Seeds Algorithm using time-domain data. Appl. Soft Comput. 83,
105602 (2019)
55. Zhao, S.; Wang, N.; Liu, X.: Artificial bee colony algorithm
with tree-seed searching for modeling multivariable systems using
GRNN. In: 2019 Chinese Control And Decision Conference
(CCDC), IEEE, pp. 4702–4707 (2019)
56. Sahman, M.; Cinar, A.; Saritas, I.; Yasar, A.: Tree-seed algorithm in
solving real-life optimization problems. In: IOP conference series:
materials science and engineering, vol 1. IOP Publishing (2019)
57. Ding, Z.; Li, J.; Hao, H.; Lu, Z.-R.: Nonlinear hysteretic param-
eter identification using an improved tree-seed algorithm. Swarm
Evolut. Comput. 46, 69–83 (2019)
58. Ding, Z.; Li, J.; Hao, H.: Structural damage detection with uncer-
tainties using a modified tree seeds algorithm. In: International
Conference on Computational & Experimental Engineering and
Sciences, Springer, Berlin, pp. 751–760 (2019)
59. Muneeswaran, V.; Rajasekaran, M.P.: Gallbladder shape estimation
using tree-seed optimization tuned radial basis function network
for assessment of acute cholecystitis. In: Intelligent engineering
informatics. Springer, pp 229–239 (2018)
60. Cinar, A.; Kiran, M.: A parallel version of tree-seed algorithm
(TSA) within CUDA platform. In: Selçuk International Scientific
Conference on Applied Sciences (2016)
61. Cinar, A.C.; Kiran, M.S.: A parallel implementation of tree-seed
algorithm on CUDA-supported graphical processing unit. J Fac
Eng Archit Gazi Univ 33(4), 1397–1409 (2018)
123
Arabian Journal for Science and Engineering
62. Muneeswaran, V.; Rajasekaran, M.P. Beltrami-regularized denois-
ing filter based on tree seed optimization algorithm: an ultrasound
image application. In: International conference on information
and communication technology for intelligent systems, Springer,
pp. 449–457 (2017)
63. Muneeswaran, V.; Rajasekaran, M.P.: Local contrast regularized
contrast limited adaptive histogram equalization using tree seed
algorithm—an aid for mammogram images enhancement. In:
Smart Intelligent Computing and Applications. Springer, Berlin,
pp 693–701 (2019)
64. Ding, Z.; Li, J.; Hao, H.; Lu, Z.-R.: Structural damage identifi-
cation with uncertain modelling error and measurement noise by
clustering based tree seeds algorithm. Eng. Struct. 185, 301–314
(2019)
65. Oliva, D.; Elaziz, M.A.; Hinojosa, S.: Otsu’s between class variance
and the tree seed algorithm. In: Metaheuristic Algorithms for Image
Segmentation: Theory and Applications. Springer, pp 71–83 (2019)
66. Cinar, A.C.; Kiran, M.S.: Similarity and logic gate-based tree-
seed algorithms for binary optimization. Comput. Ind. Eng. 115,
631–646 (2018)
67. Cinar, A.C.; Iscan, H.; Kiran, M.S.: Tree-seed algorithm for large-
scale binary optimization. In: KnE Social Sciences, pp. 48–64
(2018)
68. Sahman, M.A.; Cinar, A.C.: Binary tree-seed algorithms with S-
shaped and V-shaped transfer functions. Int. J. Intell. Syst. Appl.
Eng. 7(2), 111–117 (2019)
69. Kiran, M.S.: Withering process for tree-seed algorithm. Proced.
Comput. Sci. 111, 46–51 (2017)
70. Aslan, M.; Beskirli, M.; Kodaz, H.; Kıran, M.S.: An improved
tree seed algorithm for optimization problems. Int. J. Mach. Learn.
Comput. 8(1), 20–25 (2018)
71. Çınar, A.C.; Kıran, M.S. Boundary conditions in Tree-Seed Algo-
rithm: analysis of the success of search space limitation techniques
in Tree-Seed Algorithm. In: 2017 International Conference on
Computer Science and Engineering (UBMK), IEEE, pp. 571–576
(2017)
72. Be¸skirli, A.; Özdemir, D.; Temurta¸s, H.: A comparison of modi-
fied tree–seed algorithm for high-dimensional numerical functions.
Neural Comput. Appl., 1–35 (2019)
73. Gungor, I.; Emiroglu, B.G.; Cinar, A.C.; Kiran, M.S.: Integration
search strategies in tree seed algorithm for high dimensional func-
tion optimization. Int. J. Mach. Learn. Cybernet., 1–19 (2019)
74. Jiang, J.; Jiang, S.; Meng, X.; Qiu, C.: EST-TSA: An effective
search tendency based to tree seed algorithm. Physica A 534,
122323 (2019)
75. Jiang, J.; Xu, M.; Meng, X.; Li, K.: STSA: A sine Tree-Seed Algo-
rithm for complex continuous optimization problems. Physica A
537, 122802 (2020)
76. Be¸skirli, M.; Yüksek, B.: Test Fonksiyonlarında A˘gaç Tohum Algo-
ritmasının Performans Analizi. Avrupa Bilim ve Teknoloji Dergisi,
pp. 93–101
77. Chen, F.; Ye, Z.; Wang, C.; Yan, L.; Wang, R.: A feature selec-
tion approach for network intrusion detection based on tree-seed
algorithm and k-nearest neighbor. In: 2018 IEEE 4th International
Symposium on Wireless Systems within the International Confer-
ences on Intelligent Data Acquisition and Advanced Computing
Systems (IDAACS-SWS), IEEE, pp 68–72 (2018)
123