ArticlePDF Available

Training Feed-Forward Multi-Layer Perceptron Artificial Neural Networks with a Tree-Seed Algorithm

September 2020
Arabian Journal for Science and Engineering 45(5)

September 2020
45(5)

Authors:

Selcuk University

The artificial neural network (ANN) is the most popular research area in neural computing. A multi-layer perceptron (MLP) is an ANN that has hidden layers. Feed-forward (FF) ANN is used for classification and regression commonly. Training of FF MLP ANN is performed by backpropagation (BP) algorithm generally. The main disadvantage of BP is trapping into local minima. Nature-inspired optimizers have some mechanisms escaping from the local minima. Tree-seed algorithm (TSA) is an effective population-based swarm intelligence algorithm. TSA mimics the relationship between trees and their seeds. The exploration and exploitation are controlled by search tendency which is a peculiar parameter of TSA. In this work, we train FF MLP ANN for the first time. TSA is compared with particle swarm optimization, gray wolf optimizer, genetic algorithm, ant colony optimization, evolution strategy, population-based incremental learning, artificial bee colony, and biogeography-based optimization. The experimental results show that TSA is the best in terms of mean classification rates and outperformed the opponents on 18 problems.

The details of datasets

…

Specific parameters of the algorithms that used in this work

…

The population/colony/stand sizes and iteration/function evaluation numbers

…

Experimental results of the XOR13 dataset

…

+10

Experimental results of the Sigmoid dataset

…

Figures - uploaded by Ahmet Cevahir Cinar

Content may be subject to copyright.

Content uploaded by Ahmet Cevahir Cinar

Content may be subject to copyright.

Arabian Journal for Science and Engineering

https://doi.org/10.1007/s13369-020-04872-1

RESEARCH ARTICLE-COMPUTER ENGINEERING AND COMPUTER SCIENCE

Training Feed-Forward Multi-Layer Perceptron Artiﬁcial Neural

Networks with a Tree-Seed Algorithm

Ahmet Cevahir Cinar1

Received: 28 March 2020 / Accepted: 13 August 2020

Abstract

The artiﬁcial neural network (ANN) is the most popular research area in neural computing. A multi-layer perceptron (MLP)

is an ANN that has hidden layers. Feed-forward (FF) ANN is used for classiﬁcation and regression commonly. Training of FF

MLP ANN is performed by backpropagation (BP) algorithm generally. The main disadvantage of BP is trapping into local

minima. Nature-inspired optimizers have some mechanisms escaping from the local minima. Tree-seed algorithm (TSA) is

an effective population-based swarm intelligence algorithm. TSA mimics the relationship between trees and their seeds. The

exploration and exploitation are controlled by search tendency which is a peculiar parameter of TSA. In this work, we train FF

MLP ANN for the ﬁrst time. TSA is compared with particle swarm optimization, gray wolf optimizer, genetic algorithm, ant

colony optimization, evolution strategy, population-based incremental learning, artiﬁcial bee colony, and biogeography-based

optimization. The experimental results show that TSA is the best in terms of mean classiﬁcation rates and outperformed the

opponents on 18 problems.

Keywords Tree-seed algorithm ·Multi-layer perceptron ·Training neural network ·Artiﬁcial neural network ·Neural

networks ·Nature inspired algorithms

1 Introduction

Neural computing mimics the human brain which is the

most complex organ of the human body [1]. Neural net-

works simulate the connections in the human brain [2]. This

simulation is named as the artiﬁcial neural network (ANN).

Basically ANN takes inputs, computes them, and produces

the outputs. This process is a learning process. Learning

has two main types: supervised and unsupervised. In super-

vised learning, the training data have output labels but in

unsupervised learning, the data have not got output labels.

ANN is a balancer between the input and outputs. In the

literature, there are various types of networks such as feed-

forward (FF) [3], Kohonen [4], radial-basis function (RBF)

[5], recurrent neural [6], spiking neural [7]. FF is a network

that has one way (one direction). In FF, the association of

inputs and outputs is provided with weights and biases. If a

FF ANN has hidden layers, it is named as multi-layer percep-

BAhmet Cevahir Cinar

accinar@selcuk.edu.tr; ahmetcevahircinar@gmail.com

1Department of Computer Engineering, Faculty of

Technology, Selcuk University, 42075 Konya, Turkey

tron (MLP) [8,9]. MLP has three layers: input, hidden, and

output. Training data are used for learning the hidden weights

between the attributes and class labels. The deterministic and

stochastic learning approaches are used for training an ANN.

The gradient-based methods and backpropagation (BP) algo-

rithm are deterministic methods [10]. If the training data

does not change, deterministic methods produce the same

results. The deterministic methods are simple and speedy.

Stochastic methods try to improve the learning rate during

the iterations. Thus, time usage is higher than the deter-

ministic methods, but it gives us better results. The main

drawback of the deterministic method is the initial solu-

tions dependency. Stochastic optimization techniques start

with random solutions. These random solutions are evolved

in every iteration, and the main advantage is avoiding the

local optima. Nature-inspired optimizers are in the stochas-

tic optimization techniques group. Most of these methods are

multi-solution-based algorithms but some of them like hill

climbing [11] and simulated annealing [12,13] are single

solution-based algorithms. TSA, particle swarm optimiza-

tion (PSO), gray wolf optimizer (GWO), genetic algorithm

(GA), ant colony optimization (ACO), evolution strategy

(ES), population-based incremental learning (PBIL), arti-

123

Arabian Journal for Science and Engineering

ﬁcial bee colony (ABC), biogeography-based optimization

(BBO) are some of the multi-solution nature-inspired opti-

mizers. These algorithms are not only used in training of FF

ANN MLPs but also used in various applications like feed

formulation [14], traveling salesman problem [15], layout

problem [16], and combinatorial problems [17]. Also, ANN

is used in many applications like forecasting [18], classi-

ﬁcation [19,20], estimation [21,22], and prediction [23].

The ANN is not only used for classifying stationary signals

but also non-stationary signals [24]. Various techniques are

used in classiﬁcation of non-stationary signals, for example,

Koh and Woo [24] combined ensemble technique and multi-

view learning for classifying stationary signals; Boashash

and Ouelha [25] focused on extracting information from

non-stationary signals; Delsy et al. [26] extracted the fea-

tures from non-stationary signals and classiﬁed them with

backpropagation network. According to the No Free Lunch

(NFL) [27] theorem, a nature-inspired optimizer cannot solve

all optimization problems successfully. Therefore, since the

GA is proposed in 1975, more than 300 nature-inspired

algorithms are proposed until now. Every nature-inspired

optimizer has peculiar property, so, in this work, we want

to prove the success of the TSA to training MLP. TSA is an

effective solver on low dimensional problems. In this work,

we modify the basic TSA for solving large-scale MLP train-

ing.

The remainder of the paper is organized as follows:

Sect. 1.1 gives the main contribution of the study. In Sect. 2,

the related works are given. FF MLP ANN and TSA are

examined in Sects. 3.1 and 3.2, respectively. The experimen-

tal setup and information about datasets are given in Sect. 4.

The results and discussion are located in Sect. 5. Finally, in

Sect. 6we conclude the work.

1.1 The Main Contribution of the Study

•TSA is used for training the FF MLP ANN for the ﬁrst

time.

•TSA is compared and outperformed on 18 different

datasets (6 to 6786 dimensions, 4 to 7400 samples) with 8

metaheuristic algorithms.

•TSA ﬁnds eligible weights and biases of FF MLP ANN.

•The parameter adjustment for the basic TSA has increased

the mean classiﬁcation accuracy.

•TSA is the best solver on 18 different type datasets in terms

of mean classiﬁcation rates.

2 Literature Review

Neural computing and nature-inspired optimizers are a huge

research domain. Training feed-forward multi-layer percep-

tron artiﬁcial neural networks subject is not a fresh idea, but

it is a most discussed, alive, and growing research problem

in the literature. Therefore, in this section, we only focus

on the recent applications related to the training of FF MLP

ANN with nature-inspired optimizers and the literature of

TSA.

Wienholt [28] uses ES for minimizing the system error of

an MLP. Seiffer [29] propose a GA approach for avoiding

local minima on training MLP in 2001. Mendes et al. [30]

use PSO for training MLPs on classiﬁcation and regression

tasks in 2002. In 2005, Blum and Socha [31] extend ACO

for pattern classiﬁcation on medical data. Karaboga et al.

[32] use ABC for training FF ANN. Five function approxi-

mation problems are used in experiments. ABC outperforms

BP and GA in this work. Mirjalili et al. [33] hybridize PSO

and gravitational search algorithm, and it is named PSOGSA.

In this work, MLP is trained with PSOGSA. The obtained

results are compared with PSO and GSA. PSOGSA is bet-

ter than PSO and GSA in terms of convergence, training

error, and classiﬁcation rate. Mirjalili et al. [34]trainMLP

with BBO in 2014. Five classiﬁcation and six approximation

datasets are used for experiments. BBO is outperformed to

PSO, GA, ACO, ES, and PBIL. Also, the obtained results

are compared with the BP algorithm and extreme learning

machine. Mirjalili [8] investigates the effectiveness of the

GWO on training MLP in 2015. Five classiﬁcation and three

function approximation datasets are used for determining

the performance of GWO. GWO creates better results than

PSO, GA, ACO, ES, and PBIL. Amirsadri et al. [35] com-

bine BP and GWO for training MLP. Levy ﬂight technique

is used to improve the exploration capability of GWO like

[36]. BP increases the exploitation capability of GWO. The

success of the proposed model is shown on 12 classiﬁcation

and function-approximation datasets. Xu et al. [37] modify

ABC with the global best-guided approach for continuous

optimization problems, and this method is named ABC-ISB.

ABC-ISB is compared with ABC variants in the literature. In

this work, basic ABC and ABC-ISB is compared on training

MLP. ABC-ISB creates promising results. Zhang et al. [38]

optimize the weights and biases of a MLP with improved

GWO, and their approach is named RSMGWO in 2019.

RSMGWO used a random opposition learning strategy for

avoiding the local optima. Nineteen different cancer-related

datasets are used for experiments. RSMGWO produces com-

petitive results. Heidari et al. [39] use ant–lion optimizer

for the training MLP in 2020, and their approach is named

as ALOMLP. ALOMLP outperforms GA, PBIL, DE, and

PSO in this work. Dalwinder et al. [40] weighted the fea-

tures of the datasets, for increasing the classiﬁcation rate in

2020. In this work, an ant–lion optimizer is used for train-

ing the MLPs. Three breast cancer datasets are used in the

experimental setup. The obtained results showed that this

paradigm increases the classiﬁcation rate. Faris et al. [41]

train MLP with multi-verse optimizer (MVO). Nine different

123

Arabian Journal for Science and Engineering

Fig. 1 The structure of 2–3-1 MLP

bio-medical datasets selected from the UCI machine learning

repository are used in experiments. MVO is compared with

GA, PSO, DE, ﬁreﬂy, and cuckoo search algorithms. The

experimental results show that MVO produces compatible

results. The metaheuristic algorithms are not used only for

time-series prediction but also used in image processing [42],

electrical machine design [43], and optimization of energy

consumption in wireless sensor network [44].

TSA is another iterative continuous search algorithm pro-

posed by Kiran [45] in 2015. In the literature, TSA is used

in the wide range of research areas such as constrained ver-

sions of TSA [46,47], engineering optimization problems

solved with TSA [48–58], RBF network training and appli-

cations with TSA [48,59], parallel versions of TSA [60,61],

image processing with TSA [62–65], binary optimization

with TSA [66–68], improved versions of TSA [50,69–76],

feature selection with TSA [77], discrete versions of TSA

[15].

Until now, there is no work in the literature for training

MLP with TSA; our main motivation is to present the effec-

tiveness of TSA for training MLPs.

3 Materials and Methods

3.1 Feed-Forward Neural Network and Multi-Layer

Perceptron

The FF neural network is a neural network that has

one way (one direction) between their neurons. If a NN

has hidden layers, it is named as MLP. In this study,

vector representation is used for individuals. The indi-

vidual for the 2-3-1 MLP that is presented in Fig. 1is

X[W13W23 W14 W24 W15 W25 W36W46W56θ1θ2θ3θ4].

The dimension is calculated as

((InputNumber+OutputNumber+1)∗HiddenNodesNumber)+

Mean square error (MSE) for all training samples is used

as the objective function. Equation 1shows this calculation:

MSE 



t1

m

i1Rt

i−Ct

i2

T(1)

where Tis the number of training samples, m is the number

of outputs, Ct

iis the created output value for ith input for tth

training sample, Rt

iis the real output value for ith input for

tth training sample.

3.2 Tree-Seed Algorithm

TSA was proposed by Kiran [45] for solving unconstrained

continuous optimization problems in 2015. TSA simulates

the relationship between trees and their seeds. TSA is

a population-based swarm intelligence techniques. It has

two peculiar parameters. These parameters are the Search

Tendency (ST) and the Number of Seeds (NS). ST controls

the seed creation direction. The population is named as stand

in TSA. Kiran [45] recommends that the NS can be between

10% of the stand size and 25% of the stand. But if nec-

essary one can change this number. In TSA, the trees and

seeds correspond to the possible solution of an optimization

problem. At the initialization phase, the population is cre-

ated randomly in a predetermined search space. The trees

and seeds are D-dimensional vectors, and D is the dimen-

sionality of an optimization problem. The search process is

a trade-off of exploration and exploitation. If this trade-off

is balanced, the algorithm creates more qualiﬁed solutions.

In TSA, this situation is controlled by the ST parameter with

two different seed creation formulas given in Eqs. 2and 3,

respectively.

Seed(k,j)Tree(i,j)+Best j−Tree(r,j)×Rand(−1,1)

(2)

Seed(k,j)Tree(i,j)+Tree(i,j)−Tree(r,j)×Rand(−1,1)

(3)

where kis the index of the seed, jis the index of the dimen-

sion, ris the index of the random neighbor tree, Best is the

best tree obtained so far, Rand(−1,1) is a random number

between −1 and 1. The Equation Xprovides the exploita-

tion and Equation Xprovides the exploration. The detailed

pseudocode of the basic TSA is given in Fig. 2.

3.3 Training MLP with TSA

This section describes how to train a FF MLP with TSA

deeply. TSA is a continuous optimization algorithm, and

Sect. 3.2 gives detailed information about TSA. The main

aim is to determine the optimum parameters of MLP. These

123

Arabian Journal for Science and Engineering

Determine the number of trees (N)

Determine the search tendency (ST) parameter

Determine the maximum function evaluation number (Maxfes)

Dis the dimensionality of the problem

Initialize the trees

Evaluate the trees

Fes=N

WHILE Fes<Maxfes

FOR i=1 to N

Determine the number of seeds between the 10% of the population size and 25%of the

population (NS)

Select a random neighbor tree (r) that not equals the current tree

FOR k=1 to NS

FOR j=1 to D

IF rand<ST

Seed(k,j)=Tree(i,j)+rand(-1,1)*(Best(j)-Tree(r,j))

Relocate the seeds if cross the search space boundaries

ELSE

Seed(k,j)=Tree(i,j)+rand(-1,1)*(Tree(i,j)-Tree(r,j))

Relocate the seeds if cross the search space boundaries

END

Determine the best seed with a greedy selection mechanism

If the best seed is better than its tree, then the tree is removed from the search space and the

best seed become a tree

END

Determine the best tree with a greedy selection mechanism

END

Fig. 2 The detailed pseudocode of the basic TSA

parameters are clearly explained in Sect. 3.1. At the initializa-

tion phase, these values are started as a random vector. After

that, this vector is optimized by TSA, and ﬁnally optimized

parameters of a MLP are produced by TSA. The ﬂowchart

of the proposed method is given in Fig. 3.

3.4 The Computational Complexity of the Proposed

Method

The computational complexity of the proposed method is

related to the structure of the MLP, the number of instances

in the training data, the stand size, the number of maximum

function evaluation numbers, and the number of seeds. The

Big-0 notation of the computational complexity of the pro-

posed method is given in Eq. 4.

O(TSA,MLP)O(Maxfes(O(MLP)+O(TSA))) (4)

where Maxfes is the number of maximum function evaluation

numbers, O(MLP)is the Big-O notation of MLP, and it is

calculated as in Eq. 5,O(TSA

)is the Big-O notation of TSA,

and it is calculated as in Eq. 6.

O(MLP)(t(h+o)) (5)

123

Arabian Journal for Science and Engineering

Fig. 3 The ﬂowchart of the proposed method

123

Arabian Journal for Science and Engineering

Table 1 The details of datasets

No Dataset name Number of attributes MLP structure Dimensions Weight numbers Bias numbers Range

1 XOR6 2 2-2-1 6 6 weights 0 bias [−100,100]

2 XOR9 2 2-2-1 9 6 weights 3 biases [−10,10]

3 XOR13 2 2-3-1 13 9 weights 4 biases [−10,10]

4 3-bit Parity 3 3-3-1 16 12 weights 4 biases [−10,10]

5 4-bit Enc. Dec 4 4-2-4 22 16 weights 6 biases [−10,10]

6 3-bits XOR 3 3-7-1 36 28 weights 8 biases [−10,10]

7 Sigmoid 1 1-15-1 46 30 weights 16 biases [−10,10]

8 Cosine 1 1-15-1 46 31 weights 17 biases [−10,10]

9 Sine 1 1-15-1 46 32 weights 18 biases [−10,10]

10 Balloon 4 4-9-1 55 45 weights 10 biases [−10,10]

11 Iris 4 4-9-3 75 63 weights 12 biases [−10,10]

12 Breast Cancer 9 9-19-1 210 190 weights 20 biases [−10,10]

13 Heart 22 22-45-1 1082 1035 weights 46 biases [−10,10]

14 Banknote 4 4-9-1 55 45 weights 10 biases [−10,10]

15 Diabetic 19 19-39-1 820 780 weights 40 biases [−10,10]

16 Twonorm 20 20-41-1 903 861 weights 42 biases [−10,10]

17 Ringnorm 20 20-41-1 903 861 weights 42 biases [−10,10]

18 Spambase 57 57-115-1 6786 6670 weights 116 biases [−10,10]

where tis the number of instances in the training data, his

the number of hidden nodes in the MLP, ois the number of

output values. In this work, hand oare smaller than t,soin

the worst-case O(MLP)t.

O(TSA)(N×NS×D)(6)

where Nis the stand size, NS is the number of seeds, Dis

the dimensionality of the training dataset. In the best case

NS N/10 and in the worst-case NS N/4. NS must

be smaller than N. Generally Dis smaller than t; for easing

the calculation we suppose Dt. So, in the worst-case O

(TSA)N×N/4×t.

The overall computational complexity of the proposed

method is given in Eq. 7.

O(TSA,MLP)O(Maxfes(t+N×N/4×t)) (7)

4 Experimental Setup

In this section, we gave the details of our experimental setup.

The details of the datasets (the name of the dataset, the num-

ber of the attributes of the dataset, the MLP structure for

training the dataset, the dimension of the dataset, the total

weight numbers of the dataset, the total bias numbers of

the dataset, and the search space range for the dataset) are

given in Table 1. Determining the performance of the algo-

rithm, 18 different datasets (XOR6, XOR9, XOR13, 3-bit

Parity, 4-bit Encoder Decoder, 3-bits XOR, Sigmoid, Cosine,

Sine, Balloon, Iris, Breast Cancer, Heart, Banknote, Diabetic,

Twonorm, Ringnorm, and Spambase) are used in experi-

ments. The large datasets have more than 1000 training/test

samples discussed in Sect. 5.3. There is no strict rule for

selecting the number of hidden nodes. Equation 8.isused

for the determining the number of hidden nodes.

H2×I+1 (8)

where His the number of hidden nodes of MLP and Iis the

number of input nodes. For function approximation datasets,

the number of hidden nodes is set as 15.

In this work, we use nine algorithms. All speciﬁc param-

eters of these algorithms are listed in Table 2.

The maximum iteration numbers, the maximum number

of the evaluation numbers, the population sizes (for PSO,

GWO, GA, ACO, ES, PBIL, and BBO), the stand size of

TSA, and the colony size of ABC for every dataset are given

in Table 3.

The information about training/test samples is given in

Table 4.

The datasets are mapped to [−1, + 1] space with the

min–max normalization method that was formulated as seen

in Eq. 5.

X(X−xmin)×(1 −(−1))

(xmax −xmin)+(−1) (9)

123

Arabian Journal for Science and Engineering

Table 2 Speciﬁc parameters of the algorithms that used in this work

Algorithm Parameter Value

TSA Search Tendency 0.1

Number of Seeds N:

Stand Size

N*0.1—N*0.25

GWO a (linearly decreased) 2 to 0

ABC Colony Size (CS) N:

Population Size

N/2

Limit D: Dimension of

the problem

CS*D

BBO Habitat modiﬁcation

probability

Immigration

probability bounds

per gene

[0,1]

Step size for numerical

integration of

probabilities

Max immigration (I)

and Max emigration

(E)

Mutation probability 0.005

PSO Cognitive constant

(C1)

Social constant (C2) 1

Inertia constant (w) 0.3

GA (real coded,

selection Roulette

wheel)

Crossover single point

(probability)

Mutation uniform

(probability)

0.01

ACO Initial pheromone (s0) 1.00E−06

Pheromone update

constant (Q)

Pheromone constant

(q0)

Global pheromone

decay rate (pg)

0.9

Local pheromone

decay rate (pt)

0.5

Pheromone sensitivity

(a)

Visibility sensitivity

(b)

ES Lambda 10

Sigma 1

PBIL Learning rate 0.05

Good population

member

Bad population

member

Elitism parameter 1

Mutational probability 0.1

where Xis the mapped value, Xis the real value, xmax is the

maximum value of the dataset, xmin is the minimum value of

the dataset.

4.1 Balloon Dataset

The balloon dataset, about blowing up a balloon, has 4

attributes (color, size, act, and age) and 20 training/test sam-

ples (4 repeated). If the balloon is inﬂated, the output is 1;

otherwise, the output is zero. The string input variables are

converted to binary format. The color values are yellow and

purple, the size values are small and large, the act values are

stretch and dip, and the age values are adult and child. The

ﬁles related to the dataset can be found in https://archive.ics.

uci.edu/ml/datasets/Balloons.

4.2 Iris Dataset

The Iris dataset, about class of iris plant, has 4 attributes

(sepal length, sepal width, petal length, and petal width) and

150 training/test samples. If the class is Iris Setosa the output

is −1, if the class is Iris Versicolour the output is 0, and if

the class is Iris Virginica the output is 1. The input variables

are mapped between −1 and 1 with the min–max normal-

ization method that aforementioned before. The ﬁles related

to the dataset can be found in https://archive.ics.uci.edu/ml/

datasets/Iris.

4.3 Breast Cancer Dataset

The Breast cancer dataset, about patients who cancer or not,

has 10 attributes (id, clump thickness, uniformity of cell size,

uniformity of cell shape, marginal adhesion, single epithe-

lial cell size, bare nuclei, bland chromatin, normal nucleoli,

and mitoses) and 599 training/100 test samples. If the can-

cer is benign the output is 0; if the cancer is malignant, the

output is 1. The input variables are converted to continuous

variables between −1 and 1 with the min–max normaliza-

tion method that aforementioned before. The ﬁles related

to the dataset can be found in https://archive.ics.uci.edu/ml/

datasets/breast+cancer+wisconsin+(original).

4.4 Heart Dataset

The Heart dataset, about patients who has heart disease or

not, has 22 attributes (binary features that extracted from

images) and 267 training/test samples. In this work, we only

use the ﬁrst 80 training/test samples. If the patient is normal,

the output is 0; if the patient is abnormal, the output is 1. The

ﬁles related to the dataset can be found in https://archive.ics.

uci.edu/ml/datasets/spect+heart.

123

Arabian Journal for Science and Engineering

Table 3 The

population/colony/stand sizes

and iteration/function evaluation

numbers

No Dataset name Population size Maximum

iteration

number

Maximum

function

evaluations

Colony size Stand size

1 XOR6 50 250 12,500 25 10

2 XOR9 50 250 12,500 25 10

3 XOR13 50 250 12,500 25 10

4 3-bit Parity 50 250 12,500 25 10

5 4-bit Enc. Dec 50 250 12,500 25 10

6 3-bits XOR 50 250 12,500 25 10

7 Sigmoid 200 250 50,000 100 50

8 Cosine 200 250 50,000 100 50

9 Sine 200 250 50,000 100 50

10 Balloon 50 250 12,500 25 10

11 Iris 200 250 50,000 100 50

12 Breast cancer 200 250 50,000 100 50

13 Heart 200 250 50,000 100 50

14 Banknote 50 250 12,500 25 10

15 Diabetic 50 250 12,500 25 10

16 Twonorm 50 250 12,500 25 10

17 Ringnorm 50 250 12,500 25 10

18 Spambase 50 250 12,500 25 10

Table 4 The training/test samples information

No Dataset name Training samples Test samples NOTrS NOTeS

1XOR6 (0 0;0 1;1 0;1 1) >(0;1;1;0) Same as training samples 4 4

2XOR9 (0 0;0 1;1 0;1 1) >(0;1;1;0) Same as training samples 4 4

3XOR13 (0 0;0 1;1 0;1 1) > (0;1;1;0) Same as training samples 4 4

43-bitParity (000;001;010;011;100;101;110;11

1)→(0;1;1;0;1;0;0;1)

Same as training samples 8 8

54-bitEnc.Dec (0001;0010;0100;1000)→(0001;001

0;0100;1000)

Same as training samples 4 4

63-bitsXOR (000;001;010;011;100;101;110;11

1)→(0;1;1;0;1;0;0;1)

Same as training samples 8 8

7Sigmoid x in [−3:0.1:3] x in [−3:0.05:3] 61 121

8Cosine x in [1.25:0.05:2.75] x in [1.25:0.04:2.75] 31 38

9Sine xin [−2π:0.1:2π]xin[−2π:0.05:2π] 126 252

10 Balloon The details are given in Sect. 4.1 Same as training samples 20 20

11 Iris The details are given in Sect. 4.2 Same as training samples 150 150

12 Breast cancer The details are given in Sect. 4.3 Not the same as training samples 599 100

13 Heart The details are given in Sect. 4.4 Same as training samples 80 80

14 Banknote The details are given in Sect. 4.5 Same as training samples 1372 1372

15 Diabetic The details are given in Sect. 4.6 Same as training samples 1151 1151

16 Twonorm The details are given in Sect. 4.7 Same as training samples 7400 7400

17 Ringnorm The details are given in Sect. 4.8 Same as training samples 7400 7400

18 Spambase The details are given in Sect. 4.9 Same as training samples 4601 4601

NOTrS: Number of training samples NOTeS: Number of test samples

123

Arabian Journal for Science and Engineering

Table 5 Experimental results of the XOR6 dataset

XOR6 ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 1.98E−23 2.40E–23 2.97E−29 6.73E−29 2.34E−29 1.05E−29 1.24E−29 1.05E−29 1.05E−29

Best 1.06E−29 1.24E−29 2.97E−29 1.06E−29 1.24E−29 1.03E−29 1.24E−29 1.03E−29 1.03E−29

Worst 3.99E−22 7.16E−22 2.97E−29 5.48E−28 8.41E−29 1.09E−29 1.24E−29 1.09E−29 1.09E−29

SD 7.39E−23 1.31E−22 2.28E−44 1.16E−28 2.40E−29 1.66E−31 2.85E−45 1.91E−31 1.88E−31

Median 5.10E−26 8.29E−29 2.97E−29 2.51E−29 1.24E−29 1.05E−29 1.24E−29 1.05E−29 1.05E−29

Mean time 1.3564 1.6653 1.3946 1.4546 1.8817 1.4412 0.7653 2.2254 2.5986

Friedman rank 8.4 7.8 6.6 5.8 5.5 1.9 4.7 2.3 1.9

Manuel rank 2 3 4 2 3 1311

Wilcoxon 1.92E−06 1.73E−06 1.73E−06 2.13E−06 1.73E−06 4.91E−01 1.73E−06 2.06E−01 0.00E +00

Classiﬁcation rate (%) 100 100 100 100 100 100 100 100 100

4.5 Banknote Authentication Dataset

The Banknote authentication dataset is related to whether the

banknote is valid or invalid. It has four continuous attributes

(variance of wavelet transformed image, skewness of wavelet

transformed image, curtosis of wavelet transformed image,

the entropy of image), and 1372 training/test samples. The

ﬁles related to the dataset can be found in https://archive.ics.

uci.edu/ml/datasets/banknote+authentication.

4.6 Diabetic Retinopathy Debrecen Dataset

The Diabetic Retinopathy Debrecen dataset contains infor-

mation about people who have diabetic retinopathy or

not. It has 19 continuous and integer attributes and 1151

training/test samples. The ﬁles related to the dataset can

be found in https://archive.ics.uci.edu/ml/datasets/Diabetic+

Retinopathy+Debrecen+Data+Set.

4.7 Twonorm Dataset

The Twonorm dataset is an artiﬁcial dataset that has 20 con-

tinuous attributes and 7400 training/test samples. The ﬁles

related to the dataset can be found in https://www.cs.toronto.

edu/~delve/data/twonorm/desc.html

4.8 Ringnorm Dataset

The Ringnorm dataset is an artiﬁcial dataset that has 20 con-

tinuous attributes and 7400 training/test samples. The ﬁles

related to the dataset can be found in https://www.cs.toronto.

edu/~delve/data/ringnorm/desc.html.

4.9 Spambase Dataset

The Spambase dataset is about the classifying emails as spam

or not. It has 57 continuous or integer attributes and 4601

training/test samples. The ﬁles related to the dataset can be

found in https://archive.ics.uci.edu/ml/datasets/Spambase/.

5 Results and Discussion

All obtained results and discussion about these results are

located in this section. The best training results and the best

classiﬁcation rates are highlighted with bold type text and

italic background in Tables 5,6,7,8,9,10,11,12,13,14,

15,16,17 ,21,22,23,24. The statistical tests are impor-

tant for determining the signiﬁcant difference between the

obtained results. In this work, two different statistical tests

are conducted. These are the Wilcoxon signed rank test and

Friedman’s test. The 30 runs obtained results are used in these

tests. The signiﬁcance level is taken as 5% (0.05), and the p

values of the Wilcoxon signed rank test and mean rank values

of Friedman’s test are located in Tables 5,6,7,8,9,10,11,

12,13,14,15,16,17 and Tables 21,22,23,24.Thelarge

datasets have more than 1000 training/test samples discussed

in Sect. 5.3.

The experimental results for the XOR6 dataset are given

in Table 5. TSA, PSO, and GWO share the ﬁrst position in

terms of mean training error results. The classiﬁcation rate is

100% for all methods. Therefore, XOR6 is not an identiﬁer

problem.

The experimental results for the XOR9 dataset are given

in Table 6. ABC, BBO, GA, PBIL, and GWO share the ﬁrst

position in terms of mean training error results. The classiﬁ-

cation rate is 100% for all methods. Therefore, XOR9 is not

an identiﬁer problem.

The experimental results for the XOR13 dataset are given

in Table 7. ABC, BBO, GA, and GWO share the ﬁrst position

in terms of mean training error results. The classiﬁcation

rate is 100% for all methods. Therefore, XOR13 is not an

identiﬁer problem.

123

Arabian Journal for Science and Engineering

Table 6 Experimental results of the XOR9 dataset

XOR9 ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 2.70E−09 2.30E−06 2.69E−09 1.86E−05 2.69E−09 2.89E −06 3.23E−09 1.75E−08 2.94E−09

Best 2.69E−09 1.74E−08 2.69E−09 3.11E−09 2.69E−09 2.69E−09 2.69E−09 2.70E−09 2.70E−09

Wor s t 2.71E−09 2.58E−05 2.69E−09 0.00028 2.69E−09 5.12E−05 5.75E−09 1.15E−07 3.67E−09

SD 3.03E−12 4.84E−06 4.21E−25 5.34E−05 1.08E−24 1.12E −05 8.34E−10 2.41E−08 2.60E−10

Median 2.69E−09 8.89E−07 2.69E−09 7.10E−07 2.69E−09 2.70E −09 2.69E−09 9.30E−09 2.83E−09

Mean time 1.4393 1.4955 1.4015 1.4390 1.8169 1.4570 0.7771 2.2102 2.7048

Friedman rank 3.9 8.4 1.7 8.2 1.7 4.6 4.0 6.8 5.6

Manuel rank 141311122

Wilcoxon 1.73E−06 1.73E−06 1.73E−06 1.73E−06 1.73E −06 6.16E−04 6.58E −01 4.73E−06 0.00E + 00

Classiﬁcation rate (%) 100 100 100 100 100 100 100 100 100

Table 7 Experimental results of the XOR13 dataset

XOR13 ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 1.43E−13 2.20E−07 1.40E−13 1.12E−06 1.73E−09 4.51E −08 5.95E−10 1.32E−09 6.72E−10

Best 1.40E−13 4.49E−11 1.40E−13 6.51E−10 1.40E−13 1.40E−13 1.52E−13 2.54E−13 1.55E−13

Wor s t 1.59E−13 1.72E−06 1.40E−13 2.24E−05 2.16E−08 6.81E−07 3.96E−09 2.32E−08 5.21E−09

SD 4.41E−15 3.82E−07 2.57E−29 4.15E−06 4.17E−09 1.61E −07 1.03E−09 4.57E−09 1.50E−09

Median 1.41E−13 5.03E−08 1.40E−13 2.28E−08 1.40E−13 1.42E −13 6.08E−13 1.34E−11 3.92E−13

Mean time 2.0267 2.1107 2.0286 1.8804 2.4000 2.0502 0.9473 2.7743 3.7753

Friedman rank 2.8 8.3 1.4 8.1 3.5 4.6 5.2 5.8 5.3

Manuel rank 151611 243

Wilcoxon 1.73E−06 2.60E−06 1.73E−06 3.88E−06 5.30E −01 9.26E−01 8.13E−01 1.92E−01 0.00E + 00

Classiﬁcation rate (%) 100 100 100 100 100 100 100 100 100

Table 8 Experimental results of the 3-bit Parity dataset

3-bit Parity ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 1.49E−13 2.74E−05 1.44E−13 2.00E−05 6.77E−10 1.39E−09 1.13E−09 3.34E−07 3.87E−09

Best 1.32E−13 5.41E−08 1.44E−13 2.43E−09 1.40E−13 1.34E−13 6.04E−13 7.00E−12 4.01E−13

Wor s t 2.24E−13 0.00017 1.44E−13 2.99E−05 1.41E−09 2.67E−09 1.73E−08 9.13E−06 2.44E−08

SD 2.54E−14 4.03E−05 2.57E−29 1.26E−05 6.99E−10 1.12E−09 3.51E−09 1.66E−06 5.78E−09

Median 1.37E−13 7.79E−06 1.44E−13 2.99E−05 3.58E−10 1.40E−09 3.42E−12 1.43E−08 2.07E−09

Mean time 1.9296 2.3495 2.0153 2.0680 2.5710 2.2769 1.0279 2.7993 4.0024

Friedman rank 1.6 8.5 2.2 8.4 3.6 4.5 4.2 6.6 5.4

Manuel rank 194832675

Wilcoxon 1.73E−06 1.73E−06 1.73E−06 1.73E−06 4.11E−03 6.27E−02 6.64E−04 3.32E−04 0.00E + 00

Classiﬁcation rate (%) 50.00 62.50 50.00 62.50 50.00 50.00 62.50 62.50 50.00

The experimental results for the 3-bit Parity dataset are

given in Table 8. ABC is the best in terms of mean training

error results. But the classiﬁcation rate of ABC is 50%. ACO,

ES, PBIL, and PSO have the same classiﬁcation rate (62.5%).

The best trained model cannot produce the best classiﬁcation

accuracy.

The experimental results for the 4-bit Encoder Decoder

dataset are given in Table 9. All algorithms are trapped in

the same local minima, and all of them produce the same

classiﬁcation rates. Thus, the 4-bit Encoder Decoder is not

an identiﬁer problem. This problem has four output values;

therefore, our model cannot be appropriate for solving the

4-bit Encoder Decoder problem.

The experimental results for the 3-bits XOR dataset are

given in Table 10. GA is the best in terms of mean training

123

Arabian Journal for Science and Engineering

Table 9 Experimental results of the 4-bit Encoder Decoder dataset

4-bit Encoder Decoder ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 6.23E−02 6.23E−02 6.23E−02 6.25E −02 6.23E−02 6.23E −02 6.23E−02 6.23E −02 6.23E−02

Best 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02

Wor s t 6.23E−02 6.24E−02 6.23E−02 6.37E−02 6.23E−02 6.23E−02 6.23E−02 6.23E−02 6.23E −02

SD 1.09E−07 2.31E−05 2.82E−17 3.76E −04 1.40E−05 1.60E −07 1.18E−05 3.85E −06 2.32E−06

Median 6.23E−02 6.23E−02 6.23E−02 6.24E −02 6.23E−02 6.23E −02 6.23E−02 6.23E −02 6.23E−02

Mean time 2.2708 2.9314 2.4063 2.4090 2.7637 2.7607 1.1166 3.1452 4.7394

Friedman rank 2.8 8.0 1.5 8.5 5.2 2.3 7.3 5.3 4.1

Manuel rank 111111111

Wilcoxon 3.41E−05 1.73E−06 1.73E −06 1.73E−06 2.41E −03 9.71E−05 1.73E −06 1.04E−03 0.00E +00

Classiﬁcation rate (%) 25 25 25 25 25 25 25 25 25

Table 10 Experimental results of the 3-bits XOR dataset

3-bits XOR ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 2.97E−17 2.63E−06 3.14E−30 9.28E−10 7.43E−20 7.11E−15 3.84E−12 7.37E−10 1.15E−10

Best 5.93E−20 2.41E−09 3.14E−30 3.40E−15 1.36E−30 3.68E−26 4.83E−19 4.81E−13 1.94E−18

Worst 9.91E−17 2.09E−05 3.14E−30 1.38E−09 2.23E−18 1.35E−13 1.13E−10 1.37E−08 2.75E−09

SD 2.76E−17 4.62E−06 2.14E−45 3.99E−10 4.07E−19 2.78E−14 2.06E−11 2.53E−09 5.06E−10

Median 1.70E−17 5.28E−07 3.14E−30 6.91E−10 1.88E−25 5.64E−22 1.44E−15 4.22E−11 8.68E−14

Mean time 3.1274 4.3180 3.4139 3.4170 3.5980 4.1621 1.3527 3.8864 7.0275

Friedman rank 3.9 9.0 1.1 7.8 2.0 3.4 5.1 7.0 5.7

Manuel rank 4 9 2 7 13586

Wilcoxon 2.35E−06 1.73E−06 1.73E−06 3.41E−05 1.73E−06 5.31E−05 1.48E−02 1.96E−03 0.00E + 00

Classiﬁcation rate (%) 100 100 100 100 100 100 100 100 100

Table 11 Experimental results of the Sigmoid dataset

Sigmoid ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 2.46E−01 2.48E−01 2.46E−01 2.49E−01 2.47E−01 2.47E −01 2.47E−01 2.47E−01 2.47E−01

Best 2.46E−01 2.47E−01 2.46E−01 2.47E−01 2.46E−01 2.46E−01 2.47E−01 2.47E−01 2.47E−01

Wor s t 2.46E−01 2.51E−01 2.46E−01 2.53E−01 2.49E−01 2.47E−01 2.48E−01 2.48E−01 2.48E−01

SD 2.15E−05 1.13E−03 8.47E−17 1.20E−03 5.79E−04 1.67E −04 3.04E−04 3.26E−04 3.47E−04

Median 2.46E−01 2.48E−01 2.46E−01 2.49E−01 2.47E−01 2.46E −01 2.47E−01 2.47E−01 2.47E−01

Mean time 95.9492 34.0330 35.7778 31.2610 31.3496 35.0565 23.0235 47.3332 39.8589

Friedman rank 1.1 7.9 2.6 8.6 4.7 2.8 5.5 5.8 6.2

Manuel rank 121211 222

Wilcoxon 1.73E−06 2.60E−05 1.73E−06 2.60E−06 5.67E −03 3.18E−06 1.75E−02 2.99E−01 0.00E + 00

Classiﬁcation rate (%) 100.00 94.21 100.00 100.00 100.00 100.00 100.00 100.00 100.00

error results. The classiﬁcation rate is 100% for all methods.

Therefore, 3-bits XOR is not an identiﬁer problem.

The experimental results for the Sigmoid dataset are given

in Table 11. ABC, BBO, GA, and GWO share the ﬁrst posi-

tion in terms of mean training error results. The classiﬁcation

rate is 100% for all methods except ACO. Therefore, Sigmoid

is not an identiﬁer problem.

The experimental results for the Cosine dataset are given

in Table 12. GWO and ABC share the best position in terms

of mean training error results. PSO is the best in terms of the

classiﬁcation. Cosine is an identiﬁer problem because every

algorithm produces different results.

The experimental results for the Sine dataset are given in

Table 13. ABC is in the best position in terms of mean training

error results. GA is the best in terms of the classiﬁcation. Sine

123

Arabian Journal for Science and Engineering

Table 12 Experimental results of the Cosine dataset

Cosine ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 1.77E−01 1.86E−01 1.78E−01 1.95E−01 1.79E−01 1.76E−01 1.81E−01 1.82E−01 1.81E−01

Best 1.76E−01 1.79E−01 1.78E−01 1.82E−01 1.77E−01 1.76E−01 1.78E−01 1.78E−01 1.78E−01

Wor s t 1.77E−01 1.97E−01 1.78E−01 2.13E−01 1.83E−01 1.77E−01 1.85E−01 1.85E−01 1.83E−01

SD 3.48E−04 4.46E−03 5.65E−17 8.32E−03 1.51E−03 4.70E−04 1.82E−03 1.53E−03 1.34E−03

Median 1.76E−01 1.86E−01 1.78E−01 1.95E−01 1.78E−01 1.76E−01 1.81E−01 1.81E−01 1.81E−01

Mean time 88.9853 29.4610 31.5415 26.7391 26.9503 30.9972 18.8338 43.6489 33.0181

Friedman rank 1.7 7.7 3.6 8.7 3.7 1.3 5.7 6.6 5.9

Manuel rank 143521333

Wilcoxon 1.73E−06 1.24E−05 2.13E−06 1.73E−06 1.92E−06 1.73E−06 9.26E−01 1.48E−02 0.00E + 00

Classiﬁcation rate (%) 97.37 78.95 92.11 76.32 94.74 97.37 92.11 100.00 84.21

Table 13 Experimental results of the Sine dataset

Sine ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 4.13E−01 4.52E−01 4.41E−01 4.49E−01 4.54E−01 4.53E−01 4.35E−01 4.40E−01 4.46E−01

Best 4.00E−01 4.43E−01 4.41E−01 4.30E−01 4.45E−01 4.44E−01 4.20E−01 4.22E−01 4.34E−01

Wor s t 4.27E−01 4.56E−01 4.41E−01 4.56E−01 4.60E−01 4.56E−01 4.47E−01 4.52E−01 4.53E−01

SD 6.20E−03 4.12E−03 1.13E−16 7.51E−03 3.99E−03 1.86E−03 6.38E−03 7.42E−03 5.31E−03

Median 4.14E−01 4.54E−01 4.41E−01 4.52E−01 4.55E−01 4.53E−01 4.35E−01 4.42E−01 4.48E−01

Mean time 88.4847 36.3617 38.3013 33.9895 34.4315 36.8322 26.4933 48.3780 42.7450

Friedman rank 1.0 7.0 3.6 6.4 8.1 7.2 2.7 3.8 5.1

Manuel rank 176498235

Wilcoxon 1.73E−06 2.61E−04 1.15E−04 7.52E−02 2.16E−05 2.60E−06 1.64E−05 3.38E−03 0.00E + 00

Classiﬁcation rate (%) 59.52 57.54 56.75 60.32 67.06 56.75 66.67 60.71 57.94

Table 14 Experimental results of the Balloon dataset

Balloon ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 6.22E−17 4.13E−07 2.22E−31 8.22E−10 3.15E−23 7.44E−19 1.65E−13 1.33E−10 8.46E−12

Best 1.39E−20 3.99E−10 2.22E−31 4.68E−14 3.40E−40 1.10E−31 4.10E−21 1.43E−15 2.08E−17

Worst 5.68E−16 4.51E−06 2.22E−31 1.55E−09 9.45E−22 8.07E−18 1.87E−12 1.19E−09 1.73E−10

SD 1.09E−16 8.96E−07 4.45E−47 3.86E−10 1.72E−22 2.13E−18 4.45E−13 2.44E−10 3.25E−11

Median 3.62E−17 8.62E−08 2.22E−31 1.03E−09 9.86E−34 1.05E−22 2.55E−15 3.11E−11 6.73E−14

Mean time 10.3630 6.2161 4.8502 4.9285 4.5307 6.0645 1.9197 4.9773 12.5299

Friedman rank 4.3 8.9 1.8 8.0 1.3 2.9 5.1 6.9 5.8

Manuel rank 5 9 3 8 12476

Wilcoxon 2.35E−06 1.73E−06 1.73E−06 1.92E−06 1.73E−06 1.73E−06 6.04E−03 1.97E−05 0.00E + 00

Classiﬁcation rate (%) 50 050 50 50 50 50 50 0

is an identiﬁer problem because every algorithm produces

different results.

The experimental results for the Balloon dataset are given

in Table 14. GA is in the best position in terms of mean train-

ing error results. The best trained model of ACO and TSA

cannot classify the test data. ABC, BBO, ES, GWO, PBIL,

and PSO have a 50% classiﬁcation rate. Balloon is an iden-

tiﬁer problem because every algorithm produces different

results.

The experimental results for the Iris dataset are given in

Table 15. All algorithms except ACO and GA trapped in the

same local minima. ABC is the best in terms of the classiﬁ-

cation. This problem has three output values; therefore, our

model cannot appropriate for solving the Iris problem.

123

Arabian Journal for Science and Engineering

Table 15 Experimental results of the Iris dataset

Iris ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 2.50E−01 2.84E−01 2.50E−01 2.51E−01 3.52E−01 2.50E −01 2.50E−01 2.51E −01 2.52E−01

Best 2.50E−01 2.54E−01 2.50E−01 2.50E−01 2.54E−01 2.50E−01 2.50E−01 2.50E−01 2.50E−01

Wor s t 2.51E−01 3.32E−01 2.50E−01 2.54E−01 4.91E−01 2.50E−01 2.51E−01 2.55E−01 2.56E−01

SD 1.64E−04 2.02E−02 1.69E−16 8.61E−04 7.18E−02 2.90E −05 1.43E−04 1.03E −03 1.58E−03

Median 2.50E−01 2.84E−01 2.50E−01 2.50E−01 3.19E−01 2.50E −01 2.50E−01 2.51E −01 2.51E−01

Mean time 255.9866 37.5246 42.3152 33.8377 33.6497 41.2012 20.9516 49.5602 56.7811

Friedman rank 3.5 8.2 1.5 5.0 8.8 1.8 3.6 6.2 6.5

Manuel rank 121121111

Wilcoxon 5.22E−06 1.92E−06 1.73E−06 4.86E −05 1.73E−06 1.73E−06 2.35E −06 8.97E−02 0.00E +00

Classiﬁcation rate (%) 20.67 5.33 12.00 8.67 3.33 11.33 10.00 4.67 20.00

Table 16 Experimental results of the Breast Cancer dataset

Breast Cancer ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 9.36E−03 1.10E−02 2.35E−03 3.76E−02 1.82E−03 1.34E−03 2.55E−02 2.76E−02 2.92E−02

Best 3.95E−03 9.28E−03 2.35E−03 3.49E−02 1.18E−03 1.14E−03 1.24E−02 2.11E−02 1.67E−02

Worst 1.56E−02 1.41E−02 2.35E−03 4.07E−02 8.17E−03 1.55E−03 3.28E−02 3.49E−02 3.39E−02

SD 2.89E−03 1.96E−03 8.82E−19 1.59E−03 1.32E−03 1.26E−04 4.92E−03 3.18E−03 3.45E−03

Median 9.23E−03 9.64E−03 2.35E−03 3.74E−02 1.46E−03 1.35E−03 2.62E−02 2.76E−02 2.96E−02

Mean time 592.9885 178.6869 200.6069 181.4763 170.7403 208.1356 145.6965 189.3432 232.0039

Friedman rank 4.3 4.7 2.9 9.0 1.8 1.4 6.8 6.8 7.4

Manuel rank 4 5 3 9 2 1687

Wilcoxon 1.73E−06 1.73E−06 1.73E−06 1.73E−06 1.73E−06 1.73E−06 7.27E−03 2.18E−02 0.00E + 00

Classiﬁcation rate (%) 0 100 100 95 0 0 52 1 77

Table 17 Experimental results of the Heart dataset

Heart ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 6.49E−25 1.57E−17 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Best 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00 0.00E + 00

Wor s t 1.94E−23 4.70E−16 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

SD 3.55E−24 8.59E−17 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Median 1.12E−33 4.48E−33 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Mean time 562.8594 462.9802 636.0618 466.6571 457.8536 632.8333 225.8350 494.0318 356.5683

Friedman rank 6.7 6.8 4.5 4.5 4.5 4.5 4.5 4.5 4.5

Manuel rank 111111111

Wilcoxon 6.10E−05 6.10E−05 1.00E +00 1.00E +00 1.00E +00 1.00E +00 1.00E +00 1.00E +00 0.00E +00

Classiﬁcation rate (%) 100 0100 100 59.09 100 00100

The experimental results for the Breast Cancer dataset

are given in Table 16. GWO is the best in terms of mean

training error results, but its trained model cannot classify

the test data. ACO and BBO share the best position in terms

of the classiﬁcation. According to these results, it has seen

that breast cancer is a challenging problem.

The experimental results for the Heart dataset are given in

Table 17. All algorithms achieved the zero error, but ACO,

PBIL, and PSO cannot classify the test data. The classiﬁca-

tion rate of GA is 59.09%. ABC, BBO, ES, GWO, and TSA

classify the test data successfully.

123

Arabian Journal for Science and Engineering

Table 18 The manual ranks

overview Manual ranks ABC ACO BBO ES GA GWO PBIL PSO TSA

XOR6 234231311

XOR9 141311122

XOR13 151611243

3-bit Parity 194832675

4-bit Enc. Dec 111111111

3-bits XOR 492713586

Sigmoid 121211222

Cosine 143521333

Sine 176498235

Balloon 593812476

Iris 121111111

Breast cancer 453921687

Heart 111111111

Total ranks 24 61 31 57 27 24 37 48 43

Table 19 The Friedman ranks

overview Friedman ranks ABC ACO BBO ES GA GWO PBIL PSO TSA

XOR6 8.4 7.8 6.6 5.8 5.5 1.9 4.7 2.3 1.9

XOR9 3.9 8.4 1.7 8.2 1.7 4.6 4.0 6.8 5.6

XOR13 2.8 8.3 1.4 8.1 3.5 4.6 5.2 5.8 5.3

3-bit Parity 1.6 8.5 2.2 8.4 3.6 4.5 4.2 6.6 5.4

4-bit Enc. Dec 2.8 8.0 1.5 8.5 5.2 2.3 7.3 5.3 4.1

3-bits XOR 3.9 9.0 1.1 7.8 2.0 3.4 5.1 7.0 5.7

Sigmoid 1.1 7.9 2.6 8.6 4.7 2.8 5.5 5.8 6.2

Cosine 1.7 7.7 3.6 8.7 3.7 1.3 5.7 6.6 5.9

Sine 1.0 7.0 3.6 6.4 8.1 7.2 2.7 3.8 5.1

Balloon 4.3 8.9 1.8 8.0 1.3 2.9 5.1 6.9 5.8

Iris 3.3 8.6 1.0 8.1 4.0 2.0 7.1 5.9 5.0

Breast cancer 4.3 4.7 2.9 9.0 1.8 1.4 6.8 6.8 7.4

Heart 6.7 6.8 4.5 4.5 4.5 4.5 4.5 4.5 4.5

Total ranks 45.9 101.6 34.4 100.1 49.6 43.5 68.0 74.0 68.1

General rank 3 9 1842576

Table 20 The classiﬁcation rates

overview ABC ACO BBO ES GA GWO PBIL PSO TSA

XOR6 100 100 100 100 100 100 100 100 100

XOR9 100 100 100 100 100 100 100 100 100

XOR13 100 100 100 100 100 100 100 100 100

3-bit Parity 50.00 62.50 50.00 62.50 50.00 50.00 62.50 62.50 50.00

4-bit Enc. Dec 25 25 25 25 25 25 25 25 25

3-bits XOR 100 100 100 100 100 100 100 100 100

Sigmoid 100.00 94.21 100.00 100.00 100.00 100.00 100.00 100.00 100.00

Cosine 97.37 78.95 92.11 76.32 94.74 97.37 92.11 100.00 84.21

Sine 59.52 57.54 56.75 60.32 67.06 56.75 66.67 60.71 57.94

Balloon 50 0 50 50 50 50 50 50 0

Iris 20.67 5.33 12.00 8.67 3.33 11.33 10.00 4.67 20.00

Breast cancer 0 100 100 95 0 0 52 1 77

Heart 100 0 100 100 59.0909 100 0 0 100

Mean CR 69.4 63.3 75.8 75.2 65.3 68.5 66.0 61.8 70.3

123

Arabian Journal for Science and Engineering

Table 21 Experimental results of the Balloon dataset for TSA

Balloon N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 4.08E−10 1.26E−10 7.19E−11 5.27E−09 5.17E−09 5.97E−09 2.67E−08 1.71E−08 2.62E−08

Best 9.43E−19 9.81E−21 4.95E−20 1.51E−16 7.38E−14 2.79E−15 3.75E−14 1.01E−14 5.59E−14

Worst 6.51E−09 3.23E−09 1.95E−09 7.56E−08 3.41E−08 1.57E−07 3.29E−07 2.01E−07 1.99E−07

SD 1.47E−09 5.91E−10 3.56E−10 1.44E−08 1.01E−08 2.86E−08 6.56E−08 4.31E−08 5.94E−08

Median 8.42E−14 4.81E−15 8.00E−14 4.06E−10 6.68E−10 5.13E−11 2.41E−09 4.89E−10 9.13E−10

Mean time 8.0720 8.0343 8.7865 5.0810 4.9332 4.7059 4.6307 4.5278 4.2975

Friedman

rank

2.8 2.4 2.5 5.9 6.2 5.1 7.3 6.5 6.3

Manuel

rank

312495768

Wilcoxon 0.00E + 00 2.54E−01 4.05E−01 8.19E−05 1.80E−05 6.84E−03 9.32E−06 1.24E−05 1.74E−04

Classiﬁcation

rate %

50.00 50.00 50.00 50.00 50.00 50.00 50.00 0.00 50.00

Table 22 Experimental results of the Iris dataset for TSA

Iris N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 2.51E−01 2.51E−01 2.51E−01 2.55E−01 2.54E −01 2.53E−01 2.54E −01 2.53E−01 2.53E −01

Best 2.50E−01 2.50E−01 2.50E−01 2.51E−01 2.50E−01 2.50E−01 2.50E−01 2.50E−01 2.50E−01

Wor s t 2.54E−01 2.56E−01 2.55E−01 2.69E−01 2.69E−01 2.66E−01 2.76E−01 2.64E−01 2.58E−01

SD 8.92E−04 1.15E−03 1.23E−03 5.24E−03 3.94E −03 3.42E−03 4.71E −03 3.30E−03 2.25E −03

Median 2.50E−01 2.50E−01 2.50E−01 2.52E−01 2.52E −01 2.52E−01 2.53E −01 2.52E−01 2.52E −01

Mean time 122.8270 122.7315 107.2986 42.7832 41.5848 41.1891 31.9500 31.3707 30.3779

Friedman

rank

2.6 2.8 3.0 6.5 5.9 6.1 6.3 5.8 5.9

Manuel

rank

111211111

Wilcoxon 0.00E +00 6.58E−01 7.19E−01 1.97E−05 2.16E −05 3.72E−05 4.29E −06 4.86E−05 1.64E −05

Classiﬁcation

rate %

16.67 27.33 9.33 30.00 20.67 7.33 14.00 30.67 4.00

Table 23 Experimental results of the Cancer dataset for TSA

Cancer N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 2.55E−02 2.49E−02 2.54E−02 3.05E−02 2.89E−02 2.93E−02 3.03E−02 3.08E−02 3.00E−02

Best 1.62E−02 1.50E−02 1.25E−02 2.44E−02 1.51E−02 2.43E−02 2.24E−02 2.17E−02 2.19E−02

Worst 3.39E−02 3.44E−02 3.31E−02 3.47E−02 3.41E−02 3.44E−02 3.59E−02 3.62E−02 3.66E−02

SD 4.25E−03 4.32E−03 5.16E−03 2.63E−03 4.34E−03 3.02E−03 3.64E−03 3.61E−03 3.69E−03

Median 2.58E−02 2.54E−02 2.60E−02 3.04E−02 2.98E−02 2.91E−02 3.08E−02 3.21E−02 3.06E−02

Mean time 442.0449 337.4356 362.2962 231.0281 253.7927 311.0903 217.8359 213.5434 227.1923

Friedman

rank

3.3 2.9 3.7 6.4 5.5 5.3 5.9 6.1 6.0

Manuel

rank

421938756

Wilcoxon 0.00E + 00 5.58E−01 8.61E−01 3.41E−05 1.71E−03 1.38E−03 1.15E−04 1.15E−04 3.06E−04

Classiﬁcation

rate %

77.00 74.00 0.00 89.00 77.00 81.00 76.00 77.00 87.00

123

Arabian Journal for Science and Engineering

Table 24 Experimental results of the 3-bit parity dataset for TSA

3-bit Parity N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 1.70E−09 1.99E−06 2.06E−08 4.60E−06 1.66E−06 8.88E−07 1.39E−05 8.09E−06 6.14E−06

Best 6.41E−13 8.06E−13 8.19E−13 1.17E−11 1.54E−11 7.66E−11 5.49E−11 2.68E−09 4.83E−12

Wor s t 5.34E−09 5.95E−05 5.28E−07 5.17E−05 1.44E−05 1.89E−05 7.07E−05 8.19E−05 5.04E−05

SD 1.85E−09 1.09E−05 9.60E−08 1.15E−05 3.29E−06 3.54E−06 2.18E−05 1.85E−05 1.29E−05

Median 1.26E−09 2.52E−09 2.56E−09 2.28E−08 8.53E−08 1.66E−08 1.76E−06 4.26E−07 4.15E−07

Mean time 1.8920 1.8433 1.8087 1.2872 1.2497 1.1923 1.2163 1.2349 1.1735

Friedman

rank

2.2 2.8 2.8 5.5 5.5 5.2 7.2 7.0 6.9

Manuel

rank

123568794

Wilcoxon 0.00E +00 1.59E−01 5.71E−02 4.29E−06 6.34E−06 6.98E−06 2.13E−06 1.73E−06 2.13E−06

Classiﬁcation

rate %

50.00 75.00 50.00 75.00 50.00 50.00 50.00 62.50 75.00

Table 25 Experimental results of the 4-bit Encoder Decoder dataset for TSA

4-bit

Encoder

Decoder

N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 2.06E−02 2.05E−02 2.06E−02 2.05E −02 2.06E−02 2.05E −02 2.07E−02 2.06E −02 2.06E−02

Best 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02 2.05E−02

Wor s t 2.23E−02 2.11E−02 2.17E−02 2.09E−02 2.18E−02 2.06E−02 2.20E−02 2.15E−02 2.10E −02

SD 3.51E−04 1.11E−04 2.65E−04 8.37E −05 2.64E−04 1.42E −05 3.22E−04 1.85E −04 1.16E−04

Median 2.05E−02 2.05E−02 2.05E−02 2.05E −02 2.05E−02 2.05E −02 2.05E−02 2.05E −02 2.05E−02

Mean time 2.5172 2.3700 2.3480 1.8992 1.8609 1.7707 1.8014 1.7484 1.7475

Friedman

rank

3.6 3.1 3.4 5.4 5.2 5.1 6.9 6.3 6.0

Manuel

rank

111111111

Wilcoxon 0.00E +00 1.85E−01 6.88E−01 3.00E −02 3.68E−02 3.16E −02 8.31E−04 3.61E −03 9.84E−03

Classiﬁcation

rate %

25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00

For overall analysis, the manual ranks overview is given in

Table 18, the Friedman ranks overview is given in Table 19,

and the classiﬁcation rates overview is given in Table 20.////

ABC and GWO have the same total rank value in terms

of mean training error results. BBO is the best in terms of

Friedman rank and classiﬁcation rank values. Every dataset

has a different type of search space. An algorithm that solves

one problem well cannot solve another problem. This situa-

tion is proved by Wolpert and Macready [27], and its name

is no free lunch theorems for optimization. The mean classi-

ﬁcation rate of the TSA is 70.3%. This value is a compatible

result. In these experiments, we use ﬁxed stand sizes (10 and

50) and ST (0.1) for TSA. These two peculiar parameters

affect the results of the algorithm. In the next experiment,

we analyze the different stand sizes and ST values for iden-

tiﬁer datasets (Balloon, Iris, Cancer, Parity, EncDec, Cosine,

Sine).

5.1 The Parameter Adjustment for TSA

In this section, we adjust the peculiar parameters of TSA

for the 7 identiﬁer datasets (Balloon, Iris, Cancer, Parity,

EncDec, Cosine, and Sine). In experiments, we use 10, 50,

and 100 as stand sizes and 0.1, 0.5, and 0.9 as ST parameters.

The base method for Wilcoxon signed rank test is N10

and ST 0.1.

The experimental results of the Balloon dataset for the

TSA are given in Table 21.N10 and ST 0.5 variant is

the best in terms of mean training error results. All variants

except N100 and ST 0.5 has the same classiﬁcation

rates (50%).

123

Arabian Journal for Science and Engineering

Table 26 Experimental results of the Cosine dataset for TSA

Cosine N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 1.80E−01 1.80E−01 1.81E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01

Best 1.77E−01 1.77E−01 1.78E−01 1.78E−01 1.78E−01 1.78E−01 1.78E−01 1.77E−01 1.79E−01

Wor s t 1.83E−01 1.83E−01 1.86E−01 1.89E−01 1.85E−01 1.86E−01 1.87E−01 1.87E−01 1.85E−01

SD 1.56E−03 1.64E−03 1.88E−03 2.13E−03 1.73E−03 1.98E−03 2.46E−03 2.12E−03 1.69E−03

Median 1.80E−01 1.80E−01 1.81E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01 1.82E−01

Mean time 73.7480 66.3652 60.0203 36.9796 55.8863 50.7422 27.1082 27.2170 25.9935

Friedman

rank

3.7 4.0 4.2 5.0 5.7 5.6 5.9 5.2 5.5

Manuel

rank

112222213

Wilcoxon 0.00E +00 7.81E−01 3.39E−01 1.04E−02 1.20E−03 2.96E−03 3.32E−04 2.30E−02 5.32E−03

Classiﬁcation

rate %

36.84 36.84 100.00 36.84 36.84 100.00 44.74 100.00 36.84

Table 27 Experimental results of the Sine dataset for TSA

SineNN10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Mean 4.44E−01 4.43E−01 4.40E−01 4.44E−01 4.44E−01 4.44E−01 4.46E−01 4.45E−01 4.43E−01

Best 4.29E−01 4.23E−01 4.23E−01 4.17E−01 4.27E−01 4.29E−01 4.33E−01 4.29E−01 4.24E−01

Worst 4.54E−01 4.53E−01 4.54E−01 4.54E−01 4.53E−01 4.53E−01 4.53E−01 4.54E−01 4.54E−01

SD 6.67E−03 7.45E−03 8.59E−03 8.24E−03 6.32E−03 6.96E−03 6.23E−03 6.60E−03 8.10E−03

Median 4.46E−01 4.45E−01 4.42E−01 4.47E−01 4.44E−01 4.46E−01 4.48E−01 4.47E−01 4.43E−01

Mean time 56.8857 56.5048 55.8578 37.4354 54.5648 52.6769 52.1033 48.5141 38.4682

Friedman

rank

5.3 4.6 3.9 5.4 4.9 5.1 5.5 5.4 4.9

Manuel

rank

522145653

Wilcoxon 0.00E + 00 4.17E−01 7.52E−02 9.26E−01 8.13E−01 9.43E−01 2.21E−01 4.41E−01 6.00E−01

Classiﬁcation

rate %

100.00 52.78 51.98 55.95 100.00 51.98 100.00 89.68 51.98

Table 28 The Friedman ranks overview for the TSA variants

NN10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Balloon 2.8 2.4 2.5 5.9 6.2 5.1 7.3 6.5 6.3

Iris 2.6 2.8 3.0 6.5 5.9 6.1 6.3 5.8 5.9

Cancer 3.3 2.9 3.7 6.4 5.5 5.3 5.9 6.1 6.0

Parity 2.2 2.8 2.8 5.5 5.5 5.2 7.2 7.0 6.9

EncDec 3.6 3.1 3.4 5.4 5.2 5.1 6.9 6.3 6.0

Cosine 3.7 4.0 4.2 5.0 5.7 5.6 5.9 5.2 5.5

Sine 5.3 4.6 3.9 5.4 4.9 5.1 5.5 5.4 4.9

Tot a l FR 23. 5 22.6 23.5 40.2 38.9 37.5 45.0 42.4 41.4

123

Arabian Journal for Science and Engineering

Table 29 The classiﬁcation rates overview for the TSA

N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

Balloon 50.00 50.00 50.00 50.00 50.00 50.00 50.00 0.00 50.00

Iris 16.67 27.33 9.33 30.00 20.67 7.33 14.00 30.67 4.00

Cancer 77.00 74.00 0.00 89.00 77.00 81.00 76.00 77.00 87.00

Parity 50.00 75.00 50.00 75.00 50.00 50.00 50.00 62.50 75.00

EncDec 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00

Cosine 36.84 36.84 100.00 36.84 36.84 100.00 44.74 100.00 36.84

Sine 100.00 52.78 51.98 55.95 100.00 51.98 100.00 89.68 51.98

Total FR 50.8 48.7 40.9 51.7 51.4 52.2 51.4 55.0 47.1

Table 30 The classiﬁcation rates

overview ABC ACO BBO ES GA GWO PBIL PSO TSA

XOR6 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

XOR9 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

XOR13 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

3-bit Parity 50.00 62.50 50.00 62.50 50.00 50.00 62.50 62.50 62.50

4-bit Enc. Dec 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00

3-bits XOR 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

Sigmoid 100.00 94.21 100.00 100.00 100.00 100.00 100.00 100.00 100.00

Cosine 97.37 78.95 92.11 76.32 94.74 97.37 92.11 100.00 100.00

Sine 59.52 57.54 56.75 60.32 67.06 56.75 66.67 60.71 89.68

Balloon 50.00 0.00 50.00 50.00 50.00 50.00 50.00 50.00 0.00

Iris 20.67 5.33 12.00 8.67 3.33 11.33 10.00 4.67 30.67

Breast cancer 0.00 100.00 100.00 95.00 0.00 0.00 52.00 1.00 77.00

Heart 100.00 0.00 100.00 100.00 59.09 100.00 0.00 0.00 100.00

Banknote 66.69 62.76 55.54 60.28 55.54 55.54 55.54 61.95 55.54

Diabetic 51.26 58.12 47.87 51.69 51.26 43.96 54.04 54.74 52.13

Twonorm 84.58 73.61 86.16 85.95 95.95 92.95 89.85 83.64 88.59

Ringnorm 63.24 55.04 60.69 64.18 78.19 69.15 67.64 66.59 63.08

Spambase 42.19 43.06 40.62 42.84 42.84 39.40 41.88 42.21 40.69

Mean CR 67.25 62.01 70.93 71.26 65.17 66.19 64.85 61.83 71.38

The experimental results of the Iris dataset for the TSA

are given in Table 22. All variants except N50 and ST 

0.1 variant are trapped in the same local minima in terms of

mean training error results. N100 and ST 0.5 variant is

the best in terms of classiﬁcation rates.

The experimental results of the Cancer dataset for the TSA

are given in Table 23.N10 and ST 0.9 variant is the

best in terms of mean training error results. N50 and ST

0.1 variant is the best in terms of classiﬁcation rates.

The experimental results of the 3-bit Parity dataset for the

TSA are given in Table 24.N10 and ST 0.1 variant is

the best in terms of mean training error results. N10 and

ST 0.5, N50 and ST 0.1 and N100 and ST 

0.9 variants share the best position in terms of classiﬁcation

rates.

The experimental results of the 4-bit Encoder Decoder

dataset for the TSA are given in Table 25. All variants are

trapped in the same local minima, and all of them produce

the same classiﬁcation rates.

The experimental results of the Cosine dataset for the TSA

are given in Table 26.N10 and ST 0.1, N10 and

ST 0.5 and N100 and ST 0.5 variants share the best

position in terms of mean training error results. N10 and

ST 0.9, N50 and ST 0.9 and N100 and ST 

0.5 variants share the best position in terms of classiﬁcation

rates.

The experimental results of the Sine dataset for the TSA

are given in Table 27.N50 and ST 0.1 variant is the best

in terms of mean training error results. N10 and ST 0.1,

N50 and ST 0.5 and N100 and ST 0.1 variants

share the best position in terms of classiﬁcation rates.

123

Arabian Journal for Science and Engineering

Table 31 The classiﬁcation rates of the Cancer dataset for the TSA variants

Run No N10 ST

0.1

N10 ST

0.5

N10 ST

0.9

N50 ST

0.1

N50 ST

0.5

N50 ST

0.9

N100

ST 0.1

N100

ST 0.5

N100

ST 0.9

1799190879088818853

2747491859282838757

3578074886186898325

4848166338694877783

5798983887990848782

6826583748495908484

7883669778383794877

8847883928583847982

9848490828481909083

10 84 88 91 79 76 81 79 96 88

11 94 77 85 89 82 85 80 88 89

12 82 79 59 84 83 82 87 87 87

13 95 89 41 84 85 86 88 84 42

14 87 77 86 92 90 81 85 92 83

15 86 76 73 92 86 78 84 88 83

16 86 78 52 83 88 84 86 82 64

17 56 88 47 82 77 86 85 78 81

18 84 78 87 89 83 88 25 75 82

19 78 81 82 91 42 82 89 77 39

20 78 80 34 85 83 84 84 70 90

21 93 85 80 76 87 92 93 81 81

22 87 86 85 45 85 85 56 81 89

23 83 80 37 86 76 82 13 81 82

24 77 80 69 89 90 91 76 87 83

25 78 82 81 83 79 73 88 79 45

26 67 26 78 87 74 91 86 84 84

27 86 81 0 74 90 81 92 53 80

28 79 83 87 83 80 85 89 85 85

29 67 87 83 81 61 70 87 86 85

30 88 61 84 88 84 90 83 63 63

Mean CR 80.87 77.33 71.67 81.60 80.83 84.63 80.07 80.67 74.37

Rank379241658

MaxCR959191929295939690

Min CR 56 26 0 33 42 70 13 48 25

For the overall analysis of the TSA variants, the Friedman

ranks overview for the TSA variants is given in Table 28 and

the classiﬁcation rates overview for the TSA variants is given

in Table 29.

N10 and ST 0.5 variant is the best in terms of mean

Friedman rank results. N100 and ST 0.5 variant is the

best in terms of classiﬁcation rate values. The last combined

values of the classiﬁcation rate are given in Table 30.

According to the classiﬁcation rates in Table 30, TSA is the

best classiﬁer in these experiment area. TSA is the best solver

on 18 different type datasets in terms of mean classiﬁcation

rates. In this work, the second is ES, the third is BBO, the

fourth is ABC, the ﬁfth is GWO, the sixth is GA, the seventh

is PBIL, the eighth is ACO, and the last is PSO.

5.2 The Deep Run Analyses for TSA

In the aforementioned experiments, we use the best trained

model for classiﬁcation the test data. When we look at deeply,

the best trained model is not the best in the test phase at every

time. The classiﬁcation rates of the Cancer dataset for the

TSA variants are given in Table 31.

The maximum classiﬁcation rate is 96% when N100

and ST 0.5. According to these results, for better classiﬁ-

123

Arabian Journal for Science and Engineering

Fig. 4 The convergence graph for the Cancer dataset

cation we should not use only the best trained model, but we

must look at the different run model results. The convergence

graph for the Cancer dataset is given in Fig. 4.

According to Fig. 4, TSA achieves the best in terms of

the mean square error in the training phase for the Cancer

dataset. ES, PSO, PBIL, and ACO trapped local minima early

but ABC, BBO, GA, GWO, and TSA continue the search

process until the termination criterion is met. This graph is

also proved the success of the TSA again.

In this type of study, there are two main limitations:

general limitations and speciﬁc limitations. The general lim-

itations are related to all methods. These are determining

the population size; trapping into local optima; detecting the

effective exploration and exploitation ratio. The speciﬁc lim-

itations of TSA are related to the peculiar parameters of the

method. These are determining the search tendency param-

eter and determining the number of seeds parameter. The

search tendency controls the new candidate solution creating

scheme, and the number of seeds controls the exploitation in

the search space. In this work, we analyzed the search ten-

dency parameter. Experimental results showed that 0.5 is a

good value for the search tendency parameter. This means

the new candidate solution creation equations are used as

half-and-half.

5.3 The Experiments on Large Datasets

In this section, the experiments on ﬁve large datasets are

given. The experimental results of Banknote, Diabetic,

Twonorm, Ringnorm, and Spambase datasets are located into

Tables 32,33,34,35,36, respectively.

The experimental results for the Banknote dataset are

given in Table 32. BBO is in the best position in terms of

mean training error results. ABC produced the maximum

classiﬁcation rate.

The experimental results for the Diabetic dataset are given

in Table 33. GA is in the best position in terms of mean train-

ing error results. ACO produced the maximum classiﬁcation

rate.

The experimental results for the Twonorm dataset are

given in Table 34. GA is in the best position in terms of mean

training error results. GA produced the maximum classiﬁca-

tion rate.

The experimental results for the Ringnorm dataset are

given in Table 35. GA is in the best position in terms of mean

training error results. GA produced the maximum classiﬁca-

tion rate.

The experimental results for the Spambase dataset are

given in Table 36. All algorithms produced the optimum

results in terms of mean training error results. ACO produced

the maximum classiﬁcation rate.

6 Conclusion

In this paper, FF MLP ANN is trained by TSA for the ﬁrst

time. TSA is one of the population-based swarm intelligence

Table 32 Experimental results of the Banknote dataset

Banknote ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 4.39E−20 7.84E−24 1.38E−87 1.02E−53 1.88E−79 1.31E−71 7.72E−69 1.95E−41 6.79E−55

Best 4.81E−34 5.00E−36 1.38E−87 2.37E−70 1.38E−87 1.51E−87 2.95E−87 1.80E−51 1.61E−74

Worst 6.43E−19 1.71E−22 1.38E−87 4.12E−53 4.96E−78 2.04E−70 2.30E−67 5.28E−40 2.04E−53

SD 1.55E−19 3.22E−23 9.08E−103 1.49E−53 9.10E−79 4.79E−71 4.19E−68 9.65E−41 3.72E−54

Median 1.35E−22 2.04E−28 1.38E−87 8.53E−61 1.65E−87 2.43E−79 9.67E−78 1.46E−45 6.19E−66

Mean time 91.9752 5.7195 4.7103 4.7717 4.5926 5.5933 2.8472 4.8154 42.1367

Friedman rank 8.9 8.1 1.0 5.9 2.1 3.1 3.8 7.0 5.1

Manuel rank 8 7 1512364

Wilcoxon 1.73E−06 1.73E−06 1.73E−06 0.001287 1.73E−06 2.13E−06 1.73E−06 1.73E−06 0

Classiﬁcation rate (%) 66.69 62.76 55.54 60.28 55.54 55.54 55.54 61.95 55.54

123

Arabian Journal for Science and Engineering

Table 33 Experimental results of the Diabetic dataset

Diabetic ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 6.81E−02 2.08E−01 2.14E−05 1.52E−01 3.44E−02 7.82E−02 1.39E−01 1.42E−01 1.43E−01

Best 4.07E−03 1.58E−01 2.14E−05 1.21E−01 6.77E−06 3.67E−02 1.06E−01 9.14E−02 6.54E−02

Worst 1.50E−01 2.51E−01 2.14E−05 1.84E−01 1.00E−01 1.31E−01 1.87E−01 1.72E−01 1.76E−01

SD 3.01E−02 2.48E−02 6.89E−21 1.61E−02 2.95E−02 2.20E−02 1.87E−02 2.38E−02 2.29E−02

Median 6.24E−02 2.11E−01 2.14E−05 1.58E−01 3.26E−02 7.77E−02 1.41E−01 1.46E−01 1.44E−01

Mean time 298.6501 57.9424 47.1950 47.5073 37.6225 61.5118 17.7858 38.4946 195.2061

Friedman rank 3.3 8.8 1.1 7.1 2.3 3.5 6.0 6.5 6.4

Manuel rank 3 9 2 8 14765

Wilcoxon 2.35E−06 1.92E−06 1.73E−06 0.135908 1.73E−06 1.92E−06 0.158855 0.975387 0

Classiﬁcation rate (%) 51.26 58.12 47.87 51.69 51.26 43.96 54.04 54.74 52.13

Table 34 Experimental results of the Twonorm dataset

Twonorm ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 1.43E−03 1.27E−01 9.37E−107 3.78E−04 6.10E−59 5.06E−54 3.19E−14 1.50E−06 4.01E−08

Best 1.99E−17 4.18E−02 9.37E−107 1.29E−20 1.10E−139 1.63E−94 4.83E−31 9.20E−18 1.28E−23

Worst 4.30E−02 2.01E−01 9.37E−107 7.26E−03 1.83E−57 1.49E−52 9.34E−13 1.43E−05 1.13E−06

SD 7.85E−03 3.93E−02 2.46E−122 1.48E−03 3.34E−58 2.72E−53 1.70E−13 4.00E−06 2.06E−07

Median 9.37E−17 1.22E−01 9.37E−107 2.87E−08 1.48E−107 4.42E−69 1.18E−21 4.17E−09 3.32E−14

Mean time 2112.1370 97.5102 87.5008 87.7857 77.1834 102.8828 55.6546 77.8253 927.3152

Friedman rank 5.5 9.0 1.5 7.3 1.6 2.9 4.2 7.2 5.7

Manuel rank 8 9 2 6 13475

Wilcoxon 0.007731 1.73E−06 1.73E−06 0.000529 1.73E−06 1.73E−06 6.32E−05 9.32E−06 0

Classiﬁcation rate (%) 84.58 73.61 86.16 85.95 95.95 92.95 89.85 83.64 88.59

Table 35 Experimental results of the Ringnorm dataset

Ringnorm ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 2.13E−02 1.69E−01 6.51E−28 1.03E−01 8.88E−10 5.60E−12 3.18E−02 7.47E−02 5.43E−02

Best 6.21E−15 1.00E−01 6.51E−28 5.52E−02 1.44E−32 6.81E−24 3.46E−07 3.05E−02 2.61E−03

Worst 5.02E−02 2.26E−01 6.51E−28 1.41E−01 2.66E−08 1.48E−10 1.06E−01 1.08E−01 1.12E−01

SD 2.43E−02 3.20E−02 9.12E−44 2.33E−02 4.86E−09 2.72E−11 2.47E−02 2.23E−02 2.72E−02

Median 1.32E−03 1.68E−01 6.51E−28 1.06E−01 3.39E−18 6.39E−17 3.45E−02 7.82E−02 5.05E−02

Mean time 2115.7338 97.4103 87.4961 87.9626 76.6757 102.8440 55.4910 77.9352 928.1096

Friedman rank 4.4 9.0 1.1 7.7 2.3 2.7 5.0 6.9 5.9

Manuel rank 5 3 6 1 79842

Wilcoxon 4.07E−05 1.73E−06 1.73E−06 3.18E−06 1.73E−06 1.73E−06 0.003854 0.002255 0

Classiﬁcation rate (%) 63.24 55.04 60.69 64.18 78.19 69.15 67.64 66.59 63.08

algorithms. TSA has two peculiar parameters which are ST

and NS. ST controls the exploration and exploitation progress

of the algorithm. NS provides better intensiﬁcation about the

current solutions. FF MLP ANN is converted to a vector

and TSA optimizes this vector. Eighteen different datasets

(XOR6, XOR9, XOR13, 3-bit Parity, 4-bit Encoder Decoder,

3-bits XOR, Sigmoid, Cosine, Sine, Balloon, Iris, Breast

Cancer, Heart, Banknote, Diabetic, Twonorm, Ringnorm,

and Spambase) are used in experiments. TSA is compared

with PSO, GWO, GA, ACO, ES, PBIL, ABC, and BBO. The

experimental results show that TSA is the best in terms of

mean classiﬁcation rates and outperformed the opponents on

18 problems. The obtained results are proven by two differ-

ent statistical (Wilcoxon signed rank test and Friedman’s test)

tests. Generally speaking, the swarm-based methods suffer

from low exploration, but TSA has an efﬁcient exploration

123

Arabian Journal for Science and Engineering

Table 36 Experimental results of the Spambase dataset

Spambase ABC ACO BBO ES GA GWO PBIL PSO TSA

Mean 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Best 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Wor s t 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

SD 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Median 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00 0.00E +00

Mean time 2751.0268 468.9487 418.0443 428.8673 344.7158 538.8551 154.8249 341.6761 1754.3480

Friedman rank 111111111

Manuel rank 111111111

Wilcoxon 111111110

Classiﬁcation rate (%) 42.19 43.06 40.62 42.84 42.84 39.40 41.88 42.21 40.69

mechanism. In future studies, the improved versions of TSA

would be used for training FF MLP ANN.

Funding The authors wish to thank Scientiﬁc Research Projects Coor-

dinatorship at Selcuk University and The Scientiﬁc and Technological

Research Council of Turkey for their institutional supports.

Compliance with ethical standards

Conﬂicts of interest The authors declare that they have no conﬂict of

interest.

References

1. Bassett, D.S.; Gazzaniga, M.S.: Understanding complexity in the

human brain. Trends Cognit. Sci. 15(5), 200–209 (2011)

2. Haykin, S.: Neural Networks: A Comprehensive Foundation. Pren-

tice Hall PTR, New York (1994)

3. Yao, L.; Li, T.; Li, Y.; Long, W.; Yi, J.: An improved feed-forward

neural network based on UKF and strong tracking ﬁltering to estab-

lish energy consumption model for aluminum electrolysis process.

Neural Comput. Appl. 31(8), 4271–4285 (2019)

4. Zhang, Y.; Gendeel, M.A.A.; Peng, H.; Qian, X.: Xu H Super-

vised Kohonen network with heterogeneous valuedifference metric

for both numeric and categorical inputs. Soft Comput.ss 24(3),

1763–1774 (2020)

5. Mirjalili, S.: Evolutionary radial basis function networks. In:

Evolutionary Algorithms and Neural Networks: Theory and Appli-

cations. Springer International Publishing, Cham, pp 105–139

(2019). https://doi.org/10.1007/978-3-319-93025-1-8

6. Shojaeifard, A.; Amroudi, A.N.; Mansoori, A.; Erfanian, M.: Pro-

jection recurrent neural network model: a new strategy to solve

weapon-target assignment problem. Neural Process. Lett. 30(8),

2538–2547 (2019)

7. Tavanaei, A.; Ghodrati, M.; Kheradpisheh, S.R.; Masquelier, T.;

Maida, A.: Deep learning in spiking neural networks. Neural Netw.

111, 47–63 (2019)

8. Mirjalili, S.: How effective is the Grey Wolf optimizer in training

multi-layer perceptrons. Appl. Intell. 43(1), 150–161 (2015)

9. Lee, S.-J.; Tseng, C.-H.; Lin, G.R.; Yang, Y.; Yang, P.; Muham-

mad, K.; Pandey, H.M.: A dimension-reduction based multilayer

perception method for supporting the medical decision making.

Pattern Recogn. Lett. 131, 15–22 (2020)

10. Hertz, J.A.: Introduction to the Theory of Neural Computation.

CRC Press, Amsterdam (2018)

11. Mitchell, M.; Holland, J.H.; Forrest, S.: When will a genetic algo-

rithm outperform hill climbing. In: Advances in Neural Information

Processing Systems, pp. 51–58 (1994)

12. Sonuc, E.; Sen, B.; Bayir, S.: A cooperative GPU-based parallel

multistart simulated annealing algorithm for quadratic assignment

problem. Eng. Sci. Technol. Int. J. 21(5), 843–849 (2018). https://

doi.org/10.1016/j.jestch.2018.08.002

13. Pandey, H.M.; Rajput, M.; Mishra, V.: Performance compari-

son of pattern search, simulated annealing, genetic algorithm and

jaya algorithm. In: Data Engineering and Intelligent Computing.

Springer, Berlin, pp 377–384 (2018)

14. ¸Sahman, M.A.; Altun, A.A.; Dündar, A.O.: A new MILP model

proposal in feed formulation and using a hybrid-linear binary PSO

(H-LBP) approach for alternative solutions. Neural Comput. Appl.

29(2), 537–552 (2018)

15. Cinar, A.C.; Korkmaz, S.; Kiran, M.S.: A discrete tree-seed algo-

rithm for solving symmetric traveling salesman problem. Eng. Sci.

Technol. Int. J. (2019)

16. Tongur, V.; Hacibeyoglu, M.; Ulker, E.: Solving a big-scaled hos-

pital facility layout problem with meta-heuristics algorithms. Eng.

Sci. Technol. Int. J. (2019)

17. Xu, X.; Rong, H.; Trovati, M.; Liptrott, M.; Bessis, N.: CS-PSO:

chaotic particle swarm optimization algorithm for solving com-

binatorial optimization problems. Soft. Comput. 22(3), 783–795

(2018)

18. Egrioglu, E.; Yolcu, U.; Bas, E.; Dalar, A.Z.: Median-Pi artiﬁ-

cial neural network for forecasting. Neural Comput. Appl. 31(1),

307–316 (2019)

19. Yasar, A.; Saritas, I.; Sahman, M.A.; Dundar, A.O.: Classiﬁcation

of leaf type using artiﬁcial neural networks. Int. J. Intell. Syst. Appl.

Eng. 3(4), 136–139 (2015)

20. Yasar, A.; Saritas, I.; Sahman, M.; Cinar, A.: Classiﬁcation of

parkinson disease data with artiﬁcial neural networks. In: IOP

Conference Series: Materials Science and Engineering, vol 1. IOP

Publishing, p. 012031 (2019

21. Sulistyo, S.B.; Woo, W.L.; Dlay, S.S.: Regularized neural networks

fusion and genetic algorithm based on-ﬁeld nitrogen status esti-

mation of wheat plants. IEEE Trans. Industr. Inf. 13(1), 103–114

(2016)

22. Sulistyo, S.B.; Woo, W.L.; Dlay, S.S.; Gao, B.: Building a globally

optimized computational intelligent image processing algorithm

for on-site inference of nitrogen in plants. IEEE Intell. Syst. 33(3),

15–26 (2018)

123

Arabian Journal for Science and Engineering

23. Gu, K.; Zhou, Y.; Sun, H.; Zhao, L.; Liu, S.: Prediction of air quality

in Shenzhen based on neural network algorithm. Neural Comput.

Appl. 1–14 (2019)

24. Koh, B.H.D.; Woo, W.L.: Multi-view temporal ensemble for clas-

siﬁcation of non-stationary signals. IEEE Access 7, 32482–32491

(2019)

25. Boashash, B.; Ouelha, S.: Designing high-resolution time–fre-

quency and time–scale distributions for the analysis and classiﬁca-

tion of non-stationary signals: a tutorial review with a comparison

of features performance. Digital Signal Process. 77, 120–152

(2018)

26. Delsy, T.T.M.; Nandhitha, N.; Rani, B.S.: Feasibility of spectral

domain techniques for the classiﬁcation of non-stationary signals.

J. Ambient Intell. Hum. Comput, 1–8 (2020)

27. Wolpert, D.H.; Macready, W.G.: No free lunch theorems for opti-

mization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)

28. Wienholt, W.: Minimizing the system error in feedforward neural

networks with evolution strategy. In: International Conference on

Artiﬁcial Neural Networks, Springer, Berlin, pp 490–493 (1993)

29. Seiffert, U.: Multiple layer perceptron training using genetic algo-

rithms. In: ESANN, Citeseer, pp 159–164 (2001)

30. Mendes, R.; Cortez, P.; Rocha, M.; Neves, J.: Particle swarms for

feedforward neural network training. In: Proceedings of the 2002

International Joint Conference on Neural Networks. IJCNN’02

(Cat. No. 02CH37290), IEEE, pp 1895–1899 (2002)

31. Blum, C.; Socha, K.: Training feed-forward neural networks with

ant colony optimization: an application to pattern classiﬁcation.

In: Fifth International Conference on Hybrid Intelligent Systems

(HIS’05), IEEE (2005)

32. Karaboga, D.; Akay, B.; Ozturk, C.: Artiﬁcial bee colony (ABC)

optimization algorithm for training feed-forward neural networks.

In: International conference on modeling decisions for artiﬁcial

intelligence, Springer, Berlin, pp. 318–329 (2007)

33. Mirjalili, S.; Hashim, S.Z.M.; Sardroudi, H.M.: Training feedfor-

ward neural networks using hybrid particle swarm optimization

and gravitational search algorithm. Appl. Math. Comput. 218(22),

11125–11137 (2012)

34. Mirjalili, S.; Mirjalili, S.M.; Lewis, A.: Let a biogeography-based

optimizer train your multi-layer perceptron. Inf. Sci. 269, 188–209

(2014)

35. Amirsadri, S.; Mousavirad, S.J.; Ebrahimpour-Komleh, H.: A Levy

ﬂight-based grey wolf optimizer combined with back-propagation

algorithm for neural network training. Neural Comput. Appl.

30(12), 3707–3720 (2018)

36. Haklı, H.; U˘guz, H.: A novel particle swarm optimization algorithm

with Levy ﬂight. Appl. Soft Comput. 23, 333–345 (2014)

37. Xu, F.; Pun, C.-M.; Li, H.; Zhang, Y.; Song, Y.; Gao, H.: Training

feed-forward artiﬁcial neural networks with a modiﬁed artiﬁcial

bee colony algorithm. Neurocomputing (2019)

38. Zhang, X.; Wang, X.; Chen, H.; Wang, D.; Fu, Z.: Improved GWO

for large-scale function optimization and MLP optimization in can-

cer identiﬁcation. Neural Comput. Appl., 1–21 (2019)

39. Heidari, A.A.; Faris, H.; Mirjalili, S.; Aljarah, I.; Mafarja, M.: Ant

lion optimizer: theory, literature review, and application in multi-

layer perceptron neural networks. In: Nature-inspired optimizers.

Springer, Berlin, pp 23–46 (2020)

40. Dalwinder, S.; Birmohan, S.; Manpreet, K.: Simultaneous feature

weighting and parameter determination of neural networks using

ant lion optimization for the classiﬁcation of breast cancer. Biocy-

bernet. Biomed. Eng. (2019)

41. Faris, H.; Aljarah, I.; Mirjalili, S.: Training feedforward neural

networks using multi-verse optimizer for binary classiﬁcation prob-

lems. Appl. Intell. 45(2), 322–332 (2016)

42. Gao, B.; Li, X.; WooyunTian, W.L.G.: Physics-based image

segmentation using ﬁrst order statistical properties and genetic

algorithm for inductive thermography imaging. IEEE Trans. Image

Process. 27(5), 2160–2175 (2017)

43. Mutluer, M.; ¸Sahman, M.A.; Çunka¸s, M.: Heuristic optimization

based on penalty approach for surface permanent magnet syn-

chronous machines. Arab. J. Sci. Eng. 1–17 (2020)

44. Karasekreter, N.; ¸Sahman, M.A.; Ba¸sçiftçi, F.; Fidan, U.: PSO

based clustering for the optimization of energy consumption in

wireless sensor network. Emerg. Mater. Res, 1–7 (2020)

45. Kiran, M.S.: TSA: tree-seed algorithm for continuous optimization.

Expert Syst. Appl. 42(19), 6686–6698 (2015)

46. Kıran, M.S.: An implementation of tree-seed algorithm (TSA) for

constrained optimization. In: Intelligent and evolutionary systems.

Springer, Berlin, pp 189–197 (2016)

47. Babalik, A.; Cinar, A.C.; Kiran, M.S.: A modiﬁcation of tree-seed

algorithm using Deb’s rules for constrained optimization. Appl.

Soft Comput. 63, 289–305 (2018)

48. El-Fergany, A.A.; Hasanien, H.M.: Tree-seed algorithm for solv-

ing optimal power ﬂow problem in large-scale power systems

incorporating validations and comparisons. Appl. Soft Comput.

64, 307–316 (2018)

49. Zhou, J.; Zheng, Y.; Xu, Y.; Liu, H.; Chen, D.: A heuristic

TS fuzzy model for the pumped-storage generator-motor using

variable-length tree-seed algorithm-based competitive agglomer-

ation. Energies 11(4), 944 (2018)

50. Horng, S.-C.; Lin, S.-S.: Embedding ordinal optimization into

tree–seed algorithm for solving the probabilistic constrained sim-

ulation optimization problems. Appl. Sci. 8(11), 2153 (2018)

51. Zheng, Y.; Zhou, J.; Zhu, W.; Zhang, C.; Li, C.; Fu, W.: Design of a

multi-mode intelligent model predictive control strategy for hydro-

electric generating unit. Neurocomputing 207, 287–299 (2016)

52. Chen, W.; Tan, X.; Cai, M.: Parameter identiﬁcation of equivalent

circuit models for Li-ion batteries based on tree seeds algorithm.

In: IOP Conference Series: Earth and Environmental Science, vol

1. IOP Publishing, p 012024 (2017)

53. Chen, W.; Cai, M.; Tan, X.; Wei, B.: Parameter identiﬁcation and

state-of-charge estimation for Li-Ion batteries using an improved

tree seed algorithm. IEICE Trans. Inf. Syst. 102(8), 1489–1497

(2019)

54. Ding, Z.; Zhao, Y.; Lu, Z.: Simultaneous identiﬁcation of structural

stiffness and mass parameters based on Bare-bones Gaussian Tree

Seeds Algorithm using time-domain data. Appl. Soft Comput. 83,

105602 (2019)

55. Zhao, S.; Wang, N.; Liu, X.: Artiﬁcial bee colony algorithm

with tree-seed searching for modeling multivariable systems using

GRNN. In: 2019 Chinese Control And Decision Conference

(CCDC), IEEE, pp. 4702–4707 (2019)

56. Sahman, M.; Cinar, A.; Saritas, I.; Yasar, A.: Tree-seed algorithm in

solving real-life optimization problems. In: IOP conference series:

materials science and engineering, vol 1. IOP Publishing (2019)

57. Ding, Z.; Li, J.; Hao, H.; Lu, Z.-R.: Nonlinear hysteretic param-

eter identiﬁcation using an improved tree-seed algorithm. Swarm

Evolut. Comput. 46, 69–83 (2019)

58. Ding, Z.; Li, J.; Hao, H.: Structural damage detection with uncer-

tainties using a modiﬁed tree seeds algorithm. In: International

Conference on Computational & Experimental Engineering and

Sciences, Springer, Berlin, pp. 751–760 (2019)

59. Muneeswaran, V.; Rajasekaran, M.P.: Gallbladder shape estimation

using tree-seed optimization tuned radial basis function network

for assessment of acute cholecystitis. In: Intelligent engineering

informatics. Springer, pp 229–239 (2018)

60. Cinar, A.; Kiran, M.: A parallel version of tree-seed algorithm

(TSA) within CUDA platform. In: Selçuk International Scientiﬁc

Conference on Applied Sciences (2016)

61. Cinar, A.C.; Kiran, M.S.: A parallel implementation of tree-seed

algorithm on CUDA-supported graphical processing unit. J Fac

Eng Archit Gazi Univ 33(4), 1397–1409 (2018)

123

Arabian Journal for Science and Engineering

62. Muneeswaran, V.; Rajasekaran, M.P. Beltrami-regularized denois-

ing ﬁlter based on tree seed optimization algorithm: an ultrasound

image application. In: International conference on information

and communication technology for intelligent systems, Springer,

pp. 449–457 (2017)

63. Muneeswaran, V.; Rajasekaran, M.P.: Local contrast regularized

contrast limited adaptive histogram equalization using tree seed

algorithm—an aid for mammogram images enhancement. In:

Smart Intelligent Computing and Applications. Springer, Berlin,

pp 693–701 (2019)

64. Ding, Z.; Li, J.; Hao, H.; Lu, Z.-R.: Structural damage identiﬁ-

cation with uncertain modelling error and measurement noise by

clustering based tree seeds algorithm. Eng. Struct. 185, 301–314

(2019)

65. Oliva, D.; Elaziz, M.A.; Hinojosa, S.: Otsu’s between class variance

and the tree seed algorithm. In: Metaheuristic Algorithms for Image

Segmentation: Theory and Applications. Springer, pp 71–83 (2019)

66. Cinar, A.C.; Kiran, M.S.: Similarity and logic gate-based tree-

seed algorithms for binary optimization. Comput. Ind. Eng. 115,

631–646 (2018)

67. Cinar, A.C.; Iscan, H.; Kiran, M.S.: Tree-seed algorithm for large-

scale binary optimization. In: KnE Social Sciences, pp. 48–64

(2018)

68. Sahman, M.A.; Cinar, A.C.: Binary tree-seed algorithms with S-

shaped and V-shaped transfer functions. Int. J. Intell. Syst. Appl.

Eng. 7(2), 111–117 (2019)

69. Kiran, M.S.: Withering process for tree-seed algorithm. Proced.

Comput. Sci. 111, 46–51 (2017)

70. Aslan, M.; Beskirli, M.; Kodaz, H.; Kıran, M.S.: An improved

tree seed algorithm for optimization problems. Int. J. Mach. Learn.

Comput. 8(1), 20–25 (2018)

71. Çınar, A.C.; Kıran, M.S. Boundary conditions in Tree-Seed Algo-

rithm: analysis of the success of search space limitation techniques

in Tree-Seed Algorithm. In: 2017 International Conference on

Computer Science and Engineering (UBMK), IEEE, pp. 571–576

(2017)

72. Be¸skirli, A.; Özdemir, D.; Temurta¸s, H.: A comparison of modi-

ﬁed tree–seed algorithm for high-dimensional numerical functions.

Neural Comput. Appl., 1–35 (2019)

73. Gungor, I.; Emiroglu, B.G.; Cinar, A.C.; Kiran, M.S.: Integration

search strategies in tree seed algorithm for high dimensional func-

tion optimization. Int. J. Mach. Learn. Cybernet., 1–19 (2019)

74. Jiang, J.; Jiang, S.; Meng, X.; Qiu, C.: EST-TSA: An effective

search tendency based to tree seed algorithm. Physica A 534,

122323 (2019)

75. Jiang, J.; Xu, M.; Meng, X.; Li, K.: STSA: A sine Tree-Seed Algo-

rithm for complex continuous optimization problems. Physica A

537, 122802 (2020)

76. Be¸skirli, M.; Yüksek, B.: Test Fonksiyonlarında A˘gaç Tohum Algo-

ritmasının Performans Analizi. Avrupa Bilim ve Teknoloji Dergisi,

pp. 93–101

77. Chen, F.; Ye, Z.; Wang, C.; Yan, L.; Wang, R.: A feature selec-

tion approach for network intrusion detection based on tree-seed

algorithm and k-nearest neighbor. In: 2018 IEEE 4th International

Symposium on Wireless Systems within the International Confer-

ences on Intelligent Data Acquisition and Advanced Computing

Systems (IDAACS-SWS), IEEE, pp 68–72 (2018)

123

EvatCrop: a novel hybrid quasi-fuzzy artificial neural network (ANN) model for estimation of reference evapotranspiration

Article

May 2024

Reference evapotranspiration (ET 0) is a significant parameter for efficient irrigation scheduling and groundwater conservation. Different machine learning models have been designed for ET 0 estimation for specific combinations of available meteorological parameters. However, no single model has been suggested so far that can handle diverse combinations of available meteorological parameters for the estimation of ET 0. This article suggests a novel architecture of an improved hybrid quasi-fuzzy artificial neural network (ANN) model (EvatCrop) for this purpose. EvatCrop yielded superior results when compared with the other three popular models, decision trees, artificial neural networks, and adaptive neuro-fuzzy inference systems, irrespective of study locations and the combinations of input parameters. For real-field case studies, it was applied in the groundwater-stressed area of the Terai agro-climatic region of North Bengal, India, and trained and tested with the daily meteorological data available from the National Centres for Environmental Prediction from 2000 to 2014. The precision of the model was compared with the standard Penman-Monteith model (FAO56PM). Empirical results depicted that the model performances remarkably varied under different data-limited situations. When the complete set of input parameters was available, EvatCrop resulted in the best values of coefficient of determination (R 2 = 0.988), degree of agreement (d = 0.997), root mean square error (RMSE = 0.183), and root mean square relative error (RMSRE = 0.034).

Fixed Feature Selection for Gene Expression Microarray Data for Cancer Classification Using Real-Valued ABC and MLP-ANN

Conference Paper

Full-text available

Mar 2023

Datasets related to Genes expressions stand out for their very high dimensionality, which can be on the order of thousands of genes found in the same organism's genome. Pre-processing the data, in this case, is essential because of the noise and the redundancy phenomena. The smallest possible subset of genes identifying a given disease or a significant feature can be found by performing feature selection on gene expression data. In this context, this study aims to provide a classification system by using a DNA microarray. The suggested technique uses a real-valued Artificial Bee Colony algorithm to make the selection, then uses the feed-forward Artificial Neural Networks (ANNs) to evaluate the generated subsets. Binary and multi-cancer gene expression data-sets were employed to test this proposition. The obtained results were compared to those of five nature-inspired meta-heuristics and revealed that the proposed method shows the lowest error rate.

A Systematic Review of Cancer Burden Forecasting Models: Evaluating Efficacy for Long-Term Predictions Using Annual Data

Preprint

Full-text available

Apr 2024

This paper presents a comprehensive systematic review of forecasting models applied to cancer burden prediction, focusing on their efficacy for long-term predictions using annual data. Cancer represents a significant challenge to global healthcare systems, necessitating accurate forecasting models for effective planning and resource allocation. We evaluated various methodologies, including JoinPoint Regression, Age-Period-Cohort models, time series analysis, exponential smoothing, machine learning, and more, highlighting their strengths and weaknesses in forecasting cancer incidence, mortality, and Disability-Adjusted Life Years. Our literature search strategy involved a systematic search across major scientific databases, yielding a final selection of 10 studies for in-depth analysis. These studies employed diverse forecasting models, which were critically assessed for their predictive accuracy, handling of annual data limitations, and applicability to cancer epidemiology. Our findings indicate that no single model universally excels in all aspects of cancer burden forecasting. However, ARIMA models and their variants consistently demonstrated strong predictive performance across different cancers, countries, and projection periods. The evaluation also underscores the challenges posed by limited long-term data and the potential for complex models to overfit in sparse data scenarios. Importantly, the review suggests a need for further research into developing models capable of accurate longer-term forecasts, which could significantly enhance healthcare planning and intervention strategies. In conclusion, while ARIMA and its derivatives currently lead in performance, there is a pressing need for innovative models that extend predictive capabilities over longer horizons, improving the global healthcare sector's response to the cancer burden.

Accurate Agarwood Oil Quality Determination: A Breakthrough With Artificial Neural Networks and the Levenberg- Marquardt Algorithm

Article

Full-text available

Jan 2024

The agarwood oil quality has been divided into four grades, including low, medium-low, medium-high, and high, and has been thoroughly examined in this manuscript. Recently, there has been a high demand for agarwood oil but the current grading method is based on conventional techniques that rely on visual inspection of various characteristics such as intensity, smell, texture, and weight. However, this method is not standardized, making it difficult to grade agarwood oil accurately. Therefore, the use of artificial neural networks (ANN) in artificial intelligence (AI) was employed to develop a system for identifying agarwood oil quality using the Levenberg-Marquardt (LM) algorithm. Data from 660 samples of chemical compounds extracted from agarwood oil were used to train the ANN. To enhance the accuracy of agarwood oil quality identification with LM performance, the data was split into 70% for validation, 15% for training, and 15% for testing. The results showed that the ANN with the eleven inputs (10-epi-ɤ-eudesmol, α-agarofuran, ɤ-eudesmol, β-agarofuran, ar-curcumene, valerianol, β-dihydro agarofuran, α-guaiene, allo aromadendrene epoxide and ϒ-cadinene) trained by ten hidden neurons of LM algorithm provided the best performance with 100% for accuracy, specificity, sensitivity and precision as well as minimum convergence epoch. The experimental implementation of the model was done using the MATLAB version R2015a platform. This study will help to standardize agarwood oil quality determination using intelligent modeling techniques and serve as a guide for future research in the essential oil industry.

Enhancing the Control of Doubly Fed Induction Generators Using Artificial Neural Networks in the Presence of Real Wind Profiles

Article

Full-text available

Mar 2024
PLOS ONE

This study tackles the complex task of integrating wind energy systems into the electric grid, facing challenges such as power oscillations and unreliable energy generation due to fluctuating wind speeds. Focused on wind energy conversion systems, particularly those utilizing double-fed induction generators (DFIGs), the research introduces a novel approach to enhance Direct Power Control (DPC) effectiveness. Traditional DPC, while simple, encounters issues like torque ripples and reduced power quality due to a hysteresis controller. In response, the study proposes an innovative DPC method for DFIGs using artificial neural networks (ANNs). Experimental verification shows ANNs effectively addressing issues with the hysteresis controller and switching table. Additionally, the study addresses wind speed variability by employing an artificial neural network to directly control reactive and active power of DFIG, aiming to minimize challenges with varying wind speeds. Results highlight the effectiveness and reliability of the developed intelligent strategy, outperforming traditional methods by reducing current harmonics and improving dynamic response. This research contributes valuable insights into enhancing the performance and reliability of renewable energy systems, advancing solutions for wind energy integration complexities.

Performance discrepancy mitigation in heart disease prediction for multisensory inter-datasets

Article

Full-text available

Mar 2024

Heart disease is one of the primary causes of morbidity and death worldwide. Millions of people have had heart attacks every year, and only early-stage predictions can help to reduce the number. Researchers are working on designing and developing early-stage prediction systems using different advanced technologies, and machine learning (ML) is one of them. Almost all existing ML-based works consider the same dataset (intra-dataset) for the training and validation of their method. In particular, they do not consider inter-dataset performance checks, where different datasets are used in the training and testing phases. In inter-dataset setup, existing ML models show a poor performance named the inter-dataset discrepancy problem. This work focuses on mitigating the inter-dataset discrepancy problem by considering five available heart disease datasets and their combined form. All potential training and testing mode combinations are systematically executed to assess discrepancies before and after applying the proposed methods. Imbalance data handling using SMOTE-Tomek, feature selection using random forest (RF), and feature extraction using principle component analysis (PCA) with a long preprocessing pipeline are used to mitigate the inter-dataset discrepancy problem. The preprocessing pipeline builds on missing value handling using RF regression, log transformation, outlier removal, normalization, and data balancing that convert the datasets to more ML-centric. Support vector machine, K-nearest neighbors, decision tree, RF, eXtreme Gradient Boosting, Gaussian naive Bayes, logistic regression, and multilayer perceptron are used as classifiers. Experimental results show that feature selection and classification using RF produce better results than other combination strategies in both single- and inter-dataset setups. In certain configurations of individual datasets, RF demonstrates 100% accuracy and 96% accuracy during the feature selection phase in an inter-dataset setup, exhibiting commendable precision, recall, F1 score, specificity, and AUC score. The results indicate that an effective preprocessing technique has the potential to improve the performance of the ML model without necessitating the development of intricate prediction models. Addressing inter-dataset discrepancies introduces a novel research avenue, enabling the amalgamation of identical features from various datasets to construct a comprehensive global dataset within a specific domain.

Learning Behavior Analysis and Prediction of Teaching System Based on Neural Network Algorithm

Conference Paper

Jan 2024

Jing Dong

Microstructural, mechanical, tribological, and corrosion behavior of ultrafine bio-degradable Mg/CeO2 nanocomposites: Machine learning-based modeling and experiment

Article

Oct 2023
TRIBOL INT

Improved Tree-seed Algorithm for Disassembly Line Balancing Problem Considering Ergonomics

Conference Paper

Feb 2024

GETİRİ ÖNGÖRÜLEBİLİRLİĞİNİ ÖĞRENMEYE YÖNELİK BİR SİMÜLASYON YAKLAŞIMI İLE PORTFÖY OPTİMİZASYONU: BORSA İSTANBUL UYGULAMASIPORTFOLIO OPTIMIZATION WITH A SIMULATION APPROACH TO LEARN ABOUT RETURN PREDICTABILITY: BORSA ISTANBUL APPLICATION

Article

Dec 2023

Hasan AKYER

Bireysel ve kurumsal yatırımcılar açısından portföy oluşturmasındaki temel amaç yüksek getiri elde etmektir. Yatırımcılar açısından, varlıkların getiri ve risk yönlerini kapsamlı bir analiz yaparak minimum risk ve maksimum getiriyi oluşturacak şekilde bir portföy ortaya koymak gereklidir. Portföy optimizasyonu alanında, literatürde temel olarak iki yöntem kullanılarak portföy oluşturma çalışmaları yapılmıştır. Geleneksel portföy yönetimi, sektörel çeşitlendirme esasına dayanmaktadır. Modern portföy yönetimi ise matematiksel modeli esas alan yenilikçi bir yaklaşımdır. Bu çalışmada, modern portföy yaklaşımları temelli simülasyon metodu kullanılarak portföyler elde edilmiştir. BİST 30’da yer alan şirketlerin Piyasa Değeri / Defter Değeri (PD/DD) ve Fiyat – Kazanç (F/K) oranlarını kullanarak Monte Carlo simülasyonu metodu ile yatırımcıların risk algısına göre portföyler elde edilmiştir. Çalışma sonucunda, portföyün en yüksek yüzdesini oluşturan hisselerin PD/DD değerinin daha düşük olanlar olduğu gözlemlenmiştir. F/K değerlerinin ikinci derece önemli faktör olduğu görülmüştür.

PSO Based Clustering for the Optimization of Energy Consumption in Wireless Sensor Network

Article

Full-text available

Jul 2020

Mehmet Akif Sahman

Wireless sensors (Node) are devices with a built-in battery, sensor and communication unit. Wireless sensor network (WSN) are structures that are formed by multiple nodes coming together to transmit the data they collect to each other to the base station. Significant work has been done on WSN in recent years. One of the important issues that these studies focus on is increasing the energy efficiency of the nodes forming the network and ensure their survival for longer. In this study, two-dimensional PSO (TDPSO) has been proposed to solve the problem of clustering in wireless sensor networks by inspiring from Particle Swarm Optimization (PSO) modified by Huilian FAN to solve discrete problems such as traveling salesman problem. The proposed algorithm was analyzed comparatively with the Low-Energy Adaptive Clustering Hierarchy (LEACH) protocol. As a result, an improvement of 4% compared to LEACH was achieved in terms of the amount of energy left in the network. The data packet sent in 20 rounds was increased by 2000 packets according to LEACH and 27% improvement was achieved. In addition, the number of surviving nodes was increased by 22%.

Feasibility of spectral domain techniques for the classification of non-stationary signals

Article

Full-text available

Jun 2021

Extensive research is carried out in the analysis of non stationary signals. Most of the real time signals are non-stationary in nature. In most cases, these non stationary signals are of types viz defect / non defect, normal / abnormal etc. Hence analysis refers to categorising the signals. Developing a signal processing algorithm for performing the above task is a major challenge. In Machine learning, features are extracted and classifiers are used for categorizing the signals. Feature extraction can be done in time domain, frequency domain and spectral domain. In this paper, feasibility of Singular Value Decomposition (SVD), Framelet Transforms, Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT) for feature extraction is studied. These features are aggregated using statistical parameters, namely mean, skewness and kurtosis. These aggregated features are then fed to Back Propagation Network (BPN). Performance of Back Propagation Network is measured in terms of sensitivity, specificity and accuracy.

Binary tree-seed algorithms with S-shaped and V-shaped transfer functions

Article

Full-text available

Jun 2019
IJISA

Tree-seed algorithm (TSA) is a nature-inspired metaheuristic optimization algorithm. TSA was originally designed and introduced for solving continuous optimization problems. In this study, TSA was modified with transfer functions so as to solve binary optimization problems. Continuous search space was mapped to binary search space with transfer functions. Four S-shaped and four V-shaped transfer functions were used for discretization. Uncapacitated facility location problem (UFLP) is a pure binary optimization problem. In order to measure the performance, 15 different sized (small, medium, large and extra-large) UFLPs were solved with eight different binary TSAs in this study. Experimental results has shown that S-shaped transfer functions are better than V-shaped transfer functions on these problem sets.

A discrete tree-seed algorithm for solving symmetric traveling salesman problem

Article

Full-text available

Nov 2019

Tree-Seed algorithm (TSA) is a recently developed nature inspired population-based iterative search algorithm. TSA is proposed for solving continuous optimization problems by inspiring the relations between trees and their seeds. The constrained and binary versions of TSA are present in the literature but there is no discrete version of TSA which decision variables represented as integer values. In the present work, the basic TSA is redesigned by integrating the swap, shift, and symmetry transformation operators in order to solve the permutation-coded optimization problems and it is called as DTSA. In the basic TSA, the solution update rules can be used for the decision variables whose are defined in continuous solution space, this rules are replaced with the transformation operators in the proposed DTSA. In order to investigate the performance of DTSA, well-known symmetric traveling salesman problems are considered in the experiments. The obtained results are compared with well-known metaheuristic algorithms and their variants, such as Ant Colony Optimization (ACO), Genetic Algorithm (GA), Simulated Annealing (SA), State Transition Algorithm (STA), Artificial Bee Colony (ABC), Black Hole (BH), and Particle Swarm Optimization (PSO). Experimental results show that DTSA is another qualified and competitive solver on discrete optimization.

A Dimension-Reduction Based Multilayer Perception Method for Supporting the Medical Decision Making

Article

Full-text available

Nov 2019
PATTERN RECOGN LETT

Due to the rapid development of Medical IoT recently, how to effectively apply these huge amounts of IoT data to enhance the reliability of the clinical decision making has become an increasing issue in the medical field. These data usually comprise high-complicated features with tremendous volume, and it implies that the simple inference models may less powerful to be practiced. In deep learning, multilayer perceptron (MLP) is a kind of feed-forward artificial neural network, and it is one of the high-performance methods about stochastic scheme, fitness approximation, and regression analysis. To process these high uncertain data, the proposed work based on MLP structure in particular integrates the boosting scheme and dimension-reduction process. In this proposed work, the advanced ReLU-based activation function is used. Also, the weight initialization is applied to improve the stable prediction and convergence. After the improved dimension-reduction process is introduced, the proposed method can effectively learn the hidden information from the reformative data and the precise labels also can be recognized by stacking a small amount of neural network layers with paying few extra cost. The proposed work shows a possible path of embedding dimension reduction in deep learning structure with minor price. In addition to the prediction issue, the proposed method can also be applied to assess risk and forecast trend among different information systems.

Tree-seed algorithm in solving real-life optimization problems

Article

Full-text available

Nov 2019

Tree-seed algorithm (TSA) is a nature-inspired and population-based algorithm for solving continuous optimization problems. The tree-seed relationship is the main motivation of this algorithm. TSA has only two peculiar parameters which are the total number of trees in the stand (pop) and the controller of the seed production (search tendency). Although many problems have been solved in the literature by TSA which is the successful optimizer for low dimensional unconstrained continuous problems, real-life problems have not been addressed yet. In this study, six continuous unconstrained real-life optimization problems (gas transmission compressor design, optimal capacity of gas production facility, gear train design, frequency modulation sounds parameter identification, the spread spectrum radar polyphase code design with 10 decision variables, the spread spectrum radar polyphase code design with 20 decision variables) have been solved. It is seen that choosing the number of the population as 50 and the value of search tendency as 0.1 is appropriate according to experimental results for these problems.

Classification of Parkinson disease data with artificial neural networks

Article

Full-text available

Nov 2019

An artificial neural network system has been developed to detect Parkinson's Disease (PD). Three samples were taken from each patient and included in the system. The importance of the study is based on the development and use of a new subject-based ANN approach that takes into account the dependent nature of the data in a replicated measure-based design. In order to evaluate the performance of the proposed system, an audio replication-based experiment was performed to differentiate healthy people from PD patients. The UCI Experiment consisted of 80 subjects, half of whom were affected by PD. Although the proposed system has a reduced number of subjects, the system is able to distinguish people with PD from an acceptable degree of healthy people with an accuracy rate of 94.93% in an artificial neural network.

PSO-based clustering for the optimization of energy consumption in wireless sensor network

Article

Aug 2020

Naim Karasekreter

Heuristic Optimization Based on Penalty Approach for Surface Permanent Magnet Synchronous Machines

Article

Jun 2020

This paper aims to provide a smart design to improve the efficiency of surface permanent magnet synchronous motor. An efficient design strategy involving penalty approaches are considered for extracting all the possible parameter combinations and the solutions in the infeasible region. We compare the performance of tree heuristic optimization algorithms and six penalty methods. The heuristic optimization algorithms are: particle swarm optimization, differential search algorithm, and tree seed algorithm. The penalty methods are: three of which are static penalty approaches, two of dynamic penalty approaches, and Deb’s rule. Besides, the optimized motor design is tested with finite element analysis. Two conclusions were drawn from the experiments. First, heuristic algorithms using penalty approaches had significantly better performance compared to standard and popular heuristic algorithms. This emphasizes the importance of using heuristic algorithms with penalty approaches in SPMSM design optimization. Second, the compatibility of design optimization and numerical analysis results are acceptable and highly satisfactory for surface permanent magnet synchronous motor design. According to the analytical design, a 4% improvement in efficiency was achieved with the proposed approach.

Simultaneous feature weighting and parameter determination of Neural Networks using Ant Lion Optimization for the classification of breast cancer

Article

Jan 2020

In this paper, feature weighting is used to develop an effective computer-aided diagnosis system for breast cancer. Feature weighting is employed because it boosts the classification performance more as compared to feature subset selection. Specifically, a wrapper method utilizing the Ant Lion Optimization algorithm is presented that searches for best feature weights and parametric values of Multilayer Neural Network simultaneously. The selection of hidden neurons and backpropagation training algorithms are used as parameters of neural networks. The performance of the proposed approach is evaluated on three breast cancer datasets. The data is initially normalized using tanh method to remove the effects of dominant features and outliers. The results show that the proposed wrapper method has a better ability to attain higher accuracy as compared to the existing techniques. The obtained high classification performance validates the work which has the potential for becoming an alternative to the other well-known techniques.

Training Feed-Forward Multi-Layer Perceptron Artificial Neural Networks with a Tree-Seed Algorithm

Abstract and Figures

Recommended publications

G-HABC Algorithm for Training Artificial Neural Networks

Global Hybrid Ant Bee Colony Algorithm for Training Artificial Neural Networks

A tree–seed algorithm based on intelligent search mechanisms for continuous optimization

Advances in Tree Seed Algorithm: A Comprehensive Survey

A discrete tree-seed algorithm for solving symmetric traveling salesman problem

A multi-strategy improved tree–seed algorithm for numerical optimization and engineering optimizatio...