ArticlePDF Available

Chaotic vortex search algorithm: metaheuristic algorithm for feature selection

September 2022
Evolutionary Intelligence 15(4)

September 2022
15(4)

DOI:10.1007/s12065-021-00590-1

Authors:

Farhad Soleimanian Gharehchopogh

Islamic Azad University of Urmia

Isa Maleki

Islamic Azad University

Zahra Asheghi Dizaji

The Vortex Search Algorithm (VSA) is a meta-heuristic algorithm that has been inspired by the vortex phenomenon proposed by Dogan and Olmez in 2015. Like other meta-heuristic algorithms, the VSA has a major problem: it can easily get stuck in local optimum solutions and provide solutions with a slow convergence rate and low accuracy. Thus, chaos theory has been added to the search process of VSA in order to speed up global convergence and gain better performance. In the proposed method, various chaotic maps have been considered for improving the VSA operators and helping to control both exploitation and exploration. The performance of this method was evaluated with 24 UCI standard datasets. In addition, it was evaluated as a Feature Selection (FS) method. The results of simulation showed that chaotic maps (particularly the Tent map) are able to enhance the performance of the VSA. Furthermore, it was clearly shown the fitness of the proposed method in attaining the optimal feature subset with utmost accuracy and the least number of features. If the number of features is equal to 36, the percentage of accuracy in VSA and the proposed model is 77.49 and 92.07. If the number of features is 80, the percentage of accuracy in VSA and the proposed model is 36.37 and 71.76. If the number of features is 3343, the percentage of accuracy in VSA and the proposed model is 95.48 and 99.70. Finally, the results on Real Application showed that the proposed method has higher percentage of accuracy in comparison to other algorithms.

Flowchart of Proposed Method

…

a Feature Count-Total. b Accuracy (%). c Time (sec)

…

Result of comparison a The worst criterion. b The best criterion. c The mean criterion

…

Result of comparison a Reuter_50_50 dataset. b PAN Dataset. c Enron Email Dataset. d Arabic Scripts

…

A description of the VSA algorithm

…

Figures - available from: Evolutionary Intelligence

This content is subject to copyright. Terms and conditions apply.

Content uploaded by Farhad Soleimanian Gharehchopogh

Content may be subject to copyright.

Vol.:(0123456789)

1 3

Evolutionary Intelligence

https://doi.org/10.1007/s12065-021-00590-1

RESEARCH PAPER

Chaotic vortex search algorithm: metaheuristic algorithm forfeature

selection

FarhadSoleimanianGharehchopogh1 · IsaMaleki1· ZahraAsheghiDizaji1

Received: 2 July 2020 / Revised: 2 November 2020 / Accepted: 2 March 2021

Abstract

The Vortex Search Algorithm (VSA) is a meta-heuristic algorithm that has been inspired by the vortex phenomenon proposed

by Dogan and Olmez in 2015. Like other meta-heuristic algorithms, the VSA has a major problem: it can easily get stuck in

local optimum solutions and provide solutions with a slow convergence rate and low accuracy. Thus, chaos theory has been

added to the search process of VSA in order to speed up global convergence and gain better performance. In the proposed

method, various chaotic maps have been considered for improving the VSA operators and helping to control both exploitation

and exploration. The performance of this method was evaluated with 24 UCI standard datasets. In addition, it was evaluated

as a Feature Selection (FS) method. The results of simulation showed that chaotic maps (particularly the Tent map) are able

to enhance the performance of the VSA. Furthermore, it was clearly shown the ﬁtness of the proposed method in attaining

the optimal feature subset with utmost accuracy and the least number of features. If the number of features is equal to 36, the

percentage of accuracy in VSA and the proposed model is 77.49 and 92.07. If the number of features is 80, the percentage of

accuracy in VSA and the proposed model is 36.37 and 71.76. If the number of features is 3343, the percentage of accuracy in

VSA and the proposed model is 95.48 and 99.70. Finally, the results on Real Application showed that the proposed method

has higher percentage of accuracy in comparison to other algorithms.

Keywords Vortex Search Algorithm· Feature Selection· Chaotic Maps· Exploration· Exploitation· Accuracy

1 Introduction

The metaheuristic optimization algorithms were proposed

over the past decades and implemented extensively to the

problem of the complicated [1, 2]. The essential target in the

optimization is the candidate the problem variables to mini-

mize or maximize the objective function based on the global

and local search [3, 4]. So as to triumph over the state-of-

the-art goals in any problem, most of such algorithms were

applied as an attempt to establish an approximate technique

for attaining the optimum solution [5, 6]. A number of well-

known new nature-inspired algorithms include the Invasive

Weed Optimization (IWO) [7], the butterﬂy optimization

algorithm (BOA) [8], the Artiﬁcial Bee Colony (ABC) [9],

the Fruit Fly Optimization Algorithm (FOA) [10], the Fireﬂy

Algorithm (FA) [11], the Krill Herd (KH) algorithm [12],

the Diﬀerential Evolution (DE) algorithm [13], the Flower

Pollination algorithm (FPA) [14], etc. The distinction in

nature is an essential factor why the algorithms have an

alternate dimension of execution in delivering results [15,

16]. Besides, this factor might be the motivation behind why

a few algorithms can best item an answer for speciﬁc issues,

while others don’t. Thus, it is according to this limitation

that one algorithm is not good enough for solving every kind

of problem.

During the past decade, an arithmetic framework and

scientiﬁc branch, namely chaos, has been proposed, and

is connected deeply with diﬀerent scientiﬁc ﬁelds. Chaos

involves three major dynamic properties: the quasi stochas-

tic property, being sensitive against initial conditions, and

ergodicity. The application of chaos theory in optimization

research areas has attracted a lot of attention over the recent

years. The Chaotic Optimization Algorithm (COA) [17]

is among the applications of chaos, and uses the nature of

chaos sequences. It has been indicated if random variables

are replaced with chaotic variables, the performance of COA

can be enhanced. Therefore, in the literature, there are a

* Farhad Soleimanian Gharehchopogh

bonab.farhad@gmail.com

1 Department ofComputer Engineering, Urmia Branch,

Islamic Azad University, Urmia, Iran

Evolutionary Intelligence

1 3

number of studies on the hybridization of chaos with other

algorithms for the purpose of improving the performance of

COA. Some instances include the chaotic ACO [18], cha-

otic DE algorithm [19, 20], chaotic KH algorithm [21, 22],

chaotic FPA [23], chaotic genetic algorithm [24, 25], chaotic

PSO [26–28], chaotic gravitational search [29–31], chaotic

bat algorithm [32], and etc.

FS is the procedure of selecting a subset of features from

an original feature set; it may be considered the most impor-

tant pre-processing instrument to solve classiﬁcation issues

[33]. Figuring out a superior subset of features is a quite

complicated challenge, and is decisive in the ﬁnal results of

the rates of classiﬁcation error. The ﬁnalized feature subset

will retain high rates of classiﬁcation accuracy. The pur-

pose is choosing an applicable subset including d features

from a set of D features (d < D) in a given dataset [34]. D

is made out of all features that are present in a given data

set; it can encompass redundant, noisy, and misleading

features. Consequently, an exhaustive search is performed

within the whole solution environment, which usually takes

a lot of time and cannot often be implemented in practice.

To remedy this FS strategy, maintaining the best subset of

d relevant features was taken into consideration. Inappro-

priate features are not only useless, but also can certainly

worsen the classiﬁcation performance. If irrelevant features

are deleted, the computational eﬃciency can be advanced

and classiﬁcation accuracy expanded.

As indicated by search techniques of feature subsets, the

current FS strategies can be classiﬁed into two classes: the

ﬁlter-based approach and the wrapper-based approach. The

ﬁlter method depends fundamentally on general qualities of

datasets to assess and choose include subsets without tak-

ing into account an uncommon learning approach. Thus,

the productivity of this methodology depends predominantly

on the dataset itself instead of on the classiﬁer [35, 36].

The wrapper method utilizes a classiﬁcation calculation to

assess feature subsets and embraces a search system to look

for ideal subsets. It often leads to better since the wrapper

approach takes into consideration a classiﬁer with the evalu-

ation or search process [37].

Each meta-heuristic algorithm has a unique search strat-

egy. Meta-heuristic algorithms can ﬁnd optimal solutions

based on its own strategy such as balance between explo-

ration and exploitation. Furthermore, VSA has advantages

such as the smaller number of parameters and easy imple-

mentation. The VSA was embedded with chaotic maps to

obtain a better compromise between exploitation and explo-

ration. This paper uses hybrid methods based on CMs with

the VSA for FS. The major contribution of the current paper

is that a CMs model of VSA has been proposed to enhance

the performance of VSA. In proposed methods, the chaotic

seek method are followed to choose the ideal characteristic

subset that maximizes the category accuracy and minimizes

the feature subset duration. Ten one-dimensional CMs are

adopted and changed with random movement parameters

of the VSA. The performance of the proposed methods is

tested on 24 benchmark datasets. Similarly, the performance

of VSA is comparison with seven other metaheuristic algo-

rithms. Based on mean criterion, the proposed method can

obtain better solutions using the Tent Map in comparison

with other metaheuristic algorithms.

The main contributions of this paper are as follows:

• VSA and Chaotic Maps are deﬁned to FS.

• The proposed method has a faster convergence perfor-

mance than the other algorithms. The proposed method

has better convergence results on diﬀerent datasets.

• The proposed method has been evaluated with 24 UCI

standard datasets.

• The best VSA is State2 with VSAC101 that obtained by

using the Tent map.

• The proposed method has been tested on author identiﬁ-

cation datasets

• The obtained results conﬁrmed the validity and supe-

riority of the proposed method in comparison to other

algorithms.

The organization of this paper is as follows: Sect.2 gives

related works about chaotic and FS. Section3 provides an

introduction to VSA. The detailed description of the pro-

posed method has been provided in Sect.4, while the experi-

mental results and discussion of the proposed VSA have

been provided in Sect.5. In Sect.6, the proposed method

has been applied on a real application (i.e., author identiﬁ-

cation). Finally, the conclusion and future work have been

discussed in Sect.7.

2 Related works

The Moth Swarm Algorithm (MSA) is among the most

recently-developed nature-inspired heuristics for the pur-

pose of the optimization problem. However, its shortcoming

is that it has slow convergence rate, and the Chaos theory

has been incorporated into it to eliminate this drawback.

In [38], ten CMs have been embedded within the MSA for

the purpose of ﬁnding the ideal number of prospectors to

increase exploiting the most promising solutions. The pro-

posed method was applied in solving the famous seven

benchmark test functions. The results of simulation showed

that CMs can enhance the performance of the original MSA

with regard to the convergence speed. In addition, the sinu-

soidal map was found to be the best map for enhancing the

performance of MSA.

The Cuckoo search algorithm (CSA) is a metaheuristic

algorithm that has been inspired by nature and imitates

Evolutionary Intelligence

1 3

the obligate brood parasitic behavior of the cuckoo spe-

cies. The method has been proven to have promising over-

all performance in solving optimization problems. Chaotic

mechanisms were incorporated into CSA to make use of the

dynamic features of the chaos theory, to further improve

its search overall performance. However, in chaotic CSA

(CCSA) [39], the best CM was applied in a single search of

the new release, which restrained the exploitation capabil-

ity of the search. The researchers considered utilizing mul-

tiple CMs at the same time to perform the nearby search

inside the community of the global best solution that is

found by CSA. To attain this goal, three kinds of multiple

chaotic CSAs (MCCSA) were proposed via incorporating

several CMs into the chaotic local search (CLS) parallel in

a random or selective manner. The overall performance of

MCCSA was validated using 48 broadly-used benchmark

optimization features. The experimental results indicated

that MCCSAs are generally better than CCSAs, and the

MCCSA-P that makes use of the CMs has the best quality

among all sixteen editions of the CSAs.

In [40], a chaos-based Crow Search Algorithm (CCSA)

has been proposed to solve the fractional optimization prob-

lems (FOPs). The proposed CCSA integrated the chaos the-

ory (CT) into the CSA for the purpose of reﬁning the global

convergence velocity and enhance the exploration/exploita-

tion inclinations. CT was utilized to track the standard CSA

parameters, which yielded four versions and the high-quality

chaotic variant was investigated. The incorporation of CT

was able to improve the overall performance of the proposed

CCSA and allow the search process to perform better speeds.

The overall performance of the CCSA method was proven

on twenty fractional benchmark problems. Furthermore, it

was further tested on a fractional monetary environmental

power dispatch problem via attempting to limit the ratio of

the overall emissions to general gasoline cost. Ultimately,

the proposed CCSA was compared with the PSO, standard

CSA, FA, Dragonﬂy Algorithm (DA), and GWO. In addi-

tion, the eﬃciency of the proposed CCSA was justiﬁed by

the non-parametric Wilcoxon signed-rank test. The experi-

mental results proved that the proposed CCSA performs

better than similar algorithms with regard to eﬃciency and

reliability.

In [41], a new hybrid algorithm for solving optimization

problems based on chaotic ABC and chaotic simulated

annealing has been proposed. The chaotic ABC reveals

new locations chaotically. Chaos may additionally improve

the exploration of the search space. Really, the proposed

hybrid method aﬀords a hybrid of nearby search accu-

racy of simulated annealing and the capacities of global

seek of ABC. Moreover, they used an exclusive method

for producing the initial population. Sincerely preliminary

populace is of brilliant signiﬁcance for populace-based

techniques, because it immediately inﬂuences the rate of

convergence and nice of the outcomes. It is established

the usage of 12 benchmark functions. The eﬀects are as

compared with those of the artiﬁcial bees’ algorithm, the

hybrid algorithm of ABC and simulated annealing and

PSO. Simulation eﬀects display the performance of the

proposed method.

In [42], an adaptive chaotic Bacterial Foraging Optimi-

zation (BFO) is presented. The improved BFO consisted

of two new features, the adaptive chemotaxis step setting,

and the chaotic perturbation operation in all chemotactic

events. The former feature results in fast convergence rate

and the acceptable convergence accuracy in the algorithm,

while the latter further allows the search to avoid the local

optima and attain better convergence accuracy. Firstly, an

idea of adaptive exponential decease chemo taxis step is pre-

sented, in which the natural exponential function variable is

a function about the iterations and nutritive ratio between the

current bacterium position and the best bacterium position

in each iteration. Secondly, when each bacterium reaches a

new position through swim behavior, chaotic perturbation

is applied to avoid entrapping into local optima. With ﬁve

benchmark functions, Chaotic BFO is proved to have a bet-

ter performance than the original BFO and BFO with linear

deceasing chemo taxis step (BFO-LDC).

Jia etal. [43] proposed an eﬀective memetic DE algo-

rithm (DECLS), which makes use of a CLS with a ‘shrink-

ing’ strategy. The shrinking strategy for the CLS search

space was introduced in that paper. In addition, the local

search length was determined according to the feedback of

the ﬁtness of the objective functions in a dynamic manner

in order to save the function evaluations. Furthermore, the

parameter settings of the DECLS were adapted in the pro-

cess of evolution so as to further enhance the optimization

eﬃciency. The hybrid form of the DE and a CLS as well as

a parameter adaptation mechanism seemed very reasonable.

The CLS is helpful in enhancing the local search capability

of DE, whereas the parameter adaptation can improve the

global optimization quality. The CLS is helpful in improving

the optimization performance of the canonical DE through

exploring a very large search space in the early phases so

as to avoid the occurrence of premature convergence, and

exploiting a tiny region in later phases to reﬁne the ﬁnal-

ized solutions. In addition, the settings of parameters in the

DECLS were controlled adaptively to further improve the

search capability. To assess the eﬃciency and eﬀectiveness

of the proposed DECLS algorithm, it was compared with

four state-of-the-art DE variants and the IPOP-CMA-ES

algorithm on a set of 20 selected benchmark functions. The

ﬁndings showed that the DECLS is signiﬁcantly superior, or

at least comparable, to other optimizers with regard to the

convergence performance and solution accuracy. Further-

more, the DECLS was shown to have certain advantages in

terms of solving problems with high dimensions.

Evolutionary Intelligence

1 3

In [44], a modiﬁed DE algorithm based on the Opposi-

tion-based Learning (OBL) and a chaotic sequence named

the OBL Chaotic DE (OBL-CDE) was proposed. The pro-

posed OBL-CDE algorithm is diﬀerent from the basic DE

in two ways. The ﬁrst one is related to the generation of

the initial population that follows the OBL rules, while the

second one is the dynamic adaption of the scaling factor F

through using the chaotic sequence. The numerical results

obtained by the OBL-CDE compared to the results of DE

and the opposition-based DE algorithms on 18 benchmark

functions indicated that the OBL-CDE is capable of ﬁnding

more superior solutions and maintaining reasonable conver-

gence rates at the same time.

The standard Glowworm Swarm Optimization (GSO)

shows poor ability in global search and easily gets trapped

into local optima. A Quantum GSO algorithm based on

CMs was proposed [45] in order to solve such problems.

First of all, a chaotic sequence was generated to initialize

the population. This process results in higher probability to

cover more local optimal areas, and provides the ground for

further optimization and tuning. Next, the quantum behavior

was applied to the elite population, which made it possible

for individuals to locate any position of the solution space

randomly with a certain probability. This greatly enhanced

the capability of the algorithm in global search and avoid-

ing local optima. Finally, it adopted the single dimension

loop swimming instead of the original ﬁxed-step movement

mode. This not only improved the solution precision and

convergence speed, but also solved GSO problems that were

too sensitive to the step-size, and indirectly enhanced the

robustness of the algorithm. The simulation results indicated

that the proposed method was feasible and eﬀective.

The Fruit Fly Algorithm (FOA) has recently been pro-

posed as a metaheuristic technique, and is inspired by the

behavior of fruit ﬂies. Mitic etal. [46] improved the stand-

ard FOA through introducing the novel parameter in com-

bination with chaos. The performance of this chaotic FOA

(CFOA) was studied on ten famous benchmark problems

using 10 diﬀerent CMs. In addition, comparison studies with

the basic FOA, FOA with Levy ﬂight distribution, and other

recently-published chaotic algorithms were made. Statistical

ﬁndings on each optimization task showed that the CFOA

results in a very high convergence rate. In addition, CFOA

is compared with recently developed chaos enhanced algo-

rithms such as chaotic bat algorithm, chaotic-accelerated

PSO, chaotic FA, chaotic ABC, and chaotic CSA. Research

ﬁndings generally indicate that FOA with Chebyshev map

show superiority to the similar methods in terms of the reli-

ability of global optimality and the algorithm success rate.

In addition, Gandomi et al.[47] proposed a chaos-

enhanced version of the accelerated PSO. Some other

instances of chaos-enhanced metaheuristic algorithms

include the chaotic Genetic Algorithm [48], Chaotic PSO

[49, 50], Chaotic Salp Swarm Algorithm [51], Chaotic Ele-

phant Herding Optimization (EHO) algorithm [52], Chaotic

Bat Algorithm[53], Chaotic FOA[46], Chaotic GSO Algo-

rithm [45, 54], Chaotic Black Hole algorithm [55], Cha-

otic Simulated Annealing PSO Algorithm (CSAPSO) [56],

Chaotic Social Spider Optimization Algorithm[57], Cha-

otic Bean Optimization Algorithm[58], Chaotic Quantum

CSA [59], Chaotic Antlion Algorithm[60], Chaotic Hybrid

Cognitive Optimization Algorithm[61], Chaotic Simulated

Annealing [62], Chaotic Based Quantum Genetic Algorithm

[63], Chaotic Teaching Learning Algorithm[64], Chaotic

DE algorithm [65], Chaotic Grey Wolf Optimization Algo-

rithm[66], Chaotic Fractal Search[67], Chaotic Brain Storm

Optimization Algorithm [68], Multi-Objective CCSA [69],

Chaotic Grasshopper Optimization Algorithm [70], Chaotic

Krill Herd [21, 71, 72], Chaotic DE[73], Chaotic Fireﬂy

Algorithm [74, 75], Chaotic Starling PSO Algorithm[76],

Chaotic CCSA [77], Chaotic Grey Wolf Optimization Algo-

rithm [78] and etc. Table1 shows a comparison of diﬀerent

models of meta-heuristic algorithms based on chaotic map.

3 Vortex search algorithm

The VSA is a recent metaheuristic optimization algorithm

that changes into the stimulated mode by the vertical ﬂow of

the stirred ﬂuids. Its processes consist of the simpliﬁed gen-

eration phases similar to other single-solution algorithms.

The generation of VSA populations is modiﬁed to any gen-

erations with the aid of the usage of values completely shape

the modern single solution. Furthermore, the performance

of every update and seek of iteration pass at the seek space

is an essential section in rendering single-solution. Inside

the proposed VSA, this stability is performed with the aid

of using a vortex-like search pattern. The strategies of vortex

sample are simulated through some of the nested circles. The

info of VSA techniques may be in brief deﬁned in 4 steps

as follow [79].

3.1 Generating theinitial solution

The preliminary procedure initials ‘center’ μ0 and ‘radius’

r0. In this phase, the initial center (μ0) can be calculated

using Eq.(1).

where

upperlimit

and

lowerlimit

are the bound constraints of

the problem, which can be deﬁned in vector of d × 1 dimen-

sional-space. In addition, σ0 is the initial radius r0 generated

with Eq.(2).

(1)

𝜇

0=upperlimit +lowerlimit

Evolutionary Intelligence

1 3

3.2 Generating thecandidate solutions

The procedure of producing candidate solutions is

applied for the purpose of rendering the generation of

(2)

𝜎

max (upperlimit)−min(lowerlimit)

populations

Ct(s)

in any iterations, where t is the t-th

iteration. The VSA is randomly generated around the

initial center μ0 by using a Gaussian distribution, where

(s)=

{

,…,s

m=1, 2, 3, …,

represents the

solution and n is the overall number of candidate solu-

tions. The equation of multivariate Gaussian distribution

has been shown in Eq.(3).

Table 1 A Comparison of Diﬀerent Models of Meta-heuristic Algorithms based on Chaotic Maps

Refs Models Application Chaotic map Improved

[38] Chaotic MSA Optimization problem Sinusoidal *Convergence speed

[39] CCSA Solving optimization problems Gaussian map *Exploration and exploitation

[40] CCSA Optimization problems Circle map *Obtaining the global optimum

*accelerate the convergence

performance

[41] Chaotic artiﬁcial bee colony and

chaotic simulated annealing

Solving optimization problems Sinusoidal map *Faster convergence *better

exploration

[42] Adaptive chaotic BFO (ACBFO) Solving optimization problems Logistic map *Convergence speed

[43] DE algorithm based on chaotic

local search

20 benchmark functions Logistic chaotic function *Convergence performance *solu-

tion accuracy

[44] Opposition based Chaotic DE

(OCDE)

Benchmark function Logistic chaotic function *Obtaining the global optimum

*accelerate the convergence

performance

[45] Quantum GSO algorithm

based on Chaotic Sequence

(QCSGSO)

Solving optimization problems Logistic chaotic function *Avoid from prematurity

[46] Chaotic FOA Multi-mode functions Logistic chaos mapping *Convergence speed

[47] Chaos-enhanced accelerated PSO *Global optimization *three

engineering problems

*Sinusoidal map *Singer map *Global optimality *convergence

speed

[48] Chaos-Genetic Algorithm Grid scheduling Logistic map *Convergence performance *solu-

tion accuracy

[49] An improved chaotic PSO algo-

rithm based on adaptive inertia

weight (AIWCPSO)

Benchmark functions Logistic map *Convergence accuracy

[50] Chaotic PSO algorithm (CS-

PSO)

Recommendation system Chaos methods *Global search capability; *avoid-

ing the premature convergence;

[51] Chaotic Salp Swarm Algorithm

(CSSA)

*Global optimization; *feature

selection;

Logistic map *Minimizing the number of

selected features; *maximizes

the classiﬁcation accuracy;

[52] Chaotic EHO algorithm 15 benchmark functions from

CEC 2013

Tent map *Convergence speed

[53] Chaotic bat algorithm Robust global optimization Thirteen diﬀerent chaotic maps *Global search capability; *avoid-

ing the premature convergence;

[54] Chaotic GSO algorithm Eight standard test functions Logistic map *Convergence performance *solu-

tion accuracy

[55] Chaotic Inertia Weight Black

Hole Algorithm (CIWBH)

Twenty-three benchmark func-

tions

Logistic map *Enhance the global search

*trade-oﬀ between exploita-

tion and exploration *better

convergence

[56] Chaotic Simulated Annealing

PSO Algorithm

Complex optimization problems Logistic map *Global search ability; *conver-

gence precision;

[57] Chaotic Social Spider Optimiza-

tion Algorithm

Robust Clustering Logistic map *Convergence speed

[58] Chaotic bean optimization algo-

rithm (CBOA)

CEC2014 benchmark functions Logistic map *Global optimization

Evolutionary Intelligence

1 3

In Eq.(3) d indicates the dimension, while x is the d × 1

vector of a random variable, μ indicates the d × 1 vector of

the sample mean (i.e., center), and Σ indicates the covariance

matrix. Equation(4) indicates that when the diagonal elements

(i.e., variances) of the Σ values are equal and the oﬀ-diago-

nal elements (i.e., covariance) equal zero (uncorrelated), the

resulting shape of the distribution will be spherical. Thus, the

value of Σ is computed through utilizing equal variances with

zero covariance.

where the representation in Eq.(4), σ2 is the variance of the

distribution, I represent the

d×d

identity matrix and σ0 is

the initial radius (r0) as can see in Eq.(2).

3.3 Replacement ofthecurrent solution

The replacement of the current solution is conducted for the

selection process. A solution (which is the best one)

s∈C0(s)

is selected and memorized from

C0(s)

for the purpose of

replacing the current circle center (μ0). Before the selection

process, it must be made sure that the candidate solutions are

inside the search spaces (Eq.(5)).

where

k=1, 2, …,n

and

i=1, 2, …,d

, and rand indicates

a random number that is distributed uniformly. VSA uses

as a new center, and reduces the vortex size using Eq.(3) to

(3)



𝜇,Σ)=1



(

𝜋)d



exp



−1

2(x−𝜇)T

−1



(x−𝜇)



(4)

∑

=𝜎2.[I]

(5)











rand.



upperlimiti−lowerlimiti



+lowerlimiti

,si

k<lowerlimit

lowerlimiti≤si

k≤upperlimiti

rand.upperlimiti−lowerlimiti+lowerlimiti

,si

>upperlimit

select the next solutions. Thus, the new set of solutions

C1(s)

can be generated. If the chosen solution is better than the

best solution, it can be determined as the new best solution

and was memorized.

3.4 The radius decrement process

In the VSA, the inverse incomplete gamma function is

applied for the purpose of decreasing the radius value

during each iteration pass. The incomplete gamma func-

tion provided in Eq.(6) often arises in probability the-

ory, especially in applications that involve the chi-square

distribution.

where a > 0 is the shape parameter while x ≥ 0 is a random

variable. Similar to the incomplete gamma function, its com-

plementary

Γ(x,a)

is usually also introduced (Eq.(7)). In

Eq.(7),

Γ(a)

is a (1).

Table2 describes pseudocode of VSA algorithm.

4 Proposed methods

In this section, the hybrid form of VSA and CMs will be

explained. The simple shape of the VSA consists of impor-

tant keys that can be center and radius. First, the center is

a current position from which the VSA may be evaluated

(6)

𝛾

(x,a)=

∫

e−tta−1dta >

(7)

(x,a)=

∞

∫

e−tta−1dta >

Table 2 A description of the

VSA algorithm Initializing step

Algorithm parameters: Input the population size, the lower and upper bounds

Fitness of best solution

Center of the circle(µ0), Eq. (1)

Radius of the circle (0), Eq. (2)

Repeat

Create candidate solutions within the circle by Eq. (3)

If Exceeded, then shift values into the boundaries by Eq. (5)

Select best solution to replace the current center

Decrease the standard deviation (radius) for the next iterationbyEq. (7)

End

Output

Best solution found so far

Evolutionary Intelligence

1 3

based on the problem search space where iterations skip.

With respect to exploration for a premier solution that has

been carried out up to now, VSA used this function to

identify the ‘center’ with the purpose of replacing a new

position of the populations. Secondly, ‘radius’ is a method

that is utilized to simplify the issues-creating a massive-

radius problem to grow to be a small-radius problem. In

extra, a Gaussian distribution is a VSA which is used to

stability the exploration and exploitation at every itera-

tion skip. However, the VSA used best a single center this

is referred to as the single strategy to generate candidate

solutions around the current great answer. However, the

disadvantages of the VSA can be not noted from the local

factor whilst it suﬀers from the issues that have numer-

ous neighborhood minimal values. in the equal time, the

radius used to update the pleasant solution have been capa-

ble of decrease the new release skip by way of the usage

of a Gaussian distribution, making it less complicated

to trap the VSA in neighborhood optima. This explains

some drawbacks of the VSA. The presented have a look

at specializes in hybridizing the VSA with the CMs. This

hybridization is referred to as the chaotic VSA which 10

CMs have been used. These 10 maps have been used in

three diﬀerent locations of the VSA [74]. Figure1 shows

ﬂowchart of proposed method. In the ﬁrst step is done

the initialization of the parameters. In the second step,

the VSA Eqs.(9, 10, and 11) are optimized based on the

chaotic maps in order to FS. In the third step, the samples

are classiﬁed and at the end, the accuracy percentage is

displayed.

Fig. 1 Flowchart of Proposed Method

Evolutionary Intelligence

1 3

In the proposed model, we combine the formulas of CMs

based on Table3 with Eqs.(3), (5), and (6). The goal is to

ﬁnd the best CMs to optimize VSA. These places can be

expressed as follows:

State 1 the production of candidate solutions inside the

search circle [Eq.(9)].

State 2 If the solution is out of range, these mappings are

used to move to the desired range. (Eq.10).

State 3 Reduced search radius using reverse gamma func-

tion and CMs [Eq.(11)].

In Table3, the CMs formulas and methods are shown.

The optimization of the VSA based on three methods

(State1, State2, and State3) has been done. In each method

have been used 10 CMs. So, in each run, there are 30 diﬀer-

ent modes for a given dataset.

Chaos is described as a phenomenon. Any exchange

of its preliminary scenario might also purpose non-linear

change for future behavior. Chaos optimization is one of

the optimization models for search algorithms. The primary

idea behind it is too seriously change parameters/variables

from the chaos to the solution area. It relies upon for look-

ing out of the global optimum on chaotic motion properties

including ergodicity, regularity, and stochastic properties.

The major advantages of chaos are speedy convergence and

functionality for warding oﬀ local minima. CMs have a form

of determinate in which no random factors are applied. In

this paper, 10 distinguished non-invertible unidimensional

maps were adopted to attain chaotic sets. The adopted CMs

have been deﬁned in Table3, where q denotes the index

of the chaotic sequence p, and

is the

qth

number in the

chaotic sequence. The remaining parameters including d, c,

and μ are considered as the control parameters, determining

the chaotic behavior of the dynamic system. The initial point

p0 was set to 0.7 for all CMs, as the initial values for CMs

may have a great inﬂuence of ﬂuctuation patterns on CMs.

In this paper, ten diﬀerent CMs were applied for the opti-

mization process. These maps are Chebyshev, circle, gauss/

mouse, iterative, logistic, piecewise, sine, singer, sinusoidal,

and tent [74].

Descriptions of State 1, State 2, and State 3 are as follows:

State 1 VSA generates candidate solutions using just a

single ‘center’ (μ). The generation of ‘center’ is

then transformed to new center when iterations

pass through the limitation of upper and lower

bound of problems. This mechanism has some

problems. One of such problems is that VSA tends

to be trapped in local minima when suﬀering from

a local point of minimum problems. To overcome

this, the CMs of candidate solution VSA was

proposed.

In this method, chaos maps are used to generate candidate

solutions. Several neighbor solutions

Ct(s)

, (t indicates the

iteration index and is t = 0 at initial stages) were generated

randomly around the initial center µ0 in the d-dimensional

Table 3 CMs and proposed methods

ID Map Deﬁnition Range State1 State2 State3

1 Chebyshev map p

1=cos

(

qcos

−1(

q))

(− 1,1) C11 C12 C13

2 Circle map pq+1=mod



pq+d−



𝜋

sin



2𝜋pq





,c=0.5 and d =

0.2

(0,1) C21 C22 C23

3 Gauss/mouse map

pq+1=

{

1, pq=0

mod

(

)

otherwise

(0,1) C31 C32 C33

4 Iterative map pq+1=sin

(

c𝜋

,c=

0.7

(− 1,1) C41 C42 C43

5 Logistic map p

1=cp

1−p

,c=

(0,1) C51 C52 C53

6 Piecewise map

pq+1=

⎧

⎪

⎨

⎪

⎩

l,0 ≤pq<l

pq−1

0.5−l,l≤pq<0.5

1−l−pq

0.5−l, 0.5 ≤pq<1−l,l=

0.4

1−pq

,1−l≤pq<1

(0,1) C61 C62 C63

7 Sine map pq+1=

sin

(

𝜋pq

)

,c=

(0,1) C71 C72 C73

8 Singer map pq+1=𝜇(7.86pq−23.31p

+28.75p

−13.302875p

,𝜇=

1.07

(0,1) C81 C82 C83

9 Sinusoidal map pq+1=cp

qsin

(

𝜋pq

)

,c=

2.3

(0,1) C91 C92 C93

10 Tent map

pq+1=

0.7 ,pq<0.7

(

1−pq

)

,pq≥

0.7

(0,1) C101 C102 C103

Evolutionary Intelligence

1 3

space by using a Gaussian distribution and CMs. Here,

(s)=

{

,…,s

m=1, 2, 3, …,

represents the solu-

tions, and n represents the total number of candidate solutions.

In Eq.(9), the formula of the proposed method is given.

where d represents the dimension, cm is the

d×1

vector of a

CMs variable, µ is the d × 1 vector of sample mean (center),

and Σ is the covariance matrix.

State 2 If the solution is out of range, these mappings

are used to move to the desired range. During

the selection phase, a solution (i.e., the best one),

s∈C0(s)

is selected and memorized from C0(s) for

the purpose of replacing the current circle center

µ0. Before the selection phase, it must be made

sure that the candidate solutions are inside the

search boundaries. To attain this goal, the solu-

tions that exceed the boundaries are shifted into

the boundaries, as in Eq.(10). The VSA combined

with chaotic sequences is described in Eq.(10).

In Eq.(10),

Cm(i)

is the obtained value of chaotic

map at

jth

iteration.

State 3 Reduced search radius using reverse gamma func-

tion and CMs. In the VSA, the inverse incomplete

gamma function is used for the purpose of decreas-

ing the value of the radius during each iteration

pass. The incomplete gamma function has been

given in Eq.(11).

where a > 0 is known as the shape parameter and cm ≥ 0 is

a CMs variable.

In the current study, the chaotic VSA has been imple-

mented as an FS algorithm based on the wrapper method.

In VSA, a chaotic sequence is embedded in the search

iterations, and the optimal feature subset that describes the

dataset is selected using VSA. The FS strategy is aimed at

(9)



𝜇)=1

2𝜋

exp



−1

2(cm −𝜇)T

−1



(cm −𝜇)



(10)











Cmi∗



upperlimiti−lowerlimiti



+lowerlimiti,si

k<lowerlimit

lowerlimiti≤si

k≤upperlimiti

Cmi∗upperlimiti−lowerlimiti+lowerlimiti,si

k>upperlimit

(11)

𝛾

(x.a)=

∫

e−tta−1dta >

improving the classiﬁcation eﬃciency, reducing the length

of feature subset, and reducing the computational costs.

4.1 Fitness function

At each iteration, every point position is evaluated the

use of a special fitness function fit. The data are ran-

domly divided into extraordinary components, especially

training and testing datasets by using the m-fold tech-

niques. Goal standards are used for assessment, which

are classification accuracy and the number of selected

features. The followed fitness function equation hybrids

the two standards into one by means of setting a weight

factor as in Eq.(12). a is the class accuracy calculated

with the aid of dividing the variety of efficiently labeled

instances over the full variety of instances. K-nearest

neighbor (KNN) [80] is the used classifier in which k

equals to three with suggesting absolute distance. KNN

is one in every of supervised learning algorithms which

rely on classifying new instance based totally on distance

from the new sample to the training samples. The KNN

classifier predicts the class of the testing sample through

calculating and sorting the distances between the testing

sample and each one of the training samples. Such a pro-

cess is repeated until each datum in the dataset has been

selected once as the testing sample. What is meant by the

classification accuracy of a feature subset is the ratio of

the number of samples that have been predicted correctly

to that of all the samples. In this paper, KNN has been

used for determining the fitness of the selected features.

The selection of K and distance method was decided

based on trial and error. Ls is the length of the selected

feature subset, Ln is the total number of features, and β is

the weighted factor which has value in [0, 1]. β is used to

control the importance of classification accuracy and the

number of selected features. Since improving accuracy

is the primary goal for any classifier, the weight factor is

usually set to values near 1 [81]. In this paper, β was set

to 0.8. The best solution is maximizes the classification

accuracy and minimizes the number of selected features

[81].

5 Result anddiscussion

In this section, first a summary of the main characteristics

of the implemented datasets will be discussed. Second,

the proposed methods (State1, State2 and State3) using

different CMs will be investigated. Third, comparisons

(12)

Fit

=maximize

(

a+𝛽×

(

1−

s)).

Evolutionary Intelligence

1 3

will be made between VSA and the proposed method

based on FS. Finally, to emphasize the advantages of the

proposed method compared to other algorithms, different

experiments will be described and the obtained results

will thoroughly be discussed.

5.1 Datasets description

Twenty-four benchmark datasets from different types

including medical/biology and business were used in the

experiments. Four datasets (21, 22, 23, and 24) were related

to the identiﬁcation and classiﬁcation of the text author.

The datasets were collected from the UCI machine learn-

ing repository [82]. A short description of each one of the

adopted datasets has been presented in Table4. As it can be

observed, the used datasets involve missing values in some

records. In the current study, all such missing values were

replaced by the median value of all known values of a given

feature class. The mathematical deﬁnition of the median

method has been deﬁned in Eq.(13).

Si,j

parameter is the

missing value for

jth

feature of a given

ith

class W. For miss-

ing categorical values, the most appeared value for a feature

given class is replaced with the missing value.

Four diﬀerent statistical measurements—including the

worst, the best, the mean ﬁtness value, and the standard devi-

ation (SD) were adopted. In the current study, this test was

used to evaluate the performance of each CM and determine

the best one. The worst, the best, the mean ﬁtness value, and

the SD are mathematically deﬁned as follows:

BS is the best score gained so far for each iteration.

(13)

i,j

=median

i∶s

i,j

∈W

i,j

(14)

Best

=max

tMax

i=1

(15)

Wors t

=min

tMax

i=1

(16)

Mean

tMax

∑

i=1

(17)

�∑

i=1(BSi−𝜇)2

tMax

Table 4 Dataset description

ID Dataset Missing values No. of features No. of classes No. of instances Type

Dataset1 Chess No 36 2 3196 Game

Dataset2 Poker hand No 10 10 25,010 Game

Dataset3 Germen credit No 24 2 1000 Business

Dataset4 Credit approval Yes 15 2 690 Business

Dataset5 Cylinder bands Yes 40 2 512 Physical

Dataset6 Abalone No 8 29 4177 Life

Dataset7 Glass identiﬁcation No 10 6 214 Physical

Dataset8 Letter recognition No 17 26 20,000 Computer

Dataset9 Waveform No 21 3 5000 Physical

Dataset10 Zoo No 18 2 101 Life

Dataset11 Wisconsin Diagnosis Breast Cancer (WBCD) No 32 2 596 Clinical

Dataset12 Mice Protein Expression Data set (MPED) Yes 82 8 1080 Clinical

Dataset13 Parkinson’s Disease Detection Data set (PDD) No 23 2 197 Clinical

Dataset14 Cardiotocography No 23 3 2126 Clinical

Dataset15 Hepatitis Yes 19 2 155 Clinical

Dataset16 Lung cancer Yes 56 3 32 Clinical

Dataset17 Single proton emission computed tomography

(SPECT)

No 44 2 267 Clinical

Dataset18 Thoracic surgery No 17 2 470 Clinical

Dataset19 Statlog (heart) No 13 2 270 Clinical

Dataset20 Indian Liver Patient Dataset No 10 2 583 Clinical

Dataset21 WebKB No 2350 4 4199 Computer

Dataset22 Cade12 No 5340 12 40,983 Computer

Dataset23 Reuters 21,578 – R8 No 3343 8 7674 Computer

Dataset24 Reuter_50_50 No 2340 50 5000 Computer

Evolutionary Intelligence

1 3

5.2 Analysis anddiscussion

For evaluation of methods on diﬀerent datasets of four cri-

teria (worst, best, mean and SD) have been used. In Table5,

30 modes and the VSA with the mentioned criteria are inves-

tigated. Proposed Method (State1) is equal to VSAC11 to

VSAC101 modes, Proposed Method (State2) is equal to

VSAC12 to VSAC102 modes, and Proposed Method (State3)

is equal to VSAC13 to VSAC103 modes. With regard to the

results, it can be stated that Proposed Method (State2) has

better results. Proposed Method (State2) with VSAC101

mode oﬀers best result than other modes using Tent map.

The main target of this test is to evaluate the eﬃciency of

VSA with diﬀerent chaotic maps and deﬁne the optimal cha-

otic map (Tables6, 7, 8).

5.3 Comparisons betweenVSA andproposed

method based onFS

In Table9 and Fig.2, the results of the VSA and the Pro-

posed Method are shown based on the FS. We chose the

Proposed Method in order to FS because it had a high per-

centage of accuracy. Based on the results, it can be said that

the Proposed Method in 19 datasets is better than the VSA.

5.4 Comparison andevaluation

Comparison of the Proposed Method with GA, PSO, ABC,

BOA, IWO, FPA, and FA algorithms has been performed to

evaluate the eﬃciency. In Table10, the control parameters

of the algorithms are expressed.

The comparison of the Proposed Method with the PSO,

ABC, BOA, IWO, GA, FA, FPA, and VSA was performed

according to the worst criteria. According to Table11 and

Fig.3, it is clear that the results of other algorithms are

worse than the Proposed Method.

In Table12, the comparison of the Proposed Method with

PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA was per-

formed based on the best criteria. According to Table12 and

Fig.3, it is clear that the results of the Proposed Method are

better than other algorithms.

In Table13, the comparison of the Proposed Method with

PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA was per-

formed based on the mean criteria. According to Table13

and Fig.3, it is clear that the results of the Proposed Method

are better than other algorithms.

To sum up, the results and discussion of this paper dem-

onstrate that integrating CMs to the VSA is deﬁnitely beneﬁ-

cial. The reason why that the Proposed Method outperforms

all the other algorithms is that the Tent chaotic map assists

this algorithm to highly emphasize exploration in the initial

steps of optimization and reduced search radius.

6 Real application: author identication

Author identiﬁcation, is a

stylometric

problem that tries to

identify a copied text belonging to an original author [85,

86]. With ever-increasing volume of documents uploaded

to the internet, new methods for analyzing and extracting

data and knowledge are needed. In order to prevent plagia-

rism and copying copyrighted materials, the best solution

is to use authorship identiﬁcation. Every writer has his/her

own writing style in manuscripts that he/she writes, and the

writer’s style can be identiﬁed in other papers [87]. Author-

ship identiﬁcation is one of the up-to-date problems in the

ﬁeld of natural language processing. Author identiﬁcation, is

an eﬀort to show the writer’s personal characteristics, based

on a piece of linguistic information [88] such that various

manuscripts written by various authors can be distinguished.

Humans possess certain writing patterns for using a lan-

guage in their writings, which act like ﬁgure prints of the

writer (writer print); these patterns are speciﬁc to the writ-

ers [89].

Authors in [90] have proposed an approach known as the

stylometric

approach to deal with the problem of Author

Identiﬁcation. There are four diﬀerent steps in this approach:

• Calculation of word frequencies to ﬁnd the most frequent

words in the entire corpus.

• Calculation of normalized frequency. This is done by

dividing the frequency of the most frequent word in that

document to the total number of words in entire corpus.

• Using Z-score method.

• Calculation of distance table by ﬁnding distance between

two matrices.

Therefore, since the text is converted into numeric repre-

sentation (feature extraction), classiﬁcation, and clustering

techniques of machine learning can be implemented on it.

The Reuter_50_50 data set is used for experiments. There

are 50 authors and 50 documents per each author in this

dataset. Thus, both training corpus and test corpus contains

2500 texts. These corpuses do not overlap with each other.

By applying

stylometry

approach and n-gram features to the

author identiﬁcation problem an accuracy of about 85% of

that of SVM classiﬁer is achieved which is a higher accuracy

in comparison to Delta and KNN classiﬁer.

Dissimilarity Counter Method (DCM), DCM-Voting,

and DCM-Classifier have been applied in [91] to the

problem of Author Identiﬁcation. Once the representation

spaces are selected, similarity measures such as Euclidean

distance, correlation coeﬃcient, and Cosine can be used to

compare the documents and then, the document author can

be identiﬁed using one of the above-mentioned approaches

(DCM, DCM-voting, or DCM-Classiﬁer). DCM only uses

Evolutionary Intelligence

1 3

Table 5 Comparison of results of methods with VSA

Methods Dataset1 Dataset2 Dataset3

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 1.0976 1.1776 1.3368 0.0394 VSA 0.8055 1.1195 1.0782 0.0793 VSA 0.9830 1.4678 1.3695 0.0993

Proposed method (state1) VSAC11 1.0393 1.3076 1.3048 0.0658 VSAC11 0.7894 1.0633 0.9814 0.0548 VSAC11 0.9766 1.4506 1.3541 0.0945

VSAC21 1.0889 1.4099 1.2423 0.0711 VSAC21 0.7765 1.0660 1.0233 0.0676 VSAC21 1.0311 1.4668 1.3387 0.0728

VSAC31 1.0593 1.2748 1.3172 0.0529 VSAC31 0.7732 1.0566 1.0201 0.0659 VSAC31 0.9761 1.4632 1.3333 0.0960

VSAC41 1.0116 1.3581 1.2463 0.0844 VSAC41 0.7901 1.1190 1.0133 0.0771 VSAC41 0.9957 1.4124 1.3662 0.0765

VSAC51 1.0452 1.3552 1.2530 0.0690 VSAC51 0.7989 1.0896 0.9964 0.0612 VSAC51 1.0332 1.4214 1.3989 0.0679

VSAC61 1.0631 1.3285 1.2346 0.0499 VSAC61 0.7839 1.0976 0.9859 0.0698 VSAC61 1.0205 1.4478 1.3981 0.0808

VSAC71 1.0792 1.3860 1.2341 0.0653 VSAC71 0.7733 1.1161 1.0379 0.0867 VSAC71 1.0017 1.4035 1.3894 0.0762

VSAC81 1.0559 1.2231 1.3089 0.0451 VSAC81 0.7896 1.0840 1.0707 0.0757 VSAC81 0.9730 1.4676 1.3427 0.1000

VSAC91 1.0044 1.2229 1.3067 0.0674 VSAC91 0.7710 1.0815 0.9993 0.0713 VSAC91 0.9725 1.4393 1.3897 0.0994

VSAC101 1.0876 1.2742 1.2768 0.0286 VSAC101 0.8021 1.1128 1.0435 0.0732 VSAC101 1.0290 1.4433 1.3449 0.0667

Proposed method (state2) VSAC12 1.0529 1.3255 1.3045 0.0638 VSAC12 0.8056 1.0884 0.9876 0.0571 VSAC12 0.9721 1.4164 1.3905 0.0936

VSAC22 1.0143 1.2019 1.2335 0.0368 VSAC22 0.7981 1.0509 1.0391 0.0565 VSAC22 0.9730 1.4143 1.3357 0.0822

VSAC32 1.0957 1.2719 1.2366 0.0161 VSAC32 0.7941 1.0908 0.9889 0.0631 VSAC32 1.0060 1.4569 1.3820 0.0873

VSAC42 1.0308 1.1706 1.2720 0.0389 VSAC42 0.7932 1.0744 1.0606 0.0694 VSAC42 0.9840 1.4357 1.3943 0.0939

VSAC52 1.0021 1.1868 1.3132 0.0677 VSAC52 0.7975 1.0971 1.0658 0.0744 VSAC52 1.0301 1.4388 1.3899 0.0723

VSAC62 1.0717 1.3209 1.2547 0.0454 VSAC62 0.7822 1.1272 0.9832 0.0815 VSAC62 1.0105 1.4248 1.3607 0.0721

VSAC72 1.0866 1.2734 1.3173 0.0400 VSAC72 0.8063 1.1066 1.0752 0.0747 VSAC72 0.9759 1.4203 1.3721 0.0891

VSAC82 1.0315 1.3233 1.2861 0.0697 VSAC82 0.7939 1.0659 1.0045 0.0565 VSAC82 0.9845 1.4172 1.3819 0.0862

VSAC92 1.0091 1.2543 1.2962 0.0666 VSAC92 0.7982 1.1023 0.9993 0.0663 VSAC92 0.9728 1.4507 1.3435 0.0947

VSAC102 1.0389 1.2108 1.2746 0.0395 VSAC102 0.7977 1.1233 1.0319 0.0771 VSAC102 0.9941 1.4224 1.3632 0.0795

Proposed method (state3) VSAC13 1.0100 1.2235 1.2112 0.0379 VSAC13 0.8058 1.0891 1.0382 0.0633 VSAC13 1.0375 1.4241 1.3660 0.0602

VSAC23 1.0807 1.3786 1.2443 0.0618 VSAC23 0.7862 1.1007 0.9722 0.0691 VSAC23 0.9806 1.4586 1.3366 0.0928

VSAC33 1.0246 1.2044 1.2127 0.0268 VSAC33 0.7987 1.0708 1.0361 0.0609 VSAC33 0.9746 1.4617 1.3871 0.1042

VSAC43 1.0654 1.4037 1.3171 0.0835 VSAC43 0.8038 1.0880 1.0316 0.0629 VSAC43 0.9925 1.4351 1.3494 0.0817

VSAC53 1.0581 1.3064 1.2760 0.0506 VSAC53 0.8100 1.0656 1.0628 0.0598 VSAC53 0.9870 1.4028 1.3497 0.0748

VSAC63 1.0844 1.2271 1.3428 0.0457 VSAC63 0.7987 1.0646 1.0441 0.0608 VSAC63 0.9880 1.4462 1.3796 0.0921

VSAC73 1.0181 1.2147 1.2440 0.0403 VSAC73 0.7880 1.1148 1.0455 0.0806 VSAC73 1.0125 1.4290 1.3646 0.0730

VSAC83 1.0375 1.3182 1.2657 0.0618 VSAC83 0.7752 1.1201 0.9816 0.0817 VSAC83 0.9941 1.4543 1.4039 0.0961

VSAC93 1.0261 1.2238 1.2300 0.0347 VSAC93 0.8083 1.0643 1.0528 0.0581 VSAC93 1.0280 1.4174 1.4057 0.0709

VSAC103 1.0195 1.1869 1.2116 0.0254 VSAC103 0.7738 1.0790 1.0458 0.0767 VSAC103 0.9906 1.4521 1.3675 0.0906

Methods Dataset4 Dataset5 Dataset6

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 1.0587 1.3144 1.2876 0.0548 VSA 1.0114 1.3799 1.3384 0.0748 VSA 0.5416 0.6599 0.8384 0.0420

Evolutionary Intelligence

1 3

Table 5 (continued)

Methods Dataset4 Dataset5 Dataset6

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

Proposed

Method

(State1)

VSAC11 1.0880 1.3770 1.2946 0.0616 VSAC11 1.0138 1.3721 1.3292 0.0698 VSAC11 0.4663 0.6720 0.8419 0.0736

VSAC21 1.0712 1.3161 1.3080 0.0536 VSAC21 1.0112 1.3921 1.3328 0.0773 VSAC21 0.5123 0.8086 0.8081 0.0595

VSAC31 1.1361 1.3954 1.2624 0.0459 VSAC31 1.0169 1.4002 1.3189 0.0749 VSAC31 0.5026 0.7927 0.8480 0.0715

VSAC41 1.0834 1.2983 1.2900 0.0394 VSAC41 1.0137 1.3826 1.3106 0.0697 VSAC41 0.4878 0.5965 0.8528 0.0730

VSAC51 1.0942 1.3998 1.3078 0.0680 VSAC51 1.0137 1.4198 1.3033 0.0807 VSAC51 0.5451 0.6770 0.8684 0.0527

VSAC61 1.0741 1.3500 1.2901 0.0585 VSAC61 1.0170 1.3781 1.3388 0.0718 VSAC61 0.5194 0.7853 0.8184 0.0538

VSAC71 1.1019 1.3126 1.3021 0.0370 VSAC71 1.0142 1.3869 1.3070 0.0702 VSAC71 0.4786 0.7875 0.8698 0.0884

VSAC81 1.0968 1.3949 1.3050 0.0648 VSAC81 1.0189 1.3773 1.3061 0.0649 VSAC81 0.5494 0.8144 0.8144 0.0449

VSAC91 1.1381 1.3435 1.2693 0.0249 VSAC91 1.0120 1.4163 1.3212 0.0826 VSAC91 0.5451 0.7079 0.8144 0.0307

VSAC101 1.1095 1.3773 1.3092 0.0537 VSAC101 1.0112 1.4444 1.3252 0.0927 VSAC101 0.5253 0.6853 0.8626 0.0578

Proposed

Method

(State2)

VSAC12 1.1262 1.3242 1.2510 0.0218 VSAC12 1.0135 1.3774 1.3370 0.0728 VSAC12 0.5178 0.8738 0.8303 0.0786

VSAC22 1.0578 1.3752 1.3054 0.0762 VSAC22 1.0194 1.4010 1.3332 0.0762 VSAC22 0.4797 0.8377 0.7928 0.0792

VSAC32 1.0881 1.3501 1.2761 0.0503 VSAC32 1.0106 1.3900 1.3131 0.0738 VSAC32 0.5330 0.7060 0.8127 0.0353

VSAC42 1.0537 1.2853 1.2986 0.0524 VSAC42 1.0194 1.4036 1.3164 0.0744 VSAC42 0.5134 0.6942 0.8297 0.0496

VSAC52 1.1042 1.3998 1.2805 0.0614 VSAC52 1.0175 1.3704 1.3133 0.0647 VSAC52 0.4875 0.7062 0.8177 0.0571

VSAC62 1.1158 1.3116 1.2496 0.0217 VSAC62 1.0162 1.4367 1.3220 0.0875 VSAC62 0.4753 0.8537 0.8308 0.0932

VSAC72 1.1114 1.3633 1.2641 0.0436 VSAC72 1.0159 1.3933 1.3083 0.0717 VSAC72 0.5008 0.8408 0.8633 0.0858

VSAC82 1.0555 1.2932 1.2426 0.0422 VSAC82 1.0183 1.4367 1.3218 0.0865 VSAC82 0.4679 0.7445 0.7921 0.0630

VSAC92 1.1109 1.3618 1.2823 0.0447 VSAC92 1.0106 1.4437 1.3042 0.0905 VSAC92 0.4674 0.6364 0.7936 0.0532

VSAC102 1.0776 1.3493 1.2812 0.0554 VSAC102 1.0187 1.3993 1.3371 0.0767 VSAC102 0.4943 0.6033 0.8131 0.0523

Proposed

Method

(State3)

VSAC13 1.0813 1.2764 1.2555 0.0274 VSAC13 1.0124 1.4059 1.3146 0.0781 VSAC13 0.4681 0.6582 0.8086 0.0593

VSAC23 1.1154 1.2914 1.2469 0.0147 VSAC23 1.0136 1.3882 1.3077 0.0710 VSAC23 0.4696 0.6789 0.8111 0.0606

VSAC33 1.0596 1.2843 1.2631 0.0413 VSAC33 1.0169 1.4456 1.3272 0.0907 VSAC33 0.5189 0.5296 0.8083 0.0540

VSAC43 1.0727 1.2788 1.2980 0.0420 VSAC43 1.0188 1.4463 1.3186 0.0891 VSAC43 0.4603 0.6987 0.8460 0.0789

VSAC53 1.0895 1.3832 1.2876 0.0623 VSAC53 1.0104 1.4034 1.3054 0.0770 VSAC53 0.4699 0.7904 0.7909 0.0712

VSAC63 1.1028 1.3547 1.2910 0.0469 VSAC63 1.0118 1.3891 1.3186 0.0738 VSAC63 0.5173 0.8254 0.7994 0.0596

VSAC73 1.0776 1.3072 1.2938 0.0452 VSAC73 1.0185 1.4352 1.3083 0.0844 VSAC73 0.4795 0.8182 0.7962 0.0747

VSAC83 1.0892 1.3187 1.2608 0.0375 VSAC83 1.0161 1.4071 1.3166 0.0771 VSAC83 0.5205 0.7337 0.8562 0.0587

VSAC93 1.0779 1.2916 1.2883 0.0400 VSAC93 1.0154 1.4041 1.3047 0.1049 VSAC93 0.5359 0.6846 0.8695 0.0564

VSAC103 1.1324 1.3713 1.2418 0.0376 VSAC103 1.0126 1.4016 1.3223 0.1078 VSAC103 0.4834 0.5521 0.7961 0.0542

Bold values is to show the best-obtained value in the comparisons

Evolutionary Intelligence

1 3

Table 6 Comparison of results of methods with VSA (continuance)

Methods Dataset7 Dataset8 Dataset9

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 1.0327 1.7105 1.5810 0.0938 VSA 1.0813 1.3577 1.2658 0.0549 VSA 1.0725 1.3744 1.2617 0.0446

Proposed Method (State1) VSAC11 1.0622 1.7039 1.5924 0.0799 VSAC11 1.0842 1.3458 1.2703 0.0499 VSAC11 1.0763 1.3764 1.2678 0.0441

VSAC21 1.1233 1.7201 1.5845 0.0554 VSAC21 1.0974 1.3345 1.2662 0.0397 VSAC21 1.0600 1.3357 1.2679 0.0373

VSAC31 1.0912 1.7133 1.5996 0.0704 VSAC31 1.1245 1.3479 1.2598 0.0319 VSAC31 1.0716 1.3087 1.3014 0.0301

VSAC41 1.0180 1.7291 1.6070 0.1105 VSAC41 1.1043 1.3529 1.2628 0.0428 VSAC41 1.0726 1.3331 1.3121 0.0382

VSAC51 1.0277 1.7234 1.6277 0.1079 VSAC51 1.1059 1.3438 1.2547 0.0381 VSAC51 1.0629 1.3375 1.3166 0.0448

VSAC61 1.0476 1.7064 1.6000 0.0888 VSAC61 1.1266 1.3489 1.2777 0.0327 VSAC61 1.0785 1.3285 1.3007 0.0319

VSAC71 1.1099 1.7213 1.5909 0.0629 VSAC71 1.1106 1.3372 1.2722 0.0353 VSAC71 1.0553 1.3474 1.2740 0.0441

VSAC81 1.0484 1.7283 1.5891 0.0932 VSAC81 1.0897 1.3369 1.2594 0.0432 VSAC81 1.0789 1.3591 1.2761 0.0375

VSAC91 1.1205 1.7090 1.6188 0.0588 VSAC91 1.0907 1.3496 1.2670 0.0480 VSAC91 1.0720 1.3313 1.2798 0.0321

VSAC101 1.0627 1.7109 1.5934 0.0820 VSAC101 1.1240 1.3310 1.2703 0.0269 VSAC101 1.0709 1.3294 1.3060 0.0367

Proposed Method (State2) VSAC12 1.0749 1.7106 1.5961 0.0766 VSAC12 1.1094 1.3493 1.2786 0.0407 VSAC12 1.0601 1.3193 1.2913 0.0361

VSAC22 1.0502 1.7035 1.5814 0.0836 VSAC22 1.0901 1.3542 1.2707 0.0502 VSAC22 1.0612 1.3613 1.3001 0.0495

VSAC32 1.0779 1.7251 1.6094 0.0818 VSAC32 1.1174 1.3573 1.2672 0.0389 VSAC32 1.0586 1.3569 1.2616 0.0444

VSAC42 1.1263 1.7100 1.6274 0.0579 VSAC42 1.1006 1.3506 1.2615 0.0434 VSAC42 1.0717 1.3782 1.2670 0.0467

VSAC52 1.0769 1.7185 1.6142 0.0811 VSAC52 1.1282 1.3312 1.2727 0.0254 VSAC52 1.0501 1.3567 1.2594 0.0479

VSAC62 1.0740 1.7092 1.6143 0.0798 VSAC62 1.0959 1.3408 1.2547 0.0414 VSAC62 1.0710 1.3517 1.2908 0.0406

VSAC72 1.1288 1.7209 1.6054 0.0563 VSAC72 1.0887 1.3538 1.2535 0.0493 VSAC72 1.0650 1.3280 1.2680 0.0326

VSAC82 1.0588 1.7105 1.6272 0.0896 VSAC82 1.0950 1.3500 1.2522 0.0450 VSAC82 1.0760 1.3314 1.3151 0.0367

VSAC92 1.0938 1.7192 1.6253 0.0754 VSAC92 1.1124 1.3497 1.2631 0.0380 VSAC92 1.0531 1.3195 1.2529 0.0332

VSAC102 1.0309 1.7026 1.6180 0.0987 VSAC102 1.1055 1.3332 1.2709 0.0360 VSAC102 1.0729 1.3093 1.2719 0.0238

Proposed Method (State3) VSAC13 1.0201 1.7183 1.5909 0.1036 VSAC13 1.1158 1.3449 1.2569 0.0344 VSAC13 1.0780 1.3084 1.3018 0.0271

VSAC23 1.0979 1.7097 1.5866 0.0642 VSAC23 1.1167 1.3431 1.2648 0.0339 VSAC23 1.0690 1.3738 1.2685 0.0464

VSAC33 1.0945 1.7074 1.5988 0.0670 VSAC33 1.0918 1.3363 1.2616 0.0423 VSAC33 1.0503 1.3379 1.2865 0.0452

VSAC43 1.1189 1.7240 1.6134 0.0631 VSAC43 1.1008 1.3495 1.2594 0.0428 VSAC43 1.0736 1.3764 1.2590 0.0446

VSAC53 1.0429 1.7097 1.6164 0.0948 VSAC53 1.1031 1.3534 1.2775 0.0448 VSAC53 1.0714 1.3163 1.2974 0.0313

VSAC63 1.0439 1.7063 1.6284 0.0956 VSAC63 1.0805 1.3353 1.2586 0.0467 VSAC63 1.0649 1.3548 1.2807 0.0430

VSAC73 1.0555 1.7082 1.5865 0.0834 VSAC73 1.0915 1.3577 1.2710 0.0508 VSAC73 1.0611 1.3612 1.3028 0.0499

VSAC83 1.0849 1.7247 1.6176 0.0798 VSAC83 1.1007 1.3575 1.2730 0.0469 VSAC83 1.0682 1.3586 1.3106 0.0471

VSAC93 1.0543 1.7269 1.5807 0.0888 VSAC93 1.0905 1.3507 1.2705 0.0488 VSAC93 1.0596 1.3686 1.3034 0.0530

VSAC103 1.0679 1.7055 1.6236 0.0832 VSAC103 1.1026 1.3343 1.2715 0.0379 VSAC103 1.0761 1.3645 1.2945 0.0428

Methods Dataset4 Dataset11 Dataset12

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 1.1862 1.5481 1.3965 0.0884 VSA 1.2308 1.6919 1.6400 0.0762 VSA 1.2996 1.7581 1.6363 0.0839

Evolutionary Intelligence

1 3

Table 6 (continued)

Methods Dataset4 Dataset11 Dataset12

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

Proposed

Method

(State1)

VSAC11 1.1992 1.5405 1.4010 0.0801 VSAC11 1.2294 1.6963 1.5527 0.0653 VSAC11 1.2860 1.7431 1.6426 0.0861

VSAC21 1.1862 1.5494 1.3920 0.0887 VSAC21 1.1812 1.6990 1.5806 0.0915 VSAC21 1.2751 1.7532 1.5976 0.0891

VSAC31 1.1948 1.5347 1.4500 0.0845 VSAC31 1.2143 1.6811 1.5910 0.0722 VSAC31 1.2730 1.7468 1.6656 0.0969

VSAC41 1.1829 1.5137 1.4000 0.0773 VSAC41 1.1760 1.6916 1.5470 0.0872 VSAC41 1.2848 1.7570 1.6617 0.0939

VSAC51 1.1891 1.5379 1.4067 0.0838 VSAC51 1.2027 1.6881 1.6216 0.0848 VSAC51 1.2748 1.7562 1.6665 0.0990

VSAC61 1.1908 1.5153 1.4068 0.0749 VSAC61 1.2408 1.6897 1.6102 0.0656 VSAC61 1.2830 1.7360 1.5938 0.0792

VSAC71 1.1969 1.5441 1.4278 0.0843 VSAC71 1.2456 1.6821 1.5549 0.0533 VSAC71 1.2768 1.7524 1.6126 0.0896

VSAC81 1.1949 1.5376 1.4331 0.0834 VSAC81 1.1655 1.6963 1.5902 0.0993 VSAC81 1.2906 1.7450 1.6614 0.0875

VSAC91 1.1922 1.5333 1.4525 0.0855 VSAC91 1.1947 1.6981 1.5973 0.0875 VSAC91 1.2968 1.7443 1.6201 0.0786

VSAC101 1.1807 1.5416 1.4281 0.0907 VSAC101 1.1607 1.6989 1.6173 0.1068 VSAC101 1.2792 1.7594 1.6602 0.0970

Proposed

Method

(State2)

VSAC12 1.1896 1.5193 1.4497 0.0819 VSAC12 1.2071 1.6990 1.5745 0.0788 VSAC12 1.2777 1.7337 1.6396 0.0865

VSAC22 1.1976 1.5403 1.4394 0.0838 VSAC22 1.1616 1.6863 1.5449 0.0917 VSAC22 1.2881 1.7462 1.5954 0.0806

VSAC32 1.1817 1.5146 1.4334 0.0817 VSAC32 1.1985 1.6905 1.6434 0.0917 VSAC32 1.2896 1.7306 1.6276 0.0784

VSAC42 1.1909 1.5381 1.3953 0.0825 VSAC42 1.2291 1.6815 1.6307 0.0723 VSAC42 1.2984 1.7429 1.6667 0.0841

VSAC52 1.1819 1.5424 1.4465 0.0924 VSAC52 1.2395 1.6918 1.5861 0.0632 VSAC52 1.2888 1.7338 1.6233 0.0792

VSAC62 1.1942 1.5210 1.4141 0.0761 VSAC62 1.2383 1.6826 1.5341 0.0547 VSAC62 1.2890 1.7528 1.6634 0.0909

VSAC72 1.1844 1.5117 1.4324 0.0794 VSAC72 1.1832 1.6858 1.6093 0.0911 VSAC72 1.2915 1.7565 1.6725 0.0923

VSAC82 1.1989 1.5139 1.4465 0.0754 VSAC82 1.2376 1.6812 1.6132 0.0651 VSAC82 1.2726 1.7555 1.6607 0.0989

VSAC92 1.1957 1.5220 1.4578 0.0811 VSAC92 1.2332 1.6904 1.6416 0.0750 VSAC92 1.2802 1.7582 1.6326 0.0923

VSAC102 1.1845 1.5175 1.4039 0.0782 VSAC102 1.2143 1.6913 1.6421 0.0842 VSAC102 1.2872 1.7540 1.6200 0.0862

Proposed

Method

(State3)

VSAC13 1.1861 1.5153 1.4381 0.0806 VSAC13 1.2198 1.6898 1.5747 0.0700 VSAC13 1.2812 1.7340 1.6387 0.0849

VSAC23 1.1818 1.5145 1.4077 0.0787 VSAC23 1.2087 1.6895 1.5412 0.0710 VSAC23 1.2820 1.7369 1.6606 0.0889

VSAC33 1.1878 1.5408 1.4187 0.0864 VSAC33 1.2407 1.6906 1.6048 0.0650 VSAC33 1.2706 1.7320 1.6602 0.0927

VSAC43 1.1891 1.5249 1.4392 0.0825 VSAC43 1.2347 1.6808 1.6047 0.0649 VSAC43 1.2771 1.7576 1.5999 0.0900

VSAC53 1.1849 1.5402 1.4029 0.0863 VSAC53 1.1923 1.6829 1.6446 0.0928 VSAC53 1.2784 1.7579 1.6529 0.0958

VSAC63 1.1825 1.5323 1.3975 0.0841 VSAC63 1.1791 1.6939 1.5991 0.0937 VSAC63 1.2736 1.7307 1.6575 0.0905

VSAC73 1.1821 1.5187 1.4187 0.0811 VSAC73 1.2165 1.6856 1.5374 0.0658 VSAC73 1.2997 1.7443 1.5993 0.0752

VSAC83 1.1829 1.5214 1.4029 0.0802 VSAC83 1.2224 1.6829 1.6467 0.0791 VSAC83 1.2829 1.7364 1.6692 0.0899

VSAC93 1.1863 1.5402 1.4211 0.0870 VSAC93 1.1692 1.6938 1.6277 0.1033 VSAC93 1.2709 1.7507 1.6201 0.0926

VSAC103 1.1918 1.5205 1.4030 0.0760 VSAC103 1.1848 1.6969 1.5420 0.0844 VSAC103 1.2840 1.7564 1.6505 0.0924

Bold values is to show the best-obtained value in the comparisons

Evolutionary Intelligence

1 3

Table 7 Comparison of results of methods with VSA (continuance)

Methods Dataset13 Dataset14 Dataset15

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 1.1441 1.6251 1.4886 0.0924 VSA 1.1812 1.5273 1.5207 0.0716 VSA 1.1460 1.7857 1.6785 0.0997

Proposed method (state1) VSAC11 1.0892 1.5685 1.4332 0.0918 VSAC11 1.1704 1.5441 1.5232 0.0814 VSAC11 1.1630 1.7513 1.6713 0.0805

VSAC21 1.1195 1.5472 1.4824 0.0782 VSAC21 1.1874 1.5341 1.4535 0.0582 VSAC21 1.1570 1.7418 1.6000 0.0690

VSAC31 1.1091 1.5705 1.5483 0.1025 VSAC31 1.1790 1.5162 1.5298 0.0722 VSAC31 1.1493 1.7299 1.6540 0.0777

VSAC41 1.0870 1.5579 1.5286 0.1054 VSAC41 1.1727 1.5294 1.5127 0.0743 VSAC41 1.1584 1.7621 1.6856 0.0884

VSAC51 1.1146 1.5887 1.5776 0.1110 VSAC51 1.1763 1.5168 1.4998 0.0666 VSAC51 1.1406 1.7398 1.5210 0.0676

VSAC61 1.1436 1.6068 1.5190 0.0909 VSAC61 1.1893 1.5752 1.4953 0.0763 VSAC61 1.1758 1.7855 1.5634 0.0720

VSAC71 1.1567 1.5226 1.5409 0.0669 VSAC71 1.1895 1.5488 1.4515 0.0617 VSAC71 1.1680 1.7983 1.5230 0.0780

VSAC81 1.1112 1.5242 1.5405 0.0887 VSAC81 1.1812 1.5464 1.4671 0.0668 VSAC81 1.1648 1.7897 1.5283 0.0762

VSAC91 1.1511 1.5526 1.5002 0.0682 VSAC91 1.1747 1.5269 1.4733 0.0650 VSAC91 1.1611 1.7312 1.6098 0.0652

VSAC101 1.0727 1.5530 1.5720 0.1210 VSAC101 1.1700 1.5292 1.4665 0.0667 VSAC101 1.1711 1.7363 1.5221 0.0530

Proposed method (state2) VSAC12 1.1623 1.5626 1.5780 0.0825 VSAC12 1.1849 1.5092 1.5295 0.0679 VSAC12 1.1672 1.7850 1.5450 0.0743

VSAC22 1.0926 1.6297 1.4349 0.1120 VSAC22 1.1723 1.5311 1.4913 0.0706 VSAC22 1.1709 1.7863 1.6375 0.0821

VSAC32 1.0783 1.5767 1.5770 0.1250 VSAC32 1.1890 1.5308 1.4701 0.0589 VSAC32 1.1791 1.7460 1.5282 0.0535

VSAC42 1.1287 1.5378 1.5334 0.0818 VSAC42 1.1834 1.5098 1.5199 0.0663 VSAC42 1.1794 1.7680 1.6707 0.0776

VSAC52 1.1125 1.6148 1.4404 0.0982 VSAC52 1.1754 1.5187 1.5000 0.0676 VSAC52 1.1486 1.7252 1.6701 0.0798

VSAC62 1.1348 1.5408 1.5033 0.0732 VSAC62 1.1800 1.5200 1.4650 0.0590 VSAC62 1.1793 1.7619 1.6899 0.0794

VSAC72 1.0734 1.6152 1.4171 0.1139 VSAC72 1.1791 1.5708 1.5295 0.0857 VSAC72 1.1439 1.7678 1.6385 0.0889

VSAC82 1.1081 1.5829 1.4854 0.0947 VSAC82 1.1855 1.5637 1.5266 0.0802 VSAC82 1.1721 1.7848 1.6272 0.0798

VSAC92 1.0951 1.6040 1.5084 0.1108 VSAC92 1.1875 1.5707 1.4886 0.0747 VSAC92 1.1556 1.7730 1.6695 0.0900

VSAC102 1.1343 1.5986 1.5142 0.0919 VSAC102 1.1803 1.5127 1.5118 0.0665 VSAC102 1.1591 1.7647 1.5690 0.0723

Proposed method (state3) VSAC13 1.1636 1.5286 1.5780 0.0748 VSAC13 1.1725 1.5495 1.4963 0.0766 VSAC13 1.1517 1.7890 1.5272 0.0815

VSAC23 1.1256 1.6364 1.4539 0.1013 VSAC23 1.1772 1.5437 1.5003 0.0735 VSAC23 1.1659 1.7216 1.6352 0.0642

VSAC33 1.0866 1.5536 1.4532 0.0907 VSAC33 1.1810 1.5741 1.4962 0.0799 VSAC33 1.1496 1.7648 1.5559 0.0754

VSAC43 1.1054 1.5949 1.4492 0.0952 VSAC43 1.1873 1.5780 1.4658 0.0743 VSAC43 1.1598 1.7357 1.5758 0.0627

VSAC53 1.1309 1.5694 1.4590 0.0762 VSAC53 1.1872 1.5031 1.4708 0.0519 VSAC53 1.1494 1.7346 1.6714 0.0822

VSAC63 1.1253 1.6004 1.5405 0.1013 VSAC63 1.1703 1.5673 1.4430 0.0758 VSAC63 1.1655 1.7815 1.5669 0.0753

VSAC73 1.1628 1.5362 1.5881 0.0794 VSAC73 1.1706 1.5053 1.4890 0.0641 VSAC73 1.1430 1.7487 1.5400 0.0712

VSAC83 1.1032 1.5374 1.5235 0.0915 VSAC83 1.1814 1.5611 1.4473 0.0691 VSAC83 1.1440 1.7794 1.6899 0.1008

VSAC93 1.1575 1.5742 1.4197 0.0620 VSAC93 1.1749 1.5740 1.4582 0.0776 VSAC93 1.1579 1.7901 1.6727 0.0945

VSAC103 1.1055 1.6348 1.4488 0.1092 VSAC103 1.1788 1.5262 1.4693 0.0621 VSAC103 1.1592 1.7674 1.6270 0.0800

Methods Dataset4 Dataset17 Dataset18

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 0.7088 1.8566 1.6167 0.1044 VSA 1.0689 1.7354 1.6695 0.0899 VSA 1.1120 1.6863 1.6872 0.1109

Evolutionary Intelligence

1 3

Table 7 (continued)

Methods Dataset4 Dataset17 Dataset18

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

Proposed

method

(state1)

VSAC11 0.6902 1.8539 1.5967 0.1091 VSAC11 1.0864 1.7759 1.6687 0.0930 VSAC11 1.1723 1.7033 1.6068 0.0710

VSAC21 0.8725 1.8518 1.5915 0.0242 VSAC21 1.0813 1.7860 1.6392 0.0936 VSAC21 1.1095 1.7584 1.5863 0.1145

VSAC31 0.8720 1.8505 1.6129 0.0267 VSAC31 1.1189 1.7389 1.6394 0.0619 VSAC31 1.1338 1.7174 1.6306 0.0971

VSAC41 0.8873 1.8525 1.5597 0.0141 VSAC41 1.0986 1.8086 1.5121 0.0812 VSAC41 1.1430 1.7571 1.5692 0.0969

VSAC51 0.7766 1.8590 1.5251 0.0626 VSAC51 1.0884 1.7982 1.6546 0.0964 VSAC51 1.1540 1.6816 1.5372 0.0626

VSAC61 0.8194 1.8598 1.5882 0.0506 VSAC61 1.0562 1.7333 1.5936 0.0819 VSAC61 1.1528 1.6619 1.5289 0.0556

VSAC71 0.8537 1.8548 1.5435 0.0283 VSAC71 1.0740 1.7528 1.6620 0.0909 VSAC71 1.1517 1.6910 1.5192 0.0649

VSAC81 0.8656 1.8514 1.5442 0.0219 VSAC81 1.0919 1.7855 1.5151 0.0754 VSAC81 1.1644 1.6820 1.5290 0.0571

VSAC91 0.8687 1.8551 1.5206 0.0196 VSAC91 1.0547 1.7438 1.5134 0.0764 VSAC91 1.0928 1.7002 1.6988 0.1260

VSAC101 0.7156 1.8599 1.6146 0.1019 VSAC101 1.0918 1.7454 1.6640 0.0808 VSAC101 1.1239 1.7315 1.6920 0.1176

Proposed

method

(state2)

VSAC12 0.7711 1.8576 1.6005 0.0736 VSAC12 1.1128 1.7373 1.5244 0.0492 VSAC12 1.1395 1.6854 1.5345 0.0702

VSAC22 0.7731 1.8597 1.5485 0.0669 VSAC22 1.0669 1.7586 1.6877 0.1007 VSAC22 1.1702 1.6906 1.6156 0.0697

VSAC32 0.7652 1.8593 1.5492 0.0704 VSAC32 1.0431 1.7597 1.5528 0.0912 VSAC32 1.1796 1.7157 1.6159 0.0728

VSAC42 0.8710 1.8524 1.5246 0.0179 VSAC42 1.1024 1.8005 1.5231 0.0770 VSAC42 1.1012 1.7181 1.6372 0.1138

VSAC52 0.7723 1.8510 1.5716 0.0671 VSAC52 1.1161 1.7365 1.6164 0.0586 VSAC52 1.1537 1.7449 1.5215 0.0837

VSAC62 0.7172 1.8541 1.5670 0.0927 VSAC62 1.1157 1.7601 1.6432 0.0703 VSAC62 1.1446 1.7060 1.5646 0.0784

VSAC72 0.8708 1.8513 1.5962 0.0254 VSAC72 1.1381 1.7894 1.6688 0.0729 VSAC72 1.1835 1.7404 1.5216 0.0691

VSAC82 0.7680 1.8575 1.5988 0.0748 VSAC82 1.0417 1.7339 1.5291 0.0803 VSAC82 1.1049 1.6851 1.5991 0.0956

VSAC92 0.7155 1.8535 1.5680 0.0934 VSAC92 1.0597 1.8063 1.5647 0.1010 VSAC92 1.1621 1.6884 1.6959 0.0899

VSAC102 0.8549 1.8545 1.5769 0.0313 VSAC102 1.0837 1.8083 1.5419 0.0893 VSAC102 1.0988 1.6896 1.5441 0.0913

Proposed

method

(state3)

VSAC13 0.7810 1.8520 1.5818 0.0648 VSAC13 1.1011 1.8030 1.6219 0.0875 VSAC13 1.1840 1.6864 1.6454 0.0678

VSAC23 0.8029 1.8574 1.5613 0.0541 VSAC23 1.0747 1.7686 1.6550 0.0939 VSAC23 1.1284 1.6943 1.5352 0.0783

VSAC33 0.7254 1.8520 1.6354 0.0981 VSAC33 1.0406 1.7397 1.6881 0.1081 VSAC33 1.1336 1.6709 1.5366 0.0683

VSAC43 0.8986 1.8577 1.6172 0.0174 VSAC43 1.0686 1.7778 1.6737 0.1027 VSAC43 1.1232 1.7109 1.5954 0.0942

VSAC53 0.7473 1.8524 1.5286 0.0739 VSAC53 1.1298 1.7651 1.6274 0.0629 VSAC53 1.1565 1.7258 1.6131 0.0862

VSAC63 0.7880 1.8561 1.5700 0.0615 VSAC63 1.0953 1.7621 1.6328 0.0788 VSAC63 1.1862 1.6637 1.5385 0.0422

VSAC73 0.8155 1.8535 1.6382 0.0573 VSAC73 1.0707 1.8040 1.5938 0.0983 VSAC73 1.1614 1.7475 1.6339 0.0938

VSAC83 0.7385 1.8572 1.6048 0.0891 VSAC83 1.1067 1.7787 1.6165 0.0763 VSAC83 1.0969 1.7391 1.6547 0.1250

VSAC93 0.8227 1.8512 1.6346 0.0527 VSAC93 1.0478 1.7816 1.5119 0.0930 VSAC93 1.1100 1.7445 1.6903 0.1272

VSAC103 0.8314 1.8518 1.5446 0.0374 VSAC103 1.0768 1.7904 1.6597 0.1002 VSAC103 1.1569 1.6792 1.6744 0.0851

Bold values is to show the best-obtained value in the comparisons

Evolutionary Intelligence

1 3

Table 8 Comparison of Results of methods with VSA (continuance)

Methods Dataset19 Dataset20 Dataset21

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 0.9269 1.4273 1.3141 0.0743 VSA 0.9335 1.3484 1.3985 0.0784 VSA 1.0266 1.3506 1.2821 0.0594

Proposed method (state1) VSAC11 0.9138 1.3929 1.3797 0.0828 VSAC11 0.9271 1.3953 1.3572 0.0823 VSAC11 1.0539 1.3512 1.2407 0.0427

VSAC21 0.9192 1.4345 1.3057 0.0790 VSAC21 0.9327 1.3936 1.3533 0.0785 VSAC21 1.0210 1.3840 1.2459 0.0696

VSAC31 0.9218 1.4036 1.3280 0.0716 VSAC31 0.9293 1.3813 1.3138 0.0691 VSAC31 1.0013 1.3876 1.2364 0.0789

VSAC41 0.9331 1.4302 1.3624 0.0801 VSAC41 0.9343 1.3927 1.3533 0.0774 VSAC41 1.0586 1.3805 1.2804 0.0545

VSAC51 0.9117 1.3751 1.3125 0.0653 VSAC51 0.9185 1.4397 1.3324 0.0947 VSAC51 1.0258 1.3886 1.2647 0.0706

VSAC61 0.9198 1.4048 1.3046 0.0691 VSAC61 0.9368 1.3669 1.3234 0.0633 VSAC61 1.0251 1.3697 1.2523 0.0631

VSAC71 0.9354 1.3796 1.3286 0.0585 VSAC71 0.8805 1.4118 1.3516 0.1075 VSAC71 1.0254 1.3722 1.2542 0.0639

VSAC81 0.9280 1.3706 1.3393 0.0617 VSAC81 0.8644 1.3749 1.2917 0.0936 VSAC81 1.0009 1.3761 1.2723 0.0782

VSAC91 0.9411 1.3743 1.3765 0.0647 VSAC91 0.9223 1.4006 1.3328 0.0813 VSAC91 1.0429 1.3889 1.2431 0.0618

VSAC101 0.9398 1.4117 1.3012 0.0616 VSAC101 0.8920 1.3797 1.2962 0.0830 VSAC101 1.0517 1.3861 1.2671 0.0584

Proposed method (state2) VSAC12 0.9206 1.4400 1.3166 0.0816 VSAC12 0.9051 1.3541 1.3658 0.0845 VSAC12 1.0380 1.3510 1.2309 0.0490

VSAC22 0.9374 1.4449 1.3079 0.0744 VSAC22 0.9261 1.3785 1.2897 0.0657 VSAC22 1.0620 1.3730 1.2777 0.0501

VSAC32 0.9331 1.4191 1.3284 0.0710 VSAC32 0.9244 1.4068 1.3796 0.0913 VSAC32 1.0449 1.3792 1.2432 0.0572

VSAC42 0.9201 1.4053 1.3627 0.0794 VSAC42 0.9249 1.3893 1.3939 0.0900 VSAC42 1.0150 1.3561 1.2393 0.0615

VSAC52 0.9359 1.3762 1.3241 0.0565 VSAC52 0.8656 1.3387 1.3980 0.1082 VSAC52 1.0486 1.3520 1.2623 0.0472

VSAC62 0.9321 1.3753 1.3651 0.0665 VSAC62 0.9008 1.3425 1.3884 0.0898 VSAC62 1.0189 1.3714 1.2493 0.0662

VSAC72 0.9486 1.3840 1.3346 0.0546 VSAC72 0.8763 1.3433 1.2977 0.0802 VSAC72 1.0248 1.3516 1.2422 0.0558

VSAC82 0.9209 1.4353 1.3635 0.0875 VSAC82 0.9252 1.3333 1.2750 0.0502 VSAC82 1.0112 1.3700 1.2759 0.0719

VSAC92 0.9251 1.4186 1.3434 0.0771 VSAC92 0.9218 1.3572 1.3824 0.0815 VSAC92 1.0336 1.3584 1.2449 0.0546

VSAC102 0.9450 1.3770 1.3235 0.0523 VSAC102 0.8720 1.4383 1.3011 0.1112 VSAC102 1.0433 1.3690 1.2348 0.0537

Proposed method (state3) VSAC13 0.9414 1.4073 1.3230 0.0627 VSAC13 0.9005 1.3308 1.4063 0.0928 VSAC13 1.0267 1.3762 1.2303 0.0633

VSAC23 0.9377 1.4139 1.3537 0.0717 VSAC23 0.8611 1.3899 1.3941 0.1202 VSAC23 1.0277 1.3520 1.2370 0.0542

VSAC33 0.9488 1.4272 1.3755 0.0744 VSAC33 0.9262 1.3827 1.3839 0.0855 VSAC33 1.0325 1.3843 1.2598 0.0657

VSAC43 0.9419 1.4002 1.3880 0.0732 VSAC43 0.8640 1.4302 1.2804 0.1096 VSAC43 1.0189 1.3516 1.2896 0.0645

VSAC53 0.9473 1.4355 1.3805 0.0783 VSAC53 0.9094 1.3536 1.3330 0.0747 VSAC53 1.0437 1.3756 1.2891 0.0606

VSAC63 0.9500 1.3758 1.3061 0.0465 VSAC63 0.9042 1.3611 1.4056 0.0966 VSAC63 1.0615 1.3703 1.2842 0.0501

VSAC73 0.9245 1.3929 1.3346 0.0684 VSAC73 0.9112 1.3800 1.2873 0.0727 VSAC73 1.0443 1.3731 1.2428 0.0552

VSAC83 0.9179 1.3834 1.3330 0.0686 VSAC83 0.9284 1.3581 1.2770 0.0564 VSAC83 1.0132 1.3852 1.2568 0.0742

VSAC93 0.9342 1.4103 1.3894 0.0797 VSAC93 0.8747 1.3644 1.3424 0.0959 VSAC93 1.0072 1.3574 1.2468 0.0662

VSAC103 0.9395 1.3729 1.3707 0.0638 VSAC103 0.9255 1.3728 1.3494 0.0756 VSAC103 1.0457 1.3581 1.2763 0.0522

Methods Dataset4 Dataset23 Dataset24

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

VSA VSA 1.1897 1.5564 1.5208 0.0851 VSA 1.0158 1.3750 1.3256 0.0690 VSA 0.9870 1.4522 1.3572 0.0407

Evolutionary Intelligence

1 3

Table 8 (continued)

Methods Dataset4 Dataset23 Dataset24

Modes Worst Best Mean SD Modes Worst Best Mean SD Modes Worst Best Mean SD

Proposed

method

(state1)

VSAC11 1.1876 1.5374 1.5285 0.0829 VSAC11 1.0145 1.3895 1.3149 0.0721 VSAC11 1.0012 1.4425 1.3945 0.0377

VSAC21 1.1773 1.5394 1.4289 0.0715 VSAC21 1.0175 1.4069 1.3059 0.0750 VSAC21 1.0067 1.4529 1.3599 0.0322

VSAC31 1.1897 1.5401 1.4896 0.0746 VSAC31 1.0174 1.3841 1.3352 0.0725 VSAC31 1.0213 1.4292 1.3603 0.0183

VSAC41 1.1712 1.5071 1.4836 0.0731 VSAC41 1.0179 1.4491 1.3330 0.0922 VSAC41 1.0305 1.4663 1.3747 0.0276

VSAC51 1.1700 1.5216 1.4667 0.0744 VSAC51 1.0142 1.3903 1.3020 0.0706 VSAC51 1.0346 1.4278 1.3573 0.0112

VSAC61 1.1824 1.5080 1.5095 0.0738 VSAC61 1.0116 1.4418 1.3009 0.0890 VSAC61 0.9807 1.4672 1.3742 0.0509

VSAC71 1.1721 1.5156 1.5276 0.0848 VSAC71 1.0115 1.4461 1.3299 0.0937 VSAC71 1.0164 1.4096 1.3419 0.0116

VSAC81 1.1782 1.5442 1.4576 0.0762 VSAC81 1.0136 1.4351 1.3286 0.0890 VSAC81 0.9928 1.4460 1.3534 0.0355

VSAC91 1.1733 1.5612 1.4795 0.0870 VSAC91 1.0101 1.4122 1.3065 0.0802 VSAC91 1.0327 1.4621 1.3757 0.0255

VSAC101 1.1804 1.5746 1.5080 0.0923 VSAC101 1.0102 1.4492 1.3079 0.0930 VSAC101 0.9989 1.4285 1.3604 0.0285

Proposed

method

(state2)

VSAC12 1.1888 1.5573 1.5084 0.0834 VSAC12 1.0132 1.3813 1.3252 0.0720 VSAC12 1.0320 1.4259 1.3587 0.0121

VSAC22 1.1808 1.5455 1.4537 0.0749 VSAC22 1.0121 1.4121 1.3255 0.0818 VSAC22 1.0225 1.4153 1.3390 0.0100

VSAC32 1.1836 1.5436 1.4478 0.0722 VSAC32 1.0165 1.4380 1.3291 0.0886 VSAC32 1.0110 1.4184 1.3305 0.0151

VSAC42 1.1718 1.5017 1.4702 0.0686 VSAC42 1.0122 1.4304 1.3296 0.0882 VSAC42 0.9899 1.4146 1.3517 0.0271

VSAC52 1.1720 1.5058 1.4722 0.0700 VSAC52 1.0195 1.4455 1.3002 0.0868 VSAC52 0.9791 1.4246 1.3695 0.0383

VSAC62 1.1770 1.5322 1.4853 0.0776 VSAC62 1.0165 1.4336 1.3381 0.0884 VSAC62 0.9829 1.4062 1.4061 0.0395

VSAC72 1.1851 1.5333 1.4687 0.0713 VSAC72 1.0164 1.4128 1.3129 0.0783 VSAC72 1.0237 1.4276 1.4013 0.0245

VSAC82 1.1853 1.5710 1.4496 0.0810 VSAC82 1.0187 1.4131 1.3199 0.0783 VSAC82 1.0158 1.4184 1.3413 0.0145

VSAC92 1.1790 1.5481 1.4977 0.0834 VSAC92 1.0102 1.4219 1.3355 0.0873 VSAC92 0.9728 1.4201 1.3367 0.0342

VSAC102 1.1821 1.5046 1.4437 0.0599 VSAC102 1.0158 1.4147 1.3215 0.0804 VSAC102 1.0189 1.4217 1.3563 0.0165

Proposed

method

(state3)

VSAC13 1.1749 1.5629 1.5217 0.0940 VSAC13 1.0159 1.4236 1.3113 0.0819 VSAC13 1.0292 1.4056 1.3730 0.0103

VSAC23 1.1821 1.5664 1.4781 0.0843 VSAC23 1.0122 1.4042 1.3016 0.0760 VSAC23 0.9879 1.4021 1.3331 0.0212

VSAC33 1.1847 1.5522 1.4861 0.0800 VSAC33 1.0157 1.3706 1.3188 0.0666 VSAC33 1.0202 1.4470 1.4036 0.0318

VSAC43 1.1831 1.5304 1.4298 0.0659 VSAC43 1.0142 1.3926 1.3026 0.0714 VSAC43 1.0170 1.4634 1.3791 0.0336

VSAC53 1.1713 1.5451 1.4829 0.0835 VSAC53 1.0135 1.4211 1.3004 0.0810 VSAC53 1.0399 1.4135 1.4087 0.0150

VSAC63 1.1825 1.5467 1.4260 0.0715 VSAC63 1.0197 1.4042 1.3214 0.0752 VSAC63 1.0005 1.4222 1.3626 0.0263

VSAC73 1.1803 1.5572 1.4855 0.0834 VSAC73 1.0160 1.4196 1.3254 0.0824 VSAC73 1.0089 1.4579 1.3645 0.0334

VSAC83 1.1852 1.5076 1.5033 0.0710 VSAC83 1.0104 1.4202 1.3364 0.0868 VSAC83 1.0287 1.4363 1.4090 0.0261

VSAC93 1.1794 1.5254 1.4340 0.0664 VSAC93 1.0140 1.4161 1.3062 0.0797 VSAC93 0.9908 1.4042 1.3628 0.0259

VSAC103 1.1858 1.5717 1.4757 0.0840 VSAC103 1.0107 1.3905 1.3235 0.0755 VSAC103 1.0019 1.4603 1.3812 0.0401

Bold values is to show the best-obtained value in the comparisons

Evolutionary Intelligence

1 3

the similarities between Victoria representations of docu-

ments in one space to solve a problem p of P. In the other

two DCM-based approaches, it is possible to hybrid dif-

ferent representation spaces. In the case of DCM-voting

approach, this is done using a voting technique and as

for the DCM-classiﬁer, it can be performed through a

supervised learning method which requires the deﬁnition

of predictive features. During evaluation of the challenge

PAN-CLEF 2013, it is observed that DCM-classiﬁer has

the best performance only on the Greek corpus with 85%,

and the two other approaches i.e. DCM-voting and DCM-

classiﬁer obtain the best results or equivalent to the winner

of the competition for all evaluation measures (F1, preci-

sion and recall) on all the corpora.

The General Impostors Method (GENIM) which took

part in the PAN’13 authorship identiﬁcation competition has

been evaluated in [92]. The basis of this model is the com-

parison made between the given documents and a number

of external (impostor) documents, and since there are two

stages in their method, the performance had to be meas-

ured and parameters needed to be optimized at each step.

25–33 percent of the training documents of each language

were used for measuring and optimizing IM, whereas the

rest were used for evaluation of GENIM. For the IM evalu-

ation set, 3 or 4 documents were used as seed documents

to retrieve the web impostor. The test accuracy is equal to

75.3%.

Blocks containing 140, 280, and 500 characters were

investigated. The feature set contains conventional features

like syntactic, lexical, application speciﬁc features, and

some new features that are extracted from n-gram analy-

sis. Moreover, the proposed approach has a mechanism

for handling issues related to unbalanced dataset. It also

uses Support Vector Machine (SVM) for data classiﬁcation

and uses Information Gain and Mutual Information as a

FS strategy. The proposed approach was evaluated experi-

mentally using the Enron email and Twitter corpuses. The

results of this evaluation were very promising including

an Equal Error Rate (EER) changing between 9.98% and

21.45%, for diﬀerent block sizes [93].

In [94], by using a cluster-based classiﬁcation approach, a

model is presented for email authorship identiﬁcation (EAI).

Contributions of this paper are as follows: a) Developing a

new model for email authorship identiﬁcation. b) Evaluation

Table 9 Results of the VSA and

the proposed method based on

the FS

Bold values is to show the best-obtained value in the comparisons

Datasets VSA Proposed method

Feature

count-total

Accuracy (%) Time(sec) Feature

count-select

Accuracy (%) Time(sec)

Dataset1 36 44.2472 2.3244 25 56.5962 2.105

Dataset2 10 48.4139 1.5936 8 67.549 0.7746

Dataset3 24 91.6334 1.0279 11 87.8578 2.2247

Dataset4 15 63.7899 1.6667 8 64.9167 0.175

Dataset5 36 77.4904 1.7881 19 92.073 1.1341

Dataset6 8 60.3178 1.6666 6 62.4584 1.191

Dataset7 10 72.4666 1.941 6 81.1022 2.4077

Dataset8 16 71.3516 3.5829 8 96.3686 2.6418

Dataset9 21 87.4691 1.1463 9 96.3448 2.1726

Dataset10 16 82.3352 0.0883 7 90.1066 0.456

Dataset11 30 98.4868 2.5427 16 98.7271 0.3067

Dataset12 80 36.3746 0.1836 53 71.7682 0.9082

Dataset13 22 54.9418 2.5736 13 50.01 0.8355

Dataset14 22 35.7599 0.5076 17 59.9471 2.2448

Dataset15 20 88.6538 2.1595 8 80.2569 1.2598

Dataset16 57 36.5519 2.3174 13 43.957 0.5924

Dataset17 45 45.2476 3.066 23 93.6619 1.7498

Dataset18 17 60.1896 3.4182 6 58.0906 0.0865

Dataset19 13 69.6363 2.7172 7 92.9925 2.6428

Dataset20 10 82.1821 1.1213 6 83.4601 1.9857

Dataset21 2350 48.185 0.9042 1200 90.7128 1.4438

Dataset22 5340 84.1727 1.245 2341 83.7111 2.7264

Dataset23 3343 95.4847 3.8613 1845 99.7029 0.8341

Dataset24 2340 47.1271 0.6734 800 50.26194 1.3153

Evolutionary Intelligence

1 3

of using additional features together with basic

stylometric

features for email authorship identiﬁcation as well as content

features that are based on Info Gain FS. On the Enron data-

set, the proposed model achieved accuracies of 94, 89, and

81 percent for 10, 25, and 50 authors, respectively. Whereas,

on real email dataset constructed by authors, it attained an

accuracy of 89.5%.

A large number of researches only focus on enhanc-

ing predictive accuracy and do not pay much attention to

intrinsic value of the collected evidence. In this paper, a

customized associative classiﬁcation approach, which is a

well-known data mining technique, is applied to the author-

ship attribution problem. This method models the features

of writing style which are unique to a person. Then, it meas-

ures the associativity level of these features and generates

an instinctive classiﬁer. In this research, it is also concluded

that a more accurate write print can also be provided by

applying modiﬁcations on the rule pruning and ranking

system described in the popular Classiﬁcation by Multiple

Association Rule (CMAR) algorithm. More convincing evi-

dences can be provided for a court of law by eliminating

patterns common amongst diﬀerent authors since it leads

to fairly unique and easy-to-understand write prints. Since

this customized abandonment counter method is helpful in

solving the problem of the e-mail authorship attribution, it

can be used as a powerful tool against cybercrimes. The

Fig. 2 a Feature Count-Total. b Accuracy (%). c Time (sec)

Evolutionary Intelligence

1 3

eﬀectiveness of the presented approach is veriﬁed by the

results obtained through experiments [95].

An eﬀort is made by authors in [96] to identify the author

of articles written in Arabic. They introduced a new data-

set which is composed of 12 features and 456 samples of 7

authors. Furthermore, to distinguish diﬀerent authors from

each other, powerful classiﬁcation techniques were hybrids

with the proposed dataset in their approach. The obtained

results revealed that the proposed dataset was very success-

ful and achieved a classiﬁcation performance accuracy of

82% in the hold-out tests. They also conducted some exper-

iments with two well-known classiﬁers namely the SVM

and functional trees (FT) in order to show the eﬃciency

of the proposed feature set. The reported an accuracy of

82% with the FT approach and holdout testing which con-

ﬁrmed robustness of the proposed feature set. Moreover, an

accuracy of 100% has been achieved in one of the classes.

They also conducted some test on FT by using tenfold cross

validation and the proposed approach retained its accuracy

to some extent.

One of the classiﬁers which have been extensively used

for language processing is the Naive Bayes classifiers.

Nevertheless, the event model used which can remarkably

aﬀect the classiﬁer performance is not often mentioned.

So far, Naive Bayes (NB) classiﬁers have never been used

for authorship attribution in Arabic. Thus, they proposed

to apply these classiﬁers to this problem, taking into con-

sideration various event models such as simple NB, multi-

nomial NB (MNB), multi-variant Bernoulli NB (MBNB),

and Multi-variant Poisson NB (MPNB). The MBNB prob-

ability estimation is dependent on whether a feature exists

or not, whereas MNB and MPNB a probability estimation

is dependent on the frequency of the feature. The mean and

standard deviation of the features form the basis of probabil-

ity estimation in the NB model. The performances of these

models are evaluated using a large Arabic dataset taken from

books written by 10 diﬀerent authors. Then, they are com-

pared with other methods. The obtained results reveal that

MBNB outperforms other techniques and is able to identify

the author of a text with an accuracy of 97.43%. In addition,

these results show that MNB and MBNB can be considered

as a good choice for authorship attribution [97].

In [98], authorship identiﬁcation methods were applied

to messages of Arabic web forum. In this study, syntactic,

lexical, structural, and content-speciﬁc writing style features

were used to identify the authors. Some of the problematic

characteristics of Arabic language were addressed in order to

present a model with an acceptable degree of classiﬁcation

accuracy for authorship identiﬁcation. SVM had a better per-

formance than C4.5 and compared to English performance,

the overall accuracy for Arabic was lower. These results

were in consistence with previous researches. Finally, as

future work, the authors proposed to analyze the diﬀerences

between these two languages by evaluating the key features

as determined by decision trees. Highlighting the linguistic

diﬀerences between English and Arabic languages provides

further insight into possible technique for enhancing the per-

formance of authorship discrimination methodologies in an

online, multilingual setting. The results showed accuracies

of 85.43 and 81.03 for SVM and C4.5, respectively.

In [99], they developed an authorship visualization known

as Write prints which can be used for identiﬁcation of indi-

viduals based on their writing style. Unique writing style

patterns are created through this visualization. These pat-

terns can be distinguished in a similar way that ﬁngerprint

biometric systems work. Write prints provide an approach

which is based on component analysis and utilizes a dynamic

feature-based sliding window algorithm. This makes them

very suitable for visualizing authorship across larger groups

of messages. The performance of visualization across mes-

sages taken from three diﬀerent Arabic and English forums

was evaluated and compared with the performance of

SVM. This comparison indicated that Write prints show

Table 10 Control parameters of the algorithms

Algorithms Description Parameters Value

GA [83] Crossover rate Pc 0.9

Generation Ng 200

Mutation rate Pm 0.01

PSO [84] Acceleration coeﬃcients C1 1.5

Acceleration coeﬃcients C2 1.5

Random number R1 0.5

Random number R2 0.5

Inertia weight (linearly

decreases)

(w) 0.6 to 0.3

ABC [9] Employed bees Ne 50

Onlooker bees No 50

Scout bee Ns 50

Random number ϕ 0.5

BOA [8] Flight distance Stepe0.05

Random number Rand 0.02

Linearly decreases a 2–0

IWO [7] Minimum number of seeds smin 0

Maximum number of seeds smax 5

Final value of standard devia-

tion

σﬁnal 0.001

Initial value of standard devia-

tion

σinitial 3

FPA [14] Step size scaling factor γ 0.01

Switch probability between

Local and Global pollination

P 0.4

Randomization α 0.2

Attractiveness of a ﬁreﬂy β 1

FA [11] Absorption coeﬃcient γ 1

Evolutionary Intelligence

1 3

an excellent classiﬁcation performance and provide better

results than SVM in many instances. They also concluded

that visualization can be used to identify cyber criminals and

can help users authenticate fellow online members to prevent

cyber fraud. Accuracies of 68.92 and 87.00 were obtained

for Write prints and SVM, respectively.

In [100], they introduced approaches to deal with imbal-

anced multi-class textual datasets. The main idea behind

their approach is to divide the training texts into text sam-

ples based on the class size thus, a fairer classiﬁcation model

could be generated. Therefore, it becomes possible to divide

majority classes into less and longer samples and minority

classes into many shorter samples. They used text sampling

techniques to form a training set based on a desirable dis-

tribution over the classes. By text sampling, they developed

new synthetic data that artiﬁcially caused the training size

of a class to increase. A series of authorship identiﬁcation

experiments were conducted by these researchers on dif-

ferent multiclass imbalanced cases belonging to two text

corpora of two languages; newspaper reportage in Arabic

and newswire stories in English. Properties of the presented

techniques were revealed by the results obtained through

these experiments. They also tested four methods to deal

with the problem of class imbalance [100]:

• The ﬁrst method: To under-sample majority classes based

on training texts. The same amount of text which was

equal to the base was used. No modiﬁcation is applied to

the length of each text.

• The second method: To Under-sample majority classes

based on training text lines. All the training texts for a

particular author were merged to form a big text. Assum-

ing that xmin represents the size (in text lines) of the

shortest big ﬁle then, the ﬁrst xmin text lines of each big

ﬁle were segmented into text samples of length a (in text

lines). It is worth noting that there was as least one com-

plete sentence in each text line in both corpora. It was

concluded that smaller values (such as 2 or 3) lead to

better results.

• The third method: Re-balancing the dataset by text sam-

ples of varying length. As was mention earlier in this

paper, one big ﬁle is generated for each author by concat-

enation of training texts. In other words, the length of text

sample is equal to xi/k (where, k is predeﬁned param-

eter). Short text samples belong to minority authors and

long text samples belong to majority authors. Therefore,

a balanced dataset is generated which consists of k text

samples per class. Experiments were conducted for

k = 10, 20, and 50. It is noteworthy that each text line

Table 11 Comparison of the

proposed method with other

algorithms based on the worst

criterion

Bold values is to show the best-obtained value in the comparisons

Datasets PSO ABC BOA IWO GA FA FPA VSA Proposed Method

Dataset1 1.1652 1.3169 1.2795 1.2116 1.3078 1.0623 1.3221 1.151 1.3021

Dataset2 1.0134 0.8357 1.0904 0.7935 0.859 0.9388 0.9091 1.0718 1.0804

Dataset3 1.1403 0.9748 1.0289 1.0323 1.0699 1.0364 1.0189 1.0692 1.0699

Dataset4 1.2313 1.2398 1.0536 1.3387 1.0844 1.076 1.1111 1.1463 1.3387

Dataset5 1.1624 1.1992 1.0705 1.19 1.0613 1.1965 1.0762 1.1983 1.1083

Dataset6 0.5866 0.7323 0.7112 0.472 0.7894 0.6187 0.7379 0.502 0.7874

Dataset7 1.1565 1.2648 1.2797 1.4996 1.1722 1.4984 1.4836 1.0209 1.4984

Dataset8 1.2758 1.0955 1.0956 1.1194 1.2625 1.1351 1.1704 1.1696 1.2758

Dataset9 1.1248 1.0839 1.1775 1.1833 1.0772 1.2775 1.264 1.0373 1.2375

Dataset10 1.2866 1.1589 1.167 1.3825 1.2842 1.2472 1.2791 1.3497 1.2825

Dataset11 1.243 1.327 1.2921 1.3961 1.6147 1.3582 1.3734 1.2358 1.3961

Dataset12 1.4566 1.3633 1.4114 1.2155 1.2255 1.5031 1.2839 1.4008 1.2255

Dataset13 1.2436 1.1973 1.0512 1.2454 1.0363 1.2885 1.4906 1.3004 1.3204

Data set14 1.2653 1.1884 1.2394 1.3543 1.3115 1.2894 1.3822 1.3922 1.1644

Dataset15 1.1821 1.2728 1.3882 1.4679 1.4981 1.3427 1.4981 1.4827 1.3181

Dataset16 1.4262 1.0759 1.6764 1.2535 1.3716 1.5304 1.5123 0.8414 1.3264

Dataset17 1.2562 1.1611 1.1055 1.2166 1.031 1.0767 0.9641 1.369 1.0169

Dataset18 1.3983 1.4124 1.2653 1.303 1.0773 1.2703 1.0404 1.467 1.1167

Data set19 1.1328 0.9997 0.9467 1.0967 0.9658 0.9974 1.198 1.1236 1.1208

Dataset20 0.944 1.0662 1.2361 1.1814 0.9685 1.0653 1.1904 1.1242 1.0361

Dataset21 1.6487 1.4398 1.0724 1.2875 1.6143 1.631 1.1727 1.7802 1.1802

Dataset22 1.3458 1.6606 1.7973 1.1826 1.3905 1.0926 1.8201 1.1224 1.0201

Dataset23 1.154 1.0374 0.9335 1.9864 1.0774 0.8664 1.4506 1.2632 1.1206

Dataset24 1.0724 1.1301 1.1549 0.8827 0.7526 1.1974 1.8704 0.9771 1.1974

Evolutionary Intelligence

1 3

of the training corpus is used exactly once in the text

samples.

• The fourth method: Re-balancing the dataset through text

re-sampling. A big ﬁle is generated for each author once

again. Assuming that x represents the text-length (in text

lines) of the

ith

author and xmax is the longest ﬁle then,

for each author, k + xmax/xi text samples are generated

each of which consisting of xi/k. Therefore, based on the

length of the big ﬁle, a variable number of text samples

are generated for each author. Nonetheless, the relation-

ship is inversed now. Longer text samples are generated

for the majority classes but a large number of short text

samples are generated for the minority classes.

Using a data set extracted from Arabic novels, they the

modiﬁed this to two sets of words AFW54 and AFW65,

with 11 words eliminated [101]. These two sets were used

convert several Arabic texts into frequency vectors. They

carried out a performance evaluation on these word sets

Fig. 3 Result of comparison a The worst criterion. b The best criterion. c The mean criterion

Evolutionary Intelligence

1 3

through experiments which used a hybridization of an EA

and LDA to generate a classiﬁer. Then, they fed unseen

data to that classiﬁer in order to test it. The obtained per-

formance was apparently consistent with results of author-

ship attribution researches performed on other languages.

It is arguable that AFW54 is a more suitable choice nev-

ertheless; such a claim cannot be made with any statis-

tical signiﬁcance. For the cases considered here, only a

small number of investigations are reported for evaluating

the appropriate ‘chunk’ size. In real-world applications

this will be probably dependent on several factors, but

they have identiﬁed at least about 1,000 characterization

of function word usage for Arabic authors. Through this

work, they have conﬁrmed that the concept of function

words translates properly into the Arabic language. In

other words, various authors use this set of words in vari-

ous ways, and this enables us to recognize stylistic features

of individual authors and use them to distinguish between

diﬀerent authors [101].

High dimensional datasets bring about more computa-

tional challenges. One of the problems with high dimen-

sional datasets is that in most cases, all features of data

are not crucial for the knowledge implicit in the data [85,

102]. Consequently, in most occasions, reduction in the

dimensions of data is a favored subject. Often, many of

candidate features for learning are irrelevant and super-

fluous and degrade the efficiency of the learning algo-

rithm [103, 104]. Learning accuracy and teaching speed

may be worsening with superfluous features. Therefore,

choosing the corresponding necessary features in preproc-

essing phase is essentially important. In this section, for

identifying the author, at the first stage, the frequency of

words is obtained using the method TF-IDF [105]. At the

second stage, each feature is weighted [106]. At the third

stage, using metaheuristic algorithms, FS is performed.

At the fourth stage, classification is performed via KNN

[106].

Furthermore, we used the accuracy as the evaluation

measure. This accuracy is calculated as:

In the case that TP represents the number of authors who

are in the positive class while, TN indicates the number of

authors who are in the negative class. Furthermore, FP is

the number of authors falsely was considered as positive

class by the model and FN is the number of authors falsely

(22)

Accuracy

TP +TN

TP +TN +FP +FN

∗

100

Table 12 Comparison of the

proposed method with other

algorithms based on the best

criterion

Bold values is to show the best-obtained value in the comparisons

Datasets PSO ABC BOA IWO GA FA FPA VSA Proposed Method

Dataset1 1.2178 1.2977 1.1785 1.3271 1.1248 1.3193 1.1703 1.3125 1.3425

Dataset2 1.0049 1.011 0.9677 1.0707 1.0007 1.0669 1.0093 1.0633 1.0833

Dataset3 1.3354 1.3319 1.2486 1.2042 1.271 1.2486 1.3494 1.3764 1.3494

Dataset4 1.4182 1.2193 1.3134 1.3666 1.3154 1.3457 1.3129 1.4819 1.4182

Dataset5 1.3251 1.3573 1.2022 1.2551 1.1989 1.1362 1.1206 1.3422 1.3673

Dataset6 0.8806 0.6019 0.8504 0.6873 0.6848 0.5194 0.7076 0.9001 0.9031

Dataset7 1.6409 1.4838 1.4203 1.6609 1.5315 1.6245 1.4666 1.7136 1.6609

Dataset8 1.2302 1.1222 1.3725 1.1426 1.2599 1.1481 1.3451 1.4027 1.4047

Dataset9 1.3527 1.3663 1.3565 1.3902 1.2539 1.3352 1.2761 1.3717 1.3902

Dataset10 1.5953 1.4125 1.4442 1.5566 1.4284 1.3015 1.2941 1.3567 1.5966

Dataset11 1.6458 1.4675 1.4298 1.5415 1.6417 1.276 1.7019 1.6149 1.7019

Dataset12 1.7133 1.7001 1.4547 1.4802 1.7619 1.2872 1.3965 1.7329 1.7610

Dataset13 1.6127 1.208 1.2482 1.2267 1.2807 1.3122 1.4292 1.3327 1.6227

Dataset14 1.4605 1.5531 1.5892 1.2713 1.4945 1.4966 1.4835 1.5378 1.5892

Dataset15 1.4354 1.8054 1.3415 1.5058 1.3346 1.6282 1.5713 1.5591 1.8154

Dataset16 1.7259 1.3034 1.5922 1.0983 1.7772 1.0201 1.4102 1.6328 1.8328

Dataset17 1.6531 1.7521 1.6948 1.1967 1.7978 1.673 1.8108 1.4214 1.8108

Dataset18 1.7742 1.3852 1.6935 1.7389 1.6448 1.5793 1.7353 1.5856 1.7742

Dataset19 1.3372 1.3511 1.4452 1.3507 1.4515 1.2678 1.1654 1.0718 1.6515

Dataset20 1.4412 1.4441 1.2213 1.1804 1.2992 1.4351 1.2057 1.2681 1.5441

Dataset21 1.0257 1.1663 1.4461 1.2206 1.3912 1.1613 1.4962 1.6151 1.6151

Dataset22 1.3227 1.0865 1.6166 1.44 1.2056 1.731 1.0597 1.7058 1.7031

Dataset23 1.1935 1.426 1.2313 1.2189 0.9236 1.01 0.9433 1.0665 1.4466

Dataset24 1.2898 0.9603 1.2427 1.3963 1.2943 1.1029 1.1937 1.1191 1.3963

Evolutionary Intelligence

1 3

was considered as negative class by the model, even though

they were positive.

6.1 Reuter_50_50 dataset

In this subsection, the Proposed Method and other algo-

rithms are applied to Reuter_50_50 datasets. The dataset

contains 2500 documents and 50 writers (https:// archi ve.

ics. uci. edu/ ml/ datas ets/ reuter_ 50_ 50). The results from

the discussed algorithms and the results from other papers

are presented in Table14 and Fig.4. The results show that

the proposed method has a better identiﬁcation accuracy

compared to other algorithms. Moreover, BOA and FPA

have also better identiﬁcation accuracy compared to other

algorithms.

6.2 PAN dataset

These datasets consist of scientific documents in Greek,

English, and Spanish, and from 2011 until now, a new

dataset has been added to the existing ones every year

(https:// pan. webis. de). The results from the discussed

algorithms and the results from other papers on these

datasets are evaluated in Table15 and Fig.4. Identifica-

tion accuracy of proposed method for PAN11, PAN12,

PAN13, PAN14, PAN15, and PAN16 are 84%, 80.9%,

81.3%, 82.12%, 83.25%, and 81.79%, respectively.

Table 13 Comparison of the

Proposed Method with other

algorithms based on the mean

criterion

Bold values is to show the best-obtained value in the comparisons

Datasets PSO ABC BOA IWO GA FA FPA VSA Proposed Method

Data set1 1.3916 1.1978 1.3929 1.3897 1.2765 1.3521 1.1941 1.3898 1.3929

Data set2 0.9622 1.0635 0.9613 0.9978 1.0673 1.0438 1.0089 0.9741 1.0673

Data set3 1.1495 1.3674 1.1479 1.1516 1.2789 1.1939 1.3617 1.1515 1.3717

Data set4 1.5543 1.2753 1.2469 1.2985 1.4039 1.2656 1.2512 1.3383 1.4019

Data set5 1.0945 1.1002 1.23701 1.1344 1.3232 1.1146 1.3776 1.1253 1.3776

Data set6 0.9357 0.9499 0.6947 0.7346 0.8517 0.6554 0.9393 0.8642 0.9393

Data set7 1.5092 1.3172 1.3024 1.6366 1.4551 1.4772 1.3768 1.4806 1.6366

Data set8 1.1491 1.2448 1.1641 1.1609 1.2675 1.2762 1.2003 1.1581 1.2762

Data set9 1.2819 1.2329 1.3252 1.1997 1.2909 1.2488 1.2102 1.1581 1.3152

Data set10 1.2122 1.3358 1.4584 1.4552 1.4227 1.2478 1.4237 1.2327 1.4584

Data set11 1.6026 1.3283 1.5601 1.6014 1.5996 1.4536 1.5107 1.3743 1.6214

Data set12 1.2846 1.4074 1.4251 1.2414 1.5571 1.3214 1.3231 1.5242 1.5271

Data set13 1.5899 1.5719 1.5801 1.4036 1.3117 1.2544 1.5519 1.4592 1.5811

Data set14 1.3798 1.5262 1.4221 1.4781 1.2758 1.4304 1.3609 1.3481 1.5662

Data set15 1.4311 1.5769 1.4087 1.3936 1.3683 1.4963 1.6504 1.6188 1.6504

Data set16 0.9593 1.6964 1.1033 1.0707 1.0311 1.6315 1.3922 1.6179 1.6964

Data set17 1.2304 1.4441 1.1709 1.5863 1.5396 1.6017 1.6124 1.5419 1.6224

Data set18 1.4723 1.4846 1.6506 1.3603 1.4505 1.5661 1.4996 1.3536 1.6706

Data set19 1.0788 1.2331 1.2337 1.2852 1.0904 1.2745 1.3522 1.0931 1.3622

Data set20 1.3175 1.1659 1.2635 1.3272 1.1213 1.1434 1.2574 1.2245 1.3272

Data set21 0.8526 0.8718 1.1197 0.8701 1.3629 1.4002 1.2535 0.9912 1.4032

Data set22 0.8526 0.8718 1.1197 0.8701 1.3629 1.4002 1.2535 0.9912 1.4082

Data set23 1.3227 1.2502 0.9458 1.1709 1.2078 1.3314 0.8971 1.3211 1.3414

Data set24 0.9655 1.0916 1.2229 0.9476 0.9983 1.1743 0.9823 1.2059 1.2429

Table 14 Comparison of proposed method with other algorithms on

Reuter_50_50 datasets

# Algorithms Accuracy (%)

1GA 82

2 PSO 87.91

3 ABC 86

4BOA 88.13

5IWO 87.01

6FPA 88

7FA 85.6

8SVA 88.2

9 Delta [90] 67

10 KNN [90] 69

11 SVM [90] 85

12 Proposed method 89.3

Evolutionary Intelligence

1 3

Fig. 4 Result of comparison a Reuter_50_50 dataset. b PAN Dataset. c Enron Email Dataset. d Arabic Scripts

Table 15 Comparison of

proposed method with other

algorithms on PAN datasets

# Dataset PAN11 PAN12 PAN13 PAN14 PAN15 PAN16

Method

1GA 73 72 75 74 76 69

2 PSO 81.5 78.3 79.5 80.11 79.78 78.01

3 ABC 80.1 78.8 77.6 80.1 79.9 79.32

4BOA 80.14 79.35 80.82 81.24 80.3 81

5IWO 76 74.3 79 80.1 81.2 79

6FPA 81.01 78.51 79.08 79.03 78.4 77

7FA 80 76.5 77.3 79 77.02 78.6

8SVA 83 78.3 76.9 78.1 83 80.9

9 DCM [91] – – 74.4 – – –

10 DCM-voting [91] – – 78.1 – – –

11 DCM-classiﬁer [91] – – 76 – – –

12 Best result [92] – – 75.3 – – –

13 Proposed method 84 80.9 81.3 82.12 83.25 81.79

Evolutionary Intelligence

1 3

Moreover, identification accuracies of DCM models are

less than other algorithms. The algorithms BOA, ABC

and IWO have better identification accuracies compared

to the algorithms GA, PSO, FPA, and FA.

6.3 Enron email dataset

This dataset is collected and prepared by CALO project (a

cognitive assistant that learns and organizes). This dataset

includes comments of 150 users who are CEOs of Enron

(https:// www. cs. cmu. edu/ ~enron/). The results from the

proposed method and the results from other papers on

Enron Email dataset are presented in Table16 and Fig.4.

The results show that the accuracy and error rate in pro-

posed method are 95.04 and 11.68, respectively. Accuracy

in the algorithms PSO, BOA and FPA are 91.02, 93.01,

and 90.78, respectively. The accuracy and error rate in

ABC algorithm are 90.02 and 15.2, respectively. Among

other models, the model CCM-10 has a better accuracy,

and the lowest accuracies are seen in the models Naïve

Bayes and Bayes Net.

6.4 Arabic scripts

This dataset consists of 30 documents from 10 authors. The

author was chosen from the website, (http:// www. alwar aq.

net) and their names are: Aljahedh, Alghazali, Alfarabi,

Almas3ody, Almeqrezi, Altabary, Altow7edy, Ibnaljawzy,

Ibnrshd, and Ibnsena. The results from the proposed method

and the results from other papers on these datasets are pre-

sented in Table17 and Fig.4. The identiﬁcation accuracy

of proposed method model is 93.24%, which is better than

other models.

According to the experiments results, it is concluded that

the Proposed Method has a better performance than other

models in terms of identiﬁcation accuracy. According to

Tables15, 16, 17 the proposed method in benchmark func-

tions is the closest to minimum compared to the algorithms

FPA, IWO, BOA, ABC, PSO, GA, and FA. Moreover, the

proposed method has a better accuracy in the author identiﬁ-

cation problem. The rate of accuracy of ABC, BOA and pro-

posed method is indicated in Table17. The results revealed

that the proposed method outperformed to the other models

that is ABC and BOA models. The percentage of proposed

method is 93.24%. Consequently, the percentage of ABC is

91.00% and it is 92.51% for BOA model.

Table 16 Comparison of

proposed method with other

algorithms on enron email

dataset

# Algorithms Accuracy (%) Error rate

1GA 85.06 22

2 PSO 91.02 12

3 ABC 90.02 15.2

4BOA 93.01 12.3

5IWO 89.03 17.32

6FPA 90.78 13

7FA 88.05 20.3

8SVA 93.89 10.98

9 Linear(SVM) [93] – 18.86

10 Linear(SVM-LR) [93] – 15.34

11 Polynomial3(SVM-LR) [93] – 19.20

12 Polynomial5(SVM-LR) [93] – 28.47

13 Gaussian(SVM-LR) [93] – 46.01

14 CEAI: CCM-10 [94] 94 –

15 CEAI: CCM-25 [94] 89 –

16 CEAI: CCM-50 [94] 81 –

17 Author miner-AM [95] 68.19 –

18 Naïve bayes [95] 79.08 –

19 Bayes net [95] 79.56 –

20 Classiﬁcation by multiple association rule-CMAR [95] 88.47 –

21 Classiﬁcation by association-CBA [95] 84.18 –

22 J48 [95] 89.45 –

23 Classiﬁcation by multiple association rule for authorship

attribution-CMARAA [95]

90.08 –

24 Proposed method 95.04 11.68

Evolutionary Intelligence

1 3

7 Conclusion andfeature works

We proposed three State based on the hybrid of chaotic and

VSA in this paper for FS. State2 compared with Method1,

Method3, and VSA where it had better values. We also used

State2 to FS and text author identiﬁcation. This paper is

accompanied by using 10 CMs to enhance the overall per-

formance and precision of the VSA. VSA is introduced to

one of the challenge problems, especially FS. The proposed

methods have been evaluated on 24 benchmark datasets.

Four precise evaluation standards are followed in this paper.

These standards are worst, best, mean, and SD. Similarly,

the performance of Proposed Method is compared with the

popular and maximum current other algorithms. These algo-

rithms are PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA.

The experimental eﬀects show that State2 outperforms the

other algorithms in terms of best and mean ﬁtness.

Moreover, the outcomes displayed that the Proposed

Method (State2) with Tent map can drastically enhance

VSA in terms of classiﬁcation overall performance, sta-

bility exceptional, number of FS, and convergence speed.

Moreover, the outcomes showed that Tent map turned into

the satisfactory map. Therefore, the following conclusion

can be drawn:

• The CMs improve the section of exploration because they

change the radius of value search, helping the trapped

masses to release themselves from local minima.

• The CMs are permitted to adaptively adjust exploration

and exploitation by the proposed method. As it were, the

Proposed Method (State1) encourages VSA to transit grad-

ually from the exploration stage to the exploitation stage.

Essential work on the integration of CMs with the addi-

tion of other metaheuristic algorithms will be considered.

VSA’s performance on more problematic science and

real-world engineering problems will be applied in future

veriﬁcation.

Declarations

Conflict of interest The authors declare that they have no conﬂict of

interest.

References

1. Gharehchopogh FS, Gholizadeh H (2019) A comprehensive sur-

vey: whale optimization algorithm and its applications. Swarm

Evol Comput 48:1–24

2. Shayanfar H, Gharehchopogh FS (2018) Farmland fertility: a new

metaheuristic algorithm for solving continuous optimization prob-

lems. Appl Soft Comput 71:728–746

3. Razmjooy N, Khalilpour M, Ramezani M (2016) A new meta-heuristic

optimization algorithm inspired by FIFA world cup competitions:

theory and its application in PID designing for AVR system. J Con-

trol Autom Electr Syst 27(4):419–440

4. Razmjooy N, Ramezani M (2014) An improved quantum evolutionary

algorithm based on invasive weed optimization. Indian J Sci Res

4(2):413–422

5. Gharehchopogh FS, Shayanfar H, Gholizadeh H (2019) A compre-

hensive survey on symbiotic organisms search algorithms. Artiﬁcial

Intelligence Review

6. Harrison KR, Engelbrecht AP, Ombuki-Berman BM (2016) Inertia

weight control strategies for particle swarm optimization. Swarm

Intell 10(4):267–305

7. Xing B, Gao W-J (2014) Invasive Weed Optimization Algorithm. In:

Xing B, Gao W-J (eds) Innovative COMPUTATIONAL INTEL-

LIGENCE: A ROUGH GUIDE TO 134 CLEVER ALGORITHms.

Springer International Publishing, Cham, pp 177–181

8. Qi X, Zhu Y, Zhang H (2017) A new meta-heuristic butterﬂy-inspired

algorithm. J Comput Sci 23:226–239

9. Karaboga D (2005) An idea based on honeybee swarm for numerical

optimization. Technical Report TR06, Erciyes University, Engineer-

ing Faculty, Computer Engineering Department

10. Pan W-T (2012) A new fruit ﬂy optimization algorithm: taking the

ﬁnancial distress model as an example. Knowl-Based Syst 26:69–74

11. Yang XS (2008) Nature-Inspired Metaheuristic Algorithms. Luniver

Press, United Kingdom

12. Gandomi AH, Alavi AH (2012) Krill herd: A new bio-inspired

optimization algorithm. Commun Nonlinear Sci Numer Simul

17(12):4831–4845

13. Storn R, Price K (1996) Minimizing the real functions of the ICEC’96

contest by diﬀerential evolution. In: Proceedings of IEEE Interna-

tional Conference on Evolutionary Computation

Table 17 Comparison of proposed method with other algorithms on

Arabic scripts

#Algorithms Accuracy (%)

1GA 85

2 PSO 90.9

3 ABC 91

4BOA 92.51

5IWO 90.8

6FPA 89

7FA 86.3

8SVA 92.8

9 Functional trees [96] 79.3

10 SVM [96] 82.0

11 NB [97] 82.30

12 MNB [97] 92.03

13 MBNB [97] 97.43

14 MPNB [97] 87.40

15 Decision trees (C4.5) [98] 81.03

16 SVM [98] 85.43

17 SVM [99] 87.00

18 Write print [99] 68.92

19 SVM [100] 93.60

20 LDA [101] 87.63

21 Proposed method 93.24

Evolutionary Intelligence

1 3

14. Yang X-S (2012) Flower pollination algorithm for global optimization.

In: Unconventional computation and natural computation. Berlin,

Heidelberg

15. Navid R etal (2019) A comprehensive survey of new meta-heu-

ristic algorithms. In: Recent advances in hybrid metaheuristics

for data clustering, p 1–25

16. Ali N, Mehdi R, Navid R (2016) A New Meta-Heuristic Algo-

rithm for Optimization Based on Variance Reduction of Gaussian

distribution. Majlesi J Electr Eng 10(4):49–56

17. Li B, Jiang W (1998) Optimizing complex functions by chaos

search. J Cybern Syst 29:409–419

18. Li Y-Y, Wen Q-Y, Li L-X (2009) Modiﬁed chaotic ant swarm

to function optimization. J China Univ Posts Telecommun

16(1):58–63

19. Yi J, Jian D, Zhenhong S (2017) Pattern synthesis of MIMO

radar based on chaotic diﬀerential evolution algorithm. Optik

140:794–801

20. He Y etal (2014) A novel chaotic diﬀerential evolution algorithm

for short-term cascaded hydroelectric system scheduling. Int J

Electr Power Energy Syst 61:455–462

21. Wang G-G etal (2014) Chaotic krill herd algorithm. Inf Sci

274:17–34

22. Prasad D, Mukherjee A, Mukherjee V (2017) Application

of chaotic krill herd algorithm for optimal power ﬂow with

direct current link placement problem. Chaos Solitons Fractals

103:90–100

23. Yousri D etal (2019) Chaotic ﬂower pollination and grey wolf

algorithms for parameter extraction of bio-impedance models.

Appl Soft Comput 75:750–774

24. Youseﬁ M etal (2018) Chaotic genetic algorithm and Adaboost

ensemble metamodeling approach for optimum resource plan-

ning in emergency departments. Artif Intell Med 84:23–33

25. Hong W-C etal (2013) Cyclic electric load forecasting by sea-

sonal SVR with chaotic genetic algorithm. Int J Electr Power

Energy Syst 44(1):604–614

26. Chen K, Zhou F, Liu A (2018) Chaotic dynamic weight particle

swarm optimization for numerical function optimization. Knowl-

Based Syst 139:23–40

27. Chuang L-Y, Hsiao C-J, Yang C-H (2011) Chaotic parti-

cle swarm optimization for data clustering. Expert Syst Appl

38(12):14555–14563

28. Liu L etal (2018) Research on ships collision avoidance based

on chaotic particle swarm optimization. In: Advances in smart

vehicular technology, transportation, communication and appli-

cations. Springer International Publishing, Cham

29. Ji J etal (2017) Self-adaptive gravitational search algorithm with

a modiﬁed chaotic local search. IEEE Access 5:17881–17895

30. García-Ródenas R, Linares LJ, López-Gómez JA (2019) A

memetic chaotic gravitational search algorithm for unconstrained

global optimization problems. Appl Soft Comput 79:14–29

31. Wang Y etal (2019) A hierarchical gravitational search algorithm

with an eﬀective gravitational constant. Swarm Evol Comput

46:118–139

32. Hong W-C etal (2019) Novel chaotic bat algorithm for forecast-

ing complex motion of floating platforms. Appl Math Model

72:425–443

33. Wang H, Tan L, Niu B (2019) Feature selection for classiﬁcation

of microarray gene expression cancers using Bacterial Colony

Optimization with multi-dimensional population. Swarm Evol

Comput 48:172–181

34. Arora S, Anand P (2019) Binary butterfly optimization

approaches for feature selection. Expert Syst Appl 116:147–160

35. Zakeri A, Hokmabadi A (2019) Eﬃcient feature selection method

using real-valued grasshopper optimization algorithm. Expert

Syst Appl 119:61–72

36. Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for fea-

ture selection: A review and future trends. Inf Fusion 52:1–12

37. Papa JP etal (2018) Feature selection through binary brain storm

optimization. Comput Electr Eng 72:468–481

38. Guvenc U, Duman S, Hinislioglu Y (2017) Chaotic Moth Swarm

Algorithm. In: 2017 IEEE International Conference on INnova-

tions in Intelligent SysTems and Applications (INISTA)

39. Wang S etal (2017) Multiple chaotic cuckoo search algorithm.

In: Advances in Swarm Intelligence. Springer International Pub-

lishing, Cham

40. Rizk-Allah RM, Hassanien AE, Bhattacharyya S (2018) Chaotic

crow search algorithm for fractional optimization problems. Appl

Soft Comput 71:1161–1175

41. Chahkandi V, Yaghoobi M, Veisi G (2013) CABC–CSA: a new

chaotic hybrid algorithm for solving optimization problems.

Nonlinear Dyn 73:475–484

42. Zhang Y, Zhou W, Yi J (2016) A novel adaptive chaotic bacterial

foraging optimization algorithm. In: 2016 International confer-

ence on computational modeling, simulation and applied math-

ematics (CMSAM 2016), p 1–8

43. Jia D, Zheng G, Khan MK (2011) An eﬀective memetic diﬀer-

ential evolution algorithm based on chaotic local search. Inf Sci

181(15):3175–3187

44. Thangaraj R etal (2012) Opposition based Chaotic Diﬀerential

Evolution algorithm for solving global optimization problems. In

2012 fourth world congress on nature and biologically inspired

computing (NaBIC)

45. Du Pengzhen TZ, Yan S (2014) A quantum glowworm swarm

optimization algorithm based on chaotic sequence. Optimization

7(9)

46. Mitić M etal (2015) Chaotic fruit ﬂy optimization algorithm.

Knowl-Based Syst 89:446–458

47. Gandomi AH etal (2013) Chaos-enhanced accelerated parti-

cle swarm optimization. Commun Nonlinear Sci Numer Simul

18(2):327–340

48. Yao J-F etal (2001) A new optimization approach-chaos genetic

algorithm. Syst Eng 1:015

49. Li J-W, Cheng Y-M, Chen K-Z (2014) Chaotic particle swarm

optimization algorithm based on adaptive inertia weight. In: Con-

trol and Decision Conference (2014 CCDC), The 26th Chinese.

IEEE

50. Xu X etal (2018) CS-PSO: chaotic particle swarm optimization

algorithm for solving combinatorial optimization problems. Soft

Comput 22(3):783–795

51. Sayed GI, Khoriba G, Haggag MH (2018) A novel chaotic salp

swarm algorithm for global optimization and feature selection.

Appl Intell p 1–20

52. Tuba E etal (2018) Chaotic elephant herding optimization algo-

rithm. In: Applied Machine Intelligence and Informatics (SAMI),

2018 IEEE 16th World Symposium on. IEEE

53. Gandomi AH, Yang X-S (2014) Chaotic bat algorithm. J Comput

Sci 5(2):224–232

54. Pan G, Xu Y (2016) Chaotic glowworm swarm optimization

algorithm based on Gauss mutation. In: Natural computation,

fuzzy systems and knowledge discovery (ICNC-FSKD), 2016

12th International Conference on. IEEE

55. Aslani H, Yaghoobi M, Akbarzadeh-T M-R (2015) Chaotic iner-

tia weight in black hole algorithm for function optimization. In:

Technology, Communication and Knowledge (ICTCK), 2015

International Congress on. IEEE

56. Yang X, Niu J, Cai Z (2018) Chaotic Simulated Annealing Parti-

cle Swarm Optimization Algorithm. In: 2018 2nd IEEE advanced

information management, communicates, electronic and automa-

tion control conference (IMCEC). IEEE

Evolutionary Intelligence

1 3

57. Aggarwal S etal (2018) A social spider optimization algorithm

with chaotic initialization for robust clustering. Proc Comput Sci

143(1):450–457

58. Zhang X, Feng T (2018) Chaotic bean optimization algorithm.

Soft Comput 22(1):67–77

59. Boushaki SI, Kamel N, Bendjeghaba O (2018) A new quantum

chaotic cuckoo search algorithm for data clustering. Expert Syst

Appl 96:358–372

60. Tharwat A, Hassanien AE (2018) Chaotic antlion algorithm for

parameter optimization of support vector machine. Appl Intell

48(3):670–686

61. Zhou Y, Su K, Shao L (2018) A new chaotic hybrid cognitive

optimization algorithm. Cogn Syst Res 52:537–542

62. Mingjun J, Huanwen T (2004) Application of chaos in simulated

annealing. Chaos, Solitons Fractals 21(4):933–941

63. Teng H, Cao A (2011) An novel quantum genetic algorithm

with Piecewise Logistic chaotic map. In: Natural Computation

(ICNC), 2011 Seventh International Conference on. IEEE

64. Kumar Y, Singh PK (2018) A chaotic teaching learning based

optimization algorithm for clustering problems. Appl Intell, p 1–27

65. Yüzgeç U, Eser M (2018) Chaotic based diﬀerential evolution

algorithm for optimization of baker’s yeast drying process. Egypt

Inf J

66. Ibrahim RA, Elaziz MA, Lu S (2018) Chaotic opposition-based

grey-wolf optimization algorithm based on diﬀerential evolution

and disruption operator for global optimization. Expert Syst Appl

108:1–27

67. Rahman TA etal (2017) Chaotic fractal search algorithm for

global optimization with application to control design. In: Com-

puter applications and industrial electronics (ISCAIE), 2017 IEEE

symposium on. IEEE

68. Tuba E, Dolicanin E, Tuba M (2017) Chaotic brain storm optimi-

zation algorithm. In International conference on intelligent data

engineering and automated learning. Springer, Berlin

69. Hinojosa S etal (2018) Improving multi-criterion optimization

with chaos: a novel Multi-Objective Chaotic Crow Search Algo-

rithm. Neural Comput Appl 29(8):319–335

70. Arora S, Anand P (2018) Chaotic grasshopper optimization algo-

rithm for global optimization. Neural Comput Appl p 1–21

71. Saremi S, Mirjalili SM, Mirjalili S (2014) Chaotic Krill Herd Opti-

mization Algorithm. Proc Technol 12:180–185

72. Wang G-G, Hossein Gandomi A, ossein Alavi A, (2013) A chaotic

particle-swarm krill herd algorithm for global numerical optimiza-

tion. Kybernetes 42(6):962–978

73. Zhenyu G etal (2006) Self-adaptive chaos diﬀerential evolution.

In: International Conference on Natural Computation. Springer,

Berlin

74. Gandomi AH etal (2013) Fireﬂy algorithm with chaos. Commun

Nonlinear Sci Numer Simul 18(1):89–98

75. dos Santos CL, Mariani VC (2012) Fireﬂy algorithm approach

based on chaotic Tinkerbell map applied to multivariable PID

controller tuning. Comput Math Appl 64(8):2371–2382

76. Wang L etal (2018) A new chaotic starling particle swarm opti-

mization algorithm for clustering problems. Math Prob Eng 2018

77. Sayed GI, Hassanien AE, Azar AT (2017) Feature selection via

a novel chaotic crow search algorithm. Neural Computing and

Applications, p 1–18

78. Kohli M, Arora S (2018) Chaotic grey wolf optimization algo-

rithm for constrained optimization problems. J Comput Des Eng

5(4):458–472

79. Doğan B, Ölmez T (2015) A new metaheuristic for numerical func-

tion optimization: Vortex Search algorithm. Inf Sci 293:125–145

80. Martin B (1995) Instance-based learning: nearest neighbour with

generalisation. doctoral dissertation, University of Waikato

81. Mafarja M etal (2019) Binary grasshopper optimisation algo-

rithm approaches for feature selection problems. Expert Syst

Appl 117:267–286

82. https:// archi ve. ics. uci. edu/ ml/ index. php, 2019.

83. Holland JH (1975) Adaptation in natural and artiﬁcial systems.

University of Michigan Press, Ann ArborI

84. Eberhart R, Kennedy J (1995) A new optimizer using particle

swarm theory. in MHS’95. In: Proceedings of the Sixth Inter-

national Symposium on Micro Machine and Human Science

85. Villar-Rodriguez E etal (2016) A feature selection method for

author identiﬁcation in interactive communications based on

supervised learning and language typicality. Eng Appl Artif Intell

56:175–184

86. Digamberrao KS, Prasad RS (2018) Author identiﬁcation using

sequential minimal optimization with rule-based decision tree on

indian literature in Marathi. Proc Comput Sci 132:1086–1101

87. Bay Y, Çelebi E (2016) Feature selection for enhanced author iden-

tiﬁcation of Turkish Text. In: Information sciences and systems.

Springer, Cham

88. Zhang C etal (2014) Authorship identiﬁcation from unstructured

texts. Knowl-Based Syst 66:99–111

89. Zamani H etal (2014) Authorship identiﬁcation using dynamic

selection of features from probabilistic feature set. In: Information

Access Evaluation, Multilinguality, multimodality, and interaction.

Springer International Publishing, Cham

90. Nirkhi S, Dharaskar RV, Thakre VM (2014) Stylometric approach

for author identiﬁcation of online messages. Int J Comput Sci Inf

Technol 5(5):6158–6159

91. Frery J, Largeron C, Juganaru-Mathieu M (2015) Author identiﬁca-

tion by automatic learning. In: 2015 13th International conference

on document analysis and recognition (ICDAR)

92. Seidman S (2013) Authorship veriﬁcation using the impostors

method. In: Notebook for PAN at CLEF, p 13–16

93. Brocardo ML, Traore I, Woungang I (2015) Authorship veriﬁcation

of e-mail and tweet messages applied for continuous authentication.

J Comput Syst Sci 81(8):1429–1440

94. Nizamani S, Memon N (2013) CEAI: CCM-based email authorship

identiﬁcation model. Egypt Inf J 14(3):239–249

95. Schmid MR, Iqbal F, Fung BCM (2015) E-mail authorship attri-

bution using customized associative classiﬁcation. Digit Investig

14:S116–S126

96. Otoom AF etal (2014) Towards author identiﬁcation of Arabic text

articles. In: 2014 5th International conference on information and

communication systems (ICICS)

97. Altheneyan AS, Menai MEB (2014) Naïve Bayes classiﬁers for

authorship attribution of Arabic texts. J King Saud Univ Comput

Inf Sci 26(4):473–484

98. Abbasi A, Chen H (2005) Applying authorship analysis to arabic

web content. In: Intelligence and Security Informatics. Springer,

Berlin

99. Abbasi A, Chen H (2006) Visualizing authorship for identiﬁcation.

In: Intelligence and security informatics. Springer, Berlin

100. Stamatatos E (2008) Author identiﬁcation: Using text sampling

to handle the class imbalance problem. Inf Process Manage

44(2):790–799

101. Shaker K, Corne D (2010) Authorship Attribution in Arabic using

a hybrid of evolutionary search and linear discriminant analysis. In:

2010 UK Workshop on Computational Intelligence (UKCI)

102. Wang Y, Feng L (2018) Hybrid feature selection using compo-

nent co-occurrence based feature relevance measurement. Expert

Syst Appl 102:83–99

103. Kushwaha N, Pant M (2018) Link based BPSO for feature

selection in big data text clustering. Futur Gener Comput Syst

82:190–199

Evolutionary Intelligence

1 3

104. Marie-Sainte SL, Alalyani N (2018) Fireﬂy algorithm based

feature selection for arabic text classiﬁcation. J King Saud Univ

Comput Inf Sci

105. Uğuz H (2011) A two-stage feature selection method for

text categorization by using information gain, principal

component analysis and genetic algorithm. Knowl-Based Syst

24(7):1024–1032

106. Trstenjak B, Mikac S, Donko D (2014) KNN with TF-IDF based

Framework for Text Categorization. Proc Eng 69:1356–1364

A preview of this full-text is provided by Springer Nature.

Learn more

Content available from Evolutionary Intelligence

This content is subject to copyright. Terms and conditions apply.

A metaheuristic approach based on coronavirus herd immunity optimiser for breast cancer diagnosis

Article

Full-text available

Apr 2024
CLUSTER COMPUT

As one of the important concepts in epidemiology, herd immunity was recommended to control the COVID-19 pandemic. Inspired by this technique, the Coronavirus Herd Immunity Optimiser has recently been introduced, demonstrating promising results in addressing optimisation problems. This particular algorithm has been utilised to address optimisation problems widely; However, there is room for enhancement in its performance by making modifications to its parameters. This paper aims to improve the Coronavirus Herd Immunity Optimisation algorithm to employ it in addressing breast cancer diagnosis problem through feature selection. For this purpose, the algorithm was discretised after the improvements were made. The Opposition-Based Learning approach was applied to balance the exploration and exploitation stages to enhance performance. The resulting algorithm was employed in the diagnosis of breast cancer, and its performance was evaluated on ten benchmark functions. According to the simulation results, it demonstrates superior performance in comparison with other well-known approaches of the similar nature. The results demonstrate that the new approach performs well in diagnosing breast cancer with high accuracy and less computational complexity and can address a variety of real-world optimisation problems.

Predicting long-term displacements of deep tunnels using an artificial neural network optimized by sand cat swarm optimization with Chebyshev map

Article

Full-text available

Apr 2024
ENVIRON EARTH SCI

Long-term tunnel displacement prediction is of great engineering significance to tunnel maintenance and hazard warning. To that end, this paper provides a novel combination idea that uses the analytical solution considering the rheological properties of the rock masses and the poor blasting for excavation of a deep tunnel to establish a long-term tunnel displacement database. In the analytical solution, 12 parameters are considered to predict the deep tunnel displacements (ur\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{r}$$\end{document}) in different periods, i.e., instantaneous displacement (uratTC0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{r} at {T}_{C}^{0}$$\end{document}), the first year (uratTC1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{u}_{r} at T}_{C}^{1}$$\end{document}), and, the fifth year (uratTC5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{u}_{r} at T}_{C}^{5}$$\end{document}), and the tenth year (uratTC10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{u}_{r} at T}_{C}^{10}$$\end{document}). An artificial neural network (ANN) is optimized by the Sand Cat swarm optimization (SCSO) with the Chebyshev (Che) map (i.e., CheSCSO-ANN model) to predict the tunnel displacement and compared to the other five prediction models. The coefficient of determination (R²), the variance accounted for (VAF), the root mean squared error (RMSE), and the weighted average percentage error (WAPE) are utilized to evaluate the model performance. The outcomes of this research indicate that the CheSCSO-ANN model obtains the most satisfactory accuracy for predicting the long-term tunnel displacement (uratTC1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{u}_{r}\mathrm{ at }T}_{C}^{1}$$\end{document}: 0.9997, 99.9685%, 2.2105, 0.0116; uratTC5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{u}_{r}\mathrm{ at }T}_{C}^{5}$$\end{document}: 0.9997, 99.9704%, 2.5387, 0.0093 and uratTC10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{u}_{r}\mathrm{ at }T}_{C}^{10}$$\end{document}: 0.9994, 99.9426%, 3.3365, 0.0115). The CheSCSO-ANN model performance is verified using two independent published cases. The verification results show that the calculation accuracy of the proposed model is slightly lower than that of the analytical solution, but the model is still reliable considering the calculation efficiency and the allowable error range. Besides, the effect of the geological strength index (GSI) and damaged zone radius (RD\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{D}$$\end{document}) on the long-term tunnel convergence prediction is far greater than the other parameters one.

CVS-FLN: a novel IoT-IDS model based on metaheuristic feature selection and neural network classification model

Article

Full-text available

Jun 2024
MULTIMED TOOLS APPL

The Internet of Things (IoT) is one of the technologies that will be used all over the world in the future, and its security and privacy features are the primary concerns. However, the most critical limitation to overcome before the IoT's widespread use is addressing its security concerns. One of the most critical tasks to address the security problems posed by the IoT is the detection of network intrusions. Intrusion Detection Systems (IDS) for the IoT face substantial challenges because of the functional and physical diversity of the devices. In the rapidly growing IoT industry, the use of IDS is essential for ensuring security since many devices communicate efficiently. IDS are crucial for continual monitoring and responding to any security risks, ensuring the integrity and reliability of interconnected IoT networks. The primary objective of this research is to develop a feature-selection-based intrusion classification model. Therefore, we develop an IoT-IDS feature selection-based classification model to detect intrusions. We propose a metaheuristic algorithm, the Chaotic Vortex Search (CVS) algorithm, for feature selection. The Fast-Learning Network (FLN), an artificial neural network (ANN) model, is additionally proposed for classification. We use the IDS datasets, such as CIC IDS-2017 and BoT-IoT, to evaluate our research model. To test and validate the proposed model, extensive experiments were performed, and the outcomes stated that the CVS-FLN model achieved 99.77% accuracy, 99.92% specificity, 99.60% precision, 99.81% detection rate, and a 99.72% F1-score in the CIC IDS dataset, and 99.68% accuracy, 99.30% precision, 99.83% specificity, 99.11% detection rate, and a 99.21% F1-score in evaluating the BoT-IoT dataset.

Data Mining Problems Optimization by using Metaheuristic Algorithms: A Survey

Article

Full-text available

Jun 2024

Big data refers to large, diverse, and complicated data sets that are challenging to store, analyze, and visualize for use in subsequent operations or outcomes. Exploring and analyzing vast amounts of data in order to find significant patterns and principles is called data mining. Data mining is crucial to many human endeavors because it uncovers previously undiscovered patterns that are helpful. There are several main tasks of data mining, including Clustering, feature selection, and association rules. Several data mining techniques are employed to handle these significant duties. Metaheuristic algorithms are currently regarded as one of the most efficient methods for handling data mining issues. Black boxes like metaheuristics can offer distinct solutions regardless of the problem's nature. These algorithms treat data mining problems as combinatorial optimization problems. Numerous research papers are published in this area each year, which is why we decided to give a survey study on the topic. Consequently, this paper provides a thorough literature review on using metaheuristic algorithms to solve data mining issues that have emerged in the last five years (2019-2023).

Mantis Search Algorithm Integrated with Opposition-Based Learning and Simulated Annealing for Feature Selection

Article

Full-text available

Jun 2024

Feature selection (FS) plays a vital role in minimizing the high-dimensional data as much as possible to aid in enhancing the classification accuracy and reducing computational costs. The purpose of the FS techniques is to extract the most effective subset features, which might enable the machine learning (ML) algorithms to better grasp the input data’s patterns and improve their classification performance. Although several metaheuristic algorithms have been recently presented to solve this problem, they still suffer from several disadvantages, such as getting stuck in local optima, slow convergence speed, and a lack of population diversity, which prevent them from achieving the desired solutions in an acceptable time. Therefore, this study is presented to propose a new feature selection approach, namely OBMSASA, based on integrating the recently published mantis search algorithm with the opposition-based learning (OBL) method and simulated annealing (SA) to strengthen its exploration and exploitation operators. The OBL method aims to improve the exploration operator, making the algorithm able to avoid stagnation into local minima; meanwhile, the SA is used as a local search to further strengthen the exploration operator, thereby improving the convergence speed. The K-nearest neighbor algorithm is used to compute the accuracy of the selected feature. The proposed algorithm is assessed using 21 common datasets and compared to several rival optimizers in terms of several performance metrics, including convergence curve, average fitness, computational cost, length of selected features, and standard deviation, to observe its effectiveness and efficiency. The source code is publicly accessible at https://drive.mathworks.com/OBMSASA.

A Quantitative Evaluation of Statistical Practices in Metaheuristics Research

Preprint

Full-text available

Jan 2024

Convex Combination Search Algorithm: A Novel Metaheuristic Optimization Algorithm for Solving Global Optimization and Engineering Design Problems

Article

May 2024

An Efficient Binary Hybrid Equilibrium Algorithm for Binary Optimization Problems: Analysis, Validation, and Case Studies

Article

Full-text available

Apr 2024
INT J COMPUT INT SYS

Binary optimization problems belong to the NP-hard class because their solutions are hard to find in a known time. The traditional techniques could not be applied to tackle those problems because the computational cost required by them increases exponentially with increasing the dimensions of the optimization problems. Therefore, over the last few years, researchers have paid attention to the metaheuristic algorithms for tackling those problems in an acceptable time. But unfortunately, those algorithms still suffer from not being able to avert local minima, a lack of population diversity, and low convergence speed. As a result, this paper presents a new binary optimization technique based on integrating the equilibrium optimizer (EO) with a new local search operator, which effectively integrates the single crossover, uniform crossover, mutation operator, flipping operator, and swapping operator to improve its exploration and exploitation operators. In a more general sense, this local search operator is based on two folds: the first fold borrows the single-point crossover and uniform crossover to accelerate the convergence speed, in addition to avoiding falling into local minima using the mutation strategy; the second fold is based on applying two different mutation operators on the best-so-far solution in the hope of finding a better solution: the first operator is the flip mutation operator to flip a bit selected randomly from the given solution, and the second operator is the swap mutation operator to swap two unique positions selected randomly from the given solution. This variant is called a binary hybrid equilibrium optimizer (BHEO) and is applied to three common binary optimization problems: 0–1 knapsack, feature selection, and the Merkle–Hellman knapsack cryptosystem (MHKC) to investigate its effectiveness. The experimental findings of BHEO are compared with those of the classical algorithm and six other well-established evolutionary and swarm-based optimization algorithms. From those findings, it is concluded that BHEO is a strong alternative to tackle binary optimization problems. Quantatively, BHEO could reach an average fitness of 0.090737884 for the feature section problem and an average difference from the optimal profits for some used Knapsack problems of 2.482.

A binary hybrid sine cosine white shark optimizer for feature selection

Article

Full-text available

Apr 2024
CLUSTER COMPUT

Feature Selection (FS), a pre-processing step used in the majority of big data processing applications, aims to eliminate irrelevant and redundant features from the data. Its purpose is to select a final set of data characteristics that best represent the data as a whole. To achieve this, it explores every potential solution in order to identify the optimal subset. Meta-heuristic algorithms have been found to be particularly effective in solving FS problems, especially for high-dimensional datasets. This work adopts a recently developed meta-heuristic called the White Shark Optimizer (WSO) due to its simplicity and low computational overhead. However, WSO faces challenges in effectively balancing exploration and exploitation, particularly in complex multi-peak search problems. It tends to converge prematurely and get stuck in local optima, which can lead to poor search performance when dealing with FS problems. To overcome these issues, this paper presents three enhanced binary variants of WSO for well-known FS problems. These variants are as follows: (1) Binary Distribution-based WSO (BDWSO), where the algorithm refines the positions of white sharks by considering both the average and standard deviation of the current shark, the local best shark, and the global best shark. This strategy is designed to alleviate issues of premature convergence and stagnation during iterations , (2) Binary Sine Cosine WSO (BSCWSO), which uses sine and cosine adaptive functions for the social and cognitive components of the position update rule, and (3) Binary Hybrid Sine Cosine WSO (BHSCWSO), which employs sine and cosine acceleration factors to regulate local search and achieve convergence to the global optimal solution. Additionally, the population was initialized using the Opposition-Based Learning (OBL) mechanism, and the sine map was used to modify the inertia weight of WSO. These revised variants of WSO were established to have a better harmony between the exploration and exploitation facets. The proposed methods were extensively compared to the fundamental binary WSO and other well-known algorithms in the field. The experimental findings and comparisons demonstrate that the proposed methods outperform the conventional and most evaluated similar algorithms in terms of robustness and solution quality. In terms of classification accuracy, number of selected features, specificity, sensitivity, and fitness values, the proposed BHSCWSO optimizer performed better than all other proposed peer optimizers, including BSWO, BDWSO, and BSCWSO, in 11, 8, 13, 18, and 10 datasets, respectively. The proposed BHSCWSO optimizer showed performance levels of more than 90% in terms of accuracy, sensitivity, and specificity metric measures on 15, 14, and 11 of the 24 datasets deemed.

Atomic physics-inspired atom search optimization heuristics integrated with chaotic maps for identification of electro-hydraulic actuator systems

Article

Mar 2024
MOD PHYS LETT B

Electro-hydraulic actuator system (EHAS) has imposed a challenge in the research community for accurate mathematical modeling and identification due to non-linearities. In this paper, autoregressive exogenous (ARX) structure is used for EHAS modeling and identification is performed by exploiting the competency of atomic physics-based chaotic atom search optimization (CASO) that adapts ten chaotic maps (Chebyshev, Circle, Gauss, Iterative, Logistic, Piecewise, Sine, Singer, Sinusoidal and Tent) in position update of atom search optimization (ASO). The fitness/merit function of the EHAS model is developed in mean-square error (MSE) sense between desired and approximated values. Simulations and analysis show that ASO with a chaotic logistic map (CASO5) performs better than the ASO and its other chaotic variants, as well as other recently introduced metaheuristics for diverse variations in the system model. Statistics based on MSE, learning plots, results of autonomous trials and average fitness analyses verify the consistency and reliability of the CASO5 for the identification of the EHAS model.

A comprehensive survey on symbiotic organisms search algorithms

Article

Full-text available

Mar 2020
ARTIF INTELL REV

Recently, meta-heuristic algorithms have made remarkable progress in solving types of complex and NP-hard problems. So that, most of this algorithms are inspired by swarm intelligence and biological systems as well as other physical and chemical systems in nature. Of course, different divisions for meta-heuristic algorithms have been presented so far, and the number of these algorithms is increasing day by day. Among the meta-heuristic algorithms, some algorithms have a very high efficiency, which are a suitable method for solving real-world problems, but some algorithms have not been sufficiently studied. One of the nature-inspired meta-heuristic algorithms is symbiotic organisms search (SOS), which has been able to solve the majority of engineering issues so far. In this paper, firstly, the primary principles, the basic concepts, and mathematical relations of the SOS algorithm are presented and then the engineering applications of the SOS algorithm and published researches in different applications are examined as well as types of modified and multi-objective versions and hybridized discrete models of this algorithm are studied. This study encourages the researchers and developers of meta-heuristic algorithms to use this algorithm for solving various problems, because it is a simple and powerful algorithm to solve complex and NP-hard problems. In addition, a detailed and perfect statistical analysis was performed on the studies that had used this algorithm. According to the accomplished studies and investigations, features and factors of this algorithm are better than other meta-heuristic algorithm, which has increased its usability in various fields.

A Memetic Chaotic Gravitational Search Algorithm for unconstrained global optimization problems

Article

Full-text available

Mar 2019
APPL SOFT COMPUT

Metaheuristic optimization algorithms address two main tasks in the process of problem solving: i) exploration (also called diversification) and ii) exploitation (also called intensification). Guaranteeing a trade-off between these operations is critical to good performance. However, although many methods have been proposed by which metaheuristics can achieve a balance between the exploration and exploitation stages, they are still worse than exact algorithms at exploitation tasks, where gradient-based mechanisms outperform metaheuristics when a local minimum is approximated. In this paper, a quasi-Newton method is introduced into a Chaotic Gravitational Search Algorithm as an exploitation method, with the purpose of improving the exploitation capabilities of this recent and promising population-based metaheuristic. The proposed approach, referred to as a Memetic Chaotic Gravitational Search Algorithm, is used to solve forty-five benchmark problems, both synthetic and real-world, to validate the method. The numerical results show that the adding of quasi-Newton search directions to the original (Chaotic) Gravitational Search Algorithm substantially improves its performance. Also, a comparison with the state-of-the-art algorithms: Particle Swarm Optimization, Genetic Algorithm, Rcr-JADE, COBIDE and RLMPSO, shows that the proposed approach is promising for certain real-world problems.

An Improved Flower Pollination Algorithm for Global Numerical Optimization

Conference Paper

Oct 2020

Chaotic inertia weight in black hole algorithm for function optimization

Conference Paper

Nov 2015

A comprehensive survey: Whale Optimization Algorithm and its applications

Article

Aug 2019

Whale Optimization Algorithm (WOA) is an optimization algorithm developed by Mirjalili and Lewis in 2016. An overview of WOA is described in this paper, rooted from the bubble-net hunting strategy, besides an overview of WOA applications that are used to solve optimization problems in various categories. The best solution has been determined to make something as functional and effective as possible through the optimization process by minimizing or maximizing the parameters involved in the problems. Research and engineering attention have been paid to Meta-heuristics for purposes of decision-making given the growing complexity of models and the needs for quick decision making in the engineering. An updated review of research of WOA is provided in this paper for hybridization, improved, and variants. The categories included in the reviews are Engineering, Clustering , Classification, Robot Path, Image Processing, Networks, Task Scheduling, and other engineering applications. According to the reviewed literature, WOA is mostly used in the engineering area to solve optimization problems. Providing an overview and summarizing the review of WOA applications are the aims of this paper.

Naïve Bayes Classifier for Authorship Attribution in Arabic poetry

Conference Paper

Apr 2017

Ahmed Al-falahi

Feature selection for classification of microarray gene expression cancers using Bacterial Colony Optimization with multi-dimensional population

Article

Apr 2019

The main challenge of feature selection is overcoming the curse of dimensionality. In this paper, a new Bacterial Colony Optimization method with Multi-Dimensional Population, abbreviated as BCO-MDP, is presented for feature selection for the purpose of classification. To address the combinational problem associated with feature selection, the population with multiple dimensionalities is represented by subsets of different feature sizes. The population is grouped in terms of ‘Tribes’. The sizes of the feature subsets within a tribe are equal, while the dimensionalities differ when they belong to different tribes to achieve parallel solutions. The features are identified by their contributions to the most promising solutions in the total population and the classification performances of their tribes. A search is then conducted for the optimal feature subsets with varying dimensionalities. The convergence speed can be enhanced by a variety of exchange strategies within and between tribes. The proposed BCO-MDP method is demonstrated to be superior to the binary algorithms in terms of feature size and efficiency, while having a lower computational complexity in comparison to other population-based algorithms with constant dimensionality.

Novel Chaotic Bat Algorithm for Forecasting Complex Motion of Floating Platforms

Article

Mar 2019
APPL MATH MODEL

This paper presents a model for forecasting the motion of a floating platform with satisfactory forecasting accuracy. First, owing to the complex nonlinear characteristics of a time series of floating platform motion data, a support vector regression model with a hybrid kernel function is used to simulate the motion of a floating platform. Second, the proposed chaotic efficient bat algorithm, based on the chaotic, niche search, and evolution mechanisms, is used to optimize the parameters of the hybrid kernel-based support vector regression model. Third, the ensemble empirical mode decomposition algorithm is utilized to decompose the original floating platform motion time series into a series of intrinsic mode functions and residuals. The ultimate forecasting results are obtained by summing the outputs of these functions. Subsequently, motion data for a real floating platform are used to evaluate the reliability and effectiveness of the proposed model.

A hierarchical gravitational search algorithm with an effective gravitational constant

Article

Feb 2019

Gravitational search algorithm (GSA) inspired by the law of gravity is a swarm intelligent optimization algorithm. It utilizes the gravitational force to implement the interaction and evolution of individuals. The conventional GSA achieves several successful applications, but it still faces a premature convergence and a low search ability. To address these two issues, a hierarchical GSA with an effective gravitational constant (HGSA) is proposed from the viewpoint of population topology. Three contrastive experiments are carried out to analyze the performances between HGSA and other GSAs, heuristic algorithms and particle swarm optimizations (PSOs) on function optimization. Experimental results demonstrate the effective property of HGSA due to its hierarchical structure and gravitational constant. A component-wise experiment is also established to further verify the superiority of HGSA. Additionally, HGSA is applied to several real-world optimization problems so as to verify its good practicability and performance. Finally, the time complexity analysis is discussed to conclude that HGSA has the same computational efficiency in comparison with other GSAs.

On the performance improvement of elephant herding optimization algorithm

Article

Feb 2019
KNOWL-BASED SYST

Thanks to fewer numbers of control parameters and easier implementation, the Elephant Herding Optimization (EHO) has been gaining research interest during the past decade. In our paper, to understand the impact of the control parameters, a parametric study of the EHO is carried out using a standard test bench, engineering problems, and real-world problems. On top of that, the main aim of this paper is to propose different approaches to enhance the performance of the original EHO, i.e., cultural-based, alpha-tuning, and biased initialization EHO. Acomparative study has been made between these EHO variants and the state-of-the-art swarm optimization methods. Case studies ranging from the recent test bench problems of CEC 2016 to the popular engineering problems of gear train, welded beam, three-bar truss design problem, continuous stirred tank reactor, and fed-batch fermentor are used to validate and test the performances of the proposed EHOs against the existing techniques. Numerical results show that the performances of the three new EHOs are better than or competitive with the population-based optimization algorithms.

Chaotic vortex search algorithm: metaheuristic algorithm for feature selection

Abstract and Figures

Recommended publications

CQFFA: A Chaotic Quasi‑oppositional Farmland Fertility Algorithm for Solving Engineering Optimizatio...

Chaotic-Integrated Harris Hawks Optimization Algorithm for Unconstrained Problems Optimization

Improved chaotic binary grey wolf optimization algorithm for workflow scheduling in green cloud comp...

An Improved Grey Wolves Optimization Algorithm for Workflow Scheduling in Cloud Computing Environmen...