ArticlePDF Available

Evolutionary-Based Deep Stacked Autoencoder for Intrusion Detection in a Cloud-Based Cyber-Physical System

Authors:

Abstract and Figures

As cyberattacks develop in volume and complexity, machine learning (ML) was extremely implemented for managing several cybersecurity attacks and malicious performance. The cyber-physical systems (CPSs) combined the calculation with physical procedures. An embedded computer and network monitor and control the physical procedure, commonly with feedback loops whereas physical procedures affect calculations and conversely, at the same time, ML approaches were vulnerable to data pollution attacks. Improving network security and attaining robustness of ML determined network schemes were the critical problems of the growth of CPS. This study develops a new Stochastic Fractal Search Algorithm with Deep Learning Driven Intrusion Detection system (SFSA-DLIDS) for a cloud-based CPS environment. The presented SFSA-DLIDS technique majorly focuses on the recognition and classification of intrusions for accomplishing security from the CPS environment. The presented SFSA-DLIDS approach primarily performs a min-max data normalization approach to convert the input data to a compatible format. In order to reduce a curse of dimensionality, the SFSA technique is applied to select a subset of features. Furthermore, chicken swarm optimization (CSO) with deep stacked auto encoder (DSAE) technique was utilized for the identification and classification of intrusions. The design of a CSO algorithm majorly focuses on the parameter optimization of the DSAE model and thereby enhances the classifier results. The experimental validation of the SFSA-DLIDS model is tested using a series of experiments. The experimental results depict the promising performance of the SFSA-DLIDS model over the recent models.
Content may be subject to copyright.
Citation: Duhayyim, M.A.; Alissa,
K.A.; Alrayes, F.S.; Alotaibi, S.S.; Tag
El Din, E.M.; Abdelmageed, A.A.;
Yaseen, I.; Motwakel, A.
Evolutionary-Based Deep Stacked
Autoencoder for Intrusion Detection
in a Cloud-Based Cyber-Physical
System. Appl. Sci. 2022,12, 6875.
https://doi.org/10.3390/
app12146875
Academic Editor: João
M.F. Rodrigues
Received: 5 June 2022
Accepted: 1 July 2022
Published: 7 July 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
applied
sciences
Article
Evolutionary-Based Deep Stacked Autoencoder for Intrusion
Detection in a Cloud-Based Cyber-Physical System
Mesfer Al Duhayyim 1, *, Khalid A. Alissa 2, Fatma S. Alrayes 3, Saud S. Alotaibi 4, ElSayed M. Tag El Din 5,
Amgad Atta Abdelmageed 6, Ishfaq Yaseen 6and Abdelwahed Motwakel 6
1Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam Bin Abdulaziz
University, Al-Kharj 16278, Saudi Arabia
2SAUDI ARAMCO Cybersecurity Chair, Networks and Communications Department, College of Computer
Science and Information Technology, Imam Abdulrahman Bin Faisal University,
P.O. Box 1982, Dammam 31441, Saudi Arabia; KaAlissa@iau.edu.sa
3Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint
Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia; Fsalrayes@pnu.edu.sa
4Department of Information Systems, College of Computing and Information System,
Umm Al-Qura University, Mecca 24382, Saudi Arabia; sotaibi@uqu.edu.sa
5Department of Electrical Engineering, Faculty of Engineering and Technology, Future University in Egypt,
New Cairo 11845, Egypt; ElSayed.TagElDin@fue.edu.eg
6Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam Bin Abdulaziz
University, Al-Kharj 16278, Saudi Arabia; abzwesabi@gmail.com (A.A.A.); tsr2wesabi@gmail.com (I.Y.);
mrbwesabi@gmail.com (A.M.)
*Correspondence: m.alduhayyim@psau.edu.sa
Abstract:
As cyberattacks develop in volume and complexity, machine learning (ML) was extremely
implemented for managing several cybersecurity attacks and malicious performance. The cyber-
physical systems (CPSs) combined the calculation with physical procedures. An embedded computer
and network monitor and control the physical procedure, commonly with feedback loops whereas
physical procedures affect calculations and conversely, at the same time, ML approaches were vulnera-
ble to data pollution attacks. Improving network security and attaining robustness of ML determined
network schemes were the critical problems of the growth of CPS. This study develops a new Stochas-
tic Fractal Search Algorithm with Deep Learning Driven Intrusion Detection system (SFSA-DLIDS)
for a cloud-based CPS environment. The presented SFSA-DLIDS technique majorly focuses on the
recognition and classification of intrusions for accomplishing security from the CPS environment.
The presented SFSA-DLIDS approach primarily performs a min-max data normalization approach to
convert the input data to a compatible format. In order to reduce a curse of dimensionality, the SFSA
technique is applied to select a subset of features. Furthermore, chicken swarm optimization (CSO)
with deep stacked auto encoder (DSAE) technique was utilized for the identification and classification
of intrusions. The design of a CSO algorithm majorly focuses on the parameter optimization of
the DSAE model and thereby enhances the classifier results. The experimental validation of the
SFSA-DLIDS model is tested using a series of experiments. The experimental results depict the
promising performance of the SFSA-DLIDS model over the recent models.
Keywords:
Internet of Things; deep learning; cyber physical systems; cloud computing; intrusion
detection; security
1. Introduction
With the emergence of disruptive technology, Industry 4.0 is experiencing huge tran-
sitions in terms of cost efficiency and performance [
1
]. In particular, this applies to smart
computing on a big scale, namely, Cloud Computing, the Internet of Things (IoTs), and
Cyber Physical System (CPS). CPS is a multi-dimensional, complex system that integrates
a computer, network, and physical environment [
2
]. With the deep collaboration of 3C
Appl. Sci. 2022,12, 6875. https://doi.org/10.3390/app12146875 https://www.mdpi.com/journal/applsci
Appl. Sci. 2022,12, 6875 2 of 17
(control, computation, and communication) techniques, the dynamic control, information
servicing, and real-time perception of large engineering systems are realized [
3
]. CPS
realizes the organic design of physical, computation, and transmission systems, making the
system capable, reliable, and effective for simultaneous collaboration, resulting in extensive
and important application prospects. CPS is utilized in different industries and fields [
4
,
5
].
In recent times, the information technology sector has expanded rapidly. Innovations
and breakthroughs of several techniques have been established, resulting in earth-shaking
changes in people’s lives [6].
Particularly, in the process of rapid development, embedded technologies are often
applied to human life [
7
,
8
]. CPS has become the most prominent in researches and devel-
opment direction for scholars in different countries because of its extensible, scalable, and
interactive features, and also it becomes a priority investment region for large enterprises.
In contrast to embedded technologies, a CPS, as a combination of computer technology and
physical equipment, transforms a computing object from distributed to unified, discrete
to continuous, and digital to analog [
9
]. In contrast to the IoT system, the perceptibility
of CPS after the connection of physical entities pay increased attention to dynamic or
ongoing data control of the information services and major component of the device. In
comparison to software system, CPS focuses on the feedback and control of the physical
process, highlighting the dynamic response and interaction of data processing [10].
In relation to CPS security, a conventional pattern approaches the cyber and phys-
ical systems individually and cannot address vulnerability that is related to embedded
controllers and networks that are intended for controlling and monitoring physical pro-
cesses [
11
]. Hence, it is necessary for an organic security system to protect CPS from
cyberattacks. In this study, there exists strong evidence of the necessity for security in this
system and the havoc that can result if the security is disregarded. To identify unexpected
errors and attacks in CPS, an anomaly detection method is suggested to mitigate the threat.
For instance, state estimation (i.e., Kalman filter), rule, statistical models (histogram-based
model and Gaussian model) based methods are applied for learning the regular status
of CPS [
12
]. However, each method generally needs expert knowledge (for example, op-
erator manually extracts some rules), or should know the fundamental distribution of
data. The machine learning (ML) approach does not depend upon domain-specific knowl-
edge [
13
]. However, it generally needs an abundance of labeled datasets (for example,
classification-based method). As well, they could capture the unique attribute of CPS (for
example, spatial-temporal relationship). The intrusion detection (ID) method is dedicated
to ensuring network security.
This study develops a new Stochastic Fractal Search Algorithm with Deep Learning
Driven Intrusion Detection system (SFSA-DLIDS) for a cloud-based CPS environment. The
presented SFSA-DLIDS technique majorly focuses on the recognition and classification of
intrusions for accomplishing security from the CPS environment. The presented SFSA-
DLIDS approach primarily performs min-max data normalization approach to convert
the input data to a compatible format. In order to reduce a curse of dimensionality, the
SFSA technique is applied to select a subset of features. The SFSA uses the idea of fractals
to satisfy the intensification (exploitation) property needed by optimization algorithms,
and the stochasticity feature to guarantee the diversification (exploration) of the search
space. Additionally, chicken swarm optimization (CSO) with deep stacked auto encoder
(DSAE) technique was utilized for the identification and classification of intrusions. The
experimental validation of the SFSA-DLIDS model is tested using a series of experiments.
2. Literature Review
Li et al. [
14
] present a novel federated DL approach termed DeepFed, for identifying
cyber threats against industrial CPSs. Especially, the authors’ primary design is a novel
DL-based ID method for industrial CPSs, creating utilization of CNN and GRU. Secondary,
the authors create a federated learning structure and permit several industrial CPSs to com-
bine a detailed ID method from a privacy-preserving approach.
de Araujo-Filho et al.
[
15
]
Appl. Sci. 2022,12, 6875 3 of 17
present FID-GAN, a novel fog-based, unsupervised IDS for CPSs employing a generative
adversarial network (GAN). The IDS was presented to a fog structure that takes compu-
tation resources as near as possible to the end node and so provides for a lesser latency
requirement. For achieving superior detection rates, the presented structure estimates a
reconstruction loss depending upon the reform of data instances mapped to latent spaces.
Alohali et al. [
16
] project a novel AI-enabled multimodal fusion-based IDS (AIMMF-IDS)
for CCPS from the Industry 4.0 environments. The presented method primarily executes
the data pre-processed approach in two manners such as data conversion and data nor-
malization. Moreover, an improved fish swarm optimization-based FS (IFSO-FS) system
is utilized to suitable selective features. The IFSO approach was developed by utilizing a
Levy Flight (LF) model as the search process of a typical FSO technique in order to avoid
the local optimum problems.
Althobaiti et al. [
17
] examine a novel cognitive computing-based IDS approach for
achieving security from industrial CPS. The presented method contains pre-processing for
discarding the noise which exists from the data. Afterward, the proposed method utilizes
a binary bacterial foraging optimization (BBFO) based FS approach for selecting the best
subset of features. Additionally, the GRU method was executed for identifying the occur-
rence of intrusions from the industrial CPS environments. The authors in [
18
] primarily
present a new self-learning spatial distribution technique called Euclidean distance-based
between-class learning (EBC learning) that enhances between-class learning by comput-
ing the Euclidean distance (ED) amongst KNN of distinct classes. Moreover, a cogni-
tive computing-based ID model termed order-line SMOTE and EBC learning dependent
upon RF (BSBC-RF) is also presented as dependent upon EBC learning to industrial CPSs.
Ibor et al.
[
19
] present a new hybrid technique for intrusion forecast on a CPS’s communica-
tion network. The authors utilize a bio-simulated hyperparameter searching approach for
generating an enhanced DNN infrastructure dependent upon the basic hyperparameters of
NNs. In addition, the authors develop a forecasting method dependent upon the enhanced
NN infrastructure. Some other methods in the literature are available in [2023].
3. The Proposed Model
In this article, a new SFSA-DLIDS technique has been projected for the classification
and identification of intrusions from the CPS environment. The presented the SFSA-DLIDS
model primarily performs a min-max data normalization approach to convert the input data
to a compatible format, followed by the SFSA technique, which is applied to select a subset
of features. Finally, the CSO-DSAE approach was utilized for the identification and classifi-
cation of intrusions. Figure 1depicts the block diagram of the SFSA-DLIDS approach.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 4 of 17
Figure 1. Block diagram of SFSA-DLIDS approach.
3.1. Data Pre-Processing
At the initial stage, the presented approach performs the min-max data normaliza-
tion approach to convert the input data to a compatible format. It can be executed to scale
the feature from the zero and one range with the execution in Equation (1):
󰆒

(1)
At this point,  and  signifies the minimal and maximal values of features
. The original and normalized values of the elements, have been demonstrated by
and 󰆒 correspondingly. It is apparent in the above formula that the maximal and mini-
mal feature values were mapped to one and zero correspondingly.
3.2. Feature Selection Using SFSA Technique
In this work, the SFSA technique is applied to select a subset of features. SFSA is
based on the specific development marvel of a random fractal and basically uses two pro-
cesses afterward for population initialization: (1) diffusion and (2) update to enhance the
searching [24]. In the arithmetical modeling of the SFSA, the finest solution is only pre-
ferred from the diffusion method to generate novel arrangements, while overlooking dis-
crete arrangements. The procedure for making a new arrangement is characterized as
Gaussian walks 󰇛󰇜 that are determined in the following:
󰇛󰇜󰇛󰇛󰇜󰇛󰇜󰇜
(2)
󰇛󰇜
(3)
The expression: 󰇛󰇜 refers to an arbitrary value that lies between zero and
one, and denotes the - solutions and every particle diffuses around its position
and completes correspondingly; and indicates the Gaussian means walk that is
equivalent to and correspondingly; denotes the standard deviation that is cal-
culated by:
󰈅󰇛󰇜
󰇛󰇜󰈅
(4)
In Equation (4), represents the iteration count. In the optimization technique, is
increased but lesser than ending criteria  are attuned dynamically. All the parti-
cles are diffused around their current situation by using the Gaussian walk until a prede-
termined extreme dissemination number YD is obtained. Based on the following equation,
many generated solutions are attained:
Figure 1. Block diagram of SFSA-DLIDS approach.
Appl. Sci. 2022,12, 6875 4 of 17
3.1. Data Pre-Processing
At the initial stage, the presented approach performs the min-max data normalization
approach to convert the input data to a compatible format. It can be executed to scale the
feature from the zero and one range with the execution in Equation (1):
ν0=vminA
maxAminA
(1)
At this point,
minA
and
maxA
signifies the minimal and maximal values of features
A
.
The original and normalized values of the elements,
A
have been demonstrated by
ν
and
ν0
correspondingly. It is apparent in the above formula that the maximal and minimal feature
values were mapped to one and zero correspondingly.
3.2. Feature Selection Using SFSA Technique
In this work, the SFSA technique is applied to select a subset of features. SFSA is
based on the specific development marvel of a random fractal and basically uses two
processes afterward for population initialization: (1) diffusion and (2) update to enhance
the searching [
24
]. In the arithmetical modeling of the SFSA, the finest solution is only
preferred from the diffusion method to generate novel arrangements, while overlooking
discrete arrangements. The procedure for making a new arrangement is characterized as
Gaussian walks (Gw)that are determined in the following:
Gw1=Gaussian (µG,σ)+(rand(0, 1)×PBrand (0, 1)×Pi)(2)
Gw2=Gaussian(µP,δ)(3)
The expression:
rand(0, 1)
refers to an arbitrary value that lies between zero and one,
Pi
and
PB
denotes the
i
-
th
solutions and every particle diffuses around its position and
completes correspondingly;
µG
and
µP
indicates the Gaussian means walk that is equivalent
to |Pi|and |PB|correspondingly; δdenotes the standard deviation that is calculated by:
δ
log(g)
g(PiPB)
(4)
In Equation (4),
g
represents the iteration count. In the optimization technique,
g
is
increased but lesser than ending criteria
Gmax
,
δ
are attuned dynamically. All the particles
are diffused around their current situation by using the Gaussian walk until a predeter-
mined extreme dissemination number YD is obtained. Based on the following equation,
many generated solutions are attained:
Pi j =LBij +rand(0, 1)×UBij LBij ,j=1, 2, 3, . . . , YD(5)
In Equation (5),
UBi j
and
LBij
refers to the upper as well as lower limits of j-th values
of solution
i
;
yD
denotes the maximal diffusion count of solutions generated by the SFSA.
Then, the quality of solution has been calculated and the optimal solution
PB
is defined. In
this step, two methods are used: initially update the solution of (p) probability according
to the value of Pa <rand (0, 1)as follows:
Pa=rank(Pi)
ND
(6)
P0
ij =Paj rand(0, 1)×(Pa1Pa2)(7)
In Equation (7), rank (p) refers to the rank of
ith
solutions amongst different arrange-
ments in the population;
P0
i
indicates the new solution of the
i
-
th
solution;
Pa1
and
Pa2
signifies the random solution of population. Next, improve the exploration accordingly
and apply the variations to the solution based on discrete solutions from the population.
Appl. Sci. 2022,12, 6875 5 of 17
However, the process initiates by sorting each arrangement based on the following equa-
tion. When
Pa<rand(0, 1)
for the
P0
i
, there is no update for the present solution
Pi
or else
the solution was upgraded by the following expression:
P00
i=(P0
i¯rand(0, 1)×(P0
tPB)rand(0, 1)0.5
P0
i+rand(0, 1)×(P0
tP0
r)rand(0, 1)>0.5 (8)
Let
P0
t
and
P0
t
be the solution randomly designated from the Gaussian distribution.
The following phase is for comparing the quality of
P00
i
with
P0
i
and
P00
i
is superior to
P0
i
then P00
iis substituted Pior else P0
inot upgraded.
The fitness function (FF) of the SFSA system utilized from the presented system was
planned to contain a balance among the amount of chosen features from all the solutions
(minimal) and the classifier accuracy (maximal) reached by utilizing these selective features.
Equation (9) defines the FF for evaluating solutions:
Fitness =αγR(D) + β|R|
|C|(9)
whereas
γR(D)
denotes the classifier error rate of provided classifier (the KNN technique
was utilized).
|R|
stands for the cardinality of chosen subset and
|C|
signifies the entire
amount of features from the dataset,
α
, and
β
signifies the 2 parameters equivalent to
significance of classifier quality and subset length. [1, 0] and β=1α.
3.3. DSAE-Based Data Classification
To recognize and classify intrusions, the DSAE model has been exploited in this study.
In our study, the SAE used is developed by different LR and AE layers [
25
]. AE is the
fundamental component of SAE classification. Figure 2demonstrates the infrastructure of
SAE. It comprises an encoding step (Layer 1 to Layer 2) and a reconstruction or decoding
step (Layer 2 to Layer 3). This procedure is expressed in the following equation, where
W
and
WT
(the transpose of W) refers to the weight matrix of mode
b
and
b0
are two distinct
bias vectors,
s
indicates a nonlinearity function, namely the sigmoid function applied in
the work,
γ
denotes a latent depiction of
x
input layer, and
z
is regarded as a prediction of
xgiven γand it must have the identical shape as x.
γ=s(Wx +b). (10)
z=sWTγ+b0(11)
Numerous AE layers are collectively stacked to procedure in an unsupervised pre-
training phase (Layer 1 to Layer 4). The latent depiction
0y
calculated by an AE is utilized
as the input to the following AE layers. All the layers are trained by an AE by reducing the
reconstruction error that perform as a single layer at a time. The reconstructing error
(
loss
function
(x,z)
) is evaluated in different methods. In our work, we apply cross-entropy
for measuring the reconstructing error, as demonstrated in Equation (12), where
xk
and
zk
denotes the kth component of xand z, correspondingly.
L(x,z)=
d
k=1
[xklnzk+(1xk)ln(1zk)](12)
Appl. Sci. 2022,12, 6875 6 of 17
Appl. Sci. 2022, 12, x FOR PEER REVIEW 7 of 17
Figure 2. Structure of SAE.
3.4. Hyperparameter Tuning Using CSO Algorithm
Here, the CSO algorithm was executed for the parameter optimization of the DSAE
approach and thereby enhances the classifier results. Meng et al. [26] suggest the CSO
technique. A novel SI optimized technique was presented for simulating the hierarchy
and foraging performance of chickens. The population was separated into many sub-
groups. All the subgroups contain cock, chick, and hen. The CSO technique follows the
subsequent principles:
(1) The whole population contains many sub-populations, each of which comprises
cock, amount of hens, and many chicks.
(2) The fitness value (FV) of all the particles from the population was computed. The
particle is classified depending upon the FV. Some particles with optimum FVs were
chosen as cocks, some particles with worse FVs were chosen as chickens, and remain-
ing particles were chosen as hens.
(3) In specific hierarchy, the dominance connection and motherchild connection re-
mained unaffected. However, as the chicks produced, the population connection was
modified. The hierarchy control connection and maternal connection of chicken
swarms were variations all the 𝐺 time.
(4) The cock controls the flock, the hen follows the cock from its individual populations
and the chick food was nearby the hens. The hen arbitrarily combines a subpopula-
tion. The connection among mother as well as child from the flock was arbitrarily
introduced. The cock with main searching range and an optimum searching capabil-
ity was led from the flocks. The chick particle has the worse foraging capability and
minimum foraging range. The foraging capability and searching range of hen parti-
cles were amongst cock as well as chick particles.
In CSO, there were 𝑁 particles from the entire chick flock. The amount of roosters
can be determined as 𝑁𝑟. The amount of hens was determined as 𝑁, and the amount of
Figure 2. Structure of SAE.
The reconstructing error is minimalized by the Gradient Descent mechanism. The
weight in Equations (10) and (11) need to be upgraded based on the following equations,
where 0 indicates the learning rate:
W=WαL(x,z)
W. (13)
b=bαL(x,z)
b. (14)
b0=b0αL(x,z)
b0. (15)
Once the layer is pre-trained, the network enters the supervised finetuning phase.
In the supervised finetuning phase, add an LR layer to the resultant layer. In this work,
probability that the
x
input vector (Layer 4) belonging to
i
-
th
class is determined in the
above equation, where
y
represents the predicted class of input vector
x
.
W
and
b
denote
the weight matrices and the bias vector, correspondingly,
Wj
and
Wj
denote the
irh
and
jrh
row of matrixes
W
, correspondingly,
bi
and
bj
denote the
ith
and
jth
elements of vector
b
,
correspondingly, and
so f tmax
refers to the nonlinearity function. The class with the maxi-
mum probability was assumed as the prediction label
ypred
of
x
input vector, as determined
in Equation (17). The predictive error of sample dataset
D(Loss(D))
is evaluated according
to the true label, as demonstrated in Equation (18), where
yi
indicates the true label of
xi
.
Loss(D)
is minimalized by the Gradient Descent model that is the same as the procedure
of minimalizing the abovementioned reconstruction error:
P(Y=i|x,W,b)=so f tmax (W x +b)=eWix+bj
jeWjx+bj. (16)
Appl. Sci. 2022,12, 6875 7 of 17
ypred =argmax (P(Y=i|x,W,b)) (17)
Loss (D) =
D
i=0
In (P(V=yi|xi,W,b)) (18)
3.4. Hyperparameter Tuning Using CSO Algorithm
Here, the CSO algorithm was executed for the parameter optimization of the DSAE
approach and thereby enhances the classifier results. Meng et al. [
26
] suggest the CSO
technique. A novel SI optimized technique was presented for simulating the hierarchy
and foraging performance of chickens. The population was separated into many sub-
groups. All the subgroups contain cock, chick, and hen. The CSO technique follows the
subsequent principles:
(1)
The whole population contains many sub-populations, each of which comprises cock,
amount of hens, and many chicks.
(2)
The fitness value (FV) of all the particles from the population was computed. The
particle is classified depending upon the FV. Some particles with optimum FVs were
chosen as cocks, some particles with worse FVs were chosen as chickens, and remain-
ing particles were chosen as hens.
(3)
In specific hierarchy, the dominance connection and mother–child connection re-
mained unaffected. However, as the chicks produced, the population connection
was modified. The hierarchy control connection and maternal connection of chicken
swarms were variations all the Gtime.
(4)
The cock controls the flock, the hen follows the cock from its individual populations
and the chick food was nearby the hens. The hen arbitrarily combines a subpopulation.
The connection among mother as well as child from the flock was arbitrarily intro-
duced. The cock with main searching range and an optimum searching capability was
led from the flocks. The chick particle has the worse foraging capability and minimum
foraging range. The foraging capability and searching range of hen particles were
amongst cock as well as chick particles.
In CSO, there were
N
particles from the entire chick flock. The amount of roosters
can be determined as
Nr
. The amount of hens was determined as
Nh
, and the amount of
chickens was
Nc
. Distinct types of chickens are distinct place upgrade formulas if they
can be determined as food [
20
]. The roosters are one of the adjustable individuals from
chickens and, most apparently, for defining food from the entire population.
The formula for place upgrade of cock particles is depicted in Equation (19):
Pj
i(t+1)=Pj
i(t)1+Randn0, σ2
σ2=(1Wi<Wk
exp (WkWi)
|Wi|+εothers
(19)
In which, the
k[1, cn]
, and
k6=i
.
Randn0, σ2
means the Gaussian distribution
with mean value of 0 and standard deviation of
σ2
. An individual place of
Pj
i(t)
is the
value of
jth
dimensional of
ith
individual at
tth
iterations.
ε
is some lesser constant;
k
refers
to the random cock from every cock except
ith
cock;
Wi
refers the FV equivalent to
ith
cock;
Wk
denotes the FV equivalent to
kth
cocks. The hens are maximal proportion of
individuals from the entire chick population. Their place upgrade formulation is depicted
in Equation (20):
Pj
i(t+1)=Pj
i(t) + K1Random Pi
r1(t)Pj
i(t)+K2Random Pj
r2(t)Pj
i(t)
K1=exp(WiWr1)
|Wi|+ε
K2=exp (Wr2Wi)
(20)
Appl. Sci. 2022,12, 6875 8 of 17
whereas
Random
stands for the arbitrary number amongst
zero
and one, which follows the
standard normal distribution.
r1
represents the cock from the group but
ith
hen was placed.
r2
demonstrated that some cock excepting the cock from the set of
ith
hen. Therefore
r1
is distinct in
r2
. The chick follows the hen searching and chick place upgrade equation
demonstrated in Equation (21):
Pj
i(t+1)=Pj
i(t) + FL hPj
m(t)Pj
i(t)i(21)
In which
FL
refers to the average amount equally distributed from zero and two.
Pj
m(t)
implies the hen place equivalent to
ith
chick. The CSO system determines an FF to accom-
plish superior classifier performances. It determines a positive integer for exemplifying the
best efficiency of candidate results. In this scenario, the reduced classification error rate has
been supposed that FF is providing in Equation (22). An optimum outcome is a decreased
error rate and a worse solution accomplishes an improved error rate:
f itness(xi)=Cl assi f ie rError Rate (xi)
=numb er o f mis clas si f ied sam ple s
Total num ber o f sa mpl es 100 (22)
4. Results and Discussion
The experimental validation of the SFSA-DLIDS model is tested using two benchmark
datasets, namely, NSLKDD 2015 [
27
] and CICIDS 2017 [
28
] datasets. Table 1illustrates the
details on two benchmark datasets. The NSLKDD 2015 dataset holds samples under two
classes. It includes 67,343 samples under normal class and 58,630 samples under anomaly
class. In addition, the CICIDS 2017 dataset holds 50,000 samples under normal class and
50,000 samples under anomaly class.
Table 1. Dataset details.
Class No. of Samples
NSLKDD 2015 CICIDS 2017
Normal 67,343 50,000
Anomaly 58,630 50,000
Total 125,973 100,000
Figure 3indicates the confusion matrices produced by the SFSA-DLIDS approach on
the test NSLKDD 2015 dataset. With 70% of training (TR) dataset, the SFSA-DLIDS model
has recognized 46,762 samples into normal class and 40,054 samples into anomaly class.
In addition, with 30% of the testing (TS) dataset, the SFSA-DLIDS method has recognized
20,147 samples into normal class and 17,062 samples into anomaly class. Additionally, with
20% of TS dataset, the SFSA-DLIDS approach has identified 13,390 samples into normal
class and 11,665 samples into anomaly class.
Table 2and Figure 4showcase the overall classification output of the SFSA-DLIDS
model on the test NSLKDD 2015 dataset. The results implied that the SFSA-DLIDS model
has resulted to enhanced results under all aspects. For instance, with 70% of TR data, the
SFSA-DLIDS model has offered average
accuy
of 97.74%,
precn
of 97.76%,
recal
of 97.74%,
and
Fscore
of 97.74%. Simultaneously, with 30% of TS data, the SFSA-DLIDS approach has
rendered average
accuy
of 97.86%,
precn
of 97.88%,
recal
of 97.87%, and
Fscore
of 97.86%.
Concurrently, with 20% of TS data, the SFSA-DLIDS method has provided average
accuy
of
99.32%, precnof 99.32%, recalof 99.33%, and Fscore of 99.32%.
Appl. Sci. 2022,12, 6875 9 of 17
Appl. Sci. 2022, 12, x FOR PEER REVIEW 9 of 17
Table 1. Dataset details.
Class
No. of Samples
CICIDS 2017
Normal
50,000
Anomaly
50,000
Total
100,000
Figure 3 indicates the confusion matrices produced by the SFSA-DLIDS approach on
the test NSLKDD 2015 dataset. With 70% of training (TR) dataset, the SFSA-DLIDS model
has recognized 46,762 samples into normal class and 40,054 samples into anomaly class.
In addition, with 30% of the testing (TS) dataset, the SFSA-DLIDS method has recognized
20,147 samples into normal class and 17,062 samples into anomaly class. Additionally,
with 20% of TS dataset, the SFSA-DLIDS approach has identified 13,390 samples into nor-
mal class and 11,665 samples into anomaly class.
Figure 3.
Confusion matrices of SFSA-DLIDS approach under NSLKDD 2015 dataset: (
a
) 70% of TR
data, (b) 30% of TS data, (c) 80% of TR data, and (d) 20% of TS data.
The training accuracy (TA) and validation accuracy (VA) acquired by the SFSA-DLIDS
approach on NSLKDD 2015 dataset is demonstrated in Figure 5. The experimental outcome
denoted that the SFSA-DLIDS algorithm attained maximal values of TA and VA. In specific,
the VA is higher than TA.
Appl. Sci. 2022,12, 6875 10 of 17
Table 2.
Result analysis of SFSA-DLIDS approach with various measures on NSLKDD 2015 dataset.
Class Labels Accuracy Precision Recall F-Score
Training Phase (70%)
Normal 97.74 98.76 96.67 97.71
Anomaly 97.74 96.76 98.80 97.77
Average 97.74 97.76 97.74 97.74
Testing Phase (30%)
Normal 97.86 98.94 96.79 97.85
Anomaly 97.86 96.82 98.95 97.87
Average 97.86 97.88 97.87 97.86
Training Phase (80%)
Normal 99.35 99.34 99.36 99.35
Anomaly 99.35 99.37 99.34 99.35
Average 99.35 99.35 99.35 99.35
Testing Phase (20%)
Normal 99.32 99.39 99.28 99.33
Anomaly 99.32 99.26 99.37 99.32
Average 99.32 99.32 99.33 99.32
Appl. Sci. 2022, 12, x FOR PEER REVIEW 10 of 17
Figure 3. Confusion matrices of SFSA-DLIDS approach under NSLKDD 2015 dataset: (a) 70% of TR
data, (b) 30% of TS data, (c) 80% of TR data, and (d) 20% of TS data.
Table 2 and Figure 4 showcase the overall classification output of the SFSA-DLIDS
model on the test NSLKDD 2015 dataset. The results implied that the SFSA-DLIDS model
has resulted to enhanced results under all aspects. For instance, with 70% of TR data, the
SFSA-DLIDS model has offered average  of 97.74%,  of 97.76%,  of
97.74%, and of 97.74%. Simultaneously, with 30% of TS data, the SFSA-DLIDS ap-
proach has rendered average  of 97.86%,  of 97.88%,  of 97.87%, and
of 97.86%. Concurrently, with 20% of TS data, the SFSA-DLIDS method has pro-
vided average  of 99.32%,  of 99.32%,  of 99.33%, and of 99.32%.
Table 2. Result analysis of SFSA-DLIDS approach with various measures on NSLKDD 2015 dataset.
Class Labels
Accuracy
Precision
Recall
F-Score
Training Phase (70%)
Normal
97.74
98.76
96.67
97.71
Anomaly
97.74
96.76
98.80
97.77
Average
97.74
97.76
97.74
97.74
Testing Phase (30%)
Normal
97.86
98.94
96.79
97.85
Anomaly
97.86
96.82
98.95
97.87
Average
97.86
97.88
97.87
97.86
Training Phase (80%)
Normal
99.35
99.34
99.36
99.35
Anomaly
99.35
99.37
99.34
99.35
Average
99.35
99.35
99.35
99.35
Testing Phase (20%)
Normal
99.32
99.39
99.28
99.33
Anomaly
99.32
99.26
99.37
99.32
Average
99.32
99.32
99.33
99.32
Figure 4. Average analysis of SFSA-DLIDS approach under NSLKDD 2015 dataset.
Appl. Sci. 2022,12, 6875 11 of 17
Appl. Sci. 2022, 12, x FOR PEER REVIEW 11 of 17
Figure 4. Average analysis of SFSA-DLIDS approach under NSLKDD 2015 dataset.
The training accuracy (TA) and validation accuracy (VA) acquired by the SFSA-
DLIDS approach on NSLKDD 2015 dataset is demonstrated in Figure 5. The experimental
outcome denoted that the SFSA-DLIDS algorithm attained maximal values of TA and VA.
In specific, the VA is higher than TA.
Figure 5. TA and VA analysis of SFSA-DLIDS approach under NSLKDD 2015 dataset.
The training loss (TL) and validation loss (VL) obtained by the SFSA-DLIDS method-
ology on NSLKDD 2015 dataset are accomplished in Figure 6. The experimental outcome
represented that the SFSA-DLIDS technique exhibited minimal values of TL and VL. Par-
ticularly, the VL is less than TL.
Figure 7 represents the confusion matrices generated by the SFSA-DLIDS algorithm
on the test CICIDS 2017 dataset. With 70% of TR dataset, the SFSA-DLIDS methodology
recognized 33,748 samples into normal class and 34,669 samples into anomaly class. More-
over, with 30% of TS dataset, the SFSA-DLIDS approach recognized 14,607 samples into
normal class and 14,752 samples into anomaly class. Along with that, with 20% of TS da-
taset, the SFSA-DLIDS method recognized 10,026 samples into normal class and 9839 sam-
ples into anomaly class.
Table 3 and Figure 8 show the overall classification output of the SFSA-DLIDS tech-
nique on the test CICIDS 2017 dataset. The results portrayed that the SFSA-DLIDS ap-
proach resulted to improvised results under all aspects. For example, with 70% of TR data,
the 3-DLIDS method rendered average  of 98.45%,  of 98.51%,  of
98.39%, and of 98.44%. In the meantime, with 30% of TS data, the SFSA-DLIDS tech-
nique presented average  of 98.46%,  of 98.52%,  of 98.39%, and
of 98.45%. Simultaneously, with 20% of TS data, the SFSA-DLIDS approach offered
average  of 99.44%,  of 99.44%,  of 99.44%, and of 99.44%.
Figure 5. TA and VA analysis of SFSA-DLIDS approach under NSLKDD 2015 dataset.
The training loss (TL) and validation loss (VL) obtained by the SFSA-DLIDS method-
ology on NSLKDD 2015 dataset are accomplished in Figure 6. The experimental outcome
represented that the SFSA-DLIDS technique exhibited minimal values of TL and VL. Partic-
ularly, the VL is less than TL.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 12 of 17
Figure 6. TL and VL analysis of SFSA-DLIDS approach under NSLKDD 2015 dataset.
Table 3. Result analysis of SFSA-DLIDS approach with various measures on CICIDS 2017 dataset.
Class Labels
Accuracy
Precision
Recall
F-Score
Training Phase (70%)
Normal
98.45
97.79
99.35
98.56
Anomaly
98.45
99.24
97.42
98.32
Average
98.45
98.51
98.39
98.44
Testing Phase (30%)
Normal
98.46
97.79
99.37
98.57
Anomaly
98.46
99.26
97.40
98.32
Average
98.46
98.52
98.39
98.45
Training Phase (80%)
Normal
99.48
99.49
99.54
99.52
Anomaly
99.48
99.47
99.42
99.44
Average
99.48
99.48
99.48
99.48
Testing Phase (20%)
Normal
99.44
99.47
99.49
99.48
Anomaly
99.44
99.41
99.40
99.40
Average
99.44
99.44
99.44
99.44
Figure 6. TL and VL analysis of SFSA-DLIDS approach under NSLKDD 2015 dataset.
Appl. Sci. 2022,12, 6875 12 of 17
Figure 7represents the confusion matrices generated by the SFSA-DLIDS algorithm on
the test CICIDS 2017 dataset. With 70% of TR dataset, the SFSA-DLIDS methodology recog-
nized 33,748 samples into normal class and 34,669 samples into anomaly class. Moreover,
with 30% of TS dataset, the SFSA-DLIDS approach recognized 14,607 samples into normal
class and 14,752 samples into anomaly class. Along with that, with 20% of TS dataset, the
SFSA-DLIDS method recognized 10,026 samples into normal class and 9839 samples into
anomaly class.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 13 of 17
Figure 7. Confusion matrices of SFSA-DLIDS approach under CICIDS 2017 dataset: (a) 70% of TR
data, (b) 30% of TS data, (c) 80% of TR data, and (d) 20% of TS data.
Figure 7.
Confusion matrices of SFSA-DLIDS approach under CICIDS 2017 dataset: (
a
) 70% of TR
data, (b) 30% of TS data, (c) 80% of TR data, and (d) 20% of TS data.
Table 3and Figure 8show the overall classification output of the SFSA-DLIDS tech-
nique on the test CICIDS 2017 dataset. The results portrayed that the SFSA-DLIDS approach
resulted to improvised results under all aspects. For example, with 70% of TR data, the
3-DLIDS method rendered average
accuy
of 98.45%,
precn
of 98.51%,
recal
of 98.39%, and
Fscore
of 98.44%. In the meantime, with 30% of TS data, the SFSA-DLIDS technique pre-
sented average
accuy
of 98.46%,
precn
of 98.52%,
recal
of 98.39%, and
Fscore
of 98.45%.
Appl. Sci. 2022,12, 6875 13 of 17
Simultaneously, with 20% of TS data, the SFSA-DLIDS approach offered average
accuy
of
99.44%, precnof 99.44%, recalof 99.44%, and Fscore of 99.44%.
Table 3. Result analysis of SFSA-DLIDS approach with various measures on CICIDS 2017 dataset.
Class Labels Accuracy Precision Recall F-Score
Training Phase (70%)
Normal 98.45 97.79 99.35 98.56
Anomaly 98.45 99.24 97.42 98.32
Average 98.45 98.51 98.39 98.44
Testing Phase (30%)
Normal 98.46 97.79 99.37 98.57
Anomaly 98.46 99.26 97.40 98.32
Average 98.46 98.52 98.39 98.45
Training Phase (80%)
Normal 99.48 99.49 99.54 99.52
Anomaly 99.48 99.47 99.42 99.44
Average 99.48 99.48 99.48 99.48
Testing Phase (20%)
Normal 99.44 99.47 99.49 99.48
Anomaly 99.44 99.41 99.40 99.40
Average 99.44 99.44 99.44 99.44
Appl. Sci. 2022, 12, x FOR PEER REVIEW 14 of 17
Figure 8. Average analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
The TA and VA attained by the SFSA-DLIDS algorithm on CICIDS 2017 dataset are
demonstrated in Figure 9. The experimental outcome shows the SFSA-DLIDS algorithm
gained higher values of TA and VA. To be specific, the VA is higher than TA.
Figure 8. Average analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
Appl. Sci. 2022,12, 6875 14 of 17
The TA and VA attained by the SFSA-DLIDS algorithm on CICIDS 2017 dataset are
demonstrated in Figure 9. The experimental outcome shows the SFSA-DLIDS algorithm
gained higher values of TA and VA. To be specific, the VA is higher than TA.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 14 of 17
Figure 8. Average analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
The TA and VA attained by the SFSA-DLIDS algorithm on CICIDS 2017 dataset are
demonstrated in Figure 9. The experimental outcome shows the SFSA-DLIDS algorithm
gained higher values of TA and VA. To be specific, the VA is higher than TA.
Figure 9. TA and VA analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
The TL and VL acquired by the SFSA-DLIDS technique on CICIDS 2017 dataset are
exhibited in Figure 10. The experimental outcome denoted the SFSA-DLIDS approach
accomplished minimal values of TL and VL. Particularly, the VL is lesser than TL.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 15 of 17
Figure 9. TA and VA analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
The TL and VL acquired by the SFSA-DLIDS technique on CICIDS 2017 dataset are
exhibited in Figure 10. The experimental outcome denoted the SFSA-DLIDS approach ac-
complished minimal values of TL and VL. Particularly, the VL is lesser than TL.
Figure 10. TL and VL analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
For ensuring the enhanced performance of the SFSA-DLIDS model, a comparative
examination is made in Table 4 [3,13]. The results implied that the WISARD, Forest-PA,
and LIB-SVM models have obtained lower 𝑎𝑐𝑐𝑢𝑦 values of 96.22%, 96.53%, and 96.56%
respectively. Followed by the GSAE and AE-RF models which attained slightly enhanced
𝑎𝑐𝑐𝑢𝑦 values of 97.44% and 97.55%, respectively. Though the FURIA model resulted in
reasonable 𝑎𝑐𝑐𝑢𝑦 of 98.82%, the SFSA-DLIDS model accomplished maximum 𝑎𝑐𝑐𝑢𝑦 of
99.44%. From the detailed results and discussion, it is obvious that the SFSA-DLIDS model
has shown enhanced security in the CPS environment.
Table 4. Comparative analysis of SFSA-DLIDS approach with existing algorithms.
Methods
Accuracy
Precision
Recall
F1-Score
SFSA-DLIDS
99.44
99.44
99.44
99.44
GSAE
97.44
96.44
98.79
97.74
AE-RF
97.55
97.08
98.15
97.66
WISARD
96.22
97.27
96.85
98.75
Forest-PA
96.53
96.99
96.85
97.88
LIB-SVM
96.56
97.38
97.32
97.75
FURIA
98.82
97.83
96.94
98.55
5. Conclusions
In this article, an innovative SFSA-DLIDS method was devised for the classification
and identification of intrusions from the CPS environment. The presented SFSA-DLIDS
model primarily performed a min-max data normalization approach to convert the input
Figure 10. TL and VL analysis of SFSA-DLIDS approach under CICIDS 2017 dataset.
Appl. Sci. 2022,12, 6875 15 of 17
For ensuring the enhanced performance of the SFSA-DLIDS model, a comparative
examination is made in Table 4[
3
,
13
]. The results implied that the WISARD, Forest-PA,
and LIB-SVM models have obtained lower
accuy
values of 96.22%, 96.53%, and 96.56%
respectively. Followed by the GSAE and AE-RF models which attained slightly enhanced
accuy
values of 97.44% and 97.55%, respectively. Though the FURIA model resulted in
reasonable
accuy
of 98.82%, the SFSA-DLIDS model accomplished maximum
accuy
of
99.44%. From the detailed results and discussion, it is obvious that the SFSA-DLIDS model
has shown enhanced security in the CPS environment.
Table 4. Comparative analysis of SFSA-DLIDS approach with existing algorithms.
Methods Accuracy Precision Recall F1-Score
SFSA-DLIDS 99.44 99.44 99.44 99.44
GSAE 97.44 96.44 98.79 97.74
AE-RF 97.55 97.08 98.15 97.66
WISARD 96.22 97.27 96.85 98.75
Forest-PA 96.53 96.99 96.85 97.88
LIB-SVM 96.56 97.38 97.32 97.75
FURIA 98.82 97.83 96.94 98.55
5. Conclusions
In this article, an innovative SFSA-DLIDS method was devised for the classification
and identification of intrusions from the CPS environment. The presented SFSA-DLIDS
model primarily performed a min-max data normalization approach to convert the input
data to a compatible format, followed by the SFSA technique which was applied to select
a subset of features. Finally, the CSO-DSAE approach was utilized for the identification
and classification of intrusions. The design of the CSO algorithm majorly focuses on the
parameter optimization of the DSAE model and thereby enhances the classifier results. The
experimental validation of the SFSA-DLIDS model was tested using a series of experiments.
The experimental results established the enhanced performance of the SFSA-DLIDS method
over the existing ones with maximum accuracy of 99.35% and 99.48% on the test NSLKDD
2015 and CICIDS 2017 datasets, respectively. Therefore, the presented SFSA-DLIDS model
was implemented as an effectual tool to recognize intrusions in the CPS environment.
In future, outlier detection approaches should be integrated for improving the overall
detection efficiency of the SFSA-DLIDS technique. In addition, the proposed model can be
realized on a big data environment in our future work.
Author Contributions:
Conceptualization, M.A.D. and K.A.A.; methodology, F.S.A.; software, I.Y.;
validation, S.S.A., K.A.A.; formal analysis, E.M.T.E.D.; investigation, K.A.A.; resources, A.A.A.; data
curation, A.A.A.; writing—original draft preparation, S.S.A.; writing—review and editing, K.A.A.;
visualization, A.M.; supervision, F.S.A.; project administration, A.M.; funding acquisition, F.S.A. All
authors have read and agreed to the published version of the manuscript.
Funding:
Princess Nourah bint Abdulrahman University Researchers Supporting Project number
PNURSP2022R319, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The
authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for
supporting this work, grant number 22UQU4210118DSR34.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement:
Data sharing is not applicable to this article as no datasets were
generated during the current study.
Appl. Sci. 2022,12, 6875 16 of 17
Conflicts of Interest:
The authors declare that they have no conflict of interest. The manuscript was
written through contributions of all authors. All authors have given approval to the final version of
the manuscript.
References
1.
Schneble, W.; Thamilarasu, G. Optimal feature selection for intrusion detection in medical cyber-physical systems. In Proceedings
of the 2019 11th International Conference on Advanced Computing (ICoAC), Chennai, India, 18–20 December 2019; pp. 238–243.
2.
Wickramasinghe, C.S.; Marino, D.L.; Amarasinghe, K.; Manic, M. Generalization of deep learning for cyber-physical system
security: A survey. In Proceedings of the IECON 2018—44th Annual Conference of the IEEE Industrial Electronics Society,
Washington, DC, USA, 21–23 October 2018; pp. 745–751.
3.
Thakur, S.; Chakraborty, A.; De, R.; Kumar, N.; Sarkar, R. Intrusion detection in cyber-physical systems using a generic and
domain specific deep autoencoder model. Comput. Electr. Eng. 2021,91, 107044. [CrossRef]
4.
Teyou, D.; Kamdem, G.; Ziazet, J. Convolutional neural network for intrusion detection system in cyber physical systems. arXiv
2019, arXiv:1905.03168.
5.
Al-Qarafi, A.; Alrowais, F.; Alotaibi, S.S.; Nemri, N.; Al-Wesabi, F.N.; Al Duhayyim, M.; Marzouk, R.; Othman, M.; Al-Shabi, M.
Optimal Machine Learning Based Privacy Preserving Blockchain Assisted Internet of Things with Smart Cities Environment.
Appl. Sci. 2022,12, 5893. [CrossRef]
6.
Panigrahi, R.; Borah, S.; Pramanik, M.; Bhoi, A.K.; Barsocchi, P.; Nayak, S.R.; Alnumay, W. Intrusion detection in cyber–physical
environment using hybrid Naïve Bayes—Decision table and multi-objective evolutionary feature selection. Comput. Commun.
2022,188, 133–144. [CrossRef]
7.
Albraikan, A.A.; Hassine, S.B.H.; Fati, S.M.; Al-Wesabi, F.N.; Hilal, A.M.; Motwakel, A.; Hamza, M.A.; Al Duhayyim, M. Optimal
Deep Learning-based Cyberattack Detection and Classification Technique on Social Networks. Comput. Mater. Contin.
2022
,
72, 907–923.
8. Yadav, S.; Kalpana, R. A Survey on Network Intrusion Detection Using Deep Generative Networks for Cyber-Physical Systems.
In Artificial Intelligence Paradigms for Smart Cyber-Physical Systems; Springer: Berlin/Heidelberg, Germany, 2021; pp. 137–159.
9.
Alohali, M.A.; Al-Wesabi, F.N.; Hilal, A.M.; Goel, S.; Gupta, D.; Khanna, A. Artificial intelligence enabled intrusion detection
systems for cognitive cyber-physical systems in industry 4.0 environment. Cogn. Neurodyn. 2022. [CrossRef]
10.
Maleh, Y. Machine learning techniques for IoT intrusions detection in aerospace cyber-physical systems. In Machine Learning and
Data Mining in Aerospace Technology; Springer: Cham, Switzerland, 2020; pp. 205–232.
11.
Jamal, A.A.; Majid, A.-A.M.; Konev, A.; Kosachenko, T.; Shelupanov, A. A review on security analysis of cyber physical systems
using Machine learning. Mater. Today Proc. 2021. [CrossRef]
12.
Sharma, M.; Elmiligi, H.; Gebali, F. A Novel Intrusion Detection System for RPL-Based Cyber–Physical Systems. IEEE Can. J.
Electr. Comput. Eng. 2021,44, 246–252. [CrossRef]
13.
Alkayem, N.F.; Shen, L.; Asteris, P.G.; Sokol, M.; Xin, Z.; Cao, M. A new self-adaptive quasi-oppositional stochastic fractal search
for the inverse problem of structural damage assessment. Alex. Eng. J. 2021,61, 1922–1936. [CrossRef]
14.
Li, B.; Wu, Y.; Song, J.; Lu, R.; Li, T.; Zhao, L. DeepFed: Federated Deep Learning for Intrusion Detection in Industrial Cyber–
Physical Systems. IEEE Trans. Ind. Inform. 2020,17, 5615–5624. [CrossRef]
15.
de Araujo-Filho, P.F.; Kaddoum, G.; Campelo, D.R.; Santos, A.G.; Macedo, D.; Zanchettin, C. Intrusion Detection for Cyber–
Physical Systems Using Generative Adversarial Networks in Fog Environment. IEEE Internet Things J.
2020
,8, 6247–6256.
[CrossRef]
16.
Althobaiti, M.M.; Kumar, K.P.M.; Gupta, D.; Kumar, S.; Mansour, R.F. An intelligent cognitive computing based intrusion
detection for industrial cyber-physical systems. Measurement 2021,186, 110145. [CrossRef]
17.
Gao, Y.; Chen, J.; Miao, H.; Song, B.; Lu, Y.; Pan, W. Self-Learning Spatial Distribution-Based Intrusion Detection for Industrial
Cyber-Physical Systems. IEEE Trans. Comput. Soc. Syst. 2022, 1–10. [CrossRef]
18.
Ibor, A.E.; Okunoye, O.B.; Oladeji, F.A.; Abdulsalam, K.A. Novel Hybrid Model for Intrusion Prediction on Cyber Physical
Systems’ Communication Networks based on Bio-inspired Deep Neural Network Structure. J. Inf. Secur. Appl.
2022
,65, 103107.
[CrossRef]
19.
Kaddoura, S.; Arid, A.E.; Moukhtar, M. Evaluation of Supervised Machine Learning Algorithms for Multi-class Intrusion
Detection Systems. In Proceedings of the Future Technologies Conference, Vancouver, BC, Canada, 28–29 October 2021; Springer:
Cham, Switzerland, 2021; pp. 1–16.
20.
Quincozes, S.E.; Passos, D.; Albuquerque, C.; Mossé, D.; Ochi, L.S. An extended assessment of metaheuristics-based feature
selection for intrusion detection in CPS perception layer. Ann. Telecommun. 2022, 1–15. [CrossRef]
21.
Nagarajan, S.M.; Deverajan, G.G.; Bashir, A.K.; Mahapatra, R.P.; Al-Numay, M.S. IADF-CPS: Intelligent Anomaly Detection
Framework towards Cyber Physical Systems. Comput. Commun. 2022,188, 81–89. [CrossRef]
22.
Wang, Z.; Li, Z.; He, D.; Chan, S. A lightweight approach for network intrusion detection in industrial cyber-physical systems
based on knowledge distillation and deep metric learning. Expert Syst. Appl. 2022,206, 117671. [CrossRef]
23.
Çelik, E. Improved stochastic fractal search algorithm and modified cost function for automatic generation control of intercon-
nected electric power systems. Eng. Appl. Artif. Intell. 2020,88, 103407. [CrossRef]
Appl. Sci. 2022,12, 6875 17 of 17
24.
Adem, K. Diagnosis of breast cancer with Stacked autoencoder and Subspace kNN. Phys. A Stat. Mech. Appl.
2020
,551, 124591.
[CrossRef]
25.
Meng, X.; Liu, Y.; Gao, X.; Zhang, H. A new bio-inspired algorithm: Chicken swarm optimization. Adv. Swarm Intell.
2014
,
5, 86–94.
26.
Fu, C.; Li, G.-Q.; Lin, K.-P.; Zhang, H.-J. Short-Term Wind Power Prediction Based on Improved Chicken Algorithm Optimization
Support Vector Machine. Sustainability 2019,11, 512. [CrossRef]
27. NSL-KDD Dataset. Available online: https://www.unb.ca/cic/datasets/nsl.html (accessed on 4 June 2022).
28. CICIDS 2017 Dataset. Available online: https://www.unb.ca/cic/datasets/ids-2017.html (accessed on 4 June 2022).
... Detailed sub-categorizations are presented to summarize advancements within each domain. The findings reveal SVM, CNN, decision trees (DTs), and GAs as leading techniques for attaining high classification performance [6][7][8][9][10]. Comparative analysis provides insights into the relative effectiveness and limitations of different algorithms and datasets. ...
... Overall, the observed publication trends demonstrate a substantial growth in research output in the field of IDS from 2018 to 2023, reflecting an increasing focus on enhancing the reliability and effectiveness of intrusion detection techniques. Figure 3 shows authors' production over time, where Motwakel [4][5][6][7][8][9][10] published seven papers in 2022 and 2023 and he focused in his research on optimization and feature selection based on intrusion detection. In his research, the highest accuracy he reached was 99.87 using sand paper optimization. ...
Article
Full-text available
Machine learning (ML) and deep learning (DL) techniques have demonstrated significant potential in the development of effective intrusion detection systems. This study presents a systematic review of the utilization of ML, DL, optimization algorithms, and datasets in intrusion detection research from 2018 to 2023. We devised a comprehensive search strategy to identify relevant studies from scientific databases. After screening 393 papers meeting the inclusion criteria, we extracted and analyzed key information using bibliometric analysis techniques. The findings reveal increasing publication trends in this research domain and identify frequently used algorithms, with convolutional neural networks, support vector machines, decision trees, and genetic algorithms emerging as the top methods. The review also discusses the challenges and limitations of current techniques, providing a structured synthesis of the state-of-the-art to guide future intrusion detection research.
... Duhayyim et al. [17] designed an original Stochastic Fractal Search Algorithm with DL Driven IDS (SFSA-DLIDS) for cloud-based CPS platform. This introduced method mainly implements a minmax data normalization algorithm for converting an input dataset into well-suited formats. ...
Article
Full-text available
Cyber-physical systems (CPSs) are characterized by their integration of physical processes with computational and communication components. These systems are utilized in various critical infrastructure sectors, including energy, healthcare, transportation, and manufacturing, making them attractive targets for cyberattacks. Intrusion detection system (IDS) has played a pivotal role in identifying and mitigating cyber threats in CPS environments. Intrusion detection in secure CPSs is a critical component of ensuring the integrity, availability, and safety of these systems. The deep learning (DL) algorithm is extremely applicable for detecting cyberattacks on IDS in CPS systems. As a core element of network security defense, cyberattacks can change and breach the security of network systems, and then an objective of IDS is to identify anomalous behaviors and act properly to defend the network from outside attacks. Deep learning (DL) and Machine learning (ML) algorithms are crucial for the present IDS. We introduced an Equilibrium Optimizer with a Deep Recurrent Neural Networks Enabled Intrusion Detection (EODRNN-ID) technique in the Secure CPS platform. The main objective of the EODRNN-ID method concentrates mostly on the detection and classification of intrusive actions from the platform of CPS. During the proposed EODRNN-ID method, a min-max normalization algorithm takes place to scale the input dataset. Besides, the EODRNN-ID method involves EO-based feature selection approach to choose the feature and lessen high dimensionality problem. For intrusion detection, the EODRNN-ID technique exploits the DRNN model. Finally, the hyperparameter related to the DRNN model can be tuned by the chimp optimization algorithm (COA). The simulation study of the EODRNN-ID methodology is verified on a benchmark data. Extensive results display the significant performance of the EODRNN-ID algorithm when compared to existing techniques.
... The QDMO-EDLID approach uses the QDMO method for feature subset selection purposes. Duhayyim et al. [23] present a novel Stochastic Fractal Search Algorithm with DL Driven IDS (SFSA-DLIDS). The SFSA was used for selecting feature subsets. ...
Article
Full-text available
Threat detection in a Cyber-Physical System (CPS) platform is a key feature of ensuring the reliability and security of these connected methods, but digital elements interface with the physical world. CPS platforms are popular in sectors like healthcare, industrial automation, smart cities, and transportation making them vulnerable to different cyber-attacks. Threat detection in CPS contains the detection and mitigation of cybersecurity risks, which disrupt physical processes, compromise data integrity, and potentially cause safety concerns. Machine learning (ML) and deep learning (DL) systems are exploited for detecting anomalies by learning the normal behaviour forms of the CPS and recognizing deviations. This study presents an Automated Threat Detection using the Flamingo Search Algorithm with Optimal Deep Learning (ATD-FSAODL) technique in a CPS environment. Initially, the ATD-FSAODL technique applies FSA-based feature subset selection to elect the better group of features. In addition, the ATD-FSAODL technique makes use of a modified Elman Spike Neural Network (MESNN) model for threat recognition and classification. Finally, the slime mold algorithm (SMA) is used for the optimal selection of the parameters related to the MESNN approach to ensure that the threat detection rate is improved. To estimate the solution of the ATD-FSAODL technique, a sequence of simulations can be carried out on benchmark databases. The performance values portray the capable solution of the ATD-FSAODL methodology with other methods with a maximum accuracy of 99.58%, precision of 99.58%, recall of 99.58%, F-score of 99.58%, and MCC of 99.16%.
Preprint
Full-text available
In the practical world, Cyber-Physical Systems have integrated physical systems and software management in the cyber-world, with networks responsible for information interchange. CPSs are key technologies for various industrial domains, including intelligent medical systems, transport systems, and smart grids. The advancements in cybersecurity have surpassed the rapid growth of CPS, with new security challenges and threat models that lack an integrated and cohesive framework. The review methodology includes the search strategy along with the inclusion and exclusion criteria of fifteen studies conducted in the past ten years. The studies specific to the relevant topic have been added, while the others have been excluded. According to the results, Machine Learning (ML) algorithms and systems can synthesize data. It is employed in cyber-physical security to alleviate concerns regarding the safety and reliability of the findings. ML offers a solution to complex problems, enhancing computer-human interaction and enabling problem-solving in areas where custom-built algorithms are impractical. A comprehensive overview of the application of ML across various domains, such as smart grids, smart vehicles, healthcare systems, and environmental monitoring, has been included. However, a few challenges are associated with implementing ML techniques in CPS networks, including feature selection complexity, model performance, deployment challenges, algorithm biases, model mismatches, and the need to foster a robust safety culture. Overall, integrating ML techniques with CPS networks holds promise for enhancing system safety, reliability, and security but requires ongoing refinement and adaptation to address existing limitations and emerging threats.
Conference Paper
Today, the Internet is used for various applications such as sending or receiving emails, access to multimedia services, social networks, etc. Moreover, the population of people who use the Internet is increasing. Therefore, the issue of Internet of Things (IoT) security has become a major problem today. Recently, Deep Learning has become one of the most influential and valuable methods for detecting abnormalities in IoT Networks. Further, Deep Learning models produced more effective performance than traditional techniques in abnormalities detection. Moreover, the structure and parameters of ML and DL models can be optimized using evolutionary algorithms (EA). The survey aims to overview recent research on anomaly detection in the IoT-based Deep Learning model that EA has enhanced.
Article
The rapid growth of IoT (Internet of Things) and smart services facilitate many CPS (Cyber-Physical Systems) such as smart health, smart grid and so on. Nevertheless, the communication security issues in CPS are becoming more and more important with the growing complexity of the CPS network and the increasing dependency of critical network infrastructure on cyber-based technologies. In recent years, deep learning technology has shown its superiority in detecting communication security attacks, but its high computational complexity and the massive amount of data generated by IoT devices have brought challenges to traditional cloud computing technology in terms of bandwidth and computing resources. In this paper, we have analyzed the characteristics of heterogeneity and hierarchy in attacks on CPS. We have also analyzed the role of edge intelligence in handling the security of large-scale data communication in CPS. Furthermore, we proposed a CPS communication attack detection framework based on edge cloud collaboration, aiming to improve the parallel efficiency of hardware resources when executing detection tasks. We aim to enhance the intelligence of physical devices and the degree of cloud collaboration, satisfying the real-time processing requirements of large-scale, hierarchical CPS attack detection. Furthermore, through simple simulation experiments, we verified the effectiveness of the proposed edge cloud collaboration framework in CPS attack detection.
Chapter
Cyber-physical systems are widely used. Nevertheless, security issues are quite acute for them. First of all, because the system must work constantly without downtime and failures. The Cyber-Physical System (CPS) must quickly transfer the parameters to the monitoring system, but if the system is not flexible enough, fast and optimal, then collisions and additional loads on the CPS may occur. This study proposes a system for monitoring and detecting anomalies for CPS based on the principles of trust, which allows you to verify the correctness of the system and detect possible anomalies. In our study, we focus on traffic analysis and analysis of the CPU operation, since these parameters are the most critical in the operation of the CPS itself. The technique is based on computationally simple algorithms and allows to analyze the basic parameters that are typical for most CPS. These factors make it highly scalable and applicable to various types of CPS, despite the fragmentation and a large number of architectures. A distributed application architecture was developed for monitoring and analyzing trust in the CPS. The calculation results show the possibility of detecting the consequences of the influences of denial-of-service attacks or CPS. In this case, three basic parameters are sufficient for detection. Thus, one of the features of the system is reflexivity in detecting anomalies, that is, we force devices to independently analyze their behavior and make a decision about the presence of anomalies.KeywordsTrustReflectionAnomaly DetectionAttacksDenial of ServiceMonitoring
Article
Full-text available
Currently, the amount of Internet of Things (IoT) applications is enhanced for processing, analyzing, and managing the created big data from the smart city. Certain other applications of smart cities were location-based services, transportation management, and urban design, amongst others. There are several challenges under these applications containing privacy, data security, mining, and visualization. The blockchain-assisted IoT application (BIoT) is offering new urban computing to secure smart cities. The blockchain is a secure and transparent data-sharing decentralized platform, so BIoT is suggested as the optimum solution to the aforementioned challenges. In this view, this study develops an Optimal Machine Learning-based Intrusion Detection System for Privacy Preserving BIoT with Smart Cities Environment, called OMLIDS-PBIoT technique. The presented OMLIDS-PBIoT technique exploits BC and ML techniques to accomplish security in the smart city environment. For attaining this, the presented OMLIDS-PBIoT technique employs data pre-processing in the initial stage to transform the data into a compatible format. Moreover, a golden eagle optimization (GEO)-based feature selection (FS) model is designed to derive useful feature subsets. In addition, a heap-based optimizer (HBO) with random vector functional link network (RVFL) model was utilized for intrusion classification. Additionally, blockchain technology is exploited for secure data transmission in the IoT-enabled smart city environment. The performance validation of the OMLIDS-PBIoT technique is carried out using benchmark datasets, and the outcomes are inspected under numerous factors. The experimental results demonstrate the superiority of the OMLIDS-PBIoT technique over recent approaches.
Article
Full-text available
In recent days, Cognitive Cyber-Physical System (CCPS) has gained significant interest among interdisciplinary researchers which integrates machine learning (ML) and artificial intelligence (AI) techniques. This era is witnessing a rapid transformation in digital technology and AI where brain-inspired computing-based solutions will play a vital role in industrial informatics. The application of CCPS with brain-inspired computing in Industry 4.0 will create a significant impact on industrial evolution. Though the CCPSs in industrial environment offer several merits, security remains a challenging design issue. The rise of artificial intelligence AI techniques helps to address cybersecurity issues related to CCPS in industry 4.0 environment. With this motivation, this paper presents a new AI-enabled multimodal fusion-based intrusion detection system (AIMMF-IDS) for CCPS in industry 4.0 environment. The proposed model initially performs the data pre-processing technique in two ways namely data conversion and data normalization. In addition, improved fish swarm optimization based feature selection (IFSO-FS) technique is used for the appropriate selection of features. The IFSO technique is derived by the use of Levy Flight (LF) concept into the searching mechanism of the conventional FSO algorithm to avoid the local optima problem. Since the single modality is not adequate to accomplish enhanced detection performance, in this paper, a weighted voting based ensemble model is employed for the multimodal fusion process using recurrent neural network (RNN), bi-directional long short term memory (Bi-LSTM), and deep belief network (DBN), depicts the novelty of the work. The simulation analysis of the presented model highlighted the improved performance over the recent state of art techniques interms of different measures.
Article
With the rapid development of technology and science, machine learning approaches and deep learning methods have been widely applied in industrial Cyber-Physical Systems. However, there are still some challenging issues for anomaly detection to classify various attacks in industrial CPS to ensure the cyber security, especially when dealing with resource-constrained IoT devices. In this paper, we propose a Knowledge Distillation model based on Triplet Convolution Neural Network to improve the model performance and greatly enhance the speed of anomaly detection for industrial CPS as well as reduce the complexity of the model. Specifically, during the training process, we design a robust model loss function to improve the training stability of the model. A new neural network training method called K-fold cross training is also proposed to enhance the accuracy of anomaly detection. A lot of experimental results demonstrate that the performance metrics of KD-TCNN on the benchmark datasets NSL-KDD and CIC IDS2017 have significant advantages over traditional deep learning approaches and the recent state-of-the-art models. Furthermore, when compared to the original model, our model's computational cost and size are both reduced by roughly 86% with just 0.4% accuracy loss.
Article
Cyber-physical systems (CPS) are multi-layer complex systems that form the basis for the world’s critical infrastructure and, thus, have a significant impact on human lives. In recent years, the increasing demand for connectivity in CPS has brought attention to the issue of cyber security. Aside from traditional information systems threats, CPS faces new challenges due to the heterogeneity of devices and protocols. In this paper, we assess how feature selection may improve different machine learning training approaches for intrusion detection and identify the best features for each intrusion detection system (IDS) setup. In particular, we propose using F1-Score as a criteria for the adapted greedy randomized adaptive search procedure (GRASP) metaheuristic to improve the intrusion detection performance through binary, multi-class, and expert classifiers. Our numerical results reveal that there are different feature subsets that are more suitable for each combination of IDS approach, classifier algorithm, and attack class. The GRASP metaheuristic found features that detect accurately four DoS (denial of service) attack classes and several variations of injection attacks in cyber physical systems.
Article
Researchers are motivated to build effective Intrusion Detection Systems because of the implications of malicious actions in computing, communication, and cyber–physical systems (IDSs). In order to develop signature-based intrusion detection techniques that are suitable for use in cyber–physical environments, state-of-the-art supervised learning algorithms are devised. The main contribution of this research is the introduction of a signature-based intrusion detection model that is based on a hybrid Decision Table and Naive Bayes technique. In addition, the contribution of the suggested method is evaluated by comparing it to the existing literature in the field. In the preprocessing stage, Multi-Objective Evolutionary Feature Selection (MOEFS) feature selection has been used to select only five attack features from the recent CICIDS017 dataset. Keeping in view the class imbalance nature of CICIDS2017 dataset, adequate attack samples has been selected with more weightage to the attack classes having a smaller number of instances in the dataset. A hybrid of Decision Table and Naive Bayes models were combined to train and detect intrusions. Detection of botnets, port scans, Denial of Service (DoS)/Distributed Denial of Service (DDoS) attacks, such as Golden-Eye, Hulk, Slow httptest, slowloris, Heartbleed, Brute Force attacks, such as Patator (FTP), Patator (SSH), and Web attacks such as Infiltration, Web Brute Force, SQL Injection, and XSS, are all successfully detected by the proposed hybrid detection model. The proposed approach shows and accuracy 96.8% using five features of CICIDS2017, which is higher than the accuracy of methods discussed in the literatures.
Article
Cyber-Physical Systems (CPSs) becoming one of the most complex, intelligent, and sophisticated system. Ensuring security is an important aspect towards CPSs. However, increase in sophisticated and complexity attacks in CPSs, the conventional anomaly detection methods are facing problems and also growth in volume of data becomes challenging which requires domain specific knowledge that could be applied directly to analyze these challenges. In order to overcome this problem, various deep learning based anomaly detection system is developed. In this research, we propose an anomaly detection approach by integration of intelligent deep learning technique named Convolutional Neural Network (CNN) with Kalman Filter (KF) based Gaussian-Mixture Model (GMM). The proposed model is used for identifying and detecting anomalous behavior in CPSs. This proposed framework consists of two important process. First is to pre-process the data by transforming and filtering original data into new format and achieved privacy preservation of the data. Secondly, we proposed GMM-KF integrated deep CNN model for anomaly detection and accurately estimated the posterior probabilities of anomalous and legitimate events in CPSs.
Article
Thanks to the great advancement of cognitive computing, artificial intelligence, big data, and the Internet of Things (IoT) technologies, the fusion of the physical and virtual worlds is changing people’s lifestyles. Although the research and deployment of cyber-physical systems (CPSs) are notably promoted by cognitive computing, the reliability and large-scale application of CPSs are still significantly challenged by some security issues. Therefore, it is meaningful to clarify and address the weaknesses of current intrusion detection methods for CPSs and enhance the ability to identify, analyze, and predict to improve the performance of intrusion detection. In this article, we first propose a novel self-learning spatial distribution algorithm, named Euclidean distance-based between-class learning (EBC learning), which improves between-class learning by calculating the Euclidean distance (ED) among $k$ -nearest neighbors of different classes. In addition, a cognitive computing-based intrusion detection method named border-line SMOTE and EBC learning based on random forest (BSBC-RF) is also proposed based on the EBC learning for industrial CPSs. The experimental results over a real industrial traffic dataset show that the proposed EBC learning has strong spatial constraint capability and can improve the prediction and recognition performance. Compared with the eight state-of-the-art methods, the proposed method has an ACC exceeding 99.5%, false alarm rate (FAR) less than 0.06%, and $F1$ close to 0.99, which is still superior to other ones.
Article
There are growing concerns on the security of communication networks of Cyber Physical Systems (CPSs). In a typical Cyber Physical System (CPS), the plant, actuators, sensors and controller interface through a communication network, which enable computing and data transmission in the CPS. Consequently, the communication network is vulnerable to sophisticated attacks. Attacks on CPSs communication networks can cause damage to critical resources and infrastructure. In this sense, it is crucial to accurately predict these attacks in order to minimise their impact on the target CPSs networks. In this paper, we propose a novel hybrid approach for intrusion prediction on CPSs communication networks. We use a bio-inspired hyperparameter search technique to generate an improved deep neural network structure based on the core hyperparameters of a neural network. Furthermore, we derive a prediction model based on the improved neural network structure and evaluate its performance using two well-known datasets, namely, the CICIDS2017 and UNSW-NB15 datasets. Results obtained from rigorous experimentation show that our model can predict diverse attack types with high accuracy, low error and false positive rates, and outperforms state-of-the-art comparative models.
Chapter
The increased demand for network security nowadays is becoming a crucial strategy. Accordingly, a need for intrusion detection system is essential to track cybersecurity attacks. Thus, some protection strategies are necessary for this purpose. However, the current intrusion detection systems are still developing and looking for more accuracy. In this paper, supervised learning algorithms (Random Forest, XGBoost, K-Nearest Neighbors (k = 5), Artificial Neural Network, Logistic Regression, Support Vector Machine, and LASSO-LARS) are trained and tested to a preprocessed dataset. It contains benign and up-to-date common attacks (DoS, Probe, R2L, and U2R). In order to measure the performance of each supervised learning algorithm, the F1-score is calculated. As a result, the random forest, XGBoost, and K-nearest neighbors (k = 5) algorithms have better accuracy than the others, having values of 99.7%, 99.1%, and 97.6% prediction success rate, respectively. The least performance is for logistic regression algorithm with a prediction accuracy rate value of 71.4%.