Conference PaperPDF Available

Deep Ensemble-based Efficient Framework for Network Attack Detection

June 2023

June 2023

DOI:10.1109/MedComNet58619.2023.10168864

Conference: 2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authors:

Furqan Rustam

University College Dublin

Ali Raza

University of Lahore

Imran Ashraf

Yeungnam University

Anca D Jurcut

University College Dublin

Nowadays, networks play a critical role in business, education, and daily life, allowing people to communicate via different platforms across long distances. However, such communication contains many potential dangers and security vulnerabilities that can compromise the confidentiality, integrity, and privacy of data. Network attacks, malware, hacking, and phishing are increasing daily, resulting in colossal losses. Automated systems based on artificial intelligence can help to detect such attacks efficiently and protect sensitive information. This work proposes an ensemble deep voting classifier (EDVC) to detect network attacks with high accuracy. Long short-term memory (LSTM), recurrent neural network (RNN), and gated recurrent unit (GRU) are employed in the proposed approach using the majority voting criteria. Experimental results using the NSL-KDD dataset indicate the superior performance of EDVC with a 0.996 accuracy score which is superior to the existing state-of-the-art methods.

The target label distribution analysis is related to network attacks.

…

The occurrence analysis of service during a network attack.

…

The occurrence analysis of flags during a network attack.

…

Figures - uploaded by Ali Raza

Content may be subject to copyright.

Content uploaded by Ali Raza

Content may be subject to copyright.

Deep Ensemble-based Efﬁcient Framework for

Network Attack Detection

Furqan Rustam 1, Ali Raza 2, Imran Ashraf 3, Anca Delia Jurcut 1,∗

1School of Computer Science University College Dublin, Ireland; furqan.rustam1@gmail.com; anca.jurcut@ucd.ie

2Department of Computer Science KFUEIT, Rahim Yar Khan Pakistan; ali.raza.scholarly@gmail.com

3Dept. of Information & CE Yeungnam University Gyeongsan, Republic of Korea; imranashraf@ynu.ac.kr

Abstract—Nowadays, networks play a critical role in business,

education, and daily life, allowing people to communicate

via different platforms across long distances. However, such

communication contains many potential dangers and security

vulnerabilities that can compromise the conﬁdentiality, integrity,

and privacy of data. Network attacks, malware, hacking, and

phishing are increasing daily, resulting in colossal losses. Auto-

mated systems based on artiﬁcial intelligence can help to detect

such attacks efﬁciently and protect sensitive information. This

work proposes an ensemble deep voting classiﬁer (EDVC) to

detect network attacks with high accuracy. Long short-term

memory (LSTM), recurrent neural network (RNN), and gated

recurrent unit (GRU) are employed in the proposed approach

using the majority voting criteria. Experimental results using the

NSL-KDD dataset indicate the superior performance of EDVC

with a 0.996 accuracy score which is superior to the existing

state-of-the-art methods.

Index Terms—Network Attack Detection, Machine Learning,

Ensemble Learning, Deep Learning, Network Intrusion Detec-

tion

I. INTRODUCTION

Networking refers to the interconnection of multiple com-

puting devices, allowing them to exchange data. The data

sharing can be done through various technologies and com-

munication protocols, such as Ethernet, Wi-Fi, or even simple

wired connections [1]. The main goal of networking is to

enable devices to work together and share resources, such as

printers, ﬁle servers, and internet connections. In today’s in-

terconnected world, networks play a critical role in business,

education, and daily life, enabling people to communicate and

share information across long distances. Networking provides

several advantages including resource sharing, communica-

tion, centralized data management, improved collaboration,

increased productivity, scalability, and remote access [2].

With substantial networking applications, many potential

dangers and security vulnerabilities can arise thus com-

promising the conﬁdentiality, integrity, and availability of

networked systems and data [3]. The typical network threats

include malware, hacking, phishing, denial-of-service (DoS)

attacks, man-in-the-middle (MitM) attacks, and spooﬁng.

Some speciﬁc dangers of network attacks include data theft,

system disruption, reputation damage, ﬁnancial losses, es-

pionage, and infrastructure damage [4]. With the increase

in network threats, the necessity of an automated attack

detection system is increased. Artiﬁcial intelligence (AI)-

based solutions may potentially detect such attacks thereby

enabling timely countermeasures to mitigate the risk of

data theft [5]. Such techniques are utilized to analyze large

amounts of network data and identify potential threats in

real time, allowing organizations to respond quickly and

effectively. Machine learning methods learn the patterns from

data and are used to identify potential attacks. Integrating

such methods into network security can signiﬁcantly improve

an organization’s ability to detect and respond to attacks

[6], reducing the risk of successful attacks and protecting

valuable information and assets. In this regard, this research

contributes as follows:

•To achieve high accuracy in network attack detection, an

ensemble model is proposed in which three deep learn-

ing models, namely long short-term memory (LSTM),

recurrent neural network (RNN), and gated recurrent unit

(GRU), are combined using a majority voting criteria.

•In order to thoroughly compare the performance of vari-

ous machine learning and deep learning models for net-

work attack detection, an in-depth analysis is conducted

on the NSL-KDD dataset. The models applied in this

analysis include decision tree (DT), logistic regression

(LR), support vector machine (SVM), Gaussian Nave

Bayes (GNB), random forest (RF), RNN, GRU, LSTM,

and convolutional neural network (CNN).

•All models are optimized with regard to their hyperpa-

rameters to obtain the best possible results. Furthermore,

the proposed approach is evaluated against existing state-

of-the-art methods to analyze its performance.

The organization of the rest of this study is as follows: Sec-

tion II contains the analysis of related literature on network

attack detection, while the proposed methodology is described

in Section III. The evaluation and discussion of experimental

results are given in Section IV. The study is concluded in

Section V.

II. RELATED WORK

Network security is a crucial element for organizations

to safeguard the privacy and conﬁdentiality of their data.

Consequently, a substantial number of research works can

be found on network security. A few more relevant works

are discussed here.

Network intrusion detection based on classical machine

learning techniques is proposed in [7]. eleven machine

learning methods are utilized on the NSL-KDD dataset.

Results show that tree-based techniques achieve the best

performance for network detection. The proposed XGBoost

model achieves a 97% accuracy for attack detection. Simi-

larly, network intrusion detection using the neural network is

performed in [8] using the NSL-KDD dataset. Experimental

results indicate that the bidirectional LSTM approach with

an attention mechanism achieves high performance. Using

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

key and local features from the data BiLSTM achieves an

accuracy score of 84%.

Network attack detection using deep learning techniques is

presented in [9]. The study utilizes three network intrusion

datasets. A deep neural network is proposed for network in-

trusion detection. The proposed method achieves an accuracy

score of 97% using the NSL-KDD dataset. However, the deep

neural network achieved poor performance scores for UNSW-

NB15 and CSE-CIC-IDS2018 datasets. The authors present a

DoS attack detection method in [10]. The NSL-KDD dataset

is utilized with various machine learning and deep learning

methods. BiLSTM is used to detect network attacks as binary

classiﬁcation. The results show that the BiLSTM technique

achieves an accuracy of 82% for network attack detection.

Besides stand-alone models, hybrid models have also been

utilized for the same purpose. For example, [11] presents

a hybrid intrusion detection model for detecting network

attacks. The particle swarm optimization (PSO) and enhanced

genetic technique with a random forest model are utilized.

Experimental results using the NSL-KDD dataset indicate

an 88% accuracy score. Along the same direction, [12]

uses a hybrid sampling with the deep hierarchical network

for detecting network attacks. The NSL-KDD dataset is

balanced using the synthetic minority over-sampling tech-

nique (SMOTE). The spatial features are extracted using the

convolution neural network while the temporal features are

extracted using the BiLSTM. The proposed approach achieves

an accuracy score of 83%. The detection of network intrusion

using a hybrid of machine learning and deep learning model

is proposed in [13]. The proposed model is based on a hybrid

of k-means, random forest, and neural networks. The adaptive

synthetic sampling technique is applied to the dataset for class

balancing. The NSL-KDD and CIS-IDS2017 datasets are

utilized for conducting the study experiments. The proposed

hybrid model achieves an accuracy of 85% using the NSL-

KDD dataset.

The study [14] uses the simulated annealing-based multi-

layer perceptron (MLP) model to detect network intrusion.

The binary classiﬁcation is performed using the neural

network-based deep learning model. Three datasets NSL-

KDD, UNSW-NB15, and CICIDS2017 are utilized for exper-

iments. The proposed MLP technique achieves a 97% accu-

racy on the NSL-KDD dataset. Similarly, [15] designed a ma-

chine learning-based network intrusion system for software-

deﬁned networks. The NSL-KDD dataset is utilized and 41

features of the dataset are used for performing the multi-class

classiﬁcation. The DDoS, PROBE, R2L, and U2R attacks are

detected using the applied techniques. The proposed XGBoost

achieves a 95% accuracy score.

The detection of network attacks using a double-layer

hybrid model is proposed in [16]. The proposed model

combines the Naive Bayes and SVM models to construct

a hybrid model. The principal component analysis (PCA)

is used during the dataset features preprocessing. The NSL-

KDD dataset is utilized to perform experiments. The proposed

machine learning-based hybrid model achieves 89% accuracy

for detecting network intrusion. The study [17] proposes net-

work intrusion detection using deep learning-based methods

for fog-assisted Internet of things (IoT). A network intrusion

detection system (NIDS) is developed using a deep learning

approach for IoT. The proposed NIDS is a device for attack

detection and is implemented in the fog node. The two public

datasets UNSW-NB15 and NSL-KDD are used to evaluate

the performance of the proposed NIDS system. The proposed

system achieves a 95% accuracy for network attack detection

in IoT.

Besides using the optimized models, some studies focus

on feature selection and engineering. For example, the study

[18] proposed a neural network-based technique for detecting

network intrusion detection. The inﬂuencing features from

the NSL-KDD are selected using the neural network and

condensed neighbors mechanism. The performance of applied

techniques is validated using the WEKA software. The study

results show that the proposed deep learning-based condensed

nearest neighbors technique achieves an accuracy of 94% for

network attack detection.

Existing research on network attack detection has several

limitations and gaps, which can be summarized as follows:

•Previous studies on network attack detection have relied

primarily on classical machine learning and deep learn-

ing techniques. However, utilizing advanced ensemble

learning techniques may improve the accuracy of net-

work attack detection.

•The accuracy scores of network attack detection perfor-

mance in previous studies range between 80% to 97%,

indicating the need for further improvement to achieve

optimal performance accuracy.

III. STUDY METHODOLOGY

Figure 1 shows the architecture of the proposed approach.

The network attack features-based dataset is utilized for

experiments. The dataset is preprocessed including cate-

gorical encoding features and target attack label mapping.

Exploratory data analysis is applied to examine the network

attack feature patterns. For experiments, the data is split into

training and testing with a ratio of 0.8 to 0.2. The proposed

approach is used to detect network attacks as normal, Dos,

Remote to Local (R2L), Probe, and User to Root (U2R)

Figure 1. The methodological analysis for network attack detection.

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

A. Network features data

The publicly available network attack features based NSL-

KDD benchmark dataset is utilized [19]. The dataset features

are based on the Dos, R2L, Probe, and U2R network attacks.

The dataset contains a total of 148517 records and 43 features

related to network attacks.

B. Data Preprocessing

As machine learning models are designed to operate on nu-

merical data, our dataset, which includes categorical features,

requires preprocessing to clean and convert the data into a nu-

merical representation before being fed to the models. In this

regard, we preprocessed the dataset by removing duplicate

records and encoding the categorical features. Speciﬁcally, the

’protocol type’, ’service’, and ’ﬂag’ features were encoded

using the LabelEncoder module in Scikit learn [20].

C. Exploratory Data Analysis

Exploratory data analysis is a crucial step in the data

analysis process that helps to identify patterns, trends, and

relationships. Graphs and charts are essential tools used in

exploratory data analysis to visualize the data and uncover

insights. The bar chart-based network attack target class

distributions analysis is illustrated in Figure 2. The analysis

shows that the dataset is highly imbalanced as the ’normal’

class has 77054 samples, the ’Dos’ class has 53387 samples,

the ’Probe’ class has 14077 samples, the ’R2L’ class has

3880 samples, and the ’U2R’ class has 119 samples. The

analysis concludes that the Dos and Probe attacks occur more

frequently, while R2L and U2R attacks are less frequent.

Figure 2. The target label distribution analysis is related to network attacks.

The bar chart-based service provider analysis during the

network attack is illustrated in Figure 3. The network service

is an application running at the network application layer. The

analysis demonstrates a high frequency of network attacks

using the HTTP service. Then the private service is most

reported during a network attack, followed by the domain u,

smtp, ftp data, others, eco i, telnet, and ecr i services. The

dataset analysis concludes that the HTTP service is most used

during a network attack.

The indication analysis of ﬂags during a network attack

is visualized in Figure 4. The ﬂags in a network indicate

Figure 3. The occurrence analysis of service during a network attack.

a particular connection state. The analysis demonstrates that

the SF, So, and REJ are the most frequently indicted during

a network attack. The analysis concludes the most occurred

ﬂags on a network.

Figure 4. The occurrence analysis of ﬂags during a network attack.

D. Applied Machine and Deep Learning Techniques

The applied machine and deep learning methods are de-

scribed in this section. This study utilizes ﬁve machine

learning and four deep learning-based techniques applied in

comparison.

•DT machine learning models are popular data science

techniques used for classiﬁcation and regression tasks

[21]. DT is a supervised learning algorithm that builds a

tree-like structure to model the decision-making process.

The tree consists of nodes representing features or

attributes and branches representing the decision rules.

At each node, the model evaluates the value of the

corresponding feature and determines which branch to

follow. The ﬁnal leaf nodes represent the predicted

class or value. DT models are easy to interpret and

visualize, making them popular in ﬁelds such as ﬁnance,

healthcare, and marketing. One of the critical beneﬁts of

DT is its ability to handle missing data and noisy data,

which are often encountered in real-world datasets.

•LR is a widely used machine learning model for clas-

siﬁcation tasks that involve predicting binary or multi-

class outcomes [22]. LR models the relationship between

a dependent variable and one or more independent

variables. The dependent variable is usually a binary

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

variable representing the presence or absence of a partic-

ular characteristic, while the independent variables are

continuous or categorical in nature. LR uses a logistic

function to model the probability of the dependent

variable taking a particular value, given the values of

the independent variables. LR has applications in various

ﬁelds, such as healthcare, ﬁnance, marketing, and social

sciences. It is beneﬁcial when the relationship between

the independent and dependent variables is non-linear or

complex.

•SVM is a popular machine learning algorithm that is

widely used in classiﬁcation, regression, and outlier

detection [23]. SVM is a supervised learning model

that works by ﬁnding the optimal decision boundary

or hyperplane that separates data points into different

classes. The key is to maximize the margin between the

decision boundary and the closest data points, known

as support vectors. SVM has several advantages, such

as efﬁciently handling high-dimensional data, non-linear

decision boundaries, and large datasets. However, SVM

is sensitive to the choice of kernel function and parame-

ters and prone to overﬁtting when the number of features

is large.

•GNB is a popular machine learning model used for

classiﬁcation problems, especially in the ﬁeld of natural

language processing [24]. It is a probabilistic algorithm

that relies on the Bayes theorem. GNB is a variant

of the Naive Bayes algorithm that assumes that the

features of the input data are independent of each other.

This simpliﬁes the computation of the probability of

the input data belonging to a particular class, making

GNB computationally efﬁcient and effective for large

datasets. Despite its simplicity, GNB often performs

well compared to other complex algorithms, making it

a popular choice in the machine learning community.

•RF is a robust machine learning model that is widely

used for network intrusion detection [25], [26]. RF is an

ensemble method that combines multiple decision trees

to make predictions. Each tree is trained on a random

subset of the data and a random subset of the features.

This randomization process helps reduce overﬁtting and

improves the model’s generalization performance. RF

has been successfully applied to a variety of ﬁelds,

including ﬁnance, medicine, and ecology. However, RF

can be computationally expensive and may require

careful tuning of hyperparameters to achieve optimal

performance.

•RNN is widely used in sequential data such as text,

speech, and time-series data [27]. Unlike traditional

feedforward neural networks, RNN has feedback loops

that allow information to persist across time steps. RNN

can capture temporal dependencies and learn from past

experiences to make better predictions. However, RNN

suffers from the problem of vanishing gradients, which

limits its ability to learn long-term dependencies. RNN

still faces challenges in modeling complex sequences

with variable lengths and in handling noisy or incom-

plete data.

•GRU has gained popularity in recent years for its ability

to model sequential data effectively [28]. GRU is a

type of RNN that has gating mechanisms which allow

the model to selectively update and forget information

at each time step, improving the model’s ability to

capture long-term dependencies. GRU has been shown

to outperform other RNNs on tasks such as language

modeling, machine translation, and speech recognition.

Additionally, GRU has a simpler architecture than other

RNNs, making it easier to train and faster to converge.

•LSTM is an RNN variant that is widely used in deep

learning [29]. Unlike traditional RNN, LSTM is de-

signed to capture long-term dependencies in sequential

data by using a memory cell that can store information

over a longer period of time. LSTM is particularly

effective in speech recognition, machine translation, and

sentiment analysis tasks. In an LSTM model, the input

data is ﬁrst processed through a series of gates, which

control the ﬂow of information into and out of the

memory cell. The output of an LSTM is then computed

based on the contents of the memory cell and the input

data. Overall, LSTM is a powerful tool for modeling

sequential data and has demonstrated state-of-the-art

performance in a wide range of applications.

•CNN, also known as feed-forward neural networks, is

primarily used to solve classiﬁcation problems [30]. The

CNN model consists of multiple layers of interconnected

nodes, with each layer transforming the input from

the previous layer to produce the ﬁnal output. The

architecture of a CNN model is based on three types of

layers: convolution, pooling, and fully connected dense

layers. The CNN model has the disadvantage that it can

suffer from overﬁtting, where the model becomes too

complex and ﬁts the training data too closely, leading to

poor performance on new unseen data. To address this,

techniques such as regularization, dropout, and early

stopping can be employed to prevent overﬁtting and

improve generalization.

The hyperparameter optimization is done for each applied

model. The best-ﬁt hyperparameters are determined using the

recursive training and testing mechanism. The hyperparame-

ter optimization model help achieves high performance. The

ﬁnal selected hyperparameters of each model are given in

Table I.

Table I

THE HYPERPARAMETERS OF MACHINE LEARNING AND DEEP LEARNING

MODELS.

Technique Hyperparameter value

DT splitter=’best’, min samples split=2, crite-

rion=’entropy’, max depth=5, random state=500

LR random state=10, solver=’lbfgs’, max iter=50,

multi class=’auto’, C=1.0

SVM random state=50, max iter=50, penalty’l2’,

loss=’squared hinge’, tol=1e-4, multi class=’ovr’,

ﬁt intercept=True

GNB var smoothing=1e-9

RF n estimators=20, max depth=5, random state=100,

criterion=’entropy’, min samples split=2,

max features=”sqrt”, bootstrap=True

E. Proposed Ensemble Deep Voting Classiﬁer

EDVC method is proposed to detect the network attack

and its architecture is given in Figure 5. The proposed EDVC

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

Table II

ARCHITECTURE OF EACH DEEP LEARNING MODEL AND EACH INDIVIDUAL MODEL USED IN EDVC.

GRU LSTM RNN CNN

Sequential() Sequential() Sequential() Sequential()

GRU(64, input shape=42) LSTM(64, input shape=42) SimpleRNN(64, input shape=42) Conv1D(64, 3, input shape = 42, activation=’relu’)

Dense(16,activation=’relu’) Dense(16,activation=’relu’) Dense(16,activation=’relu’) MaxPool1D(pool size=(2)

Dropout(0.5) Dropout(0.5) Dropout(0.5) Flatten()

Dense(5,activation=’softmax’) Dense(5,activation=’softmax’) Dense(5,activation=’softmax’) Dense(16,activation=’relu’)

Dense(5,activation=’softmax’)

compile & ﬁt (loss = ’categorical crossentropy’,optimizer = ’adam’, epochs=10)

Figure 5. The architectural analysis of the proposed EDVC model.

technique combines the deep learning-based RNN, GRU, and

LSTM methods. We chose to use these three models based

on their individual performance, as they have demonstrated

signiﬁcant accuracy. Additionally, these models share a com-

mon trait in that they are all recurrent architectures [31].

EDVC employs a majority voting mechanism to determine

the ﬁnal prediction. Each model will generate a prediction

for a given sample, and the ﬁnal prediction is made by

aggregating the individual model predictions through voting.

The class with the highest number of votes is selected as

the ﬁnal prediction. The models discussed in Table II are

designed with a lightweight architecture, which enables them

to achieve high accuracy while maintaining simplicity. Each

model begins with its own unique layer and is subsequently

followed by a dense layer comprising 16 neurons with a

ReLU activation function, a dropout layer with a 0.5 dropout

rate, and another dense layer with 5 neurons and a softmax

activation function. Finally, each model is compiled using

the categorical crossentropy loss function, optimized with the

Adam optimizer, and trained for 10 epochs. Mathematically,

EDVC can be deﬁned as:

lstmp=LST M (ts)//ts  Dataset (1)

rnnp=RNN (ts)(2)

and,

grup=GRU(ts)(3)

where lstmp,rnnp, and grupare the predictions by LSTM,

RNN, and LSTM, respectively for the target class on a test

sample ts. The ﬁnal prediction step can be deﬁned as

fp =mode{lstmp,rnn

p,gru

p}(4)

Here, fp is the ﬁnal prediction which is a result of voting

between each model prediction. Algorithm 1 shows the step-

by-step ﬂow of the proposed model.

Algorithm 1 EDVC Algorithm

1: Input: NSL-KDD dataset (D)

2: Output: Attack Prediction — Normal, Dos, Probe, R2L,

and U2R.

3: Tlstm ←− LST Mtraining (TrS)Training set(TrS)D

4: Tgru ←− GRUtraining (TrS)

5: TRNN ←− RN Ntraining (TrS)

6: for iinTeSdo Testing set (TeS)D

7: LST Mp←− TLS T M (i)LST Mpis LSTM prediction against a

sample.

8: GRUp←− TGRU (i)GRUpGRU prediction

9: RNNp←− TRNN (i)RNNpGRU prediction

10: FPred ←− mode{LST Mp,GRU

p,RNN

p}

FPred N ormal, Dos, P robe, R2L, andU2R. which is ﬁnal prediction

IV. RESULTS AND DISCUSSIONS

This section contains the results and discussions of exper-

iments for network attack detection.

A. Experimental Setup

Experiments are carried out using the Google Colab envi-

ronment. The platform uses 90 GB disk space, a GPU with

13 GB RAM, and Intel(R) Xeon(R) system. All the models

are implemented using Python 3.0 programming language.

Evaluation is carried out using accuracy, precision, recall, and

F1 score.

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

B. Results of Machine Learning Models

The comparative performance results of the applied ma-

chine learning technique are given in Table III. The analysis

demonstrates that DT and RF techniques achieve excellent

performance with an accuracy score of 0.96 each while GNB

achieves poor performance with an accuracy score of 0.42.

The performance of LR and SVM is moderately well with

accuracy scores of 0.80 and 0.84, respectively.

Table III

THE COMPARATIVE RESULTS ANALYSIS OF APPLIED MACHINE LEARNING

METHODS ON UNSEEN TEST DATA.

Model Acc. P R F1

DT 0.96 0.97 0.97 0.97

LR 0.80 0.74 0.80 0.77

SVM 0.84 0.88 0.84 0.86

GNB 0.42 0.62 0.42 0.33

RF 0.96 0.94 0.96 0.95

The class-wise performance comparisons of machine learn-

ing models are given in Table IV. The analysis demonstrates

that the GNB technique achieves poor performance scores

for each class. It also indicates that the attack-encoded target

label 3 has 0% results due to fewer training samples. DT

achieves good performance metrics scores, followed by the

RF. The analysis concludes that only DT and RF models

achieve good scores for class performance comparisons. In

class-wise tables and confusion matrix ﬁgures, we represent

targets as follows: Dos (0), Probe (1), R2L (2), U2R (3), and

Normal (4).

For further veriﬁcation of results, cross-validation is per-

formed and results are given in Table V. The analysis

demonstrates that the applied GNB and LR achieve poor

cross-validation performance. On the other hand, both DT

and RF achieve signiﬁcantly better mean accuracy score of

0.996 each with minimum standard deviation.

Besides accuracy, precision, recall, and F1 score, the

number of correct and wrong predictions is an important

evaluation metric for models. The confusion matrix of applied

models is illustrated in Figure 6. Analysis shows that LR

and GNB have the highest number of wrong predictions for

network attack classiﬁcation. Only the DT and RF techniques

show better performance with a lower number of wrong

predictions. According to confusion matrix values, DT gives

28705 correct predictions and 999 wrong predictions out

of 29704 test predictions while RF gives 28577 correct

predictions and 1127 wrong predictions. GNB performs worst

with 12556 correct predictions and 17148 wrong predictions.

C. Results of Deep Learning Models

Table VI shows the results of employed deep learning

models. GRU and LSTM show the best performance for

network attack detection and achieve a 0.98 accuracy score

each. GRU and LSTM tend to show better performance due

to their recurrent architecture. Similarly, a simple recurrent

architecture RNN also achieves a 0.096 accuracy score. The

analysis shows that the CNN method could not perform well

and achieved a 0.93 accuracy score. CNN requires a large

feature set to achieve signiﬁcant results. It is found that

deep learning models tend to show better performance than

machine learning models for network attack detection.

Figure 6. The confusion matrix analysis of applied machine learning

methods.

A comparison of training and validation loss and accuracy

of employed deep learning models is presented in Figure

7. It shows that the RNN model has a high training loss

and less training accuracy score during the ﬁrst epoch of

training, followed by the GRU and LSTM. After completing

the ﬁrst training epoch, the neural network optimizer changes

the models’ weights, decreasing loss gradually. The CNN

analysis shows that the training loss is very high at the ﬁrst

epoch, and the accuracy score is deﬁcient. The CNN model

achieves poor performance scores during the training phase.

The class-wise performance analysis of employed deep

learning models is given in Table VII. The performance

metrics score of precision, recall, and F1 are evaluated for

each class. The analysis demonstrates that the CNN model

achieves poor performance scores for each class. GRU and

LSTM models achieve the highest average scores of 0.99 for

each network attack class. The performance of models for

class label 3 is still similar to those of machine learning

models with 0 scores. The analysis concludes that deep

learning models perform better for each target class than

machine learning techniques.

The cross-validation analysis of deep learning models

based on the k-fold is presented in Table VIII. The network

attacks dataset is split into ten folds to test the generalization

of each model. The analysis demonstrates that LSTM models

achieve a maximum k-fold accuracy score of 0.98 with a

minimum standard deviation of 0.0138. It is followed by

GRU and CNN with mean accuracy scores of 0.97 and 0.87,

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

Table IV

THE CLASS-WISE PERFORMANCE ANALYSIS OF MACHINE LEARNING MODELS.

Model DT LR SVM GNB RF

Target P R F1 P R F1 P R F1 P R F1 P R F1

0 0.98 0.98 0.98 0.83 0.88 0.85 0.92 0.94 0.93 0.39 0.96 0.55 1.00 0.99 0.99

1 0.87 0.96 0.91 0.14 0.07 0.09 0.85 0.65 0.73 0.37 0.06 0.10 0.96 0.97 0.96

2 0.72 0.77 0.74 0.00 0.00 0.00 0.12 0.35 0.17 0.02 0.00 0.00 0.00 0.00 0.00

3 0.00 0.00 0.00 0.00 0.00 0.00 0.26 0.42 0.32 0.05 0.17 0.08 0.00 0.00 0.00

4 0.99 0.97 0.98 0.83 0.93 0.88 0.90 0.83 0.87 0.85 0.14 0.24 0.94 0.99 0.97

Weighted Avg. 0.97 0.97 0.97 0.74 0.80 0.77 0.88 0.84 0.86 0.62 0.42 0.33 0.94 0.96 0.95

(a) The perfrmane analysis of RNN

techqniue

(b) The perfrmane analysis of GRU

techqniue

(d) The perfrmane analysis of CNN

techqniue

Figure 7. The time series-based performance comparison analysis of applied deep learning techniques during training.

Table V

K-FOLD CROSS-VALIDATION RESULTS.

Model K-fold Acc. Standard deviation

DT 10 0.96 +/- 0.0015

LR 10 0.80 +/- 0.0036

SVM 10 0.85 +/- 0.0833

GNB 10 0.42 +/- 0.0041

RF 10 0.96 +/- 0.0019

Table VI

THE COMPARATIVE RESULTS ANALYSIS OF APPLIED DEEP LEARNING

METHODS ON UNSEEN TEST DATA.

Model Acc. P R F1

RNN 0.96 0.96 0.96 0.96

GRU 0.98 0.99 0.99 0.99

LSTM 0.98 0.99 0.99 0.99

CNN 0.93 0.91 0.93 0.92

Table VII

CLASS-WISE PERFORMANCE ANALYSIS OF DEEP LEARNING MODELS.

Model RNN GRU

Target P R F1 P R F1

0 0.96 0.98 0.97 0.99 1.00 0.99

1 0.93 0.92 0.93 0.97 0.99 0.98

2 0.61 0.67 0.64 0.93 0.82 0.87

3 0.00 0.00 0.00 0.00 0.00 0.00

4 0.99 0.96 0.97 0.99 0.99 0.99

Weighted Avg. 0.96 0.96 0.96 0.99 0.99 0.99

LSTM CNN

0 0.99 1.00 1.00 0.99 0.94 0.96

1 0.98 0.96 0.97 0.93 0.87 0.90

2 0.91 0.91 0.91 0.00 0.00 0.00

3 0.00 0.00 0.00 0.00 0.00 0.00

4 0.99 0.99 0.99 0.90 0.99 0.94

Weighted Avg. 0.99 0.99 0.99 0.91 0.93 0.92

respectively. The RNN model achieves poor cross-validation

performance.

The confusion matrix of deep learning models is given in

Figure 8. The analysis shows that RNN and CNN have the

highest number of wrong predictions for network attack clas-

siﬁcation. LSTM shows the best performance with the highest

Table VIII

THE K-FOLD CROSS-VALIDATION-BASED RESULTS ANALYSIS OF APPLIED

DEEP LEARNING METHODS.

Model K-fold Acc. Standard deviation

RNN 10 0.95 +/- 0.0438

GRU 10 0.97 +/- 0.0453

LSTM 10 0.98 +/- 0.0138

CNN 10 0.87 +/- 0.0849

number of correct predictions. According to the confusion

matrix, GRU gives 29275 correct predictions and 432 wrong

predictions out of test 29704 predictions. Similarly, LSTM

gives 29327 correct predictions and 377 wrong predictions.

From deep learning models, CNN performs worst with 1828

wrong predictions and 27876 correct predictions.

Figure 8. The confusion matrix analysis of applied deep learning methods.

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

Figure 9 shows the performance comparison between ma-

chine learning and deep learning models. It indicates that

deep learning models have better performance for network

attack detection as compared to machine learning models.

Figure 9. Performance comparison between machine learning and deep

learning models.

D. Results of Proposed Ensemble Model

The performance of the proposed ensemble approach for

detecting network attacks is examined here. Table IX shows

the results of the proposed approach regarding the accuracy,

precision, recall, and F1 score. Results demonstrate that

the proposed model achieves the highest accuracy score of

99.6%. The model achieves 100% precision, recall, and F1 for

network attack detection. Performance metrics for each class

show that the proposed model achieves a 100% precision for

each target class except for target class 3. Deep learning and

machine learning models showed a 0 precision, recall, and F1

score for target class 3. On the contrary, the proposed model

obtains 0.89, 0.33, and 0.48 scores for precision, recall, and

F1 scores. The recall and F1 scores for classes 0 and 4 are

1.00 while class 1 has 0.99 precision, recall, and F1 score.

The analysis concludes that the proposed approach achieves

1.00 average scores for precision, recall, and F1 scores. In

comparison with the ensemble of deep learning models, this

study also combines machine learning best performers under

majority voting criteria. For this purpose, DT, RF, and SVM

are combined because of their better individual performance.

However, the results of this ensemble are inferior to the

proposed approach. The ensemble of DT, RF, and SVM could

achieve only a 0.967 accuracy score in comparison to a 0.996

accuracy score from the proposed model.

The classiﬁcation performance of the proposed technique

based on the confusion matrix is illustrated in Figure 10.

The confusion matrix shows that the minimum number of

wrong predictions is achieved by the proposed ensemble

technique for detecting the network attack. It makes 29598

correct predictions out of 29704 total predictions and only

106 predictions are wrong.

E. Comparisons Analysis With State-of-the-art Studies

The comparative analysis of the proposed approach with

state-of-the-art studies is analyzed in Table X. The previously

published research articles for detecting network attacks from

the year 2020 to 2023 are taken for comparison. The analysis

Table IX

RESULTS OF THE PROPOSED EDVC MODEL FOR EACH TARGET CLASS.

Model Acc. Target P R F1

EDVC 0.996 0 1.00 1.00 1.00

1 0.99 0.99 0.99

2 0.93 0.99 0.96

3 0.89 0.33 0.48

4 1.00 1.00 1.00

Weighted Avg. 1.00 1.00 1.00

DT+RF+SVM 0.977 0 0.99 0.99 0.99

1 0.93 0.96 0.95

2 0.73 0.28 0.40

3 0.00 0.00 0.00

4 0.96 0.99 0.97

Weighted Avg. 0.96 0.97 0.96

Figure 10. The confusion matrix analysis of our proposed EDVC method

for network attack detection.

shows that most researchers only used classical machine

learning and deep learning method for detecting network

attacks. Some studies used ensemble models but they also

consider only machine learning models. The comparative

analysis demonstrates that the propped ensemble model out-

performs existing models with high accuracy.

Table X

PERFORMANCE COMPARISON WITH STATE-OF-THE-ART STUDIES.

Ref. Year Approach Technique Acc.

[7] 2020 Machine Learning XGBoost 0.97

[8] 2020 Deep learning BLSTM 0.84

[9] 2020 Deep learning DNN 0.97

[12] 2020 Deep learning DHN 0.83

[15] 2021 Machine Learning XGBoost 0.95

[16] 2021 Machine Learning Hybrid NB+SVM 0.89

[17] 2021 Deep learning NIDS 0.95

[18] 2021 Deep learning Condensed Nearest

Neighbors

0.94

[13] 2021 Machine Learning Hybrid K-mean+RF 0.85

[10] 2022 Deep learning BiLSTM 0.82

[11] 2022 Machine Learning Improved RF 0.88

[14] 2023 Deep learning MLP 0.97

Our 2023 Deep Learning EDVC 0.996

F. Discussion

The performance of the proposed EDVC model is signiﬁ-

cantly better than existing state-of-the-art approaches because

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

of its ensemble architecture. The combination of three models

using voting criteria helps to improve accuracy. Another

reason that beneﬁts models perform better is the MSL-KDD

class distribution, as illustrated in Figure 11. T-distributed

stochastic neighbor embedding (t-SNE) technique is used to

show the feature space with low dimensionality [32]. The t-

SNE output illustrates that dataset is highly correlated with

target classes and is linearly separable thus providing accurate

learning because target samples do not overlap. Class distribu-

tion in terms of the number of samples is highly imbalanced

which impacted models over-ﬁtting towards the majority class

but the proposed model shows better performance because of

majority voting and achieves a 99.6% accuracy score.

Figure 11. Class distribution for the NSL-KDD dataset.

Table XI displays the computational cost associated with

the various learning models used in this study for detecting

network attacks. The machine learning models utilized in

this study exhibit a lower computational cost compared to

the deep learning models. However, our proposed approach,

EDVC, employs a combination of three deep learning models,

resulting in a higher computational cost than the simple state-

of-the-art models. In the context of network security, accuracy

is a crucial factor since even a 1% vulnerability in the security

system can result in signiﬁcant costs. Therefore, in this study,

we prioritize accuracy over computational cost, although this

approach has its limitations, which can be addressed in future

studies.

V. C ONCLUSIONS,LIMITATIONS AND FUTURE WORK

Network attack detection using a deep learning-based en-

semble model is proposed. The proposed EDVC technique

combines the deep learning RNN, GRU, and LSTM models

under majority voting criteria. Experiments are performed us-

ing the NSL-KDD dataset employing both machine learning

and deep learning models for performance comparison. Ex-

perimental results indicate that the proposed model achieves

superior results and obtains an accuracy of 99.6% for network

Table XI

COMPUTATIONAL COST OF LEARNING MODELS IN TERMS OF TIME

(SECONDS).

Model Time Model Time

DT 0.35 LR 11.11

SVM 4.13 DT+RF+SVM 6.91

GNB 0.09 RF 1.83

RNN 862.90 GRU 161.52

LSTM 263.90 CNN 106.87

EDVC 1462.79

attack detection. The performance of the proposed model is

further veriﬁed using performance comparison with existing

state-of-the-art approaches which shows that it outperforms

existing models.

Limitations: Despite the promising results of the proposed

model, it has several limitations. First, the proposed EDVC

model is an ensemble of deep learning models and its

computational cost is high compared to the ensemble of

machine learning models. It is signiﬁcantly better in terms

of accuracy but has a high computational cost. Secondly, the

used dataset is highly imbalanced as the target U2R class

contains only 119 samples. It may lead to improper training

of models thus causing poor performance to detect this attack.

Third, It should be noted that the proposed approach has only

been tested on the NSL-KDD dataset, and its performance

may vary when applied to other datasets.

Future Work: In future work, we intend to work on

reducing the computational cost by focusing on the architec-

ture of individual deep learning models. The proposed model

will undergo training on multiple types of attacks, ensuring

that it can effectively address new threats and attacks that

continually emerge. Furthermore, we want to apply advanced

data-balancing techniques to enhance the performance of

attack-detecting models.

ACRONYMS

Description Acronym Description Acronym

Ensemble Deep Voting Classiﬁer EDVC Man-in-the-Middle MitM

Long Short-Term Memory LSTM Artiﬁcial Intelligence AI

Recurrent Neural Network RNN Particle Swarm Optimization PSO

Gated Recurrent Unit GRU Synthetic Minority Over-sampling TEchnique SMOTE

Decision Tree DT Multilayered Perceptron MLP

Logistic Regression LR Remote to Local R2L

Support Vector Machine SVM User to Root U2R

Gaussian Nave Bayes GNB Internet of things IoT

Random Forest RF Network Intrusion Detection System NIDS

Convolutional Neural Network CNN Hypertext Transfer Protocol HTTP

Accuracy Acc. Precision P

Recall R F1 Score F1

REFERENCES

[1] F. Tang, X. Chen, M. Zhao, and N. Kato, “The roadmap of com-

munication and networking in 6g for the metaverse,” IEEE Wireless

Communications, 2022.

[2] H. Guo, X. Zhou, J. Liu, and Y. Zhang, “Vehicular intelligence in 6g:

Networking, communications, and computing,” Vehicular Communica-

tions, vol. 33, p. 100399, 2022.

[3] P. L. Indrasiri, E. Lee, V. Rupapara, F. Rustam, and I. Ashraf, “Mali-

cious trafﬁc detection in iot and local networks using stacked ensemble

classiﬁer,” Computers, Materials and Continua, vol. 71, no. 1, pp. 489–

515, 2022.

[4] Y. Maleh, Y. Qasmaoui, K. El Gholami, Y. Sadqi, and S. Mounir,

“A comprehensive survey on sdn security: threats, mitigations, and

future directions,” Journal of Reliable Intelligent Environments, pp.

1–39, 2022.

[5] J. Wang, J. Liu, J. Li, and N. Kato, “Artiﬁcial intelligence-assisted

network slicing: Network assurance and service provisioning in 6g,”

IEEE Vehicular Technology Magazine, 2023.

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

[6] M. A. Talukder, K. F. Hasan, M. M. Islam, M. A. Uddin, A. Akhter,

M. A. Yousuf, F. Alharbi, and M. A. Moni, “A dependable hybrid

machine learning model for network intrusion detection,” Journal of

Information Security and Applications, vol. 72, p. 103405, 2023.

[7] J. Liu, B. Kantarci, and C. Adams, “Machine learning-driven intrusion

detection for contiki-ng-based iot networks exposed to nsl-kdd dataset,”

in Proceedings of the 2nd ACM workshop on wireless security and

machine learning, 2020, pp. 25–30.

[8] T. Su, H. Sun, J. Zhu, S. Wang, and Y. Li, “Bat: Deep learning methods

on network intrusion detection using nsl-kdd dataset,” IEEE Access,

vol. 8, pp. 29 575–29 585, 2020.

[9] G. C. Amaizu, C. I. Nwakanma, J.-M. Lee, and D.-S. Kim, “Inves-

tigating network intrusion detection datasets using machine learning,”

in 2020 International Conference on Information and Communication

Technology Convergence (ICTC). IEEE, 2020, pp. 1325–1328.

[10] M. Esmaeili, S. H. Goki, B. H. K. Masjidi, M. Sameh, H. Gharagozlou,

and A. S. Mohammed, “Ml-ddosnet: Iot intrusion detection based on

denial-of-service attacks using machine learning methods and nsl-kdd,”

Wireless Communications and Mobile Computing, vol. 2022, 2022.

[11] A. K. Balyan, S. Ahuja, U. K. Lilhore, S. K. Sharma, P. Manoharan,

A. D. Algarni, H. Elmannai, and K. Raahemifar, “A hybrid intrusion

detection model using ega-pso and improved random forest method,”

Sensors, vol. 22, no. 16, p. 5986, 2022.

[12] K. Jiang, W. Wang, A. Wang, and H. Wu, “Network intrusion detection

combined hybrid sampling with deep hierarchical network,” IEEE

access, vol. 8, pp. 32 464–32 476, 2020.

[13] C. Liu, Z. Gu, and J. Wang, “A hybrid intrusion detection system based

on scalable k-means+ random forest and deep learning,” Ieee Access,

vol. 9, pp. 75 729–75 740, 2021.

[14] S. Cherﬁ, A. Boulaiche, and A. Lemouari, “Multi-layer perceptron for

intrusion detection using simulated annealing,” in Modelling and Im-

plementation of Complex Systems: Proceedings of the 7th International

Symposium, MISC 2022, Mostaganem, Algeria, October 30-31, 2022.

Springer, 2022, pp. 31–45.

[15] A. O. Alzahrani and M. J. Alenazi, “Designing a network intrusion

detection system based on machine learning for software deﬁned

networks,” Future Internet, vol. 13, no. 5, p. 111, 2021.

[16] T. Wisanwanichthan and M. Thammawichai, “A double-layered hybrid

approach for network intrusion detection system using combined naive

bayes and svm,” IEEE Access, vol. 9, pp. 138432–138 450, 2021.

[17] N. Sahar, R. Mishra, and S. Kalam, “Deep learning approach-based

network intrusion detection system for fog-assisted iot,” in Proceedings

of international conference on big data, machine learning and their

applications: ICBMA 2019. Springer, 2021, pp. 39–50.

[18] F. Z. Belgrana, N. Benamrane, M. A. Hamaida, A. M. Chaabani, and

A. Taleb-Ahmed, “Network intrusion detection system using neural

network and condensed nearest neighbors with selection of nsl-kdd in-

ﬂuencing features,” in 2020 IEEE International Conference on Internet

of Things and Intelligence System (IoTaIS). IEEE, 2021, pp. 23–29.

[19] M HASSAN ZAIB, “NSL-KDD — Kaggle.” [Online]. Available:

https://www.kaggle.com/datasets/hassan06/nslkdd

[20] E. Bisong and E. Bisong, “Introduction to scikit-learn,” Building Ma-

chine Learning and Deep Learning Models on Google Cloud Platform:

A Comprehensive Guide for Beginners, pp. 215–229, 2019.

[21] A. Pashamokhtari, G. Batista, and H. H. Gharakheili, “Adiotack:

Quantifying and reﬁning resilience of decision tree ensemble infer-

ence models against adversarial volumetric attacks on iot networks,”

Computers & Security, vol. 120, p. 102801, 2022.

[22] S. Tufail, S. Batool, and A. I. Sarwat, “A comparative study of binary

class logistic regression and shallow neural network for ddos attack

prediction,” in SoutheastCon 2022. IEEE, 2022, pp. 310–315.

[23] A. Raza, H. U. R. Siddiqui, K. Munir, M. Almutairi, F. Rustam, and

I. Ashraf, “Ensemble learning-based feature engineering to analyze

maternal health during pregnancy and health risk prediction,” Plos one,

vol. 17, no. 11, p. e0276525, 2022.

[24] S. Ismail and H. Reza, “Evaluation of na¨

ıve bayesian algorithms for

cyber-attacks detection in wireless sensor networks,” in 2022 IEEE

World AI IoT Congress (AIIoT). IEEE, 2022, pp. 283–289.

[25] T. Wu, H. Fan, H. Zhu, C. You, H. Zhou, and X. Huang, “Intrusion

detection system combined enhanced random forest with smote algo-

rithm,” EURASIP Journal on Advances in Signal Processing, vol. 2022,

no. 1, pp. 1–20, 2022.

[26] F. Rustam, M. F. Mushtaq, A. Hamza, M. S. Farooq, A. D. Jurcut,

and I. Ashraf, “Denial of service attack classiﬁcation using machine

learning with multi-features,” Electronics, vol. 11, no. 22, p. 3817,

2022.

[27] S. Kaur and M. Singh, “Hybrid intrusion detection and signature

generation using deep recurrent neural networks,” Neural Computing

and Applications, vol. 32, pp. 7859–7877, 2020.

[28] S. M. Kasongo and Y. Sun, “A deep gated recurrent unit based model

for wireless intrusion detection system,” ICT Express, vol. 7, no. 1, pp.

81–87, 2021.

[29] F. Laghrissi, S. Douzi, K. Douzi, and B. Hssina, “Intrusion detection

systems using long short-term memory (lstm),” Journal of Big Data,

vol. 8, no. 1, p. 65, 2021.

[30] Y. Lin, H. Zhao, X. Ma, Y. Tu, and M. Wang, “Adversarial attacks

in modulation recognition with convolutional neural networks,” IEEE

Transactions on Reliability, vol. 70, no. 1, pp. 389–401, 2020.

[31] S. Zargar, “Introduction to sequence learning models: Rnn, lstm, gru,”

no. April, 2021.

[32] L. van der Maaten and G. Hinton, “Visualizing data using t-sne,” Jour-

nal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605, 2008.

[Online]. Available: http://jmlr.org/papers/v9/vandermaaten08a.html

2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)

Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.

An Efficient Artificial Intelligence Approach for Early Detection of Cross-Site Scripting Attacks

Article

Full-text available

Apr 2024

Cross-Site Scripting (XSS) attacks continue to pose a significant threat to web applications, compromising the security and integrity of user data. XSS is a web application vulnerability where malicious scripts are injected into websites, allowing attackers to execute arbitrary code in the victim’s browser. The consequences of XSS attacks can be severe, ranging from financial losses to compromising sensitive user information. XSS attacks enable attackers to deface websites, distribute malware, or launch phishing campaigns, compromising the trust and reputation of affected organizations. This study proposes an efficient artificial intelligence approach for the early detection of XSS attacks, utilizing machine learning and deep learning approaches, including Long Short-Term Memory (LSTM). Additionally, advanced feature engineering techniques, such as the Term Frequency-Inverse Document Frequency (TFIDF), are applied and compared to evaluate results. We introduce a novel approach named LSTM-TFIDF (LSTF) for feature extraction, which combines temporal and TFIDF features from the cross-site scripting dataset, resulting in a new feature set. Extensive research experiments demonstrate that the random forest method achieved a high performance of 0.99, outperforming state-of-the-art approaches using the proposed features. A k-fold cross-validation mechanism is utilized to validate the performance of applied methods, and hyperparameter tuning further enhances the performance of XSS attack detection. We have applied Explainable Artificial Intelligence (XAI) to understand the interpretability and transparency of the proposed model in detecting XSS attacks. This study makes a valuable contribution to the growing body of knowledge on XSS attacks and provides an efficient model for developers and security practitioners to enhance the security of web applications.

AE-Net: Novel Autoencoder Based Deep Features for SQL Injection Attack Detection

Article

Full-text available

Nov 2023

SQL injection attacks represent a critical threat to database-driven applications and systems, exploiting vulnerabilities in input fields to inject malicious SQL code into database queries. This unauthorized access enables attackers to manipulate, retrieve, or even delete sensitive data. The unauthorized access through SQL injection attacks underscores the critical importance of robust Artificial Intelligence (AI) based security measures to safeguard against SQL injection attacks. This study’s primary objective is the automated and timely detection of SQL injection attacks through AI without human intervention. Utilizing a preprocessed database of 46,392 SQL queries, we introduce a novel approach, the Autoencoder network (AE-Net), for automatic feature engineering. The proposed AE-Net extracts high-level deep features from SQL textual data, subsequently input into machine learning models for performance evaluations. Extensive experimental evaluation reveals that the extreme gradient boosting classifier outperforms existing studies with an impressive k-fold accuracy score of 0.99 for SQL injection detection. Each applied learning approach’s performance is further enhanced through hyperparameter tuning and validated via k-fold crossvalidation. Additionally, statistical t-test analysis is applied to assess performance variations. Our innovative research has the potential to revolutionize the timely detection of SQL injection attacks, benefiting security specialists and organizati

IVNet: Transfer Learning Based Diagnosis of Breast Cancer Grading Using Histopathological Images of Infected Cells

Article

Full-text available

Jan 2023

Breast cancer constitutes a significant global health concern that impacts millions of women across the world. The diagnosis of breast cancer involves categorizing grades based on the histopathological characteristics of tumor cells. While histopathological assessment remains the established benchmark for breast cancer diagnosis, it is hampered by time-consuming procedures, subjectivity, and susceptibility to human errors. This study introduces a novel approach called ImageNet-VGG16 (IVNet) for the real-time diagnosis of breast cancer within a hospital environment. The research experiments are conducted using a benchmark dataset known as Jimma University Medical Center (JUMC) breast cancer grading. Advanced image processing techniques are applied to preprocess the data, enhancing performance. This preprocessing involves the utilization of Holistically Nested Edge Detection (HED) and Contrast Limited Adaptive Histogram Equalization (CLAHE) for transformation and stain normalization. We employ advanced neural network-based transfer learning techniques to analyze the preprocessed histopathological images and identify affected cells. Various pre-trained models are utilized, including convolutional neural networks (CNN) such as VGG16, ResNet50, InceptionNetv3, ImageNet, MobileNetv3, and EfficientNetV3, in a comparative framework. The principal objective of this research is the accurate classification of breast cancer images into Grade-1, Grade-2 and Grade-3. Through extensive experimental research, we achieved a commendable 97% correct classification rate by utilizing a hybrid of VGG16 and ImageNet as the proposed feature engineering method, IVNet. We also validate our proposed approach performance using other state-of-the-art study data and statistical t-test analysis. Furthermore, we develop a user-friendly Graphical User Interface (GUI) that facilitates real-time cell tracking in histopathological images. Our real-time diagnosing application offers valuable insights for treatment planning and assists medical professionals in making prognoses. Moreover, our approach can serve as a reliable decision support system for pathologists and clinicians, particularly in settings constrained by limited resources and restricted access to expertise and equipment.

A Network Traffic Intrusion Detection Method for Industrial Control Systems Based on Deep Learning

Article

Full-text available

Oct 2023

The current mainstream intrusion detection models often have a high false negative rate, significantly affecting intrusion detection systems’ (IDSs) practicability. To address this issue, we propose an intrusion detection model based on a multi-scale one-dimensional convolutional neural network module (MS1DCNN), an efficient channel attention module (ECA), and two bidirectional long short-term memory modules (BiLSTMs). The proposed hybrid MS1DCNN-ECA-BiLSTM model uses the MS1DCNN module to extract features with a different granularity from the input data and uses the ECA module to enhance the weight of important features. Finally, the model carries out sequence learning through two BiLSTM layers. We use the dung beetle optimizer (DBO) to optimize the hyperparameters in the model to obtain better classification results. Additionally, we use the synthetic minority oversampling technique (SMOTE) to fill several samples to reduce the local false negative rate. In this paper, we train and test the model using accurate network data from a water storage industrial control system. In the multi-classification experiment, the model’s accuracy was 97.04%, the precision was 97.17%, and the false negative rate was 2.95%; in the binary classification experiment, the accuracy and false negative rate were 99.30% and 0.7%. Compared with other mainstream methods, our model has a higher score. This study provides a new algorithm for the intrusion detection of industrial control systems.

Novel Class Probability Features for Optimizing Network Attack Detection With Machine Learning

Article

Full-text available

Jan 2023

Network attacks refer to malicious activities exploiting computer network vulnerabilities to compromise security, disrupt operations, or gain unauthorized access to sensitive information. Common network attacks include phishing, malware distribution, and brute-force attacks on network devices and user credentials. Such attacks can lead to financial losses due to downtime, recovery costs, and potential legal liabilities. To counter such threats, organizations use Intrusion Detection Systems (IDS) that leverage sophisticated algorithms and machine learning techniques to detect network attacks with enhanced accuracy and efficiency. Our proposed research aims to detect network attacks effectively and timely to prevent harmful losses. We used a benchmark dataset named CICIDS2017 to build advanced artificial intelligence-based machine learning methods. We propose a novel approach called Class Probability Random Forest (CPRF) for network attack detection performance enhancement. We created a novel feature set using the proposed CPRF approach. The CPRF approach predicts the class probabilities from the network attack dataset, which are then used as features for building applied machine learning methods. The comprehensive research results demonstrated that the random forest approach outperformed the state-of-the-art approach with a high-performance accuracy of 99.9%. The performance of each applied technique is validated using a k-fold approach and optimized with hyperparameter tuning. Our novel proposed research has revolutionized network attack detection, effectively preventing unauthorized access, service disruptions, sensitive information theft, and data integrity compromise.

Enabling a Secure IoT Environment Using a Blockchain-Based Local-Global Consensus Manager

Article

Full-text available

Sep 2023

The Internet of Things (IoT) refers to the network of interconnected devices that can communicate and share data over the Internet. The widespread adoption of smart devices within Internet of Things (IoT) networks poses considerable security challenges for their communication. To address these issues, blockchain technology, known for its decentralized and distributed nature, offers potential solutions within consensus-based authentication in IoT networks. This paper presents a novel approach called the local and global layer blockchain model, which aims to enhance security while simplifying implementation. The model leverages the concept of clustering to establish a local-global architecture, with cluster heads assuming responsibility for local authentication and authorization. Implementing a local private blockchain facilitates seamless communication between cluster heads and relevant base stations. This blockchain implementation enhances credibility assurance, strengthens security, and provides an effective network authentication mechanism. Simulation results indicate that the proposed algorithm outperforms previously reported methods. The proposed model achieved an average coverage per node of 0.9, which is superior to baseline models. Additionally, the lightweight blockchain model proposed in this paper demonstrates superior capabilities in achieving balanced network latency and throughput compared to traditional global blockchain approaches.

A Performance Overview of Machine Learning-Based Defense Strategies for Advanced Persistent Threats in Industrial Control Systems

Article

Aug 2023
COMPUT SECUR

Cybersecurity incident response is a very crucial part of the cybersecurity management system. Adversaries emerge and evolve with new cybersecurity tactics, techniques, and procedures (TTPs). It is essential to detect the TTPs in a timely manner to respond effectively and mitigate the vulnerabilities to secure business operations. This research focuses on TTP identification and detection based on a machine learning approach. Early identification and detection are paramount in protecting, responding to, and recovering from such adversarial attacks. Analyzing use cases is a critical tool to ensure proper and in-depth evaluation of sector-specific cybersecurity challenges. In this regard, this study investigates existing known methodologies for cyber-attacks such as Mitre attacks, and developed a method for identifying threat cases. In addition, Windows-based threat cases are implemented, comprehensive datasets are generated, and supervised machine learning models are applied to detect threats effectively and efficiently. Random forest outperforms other models with the highest accuracy of 99%. Future work can be done for generating threat cases based on multiple log sources, including network security and endpoint protection device, and achieve high accuracy by removing false positives using machine learning. Similarly, real-time threat detection is also envisioned for future work.

A Novel Approach For Real-Time Server-Based Attack Detection using Meta-Learning

Article

Full-text available

Mar 2024

Modern networks are crucial for seamless connectivity but face various threats, including disruptive network attacks, which can result in significant financial and reputational risks. To counter these challenges, AI-based techniques are being explored for network protection, requiring high-quality datasets for training. In this study, we present a novel methodology utilizing an Ubuntu Base Server to simulate a virtual network environment for real-time collection of network attack datasets. By employing Kali Linux as the attacker machine and Wireshark for data capture, we compile the Server-based Network Attack (SNA) dataset, showcasing UDP, SYN, and HTTP flood network attacks. Our primary goal is to provide a publicly accessible, server-focused dataset tailored for network attack research. Additionally, we leverage advanced AI methods for real-time detection of network attacks. Our proposed meta-RF-GNB (MRG) model combines Gaussian Naive Bayes and Random Forest techniques for predictions, achieving an impressive accuracy score of 99.99%. We validate the efficiency of MRG using cross-validation, obtaining a notable mean accuracy of 99.94% with a minimal standard deviation of 0.00002. Furthermore, we conducted a statistical t-test to evaluate the significance of MRG compared to other top-performing models.

Analyzing the Effectiveness of Ensemble Based Analysis in Wireless Sensor Networks

Article

Jan 2024

Seng Phil Hong

The usefulness of ensemble-based total time series analysis in Wi-Fi sensor networks is examined in this paper. A device to uses an ensemble approach combines multiple strategies to enhance overall predictive performance. This research assesses various tactics using unique metrics, such as robustness and accuracy. It contrasts the effectiveness of traditional time series methods with ensemble-based total fashions. An experimental approach focusing mostly on exceptional Wi-Fi sensor network scenarios is employed to evaluate the overall effectiveness of the suggested methods. Additionally, this study looks into how changes to community features like energy delivery, conversation range, and node density affect how effective the suggested methods are. The study's findings maintain the capacity to create effective Wi-Fi sensor networks with improved predicted overall performance. The usefulness of ensemble-based time collecting and analysis techniques for wireless sensor networks is investigated in this research. This study primarily looks upon function extraction and seasonality discounting of time series records in WSNs. In this analysis, seasonality is discounted using an ensemble median filter, and feature extraction is accomplished by primary component assessment. To assess the performance of the suggested ensemble technique on every simulated and real-world international WSN fact, multiple experiments are carried out. The findings suggest that the ensemble approach can improve the exceptional time-gathering records within WSNs and reduce seasonality. Furthermore, when compared to single-sensor strategies, the ensemble technique further improves the accuracy of the function extraction system. This work demonstrates the applicability of the ensemble approach for the investigation of time collection data in WSNs

Securing Multi-Environment Networks using Versatile Synthetic Data Augmentation Technique and Machine Learning Algorithms

Conference Paper

Aug 2023

A dependable hybrid machine learning model for network intrusion detection

Article

Full-text available

Feb 2023

Denial of Service Attack Classification Using Machine Learning with Multi-Features

Article

Full-text available

Nov 2022

The exploitation of internet networks through denial of services (DoS) attacks has experienced a continuous surge over the past few years. Despite the development of advanced intrusion detection and protection systems, network security remains a challenging problem and necessitates the development of efficient and effective defense mechanisms to detect these threats. This research proposes a machine learning-based framework to detect distributed DOS (DDoS)/DoS attacks. For this purpose, a large dataset containing the network traffic of the application layer is utilized. A novel multi-feature approach is proposed where the principal component analysis (PCA) features and singular value decomposition (SVD) features are combined to obtain higher performance. The validation of the multi-feature approach is determined by extensive experiments using several machine learning models. The performance of machine learning models is evaluated for each class of attack and results are discussed regarding the accuracy, recall, and F1 score, etc., in the context of recent state-of-the-art approaches. Experimental results confirm that using multi-feature increases the performance and RF obtains a 100% accuracy.

Ensemble learning-based feature engineering to analyze maternal health during pregnancy and health risk prediction

Article

Full-text available

Nov 2022
PLOS ONE

Maternal health is an important aspect of women’s health during pregnancy, childbirth, and the postpartum period. Specifically, during pregnancy, different health factors like age, blood disorders, heart rate, etc. can lead to pregnancy complications. Detecting such health factors can alleviate the risk of pregnancy-related complications. This study aims to develop an artificial neural network-based system for predicting maternal health risks using health data records. A novel deep neural network architecture, DT-BiLTCN is proposed that uses decision trees, a bidirectional long short-term memory network, and a temporal convolutional network. Experiments involve using a dataset of 1218 samples collected from maternal health care, hospitals, and community clinics using the IoT-based risk monitoring system. Class imbalance is resolved using the synthetic minority oversampling technique. DT-BiLTCN provides a feature set to obtain high accuracy results which in this case are provided by the support vector machine with a 98% accuracy. Maternal health exploratory data analysis reveals that the health conditions which are the strongest indications of health risk during pregnancy are diastolic and systolic blood pressure, heart rate, and age of pregnant women. Using the proposed model, timely prediction of health risks associated with pregnant women can be made thus mitigating the risk of health complications which helps to save lives.

ML-DDoSnet: IoT Intrusion Detection Based on Denial-of-Service Attacks Using Machine Learning Methods and NSL-KDD

Article

Full-text available

Aug 2022
WIREL COMMUN MOB COM

The Internet of Things (IoT) is a complicated security feature in which datagrams are protected by integrity, confidentiality, and authentication services. The network is protected from external interruptions and intrusions. Because IoT devices run with a range of heterogeneous technologies and process data over time, standard solutions may not be practical. It is necessary to develop intelligent procedures that can be used for multiple levels of data flow in the system. This study examines metainnovations using deep learning-based IDS. Per the findings of the earlier tests, BiLSTMs are better for binary (regular/attacker) classification; however, sequential models (LSTM or BiLSTM) are better for detecting some brutal attacks in multiclass classifiers. According to experts, deep learning-based intrusion detection systems can now recognize and select the best structure for each category. However, specific difficulties will need to be solved in the future. Two topics should be studied further in future attempts. One of the researchers’ concerns is the impact of various data processing techniques, such as artificial intelligence or metamethods, on IDS. The BiLSTM approach has chosen the safest instances with the highest accuracy among the models. According to the findings, the most reliable and suitable solution for evaluating DDoS attacks in IoT is the BiLSTM design.

A Hybrid Intrusion Detection Model Using EGA-PSO and Improved Random Forest Method

Article

Full-text available

Aug 2022
SENSORS-BASEL

Due to the rapid growth in IT technology, digital data have increased availability, creating novel security threats that need immediate attention. An intrusion detection system (IDS) is the most promising solution for preventing malicious intrusions and tracing suspicious network behavioral patterns. Machine learning (ML) methods are widely used in IDS. Due to a limited training dataset, an ML-based IDS generates a higher false detection ratio and encounters data imbalance issues. To deal with the data-imbalance issue, this research develops an efficient hybrid network-based IDS model (HNIDS), which is utilized using the enhanced genetic algorithm and particle swarm optimization(EGA-PSO) and improved random forest (IRF) methods. In the initial phase, the proposed HNIDS utilizes hybrid EGA-PSO methods to enhance the minor data samples and thus produce a balanced data set to learn the sample attributes of small samples more accurately. In the proposed HNIDS, a PSO method improves the vector. GA is enhanced by adding a multi-objective function, which selects the best features and achieves improved fitness outcomes to explore the essential features and helps minimize dimensions, enhance the true positive rate (TPR), and lower the false positive rate (FPR). In the next phase, an IRF eliminates the less significant attributes, incorporates a list of decision trees across each iterative process, supervises the classifier’s performance, and prevents overfitting issues. The performance of the proposed method and existing ML methods are tested using the benchmark datasets NSL-KDD. The experimental findings demonstrated that the proposed HNIDS method achieves an accuracy of 98.979% on BCC and 88.149% on MCC for the NSL-KDD dataset, which is far better than the other ML methods i.e., SVM, RF, LR, NB, LDA, and CART.

Performance Evaluation of Naïve Bayesian Algorithms for Cyber-Attacks Detection in Wireless Sensor Networks

Conference Paper

Full-text available

May 2022

One of the Internet of Things (IoT) operating platforms is the Wireless Sensor Network (WSN), which has proliferated into a broad spectrum of applications. These networks comprise many resource-restricted sensors in terms of sensing, communication, storage, and power. Security becomes a critical concern to protect the network of scarce resources from any malicious activities that target the network. Several solutions have been presented in the literature; however, machine learning has proven its appropriateness in designing energy-efficient detection measures for cyber-attacks targeting WSNs. This paper presents a WSN security performance evaluation of three Naive Bayesian machine learning classification technique variants: Gaussian Naïve Bayes, Multinomial Naïve Bayes, and Bernoulli Naïve Bayes, compared to three well-known base algorithms, K-Nearest Neighbors, Support Vector Machine, and Multilayer Perceptron. We applied Spearman correlation as a univariate feature selection. The specialized dataset, WSN-DS, was used for training and testing purposes. The performance of the six classifiers was evaluated in terms of accuracy, probability of detection, positive prediction value, probability of false alarm, probability of misdetection, memory usage, processing time, prediction time, and complexity.

Artificial Intelligence-Assisted Network Slicing: Network Assurance and Service Provisioning in 6G

Article

Mar 2023

6G networks are expected to provide instant global connectivity and enable the transition from “connected things” to “connected intelligence,” where promising network slicing can play an important role in network assurance and service provisioning for various demanding vertical application scenarios. On the basis of diversified massive data, artificial intelligence (AI)-assisted techniques are widely considered more suitable than traditional models and algorithms to deal with challenges faced by complex and dynamic slicing problems in 6G. In view of this, we provide a tutorial on AI-assisted 6G network slicing for network assurance and service provisioning, aiming to show the prospect of 6G slicing and the advantages of applying AI technology. Specifically, we propose six typical characteristics of 6G network slicing, analyze the feasibility of AI from different network domains and technical aspects, propose a case study on AI-assisted bandwidth scaling, and, finally, put forward the main challenges and open issues for its future development.

Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing

Chapter

Oct 2022

Today, due to the evolution of technology and the use of the Internet on a large scale, securing everything is becoming an unavoidable necessity and a challenge for most companies. And since the traditional means of security have become insufficient due to the increase in the number and types of computer attacks that appear almost daily, researchers in the field of computer security are busy developing security tools based on artificial intelligence concepts to detect new attacks. In this work, we proposed a binary classification method for intrusion detection that has a high accuracy, precision and recall rates. This approach is based on multi-layer perceptron using both pearson correlation coefficient and simulated annealing for selecting attributes from the three datasets used for genarating and evaluating this model which are: NSL-KDD, UNSW-NB15 and CICIDS2017. We obtained 97,02% accuracy for NSL-KDD, 92,32% accuaracy for UNSW-NB15 and 97,70% for CICIDS2017.

AdIoTack: Quantifying and Refining Resilience of Decision Tree Ensemble Inference Models against Adversarial Volumetric Attacks on IoT Networks

Article

Jun 2022
COMPUT SECUR

Machine Learning-based techniques have shown success in cyber intelligence. However, they are increasingly becoming targets of sophisticated data-driven adversarial attacks resulting in misprediction, eroding their ability to detect threats on network devices. In this paper, we present AdIoTack,¹ a system that highlights vulnerabilities of decision trees against adversarial attacks, helping cybersecurity teams quantify and refine the resilience of their trained models for monitoring and protecting Internet-of-Things (IoT) networks. In order to assess the model for the worst-case scenario, AdIoTack performs white-box adversarial learning to launch successful volumetric attacks that decision tree ensemble network behavioral models cannot flag. Our first contribution is to develop a white-box algorithm that takes a trained decision tree ensemble model and the profile of an intended network-based attack (e.g., TCP/UDP reflection) on a victim class as inputs. It then automatically generates recipes that specify certain packets on top of the indented attack packets (less than 15% overhead) that together can bypass the inference model unnoticed. We ensure that the generated attack instances are feasible for launching on Internet Protocol (IP) networks and effective in their volumetric impact. Our second contribution develops a method to monitor the network behavior of connected devices actively, inject adversarial traffic (when feasible) on behalf of a victim IoT device, and successfully launch the intended attack. Our third contribution prototypes AdIoTack and validates its efficacy on a testbed consisting of a handful of real IoT devices monitored by a trained inference model. We demonstrate how the model detects all non-adversarial volumetric attacks on IoT devices while missing many adversarial ones. The fourth contribution develops systematic methods for applying patches to trained decision tree ensemble models, improving their resilience against adversarial volumetric attacks. We demonstrate how our refined model detects 92% of adversarial volumetric attacks.

The Roadmap of Communication and Networking in 6G for the Metaverse

Article

Jan 2022

The Metaverse can be regarded as a hypothesized iteration of the Internet, which enables people to work, play, and interact socially in a persist online 3-D virtual environment with an immersive experience, by generating an imaginary environment similar to the real world, including realistic sounds, images, and other sensations. The Metaverse has strict requirements for a fully-immersive experience, large-scale concurrent users, and seamless connectivity, which poses many unprecedented challenges to the sixth generation (6G) wireless system, such as ubiquitous connectivity, ultra-low latency, ultra-high capacity and reliability, and strict security. In addition, to achieve the immersive and hassle-free experience of mass users, the full coverage sensing, seamless computation, reliable caching, and persistent consensus and security should be carefully considered to integrate into the future 6G system. To this end, this paper aims to depict the roadmap to the Metaverse in terms of communication and networking in 6G, including illustrating the framework of the Metaverse, revealing the strict requirements and challenges for 6G to realize the Metaverse, and discussing the fundamental technologies to be integrated in 6G to drive the implementation of the Metaverse, including intelligent sensing, digital twin (DT), space-air-ground-sea integrated network (SAGSIN), multi-access edging computing (MEC), blockchain, and the involved security issues.

Deep Ensemble-based Efficient Framework for Network Attack Detection

Abstract and Figures

Recommended publications

A New Method of Image Detection for Small Datasets under the Framework of YOLO Network

I2-Diagnosability Framework for Detection of Advanced Stealth Man in The Middle Attack in Wi-Fi Netw...

Model Order Selection and Eigen Similarity based Framework for Detection and Identification of Netwo...

Detection of Melanoma Skin Cancer Using Capsule Network and Multi-Task Learning Framework