Conference PaperPDF Available

Deep Ensemble-based Efficient Framework for Network Attack Detection

Authors:

Abstract and Figures

Nowadays, networks play a critical role in business, education, and daily life, allowing people to communicate via different platforms across long distances. However, such communication contains many potential dangers and security vulnerabilities that can compromise the confidentiality, integrity, and privacy of data. Network attacks, malware, hacking, and phishing are increasing daily, resulting in colossal losses. Automated systems based on artificial intelligence can help to detect such attacks efficiently and protect sensitive information. This work proposes an ensemble deep voting classifier (EDVC) to detect network attacks with high accuracy. Long short-term memory (LSTM), recurrent neural network (RNN), and gated recurrent unit (GRU) are employed in the proposed approach using the majority voting criteria. Experimental results using the NSL-KDD dataset indicate the superior performance of EDVC with a 0.996 accuracy score which is superior to the existing state-of-the-art methods.
Content may be subject to copyright.
Deep Ensemble-based Efficient Framework for
Network Attack Detection
Furqan Rustam 1, Ali Raza 2, Imran Ashraf 3, Anca Delia Jurcut 1,
1School of Computer Science University College Dublin, Ireland; furqan.rustam1@gmail.com; anca.jurcut@ucd.ie
2Department of Computer Science KFUEIT, Rahim Yar Khan Pakistan; ali.raza.scholarly@gmail.com
3Dept. of Information & CE Yeungnam University Gyeongsan, Republic of Korea; imranashraf@ynu.ac.kr
Abstract—Nowadays, networks play a critical role in business,
education, and daily life, allowing people to communicate
via different platforms across long distances. However, such
communication contains many potential dangers and security
vulnerabilities that can compromise the confidentiality, integrity,
and privacy of data. Network attacks, malware, hacking, and
phishing are increasing daily, resulting in colossal losses. Auto-
mated systems based on artificial intelligence can help to detect
such attacks efficiently and protect sensitive information. This
work proposes an ensemble deep voting classifier (EDVC) to
detect network attacks with high accuracy. Long short-term
memory (LSTM), recurrent neural network (RNN), and gated
recurrent unit (GRU) are employed in the proposed approach
using the majority voting criteria. Experimental results using the
NSL-KDD dataset indicate the superior performance of EDVC
with a 0.996 accuracy score which is superior to the existing
state-of-the-art methods.
Index Terms—Network Attack Detection, Machine Learning,
Ensemble Learning, Deep Learning, Network Intrusion Detec-
tion
I. INTRODUCTION
Networking refers to the interconnection of multiple com-
puting devices, allowing them to exchange data. The data
sharing can be done through various technologies and com-
munication protocols, such as Ethernet, Wi-Fi, or even simple
wired connections [1]. The main goal of networking is to
enable devices to work together and share resources, such as
printers, file servers, and internet connections. In today’s in-
terconnected world, networks play a critical role in business,
education, and daily life, enabling people to communicate and
share information across long distances. Networking provides
several advantages including resource sharing, communica-
tion, centralized data management, improved collaboration,
increased productivity, scalability, and remote access [2].
With substantial networking applications, many potential
dangers and security vulnerabilities can arise thus com-
promising the confidentiality, integrity, and availability of
networked systems and data [3]. The typical network threats
include malware, hacking, phishing, denial-of-service (DoS)
attacks, man-in-the-middle (MitM) attacks, and spoofing.
Some specific dangers of network attacks include data theft,
system disruption, reputation damage, financial losses, es-
pionage, and infrastructure damage [4]. With the increase
in network threats, the necessity of an automated attack
detection system is increased. Artificial intelligence (AI)-
based solutions may potentially detect such attacks thereby
enabling timely countermeasures to mitigate the risk of
data theft [5]. Such techniques are utilized to analyze large
amounts of network data and identify potential threats in
real time, allowing organizations to respond quickly and
effectively. Machine learning methods learn the patterns from
data and are used to identify potential attacks. Integrating
such methods into network security can significantly improve
an organization’s ability to detect and respond to attacks
[6], reducing the risk of successful attacks and protecting
valuable information and assets. In this regard, this research
contributes as follows:
To achieve high accuracy in network attack detection, an
ensemble model is proposed in which three deep learn-
ing models, namely long short-term memory (LSTM),
recurrent neural network (RNN), and gated recurrent unit
(GRU), are combined using a majority voting criteria.
In order to thoroughly compare the performance of vari-
ous machine learning and deep learning models for net-
work attack detection, an in-depth analysis is conducted
on the NSL-KDD dataset. The models applied in this
analysis include decision tree (DT), logistic regression
(LR), support vector machine (SVM), Gaussian Nave
Bayes (GNB), random forest (RF), RNN, GRU, LSTM,
and convolutional neural network (CNN).
All models are optimized with regard to their hyperpa-
rameters to obtain the best possible results. Furthermore,
the proposed approach is evaluated against existing state-
of-the-art methods to analyze its performance.
The organization of the rest of this study is as follows: Sec-
tion II contains the analysis of related literature on network
attack detection, while the proposed methodology is described
in Section III. The evaluation and discussion of experimental
results are given in Section IV. The study is concluded in
Section V.
II. RELATED WORK
Network security is a crucial element for organizations
to safeguard the privacy and confidentiality of their data.
Consequently, a substantial number of research works can
be found on network security. A few more relevant works
are discussed here.
Network intrusion detection based on classical machine
learning techniques is proposed in [7]. eleven machine
learning methods are utilized on the NSL-KDD dataset.
Results show that tree-based techniques achieve the best
performance for network detection. The proposed XGBoost
model achieves a 97% accuracy for attack detection. Simi-
larly, network intrusion detection using the neural network is
performed in [8] using the NSL-KDD dataset. Experimental
results indicate that the bidirectional LSTM approach with
an attention mechanism achieves high performance. Using
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
979-8-3503-3884-3/23/$31.00 ©2023 IEEE 1
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet) | 979-8-3503-3884-3/23/$31.00 ©2023 IEEE | DOI: 10.1109/MedComNet58619.2023.10168864
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
key and local features from the data BiLSTM achieves an
accuracy score of 84%.
Network attack detection using deep learning techniques is
presented in [9]. The study utilizes three network intrusion
datasets. A deep neural network is proposed for network in-
trusion detection. The proposed method achieves an accuracy
score of 97% using the NSL-KDD dataset. However, the deep
neural network achieved poor performance scores for UNSW-
NB15 and CSE-CIC-IDS2018 datasets. The authors present a
DoS attack detection method in [10]. The NSL-KDD dataset
is utilized with various machine learning and deep learning
methods. BiLSTM is used to detect network attacks as binary
classification. The results show that the BiLSTM technique
achieves an accuracy of 82% for network attack detection.
Besides stand-alone models, hybrid models have also been
utilized for the same purpose. For example, [11] presents
a hybrid intrusion detection model for detecting network
attacks. The particle swarm optimization (PSO) and enhanced
genetic technique with a random forest model are utilized.
Experimental results using the NSL-KDD dataset indicate
an 88% accuracy score. Along the same direction, [12]
uses a hybrid sampling with the deep hierarchical network
for detecting network attacks. The NSL-KDD dataset is
balanced using the synthetic minority over-sampling tech-
nique (SMOTE). The spatial features are extracted using the
convolution neural network while the temporal features are
extracted using the BiLSTM. The proposed approach achieves
an accuracy score of 83%. The detection of network intrusion
using a hybrid of machine learning and deep learning model
is proposed in [13]. The proposed model is based on a hybrid
of k-means, random forest, and neural networks. The adaptive
synthetic sampling technique is applied to the dataset for class
balancing. The NSL-KDD and CIS-IDS2017 datasets are
utilized for conducting the study experiments. The proposed
hybrid model achieves an accuracy of 85% using the NSL-
KDD dataset.
The study [14] uses the simulated annealing-based multi-
layer perceptron (MLP) model to detect network intrusion.
The binary classification is performed using the neural
network-based deep learning model. Three datasets NSL-
KDD, UNSW-NB15, and CICIDS2017 are utilized for exper-
iments. The proposed MLP technique achieves a 97% accu-
racy on the NSL-KDD dataset. Similarly, [15] designed a ma-
chine learning-based network intrusion system for software-
defined networks. The NSL-KDD dataset is utilized and 41
features of the dataset are used for performing the multi-class
classification. The DDoS, PROBE, R2L, and U2R attacks are
detected using the applied techniques. The proposed XGBoost
achieves a 95% accuracy score.
The detection of network attacks using a double-layer
hybrid model is proposed in [16]. The proposed model
combines the Naive Bayes and SVM models to construct
a hybrid model. The principal component analysis (PCA)
is used during the dataset features preprocessing. The NSL-
KDD dataset is utilized to perform experiments. The proposed
machine learning-based hybrid model achieves 89% accuracy
for detecting network intrusion. The study [17] proposes net-
work intrusion detection using deep learning-based methods
for fog-assisted Internet of things (IoT). A network intrusion
detection system (NIDS) is developed using a deep learning
approach for IoT. The proposed NIDS is a device for attack
detection and is implemented in the fog node. The two public
datasets UNSW-NB15 and NSL-KDD are used to evaluate
the performance of the proposed NIDS system. The proposed
system achieves a 95% accuracy for network attack detection
in IoT.
Besides using the optimized models, some studies focus
on feature selection and engineering. For example, the study
[18] proposed a neural network-based technique for detecting
network intrusion detection. The influencing features from
the NSL-KDD are selected using the neural network and
condensed neighbors mechanism. The performance of applied
techniques is validated using the WEKA software. The study
results show that the proposed deep learning-based condensed
nearest neighbors technique achieves an accuracy of 94% for
network attack detection.
Existing research on network attack detection has several
limitations and gaps, which can be summarized as follows:
Previous studies on network attack detection have relied
primarily on classical machine learning and deep learn-
ing techniques. However, utilizing advanced ensemble
learning techniques may improve the accuracy of net-
work attack detection.
The accuracy scores of network attack detection perfor-
mance in previous studies range between 80% to 97%,
indicating the need for further improvement to achieve
optimal performance accuracy.
III. STUDY METHODOLOGY
Figure 1 shows the architecture of the proposed approach.
The network attack features-based dataset is utilized for
experiments. The dataset is preprocessed including cate-
gorical encoding features and target attack label mapping.
Exploratory data analysis is applied to examine the network
attack feature patterns. For experiments, the data is split into
training and testing with a ratio of 0.8 to 0.2. The proposed
approach is used to detect network attacks as normal, Dos,
Remote to Local (R2L), Probe, and User to Root (U2R)
Figure 1. The methodological analysis for network attack detection.
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
2
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
A. Network features data
The publicly available network attack features based NSL-
KDD benchmark dataset is utilized [19]. The dataset features
are based on the Dos, R2L, Probe, and U2R network attacks.
The dataset contains a total of 148517 records and 43 features
related to network attacks.
B. Data Preprocessing
As machine learning models are designed to operate on nu-
merical data, our dataset, which includes categorical features,
requires preprocessing to clean and convert the data into a nu-
merical representation before being fed to the models. In this
regard, we preprocessed the dataset by removing duplicate
records and encoding the categorical features. Specifically, the
’protocol type’, ’service’, and ’flag’ features were encoded
using the LabelEncoder module in Scikit learn [20].
C. Exploratory Data Analysis
Exploratory data analysis is a crucial step in the data
analysis process that helps to identify patterns, trends, and
relationships. Graphs and charts are essential tools used in
exploratory data analysis to visualize the data and uncover
insights. The bar chart-based network attack target class
distributions analysis is illustrated in Figure 2. The analysis
shows that the dataset is highly imbalanced as the ’normal’
class has 77054 samples, the ’Dos’ class has 53387 samples,
the ’Probe’ class has 14077 samples, the ’R2L class has
3880 samples, and the ’U2R’ class has 119 samples. The
analysis concludes that the Dos and Probe attacks occur more
frequently, while R2L and U2R attacks are less frequent.
Figure 2. The target label distribution analysis is related to network attacks.
The bar chart-based service provider analysis during the
network attack is illustrated in Figure 3. The network service
is an application running at the network application layer. The
analysis demonstrates a high frequency of network attacks
using the HTTP service. Then the private service is most
reported during a network attack, followed by the domain u,
smtp, ftp data, others, eco i, telnet, and ecr i services. The
dataset analysis concludes that the HTTP service is most used
during a network attack.
The indication analysis of flags during a network attack
is visualized in Figure 4. The flags in a network indicate
Figure 3. The occurrence analysis of service during a network attack.
a particular connection state. The analysis demonstrates that
the SF, So, and REJ are the most frequently indicted during
a network attack. The analysis concludes the most occurred
flags on a network.
Figure 4. The occurrence analysis of flags during a network attack.
D. Applied Machine and Deep Learning Techniques
The applied machine and deep learning methods are de-
scribed in this section. This study utilizes five machine
learning and four deep learning-based techniques applied in
comparison.
DT machine learning models are popular data science
techniques used for classification and regression tasks
[21]. DT is a supervised learning algorithm that builds a
tree-like structure to model the decision-making process.
The tree consists of nodes representing features or
attributes and branches representing the decision rules.
At each node, the model evaluates the value of the
corresponding feature and determines which branch to
follow. The final leaf nodes represent the predicted
class or value. DT models are easy to interpret and
visualize, making them popular in fields such as finance,
healthcare, and marketing. One of the critical benefits of
DT is its ability to handle missing data and noisy data,
which are often encountered in real-world datasets.
LR is a widely used machine learning model for clas-
sification tasks that involve predicting binary or multi-
class outcomes [22]. LR models the relationship between
a dependent variable and one or more independent
variables. The dependent variable is usually a binary
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
3
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
variable representing the presence or absence of a partic-
ular characteristic, while the independent variables are
continuous or categorical in nature. LR uses a logistic
function to model the probability of the dependent
variable taking a particular value, given the values of
the independent variables. LR has applications in various
fields, such as healthcare, finance, marketing, and social
sciences. It is beneficial when the relationship between
the independent and dependent variables is non-linear or
complex.
SVM is a popular machine learning algorithm that is
widely used in classification, regression, and outlier
detection [23]. SVM is a supervised learning model
that works by finding the optimal decision boundary
or hyperplane that separates data points into different
classes. The key is to maximize the margin between the
decision boundary and the closest data points, known
as support vectors. SVM has several advantages, such
as efficiently handling high-dimensional data, non-linear
decision boundaries, and large datasets. However, SVM
is sensitive to the choice of kernel function and parame-
ters and prone to overfitting when the number of features
is large.
GNB is a popular machine learning model used for
classification problems, especially in the field of natural
language processing [24]. It is a probabilistic algorithm
that relies on the Bayes theorem. GNB is a variant
of the Naive Bayes algorithm that assumes that the
features of the input data are independent of each other.
This simplifies the computation of the probability of
the input data belonging to a particular class, making
GNB computationally efficient and effective for large
datasets. Despite its simplicity, GNB often performs
well compared to other complex algorithms, making it
a popular choice in the machine learning community.
RF is a robust machine learning model that is widely
used for network intrusion detection [25], [26]. RF is an
ensemble method that combines multiple decision trees
to make predictions. Each tree is trained on a random
subset of the data and a random subset of the features.
This randomization process helps reduce overfitting and
improves the model’s generalization performance. RF
has been successfully applied to a variety of fields,
including finance, medicine, and ecology. However, RF
can be computationally expensive and may require
careful tuning of hyperparameters to achieve optimal
performance.
RNN is widely used in sequential data such as text,
speech, and time-series data [27]. Unlike traditional
feedforward neural networks, RNN has feedback loops
that allow information to persist across time steps. RNN
can capture temporal dependencies and learn from past
experiences to make better predictions. However, RNN
suffers from the problem of vanishing gradients, which
limits its ability to learn long-term dependencies. RNN
still faces challenges in modeling complex sequences
with variable lengths and in handling noisy or incom-
plete data.
GRU has gained popularity in recent years for its ability
to model sequential data effectively [28]. GRU is a
type of RNN that has gating mechanisms which allow
the model to selectively update and forget information
at each time step, improving the model’s ability to
capture long-term dependencies. GRU has been shown
to outperform other RNNs on tasks such as language
modeling, machine translation, and speech recognition.
Additionally, GRU has a simpler architecture than other
RNNs, making it easier to train and faster to converge.
LSTM is an RNN variant that is widely used in deep
learning [29]. Unlike traditional RNN, LSTM is de-
signed to capture long-term dependencies in sequential
data by using a memory cell that can store information
over a longer period of time. LSTM is particularly
effective in speech recognition, machine translation, and
sentiment analysis tasks. In an LSTM model, the input
data is first processed through a series of gates, which
control the flow of information into and out of the
memory cell. The output of an LSTM is then computed
based on the contents of the memory cell and the input
data. Overall, LSTM is a powerful tool for modeling
sequential data and has demonstrated state-of-the-art
performance in a wide range of applications.
CNN, also known as feed-forward neural networks, is
primarily used to solve classification problems [30]. The
CNN model consists of multiple layers of interconnected
nodes, with each layer transforming the input from
the previous layer to produce the final output. The
architecture of a CNN model is based on three types of
layers: convolution, pooling, and fully connected dense
layers. The CNN model has the disadvantage that it can
suffer from overfitting, where the model becomes too
complex and fits the training data too closely, leading to
poor performance on new unseen data. To address this,
techniques such as regularization, dropout, and early
stopping can be employed to prevent overfitting and
improve generalization.
The hyperparameter optimization is done for each applied
model. The best-fit hyperparameters are determined using the
recursive training and testing mechanism. The hyperparame-
ter optimization model help achieves high performance. The
final selected hyperparameters of each model are given in
Table I.
Table I
THE HYPERPARAMETERS OF MACHINE LEARNING AND DEEP LEARNING
MODELS.
Technique Hyperparameter value
DT splitter=’best’, min samples split=2, crite-
rion=’entropy’, max depth=5, random state=500
LR random state=10, solver=’lbfgs’, max iter=50,
multi class=’auto’, C=1.0
SVM random state=50, max iter=50, penalty’l2’,
loss=’squared hinge’, tol=1e-4, multi class=’ovr’,
fit intercept=True
GNB var smoothing=1e-9
RF n estimators=20, max depth=5, random state=100,
criterion=’entropy’, min samples split=2,
max features=”sqrt”, bootstrap=True
E. Proposed Ensemble Deep Voting Classifier
EDVC method is proposed to detect the network attack
and its architecture is given in Figure 5. The proposed EDVC
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
4
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
Table II
ARCHITECTURE OF EACH DEEP LEARNING MODEL AND EACH INDIVIDUAL MODEL USED IN EDVC.
GRU LSTM RNN CNN
Sequential() Sequential() Sequential() Sequential()
GRU(64, input shape=42) LSTM(64, input shape=42) SimpleRNN(64, input shape=42) Conv1D(64, 3, input shape = 42, activation=’relu’)
Dense(16,activation=’relu’) Dense(16,activation=’relu’) Dense(16,activation=’relu’) MaxPool1D(pool size=(2)
Dropout(0.5) Dropout(0.5) Dropout(0.5) Flatten()
Dense(5,activation=’softmax’) Dense(5,activation=’softmax’) Dense(5,activation=’softmax’) Dense(16,activation=’relu’)
Dense(5,activation=’softmax’)
compile & fit (loss = ’categorical crossentropy’,optimizer = ’adam’, epochs=10)
Figure 5. The architectural analysis of the proposed EDVC model.
technique combines the deep learning-based RNN, GRU, and
LSTM methods. We chose to use these three models based
on their individual performance, as they have demonstrated
significant accuracy. Additionally, these models share a com-
mon trait in that they are all recurrent architectures [31].
EDVC employs a majority voting mechanism to determine
the final prediction. Each model will generate a prediction
for a given sample, and the final prediction is made by
aggregating the individual model predictions through voting.
The class with the highest number of votes is selected as
the final prediction. The models discussed in Table II are
designed with a lightweight architecture, which enables them
to achieve high accuracy while maintaining simplicity. Each
model begins with its own unique layer and is subsequently
followed by a dense layer comprising 16 neurons with a
ReLU activation function, a dropout layer with a 0.5 dropout
rate, and another dense layer with 5 neurons and a softmax
activation function. Finally, each model is compiled using
the categorical crossentropy loss function, optimized with the
Adam optimizer, and trained for 10 epochs. Mathematically,
EDVC can be defined as:
lstmp=LST M (ts)//ts Dataset (1)
rnnp=RNN (ts)(2)
and,
grup=GRU(ts)(3)
where lstmp,rnnp, and grupare the predictions by LSTM,
RNN, and LSTM, respectively for the target class on a test
sample ts. The final prediction step can be defined as
fp =mode{lstmp,rnn
p,gru
p}(4)
Here, fp is the final prediction which is a result of voting
between each model prediction. Algorithm 1 shows the step-
by-step flow of the proposed model.
Algorithm 1 EDVC Algorithm
1: Input: NSL-KDD dataset (D)
2: Output: Attack Prediction Normal, Dos, Probe, R2L,
and U2R.
3: Tlstm ←− LST Mtraining (TrS)Training set(TrS)D
4: Tgru ←− GRUtraining (TrS)
5: TRNN ←− RN Ntraining (TrS)
6: for iinTeSdo Testing set (TeS)D
7: LST Mp←− TLS T M (i)LST Mpis LSTM prediction against a
sample.
8: GRUp←− TGRU (i)GRUpGRU prediction
9: RNNp←− TRNN (i)RNNpGRU prediction
10: FPred ←− mode{LST Mp,GRU
p,RNN
p}
FPred N ormal, Dos, P robe, R2L, andU2R. which is final prediction
IV. RESULTS AND DISCUSSIONS
This section contains the results and discussions of exper-
iments for network attack detection.
A. Experimental Setup
Experiments are carried out using the Google Colab envi-
ronment. The platform uses 90 GB disk space, a GPU with
13 GB RAM, and Intel(R) Xeon(R) system. All the models
are implemented using Python 3.0 programming language.
Evaluation is carried out using accuracy, precision, recall, and
F1 score.
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
5
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
B. Results of Machine Learning Models
The comparative performance results of the applied ma-
chine learning technique are given in Table III. The analysis
demonstrates that DT and RF techniques achieve excellent
performance with an accuracy score of 0.96 each while GNB
achieves poor performance with an accuracy score of 0.42.
The performance of LR and SVM is moderately well with
accuracy scores of 0.80 and 0.84, respectively.
Table III
THE COMPARATIVE RESULTS ANALYSIS OF APPLIED MACHINE LEARNING
METHODS ON UNSEEN TEST DATA.
Model Acc. P R F1
DT 0.96 0.97 0.97 0.97
LR 0.80 0.74 0.80 0.77
SVM 0.84 0.88 0.84 0.86
GNB 0.42 0.62 0.42 0.33
RF 0.96 0.94 0.96 0.95
The class-wise performance comparisons of machine learn-
ing models are given in Table IV. The analysis demonstrates
that the GNB technique achieves poor performance scores
for each class. It also indicates that the attack-encoded target
label 3 has 0% results due to fewer training samples. DT
achieves good performance metrics scores, followed by the
RF. The analysis concludes that only DT and RF models
achieve good scores for class performance comparisons. In
class-wise tables and confusion matrix figures, we represent
targets as follows: Dos (0), Probe (1), R2L (2), U2R (3), and
Normal (4).
For further verification of results, cross-validation is per-
formed and results are given in Table V. The analysis
demonstrates that the applied GNB and LR achieve poor
cross-validation performance. On the other hand, both DT
and RF achieve significantly better mean accuracy score of
0.996 each with minimum standard deviation.
Besides accuracy, precision, recall, and F1 score, the
number of correct and wrong predictions is an important
evaluation metric for models. The confusion matrix of applied
models is illustrated in Figure 6. Analysis shows that LR
and GNB have the highest number of wrong predictions for
network attack classification. Only the DT and RF techniques
show better performance with a lower number of wrong
predictions. According to confusion matrix values, DT gives
28705 correct predictions and 999 wrong predictions out
of 29704 test predictions while RF gives 28577 correct
predictions and 1127 wrong predictions. GNB performs worst
with 12556 correct predictions and 17148 wrong predictions.
C. Results of Deep Learning Models
Table VI shows the results of employed deep learning
models. GRU and LSTM show the best performance for
network attack detection and achieve a 0.98 accuracy score
each. GRU and LSTM tend to show better performance due
to their recurrent architecture. Similarly, a simple recurrent
architecture RNN also achieves a 0.096 accuracy score. The
analysis shows that the CNN method could not perform well
and achieved a 0.93 accuracy score. CNN requires a large
feature set to achieve significant results. It is found that
deep learning models tend to show better performance than
machine learning models for network attack detection.
Figure 6. The confusion matrix analysis of applied machine learning
methods.
A comparison of training and validation loss and accuracy
of employed deep learning models is presented in Figure
7. It shows that the RNN model has a high training loss
and less training accuracy score during the first epoch of
training, followed by the GRU and LSTM. After completing
the first training epoch, the neural network optimizer changes
the models’ weights, decreasing loss gradually. The CNN
analysis shows that the training loss is very high at the first
epoch, and the accuracy score is deficient. The CNN model
achieves poor performance scores during the training phase.
The class-wise performance analysis of employed deep
learning models is given in Table VII. The performance
metrics score of precision, recall, and F1 are evaluated for
each class. The analysis demonstrates that the CNN model
achieves poor performance scores for each class. GRU and
LSTM models achieve the highest average scores of 0.99 for
each network attack class. The performance of models for
class label 3 is still similar to those of machine learning
models with 0 scores. The analysis concludes that deep
learning models perform better for each target class than
machine learning techniques.
The cross-validation analysis of deep learning models
based on the k-fold is presented in Table VIII. The network
attacks dataset is split into ten folds to test the generalization
of each model. The analysis demonstrates that LSTM models
achieve a maximum k-fold accuracy score of 0.98 with a
minimum standard deviation of 0.0138. It is followed by
GRU and CNN with mean accuracy scores of 0.97 and 0.87,
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
6
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
Table IV
THE CLASS-WISE PERFORMANCE ANALYSIS OF MACHINE LEARNING MODELS.
Model DT LR SVM GNB RF
Target P R F1 P R F1 P R F1 P R F1 P R F1
0 0.98 0.98 0.98 0.83 0.88 0.85 0.92 0.94 0.93 0.39 0.96 0.55 1.00 0.99 0.99
1 0.87 0.96 0.91 0.14 0.07 0.09 0.85 0.65 0.73 0.37 0.06 0.10 0.96 0.97 0.96
2 0.72 0.77 0.74 0.00 0.00 0.00 0.12 0.35 0.17 0.02 0.00 0.00 0.00 0.00 0.00
3 0.00 0.00 0.00 0.00 0.00 0.00 0.26 0.42 0.32 0.05 0.17 0.08 0.00 0.00 0.00
4 0.99 0.97 0.98 0.83 0.93 0.88 0.90 0.83 0.87 0.85 0.14 0.24 0.94 0.99 0.97
Weighted Avg. 0.97 0.97 0.97 0.74 0.80 0.77 0.88 0.84 0.86 0.62 0.42 0.33 0.94 0.96 0.95
(a) The perfrmane analysis of RNN
techqniue
(b) The perfrmane analysis of GRU
techqniue
(c) The perfrmane analysis of LSTM
techqniue
(d) The perfrmane analysis of CNN
techqniue
Figure 7. The time series-based performance comparison analysis of applied deep learning techniques during training.
Table V
K-FOLD CROSS-VALIDATION RESULTS.
Model K-fold Acc. Standard deviation
DT 10 0.96 +/- 0.0015
LR 10 0.80 +/- 0.0036
SVM 10 0.85 +/- 0.0833
GNB 10 0.42 +/- 0.0041
RF 10 0.96 +/- 0.0019
Table VI
THE COMPARATIVE RESULTS ANALYSIS OF APPLIED DEEP LEARNING
METHODS ON UNSEEN TEST DATA.
Model Acc. P R F1
RNN 0.96 0.96 0.96 0.96
GRU 0.98 0.99 0.99 0.99
LSTM 0.98 0.99 0.99 0.99
CNN 0.93 0.91 0.93 0.92
Table VII
CLASS-WISE PERFORMANCE ANALYSIS OF DEEP LEARNING MODELS.
Model RNN GRU
Target P R F1 P R F1
0 0.96 0.98 0.97 0.99 1.00 0.99
1 0.93 0.92 0.93 0.97 0.99 0.98
2 0.61 0.67 0.64 0.93 0.82 0.87
3 0.00 0.00 0.00 0.00 0.00 0.00
4 0.99 0.96 0.97 0.99 0.99 0.99
Weighted Avg. 0.96 0.96 0.96 0.99 0.99 0.99
LSTM CNN
0 0.99 1.00 1.00 0.99 0.94 0.96
1 0.98 0.96 0.97 0.93 0.87 0.90
2 0.91 0.91 0.91 0.00 0.00 0.00
3 0.00 0.00 0.00 0.00 0.00 0.00
4 0.99 0.99 0.99 0.90 0.99 0.94
Weighted Avg. 0.99 0.99 0.99 0.91 0.93 0.92
respectively. The RNN model achieves poor cross-validation
performance.
The confusion matrix of deep learning models is given in
Figure 8. The analysis shows that RNN and CNN have the
highest number of wrong predictions for network attack clas-
sification. LSTM shows the best performance with the highest
Table VIII
THE K-FOLD CROSS-VALIDATION-BASED RESULTS ANALYSIS OF APPLIED
DEEP LEARNING METHODS.
Model K-fold Acc. Standard deviation
RNN 10 0.95 +/- 0.0438
GRU 10 0.97 +/- 0.0453
LSTM 10 0.98 +/- 0.0138
CNN 10 0.87 +/- 0.0849
number of correct predictions. According to the confusion
matrix, GRU gives 29275 correct predictions and 432 wrong
predictions out of test 29704 predictions. Similarly, LSTM
gives 29327 correct predictions and 377 wrong predictions.
From deep learning models, CNN performs worst with 1828
wrong predictions and 27876 correct predictions.
Figure 8. The confusion matrix analysis of applied deep learning methods.
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
7
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
Figure 9 shows the performance comparison between ma-
chine learning and deep learning models. It indicates that
deep learning models have better performance for network
attack detection as compared to machine learning models.
Figure 9. Performance comparison between machine learning and deep
learning models.
D. Results of Proposed Ensemble Model
The performance of the proposed ensemble approach for
detecting network attacks is examined here. Table IX shows
the results of the proposed approach regarding the accuracy,
precision, recall, and F1 score. Results demonstrate that
the proposed model achieves the highest accuracy score of
99.6%. The model achieves 100% precision, recall, and F1 for
network attack detection. Performance metrics for each class
show that the proposed model achieves a 100% precision for
each target class except for target class 3. Deep learning and
machine learning models showed a 0 precision, recall, and F1
score for target class 3. On the contrary, the proposed model
obtains 0.89, 0.33, and 0.48 scores for precision, recall, and
F1 scores. The recall and F1 scores for classes 0 and 4 are
1.00 while class 1 has 0.99 precision, recall, and F1 score.
The analysis concludes that the proposed approach achieves
1.00 average scores for precision, recall, and F1 scores. In
comparison with the ensemble of deep learning models, this
study also combines machine learning best performers under
majority voting criteria. For this purpose, DT, RF, and SVM
are combined because of their better individual performance.
However, the results of this ensemble are inferior to the
proposed approach. The ensemble of DT, RF, and SVM could
achieve only a 0.967 accuracy score in comparison to a 0.996
accuracy score from the proposed model.
The classification performance of the proposed technique
based on the confusion matrix is illustrated in Figure 10.
The confusion matrix shows that the minimum number of
wrong predictions is achieved by the proposed ensemble
technique for detecting the network attack. It makes 29598
correct predictions out of 29704 total predictions and only
106 predictions are wrong.
E. Comparisons Analysis With State-of-the-art Studies
The comparative analysis of the proposed approach with
state-of-the-art studies is analyzed in Table X. The previously
published research articles for detecting network attacks from
the year 2020 to 2023 are taken for comparison. The analysis
Table IX
RESULTS OF THE PROPOSED EDVC MODEL FOR EACH TARGET CLASS.
Model Acc. Target P R F1
EDVC 0.996 0 1.00 1.00 1.00
1 0.99 0.99 0.99
2 0.93 0.99 0.96
3 0.89 0.33 0.48
4 1.00 1.00 1.00
Weighted Avg. 1.00 1.00 1.00
DT+RF+SVM 0.977 0 0.99 0.99 0.99
1 0.93 0.96 0.95
2 0.73 0.28 0.40
3 0.00 0.00 0.00
4 0.96 0.99 0.97
Weighted Avg. 0.96 0.97 0.96
Figure 10. The confusion matrix analysis of our proposed EDVC method
for network attack detection.
shows that most researchers only used classical machine
learning and deep learning method for detecting network
attacks. Some studies used ensemble models but they also
consider only machine learning models. The comparative
analysis demonstrates that the propped ensemble model out-
performs existing models with high accuracy.
Table X
PERFORMANCE COMPARISON WITH STATE-OF-THE-ART STUDIES.
Ref. Year Approach Technique Acc.
[7] 2020 Machine Learning XGBoost 0.97
[8] 2020 Deep learning BLSTM 0.84
[9] 2020 Deep learning DNN 0.97
[12] 2020 Deep learning DHN 0.83
[15] 2021 Machine Learning XGBoost 0.95
[16] 2021 Machine Learning Hybrid NB+SVM 0.89
[17] 2021 Deep learning NIDS 0.95
[18] 2021 Deep learning Condensed Nearest
Neighbors
0.94
[13] 2021 Machine Learning Hybrid K-mean+RF 0.85
[10] 2022 Deep learning BiLSTM 0.82
[11] 2022 Machine Learning Improved RF 0.88
[14] 2023 Deep learning MLP 0.97
Our 2023 Deep Learning EDVC 0.996
F. Discussion
The performance of the proposed EDVC model is signifi-
cantly better than existing state-of-the-art approaches because
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
8
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
of its ensemble architecture. The combination of three models
using voting criteria helps to improve accuracy. Another
reason that benefits models perform better is the MSL-KDD
class distribution, as illustrated in Figure 11. T-distributed
stochastic neighbor embedding (t-SNE) technique is used to
show the feature space with low dimensionality [32]. The t-
SNE output illustrates that dataset is highly correlated with
target classes and is linearly separable thus providing accurate
learning because target samples do not overlap. Class distribu-
tion in terms of the number of samples is highly imbalanced
which impacted models over-fitting towards the majority class
but the proposed model shows better performance because of
majority voting and achieves a 99.6% accuracy score.
Figure 11. Class distribution for the NSL-KDD dataset.
Table XI displays the computational cost associated with
the various learning models used in this study for detecting
network attacks. The machine learning models utilized in
this study exhibit a lower computational cost compared to
the deep learning models. However, our proposed approach,
EDVC, employs a combination of three deep learning models,
resulting in a higher computational cost than the simple state-
of-the-art models. In the context of network security, accuracy
is a crucial factor since even a 1% vulnerability in the security
system can result in significant costs. Therefore, in this study,
we prioritize accuracy over computational cost, although this
approach has its limitations, which can be addressed in future
studies.
V. C ONCLUSIONS,LIMITATIONS AND FUTURE WORK
Network attack detection using a deep learning-based en-
semble model is proposed. The proposed EDVC technique
combines the deep learning RNN, GRU, and LSTM models
under majority voting criteria. Experiments are performed us-
ing the NSL-KDD dataset employing both machine learning
and deep learning models for performance comparison. Ex-
perimental results indicate that the proposed model achieves
superior results and obtains an accuracy of 99.6% for network
Table XI
COMPUTATIONAL COST OF LEARNING MODELS IN TERMS OF TIME
(SECONDS).
Model Time Model Time
DT 0.35 LR 11.11
SVM 4.13 DT+RF+SVM 6.91
GNB 0.09 RF 1.83
RNN 862.90 GRU 161.52
LSTM 263.90 CNN 106.87
EDVC 1462.79
attack detection. The performance of the proposed model is
further verified using performance comparison with existing
state-of-the-art approaches which shows that it outperforms
existing models.
Limitations: Despite the promising results of the proposed
model, it has several limitations. First, the proposed EDVC
model is an ensemble of deep learning models and its
computational cost is high compared to the ensemble of
machine learning models. It is significantly better in terms
of accuracy but has a high computational cost. Secondly, the
used dataset is highly imbalanced as the target U2R class
contains only 119 samples. It may lead to improper training
of models thus causing poor performance to detect this attack.
Third, It should be noted that the proposed approach has only
been tested on the NSL-KDD dataset, and its performance
may vary when applied to other datasets.
Future Work: In future work, we intend to work on
reducing the computational cost by focusing on the architec-
ture of individual deep learning models. The proposed model
will undergo training on multiple types of attacks, ensuring
that it can effectively address new threats and attacks that
continually emerge. Furthermore, we want to apply advanced
data-balancing techniques to enhance the performance of
attack-detecting models.
ACRONYMS
Description Acronym Description Acronym
Ensemble Deep Voting Classifier EDVC Man-in-the-Middle MitM
Long Short-Term Memory LSTM Artificial Intelligence AI
Recurrent Neural Network RNN Particle Swarm Optimization PSO
Gated Recurrent Unit GRU Synthetic Minority Over-sampling TEchnique SMOTE
Decision Tree DT Multilayered Perceptron MLP
Logistic Regression LR Remote to Local R2L
Support Vector Machine SVM User to Root U2R
Gaussian Nave Bayes GNB Internet of things IoT
Random Forest RF Network Intrusion Detection System NIDS
Convolutional Neural Network CNN Hypertext Transfer Protocol HTTP
Accuracy Acc. Precision P
Recall R F1 Score F1
REFERENCES
[1] F. Tang, X. Chen, M. Zhao, and N. Kato, “The roadmap of com-
munication and networking in 6g for the metaverse, IEEE Wireless
Communications, 2022.
[2] H. Guo, X. Zhou, J. Liu, and Y. Zhang, “Vehicular intelligence in 6g:
Networking, communications, and computing,” Vehicular Communica-
tions, vol. 33, p. 100399, 2022.
[3] P. L. Indrasiri, E. Lee, V. Rupapara, F. Rustam, and I. Ashraf, “Mali-
cious traffic detection in iot and local networks using stacked ensemble
classifier, Computers, Materials and Continua, vol. 71, no. 1, pp. 489–
515, 2022.
[4] Y. Maleh, Y. Qasmaoui, K. El Gholami, Y. Sadqi, and S. Mounir,
“A comprehensive survey on sdn security: threats, mitigations, and
future directions,” Journal of Reliable Intelligent Environments, pp.
1–39, 2022.
[5] J. Wang, J. Liu, J. Li, and N. Kato, Artificial intelligence-assisted
network slicing: Network assurance and service provisioning in 6g,
IEEE Vehicular Technology Magazine, 2023.
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
9
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
[6] M. A. Talukder, K. F. Hasan, M. M. Islam, M. A. Uddin, A. Akhter,
M. A. Yousuf, F. Alharbi, and M. A. Moni, A dependable hybrid
machine learning model for network intrusion detection,” Journal of
Information Security and Applications, vol. 72, p. 103405, 2023.
[7] J. Liu, B. Kantarci, and C. Adams, “Machine learning-driven intrusion
detection for contiki-ng-based iot networks exposed to nsl-kdd dataset,
in Proceedings of the 2nd ACM workshop on wireless security and
machine learning, 2020, pp. 25–30.
[8] T. Su, H. Sun, J. Zhu, S. Wang, and Y. Li, “Bat: Deep learning methods
on network intrusion detection using nsl-kdd dataset,” IEEE Access,
vol. 8, pp. 29 575–29 585, 2020.
[9] G. C. Amaizu, C. I. Nwakanma, J.-M. Lee, and D.-S. Kim, “Inves-
tigating network intrusion detection datasets using machine learning,”
in 2020 International Conference on Information and Communication
Technology Convergence (ICTC). IEEE, 2020, pp. 1325–1328.
[10] M. Esmaeili, S. H. Goki, B. H. K. Masjidi, M. Sameh, H. Gharagozlou,
and A. S. Mohammed, “Ml-ddosnet: Iot intrusion detection based on
denial-of-service attacks using machine learning methods and nsl-kdd,”
Wireless Communications and Mobile Computing, vol. 2022, 2022.
[11] A. K. Balyan, S. Ahuja, U. K. Lilhore, S. K. Sharma, P. Manoharan,
A. D. Algarni, H. Elmannai, and K. Raahemifar, A hybrid intrusion
detection model using ega-pso and improved random forest method,
Sensors, vol. 22, no. 16, p. 5986, 2022.
[12] K. Jiang, W. Wang, A. Wang, and H. Wu, “Network intrusion detection
combined hybrid sampling with deep hierarchical network,” IEEE
access, vol. 8, pp. 32 464–32 476, 2020.
[13] C. Liu, Z. Gu, and J. Wang, A hybrid intrusion detection system based
on scalable k-means+ random forest and deep learning,” Ieee Access,
vol. 9, pp. 75 729–75 740, 2021.
[14] S. Cherfi, A. Boulaiche, and A. Lemouari, “Multi-layer perceptron for
intrusion detection using simulated annealing,” in Modelling and Im-
plementation of Complex Systems: Proceedings of the 7th International
Symposium, MISC 2022, Mostaganem, Algeria, October 30-31, 2022.
Springer, 2022, pp. 31–45.
[15] A. O. Alzahrani and M. J. Alenazi, “Designing a network intrusion
detection system based on machine learning for software defined
networks,” Future Internet, vol. 13, no. 5, p. 111, 2021.
[16] T. Wisanwanichthan and M. Thammawichai, “A double-layered hybrid
approach for network intrusion detection system using combined naive
bayes and svm,” IEEE Access, vol. 9, pp. 138432–138 450, 2021.
[17] N. Sahar, R. Mishra, and S. Kalam, “Deep learning approach-based
network intrusion detection system for fog-assisted iot,” in Proceedings
of international conference on big data, machine learning and their
applications: ICBMA 2019. Springer, 2021, pp. 39–50.
[18] F. Z. Belgrana, N. Benamrane, M. A. Hamaida, A. M. Chaabani, and
A. Taleb-Ahmed, “Network intrusion detection system using neural
network and condensed nearest neighbors with selection of nsl-kdd in-
fluencing features,” in 2020 IEEE International Conference on Internet
of Things and Intelligence System (IoTaIS). IEEE, 2021, pp. 23–29.
[19] M HASSAN ZAIB, “NSL-KDD Kaggle.” [Online]. Available:
https://www.kaggle.com/datasets/hassan06/nslkdd
[20] E. Bisong and E. Bisong, “Introduction to scikit-learn,” Building Ma-
chine Learning and Deep Learning Models on Google Cloud Platform:
A Comprehensive Guide for Beginners, pp. 215–229, 2019.
[21] A. Pashamokhtari, G. Batista, and H. H. Gharakheili, “Adiotack:
Quantifying and refining resilience of decision tree ensemble infer-
ence models against adversarial volumetric attacks on iot networks,
Computers & Security, vol. 120, p. 102801, 2022.
[22] S. Tufail, S. Batool, and A. I. Sarwat, A comparative study of binary
class logistic regression and shallow neural network for ddos attack
prediction,” in SoutheastCon 2022. IEEE, 2022, pp. 310–315.
[23] A. Raza, H. U. R. Siddiqui, K. Munir, M. Almutairi, F. Rustam, and
I. Ashraf, “Ensemble learning-based feature engineering to analyze
maternal health during pregnancy and health risk prediction, Plos one,
vol. 17, no. 11, p. e0276525, 2022.
[24] S. Ismail and H. Reza, “Evaluation of na¨
ıve bayesian algorithms for
cyber-attacks detection in wireless sensor networks, in 2022 IEEE
World AI IoT Congress (AIIoT). IEEE, 2022, pp. 283–289.
[25] T. Wu, H. Fan, H. Zhu, C. You, H. Zhou, and X. Huang, “Intrusion
detection system combined enhanced random forest with smote algo-
rithm,” EURASIP Journal on Advances in Signal Processing, vol. 2022,
no. 1, pp. 1–20, 2022.
[26] F. Rustam, M. F. Mushtaq, A. Hamza, M. S. Farooq, A. D. Jurcut,
and I. Ashraf, “Denial of service attack classification using machine
learning with multi-features,” Electronics, vol. 11, no. 22, p. 3817,
2022.
[27] S. Kaur and M. Singh, “Hybrid intrusion detection and signature
generation using deep recurrent neural networks,” Neural Computing
and Applications, vol. 32, pp. 7859–7877, 2020.
[28] S. M. Kasongo and Y. Sun, “A deep gated recurrent unit based model
for wireless intrusion detection system,” ICT Express, vol. 7, no. 1, pp.
81–87, 2021.
[29] F. Laghrissi, S. Douzi, K. Douzi, and B. Hssina, “Intrusion detection
systems using long short-term memory (lstm),” Journal of Big Data,
vol. 8, no. 1, p. 65, 2021.
[30] Y. Lin, H. Zhao, X. Ma, Y. Tu, and M. Wang, Adversarial attacks
in modulation recognition with convolutional neural networks, IEEE
Transactions on Reliability, vol. 70, no. 1, pp. 389–401, 2020.
[31] S. Zargar, “Introduction to sequence learning models: Rnn, lstm, gru,
no. April, 2021.
[32] L. van der Maaten and G. Hinton, “Visualizing data using t-sne, Jour-
nal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605, 2008.
[Online]. Available: http://jmlr.org/papers/v9/vandermaaten08a.html
2023 21st Mediterranean Communication and Computer Networking Conference (MedComNet)
10
Authorized licensed use limited to: Hongik Univ. Downloaded on July 09,2023 at 04:49:58 UTC from IEEE Xplore. Restrictions apply.
... XSS attacks pose a significant threat to web application security, affecting both users and website owners [1]. These attacks exploit vulnerabilities in website code, allowing malicious scripts to be injected and executed within web pages without the user's knowledge [2,3]. The injected scripts contain harmful code, typically JavaScript, which can compromise sensitive user information and manipulate website malicious code onto legitimate websites, attackers deceive browsers into executing their malware when users access the site as shown in Fig. 1. ...
Article
Full-text available
Cross-Site Scripting (XSS) attacks continue to pose a significant threat to web applications, compromising the security and integrity of user data. XSS is a web application vulnerability where malicious scripts are injected into websites, allowing attackers to execute arbitrary code in the victim’s browser. The consequences of XSS attacks can be severe, ranging from financial losses to compromising sensitive user information. XSS attacks enable attackers to deface websites, distribute malware, or launch phishing campaigns, compromising the trust and reputation of affected organizations. This study proposes an efficient artificial intelligence approach for the early detection of XSS attacks, utilizing machine learning and deep learning approaches, including Long Short-Term Memory (LSTM). Additionally, advanced feature engineering techniques, such as the Term Frequency-Inverse Document Frequency (TFIDF), are applied and compared to evaluate results. We introduce a novel approach named LSTM-TFIDF (LSTF) for feature extraction, which combines temporal and TFIDF features from the cross-site scripting dataset, resulting in a new feature set. Extensive research experiments demonstrate that the random forest method achieved a high performance of 0.99, outperforming state-of-the-art approaches using the proposed features. A k-fold cross-validation mechanism is utilized to validate the performance of applied methods, and hyperparameter tuning further enhances the performance of XSS attack detection. We have applied Explainable Artificial Intelligence (XAI) to understand the interpretability and transparency of the proposed model in detecting XSS attacks. This study makes a valuable contribution to the growing body of knowledge on XSS attacks and provides an efficient model for developers and security practitioners to enhance the security of web applications.
... This is achieved by injecting malicious SQL code into user-input fields, which is then executed by the application's database [2]. The consequences of a successful SQL injection attack can be severe, ranging from unauthorized access to sensitive information, alteration or deletion of critical data to even complete system compromise [3], [4]. These attacks can have far-reaching implications for individuals and organizations, potentially leading to financial losses, reputational damage, and legal consequences [5]. ...
Article
Full-text available
SQL injection attacks represent a critical threat to database-driven applications and systems, exploiting vulnerabilities in input fields to inject malicious SQL code into database queries. This unauthorized access enables attackers to manipulate, retrieve, or even delete sensitive data. The unauthorized access through SQL injection attacks underscores the critical importance of robust Artificial Intelligence (AI) based security measures to safeguard against SQL injection attacks. This study’s primary objective is the automated and timely detection of SQL injection attacks through AI without human intervention. Utilizing a preprocessed database of 46,392 SQL queries, we introduce a novel approach, the Autoencoder network (AE-Net), for automatic feature engineering. The proposed AE-Net extracts high-level deep features from SQL textual data, subsequently input into machine learning models for performance evaluations. Extensive experimental evaluation reveals that the extreme gradient boosting classifier outperforms existing studies with an impressive k-fold accuracy score of 0.99 for SQL injection detection. Each applied learning approach’s performance is further enhanced through hyperparameter tuning and validated via k-fold crossvalidation. Additionally, statistical t-test analysis is applied to assess performance variations. Our innovative research has the potential to revolutionize the timely detection of SQL injection attacks, benefiting security specialists and organizati
... Qualified pathologists assessed all whole-slide images (WSIs) for slide-level inclusion and quality assurance. TTH, MLAB, and TCGA datasets were used to construct and test deep learning models [29], [30]. Slides with a pathologist majority examined the component models to achieve performance assessment reference dependability. ...
Article
Full-text available
Breast cancer constitutes a significant global health concern that impacts millions of women across the world. The diagnosis of breast cancer involves categorizing grades based on the histopathological characteristics of tumor cells. While histopathological assessment remains the established benchmark for breast cancer diagnosis, it is hampered by time-consuming procedures, subjectivity, and susceptibility to human errors. This study introduces a novel approach called ImageNet-VGG16 (IVNet) for the real-time diagnosis of breast cancer within a hospital environment. The research experiments are conducted using a benchmark dataset known as Jimma University Medical Center (JUMC) breast cancer grading. Advanced image processing techniques are applied to preprocess the data, enhancing performance. This preprocessing involves the utilization of Holistically Nested Edge Detection (HED) and Contrast Limited Adaptive Histogram Equalization (CLAHE) for transformation and stain normalization. We employ advanced neural network-based transfer learning techniques to analyze the preprocessed histopathological images and identify affected cells. Various pre-trained models are utilized, including convolutional neural networks (CNN) such as VGG16, ResNet50, InceptionNetv3, ImageNet, MobileNetv3, and EfficientNetV3, in a comparative framework. The principal objective of this research is the accurate classification of breast cancer images into Grade-1, Grade-2 and Grade-3. Through extensive experimental research, we achieved a commendable 97% correct classification rate by utilizing a hybrid of VGG16 and ImageNet as the proposed feature engineering method, IVNet. We also validate our proposed approach performance using other state-of-the-art study data and statistical t-test analysis. Furthermore, we develop a user-friendly Graphical User Interface (GUI) that facilitates real-time cell tracking in histopathological images. Our real-time diagnosing application offers valuable insights for treatment planning and assists medical professionals in making prognoses. Moreover, our approach can serve as a reliable decision support system for pathologists and clinicians, particularly in settings constrained by limited resources and restricted access to expertise and equipment.
... Compared with RNN, LSTM can handle long sequences better and avoid the problem of gradient explosion. Furqan Rustam et al. designed an integrated deep voting classifier, including LSTM, RNN, and GRU [13]. The final prediction of the model is made by voting to aggregate individual model predictions. ...
Article
Full-text available
The current mainstream intrusion detection models often have a high false negative rate, significantly affecting intrusion detection systems’ (IDSs) practicability. To address this issue, we propose an intrusion detection model based on a multi-scale one-dimensional convolutional neural network module (MS1DCNN), an efficient channel attention module (ECA), and two bidirectional long short-term memory modules (BiLSTMs). The proposed hybrid MS1DCNN-ECA-BiLSTM model uses the MS1DCNN module to extract features with a different granularity from the input data and uses the ECA module to enhance the weight of important features. Finally, the model carries out sequence learning through two BiLSTM layers. We use the dung beetle optimizer (DBO) to optimize the hyperparameters in the model to obtain better classification results. Additionally, we use the synthetic minority oversampling technique (SMOTE) to fill several samples to reduce the local false negative rate. In this paper, we train and test the model using accurate network data from a water storage industrial control system. In the multi-classification experiment, the model’s accuracy was 97.04%, the precision was 97.17%, and the false negative rate was 2.95%; in the binary classification experiment, the accuracy and false negative rate were 99.30% and 0.7%. Compared with other mainstream methods, our model has a higher score. This study provides a new algorithm for the intrusion detection of industrial control systems.
... N ETWORK traffic analysis and IDS play a crucial role in detecting network attacks and enhancing overall cybersecurity [1]. Network traffic analysis involves monitoring and inspecting network traffic patterns, data packets, and communication protocols to gain useful insights into the behavior of network users and identify any malicious activities [2]. ...
Article
Full-text available
Network attacks refer to malicious activities exploiting computer network vulnerabilities to compromise security, disrupt operations, or gain unauthorized access to sensitive information. Common network attacks include phishing, malware distribution, and brute-force attacks on network devices and user credentials. Such attacks can lead to financial losses due to downtime, recovery costs, and potential legal liabilities. To counter such threats, organizations use Intrusion Detection Systems (IDS) that leverage sophisticated algorithms and machine learning techniques to detect network attacks with enhanced accuracy and efficiency. Our proposed research aims to detect network attacks effectively and timely to prevent harmful losses. We used a benchmark dataset named CICIDS2017 to build advanced artificial intelligence-based machine learning methods. We propose a novel approach called Class Probability Random Forest (CPRF) for network attack detection performance enhancement. We created a novel feature set using the proposed CPRF approach. The CPRF approach predicts the class probabilities from the network attack dataset, which are then used as features for building applied machine learning methods. The comprehensive research results demonstrated that the random forest approach outperformed the state-of-the-art approach with a high-performance accuracy of 99.9%. The performance of each applied technique is validated using a k-fold approach and optimized with hyperparameter tuning. Our novel proposed research has revolutionized network attack detection, effectively preventing unauthorized access, service disruptions, sensitive information theft, and data integrity compromise.
... The storage problem for industrial IoT was solved by offering a hierarchical blockchain storage structure (ChainSplitter), in which most of the blockchain is hosted on cloud services. Lastly, lightweight algorithms are required for IoT blockchain to overcome power and processing time constraints [29,30]. ...
Article
Full-text available
The Internet of Things (IoT) refers to the network of interconnected devices that can communicate and share data over the Internet. The widespread adoption of smart devices within Internet of Things (IoT) networks poses considerable security challenges for their communication. To address these issues, blockchain technology, known for its decentralized and distributed nature, offers potential solutions within consensus-based authentication in IoT networks. This paper presents a novel approach called the local and global layer blockchain model, which aims to enhance security while simplifying implementation. The model leverages the concept of clustering to establish a local-global architecture, with cluster heads assuming responsibility for local authentication and authorization. Implementing a local private blockchain facilitates seamless communication between cluster heads and relevant base stations. This blockchain implementation enhances credibility assurance, strengthens security, and provides an effective network authentication mechanism. Simulation results indicate that the proposed algorithm outperforms previously reported methods. The proposed model achieved an average coverage per node of 0.9, which is superior to baseline models. Additionally, the lightweight blockchain model proposed in this paper demonstrates superior capabilities in achieving balanced network latency and throughput compared to traditional global blockchain approaches.
... Without a proactive monitoring approach, an intruder can steal confidential data or perform various types of attacks that can be very harmful. Even human safety can be compromised by ICS network attacks (Al-Abassi et al., 2020;Alladi et al., 2020;Rustam et al., 2023). Different cyber security mechanisms have been developed for the ICS network. ...
Article
Cybersecurity incident response is a very crucial part of the cybersecurity management system. Adversaries emerge and evolve with new cybersecurity tactics, techniques, and procedures (TTPs). It is essential to detect the TTPs in a timely manner to respond effectively and mitigate the vulnerabilities to secure business operations. This research focuses on TTP identification and detection based on a machine learning approach. Early identification and detection are paramount in protecting, responding to, and recovering from such adversarial attacks. Analyzing use cases is a critical tool to ensure proper and in-depth evaluation of sector-specific cybersecurity challenges. In this regard, this study investigates existing known methodologies for cyber-attacks such as Mitre attacks, and developed a method for identifying threat cases. In addition, Windows-based threat cases are implemented, comprehensive datasets are generated, and supervised machine learning models are applied to detect threats effectively and efficiently. Random forest outperforms other models with the highest accuracy of 99%. Future work can be done for generating threat cases based on multiple log sources, including network security and endpoint protection device, and achieve high accuracy by removing false positives using machine learning. Similarly, real-time threat detection is also envisioned for future work.
Article
Full-text available
Modern networks are crucial for seamless connectivity but face various threats, including disruptive network attacks, which can result in significant financial and reputational risks. To counter these challenges, AI-based techniques are being explored for network protection, requiring high-quality datasets for training. In this study, we present a novel methodology utilizing an Ubuntu Base Server to simulate a virtual network environment for real-time collection of network attack datasets. By employing Kali Linux as the attacker machine and Wireshark for data capture, we compile the Server-based Network Attack (SNA) dataset, showcasing UDP, SYN, and HTTP flood network attacks. Our primary goal is to provide a publicly accessible, server-focused dataset tailored for network attack research. Additionally, we leverage advanced AI methods for real-time detection of network attacks. Our proposed meta-RF-GNB (MRG) model combines Gaussian Naive Bayes and Random Forest techniques for predictions, achieving an impressive accuracy score of 99.99%. We validate the efficiency of MRG using cross-validation, obtaining a notable mean accuracy of 99.94% with a minimal standard deviation of 0.00002. Furthermore, we conducted a statistical t-test to evaluate the significance of MRG compared to other top-performing models.
Article
The usefulness of ensemble-based total time series analysis in Wi-Fi sensor networks is examined in this paper. A device to uses an ensemble approach combines multiple strategies to enhance overall predictive performance. This research assesses various tactics using unique metrics, such as robustness and accuracy. It contrasts the effectiveness of traditional time series methods with ensemble-based total fashions. An experimental approach focusing mostly on exceptional Wi-Fi sensor network scenarios is employed to evaluate the overall effectiveness of the suggested methods. Additionally, this study looks into how changes to community features like energy delivery, conversation range, and node density affect how effective the suggested methods are. The study's findings maintain the capacity to create effective Wi-Fi sensor networks with improved predicted overall performance. The usefulness of ensemble-based time collecting and analysis techniques for wireless sensor networks is investigated in this research. This study primarily looks upon function extraction and seasonality discounting of time series records in WSNs. In this analysis, seasonality is discounted using an ensemble median filter, and feature extraction is accomplished by primary component assessment. To assess the performance of the suggested ensemble technique on every simulated and real-world international WSN fact, multiple experiments are carried out. The findings suggest that the ensemble approach can improve the exceptional time-gathering records within WSNs and reduce seasonality. Furthermore, when compared to single-sensor strategies, the ensemble technique further improves the accuracy of the function extraction system. This work demonstrates the applicability of the ensemble approach for the investigation of time collection data in WSNs
Article
Full-text available
The exploitation of internet networks through denial of services (DoS) attacks has experienced a continuous surge over the past few years. Despite the development of advanced intrusion detection and protection systems, network security remains a challenging problem and necessitates the development of efficient and effective defense mechanisms to detect these threats. This research proposes a machine learning-based framework to detect distributed DOS (DDoS)/DoS attacks. For this purpose, a large dataset containing the network traffic of the application layer is utilized. A novel multi-feature approach is proposed where the principal component analysis (PCA) features and singular value decomposition (SVD) features are combined to obtain higher performance. The validation of the multi-feature approach is determined by extensive experiments using several machine learning models. The performance of machine learning models is evaluated for each class of attack and results are discussed regarding the accuracy, recall, and F1 score, etc., in the context of recent state-of-the-art approaches. Experimental results confirm that using multi-feature increases the performance and RF obtains a 100% accuracy.
Article
Full-text available
Maternal health is an important aspect of women’s health during pregnancy, childbirth, and the postpartum period. Specifically, during pregnancy, different health factors like age, blood disorders, heart rate, etc. can lead to pregnancy complications. Detecting such health factors can alleviate the risk of pregnancy-related complications. This study aims to develop an artificial neural network-based system for predicting maternal health risks using health data records. A novel deep neural network architecture, DT-BiLTCN is proposed that uses decision trees, a bidirectional long short-term memory network, and a temporal convolutional network. Experiments involve using a dataset of 1218 samples collected from maternal health care, hospitals, and community clinics using the IoT-based risk monitoring system. Class imbalance is resolved using the synthetic minority oversampling technique. DT-BiLTCN provides a feature set to obtain high accuracy results which in this case are provided by the support vector machine with a 98% accuracy. Maternal health exploratory data analysis reveals that the health conditions which are the strongest indications of health risk during pregnancy are diastolic and systolic blood pressure, heart rate, and age of pregnant women. Using the proposed model, timely prediction of health risks associated with pregnant women can be made thus mitigating the risk of health complications which helps to save lives.
Article
Full-text available
The Internet of Things (IoT) is a complicated security feature in which datagrams are protected by integrity, confidentiality, and authentication services. The network is protected from external interruptions and intrusions. Because IoT devices run with a range of heterogeneous technologies and process data over time, standard solutions may not be practical. It is necessary to develop intelligent procedures that can be used for multiple levels of data flow in the system. This study examines metainnovations using deep learning-based IDS. Per the findings of the earlier tests, BiLSTMs are better for binary (regular/attacker) classification; however, sequential models (LSTM or BiLSTM) are better for detecting some brutal attacks in multiclass classifiers. According to experts, deep learning-based intrusion detection systems can now recognize and select the best structure for each category. However, specific difficulties will need to be solved in the future. Two topics should be studied further in future attempts. One of the researchers’ concerns is the impact of various data processing techniques, such as artificial intelligence or metamethods, on IDS. The BiLSTM approach has chosen the safest instances with the highest accuracy among the models. According to the findings, the most reliable and suitable solution for evaluating DDoS attacks in IoT is the BiLSTM design.
Article
Full-text available
Due to the rapid growth in IT technology, digital data have increased availability, creating novel security threats that need immediate attention. An intrusion detection system (IDS) is the most promising solution for preventing malicious intrusions and tracing suspicious network behavioral patterns. Machine learning (ML) methods are widely used in IDS. Due to a limited training dataset, an ML-based IDS generates a higher false detection ratio and encounters data imbalance issues. To deal with the data-imbalance issue, this research develops an efficient hybrid network-based IDS model (HNIDS), which is utilized using the enhanced genetic algorithm and particle swarm optimization(EGA-PSO) and improved random forest (IRF) methods. In the initial phase, the proposed HNIDS utilizes hybrid EGA-PSO methods to enhance the minor data samples and thus produce a balanced data set to learn the sample attributes of small samples more accurately. In the proposed HNIDS, a PSO method improves the vector. GA is enhanced by adding a multi-objective function, which selects the best features and achieves improved fitness outcomes to explore the essential features and helps minimize dimensions, enhance the true positive rate (TPR), and lower the false positive rate (FPR). In the next phase, an IRF eliminates the less significant attributes, incorporates a list of decision trees across each iterative process, supervises the classifier’s performance, and prevents overfitting issues. The performance of the proposed method and existing ML methods are tested using the benchmark datasets NSL-KDD. The experimental findings demonstrated that the proposed HNIDS method achieves an accuracy of 98.979% on BCC and 88.149% on MCC for the NSL-KDD dataset, which is far better than the other ML methods i.e., SVM, RF, LR, NB, LDA, and CART.
Conference Paper
Full-text available
One of the Internet of Things (IoT) operating platforms is the Wireless Sensor Network (WSN), which has proliferated into a broad spectrum of applications. These networks comprise many resource-restricted sensors in terms of sensing, communication, storage, and power. Security becomes a critical concern to protect the network of scarce resources from any malicious activities that target the network. Several solutions have been presented in the literature; however, machine learning has proven its appropriateness in designing energy-efficient detection measures for cyber-attacks targeting WSNs. This paper presents a WSN security performance evaluation of three Naive Bayesian machine learning classification technique variants: Gaussian Naïve Bayes, Multinomial Naïve Bayes, and Bernoulli Naïve Bayes, compared to three well-known base algorithms, K-Nearest Neighbors, Support Vector Machine, and Multilayer Perceptron. We applied Spearman correlation as a univariate feature selection. The specialized dataset, WSN-DS, was used for training and testing purposes. The performance of the six classifiers was evaluated in terms of accuracy, probability of detection, positive prediction value, probability of false alarm, probability of misdetection, memory usage, processing time, prediction time, and complexity.
Article
6G networks are expected to provide instant global connectivity and enable the transition from “connected things” to “connected intelligence,” where promising network slicing can play an important role in network assurance and service provisioning for various demanding vertical application scenarios. On the basis of diversified massive data, artificial intelligence (AI)-assisted techniques are widely considered more suitable than traditional models and algorithms to deal with challenges faced by complex and dynamic slicing problems in 6G. In view of this, we provide a tutorial on AI-assisted 6G network slicing for network assurance and service provisioning, aiming to show the prospect of 6G slicing and the advantages of applying AI technology. Specifically, we propose six typical characteristics of 6G network slicing, analyze the feasibility of AI from different network domains and technical aspects, propose a case study on AI-assisted bandwidth scaling, and, finally, put forward the main challenges and open issues for its future development.
Chapter
Today, due to the evolution of technology and the use of the Internet on a large scale, securing everything is becoming an unavoidable necessity and a challenge for most companies. And since the traditional means of security have become insufficient due to the increase in the number and types of computer attacks that appear almost daily, researchers in the field of computer security are busy developing security tools based on artificial intelligence concepts to detect new attacks. In this work, we proposed a binary classification method for intrusion detection that has a high accuracy, precision and recall rates. This approach is based on multi-layer perceptron using both pearson correlation coefficient and simulated annealing for selecting attributes from the three datasets used for genarating and evaluating this model which are: NSL-KDD, UNSW-NB15 and CICIDS2017. We obtained 97,02% accuracy for NSL-KDD, 92,32% accuaracy for UNSW-NB15 and 97,70% for CICIDS2017.
Article
Machine Learning-based techniques have shown success in cyber intelligence. However, they are increasingly becoming targets of sophisticated data-driven adversarial attacks resulting in misprediction, eroding their ability to detect threats on network devices. In this paper, we present AdIoTack,¹ a system that highlights vulnerabilities of decision trees against adversarial attacks, helping cybersecurity teams quantify and refine the resilience of their trained models for monitoring and protecting Internet-of-Things (IoT) networks. In order to assess the model for the worst-case scenario, AdIoTack performs white-box adversarial learning to launch successful volumetric attacks that decision tree ensemble network behavioral models cannot flag. Our first contribution is to develop a white-box algorithm that takes a trained decision tree ensemble model and the profile of an intended network-based attack (e.g., TCP/UDP reflection) on a victim class as inputs. It then automatically generates recipes that specify certain packets on top of the indented attack packets (less than 15% overhead) that together can bypass the inference model unnoticed. We ensure that the generated attack instances are feasible for launching on Internet Protocol (IP) networks and effective in their volumetric impact. Our second contribution develops a method to monitor the network behavior of connected devices actively, inject adversarial traffic (when feasible) on behalf of a victim IoT device, and successfully launch the intended attack. Our third contribution prototypes AdIoTack and validates its efficacy on a testbed consisting of a handful of real IoT devices monitored by a trained inference model. We demonstrate how the model detects all non-adversarial volumetric attacks on IoT devices while missing many adversarial ones. The fourth contribution develops systematic methods for applying patches to trained decision tree ensemble models, improving their resilience against adversarial volumetric attacks. We demonstrate how our refined model detects 92% of adversarial volumetric attacks.
Article
The Metaverse can be regarded as a hypothesized iteration of the Internet, which enables people to work, play, and interact socially in a persist online 3-D virtual environment with an immersive experience, by generating an imaginary environment similar to the real world, including realistic sounds, images, and other sensations. The Metaverse has strict requirements for a fully-immersive experience, large-scale concurrent users, and seamless connectivity, which poses many unprecedented challenges to the sixth generation (6G) wireless system, such as ubiquitous connectivity, ultra-low latency, ultra-high capacity and reliability, and strict security. In addition, to achieve the immersive and hassle-free experience of mass users, the full coverage sensing, seamless computation, reliable caching, and persistent consensus and security should be carefully considered to integrate into the future 6G system. To this end, this paper aims to depict the roadmap to the Metaverse in terms of communication and networking in 6G, including illustrating the framework of the Metaverse, revealing the strict requirements and challenges for 6G to realize the Metaverse, and discussing the fundamental technologies to be integrated in 6G to drive the implementation of the Metaverse, including intelligent sensing, digital twin (DT), space-air-ground-sea integrated network (SAGSIN), multi-access edging computing (MEC), blockchain, and the involved security issues.