ArticlePDF Available

GEMLIDS-MIOT: A Green Effective Machine Learning Intrusion Detection System based on Federated Learning for Medical IoT network security hardening

Authors:

Abstract

The increasing use of Internet of Things (IoT) gadgets in a daily rate has heightened security apprehension, particularly within the healthcare sector. To prevent the unauthorized disclosure of sensitive data, it is imperative for Internet of Things (IoT) systems to promptly and effectively respond to harmful activities. Nevertheless, the act of transferring data to distant cloud servers for analysis gives rise to both temporal delays and apprehensions regarding privacy. To ensure the security of medical Internet of Things (MIoT) networks, a power-efficient Intrusion Detection System (IDS) is employed for three primary objectives that it will result in three stages of execution: i) The objective is to categorize different types of attacks, such as Man-in-the-Middle (MitM) and Distributed Denial of Service (DDoS), by utilizing well-established machine learning (ML) techniques. This classification stage will serve to enhance the Intrusion Detection System (IDS) and the reporting system. ii) Anomaly detection (unknown attack identification), or detection of unknown attacks, will be employed to identify previously unknown attacks. This identification stage involves retraining the ML model to enable future recognition and classification of these unknown attacks when the anomaly attack detector identifies that an unknown attack is recognized. Then, a retraining of the first stage classification model is executed due to the anomaly detection. iii) To ensure that a remote cloud server remains current with the latest classification model changes, Federated Learning (FL) will be utilized. FL allows for collaborative model training while preserving data privacy and security. The experimental findings indicate that the Enhanced Random Forest (also called ensemble random forest) algorithm achieves a remarkable accuracy rate of 99.98% in classifying attacks. Thus, it will be our first stage classifier. Continuing, the One-Class Support Vector Machine (SVM) algorithm demonstrates a high level of accuracy, reaching 99.7% in detecting anomalies and, hence, it will be our second stage identifier. Finally, the third-stage approach, which has as a target the overall system model updater, will be our introduced Federated Learning approach that works with the Enhanced Random Forests and identifies the ERF differences from the old model in an optimal way. The efficacy of our technique is confirmed through the implementation of experiments involving an Internet of Things (IoT) system and a Raspberry Pi MIoT gateway and with simulations that simulate the FL updating process. These experiments successfully identify known and unknown attacks with a high-reliability level while limiting resource utilization and energy consumption. Future studies of this work will focus on enhancing the scalability and efficiency of our Intrusion Detection System in MIoT networks.
GEMLIDS-MIOT: A Green Effective Machine Learning Intrusion Detection System
Based on Federated Learning for Medical IoT Network Security Hardening
Iacovos Ioannoud,e, Prabagarane Nagaradjanea, Pelin Anginb, Palaniappan Balasubramanianc, Karthick Jeyagopal
Kavithaf, Palani Murugang, Vasos Vassilioud,e
aDept. of ECE, Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
bDept. of Computer Engineering at Middle East Technical University, Ankara Turkey
cSchool of Electronics and Computer Science, Bethnal Green, London E1 NS
dDept. of Computer Science University of Cyprus, Nicosia, Cyprus
eCYENS Center of Excellence, Nicosia, Cyprus
fGroww Ltd, Bengaluru, India
gGlobal Analytics India Pvt. Ltd, Chennai, India
Abstract
The increasing use of Internet of Things (IoT) gadgets in a daily rate has heightened security apprehension, particularly
within the healthcare sector. To prevent the unauthorized disclosure of sensitive data, it is imperative for Internet of
Things (IoT) systems to promptly and effectively respond to harmful activities. Nevertheless, the act of transferring data
to distant cloud servers for analysis gives rise to both temporal delays and apprehensions regarding privacy. To ensure the
security of medical Internet of Things (MIoT) networks, a power-efficient Intrusion Detection System (IDS) is employed
for three primary objectives that it will result in three stages of execution: i) The objective is to categorize different types
of attacks, such as Man-in-the-Middle (MitM) and Distributed Denial of Service (DDoS), by utilizing well-established
machine learning (ML) techniques. This classification stage will serve to enhance the Intrusion Detection System (IDS)
and the reporting system. ii) Anomaly detection (unknown attack identification), or detection of unknown attacks,
will be employed to identify previously unknown attacks. This identification stage involves retraining the ML model
to enable future recognition and classification of these unknown attacks when the anomaly attack detector identifies
that an unknown attack is recognized. Then, a retraining of the first stage classification model is executed due to the
anomaly detection. iii) To ensure that a remote cloud server remains current with the latest classification model changes,
Federated Learning (FL) will be utilized. FL allows for collaborative model training while preserving data privacy and
security. The experimental findings indicate that the Enhanced Random Forest (also called ensemble random forest)
algorithm achieves a remarkable accuracy rate of 99.98% in classifying attacks. Thus, it will be our first stage classifier.
Continuing, the One-Class Support Vector Machine (SVM) algorithm demonstrates a high level of accuracy, reaching
99.7% in detecting anomalies and, hence, it will be our second stage identifier. Finally, the third-stage approach, which
has as a target the overall system model updater, will be our introduced Federated Learning approach that works with
the Enhanced Random Forests and identifies the ERF differences from the old model in an optimal way. The efficacy
of our technique is confirmed through the implementation of experiments involving an Internet of Things (IoT) system
and a Raspberry Pi MIoT gateway and with simulations that simulate the FL updating process. These experiments
successfully identify known and unknown attacks with a high-reliability level while limiting resource utilization and
energy consumption. Future studies of this work will focus on enhancing the scalability and efficiency of our Intrusion
Detection System in MIoT networks.
Keywords: Brute force authentication, DDoS, Green Computing, Intrusion Detection, MIoT, IoMT, Machine
Learning, MQTT, MitM, Raspberry Pi, Federated Learning
1. Introduction
The healthcare sector has witnessed a significant in-
crease in security concerns about handling sensitive data,
primarily due to the widespread adoption of Internet of
Things (IoT) devices. To protect sensitive data within
the network, it is imperative for these devices to promptly
Email address: iioann06@cs.ucy.ac.cy (Iacovos Ioannou)
and effectively respond to any hostile activity. The chal-
lenges that emerge in the realm of Internet of Things (IoT)
applications stem from their heterogeneous character, the
absence of comprehensive security solutions at the outset,
and the manufacturers’ limited focus on security. In ad-
dition, the constrained computational capabilities of nu-
merous Internet of Things (IoT) devices provide obsta-
cles to the effective implementation of security software.
Despite implementing IoT-enabling technology and intru-
Preprint submitted to Elsevier February 24, 2024
sion prevention systems, attackers continue to compro-
mise devices successfully. The identified vulnerabilities
have noteworthy ramifications for medical IoT (MIoT) or
Internet of Medical Things (IoMT) networks that trans-
mit critical patient data. To ensure the security of IoT
networks, we suggest the implementation of an Intrusion
Detection System (IDS) to actively monitor MIoT gate-
ways, also known as sinkholes, by employing low-power
Machine Learning (ML) techniques 1. This study presents
an innovative methodology that centers on environmen-
tally friendly technology, employing various machine learn-
ing approaches to categorize numerous network assaults on
MIoT gateways/sinkholes and detect unfamiliar attacks as
abnormal/anomalous occurrences. Furthermore, we present
a methodology that utilizes Federated Learning (FL) to ef-
fectively update other Mobile Internet of Things (MIoT)
gateways through a cloud server.
Regarding employing environmentally friendly technol-
ogy, our research on MIoT network security incorporates
a power-efficient Intrusion Detection System (IDS). This
system’s design focuses on resource optimization, which
reduces energy consumption and aligns with environmen-
tal sustainability goals. Our methodology unfolds in three
stages: (1) The first stage involves using Enhanced Ran-
dom Forest algorithms for accurate attack classification.
(2) the One-Class Support Vector Machine (SVM) effi-
ciently detects anomalies in the second stage. (3) The
third stage employs Federated Learning (FL) to collabora-
tively update the model while ensuring data privacy and
reducing the need for extensive data transmission. The
pivot of our study towards energy efficiency, especially in
the application of FL, aims to minimize the energy require-
ments of MIoT devices. These devices often have strict
power constraints, and our approach to optimizing mem-
ory, CPU, and disk space usage significantly reduces their
energy consumption. Our experiments, which include real
Medical IoT systems and simulations, have demonstrated
our system’s capability to detect known and unknown at-
tacks regarding resource usage reliably and conservatively.
This approach not only enhances the security of MIoT
networks but also aligns with our commitment to develop-
ing environmentally sustainable solutions in the realm of
healthcare cybersecurity.
Prioritizing green energy sources is of utmost impor-
tance, mainly because sinkholes rely on batteries as in-
tegral components of their Internet of Things (IoT) func-
tionality. Hence, the power consumption of machine learn-
ing techniques becomes a critical factor to consider while
implementing an Intrusion Detection System (IDS) on a
system with limited resources. The energy conservation
for MIoT devices can be achieved by deploying the Intru-
sion Detection System (IDS) on the sinkhole instead of
installing it on each device inside the network. The sink-
1We have optimized the system to minimize resource
use—including memory, CPU, and disk space—thereby reduc-
ing the energy requirements of MIoT devices.
hole/gateway can also function as an access point, enabling
sharing of its bandwidth through Wi-Fi Direct. Further-
more, its secondary interface operates in monitor mode,
thereby enhancing data security and minimizing latency.
Moreover, it is worth noting that all sinkholes inside the
MIoT network undergo training upgrades to improve the
identification of attacks by utilizing Transfer/Federated
Learning techniques.
Our study offers a thorough and all-encompassing ap-
proach to improving security in the medical Internet of
Things (MIoT). This is achieved by utilizing a unique
combination of modern machine-learning methods within
a unified framework. Our approach primarily utilizes a
sophisticated combination of a one-class Support Vector
Machine (SVM) to accurately identify extremities in MIoT
networks and Enhanced Random Forests to detect a wide
range of threats. This fusion greatly enhances the preci-
sion and effectiveness of detecting and categorizing net-
work vulnerabilities. Our novel approach relies on uti-
lizing our novel Federated Learning approach that uses
Enhanced Random Forests to enable dynamic and contin-
uous upgrades of the Intrusion Detection System (IDS).
This method allows for quickly and effectively incorporat-
ing new attack patterns into the MIoT network. It utilizes
specialized techniques for traversing trees to ensure the
proper distribution of models. Our overall proposed ap-
proach consists of three key steps: first, using sophisticated
classification techniques at MIoT gateways to identify net-
work threats, known and unknown; second, incorporat-
ing the unknown identified system attacks into the IDS
by training the classification model with these new attack
records; and third, continuously updating both the remote
cloud server and the MIoT gateways using our introduced
Federated Learning approach that calculates the weights of
enhanced random forest optimally. The Federated Learn-
ing approach utilizes an Enhanced Random Forest ma-
chine learning model in the cloud to update the MIoTs in
the Enterprise Resource Planning (ERP) system. The im-
plementation of this three-step execution strategy brings
about a significant change in conventional IDS methodolo-
gies, offering a security solution that is capable of adapting
in real-time and delivering accurate protection for MIoT
environments. The functionality of our system is improved
by using cloud-based transfer learning, enabling the IDS
to be optimized in real time. This functionality enhances
the accuracy of models for one or all MIoT sinkholes, ad-
justing to the constantly changing network threat envi-
ronment. The effectiveness of our all-inclusive solution is
convincingly proved by rigorous testing in a specifically
engineered emulation environment, verifying its capacity
to proficiently and precisely identify a broad spectrum of
threats. Moreover, this comprehensive solution with the
three techniques (i.e., Enhanced Random Forests, one-
class SVM, and FD at Random Forests) effectively ad-
dresses the crucial resource allocation problem in MIoT
contexts, attaining an ideal equilibrium in power, CPU uti-
lization, and memory. The bilateral model update process
2
between the cloud and MIoT components guarantees that
all network elements stay synced with the latest threat de-
tection models. Our paper presents a comprehensive and
innovative solution for enhancing security in MIoT (Mo-
bile Internet of Things) networks. This solution integrates
advanced machine learning techniques with a novel system
update strategy, resulting in an adaptive Intrusion Detec-
tion System (IDS) that effectively addresses the specific
challenges and requirements of MIoT networks.
The primary contributions of this research work can be
briefly summarized as follows:
We propose an energy-efficient IDS approach for MIoT
networks, which runs lightweight ML models at the
sinkhole for real-time detection of attacks.
We classify the MitM, DDoS, Brute Force Authen-
tication, and NMAP attacks for securing our MIoT
network using Enhanced Random Forests.
We identify unknown attacks for securing our MIoT
network using One-Class SVM as anomalies.
We propose a cyber threat intelligence architecture
for MIoT, where newly discovered attack data are
shared through the cloud to update the ML mod-
els at the connected gateways. The proposed attack
data-sharing model preserves the privacy of sensitive
data.
We demonstrate through experiments in an MIoT
environment using Raspberry Pi and connected sen-
sors that the proposed approach with the new dataset
achieves high accuracy, low power consumption, and
low detection time, compliant with the requirements
of MIoT applications.
We introduce an FL at the MIoT network, which
will have the network’s sinkhole(s) updated with the
recent changes of nodes and trees in their Enhanced
Random Forest model by the cloud server for addi-
tional attack identification. Furthermore, we modi-
fied the FL to be used with Enhanced Random Forests
and MIoT devices.
We show that the proposed approach that encom-
passes Enhanced Random Forest, and One-Class SVM
is best among other approaches regarding the high-
est accuracy, lowest power consumption, CPU uti-
lization, and memory.
The rest of the paper is structured as follows. An
overview of related work in the field of MIoT intrusion
detection is provided in Section 2.1. Section 2.2 includes
background information on IoT and intrusion detection.
The assumptions, terms, system model, and problem de-
scription are provided in Section 3. Section 4 describes the
proposed end-to-end intrusion detection approach and fed-
erated learning. The used ML models are briefly analyzed
and described in Section 4.4. The investigated approaches’
performances are examined, evaluated, and compared in
Section 5. Finally, Section 6 includes concluding remarks
and our future work directions.
2. Related Work and Background Information
This section provides the related work to our inves-
tigation and the background information related to our
work. In MIoT, there is no work associated with Fed-
erated Learning; our approach is the only one support-
ing Federated Learning. Moreover, no method exists that
jointly identifies unknown attacks and then classifies these
attacks in a power and memory-save manner.
2.1. Related Work
This section provides the relevant work related to MIoT
and our investigation. More specifically, due to the detri-
mental effects of cyber-attacks on IoT systems, particu-
larly those used in healthcare, intrusion detection in IoT
has become a significant field of research in recent years.
IDS for IoT, like IDS for legacy systems, can be called
signature-based or anomaly-based. An anomaly-based IDS
compares a system’s behavior to a predefined profile of
regular activity and generates alerts whenever an action
deviates from routine behavior by exceeding a predefined
threshold. As a result, it effectively detects and prevents
new attacks (e.g., in the IoT context, abuse of resources).
Because any deviation from established boundaries is in-
terpreted as an attack, all acceptable behavior must be
known. This is impractical, however, because normal be-
haviors change over time. As a result, this technique is
prone to generating false positives. The expected behav-
ior profiles that statistical methods or machine learning
algorithms can cause may need to be more significant for
IoT network nodes with limited space and resources. Var-
ious approaches for intrusion detection in MIoT have been
proposed in recent years.
2.1.1. IDS approaches for MIoT
Authors in [1] presented an intrusion detection approach
for Medical Internet of Things (MIoT) networks. They em-
ployed machine learning techniques such as Decision Trees
(DT), Support Vector Machines (SVM), and K-Means on
MIoT gateways. The research focused on utilizing sim-
ulator data for evaluation and did not specify particu-
lar attacks. Moreover, their work did not address power
consumption or incorporate Federated Learning (FL) for
model updates, which are vital aspects for resource-constrained
MIoT devices.
Authors in [2] proposed an intrusion detection system
(IDS) for connected healthcare systems (CHS) using Au-
toencoders and XG-Boost. Unfortunately, details about
the dataset they used were not publicly available. They
primarily identified attacks like Interception, Forgery, and
Tampering. However, like Gao et al. [1], this work did not
3
consider power consumption analysis or FL for real-time
model updates, which are essential in MIoT environments.
In their work, Authors in [3] introduced an approach
for cyber-attack detection in patient monitoring devices
(PMD) using techniques like N-grams, K-Nearest Neigh-
bors (KNN), Support Vector Machines (SVM), Random
Forest (RF), and Decision Trees (DT). They utilized ac-
tual device data to identify attacks such as Eavesdropping,
Denial of Service (DoS), Man-in-the-Middle (MitM), Re-
play, and False Data Injection. However, like prior studies,
this research did not account for power consumption anal-
ysis or implement Federated Learning.
Authors in [4] designed a Mobile Agent-based intru-
sion detection system (IDS) for securing medical device
networks. They utilized Machine Learning and regression
algorithms to detect attacks like DoS, Data Falsification,
and Passive Listening. Unfortunately, the dataset details
were not publicly available, and the study did not incor-
porate power consumption analysis or Federated Learning
into the solution.
Authors in [5] proposed an intrusion detection sys-
tem for MIoT using Principal Component Analysis (PCA),
Grey Wolf Optimization (GWO), and Deep Neural Net-
works (DNN). They focused on attacks like Denial of Ser-
vice (DoS), User-to-Root, Probe, and Remote to Local
attacks using the Kaggle Intrusion dataset. However, the
study did not consider power consumption analysis or Fed-
erated Learning.
Authors in [6] developed an intrusion detection sys-
tem for MIoT networks based on Naive Bayes (NB), De-
cision Trees (DT), Random Forest (RF), and XGBoost
algorithms. They used the Ton-IoT dataset but did not
specify particular attacks. Similar to previous works, this
study did not address power consumption analysis or in-
corporate Federated Learning.
In their research, Authors in [7] focused on a healthcare
Internet of Things (IoT) Intrusion Detection System (IDS)
using Convolutional Neural Networks (CNN). They clas-
sified firewall risk levels as Normal, Critical, Major, and
Minor. However, specific dataset details were not publicly
available, and the work did not consider power consump-
tion analysis or Federated Learning.
Authors in [8] designed an IDS for healthcare systems
using Support Vector Machines (SVM), Random Forest
(RF), K-Nearest Neighbors (KNN), and Artificial Neural
Networks (ANN). Using publicly available datasets, they
identified attacks like MitM (Spoofing and Data Alter-
ation). Nevertheless, the research did not include power
consumption analysis and Federated Learning.
Authors in [9] introduced a framework for detecting
attacks in Fog nodes using EOS-ELM. They utilized the
NSL-KDD dataset to identify attacks like MitM and DDoS.
However, the study did not involve power consumption
analysis or Federated Learning.
In their work on MIoT Malware Detection, Authors in
[10] employed Convolutional Neural Networks (CNN) and
Long Short-Term Memory (LSTM). Unfortunately, spe-
cific dataset details were not mentioned, and the research
did not address power consumption analysis or Federated
Learning.
In the [11], authors introduce the AnoFed, a novel
framework for anomaly detection in digital healthcare. It
combines transformer-based Autoencoders (AE) and Vari-
ational Autoencoders (VAE) with Support Vector Data
Description (SVDD) in a federated learning setting. This
approach aims to enhance privacy, improve explainability,
and support adaptive anomaly detection. The effective-
ness of AnoFed is demonstrated through experiments us-
ing ECG data for anomaly detection, showing significant
efficiency and effectiveness compared to state-of-the-art
methods. More precisely, the study leverages transformer-
based autoencoders and variational autoencoders in a fed-
erated learning setting integrated with Support Vector Data
Description (SVDD) for adaptive anomaly detection. The
framework, applied to ECG anomaly detection, aims to
improve privacy protection, enhance the explainability of
results, and support adaptive detection.
2.1.2. IDS approaches for IoT
Many researchers have also proposed approaches for
intrusion detection in general IoT systems. Authors in
[12] adapted Suricata, a signature-based intrusion detec-
tion system (IDS), to detect denial-of-service attacks in
6LoWPAN networks. The system was developed to be de-
ployed on a centralized, dedicated host. More specifically,
the system checks IDS alerts for channel interference and
the rate at which packets are dropped to confirm an attack
and reduce false alarms.
In [13], a two-trained deep learning model-based anomaly
detection system for the IIoT/IICSs is proposed. A Deep
Auto Encoder (DAE) was initially learned during the train-
ing phase using only standard network behaviors to gen-
erate some parameters (e.g., weights and biases). The
generated parameters are used as values to initialize a
standard supervised Deep Feed Forward Neural Network
(DFFNN) algorithm to identify existing and new attack
instances during the testing phase. Two well-known net-
work datasets were used to evaluate the model. Firstly,
with the NSL-KDD dataset, the accuracy was 98.6 %,
while the false-positive rate was 1.8 %. Secondly, with
the UNSW-NB15 dataset, the accuracy was 92.4 %, and
the false-positive rate was 8.2 %.
Also, in [14], the authors proposed a scheme based on a
three-pattern detection algorithm that utilizes Snort and
ClamAV intrusion pattern sets. The algorithm was evalu-
ated on a Raspberry Pi equipped with an Omnivision 5647
sensor capable of capturing images for transmission to a
central server. Even though the approach uses auxiliary
shift values, the technique eliminates many unnecessary
comparison operations between packet payloads and at-
tack signatures, thereby lowering the computational cost
on low-capacity IoT nodes. When resources were lim-
ited, the method was two times faster than the traditional
pattern-matching algorithm.
4
The research in [15] proposed an architecture for de-
tecting SYN flooding attacks in IoT networks that utilizes
Random Neural Networks and LSTM. The authors gener-
ated their dataset by establishing a virtual network and
capturing traffic in PCAP files.
A proposed sequential attack detection architecture for
IoT networks that uses three machine learning models is
found in [16].
For distributed attack detection in IoT networks [17],
a fog-based semi-supervised learning approach was pro-
posed. The authors demonstrated that their distributed
approach outperformed centralized solutions regarding de-
tection time and accuracy using the NSL-KDD dataset.
A botnet detection scheme based on anomalies in 6LoW-
PAN sensor networks is proposed in [18]. The solution
monitors network traffic and notifies users when any node’s
computed averages undergo unexpected changes. The pro-
file of acceptable behavior was constructed by averaging
the TCP control field, packet length, and the number of
connections for each sensor.
A distributed internal anomaly detection system for
the Internet of Things is developed in [19]. The author’s
algorithm was not intended to run on low-capacity IoT
nodes. The approach continuously monitors the packet
size and data rate of one-hop neighbor nodes, looking for
anomalies in network traffic. The model learns and infers
normal behaviors based on the monitored data.
A resource-constrained IoT device-aware deep packet
inspection method for detecting anomalies is investigated
in [20]. The payload data is processed as a sequence of
bytes, with the features selected as n-grams using a bit-
pattern matching algorithm. The method is based on the
similarity of payload data in IoT protocols. The authors
ran tests on two Internet-connected devices and found that
the false positive rates for worm propagation, tunneling,
SQL code injection, and directory traversal attacks were
meager, which is good news for people who use them.
In [21], the authors detected anomalous behavior in
low-capacity 6LoWPAN networks by leveraging regular
energy consumption as a parameter. They defined lightweight
power consumption models using the mesh-under and route-
over routing schemes. The system sends out an alert and
takes it off the routing table. False alarm rates, on the
other hand, are reported.
The authors in [22] developed three algorithms for de-
tecting wormhole attacks in the Internet of Things (IoT)
networks. More specifically, their system algorithms detect
a high volume of control packets being exchanged between
i) the tunnel’s two ends; or ii) a large number of neighbors
forming to indicate an anomaly, resulting in a valid posi-
tive rate of 94 % for detecting wormhole attacks and 87 %
for detecting the attacker node. Even though the system
uses little power and memory, making it suitable for low-
resource Internet of Things devices, the authors didn’t say
the false-positive rates.
A PCA model to reduce the number of features and
classifiers such as Softmax Regression and KNN was de-
veloped in [23]. The authors created an anomaly-based
intrusion detection system (IDS) capable of real-time anal-
ysis in the Internet of Things (IoT) environments. The au-
thors show that even though Softmax Regression resulted
in a more straightforward and efficient system in terms of
time and computing, the accuracy of the KNN model is
1% higher than Softmax Regression, according to experi-
mental results using the KDD CUP 99 dataset.
In [24], a two-tier classification and dimension reduc-
tion mechanism for detecting malicious activity against an
IoT backbone is proposed. The KDD CUP 99 dataset’s
41 features were reduced to four dimensions. Following
that, the samples were classified using Naive Bayes and
kNN models. Their anomaly-based intrusion detection
technique applies to probe, DoS, U2R, and R2L attacks
in the IoT. Also, the authors reduced the number of fea-
tures using PCA and Linear Discriminant Analysis (LDA).
The method was evaluated on the NSL-KDD dataset and
achieved an overall detection rate of 84.86 % and a false
alarm rate of 4.86 %, respectively.
The paper at [25] presents Fed-ANIDS, a system com-
bining Federated Learning (FL) and anomaly detection
using autoencoders for enhancing network intrusion de-
tection. It addresses privacy concerns in centralized mod-
els by using various autoencoder models, including sim-
ple, variational, and adversarial types, to compute intru-
sion scores based on reconstruction errors of normal traf-
fic. Fed-ANIDS efficiently detects network intrusions using
autoencoders while maintaining data privacy across dis-
tributed networks. Evaluated with popular datasets like
USTC-TFC2016, CIC-IDS2017, and CSE-CIC-IDS2018,
Fed-ANIDS shows high detection accuracy with fewer false
alarms. The study highlights autoencoder-based models’
superiority over GAN-based models in this context, un-
derlining their effectiveness in threat detection while pre-
serving data privacy in distributed networks.
The authors in [26] present an approach for anomaly
detection in IoT networks using federated learning and
deep neural networks (DNN). The study introduces a method
that retains data privacy by keeping information localized
on IoT devices while only sharing updated model weights
with a centralized federated learning server. The paper
demonstrates the efficiency of the DNN-based network in-
trusion detection system (NIDS) and compares its per-
formance to traditional deep learning models. The ap-
proach shows improved model accuracy and reduced false
alarm rates, highlighting the benefits of combining fed-
erated learning with deep learning in IoT environments.
More specifically, a DNN-based anomaly detection method
combined with federated learning is introduced to enhance
privacy and efficiency in IoT networks. It utilizes the IoT-
Botnet 2020 dataset for evaluation and demonstrates an
improved model accuracy and reduced false alarm rate
compared to conventional methods.
The paper in [27] explores the challenges of implement-
ing Federated Learning (FL) in the Internet of Things
(IoT) for anomaly detection. It highlights the limitations
5
of FL due to data access constraints on devices, class
balance issues, and device heterogeneity. The study in-
vestigates the application of data augmentation strategies
to improve anomaly detection performance in IoT using
three publicly accessible datasets. Key findings include up
to 22.9% performance improvement using data augmen-
tation, particularly with stratified random sampling and
uniform random sampling, over the baseline without data
augmentation. The study also examines various data aug-
mentation methods’ effectiveness and computational cost,
including Generative Adversarial Networks, in a federated
learning context. The study addresses performance issues
in FL due to class imbalance and device heterogeneity. The
research demonstrates significant improvements in detec-
tion performance by employing various data augmenta-
tion methods, including random oversampling, stratified
sampling, SMOTE, ADASYN, and Generative Adversar-
ial Networks (GANs). The experiments used three pub-
licly available IoT datasets, enhancing performance with
a modest increase in computation time, particularly for
random and stratified sampling methods.
The primary shortcoming of the abovementioned ap-
proaches is that they were rarely evaluated using IoT-
specific datasets and communication protocols, except in
[28]. The author used three methods, Extreme Gradient
Boosting (XGBoost), GRU Recurrent Neural Networks,
and LSTM Recurrent Neural Networks, to detect three
types of attacks, namely denial of service (DoS), man-
in-the-middle, and an MQTT-specific intrusion. Another
major shortcoming is that they should have considered the
power requirements of the algorithms, which is significant
in the case of power-constrained IoT devices. To the best
of our knowledge, this work is the first to propose an IDS
for MIoT that jointly considers the identification of un-
known attacks (as anomalies) and classification in terms
of performance and resource consumption while provid-
ing data privacy and security in intrusion detection along
with the use of a custom FL approach for updating the
whole MIoT network in the hospital (ERP system). Our
technique conducted a thorough evaluation of our tech-
nique’s resource efficiency resulting in low power consump-
tion, which is a pivotal aspect in IoT environments. This
evaluation demonstrates that our approach is not only ef-
fective in terms of intrusion detection but also optimized
for minimal energy usage, making it highly suitable for
power-sensitive IoT applications. Additionally, regarding
the dataset utilized in our study, we acknowledge its cur-
rent unavailability to the public. However, to foster col-
laborative research and transparency in the field, we are in
the process of making the dataset available upon request.
This step ensures that interested researchers can access the
data while we navigate the necessary protocols for wider
public release.
Table 1 and Table 2 present with the same column
names and meaning a comparative analysis of intrusion
detection methodologies for MIoT and IoT, respectively,
as explored in the research study, with the recommended
approach. In both tables, we are comparing Intrusion De-
tection Systems (IDS); the columns represent distinct ele-
ments essential to understanding and evaluating each IDS
approach. The ’Reference’ column lists the academic stud-
ies or papers, providing a context for each intrusion detec-
tion method. ’Detection Technique(s)’ describes the spe-
cific algorithms or methods used to detect cyber threats,
varying from machine learning approaches to other so-
phisticated techniques. The ’Detection Location’ specifies
where the intrusion detection system is implemented, such
as in a specific device, network, or cloud-based setting.
The ’Dataset’ column is crucial as it mentions the data
used for training and evaluating the IDS models, indicat-
ing the source and type of data. ’Attacks Identified’ pro-
vides details on the range of cyberattacks that each system
can detect, from specific threats like DDoS to a broader
spectrum. The presence of ’Federated Learning (FL)’ in
an IDS approach is marked in its respective column, signi-
fying its use in the study. ’Power Consumption Analysis?’
is particularly relevant in MIoT/IoT environments, indi-
cating if the study assesses the energy efficiency of the
IDS. The column ’(U)nknown or (K)nown or (B)oth At-
tacks’ classifies the studies based on their focus on detect-
ing unknown, classifies known, or examining both types
of cyberattacks. The ’ERP’ column refers to ’Enterprise
Resource Planning’, showing if each study implements a
complete solution. Finally, ’Custom FL’ indicates if a tai-
lored Federated Learning approach is implemented, cater-
ing to the specific needs of the IDS. Note that both tables
are structured to allow for a straightforward comparison
of different IDS methods, highlighting their capabilities,
focusing on various attack types, and adaptability regard-
ing power consumption and Federated Learning. The first
table emphasizes MIoT, while the second provides insights
into broader IoT security, enabling a detailed understand-
ing of the IDS landscape in these technologically intercon-
nected areas. t is important to observe that due to the
identical layout of both tables, our methodology primarily
emphasizes MIoT IoT and is hence presented in the first
table. Although it is located in the first table, we can
easily compare it with the other ways in the second table.
Tables 1 and 2comprehensively compare various Intru-
sion Detection approaches in the context of MIoT (Medical
Internet of Things) and IoT security. Each row represents
a different approach, including the reference paper, the
detection technique employed, the detection location, the
dataset used, and the types of attacks identified. Most
of the reviewed approaches employ machine learning or
deep learning techniques for intrusion detection. How-
ever, it is noteworthy that ”GEMLIDS-MIOT,” presents
a novel approach. This approach utilizes Support Vector
Machines (SVM), Naive Bayes (NB), K-Nearest Neighbors
(KNN), Decision Trees (DT), and Extreme Randomized
Trees (ERF) for detection, making it a robust ensemble
method. Moreover, it not only identifies various attacks,
such as DDoS (Distributed Denial of Service), MitM (Man-
in-the-Middle), Brute Force, and Scanning but also effi-
6
Table 1: Comparison of works in MIoT Intrusion Detection
Reference Detection tech-
nique(s)
Detection loca-
tion
Dataset Attacks Identi-
fied
FL Power
con-
sump-
tion
anal-
ysis?
(U)nknown
or
(K)nown
or (B)oth
Attacks
ERP Custom
FL
[1] Different ML (DT,
SVM, K-MEANS)
The intrusion
detection system
on the connected
medical device
The dataset is gener-
ated using a simulator.
Not available
Not specified any
attacks
X X K X X
[2] Utilize stacked autoen-
coders for feature ex-
traction and XG-Boost
for classification and re-
gression.
Propose an in-
trusion detection
system (IDS)
based on a
stacked autoen-
coder for anomaly
detection in a
connected health-
care system
(CHS)
The dataset is not pub-
licly available
Interception,
Forgery, and
Tampering
X X K X X
[3] N-grame for features
extraction using KNN,
SVM, RF, and DT
Detection cyber-
attack against
PMD
The data are generated
from different real de-
vices. The dataset is
not publicly available
Eavesdropping,
DoS, MitM,
Replay attack,
and False data
injection
X X K X X
[4] Mobile agent-based in-
trusion detection sys-
tem using ML and re-
gression algorithms
Securing the net-
work of connected
medical devices
The dataset is not pub-
licly available
DoS, Data falsifi-
cation, and Pas-
sive listening
X X K X X
[5] They employ PCA and
GWO to reduce and di-
mensionalize data and
DNN to classify it.
Intrusion detec-
tion system in
MIoT system
Kaggle intrusion
dataset
DoS, User-to-
Root attack,
probe attack, and
Remote to local
attack
X X K X X
[6] Use of NB, DT, RF, and
XGBoost
For cyber-attack
detection in
MIoT networks,
and IDS based on
ensemble learning
and a fog-cloud
architecture is
used.
Ton-IoT dataset Not specified any
attacks
X X K X X
[7] CNN IDS based on ma-
chine learning and
multi-class classi-
fication designed
for healthcare IoT
in the smart city
The dataset is not pub-
licly available
Firewall risk la-
bel: Normal, crit-
ical, major and
minor
X X K X X
[8] Authors compare the re-
sults of SVM, RF, KNN
and ANN
IDS for healthcare
using medical and
network data
Dataset publicly avail-
able
MitM (spoofing
and data alter-
ation)
X X K X X
[9] EOS-ELM Framework for at-
tack detection in
the Fog node
NSL-KDD dataset MitM and DDoS X X K X X
[10] CNN to extract features
and LSTM to classify
data
Hybrid deep
learning-based
model for mal-
ware detection
in the MIoT
deployed at the
SDN plane appli-
cation level
not mentioned Not specified any
attacks
X X K X X
[11] Transformer-based AE
and VAE with SVDD
Digital Health-
care IoT
ECG Data Anomalies Detec-
tion
✓✓U X X
GEMLIDS-
MIOT and
examines
the classifi-
cation using
a Support
Vector
Machine,
Naive
Bayes,
K-nearest
neighbors,
Decision
Tree, and
Random
Forests.
Finally, it
uses for the
identifica-
tion of the
one-class
SVM and
for the
classifica-
tion of the
Random
Forests
along with
FL for up-
dating the
Enhanced
Random
Forest at
sinkholes
At the Sinkhole Data generated
using emulation
and CICEV2023
Distributed Denial of
Service (DDoS) attack,
Man-in-the-Middle
(MitM) attack, Brute
Force Authentication
attack, Network scan-
ning attack
B
7
Table 2: Comparison of IoT Intrusion Detection Works along With The Paper From Literature Review
Reference Detection tech-
nique(s)
Detection loca-
tion
Dataset Attacks Identi-
fied
FL Power
con-
sump-
tion
anal-
ysis?
(U)nknown
or
(K)nown
or (B)oth
Attacks
ERP Custom
FL
[12] Suricata IDS Central Host Not specified DoS X X K X X
[13] Deep Auto Encoder,
DFFNN
IIoT/IICSs NSL-KDD, UNSW-
NB15
Various X X K X X
[14] Three-pattern detection
(AS-EBS)
Raspberry Pi Not specified Not specified X X K X X
[15] Random Neural Net-
works, LSTM
IoT Networks Virtual network data SYN flooding X X K X X
[16] ANN, J48 DT and
Na¨ıve Bayes (Hybrid)
IoT Networks Not specified Various X X K X X
[17] Semi-supervised learn-
ing ESFCM with ELM
Fog-based IDS NSL-KDD dataset Various X X K X X
[18] Anomaly detection (Bot
Analysis)
6LoWPAN Net-
works
Not specified Wormhole X X K X X
[19] Anomaly detection
(MGSS,RSS and ISS)
IoT Not specified Various X X K X X
[20] ultra-lightweight deep-
packet anomaly (Deep
packet inspection)
IoT Internet-connected de-
vices
Various X X K X X
[21] Intrusion detection 6LoWPAN Net-
works
Not specified Various X X K X X
[22] Heuristic Algorithm IoT Networks Not specified Wormhole X X K X X
[23] KNN (Feature reduc-
tion)
IoT Not specified Various X X K X X
[24] TDTC (Dimension re-
duction in IDS)
IoT Backbone NSL-KDD dataset Probe, DoS, U2R,
R2L
X X K X X
[25] Federated Anomaly-
based NIDS
IoT Networks USTC-TFC2016, CIC-
IDS2017, CSE-CIC-
IDS2018
Various X U X X
[26] Federated DNN-based
NIDS
IoT Networks - Network Intru-
sions
X U X
[27] Data Augmentation in
Federated Learning
IoT Devices USTC-TFC2016, CIC-
IDS2017, CSE-CIC-
IDS2018
Various, with fo-
cus on Anomaly
Detection
B X X X
ciently handles power and identifies ”Unknown Attacks”.
This holistic approach is a promising solution for enhanc-
ing MIoT/IoT security.
Table 3 shows that GEMLIDS-MIOT achieved an ac-
curacy of 99.8% in the MIoT IDS category. Comparing
it to other MIoT IDS approaches, it outperforms all of
them, except DNN (99.9%) by 0.1%, including Hybrid
CNN-LSTM (98.83%), XGBoost 97.83%, Random For-
est Agent 97.21%, and HEKA (98.4%), making it one of
the most accurate systems in this category. However, it’s
important to note that the IoT IDS category contains
some approaches with higher reported accuracies, such
as DNN (99.9%), AS-EBS (100%), and Intrusion Detec-
tion (100%). Therefore, in terms of accuracy, GEMLIDS-
MIOT is competitive in the MIoT IDS category but not
the highest among all IoT IDS approaches. However, our
approach provides power consumption, reduced execution
time, memory consumption, and unknown attack identi-
fication through anomaly detection (as shown in Section
5.2).
Table 3: Accuracy of Investigated IDS Approaches
Category Authors Proposed Ap-
proach/Accuracy
MIoT IDS
Gao et al. [1] DT pruned 90.37%
He et al. [2] XGBoost 97.83%
Newaz et al. [3] HEKA 98.4%
Odesile et al. [4] Random Forest Agent
97.21%
Swarna et al. [5] DNN 99.9%
Kumar et al. [6] E-ADS 96.352%
Lee et al. [7] M-IDM 96.50%
Hady et al. [8] RF (EHMS) 92.27%
Alrashi et al. [9] OS-ELM (FBAD) 98.19%
Khan et al. [10] Hybrid CNN-LSTM 98.83%
Raza et al. [11] AnoFed with Transformer-
based AE and VAE, SVDD,
ECG Anomaly accuracy
98.125%
GEMLIDS-MIOT one-class SVM 99.7% ERF
99.8%
IoT IDS
Kasinathan et al. [12] DoS decoder in Suricata Un-
known%
Muna et al. [13] Deep Feed Forward Neural
Network (DFFNN) 98.6%
Oh et al. [14] AS-EBS 100%
Evmorfos et al. [15] Random Neural Network
(Gelembe) 80.7%
Soe et al. [16] ANN, J48 DT and Na¨ıve
Bayes (Hybrid) 99.10%
Rathore et al. [17] ESFCM with ELM 86.53%
Cho et al. [18] Bot Analysis Unknown%
Thanigaivelan et al. [19] MGSS,RSS and ISS Un-
known%
Summerville et al. [20] ultra-lightweight deep-
packet anomaly Unknown%
Lee et al. [21] Intrusion detection 100%
Pongle et al. [22] Heuristic Algorithm 94%
Zhao et al. [23] KNN 84.406%
Pajouh et al. [24] TDTC 84.86%
Wang et al. [] FDL MI-DNN, Avg. Ac-
curacy 98.51%, F1-score
97.53%, TPR 98.44% and
TNR 97.31%
Idrissi et al. [25] Fed-ANIDS & AE,Avg. F1-
score: 82.42%, FDR: 0.55%
and Accuracy: 92.60%
Weinger et al. [27] DA in FL, improve by 22.9%
8
2.2. Background Information
In this section, we provide the background information
related to the following protocols and techniques exam-
ined in this research: IoT, MQTT, REST, Examined ML
approaches in our investigation, Federated Learning, and
Intrusion Detection.
2.2.1. IoT
The Internet of Things is a complex network with sev-
eral interconnected components such as intelligent devices,
gateways, communication protocols, Internet infrastruc-
ture, applications, cloud computing, and end-users. Smart
devices generate and send data using dedicated communi-
cation protocols via an Internet-connected gateway. Sub-
sequently, interactive apps process this data and deliver
it to users through cloud-based platforms. These gadgets
span a wide spectrum, ranging from basic temperature
sensors in smart homes to advanced drones designed for
deployment in the defense industry. While IoT devices
may have distinct agendas, they exhibit certain shared
characteristics. The principal objective of an Internet of
Things (IoT) device is to perceive and gather data about
the physical world. Most Internet of Things (IoT) devices
have limited memory capacity. Hence, these devices must
minimize energy consumption, leading to the adoption of
IoT communication protocols that function at restricted
data rates across short distances. The Internet of Things
(IoT) gateway facilitates the transmission of data gathered
from IoT devices to applications using Ethernet or WiFi,
employing TCP/IP protocols. The application procedure
involved collecting data about the physical environment,
aiming to extract valuable information that can be uti-
lized for decision-making or the remote operation of phys-
ical objects. [30]. MQTT and REST are two of the most
commonly used application layer protocols on the Internet
of Things.
2.2.2. MQTT
Message Queuing Telemetry Transport (MQTT) facili-
tates the dissemination of telemetry information from net-
work clients with limited resources to IoT devices charac-
terized by significant latency. The communication proto-
col adheres to a publish-subscribe pattern and is employed
for machine-to-machine (M2M) communication. The pro-
tocol under consideration is designed to cater to the needs
of small sensors and mobile devices. Its lightweight nature
characterizes it and is optimized explicitly for networks
with high latency or unreliability. This protocol operates
on top of the TCP/IP stack. MQTT clients can function
as publishers or subscribers, contingent upon their role in
transmitting or subscribing to receive them. It is feasi-
ble to incorporate both of these functions into a single
MQTT client. When a client requires transmitting data
to a broker, it is referred to as a publisher, and the cor-
responding procedure is denoted as a publish operation.
The act of receiving data from the broker by the client
is referred to as a subscription. EMQX is a cloud-native,
distributed MQTT broker that operates on an open-source
platform. Its primary function is facilitating communica-
tion between clients, who initiate message transmission,
and subscribers, who receive these messages. This pa-
per is a broker within our Internet of Things (IoT) sys-
tem. MQTT X is a client subscribing to a MQTT sub-
ject on the broker’s side, enabling the broker to auto-
matically receive all incoming information associated with
the specified MQTT topic. Figure 1 shows an example
of the publish-subscribe architecture of MQTT that sends
health-related data across the topic ”Health”. Note that
the publisher devices transmit data to a broker, and the
subscribers receive data for the topics they have subscribed
to from the broker[31].
2.2.3. REST
REST (representational state transfer) was developed
to guide the design and development of the architecture of
the World Wide Web. The Representational State Trans-
fer (REST) framework offers a set of principles and stan-
dards for the design of distributed hypermedia systems,
specifically those operating on an Internet scale, such as
the World Wide Web. The REST architectural style em-
phasizes several key principles, including the scalability
of interactions between components, the use of uniform
interfaces, the possibility for components to be deployed
independently, and the construction of a layered archi-
tecture. These principles enable the implementation of
caching components, reducing the latency perceived by
users. Additionally, REST supports the enforcement of
security measures and the encapsulation of legacy sys-
tems. A REST API, sometimes called a RESTful API, is
an application programming interface (API) or web API
that conforms to the limitations of the REST architec-
tural style. It facilitates the interaction with RESTful
web services. The architectural style in question can em-
ploy SOAP and protocols like HyperText Transfer Proto-
col (HTTP). This technology necessitates a reduced amount
of bandwidth and exhibits compatibility with distinct data
types such as plaintext, HTML, XML, JSON, and others
[32]. Moreover, HTTP clients often use the Transmission
Control Protocol (TCP) to establish a connection through
a three-way handshake mechanism, ensuring secure server
communication. The data generated by the Internet of
Things (IoT) system is transmitted over a Representa-
tional State Transfer (REST) web service to a server in
order to protect its secrecy. The HTTP service is the
intermediary software layer employed by the Internet of
Things (IoT) system for transmitting data to the server.
9
Figure 1: MQTT Publish / Subscribe Model
2.2.4. MIoT Attacks
Overall, the vulnerability of the MIoT system to dif-
ferent attacks arises from its incorporation of wireless con-
nection and external equipment for sensor control and en-
hancement. The open literature has put forth two sep-
arate architectural approaches for the MIoT: single-hop
and multi-hop. The single-hop architecture is predicated
on the utilization of sensors for gathering and transmitting
data. Nevertheless, this architectural design is susceptible
to a singular point of failure, wherein the failure of a single
component inside the personal server layer can jeopardize
the integrity and functionality of the entire MIoT system.
Utilizing a multi-hop architecture enables the collection
and transmission of data by sensors while simultaneously
facilitating data routing. This architectural approach of-
fers advantages such as node mobility and reduced energy
consumption during the process of data transfer. Con-
sequently, the architectural framework in question may
be susceptible to routing vulnerabilities inherent in wire-
less sensor networks. The following are the most common
types of network and system attacks:
Attack at data collection level: Due to the MIoT sys-
tem’s integration of wireless communication and ex-
ternal equipment for controlling and upgrading sen-
sors, the MIoT is a single-hop and multi-hop archi-
tecture vulnerable to various attacks.
Attack at the transmission level: At the transmis-
sion level, there is a significant threat risk, as wire-
less communication enables an attacker to intercept,
modify, or block messages sent and exchange valu-
able information about the patient’s condition.
Attack at storage level: All information about a pa-
tient’s health condition, treatment, and identity is
stored at this level, making it an attractive target
for adversaries seeking access to these data.
The following are the attacks that our investigation
identifies:
Man-in-the-Middle (MitM) Attack [33]: The
Man-in-the-Middle (MitM) attack refers to a form
of cyberattack when an attacker secretly intercepts
and potentially modifies the communication between
two entities, unbeknownst to them. The potential
consequences include the unauthorized acquisition of
data, interception of communications, or the unau-
thorized takeover of a session. Man-in-the-middle
(MitM) attacks capitalize on weaknesses present in
communication protocols or network setups.
Distributed Denial-of-Service (DDoS) Attack
[34]: DDoS attacks encompass the deliberate act of
flooding a specific system or network with an ex-
cessive volume of traffic originating from numerous
sources, thereby incapacitating its functionality and
leaving it unreachable. Malicious actors frequently
employ botnets to execute distributed denial of ser-
vice (DDoS) attacks, interrupting services and caus-
ing financial detriment for targeted entities.
Brute Force Authentication Attack [35]: A
brute force attack is an iterative approach employed
to ascertain a password or encryption key by exhaus-
tively methodically examining all conceivable combi-
nations. The method above is a frequently employed
means of illicitly obtaining entry into user accounts,
systems, or networks.
NMAP Identification Attack [36]: NMAP is a
network scanning program that malicious actors em-
ploy to discern accessible ports, services, and poten-
tial weaknesses on designated computers—frequently
employed as a reconnaissance technique before the
initiation of more advanced attacks. The identifica-
tion of NMAP scans holds significant importance in
the realm of network security.
2.2.5. Examined ML Models
This section briefly overviews the algorithms used in
our traffic identification and attack classification approach.
Support Vector Machines (SVM) [37]:
The Support Vector Machines (SVM) technique is widely
utilized in the field of supervised machine learning to solve
classification problems. The program uses a kernel trick to
extract pivotal information from the dataset. The method
aims to ascertain an optimal boundary and divide the data
based on the given labels. This essential distinction will
help the algorithm figure out the new data’s best position.
The provided illustration in Figure 2 depicts an instance
of categorization using Support Vector Machines (SVM).
Figure 2: Classification by SVM
10
The provided illustration in Figure 2 is an example
for consideration. The black line represents one classifier,
while the blue line represents another. Upon evaluating
the margins, which refer to the extent of separation from
the training data, it becomes evident that the model rep-
resented by ”one” has superior data classification capa-
bilities. The SVM classifier will exert significant effort to
optimize the margins for distinguishing between different
classes. It should be noted that in our inquiry, the One-
Class SVM was employed as the classifier for detecting
anomalies and unknown assaults/attacks (we use the SVM
in our research to identify unknown attacks), as suggested
by other studies in the field, for a small and large data
set [38, 39, 40, 41, 26]. More specifically, the One-Class
Support Vector Machine (SVM) is a variant of the regu-
lar Support Vector Machine that is designed primarily to
identify abnormalities or innovations. The specialization
of this technology enables it to identify patterns in data
that deviate from recognized norms, resulting in excep-
tional effectiveness in detecting unexpected cyber threats.
The One-Class SVM differs from typical SVMs in that it
is exclusively trained on data that exhibits ’normal’ be-
havior rather than being used for classification tasks in-
volving several classes. This quality is crucial for its effec-
tiveness in identifying unexpected attacks, as it does not
require prior knowledge of all possible attack types. Dur-
ing training, the One-Class SVM algorithm aims to create
a boundary in the feature space that encompasses most
of the ’normal’ data points. The border is constructed to
include the region where the probability of encountering a
typical data point is substantial while still providing some
flexibility by eliminating a small number of ’normal’ data
points outside the boundary. This guarantees that out-
lier values within the regular dataset do not unduly affect
the boundary. When dealing with data that cannot be
separated linearly, the One-Class SVM employs the kernel
method to convert the input data into a space with more
dimensions. This facilitates the definition of a clear border.
Following the completion of training, the model possesses
the capacity to identify anomalies or distinctive patterns.
It identifies new data points that differ from the decision
boundary as likely anomalies or unrecognized attacks. The
One-Class SVM excels at identifying unforeseen attacks by
using its proficiency in anomaly detection, the adaptabil-
ity of the kernel approach, its resilience against overfitting,
and its capability to handle high-dimensional data. How-
ever, the quality and comprehensiveness of the ’normal’
data used during training significantly impact the model’s
efficacy [38, 39, 40, 41, 26].
In selecting the One-Class Support Vector Machine
(SVM) for our study, our decision was significantly in-
formed by a thorough literature review and the insights
gleaned from the papers [38, 39, 40, 41, 26]. Collectively,
these papers underscore the effectiveness of One-Class SVM
in anomaly detection, especially in contexts where anoma-
lies are sparse and not well-represented in the data. This
method stands out in its ability to model the ’normal’ be-
havior of data, thereby efficiently identifying deviations as
anomalies. Unlike KNN or Centroid Classifiers, which may
require substantial and balanced datasets to achieve high
accuracy, One-Class SVM is particularly adept at working
with datasets where anomalies are rare. This characteris-
tic aligns closely with our research objectives, where de-
tecting infrequent and potentially novel anomalies is cru-
cial. The literature review highlighted the strengths of
One-Class SVM in handling such data scenarios, making
it a more suitable choice for our study’s focus on anomaly
detection.
K-Nearest Neighbors (KNN) [42]:
The K-nearest neighbors (KNN) algorithm operates under
the assumption that data points belonging to the same
class tend to be located near each other. The fundamental
concepts that characterize the KNeighbors classifier are
proximity, distance, and closeness. The metric parameter
is utilized to elucidate the methodology employed in cal-
culating this proximity. The technique continues by using
the Minkowski distance (as the distance, the approach can
also use the Euclidean distance). The characterization of
the phenomenon can be achieved by employing the formula
shown below (Formula 1):
Minkowski Distance = n
X
i=1
|xiyi|p!1
p
(1)
It assigns a class to an object by considering the class
of its nearest objects’ classes (Kis a small positive integer).
Decision Tree (DT) [43]:
The decision tree classifier is a classification system that
is built on a tree structure comprising two distinct types
of nodes: decision nodes and leaf nodes. Decision nodes
serve as checkpoints that correspond to specific feature val-
ues and can branch into many branches. The leaf nodes
in a decision tree represent the outcomes resulting from
the preceding decisions, and they do not have any more
branches. The process involves evaluating several options,
thoroughly analyzing each option, and eliminating extra-
neous pathways that the classifier may pursue. At each
process level, an attribute test condition is selected and in-
crementally constructed starting from the top node. Leaf
nodes, which are capable of receiving values from a dis-
crete set, can be employed for classification. Decision tree
classifiers utilize many metrics such as EP, Gini Impurity,
Information Gain, Variance Reduction, and Measure of
Goodness to assess the accuracy of the tree.
Na¨ıve Bayes (NB) [44]:
Na¨ıve Bayes is a probabilistic classifier derived from appli-
cations of Bayes’ Theorem as stated below:
P(A|B) = P(B|A).P (A)
P(B)(2)
11
where A,Bare events.
P(A/B) is probability of Agiven Bis true
P(B/A) is probability of Bgiven Ais true
P(A), P(B) are independent probabilities.
An advantage of using the Na¨ıve Bayes classifier is that
the process of maximum likelihood training can be done
in linear time complexity compared to its counterparts,
which may not offer the same linear time possibility. Naive
Bayes is a conditional probability model that tries to as-
sign a class Ckby calculating the probability of the class
Ckgiven a feature xfrom a set of nsuch features. Naive
Bayes Classifiers are classified into the following categories:
i) Multinomial Naive Bayes: The classifier uses the fre-
quency of words in the document as features/predictors;
ii) Bernoulli Naive Bayes: Similar to multinomial naive
Bayes, but the predictors are boolean variables; iii) Gaus-
sian Naive Bayes: The predictors take on a continuous
value and are not discrete; we assume that these values
are sampled from a Gaussian distribution. In this work,
we use the Gaussian Naive Bayes classifier.
Random Forest (RF) [45]:
The Random Forests classifier is a descendant of the De-
cision Tree classifier. It consists of a large number of dis-
tinct decision trees. When multiple decision trees provide
distinct class values for the same object, the class that re-
ceives the most votes is chosen as the final class assigned
to the object. The critical point is that these individual
decision trees are highly correlated with one another. As
a result, the Random Forest is extremely powerful. Be-
cause the correlation between the two trees is new, an er-
ror induced in one tree may not affect the other. Random-
ness is required in these uncorrelated models to produce
the correct class value for the object. Bagging and fea-
ture randomness are techniques for ensuring a high level
of randomness.More precisely, In RF, each decision tree
is built using a random subset of the training data, often
selected with replacement (a method known as bootstrap-
ping). Additionally, when splitting nodes during the tree-
building process, a random subset of features is considered
at each split. This method introduces diversity among the
trees, as each tree sees different parts of the training data
and different sets of features. However, despite this di-
versity, there’s an underlying correlation among the trees
due to the shared pool of training data and features from
which they are drawn. This correlation is beneficial in
terms of the power of RF. When the individual decision
trees in the RF make their predictions, these are then ag-
gregated (usually through a majority vote for classification
tasks or averaging for regression tasks). Aggregating pre-
dictions from multiple, somewhat correlated trees makes
the RF more powerful and accurate than individual deci-
sion trees. This is because the errors of individual trees
are likely to be different and, when averaged, can cancel
each other out, leading to a more accurate final predic-
tion. In summary, the correlation among decision trees in
RF refers to their shared origins regarding data and fea-
tures despite their variations. This correlation, combined
with the aggregation of their predictions, contributes to
the overall strength and robustness of the RF algorithm,
particularly in handling diverse and complex datasets [46].
A Random Forest uses the outputs of all trees in the
forest to determine the majority decision when classifying
a particular data instance. By combining the outputs of
multiple trees, the classifier becomes more robust than de-
cision trees, which frequently suffer from the over-fitting
problem. More specifically, the Random Forest model is
constructed by combining numerous decision trees, each
trained on various subsets of the data and features. This
approach enhances the model’s resistance to overfitting
and results in a high level of accuracy. A Random For-
est (RF) is a machine learning model that constructs an
ensemble of decision trees, named a forest, such that each
decision tree is built using an independently and identi-
cally distributed random vector [47]. For classifying a par-
ticular data instance, a Random Forest uses the outputs
of all trees in the forest to pick the majority decision. The
utilization of the results of multiple trees makes the classi-
fier more robust than decision trees, which suffer from the
over-fitting problem in many cases.
At a high level, the RF algorithm works as follows:
1. The complete training set Sconsisting of ndata in-
stances with class labels {ci, i = 1, . . . , n}from a set
of classes Cis split into Krandom subsets using
bootstrap sampling:
S={S1, S2, . . . , SK}
2. A random feature vector θkis created and used to
build a decision tree from each Sk. All {θk, k =
1,2,3, . . . , K}are independent and identically dis-
tributed.
3. Each tree r(Sk, θk) is grown without pruning to form
the forest R.
4. The classification of a test data instance xis calcu-
lated as follows:
H(x) = argmaxCj
K
X
i=1
I(hi(x) = Cj) (3)
where Iis the indicator function, and hi(x) is the
result of classification by r(Si, θi).
Figure 3 shows a simplified view of classification by
Random Forests. Information gain is a commonly used
metric for deciding the splitting criteria for the various
nodes in the decision trees. The information gained from
the split of a node Sbased on a random variable ais
calculated as follows:
IG(S, a) = E(S)E(S|a) (4)
Here, E(S) is the entropy of the parent node before the
split, and E(S|a) is the weighted average of the entropies
12
Figure 3: classification by Random Forest.
of the child nodes after the split. E(S) is calculated as:
E(S) =
C
X
i=1
p(ci) log p(ci) (5)
Where p(ci) is the probability of a data instance in node
Shaving class label ci.
The computational complexity of the algorithm in Big-
O notation [48], is commonly denoted as O(K×N×
log(N)×F), where Krepresents the number of trees in
the forest, Ndenotes the number of training examples,
log(N) approximates the depth of the trees, and Fsigni-
fies the number of features. The Random Forest algorithm
constructs numerous decision trees during the training pro-
cess. Every tree is constructed using a bootstrapped sam-
ple of the data, and during each split in the tree, a random
selection of features is taken into account. This approach
incorporates stochasticity into the model, effectively miti-
gating variance and preventing overfitting, a prevalent con-
cern associated with decision trees. The ultimate forecast
of the Random Forest is achieved by combining the fore-
casts of individual decision trees, usually through majority
voting for classification problems or averaging for regres-
sion jobs [49]. The computational time is affected by the
number of classes in the dataset, particularly during the
tree construction and voting phases. However, the over-
all computational complexity order is determined by the
number of trees, the size of the training dataset, and the
number of features, regardless of the number of classes
[49].
Enhanced Random Forest (ERF) [50]:
Researchers have proposed several enhancements to the
Random Forest algorithm to improve its performance and
versatility. The Enhanced Random Forests enhancements
we include in our proposed solution are the following:
Feature selection based on importance scores
[51]: Feature selection is a critical preprocessing step
in building effective Random Forest models. Each
feature’s importance is quantified in Random Forest
based on its contribution to the model’s predictive
performance. This process assigns an importance
score to each feature, reflecting its ability to discrim-
inate between classes or predict the target variable.
Features with higher importance scores are consid-
ered more influential in making predictions and are
often retained, while less important features may be
excluded from the model. Feature selection based
on importance scores reduces dimensionality and en-
hances model interpretability and training efficiency.
Random Forest models can improve accuracy and
generalization on various classification and regres-
sion tasks by focusing on the most informative fea-
tures. This technique helps identify and prioritize
relevant input variables, ensuring the model’s pre-
dictive power is harnessed effectively.
Handling imbalanced datasets [52]: is a piv-
otal aspect of improving the performance of Ran-
dom Forest models, especially when dealing with
real-world datasets where one class significantly out-
numbers the others. Imbalanced datasets can lead to
biased models that favor the majority class while ne-
glecting the minority class.Chen et al. [52] proposed
techniques to address this issue within the Random
Forest framework. One common approach is to as-
sign different class weights to each class, penalizing
misclassifications of the minority class more than
the majority class. Additionally, techniques such
as oversampling the minority class or undersampling
the majority class can be employed to balance class
distributions within each forest decision tree. By do-
13
ing so, Random Forest models become more adept
at recognizing patterns and making accurate predic-
tions for all classes, making them a robust choice for
imbalanced datasets.
Balancing class weights [53]: In the context of
Random Forest, handling imbalanced datasets is a
critical challenge. When dealing with classification
tasks where one class significantly outnumbers the
others, the model can become biased towards the
majority class, leading to poor performance in identi-
fying minority class instances. To address this issue,
the technique of balancing class weights is employed.
By assigning higher weights to minority class sam-
ples during the tree construction and voting process,
Random Forest can provide more equitable consid-
eration to all classes. This adjustment ensures that
the model is not overly influenced by the majority
class, making it more capable of detecting rare or
critical events. The approach effectively enhances
the model’s sensitivity to minority class instances,
improving its overall predictive accuracy and relia-
bility.
Hyperparameter tuning and optimization [54]:
Another crucial enhancement in the realm of Ran-
dom Forest is the systematic tuning of hyperparame-
ters. Fine-tuning hyperparameters such as the num-
ber of trees, maximum depth of trees, minimum sam-
ples per leaf, and learning rates can significantly im-
pact model performance. Optimization techniques
like grid search, random search, or Bayesian opti-
mization are employed to find the best combination
of hyperparameters. This process helps tailor the
Random Forest model to the specific characteristics
of the dataset, leading to improved accuracy and ro-
bustness.
The optimizations described above in the context of the
proposed enhanced random forest are implemented in the
subsequent portions of this study. i) Section 5.1.1 is ded-
icated to the optimization of ”Feature selection based on
importance scores”; ii) Section 5.1.2 encompasses the opti-
mizations of ”Handling imbalanced datasets” and ”Balanc-
ing class weights”; and iii) Section 5.2.3 focuses on the op-
timization of ”Hyperparameter tuning and optimization”.
In our study, we employ the Enhanced Random For-
est (ERF) as a distinct model, specifically adapted for the
challenges in the Medical Internet of Things (MIoT). ERF,
an advanced version of the traditional Random Forest al-
gorithm, is optimized for MIoT applications. It incorpo-
rates specific enhancements for handling high-dimensional
data and imbalanced datasets, which is common in MIoT
environments. These modifications include techniques for
efficient data processing and improved learning capabili-
ties, such as feature selection, dimensionality reduction,
synthetic data generation, and weighted sampling [50, 55].
Moreover, ERF is designed to be computationally efficient
and power-conserving, addressing the resource limitations
and power constraints of MIoT devices. This is achieved
through optimizations like pruning strategies and efficient
tree construction algorithms [56, 57]. Studies in various
fields, including industrial fault classification and medi-
cal imaging, have demonstrated ERF’s effective applica-
tion, underscoring its suitability for MIoT scenarios [58].
This paper also distinctly identifies ERF and its special-
ized adaptations, highlighting its significance and appli-
cability in the MIoT context. These adaptations are not
merely general improvements but tailored to address spe-
cific MIoT challenges, enhancing the model’s effectiveness
in real-world MIoT security solutions.
In the development of our Enhanced Random Forest
(ERF) algorithm, we adhere to the standard Big O nota-
tion [48] for computational complexity, typically denoted
as O(K·N·log(N)·F) for Random Forests. In this
expression, Kis the number of trees, Nthe number of
training instances, log(N) the approximate depth of the
trees, and Fthe number of features. Thus, by consid-
ering the Big-O notation and the facts that Kand F
(which is 25 in our case) are constants, the resulting Big-
O notation for the ERF should be O(N·log(N)). As
an example, according to Table 7, our ERF model, af-
ter balancing the dataset through oversampling, trains on
19,208 instances across 25 features, leading to a complex-
ity of O(K·19208 ·log(19208) ·25), which simplifies to
O(19208·log(19208)). While the model handles five differ-
ent classes, this factor primarily influences computational
time rather than theoretical complexity, which governs the
forest size, training set size, and feature count.
2.2.6. Federated Learning
Federated Learning enables multiple users to collabo-
ratively train a machine learning model without disclos-
ing their local datasets. Federated Learning (FL), com-
monly called collaborative learning, is a machine learning
methodology that facilitates the training of an algorithm
on several decentralized edge devices or servers without the
need to transmit local data samples. This strategy con-
trasts with traditional centralized machine learning tech-
niques, in which all local datasets are uploaded to a single
server, and with conventional decentralized tactics, which
normally presume that local data samples are uniformly
distributed. Federated Learning (FL) facilitates the col-
laboration of multiple entities in constructing a consistent
and resilient machine-learning model without the need to
exchange data. This approach effectively tackles signif-
icant concerns, including privacy, security, access rights,
and the heterogeneous nature of data access [59, 60]. The
following types of FL exist:
Federated centralized Learning: In the context of
federated learning, a centralized approach involves
utilizing a central server to effectively manage and
synchronize all nodes involved in the learning pro-
cess. During the initial stage of the training process,
14
the server assumes the responsibility of selecting the
nodes and consolidating the received model modifi-
cations. The potential for the server to become the
system’s bottleneck arises from requiring all selected
nodes to update a single entity [61].
Federated decentralized learning: In a decentralized,
federated learning (FL) environment, the nodes can
coordinate themselves to obtain the global model au-
tonomously. This configuration mitigates the risk
of single-point failures as model changes are exclu-
sively transmitted to networked nodes, hence remov-
ing the central server. Nevertheless, the efficiency of
the learning process may be hindered by the network
structure [61].
The Federated Heterogeneous Learning: Various ap-
plication industries make use of heterogeneous clients,
including mobile phones and Internet of Things (IoT)
devices. Contemporary FL methodologies operate
under the assumption that the local and global model
architectures are identical [62].
Note that in our approach, we use Federated Hetero-
geneous Learning.
Federated learning Parameters
In this section, we show the different parameters that
optimize learning in the control of the FL process; these
are the following:
Number of rounds of federated learning: R
Total number of nodes utilized: N
Fraction of nodes utilized throughout every iteration
for each node: FN
Local batch size utilized during each iteration of learn-
ing: BS
Number of local training iterations before pooling: I
Local education rate: η
These settings must be optimized based on the machine
learning application’s restrictions (e.g., available comput-
ing power, memory, bandwidth).
Federated learning Aggregation Method In feder-
ated learning, the aggregation method is crucial in com-
bining the model updates from different devices while pre-
serving privacy and ensuring model convergence. Several
aggregation methods are used in federated learning, and
the best type depends on the specific application and re-
quirements. According to the literature, the following ag-
gregation methods exist:
Federated Averaging (FedAvg) [63]: Federated
averaging stands as a highly prevalent aggregation
technique within the realm of federated learning. In
this methodology, each device updates its own lo-
cal models by utilizing its own data and then trans-
mitting the updates to the central server. Subse-
quently, the server performs computations to derive
the weighted average of the model updates, so gen-
erating a global model. From a mathematical per-
spective, the process of aggregation can be mathe-
matically described as follows:
θglobal =1
N
N
X
i=1
wi·θi
local (6)
Where θglobal is the global model, Nis the number
of devices, wiis the weights assigned to each device
(usually based on the amount of data or device reli-
ability), and θi
local are the local model updates.
Federated Learning with Secure Aggregation
(FedSecAgg) [64]: Secure aggregation methods are
utilized in situations where maintaining privacy and
security are of utmost importance. The FedSecAgg
framework employs cryptographic methods, such as
secure multi-party computation (MPC), to combine
model updates while maintaining anonymity. Every
individual device employs encryption to secure its
model update, ensuring that the server cannot view
the raw updates while performing the aggregate pro-
cess. This measure guarantees the confidentiality of
the individual updates.
Federated Quantization [65]: In federated quan-
tization, the process involves initially quantizing or
compressing the model updates at the local devices
to minimize the communication cost. The quantized
changes are subsequently transmitted to the central
server, which is consolidated and used to recreate
the global model. This methodology effectively mit-
igates communication expenses while upholding a
satisfactory level of model precision.
Personalization and Differential Privacy [66]:
Some federated learning applications require person-
alized models for each device while ensuring differ-
ential privacy. In such cases, aggregation methods
need to balance personalization and privacy. Aggre-
gation techniques that allow for customization of the
global model while incorporating privacy-preserving
mechanisms are employed.
In realizing our approach, we used the ”Federated Learn-
ing with Secure Aggregation (FedSecAgg)” aggregation
due to its security through encryption, private key cryp-
tography, and the digital signature.
Iterate through the Decision Trees under Enhanced
Random Forests using Depth First Search for Fed-
erated Learning Delta Calculation
15
The Depth First Search (DFS) traverses trees and graphs
commonly used in data structures. Implementing recur-
sion and data structures, such as dictionaries and sets, can
be easily accomplished. The Depth-First Search (DFS) Al-
gorithm is presented in the following manner: i) Select a
node from the given set of nodes. ii) If the selected node
has not been visited, mark it as visited. iii) Recursively
apply the same process to all neighboring nodes of the se-
lected node. iv) Repeat steps ii) and iii) until all nodes
have been visited or the desired node has been found.
The time complexity and Big-O notation [48] for Depth-
First Search (DFS) on a network or tree is commonly ex-
pressed as O(V+E), where V represents the number of
vertices and E represents the number of edges. Further-
more, it is worth noting that the temporal complexity of
Depth-First Search (DFS) on a tree is O(N), with N rep-
resenting the total number of nodes within the tree. The
average time complexity arises from the fact that the av-
erage time complexity of a set insertion operation is O(1).
In contrast, the work would become more intricate if a list
were employed. The average time complexity arises from
the fact that the average time complexity of a set opera-
tion is O(1). If a list were used instead, the complexity
of the task would be more intricate. Therefore, the whole
computational complexity can be expressed as O(n).
The outcomes of a depth-first search are shown in the
paragraph. More specifically, the result of doing a Depth-
First Search (DFS) on a tree is a sequentially arranged
collection of vertices, the specific order of which is contin-
gent upon implementing the DFS algorithm. The concept
of vertex order is elucidated in the subsequent paragraph.
The subsequent section illustrates the many vertex order-
ings that can arise in a Depth-First Search (DFS) algo-
rithm. In a more specific context, the depth-first search
algorithm can be employed to linearly arrange the vertices
of a graph or tree in a sorted manner. Four distinct meth-
ods exist through which this task can be achieved.
Preordering: A preordering is a list of the vertices
in the order in which the depth-first search method
first visited them. This is a concise and natural ap-
proach to describe the search’s progress, as was done
previously in this article. A Polish notation expres-
sion is a preordering of an expression tree.
Post-order: A post-order is a list of vertices in the
order in which the algorithm last visited them. A
post-ordering of an expression tree corresponds to
the expression in Polish notation inverted.
Reverse Preordering: A reverse preordering is the
opposite of a preordering: a list of the vertices in the
reverse order of their initial visit. Reverse preorder-
ing differs from post-ordering.
Reverse Post-order: A reverse post-ordering is
the opposite of a post-ordering: a list of the vertices
in the reverse order of their previous visit. Reverse
post-ordering differs from preordering.
The DFS algorithm will calculate the Enhanced Ran-
dom Forest difference between the old and new model by
subtracting the traversal result set of the old model from
the new model for the FL to be used and inform the rest of
the devices through the cloud model with the delta changes
(more details can be found at Section 4.3).
2.2.7. Intrusion Detection
Intrusion Detection detects rare or abnormal events in
a network by monitoring its traffic and hosts. An intru-
sion detection system (IDS) attempts to detect malicious
activity an attacker generates on an organization’s infor-
mation technology infrastructure. In general, intrusions
attempt to gain unauthorized access to a device or system
or to cause a denial of service attack on it [67]. An IDS
could be network-based (NIDS) or host-based (HIDS). A
network intrusion detection system (NIDS) monitors only
network traffic and analyses packet headers, payloads, and
statistics to detect malicious activity. On the other hand,
a HIDS is installed on a device and monitors host traces
(file system changes, system calls, running processes, and
so forth) and network traffic to detect abnormal behav-
ior. When the analysis engine detects an intrusion at-
tempt, the IDS records pertinent investigation data and
notifies security analysts. Three primary techniques for
detecting intrusions are signature-based, anomaly-based,
and specification-based. A brief description of the detec-
tion methods is shown below:
Signature-based intrusion detection systems (IDS)
are used to monitor network attacks by looking for
certain patterns. This makes them vulnerable to
zero-day attacks, and the IDS’s ability to adapt to
each device isn’t very good, so they’re at risk.
Anomaly-based methods use many machine learning
techniques to get a general idea of how the system
usually works and then use that knowledge to look
for things that aren’t normal.
A hybrid of these two approaches (named specification-
based intrusion detection) is demonstrated in [68], in
which manually specified behavioral program speci-
fications are used to detect attacks. This approach
has been proposed as a promising alternative that
combines the strengths of misuse detection (accu-
rate detection of known attacks) and anomaly de-
tection (ability to detect novel attacks). One of the
main benefits of this hybrid approach is that it bal-
ances the storage costs of signatures with the inten-
sive computational tasks that come with learning-
based methods.
An IDS for IoT networks can be deployed in two ways:
connected to the gateway router or embedded in each IoT
device. The benefit of embedding the IDS in the gateway
router is that it enables centralized detection and manage-
ment [68, 12]. Internet-based attacks on IoT devices could
16
be detected at a single point and the most basic level. The
downside is that it slows down communication between
IoT devices and the gateway because the IoT IDS has to
check on the network states of the devices constantly.
By embedding the IDS in the IoT nodes, communica-
tion overhead is avoided [69]. However, it consumes the
resources (processing, memory, and power) generally asso-
ciated with low-energy devices [70]. This may be impracti-
cal in many instances. However, distributing IDS sensors
across the IoT network on a few dedicated devices may
make this approach feasible. Still, the network architec-
ture must be altered to allow devices to communicate with
the dedicated nodes [14, 21].
3. System Model
This section provides our approach’s system descrip-
tion, design, and model.
3.1. MIoT System
The MIoT system under consideration in this research
article comprises the part of the patient, the components
depicted in Figure 4. Note that Figure 4 represents a
part of the system and, most specifically, the part of the
components that are in the patient room. The diagram
depicts a standardized MIoT (Mobile Internet of Things)
system, wherein a selection of sensors are interconnected
with a computer and afterward linked to the MIoT gate-
way via the computer. Furthermore, certain sensors are
directly linked to the MIoT gateway. It should be noted
that many communication protocols for the Internet of
Things (IoT), including Bluetooth, Bluetooth Mesh, Zig-
Bee, ZigBee Mesh, Lora, and LoraWAN, can be employed
to establish connectivity between all devices at the gate-
way. A Raspberry PI was configured as an access point
for the Internet of Things (MIoTs) using the IP protocol
over WiFi Direct. The diagram illustrates how healthcare
sensors gather various physiological data, including heart
rate, blood pressure, and body temperature. These data
are then transmitted via an Internet of Things (IoT) gate-
way device to the cloud, where they undergo processing
and analysis. One potential use for this system is in the
field of remote patient monitoring, wherein health met-
rics for a particular patient are periodically transmitted
to the respective healthcare experts’ devices. As discussed
in Section 2.2, the MQTT protocol is a widely used appli-
cation layer for collecting and transmitting healthcare sen-
sor data. It operates on a publish-subscribe architecture.
Owing to the inherent sensitivity of healthcare data, the
health parameters will be subjected to encryption, render-
ing them accessible just through decryption by the device
or system employed by the healthcare professional respon-
sible for the patient’s care. Furthermore, the classification
model undergoes alterations, and both the Intrusion De-
tection System (IDS) and the Firewall (FL) signals will
be subjected to encryption. Consequently, only the des-
ignated recipient or intended destination can decipher the
encrypted messages and model modifications. The pro-
cedure mentioned above may be executed by employing
public key cryptography as the encryption mechanism and
utilizing the Certificate Authority server to distribute the
accurate public keys of the sinkholes in conjunction with
the cloud server.
Figure 4: MIoT System Model
3.2. Attack Model
Intruding into an MIoT system is a pivotal issue, as IoT
systems are resource-constrained and have a large attack
surface. An MIoT system must have a protective layer to
avoid malicious attacks. An intruder follows the following
methodology to achieve its malicious goals against an IoT
system. The following subsection will show an attacker’s
steps to attack an MIoT system.
3.2.1. System Reconnaissance:
This is the first stage that involves collecting data pri-
marily about devices in the target IoT system. The at-
tacker may require information about the hardware, IoT
providers, and even crucial telemetry data that may be
obtained from many sources. Another important thing is
to know what services are available in that system. You
can find this out by doing things like port scanning.
3.2.2. Cyber-attack design:
The tools that are most well suited for the attack on
such an IoT system need to be carefully curated based
on information obtained from the previous stage in this
stage. Depending on the attack’s needs, the attacker may
use a mix of these tools to make a base plan for the attack
strategy.
3.2.3. Entry into the system:
This is the stage at which the attacker launches the at-
tacks using the tools selected in the previous stage, gaining
access to the system by exploiting vulnerabilities in the
target IoT system.
3.2.4. Implementation:
The attacker takes control of the IoT system using the
earlier strategy and inflicts planned damage. The attack
must be sustained without losing access to the system and
set the base by making it easier for future intrusions. Ta-
ble 4 represents the four-stage attack methodology.
17
Table 4: Attack Methodology
System Reconnaissance
Collect Information
Inspect Target
Cyber-Attack Design
Devise Plan of action
Choose correct tools
Implementation
Sustain attack
Pave more attacks
Entry into system
Intrude IoT system
Physical/Remote
In the case of an MIoT system consisting of the com-
ponents described in the previous subsection, attacks could
have the purpose of making the system unavailable through
power drainage of devices or denial of service (DoS) at-
tacks, capturing/modifying sensitive health data, and ac-
cessing the system components in an authorized manner.
Our focus in this work is mainly on the first-stage and
third-stage attacks for intrusion detection. Below, we pro-
vide a brief description of each attack type considered.
Note that network scanning (NMAP)/reconnaissance at-
tacks come under ”system reconnaissance” (they are con-
sidered attacks because they can give valuable informa-
tion to the attacker), and DDoS, MitM, and Brute force
authentication attacks come under the ”entry into the sys-
tem”.
1. Distributed Denial of Service (DDoS) attack
[71]: DDoS attacks prevent legitimate users from ac-
cessing a system by flooding requests from multiple
systems (attackers) to a single targeted system. The
attacker tries to get the IP of the targeted system by
using various powerful tools. When this attack is at-
tempted successfully, the number of requests exceeds
the system’s capacity. As a result, users experience
a delayed response, or sometimes the system hangs
abruptly.
2. Man-in-the-Middle (MitM) attack [72]: MitM
is an attack where the attacker will be between the
user’s conversation and the system. The data flow
seems normal, but the attacker will view the data
passing to the cloud. This attack aims to sniff the
data passing between the system and the cloud. The
data obtained during this attack can be used in many
ways. i.e., fund transfers, password exchange, inter-
cepting data, etc.
3. Brute Force Authentication attack: This is a
password guessing attack. The attack is an attempt
to find the credentials of a legitimate user by system-
atically trying out all the possibilities. The possibil-
ities depend on the length and the complexity of the
password. The targets are the devices that require
authentication. We can also perform a dictionary
attack where we will have a list of commonly used
passwords that the user/system might use.
4. Network scanning/reconnaissance attack [73]:
A network mapper (NMAP) is a powerful tool at-
tackers use to discover network information about
a system. It helps discover hosts and services by
sending packets and analyzing the responses. The
IP addresses and other essential data are received
from NMAP packets. Different transport layer pro-
tocols are used to send error messages, which in-
clude Transmission Control Protocol (TCP), User
Datagram Protocol (UDP), Stream Control Trans-
mission Protocol (SCTP), and Internet Control Mes-
sage Protocol (ICMP). That is why, for us, they are
considered attacks because they can give valuable in-
formation to the attacker, and they must be stopped
from the IDS.
4. Methodology of the Proposed Approach
This section describes our proposed IDS overall ap-
proach for the MIoT system discussed in Section 3 along
with the assumptions of the investigation. Additionally,
we provide an algorithm for the MiOT Enterprise Re-
source Planning (ERP) system that shows the compo-
nent’s interruptions. This work focuses on detecting in-
trusions against the gateway device through the network
and updating the model (for classification) using Feder-
ated Learning. Intrusions on the cloud components for re-
porting and data storage are outside the scope of this work
and will be addressed in future work. Therefore, the intru-
sion detection algorithms are run at the gateway device as
the sinkhole. Figure 5 shows a high-level view of the pro-
posed end-to-end intrusion detection approach. A three-
step Intrusion Detection System (IDS) methodology is pro-
posed. Firstly, this study proposes a machine learning
(ML) approach to identify network threats using classifica-
tion techniques on MIoT gateways/sinkholes. Continuing,
the process involves the identification of unidentified sys-
tem attacks and facilitating the learning of these attacks by
the Intrusion Detection System (IDS) through training the
classification model with attack records. Ultimately, the
distant cloud server and other sinkholes are continuously
updated using Federated Learning techniques. The cloud
Enhanced Random Forest machine learning model updates
the continuously trained sinkholes on the local area net-
work (LAN). Conversely, the sinkholes update their model
with a newly identified attack towards the cloud model. In
a more particular manner, Federated Learning (FL) uti-
lizes the knowledge that our classification model is an En-
hanced Random Forest (also called ensemble random for-
est) and incorporates the variations in nodes and trees into
the primary cloud ERF model as novel assaults are iden-
tified, utilizing the updated model that encompasses the
modified nodes and trees. The cloud server subsequently
informs the other MIoT gateways about the updated tree
structure changes (we can call them Delta changes) to
maintain consistency with the new model and identify the
new attack. This study aims to utilize cloud-based trans-
fer learning to enhance and update our models for specific
sinkholes or all sinkholes. This approach will enable us
18
Federated Learning
Federated Learning
Federated Learning
Federated Learning
MIoT GateWay/ Anomaly Detector / Attack
Classifier and WiFi Direct Router
The Threat
Intelligence
Federated Learning
Server
Certificate Authority
Certificate
Figure 5: Architecture of the End-to-End Intrusion Detection Approach
to incorporate more attack scenarios and accomplish real-
time model optimization. To enhance the precision of our
model assessment, we perform our tests within our emu-
lation environment (as described in Section 4), utilizing a
cloud server that employs the currently enabled model at
the sinkholes of several MIoT networks.
Following the aforementioned, the primary aim of this
research is to demonstrate the feasibility of classifying nu-
merous attacks using the Random Forest approach, as in-
dicated in Section 5. Furthermore, the classification using
ERF and detection of anomalies (attacks) can be achieved
in a power, memory, and CPU efficient manner, as demon-
strated in Section 5 by utilizing One-Class SVM. Thirdly,
federated learning (FL) has been demonstrated to be both
practical and efficient, as seen in Section 5. This approach
enables MIoT gateways and cloud servers to undergo con-
tinuous training in a mutually beneficial manner. For in-
stance, a cloud server can be trained with new datasets
in addition to the existing ones, creating an instantaneous
model. The differences between the new and old models
can be calculated, and the updated model can be shared
and transferred to the network sinkholes. In this study,
we conduct an evaluation of various classification tech-
niques, namely Support Vector, Naive Bayes, K-Nearest
Neighbours, Decision Tree, and Random Forest, to classify
different types of attacks, including MitM, DDoS, Brute
Force Authentication, and NMAP identification (we con-
sider NMAP scan as an attack in our investigation). Addi-
tionally, we employ well-established machine learning tech-
niques such as Support Vector, Naive Bayes, K-Nearest
Neighbours, Decision Tree, and Enhanced Random For-
est to identify unknown attacks using One-Class SVM, to
achieve high accuracy in the identification process. More-
over, the most effective strategy for power, CPU, and
memory allocation was the Enhanced Random Forest ma-
chine learning model for classification, with an accuracy
rate of 99.8%. Furthermore, to discover anomalies belong-
ing to a single class, we employed the One-Class Support
Vector Machine (SVM) algorithm, which has been widely
recognized as the most effective technique for anomaly de-
tection, as reported in the literature [39, 38, 40]. Addi-
tionally, our findings demonstrate that the One-class Sup-
port Vector Machine (SVM) strategy for unknown attack
identification is among the most effective methods for con-
serving power, CPU use, and memory. Furthermore, it ex-
hibits a high level of accuracy in identification, achieving
a rate of 99.7%. This paper presents a novel approach for
designing a feasible MIoT IDS that considers various re-
quirements such as power consumption, delay constraints,
and high accuracy. The proposed approach utilizes an FL
green machine learning-based technique, which leverages
one-class SVM for anomaly/attack detection and employs
an Enhanced Random Forest model for attack classifica-
tion at the sinkhole. Subsequently, the central Enhanced
Random Forest model is optimized through the utilization
of cloud computing, incorporating the differences in En-
hanced Random Forests, such as intermediate nodes and
leaves. This process also extends to updating the remain-
ing MIoT sinkhole(s), which can be executed bilaterally.
In certain scenarios, the cloud model is updated first, fol-
lowed by the subsequent updates to the remaining MIoT
components.
19
The proposed intrusion detection approach consists of
the main components as described below:
The GateWay IDS/MIoT GateWay (GW) is respon-
sible for running the intrusion detection algorithms
on the network flows resulting from the connections
established with the cloud and sending anomalous
data (the delta of classification model) to the cloud
threat intelligence server.
The Attack Classifier is an ML model for determin-
ing the classification of network flows, including their
attack classes, which run in the gateway IDS.
The Anomaly/Identification Detector is an ML model
for identifying/detecting anomalies in received net-
work flows, which runs in the gateway IDS.
The Threat Intelligence Federated Learning Server/The
cloud server model acts as a central point of intelli-
gence collection for known/unknown attack informa-
tion of the system admins (via MQTT). It gathers
the classification model deltas generated from the
IDS gateways that are calculated based on anoma-
lous data, and it updates its local classification model
for the cloud IDS. It also communicates with the rest
of the gateway IDSs connected to it to update their
attack classification model with newly discovered at-
tack data using the provided delta. Note that the
intelligence server digitally signs the deltas (calcu-
lated from the updated model) before being sent to
the rest of the gateway IDSs.
Upon receipt of any network flow at the gateway, the
following steps are taken to detect intrusions (as shown in
Figure 6):
1. The network traffic that has been received is ana-
lyzed to extract the necessary features for the ma-
chine learning model. It should be noted that the
process of anomaly identification and attack clas-
sification occurs not at the PCAP files themselves
but rather at the converted PCAP files that have
been transformed into CSV feature data. There-
fore, within the gateway Intrusion Detection Sys-
tem (IDS), a service consistently utilizes the ”CI-
CFlowMeter tool” to transform Wireshark PCAP
files into feature records.
2. The extracted features are input to the anomaly de-
tector for determining whether the traffic is normal
or an instance of a known anomaly.
(a) The attack classification model is run to the
network flow to determine the attack’s class.
(b) If the network flow is classified as an attack, it
informs the Threat Intelligence Federated Learn-
ing Server
1. The system employs encryption techniques
to secure the data intended for transmis-
sion to the intelligence database, which will
be used for future IT-related information
regarding past, current, and potential at-
tacks. Before encryption, the system se-
lectively excludes identifying records asso-
ciated with private data, such as the Inter-
net Protocol (IP) addresses of user devices,
to safeguard user privacy. These excluded
records are then forwarded to the Threat
Intelligence Federated Learning Server.
(c) Alternatively, if the network flow is not catego-
rized as an attack, and in the case where the
network flow is classified as an anomaly and
an unknown attack, the approach should pro-
ceed to update its classifying model. This up-
date is achieved through the utilization of fed-
erated learning, wherein the differences in the
Enhanced Random Forests are transmitted to
the remaining models, as depicted in Figure 7..
More specifically, our proposed approach is ex-
ecuting the following:
1. It identifies the related records to the anomaly
and associated features. Then, the sys-
tem trains the classification model used for
anomaly detection with the new records ac-
cording to the executed feature selection.
2. Compare the resulting model of the attack
classification with the old model (saved in
the disk of the MIoT gateway) and calcu-
late the differences in the model. Also,
overwrite the old model with the current
model on the disk.
3. Send the calculated differences between the
old model that is saved in the disk of the
MIoT gateway and the running model to
the cloud model.
4. The cloud server model (cloud threat in-
telligence server) is updated with the new
changes and replaces its old model at the
disk with the new model.
5. The cloud server model (cloud threat intel-
ligence server) sends the differences of the
model to the rest of the MIoT.
6. The MIoT gateway receives the model dif-
ferences by updating its model with the
new changes, replacing its old model at the
disk with the new model.
7. The unknown attack identified IoT gate-
way removes private data from the flow to
preserve user privacy. Thus, from the iden-
tifying records of the features that are re-
lated to the private data, the IoT gate-
way removes private information (e.g., In-
20
ternet Protocol (IPs) addresses of user de-
vices). Afterward, it sends the flow records
to the Threat Intelligence Federated Learn-
ing Server.
(d) If an anomaly or attack was detected in the pre-
vious step, the IDS raises the alarm and takes
appropriate mitigation action.
Start
Read
PCAP
CICFlowMeter
Extract CSV
Features
Identification
Process
If Anomaly
Records
Identified
Calculates Differences
among
old and new Classification
Model (Deltas)
Read Running
and old Models
Federated Learning
Inform Cloud and other
GateWays
YES
NO
Arrange Private Data and
Record Attack
If Classifier
identified attack
Arrange
Private Data
and
Record
Attack
Classification
Process
Calculate New
Classification
Model
YES
NO
Figure 6: The Flowchart of the proposed algorithm
4.1. ERP-Based Intrusion Detection
The ERP algorithm 1 that orchestrates our approach’s
Enterprise Resource Planning system integrates various
MIoT network components to enhance its security frame-
work. Utilizing Federated Learning, the ERP system con-
solidates threat intelligence and coordinates between MIoT
gateways, anomaly detection systems, attack classifiers,
and a central threat intelligence server.
Consequently, utilizing a threat intelligence server fa-
cilitates the ongoing update of machine learning models at
the gateway intrusion detection system (IDS) by incorpo-
rating data obtained from various gateways. This process
is particularly relevant in the context of anomaly detec-
tion, as it enables the linked IDS to remain informed about
emerging assaults and then implement suitable mitigation
measures.
Algorithm 1: ERP algorithm for the Federated
Learning-based IDS for MIoT Network Security
Hardening
Input: Network traffic flows from MIoT
Gateways
Current classification and anomaly detection
models
Cloud threat intelligence database (Intelligence
db)
Federated Learning Server (FL Server)
Output: Updated classification and anomaly
detection models
Alarms for identified threats
Encrypted and privacy-ensured threat data for
Intelligence db
// Feature Extraction:
Convert incoming network traffic (PCAP) to
feature records (CSV) using CICFlowMeter
Extract necessary features for the machine
learning models
// Anomaly Detection and Attack
Classification:
Input extracted features to the anomaly detector
if traffic is normal then
return
else
Classify the traffic flow using the attack
classification model
// Model Update and Threat Intelligence
Sharing:
if flow is classified as an attack then
Encrypt and transmit attack data to the FL
Server
else
Train the classification model with new records
Compare the new model to the old model and
calculate deltas
Update the model on the MIoT Gateway
Transmit model deltas to the FL Server
FL Server updates its model and transmits
deltas to other MIoT Gateways
Other MIoT Gateways update their models
with new deltas
// Response and Mitigation:
if an anomaly or attack is detected then
IDS raises an alarm
Perform mitigation actions
else
// Continue monitoring
It should be noted that the data transmitted between
the MIoT gateways and the threat intelligence server is
subject to encryption and digital signing. This process
21
involves the utilization of pre-assigned digital signatures,
which were allocated to the devices during their configu-
ration by the Certificate Authority (CA).
The proposed IDS approach is secure against various
attacks listed below that could be launched against it.
Model poisoning attacks, which may be launched by
malicious gateways attempting to provide falsified
data to the central intelligence server, are thwarted
using digital signatures (shared through a Certified
Authority) on the attack data being transmitted.
Any malicious gateways can be identified through
their signatures and subsequently disconnected by
the server.
Rogue intelligence servers that might send tainted
models to the gateways are prevented by using dig-
ital signatures on the models they transmit. Gate-
ways verify these signatures before integrating the
models into their Intrusion Detection Systems (IDS).
Denial of Service (DoS) and power drainage attacks,
aimed at reducing the availability of the gateway,
are detected using DoS/DDoS detection algorithms
running within the IDS’s attack classifier.
Data leakage attacks, which intend to steal privacy-
sensitive data from the gateways, are thwarted by
obfuscating sensitive feature values like IP addresses
before transmitting attack data to the cloud. Health-
related data are sent separately in encrypted form
to the relevant server. Furthermore, any changes to
the model (for classification) and the IDS and Fed-
erated messages are transmitted separately in en-
crypted form to the corresponding cloud server or
sinkholes.
4.2. Assumptions of the Proposed Approach
In this section, we provide the assumptions of the inves-
tigated examination. The assumptions of our investigation
are the following:
This examination does not investigate the proposed
IDS’s reporting methods using the MQTT protocol;
this will be a future focus of the investigation.
This analysis does not investigate the security among
the suggested IDS components, and security is not
quarantined; this will be a future focus of the in-
vestigation. For the current investigation, security
is forced via Public Key encryption mechanisms and
the Certificate Authority.
The data storage handling in terms of the type of
storage needed and speed of storage (Serial Advanced
Technology Attachment (SATA), Solid State Drives
(SSD)) will be discussed in a future work of this in-
vestigation.
4.3. Algorithm of the Proposed Approach
We demonstrate the proposed approach steps in Alg.
2. This algorithm highlights the novelty and contributions
of the proposed investigation.
Algorithm 2: GEMLIDS-MIOT Algorithm
Result: Keeping the MIoT network secure
RUNN ING MODEL: The running model used
by the classification algorithm, saved in memory
PCAP: The PCAP generated files per second
CICFlowMeter: A method that converts the
PCAP files to feature records (as shown in
Section 5.1)
FeaturesRecords: The converted feature records
for model training and anomaly detection
RecodsIdentified: The feature records identified
with anomalies
AN OM ALY DE T EC T ED: The anomaly
detector that identifies an anomaly and returns
the RecodsIdentified using the ML approach (as
shown in Section 4.4.2)
FederatedLearningExecution: The FL
algorithm (as shown in Section 5.3)
AttackClassificationML: The attack
classification ML approaches (as shown in
Section 4.4)
ClassificationResult: The classification result
from the ML approaches; it is a numerical result
that relates each category to a number, where
zero indicates no classification
while true do
FeaturesRecords = CICFlowMeter(PCAP);
ClassificationResult =
AttackClassificationML(FeaturesRecords);
if (count(ClassificationResult)1) then
Inform the Central System;
else
RecodsIdentified =
AN OM ALY D ET EC T ED (F eaturesRecords);
if (count(RecodsIdentified)1) then
FederatedLearningExecution(RecodsIdentified,
SAV ED MODE L,RU N NING MODEL);
Inform the Central System;
end
end
end
4.4. Machine Learning Approaches used in the MIoT in-
vestigation based Intrusion Detection
This section provided the insides of the proposed ML
models on how to optimize our models to achieve better
accuracy.
The proposed IDS utilizes lightweight, supervised ML
models for identifying and detecting intrusions through
22
Internet
MIoTs
MIoTs
Cloud Server with a
centralised Random Forest
Trained
MIoT Gateway
MIoT Gateway
Random Forests
Intrution Detection System
communication
Attack
Gateway
Federated Algorithm
Running Saved Running Model
Difference
among Trained Live
and Saved model.
This will result to
retrain Model
Send Model Differences
Send Model Differences
Update the Trained
Live
and Saved model.
The Threat Intelligence Federated Learning Server and Clients
1
2
3
4
MIoTs
MIoT Gateway
5
Figure 7: Identification of an anomaly and the use of Federated Learning
anomaly detection and classification techniques. This anal-
ysis is performed on data acquired from network flows,
which are also depicted in this section. The following para-
graph overviews the utilized network flow features and the
used ML models.
A supervised learning approach is an approach that
takes a certain amount of labeled training data to train
models, which is very good at classification problems, there-
fore making it reasonable for us to take the same ap-
proach. A well-annotated dataset is crucial in training a
good model. The dataset’s quality depends on the proper
labeling done by humans, essentially making it a ”Human-
in-the-loop” process in the identification and classification
ML model training process. Fortunately, this step will
produce well-defined findings due to the precise connec-
tion information returned by the testing phase. Once this
has been accomplished, the models can be trained, causing
the selected algorithms to learn and ”teach” themselves to
spot patterns in a dataset.
4.4.1. Network Features
The dataset we utilized in this work has 80 features ob-
tained from network flows using the CICFlowMeter tool.
After feature selection based on importance scores, we
found the best 25 features to train our models, as explained
in Section 5. The best features that were derived from the
set of 80 features with their descriptions are listed in Table
5.
4.4.2. Examined ML Models For Classification and Iden-
tification (Anomaly Detection)
This section provides the algorithms used for traffic
identification and classification by the gateway IDS used
in the first stage. Also, it utilizes only one ML approach for
anomaly detection due to its high accuracy and investiga-
tion in the open literature. It is used in the second stage
for unknown attacks. It examines multiple competitive
approaches for classification to select the most accurate
approach at the end. So. our investigation is exploring
the different ML classifiers2for the classification and iden-
tification of each attack.
2The objective of a classification problem is to assign new ob-
jects to predetermined classes based on data provided by known ob-
jects, their classes, and their attributes, as well as data about their
23
Table 5: Network Features
flow byts s Number of flow bytes per second
flow pkts s Number of flow packets per second
fwd pkts s Number of forward packets per second
bwd pkts s Number of backward packets per second
fwd pkt len max Maximum size of packets in forward direction
bwd pkt len max Maximum size of packets in backward direction
fwd pkt len mean Mean size of packets in forward direction
fwd pkt len std Standard deviation size of the packet in forward direction
bwd pkt len max Maximum size of the packet in backward direction
bwd pkt len mean Mean size of the packet in backward direction
bwd pkt len std Standard deviation size of the packet in backward direction
pkt len max Maximum length of a packet
pkt len mean Mean length of a packet
pkt len std Standard deviation length of a packet
pkt len var Variance length of a packet
fwd iat tot Total time between two packets sent in the forward direction
bwd iat tot Total time between two packets sent in the backward direction
pkt size avg Average size of packet
init fwd win byts The total number of bytes sent in the initial window in the forward direction
init bwd win byts The total number of bytes sent in the initial window in the backward direction
fwd byts b avg Average number of bytes bulk rate in the forward direction
fwd pkts b avg Average number of packets bulk rate in the forward direction
bwd byts b avg Average number of bytes bulk rate in the backward direction
fwd seg size avg Average size observed in the forward direction
bwd seg size avg Average size observed in the backward direction
We will utilize the following ML approach for the anomaly
detection problem because it is widely used and demon-
strated in the open literature to be the most effective
method for this purpose [39, 38, 40]. The ML method we
use is shown in Section 2.2.5 and is the One-Class SVM
approach.
For our classification problem, we will compare the fol-
lowing approaches (as shown in Section 2.2.5):
Support Vector Machines (SVM).
K-Nearest Neighbors (KNN).
Decision Tree.
Random Forest.
Na¨ıve Bayes.
classes. Candidates for a solution have been chosen in consideration
of evaluation criteria such as performance (equivalent to the quality
of outcomes), complexity, and inference time. Using the following
algorithms, we can fit models to training data gathered from earlier
steps, and our experiment will continue.
4.4.3. ML Approach Training and Testing
A portion of the available data, commonly called the
training data, is used to construct the model. The training
dataset is used to assess the performance of the dataset.
Typically, it is computed as a proportion of the total dataset,
expressed as a percentage. Ensuring that the training
set is entirely distinct from the data employed for train-
ing purposes is imperative. They determine whether the
model acquired or memorized knowledge from the input is
challenging. Overfitting is a frequently encountered phe-
nomenon. To mitigate this issue, it is necessary to ensure
that the test set is distinct. Utilizing an excessively tai-
lored model has the potential to provide exceedingly poor
outcomes. Numerous libraries offer mechanisms for parti-
tioning datasets into training and test data subsets. Oc-
casionally, a need to rearrange the order of data from its
present arrangement may arise. Typically, a designated
proportion, such as 30%, is allocated for testing, while the
remaining portion is allocated for training. Conducting
tests on the designated test set will yield prediction out-
comes, enabling algorithmic performance evaluation. In
this investigation, 30% of the data is for training, and the
results are examined using k-fold validation, as shown in
24
Section 5.2.1.
4.5. Proposed Federated Learning Approach
This section contains the implementation of the FL. It
provides the FL parameters that are set in our emulation
system, the event that forces the FL to start and inform
the network, the algorithm for calculating the delta that is
sent to the network, and finally, the algorithms that should
run on the gateways and the cloud for the FL to achieve its
goal of informing all the models in the MIoT network. As
shown below, in our approach, FL will be implemented us-
ing DFS (as shown in Section 2.2.6) on Enhanced Random
Forests3for calculation of Delta differences among the old
with the new classification model.
One of the primary concerns with federated learning is
the exposure to potential risks such as backdoor injections
or malicious manipulation of the training data [74, 75, 76].
Our approach tackles this issue by combining digital signa-
tures and robust encryption techniques. Specifically, dig-
ital signatures are used to authenticate the integrity and
origin of the data shared across the network. This ensures
that unauthorized entities have not altered or tampered
with training data. Additionally, encryption is applied to
the training data while it is being stored locally and during
transmission to the federated learning server. This dual-
layered security mechanism effectively shields the system
against attempts to inject backdoors or manipulate the
training dataset, thereby maintaining the overall integrity
and reliability of the federated learning process.
4.5.1. Federated learning Parameters
In this section, we show the values of different param-
eters that optimize learning to control the FL process as
shown in Table 6.
Table 6: Federated Learning Parameters
Parameter Value
Number of rounds R= 1
Total number of nodes N= 12
Fraction of nodes per iteration F N = 100%
Local batch size BS = 10 KB
Local training iterations I= 1
Local learning rate η= 100%
4.5.2. Event that triggers the Federated Learning Algo-
rithm
Every sinkhole is trained using an Enhanced Random
Forest algorithm, which has been identified as the most
accurate machine learning model with the least computa-
tional power and memory capacity, as evidenced in Sec-
tion 5. During the training process, the sinkhole retains
3We have demonstrated that they are one of the most precise
machine learning models, with a high-level of efficiency in terms of
reserving power, CPU, and memory.
a model that is stored in the cloud or a centralized ma-
chine learning model. This stored model is used as a ref-
erence to detect any modifications or updates that need
to be made to the present training model. Furthermore,
the system initiates a request for the most recent version
of the model through a Representational State Transfer
(REST) protocol from the cloud-hosted machine learning
(ML) model. This request is made in an encrypted man-
ner, ensuring the security and integrity of the data. A
digital signature accompanies it to verify the authenticity
of any transmitted content. As outlined in Section 4, once
the anomaly detection process identifies an anomaly using
the anomaly detector, which often represents an unknown
assault/attack, the system generates a new model. This is
achieved by training the existing model with the inclusion
of the extra records.
4.5.3. Calculation of Model Differences in Random Forest
This section presents the algorithm employed for cal-
culating the Delta of the running model compared to the
saved model at any MIoT Gateway. The algorithm in
question incorporates specific components derived from
the widely recognized depth-first search algorithm (DFS).
The Depth-First Search (DFS) technique is utilized for
tree traversal (for further details on the DFS, please refer
to Section 2.2.6). This is because the Enhanced Random
Forests classifier is derived from the Decision Tree classi-
fier and comprises many individual decision trees. There-
fore, the depth-first search (DFS) methodology will involve
sequentially accessing the decision trees stored in an ar-
ray implementation, starting from the leftmost tree in the
random forest. Each tree will be assigned an incremen-
tal index value, such as ”0” for the first tree, ”1” for the
second tree, ”2” for the third tree, and so on. The modi-
fied depth-first search algorithm will also include a distinct
data structure, specifically a list, to store the visited edges
as a set. To do this, the recursive procedure must re-
ceive the parent vertices of the vertex being inspected as
an argument. Each element in the array will consist of a
set that contains a representation of the additional edges.
Each edge is composed of two vertices that are connected
by the edge. In greater detail, the edge possesses distinct
attributes, including the vertex it is connected to, its spe-
cial feature, and its corresponding weight. Upon travers-
ing all Decision Trees within a Random Forest, we proceed
to evaluate the differences in terms of edges between each
individual decision tree and the previously saved traver-
sal list of the old model. This comparison allows us to
identify any edges that have been added or removed. The
symbol denotes an array consisting of decision trees.
Each decision tree within the array is represented by a set
of edges, reflecting the differences specific to that decision
tree. The provided Algorithm 3, demonstrates the modi-
fied Depth-First Search (DFS) algorithm, which includes
the computation of the discrepancies across tree models.
Figure 8 displays a representation of the Decision Trees for
both the old and new models, derived from the depicted
25
decision trees and their disparities. The computation re-
sults obtained from executing the technique outlined above
are depicted in Figure 9.
4.5.4. Federated Learning Algorithms
This section presents the FL Algorithms 4 and 5 that
are required to be executed at the MIoTs gateway and the
cloud ML model server, respectively, to ensure efficient
and dependable execution of the model’s updates during
the FL process. To provide the necessary atomicity of
the federated learning (FL) process execution, the algo-
rithm employs a Queue to handle the requested changes
to the enhanced Random Forest model. This approach
aims to facilitate the successful and sequential updates of
the model for both the cloud server and all other mod-
els within the network. The delta computation will be
demonstrated in Section 4.5.3.
5. Experimental Evaluation
This section presents the outcomes of experiments con-
ducted on a real system configuration utilizing heart rate
sensors and a Raspberry Pi 3B+ device. This experimen-
tal setup has been chosen to replicate real-world condi-
tions in MIoT environments closely. The focus of these
experiments is not only on identifying mimicked threats
but also on rigorously evaluating the effectiveness of differ-
ent machine-learning models through comprehensive per-
formance testing. This includes both quantitative and
qualitative analyses. Quantitatively, we assess the mod-
els’ performance using metrics such as accuracy, precision,
recall, and F1-score. These metrics objectively measure
how well our models can detect and classify various types
of cyberattacks. Qualitatively, this research investigates
our models’ practical applicability and robustness, offering
insights into their real-world effectiveness and limitations.
This holistic approach to evaluation ensures a thorough
understanding of the models’ capabilities and contributes
significantly to the field of MIoT security. Even more, in
this section, we dive deeper into the specific aspects of our
study: MIoT Attack Dataset, Experiments Results, and
Discussion. This includes the Attack Classification Re-
sults Used to Select the ML Approach for the First Stage
of our Investigation, Anomaly Detection Results for Un-
known Attacks Used in the Second Stage of the Approach,
Resource Consumption and Execution Time, and Evalua-
tion of Federated Learning Realization Performance. Each
of these segments is critical to establishing the effectiveness
and efficiency of our proposed system in real-world MIoT
environments, providing a comprehensive view of our re-
search’s impact on the field of cybersecurity in the MIoT
context. Finally, we show the time complexity regarding
Big-O notation of the complete solution.
5.1. MIoT Attack Dataset
Figure 10 shows an overview of the system environment
we set up for generating our dataset.
Algorithm 3: Deltas Calculation Algorithm
Based on DFS
Result: DFS
SAV ED M ODE L: The model currently used by the
Classification ML approach, saved on the disk
RUNNIN G M ODEL: The running model used by
the classification algorithm, saved in memory
Delta: The models’ differences, an array holding the
differences of each model based on each decision tree
DecisionTreeSM, DecisionTreeRM : The decision
trees of SM and RM
visitedVertex: An array used for visited vertices
fatherVertex : The father of the investigated vertex,
used to get the edge
investigatedVertex: The investigated vertex
investigatedEdge: The investigated edge
setOfEdges: The set of edges
checkVisitedVertex : The function that checks if the
vertex is visited
getInvestigatedEdge: Get the associated edge
between fatherVertex and investigatedVertex
getInvestigatedVertexChildren: Get the children of
investigatedVertex
children: The children of investigatedVertex
Function AdaptedDFS(fatherV ertex,
investigatedV ertex):
if
(checkV isitedV ertex(investigatedV ertex) == 0)
then
Add investigatedV ertex to visitedVertex;
investigatedE dge =
getInvestigatedEdge(f atherV ertex,
investigatedV ertex);
if (count(investigatedE dge) == 1) then
Add investigatedE dge to setOfEdges ;
end
children =getInvestigatedVertexChil-
dren(investigatedV ertex);
while (count(children)0) do
Remove a child from children;
AdaptedDFS(investigatedV ertex,child);
end
else
return;
end
;
Function Main(SAV ED M ODEL,
RUNNIN G M ODEL):
Initialization;
count = 0;
while (count(RUNNIN G M ODEL)0) do
Read DecisionT reeS M and
DecisionT reeRM ;
Delta[count] =
AdaptedDFS(DecisionT reeRM .ROOT ,
nothing) -
AdaptedDFS(DecisionT reeS M.ROO T ,
nothing);
count++;
end
return
26
(a) Old Model Tree 1 Number (b) New Model Tree 1 Number
(c) Old Model Tree 2 Number (d) New Model Tree 2 Number
Figure 8: Changes of enhanced Random Forest in Decision Trees after a new attack identification
IoT Environment: For our experiments, we created
a health monitoring IoT system that sends crucial health
vitals to the cloud, collected with a Max30100 heart rate
sensor. A Raspberry Pi 3b+ was used as a central gateway,
and an ESP8266 WiFi module sent the data to the cloud.
CICFlowMeter is a tool to generate and analyze net-
work traffic flow. It generates flows bi-directionally. i.e.,
forward and backward. It converts the network flows into
features such as duration, number of packets, number of
bytes, length of packets, sub-flow packets, push flags, etc.
Along with the traffic flow features, the output has Flow
ID, Source IP, Destination IP, Source Port, Destination
Port, and Protocol. This paper’s captured packets from
Wireshark were converted into ML features using CICFlowMe-
ter.
5.1.1. Feature analysis & Feature Selection for Training of
the investigated ML approaches
Univariate feature selection works by selecting stylish
features based on univariate statistical tests. SelectKBest
approach is the one that removes all attributes but the
high-scoring features in the dataset. A chi-square test is
used in statistics to test the independence of two events
and is denoted by χ2. For example, we can obtain the
observed count Oand anticipated count Efrom the given
data of two variables. Chi-Square calculates the difference
between the expected count Eand the experimental count
O. The Chi-square formula is (Formula 7):
x2=P(OiEi)
Ei
(7)
Where Oi= observed value (factual value) Ei= antici-
pated value.
Note that the chi-squared test is most commonly used
in scenarios where both variables are categorical. However,
it is also important to note that the chi-squared test can
be applied in cases with continuous predictors, especially
when these predictors are discretized or categorized before
the test. This research adapts the discretizing of the con-
tinuous predictors to fit the categorical framework required
for the chi-squared test. This methodological choice was
made considering the specific nature of our data and the
analytical objectives that this research aimed to achieve
[77, 78, 79].
In feature selection, we select the features primarily
affecting the response. When two features are indepen-
dent, the observed count is close to the anticipated count;
therefore, we will have a lower Chi-Square value. The high
Chi-Square value indicates that the thesis of independence
is incorrect. The selected best features are shown in Table
5 and Section 4.4.1. It should be noted that there are a
total of 25 selected features selected, with a total of 21,475
elements. This includes 19,208 training elements and 2,267
total training elements, as indicated in Table 7.
5.1.2. Balancing the Dataset
Balancing the dataset is an important concept to our
research because the IoT/MIoT IDS types of classifica-
tion models are frequently encountered with an imbal-
anced dataset problem, where the number of the matu-
rity class is much more significant than the nonage class.
27
Figure 9: The results by running the Deltas calculation algorithm
Figure 10: MIoT Attack Dataset System Environment
Therefore, the model faces a severe problem in training
the nonage classes well. One popular approach to over-
coming that weakness is to induce new exemplifications
synthesized from the living nonage class. As a result, to
balance the data, the SMOTE-Tomek Links system was
used, where this perpetration combines the oversampling
approach from SMOTE and the under-slice approach from
Tomek Links [80]. SMOTE is one of the most popular
oversampling methods developed by Chawla et al.. [81].
Unlike arbitrary oversampling that only duplicates some
arbitrary exemplifications from the nonage class, SMOTE
generates models grounded on the distance of each data
point (generally using Euclidean distance) and the nonage
class’s nearest neighbors, so the generated exemplifications
are different from the original nonage class. The process
for inducing the synthetic samples is as follows:
Choose arbitrary data from the nonage class.
Calculate the Euclidean distance between the arbi-
trary data and its knearest neighbors.
Multiply the difference with an arbitrary number be-
tween 0 and 1 and add the result to the nonage class
as a synthetic sample.
Repeat the procedure until the asked proportion of
the nonage class is met.
This system is effective because the generated syn-
thetic data are close to the feature space of the nonage
class. We have five classes in total, and the instances
of each class are listed in the table 7.
5.2. Experiments Results and Discussion
This section presents the outcomes of the ML model
trials and a commentary on them. Also, experimentation
with various ML classifiers is provided, and a performance
comparison of the ML models is shown using different met-
rics.
5.2.1. K-fold cross-validation
The K-fold cross-validation technique is employed to
evaluate and assess the performance of the researched mod-
els. The K-fold cross-validation technique was employed
with a value of k= 5 to guarantee that the model ex-
hibited good generalization and yielded consistent metrics
28
Algorithm 4: Federation Learning Algorithm
Run in Each MIoT Gateway
Result: Updates Cloud ML Model
SAV ED M ODEL: The model currently used by
the Classification ML approach, saved on the disk
RUNN ING MODEL: The running model used
by the classification algorithm, saved in memory
PCAP: The PCAP generated files per second
CICFlowMeter: The method that converts the
PCAP files to features records shown at 5.1
FeaturesRecords: The converted features
records for model training and anomaly detection
RecodsIdentified: The features records identified
with anomaly
saveModeltoDisk: The calculation of Model
Differences Algorithm shown at Section 4.5.3
AN OM ALY DE T EC T ED: The anomaly
detector that identifies an anomaly and returns
the RecodsIdentified
CalculateModelsDifference: The calculation of
Model Differences Algorithm shown at Section
4.5.3
Delta: The model differences
Cl oud M L M odel Ser ver I P : The IP address of
the cloud ML Model Server
Initialization ;
while true do
FeaturesRecords = CICFlowMeter(PCAP);
RecodsIdentified =
AN OM ALY D ET EC T ED (F eaturesRecords);
if (count(RecodsIdentified)1) then
Delta = CalculateModelsDifference
(RecodsIdentif ied, SAV E D M ODEL,
RUNN ING MODELS);
SAV ED M ODEL =
RUNN ING MODEL;
Send the Delta to the
Cloud ML Model Server I P encrypted;
end
end
across various subsets of data points. The process of K-
fold cross-validation involves partitioning the dataset into
Kequally sized folds. Subsequently, the initial (k1)
folds are employed for training the model, and the mean
accuracy is computed. The k-th fold is utilized to evalu-
ate the performance of the obtained model. The algorithm
followed is shown in the following list [82]:
1. Split the dataset into K equal-sized subsets, where
K is the desired number of folds for cross-validation.
2. For each fold, do the following:
(a) Use K-1 subsets for training the model.
(b) Use the remaining 1 subset for testing the model.
Algorithm 5: Federation Learning Algorithm
Run in Cloud ML Model Server
Result: Federation Learning Algorithm Run in
Cloud ML Model Server
SAV ED M ODEL: The model currently used by
the Classification ML approach, saved on the disk
RUNN ING MODEL: The running model used
by the classification algorithm, saved in memory
Delta: The model differences
Queue M I oT GW Delta Receiv ed IP : The
Queue with the IP addresses and their Deltas (as
a string in a comma-separated way) of the MIoT
gateways that have received their model
differences
M IoT GW IP s: The list of IP addresses of the
MIoT gateways
Initialization ;
while true do
if
(count(Queue M I oT GW Delta Received I P )
1) then
while
((count(Queue M I oT GW Delta Received IP )
0))
MIoT GW Received = Remove from
Queue
Queue MIoT GW Delta Received IP the
first element;
MIoT GW Received I P = Separate from
the
MIoT GW Received the IP to send Delta;
Delta = Separate from the
MIoT GW Received the Delta;
RUNN ING M ODEL =
RUNN ING M ODEL modified with the
decrypted Delta;
SAV ED M ODEL =
RUNN ING MODEL;
M IoT GW IP s to S end: The list of IP
addresses of the MIoT gateways without
the MIoT GW Received IP IP
Send the Delta to the
M IoT GW IP s to S end encrypted;
end
end
(c) Evaluate the model’s performance on the test
subset using a chosen evaluation metric (e.g.,
accuracy, mean squared error).
(d) Record the evaluation result for this fold.
3. Repeat step 2 for all K folds.
4. Calculate the average performance metric across all
K folds to more robustly assess the model’s perfor-
mance.
29
Table 7: Dataset Statistics
Classes Training instances before oversampling Training instances after oversampling Testing Instances
Normal 4807 3842 965
MitM 190 3842 41
DDoS 3998 3842 791
Brute Force Authentication 206 3841 36
NMAP 2133 3841 434
Total 11334 19208 2267
This algorithm outlines the basic steps of K-Fold Cross-
Validation, which is commonly used to assess the perfor-
mance and generalization of machine learning models.
5.2.2. Performance Metrics
To assess the performance of our IDS classification mod-
els, we used multiple metrics, as detailed below. The
dataset containing 80 network features and nearly 7700
packets was segmented into training and testing data. Seg-
mentation was performed using a train-test split and was
repeated using stratified K-Fold algorithms. Then, the
models were created using the training data using the top
five algorithms like Random Forest, Support Vector Classi-
fier, K-Neighbours Classifier, Decision Tree Classifier, and
Gaussian Na¨ıve Bayes Classifier, and we tried making pre-
dictions on the test dataset.
We used widely used metrics for any intrusion detection
system: F1 Score, Precision (P),Recall (R),True Positive
Rate (TPR),False Positive Rate (FPR), and False Neg-
ative Rate (FNR). The confusion matrix and its related
metrics must be defined to understand how we calculated
the above metrics.
Confusion Matrix: A tabulated representation of the
results that can be used to validate the performance of
trained ML models. Considering a binary classification
problem, the possible outcome related to the expected out-
come can be categorized as:
True Positives (TP): Classified Positive (Yes) and
correctly classified. TP is the number of data sam-
ples with attacks detected correctly.
True Negatives (TN): Classified Negative (No) and
correctly classified. TN is the number of benign sam-
ples detected correctly.
False Positives (FP): Classified Positive (Yes) and
wrongly classified. FP is the number of data samples
with false attack detection.
False Negatives (FN): Classified Negative (No) and
wrongly classified. FP is the number of attack sam-
ples missed.
Given the definitions of TP, TN, FP, FN, we can com-
pute F1 Score =2·P·R
P+R,Precision =T P
T P +F P ,Recall = 1 -
FNR where FNR =F N
T P +F N , and FPR =F P
F P +T N . Note
that Recall can also be given by T P
T P +F N .
For any intrusion detection system, it is ideal not to
miss any attacks, i.e., FNR 0 and Recall 1. It should
also not trigger false alarms, i.e., precision 1 and FPR
0. Achieving both cases yields an F1 Score of 1.
AUC ROC Curve: The Receiver Operator Char-
acteristic (ROC) is a probability curve plotted with a true-
positive rate against a false-positive rate. The ability to
tell one class from another is measured by the Area Under
Curve (AUC).
5.2.3. Cross-Validation of the Hyperparameters in the ML
models
Hyperparameter tuning of parameters has been per-
formed to build machine-learning models. Hyperparame-
ter tuning is a procedure to tune the parameters that help
to increase the ML model’s accuracy. Two approaches are
widely used to improve the ML model’s performance and
reduce the error rate: GridSearchCV and Randomized-
SearchCV. These approaches are used to find which ML
model performs the best among other ML models.
GridSearchCV [83]: GridSearchCV is one of the ap-
proaches that use cross-validation to tune the hyperpa-
rameters of the ML model. It is performed with the help
of predefining the parameters in the form of a dictionary.
It tries out all the combinations in the dictionary and finds
the best parameters that improve the performance of the
ML model.
RandomizedSearchCV [84]: RandomizedSearchCV is
an alternative to GridSearchCV because it is computation-
ally extortionate and uses all the hyperparameters to tune
the ML model. RandomizedSearchCV uses a fixed num-
ber of hyperparameters to sample the ML model. This
method doesn’t give a list of specific values for each hy-
perparameter. Instead, it uses a statistical distribution to
pick values for each hyperparameter.
5.2.4. Attack Classification Results Used to Select the ML
Approach for the First Stage of our Investigation
Table 8 displays each model’s performance after per-
forming GridSearchCV; Table 9 shows each model’s per-
formance after completing RandomizedSearchCV; and Ta-
ble 10 and Fig. 11 displays the accuracy scores of the ML
algorithms used to detect intrusions after feature selection.
Explanation of the Enhanced Random Forests
hyperparameters:
30
Table 8: Grid search cross-validation results
ML Model Best score Best parameters
Support Vector Machine 0.994345 [’C’:20, ’kernel’:’linear’]
K-Nearest Neighbors 0.993934 [ ’n neighbours´:1]
Decision Tree 0.989356 [ ’max depth´:6]
Random Forest 0.999002 [ ’n estimators´:17]
Enhanced Random Forest 0.999002 [ ’n estimators´:17, ’max depth’: None, ’min samples split’: 2, ’min samples leaf ’: 1, ’max features’: ’auto’]
Naive Bayes 0.986655 []
Table 9: Randomised search cross-validation results
ML Model Best score Best parameters
Support Vector Machine 0.994345 [’C’:20, ’kernel’:’linear’]
K-Nearest Neighbors 0.997422 [ ’n neighbours´:1]
Decision Tree 0.993934 [ ’max depth´:6]
Random Forest 0.998919 [ ’n estimators´:20]
Enhanced Random Forest 0.998919 [ ’n estimators´:20, ’max depth’: None, ’min samples split’: 2, ’min samples leaf ’: 1, ’max features’: ’auto’]
Naive Bayes 0.986655 []
n estimators: The number of decision trees in the
forest. A higher number can lead to a more ro-
bust model but may require more computational re-
sources.
max depth: The maximum depth of each decision
tree. Setting it to None allows the trees to expand
until they contain min samples split samples or fewer
in each leaf.
min samples split: The minimum number of sam-
ples required to split an internal node. Setting it to
2 means that a node must have at least 2 samples to
be divided further.
min samples leaf: The minimum number of sam-
ples required to be in a leaf node. This parame-
ter can prevent the trees from growing too deep and
overfitting.
max features: The number of features to consider
when looking for the best split. ’auto’ means it will
consider all features.
These hyperparameters are commonly used as a start-
ing point for an Enhanced Random Forest. The optimized
values are identified using the aforementioned Random-
izedSearchCV and GridSearchCV approaches and a brute
force investigation.
Table 10: Accuracy metrics of ML algorithms with feature selection
Machine Learning Training sample size
Algorithms (70%) (60%) (50%)
Enhanced Random Forest 99.96 99.98 99.98
Support Vector Machine 83.57 83.91 84.25
K-Nearest Neighbors 98.89 98.90 98.83
Decision Tree 99.63 99.97 99.97
Na¨ıve Bayes 98.06 98.02 98.10
Figure 11: ML Algorithms Accuracy
Figure 12 plots the AUC - ROC Curve for the five ML
models we have trained. Based on the curve, it can be de-
duced that all the methods, including the Enhanced Ran-
dom Forest classifier, outperform NBC. An additional ob-
servation on the AUC-ROC curves for our machine learn-
ing models is that the curves converge towards the plot’s
upper left corner. This visual clustering of curves directly
results from the high classification accuracy achieved by
all models, as reflected in the precision, recall, and F1-
score metrics reported. Each model demonstrated excep-
tional performance, with scores nearing the ideal value of
1, which leads to AUC values approaching unity except
for the NBC. Given the robustness of the models and the
consequent minimal variation in their performance met-
rics, the ROC curves are understandably indistinguishable.
Such an occurrence is not uncommon in scenarios where
classifiers achieve superior performance levels, rendering
their true positive and false positive rates highly similar.
The intrusion detection model needs to predict the
five output labels as ”normal” :0, ”MITM” :1, ”DDoS”:2,
”BFA” :3, and ”NMAP”:4. The classification report func-
tion builds a visualizer showing the main classification
metrics for the custom target names and above-inferred la-
bels listed in Table 11. Thus, Table 11 shows the detailed
classification performances of the ML models for different
31
Figure 12: AUC - ROC Curve
classes.
Table 11: Classification performances of ML models
ML Model Traffic class Precision Recall f1-score
Enhanced Random Forest
Normal 1.00 1.00 1.00
MitM 1.00 1.00 1.00
DDoS 1.00 1.00 1.00
Brute Force Authentication 1.00 0.94 0.97
NMAP 0.99 1.00 0.99
Support Vector Machines
Normal 0.99 0.98 0.99
MitM 0.43 1.00 0.60
DDoS 0.83 0.54 0.65
Brute Force Authentication 0.91 0.94 0.93
NMAP 0.51 0.79 0.62
K-Nearest Neighbors
Normal 1.00 0.99 0.99
MitM 0.89 0.94 0.99
DDoS 0.98 0.99 0.99
Brute Force Authentication 1.00 0.88 0.94
NMAP 0.98 0.97 0.99
Decision Tree
Normal 1.00 1.00 1.00
MitM 1.00 0.97 0.98
DDoS 0.99 1.00 0.99
Brute Force Authentication 0.97 0.94 0.95
NMAP 0.98 1.00 0.99
Naive Bayes
Normal 0.97 0.98 0.98
MitM 0.43 0.31 0.36
DDoS 1.00 0.99 0.99
Brute Force Authentication 1.00 0.88 0.94
NMAP 0.98 0.99 0.98
Figure 13 is the partial visualization of a decision tree
from an Enhanced Random Forest formed from our train-
ing dataset.
Note that for our experiments, we run the ML ap-
proaches using our generated dataset that is created us-
ing emulation. We cross-checked the results with other
datasets such as CSE-CIC-IDS2018 [85] and NSL-KDD
[86], and our results are very similar to the results given
by the other datasets (the difference was in terms of 0,1%).
5.2.5. Anomaly Detection Results for Unknown Attacks
Used in the Second Stage of the Approach
One concern about classification challenges is that while
it is possible to successfully categorize data into certain
classes, encountering instances that do not fit inside the
classes for which our model has been trained is possible.
These anomalous data points may potentially represent
novel instances of attacks and so warrant detection. Dur-
ing the stage of anomaly identification in the proposed
methodology, One-Class Support Vector Machines (SVMs)
are utilized due to their widespread usage and extensive
documentation in addressing this objective[39, 38, 40].
In the context of anomaly detection experiments, we
systematically altered our training dataset by including all
classes but one throughout each iteration. This approach
allowed us to evaluate the model’s capacity to identify the
excluded class as an abnormality. In a practical context, it
is reasonable to classify all attacks, as well as normal data
for our One-Class Support Vector Machine (SVM) and any
other unobserved attacks, as outliers to be detected. The
accuracy of testing and training data for outlier detection
using One-Class Support Vector Machines (SVM) is eval-
uated when several classes are introduced as anomalies are
shown in Table 12.
The unknown attacks that our investigation used in
order to identify are the following:
Eavesdropping Attacks [87]: Eavesdropping at-
tacks encompass the act of intercepting, replicating,
and monitoring the transmission of data between In-
ternet of Things (IoT) devices. The act of eavesdrop-
ping by malicious individuals can result in unautho-
rized access and compromise of sensitive informa-
tion, undermining users’ privacy.
Physical Tampering [88]: Physical tampering at-
tacks refer to the act of obtaining physical proximity
to Internet of Things (IoT) devices to manipulate or
compromise their functionality. Preventing this at-
tack can pose a challenge without adequate physical
security measures.
Device Impersonation [89]: Malicious actors can
assume the identity of authentic Internet of Things
(IoT) devices, so they acquire illicit entry into net-
works or services without proper authorization. This
particular form of assault can potentially result in
data breaches and the illegal manipulation of de-
vices.
Table 12: Anomaly detection results
Anomalous classes Training accuracy Testing accuracy
MitM 0.99740 0.99742
DDoS 0.99715 0.99773
NMAP 0.99747 0.99786
Brute force authentication 0.99767 0.99770
The results suggest that One-Class SVM achieves high
performance, detecting all provided attack classes as anoma-
lies when their instances are not included in the training
set. The results are promising for use in discovering new
attack types in MIoT.
32
Figure 13: Sample tree from Random Forest
5.2.6. Resource Consumption and Execution Time Perfor-
mance
This section presents the findings of the resource con-
sumption execution times and performance times of the
various machine learning models employed for network
traffic classification. While the issue of resource consump-
tion may be relatively insignificant in the context of uti-
lizing robust cloud servers, it assumes critical significance
when employing Internet of Things (IoT) devices such as
Raspberry Pi. Our analysis primarily concentrated on
evaluating performance and assessing the impact of green
energy using the following metrics:
1. Power Consumption: Amount of power consumed by
the model in mW.
2. CPU Usage: There is limited capacity to execute and
run many programs. With better CPU usage, you
can run more tasks simultaneously.
3. Memory Usage: The measure of the percentage of
memory used by various currently used applications.
In the context of power consumption measurements, the
instrument employed was PowerTOP. The tool mentioned
above is a Linux utility that quantifies power usage and
implements power management strategies. Numerous as-
pects can be derived from the instrument, encompass-
ing power estimation, usage patterns, and CPU utiliza-
tion. The ML models were executed, and PowerTOP was
employed to quantify the concurrent power consumption.
The Atop tool was employed to obtain measurements of
CPU and memory utilization. This tool serves as an exper-
iment tracking mechanism, enabling system performance
monitoring in a real-world context. Using this instrument,
our findings are deemed rational and valid. The display
provides a range of data about the system’s load at the in-
dividual process level. The data collected through the uti-
lization of Atop encompasses many process-related statis-
tics, such as CPU usage, memory usage, hard drive usage,
and network statistics at both the transport and network
layer. The Atop tool was utilized to execute individual
runs of each machine-learning model to ascertain the re-
spective CPU and memory consumption. Furthermore,
33
we conducted experiments on multiple machine learning
(ML) models and a deep learning (DL) model using the
Atop tool. The measurements obtained from these exper-
iments were then documented on the Raspberry Pi, and
the results are presented in Table 13 and Fig. 14. The
table shown in this paper, specifically Table 14 and Fig.
15, provides information regarding the execution time and
size of the machine learning (ML) models that have been
loaded onto the Raspberry Pi 3b+ device.
Figure 14: ML Models Resource Consumption
Table 13: Resource Consumption of the ML models
Models Power
Consump-
tion(mW)
CPU
Us-
age(%)
Memory
Us-
age(%)
Enhanced Random Forest 15.5 1 5
Support Vector Machine 11.1 1 5
K-Nearest Neighbours 15.6 1 5
Decision Tree 16.6 1 5
Naive Bayes 17.7 1 5
DL model 65 3 6
The power consumption of a Deep Learning (DL) model
consisting of five hidden layers was generated in order
to facilitate a comparative analysis with other Machine
Learning (ML) models. The power consumption compar-
ison between the utilized machine learning (ML) models
and the deep learning (DL) model is evident in Table 13.
Figure 15: ML Models Execution Time and Size
Table 14: Execution time and size of various models
ML Models Execution
time (s)
Model
size
(MB)
Enhanced Random Forest 1.06 1.369
Support Vector Machine 1.37 0.849
K-Nearest Neighbours 1.25 2.44
Decision Tree 0.54 0.01
Naive Bayes 0.86 0.003
DL model 0.94 0.008
5.2.7. Rationale Behind Selecting the Enhanced Random
Forest Model
The choice of the Enhanced Random Forest model for
intrusion detection in MIoT networks was driven by a bal-
anced consideration of various factors, as highlighted be-
low:
Power Consumption and Execution Time: While
the Enhanced Random Forest model has a marginally
higher power consumption (15.5 mW) and execution
time (1.06 seconds), its high classification accuracy
justifies these trade-offs, especially in the context of
MIoT networks where accuracy is critical.
CPU and Memory Usage: The model’s CPU and
memory usage stand at 1% and 5%, respectively,
indicating its moderate resource requirements and
making it suitable for resource-constrained environ-
ments in MIoT networks.
Model Size: The Enhanced Random Forest model’s
size of 1.369 MB, while not the smallest, is compen-
sated by its superior accuracy in classifying various
traffic classes, as demonstrated in our results.
Classification Performance: This model consis-
tently achieves high precision, recall, and f1-scores
across different traffic classes, thereby ensuring reli-
able detection of both normal and anomalous traffic
in MIoT networks.
Balanced Approach: Selecting the Enhanced Ran-
dom Forest model represents a balanced approach,
optimizing both resource utilization efficiency and
effectiveness in accurately detecting a wide range of
threats.
In conclusion, the Enhanced Random Forest model is iden-
tified as the ’golden model’ for our study, offering an ideal
blend of accuracy and practicality for intrusion detection
in MIoT networks.
5.2.8. Examing the results of the proposed approach with
a well-known dataset
To thoroughly examine the performance and robust-
ness of our proposed approach, extensive experiments were
34
conducted using the ”CICEV2023 DDoS Attack Dataset”
[90]. This dataset is mainly chosen for its standardization
and relevance in the domain of network security. Over 100
individual experiments were performed to ensure a com-
prehensive evaluation. The results from these extensive
tests have been exceptionally consistent, demonstrating
the effectiveness of our approach with variations in per-
formance constrained to only +- 0.05%. This consistent
performance across a significant number of trials validates
our methodology’s reliability and underscores its applica-
bility in diverse real-world scenarios.
5.3. Evaluation of Federated Learning Realization
In this section, we will examine the realization of the
proposed approach shown in Section 4.3 and evaluate the
FL approach as shown in Section 4.5. According to the
findings in Section 5.2, the following components of the
proposed approach are implemented in the realization of
the emulation:
The GateWay IDS / MIoT GateWay (GW) is im-
plemented with a Raspberry PI 4 with routing and
WiFi access point services enabled. The gateway was
connected to the core network via cable..
The Anomaly Detector is implemented using One-
Class SVM, as shown in Section 5.2.5; it is the most
appropriate method with the highest accuracy and
identification results.
The Attack Classifier is implemented using Enhanced
Random Forests, utilizing Enhanced Random Forests,
as described in Section 5.2.4; it is the method with
the highest accuracy and classification outcomes.
The Threat Intelligence Federated Learning Server /
The cloud server model is implemented as another
model that runs at an AWS cloud server. The FL
implementation is shown in Section 4.5.
Additionally, in this section, we evaluate the FL ap-
proach regarding delay to calculate the deltas. Figure 16
shows that the investigated approach of FL uses very lit-
tle time to evaluate differences in the Enhanced Random
Forests among trees starting from 1 to 1000 decision trees.
Overall, our system is very light and fast, using Federated
Learning. As shown in Section 2.2.6, the overall complex-
ity is O(n).
5.3.1. Explanaition of the Federated Learning Approach
with Cosine Similarity Index
In our research, we have implemented the concept of
federated learning in a novel approach by enhancing Ran-
dom Forests, diverging from the conventional use of neu-
ral networks. This innovation is structured around the
Depth-First Search (DFS) algorithm, which calculates the
differences between old and new models within the feder-
ated learning framework. Specifically, the DFS algorithm
Figure 16: Delta Calculation execution time
assesses the Enhanced Random Forest models by subtract-
ing the traversal result set of the old model from the new
model. This process allows for an efficient and effective
way to capture model updates, which are then communi-
cated to other devices through a cloud-based model, fo-
cusing on the delta changes. To explain and interpret the
proposed algorithm, we have utilized the Cosine Similarity
Index 4. This metric has been instrumental in measuring
the degree of change or similarity between model updates
across different nodes in the federated network. Using the
Cosine Similarity Index aids in ensuring the consistency
and relevance of updates shared among devices, providing
a quantitative measure to assess the alignment in learning
patterns across the distributed system. Figure 17 plots
Cosine Similarity against Communication Rounds as part
of an investigation into Federated Learning. Cosine Sim-
ilarity is a measure that quantifies the similarity between
two vectors, which, in the context of Federated Learning,
typically represents the alignment of model parameters up-
dated during training across different nodes. Communica-
tion rounds in Federated Learning refer to the iterative
process where multiple nodes (such as mobile devices or
distributed servers) each compute an update to the model
based on their local data and then send it to a central
server. The server aggregates these updates to improve
the global model, which is then sent back to the nodes for
further training. This process repeats for many rounds to
refine the model progressively. In the graph, the Cosine
Similarity starts at a value of 1, indicating that initially, all
nodes have identical or maximally similar model parame-
ters. As the number of communication rounds increases,
the Cosine Similarity fluctuates slightly. Generally, it re-
mains high, suggesting that the model parameters across
4The Cosine Similarity Index is a measure used to determine the
similarity between two vectors in a multi-dimensional space. It is
calculated by finding the cosine of the angle between these vectors.
In federated learning, this index helps compare model updates from
different nodes by assessing the cosine similarity of these updates.
A high value indicates similar learning patterns across nodes, while
a low value points to varied learning or data distributions. This
understanding is essential for consistent and effective learning in a
federated system. The formula for the Cosine Similarity Index be-
tween two vectors Aand Bis given by: Cosine Similarity = A·B
A∥∥B.
In this formula, A·Brepresents the dot product of vectors A and B,
while Aand Bdenote the magnitudes (or norms) of vectors A
and B, respectively.
35
different nodes remain similar even as local updates are
applied. This can indicate that despite the diversity of
local data and experiences, the nodes are learning consis-
tently. The relatively stable Cosine Similarity across the
communication rounds in the graph suggests that Feder-
ated Learning successfully integrates diverse local updates
without significant divergence, which is a desirable out-
come for such systems. This stability is key to ensuring
that the global model benefits from the unique contribu-
tions of all participating nodes while maintaining a level
of coherence and generalizability.
Figure 17: The Cosine Similarity Index
5.4. Time Complexity of the Proposed Scheme
According to Sections 2.2.5 and 2.2.6, the computa-
tional complexity of our Federated Learning approach is
aligned with that of Depth-First Search (DFS). The com-
putational complexity of DFS is expressed as O(V+E)
for a network or tree. It simplifies to O(N) for tree struc-
tures, owing to the O(1) complexity of set operations [48].
In contrast, our Enhanced Random Forest (ERF) algo-
rithm, which handles a dataset with 25 features, follows
a complexity of O(K·N·log(N)·F). This complexity
further simplifies to O(N·log(N)) when considering our
dataset comprising 19,208 training instances, as detailed in
Table 7. Therefore, the overall system complexity, which
combines the intricacies of both DFS and ERF in the Fed-
erated Learning framework, is collectively represented as
O(n)+O(N·log(N)), effectively encompassing the efficien-
cies of both data structure traversal and ensemble learning
techniques.
6. Conclusions and Future Work
This research presents a novel strategy with three stages
for detecting intrusions on medical Internet of Things (IoT)
devices at the sinkhole, utilizing a green machine learning
methodology. In the first stage, the proposed method em-
ployed supervised learning machine learning classification
techniques. These techniques were effectively trained and
implemented on a Raspberry Pi sinkhole functioning as
a gateway for the Internet of Things (IoT) devices. The
sinkhole in our proposed architectural design has identi-
fied instances of Man-in-the-Middle (MitM), Distributed
Denial of Service (DDoS), Brute Force Authentication,
and NMAP attacks. A network emulation was conducted
using the KALI penetration testing software to generate
a dataset. Initially stored as a PCAP file, the recently
generated dataset was converted into a CSV file format.
This conversion involved utilizing a ”CICFlowmeter” tool,
which ensured that the resulting information contained ap-
propriate columnar features. This dataset, suitable for
employment in our machine learning methodologies, will
be accessible to the public. The classifiers Support Vec-
tor Machines, Naive Bayes, K-Nearest Neighbors, Deci-
sion Tree, and Enhanced Random Forest were employed
to classify known assaults in the evaluation. Having the
Enhanced Random Forest achieves the best overall ac-
curacy with 99.98%. Subsequently, in the second stage,
the One-Class Support Vector Machine (SVM) algorithm
was employed to detect unidentified attacks such as Eaves-
dropping, Physical Tampering, and Device Impersonation.
This approach was demonstrated to be the most suitable
methodology for this particular undertaking, as supported
by existing scholarly works with an accuracy of 99.7%. The
results obtained from our emulation demonstrate that all
the strategies under investigation exhibit notable perfor-
mance in terms of accuracy, precision, recall metrics, and
the F1 score. Ultimately, utilizing the acquired outcomes,
we have successfully proven that all the examined machine
learning methodologies can promptly and precisely iden-
tify all forms of attacks while exhibiting reduced utilization
of memory, CPU, and power compared to alternative deep
learning models. Furthermore, our study demonstrated
that the Enhanced Random Forest algorithm exhibits su-
perior accuracy and is considered one of the most effi-
cient machine learning techniques regarding energy, CPU,
and memory utilization compared to other approaches.
Therefore, the methodology mentioned above will be im-
plemented in the cloud-based machine learning model for
training sinkholes/gateways in the MIoT network. This
training will incorporate the most recent updates of nodes
and trees in the Enhanced Random Forest algorithm to
enhance the identification of new attacks by utilizing Fed-
erated Learning techniques. As demonstrated, implement-
ing Federated Learning results in minimal execution time
for model updates and the transmission of differences, as
measured in time. Finally, we form a complete, secure
ERP system with the solutions provided.
In subsequent research, we intend to augment our dataset
by incorporating additional attack types. We aim to assess
the overall effectiveness of the cyber threat intelligence sys-
tem in generating precise models for recently identified at-
tacks while minimizing the time delay. Additionally, there
are intentions to broaden the scope of the intrusion detec-
tion perimeter to encompass cloud servers and data storage
units specifically designed for the IoT within the context of
the Internet of Military Things. In addition, we will extend
the existing system model under investigation to incorpo-
36
rate several designs for managing the Internet of Things.
Also, for future work, it is essential to build upon the find-
ings and implications of this study by exploring several
critical areas. Firstly, investigating the scalability of the
proposed Intrusion Detection System (IDS) across diverse
and larger MIoT networks is crucial to gaining deeper in-
sights into its adaptability and robustness. Secondly, fu-
ture work should consider conducting real-world field tests
to evaluate the practical applicability and effectiveness of
the IDS in various healthcare settings, which would be
immensely beneficial. These areas of future work aim to
significantly expand the scope and impact of the research,
thereby contributing to the development of more secure
and efficient MIoT networks.
Appendix A. Glossary of Terms and Abbreviations
In this section, under the appendix, we provide the ta-
ble of abbreviations used in the paper. Table A.15 provides
the abbreviation and description near it.
Table A.15: Abbreviations and Descriptions
Abbrev. Description
6LoWPANIPv6 over Low-Power Wireless Personal Area
Networks
ANN Artificial Neural Network
API Application Programming Interface
AUC Area Under Curve
CA Certificate Authority
CHS Connected Healthcare System
CNN Convolutional Neural Networks
DAE Deep Auto Encoder
DDoS Distributed Denial of Service
DFFNN Deep Feed Forward Neural Network
DFS Depth-First Search
DL Deep Learning
DNN Deep Neural Networks
DoS Denial of Service
DT Decision Trees
EMQX EMQ X MQTT Broker
EOS-
ELM
Ensemble of Online Sequential Extreme
Learning Machine
ERF Enhanced Random Forests
ESFCM Extreme Learning Machine (ELM)-based
Semi-supervised Fuzzy C-means (ESFCM)
FL Federated Learning
FN False Negatives
Continued on next page
Table A.15: Abbreviations and Descriptions (Continued)
FNR False Negative Rate
FP False Positives
FPR False Positive Rate
GEMLIDSGreen Effective Machine Learning Intrusion
Detection System
GW Gateway
GWO Grey Wolf Optimization
HEKA IDS name
HIDS Host-based IDS
HTML Hyper Text Markup Language
HTTP HyperText Transfer Protocol
ICMP Internet Control Message Protocol
ICS Industrial Control Systems
IDS Intrusion Detection Systems
IICS Integrated ICS
IIoT Industrial Internet of Things
IoMT Internet of Medical Things
IoT Internet of Things
IP Internet Protocol
JSON JavaScript Object Notation
KNN K-Nearest Neighbors
LAN Local Area Network
LDA Linear Discriminant Analysis
LSTM Long Short-Term Memory
MIOT Medical IoT
MIoT Medical Internet of Things
MitM Man-in-the-Middle
ML Machine Learning
MPC Multi-Party Computation
MQTT Message Queuing Telemetry Transport
MSGG Message
NB Naive Bayes
N-
gram
A Sequence of N Words
NIDS Network Intrusion Detection System
NMAP Network Mapper
P Precision
PCA Principal Component Analysis
PCAP Packet Captures
PMD Patient Monitoring Devices
Continued on next page
37
Table A.15: Abbreviations and Descriptions (Continued)
Probe Prospective Randomized Open Blinded End-
Point
R Recall
R2L Remote to User
REST Representational State Transfer
RF Random Forest
RNN Random Neural Networks
ROC Receiver Operator Characteristic
SCTP Stream Control Transmission Protocol
SMOTE Synthetic Minority Oversampling Technique
SOAP Simple Object Access Protocol
SVM Support Vector Machine
SVMs Support Vector Machines
SYN
Flood-
ing
Synchronization Flooding
TCP Transmission Control Protocol
TDTC Two-tier Classification Model
TN True Negatives
TP True Positives
TPR True Positive Rate
U2R User to Root Attacks
UDP User Datagram Protocol
UNSW-
NB15
University of New South Wales-NB15 dataset
WPAN Wireless Personal Area Network
XGBoostExtreme Gradient Boosting
YANG Yet Another Next Generation
Acknowledgement
This research is part of a project that has received
funding from the European Union’s Horizon 2020 research
and innovation program under grant agreement Nº739578
and the government of the Republic of Cyprus through
the Directorate General for European Programmes, Coor-
dination, and Development. The research has also been
supported by a research grant from Middle East Technical
University’s Scientific Research Projects office under grant
number GAP-312-2020-10297.
References
[1] S. Gao, G. Thamilarasu, Machine-learning classifiers for secu-
rity in connected medical devices, in: 2017 26th International
Conference on Computer Communication and Networks (IC-
CCN), 2017, pp. 1–5. doi:10.1109/ICCCN.2017.8038507.
[2] D. He, Q. Qiao, Y. Gao, J. Zheng, S. Chan, J. Li, N. Guizani,
Intrusion detection based on stacked autoencoder for con-
nected healthcare systems, IEEE Network 33 (6) (2019) 64–69.
doi:10.1109/MNET.001.1900105.
[3] A. I. Newaz, A. K. Sikder, L. Babun, A. S. Uluagac,
Heka: A novel intrusion detection system for attacks to per-
sonal medical devices, in: 2020 IEEE Conference on Com-
munications and Network Security (CNS), 2020, pp. 1–9.
doi:10.1109/CNS48642.2020.9162311.
[4] A. Odesile, G. Thamilarasu, Distributed intrusion detection us-
ing mobile agents in wireless body area networks, in: 2017 Sev-
enth International Conference on Emerging Security Technolo-
gies (EST), 2017, pp. 144–149. doi:10.1109/EST.2017.8090414.
[5] S. P. R.M., P. K. R. Maddikunta, P. M., S. Koppu, T. R.
Gadekallu, C. L. Chowdhary, M. Alazab, An effective feature en-
gineering for dnn using hybrid pca-gwo for intrusion detection in
iomt architecture, Computer Communications 160 (2020) 139–
149. doi:https://doi.org/10.1016/j.comcom.2020.05.048.
URL https://www.sciencedirect.com/science/article/pii/
S014036642030298X
[6] P. Kumar, G. P. Gupta, R. Tripathi, An ensemble learning and
fog-cloud architecture-driven cyber-attack detection framework
for iomt networks, Computer Communications 166 (2021) 110–
124. doi:https://doi.org/10.1016/j.comcom.2020.12.003.
URL https://www.sciencedirect.com/science/article/pii/
S0140366420320090
[7] S.-R. J.-H.-P. Jae-Dong-Lee Hyo-Soung-Cha, M-idm: A multi-
classification based intrusion detection model in healthcare iot,
Computers, Materials & Continua 67 (2) (2021) 1537–1553.
doi:10.32604/cmc.2021.014774.
URL http://www.techscience.com/cmc/v67n2/41342
[8] A. A. Hady, A. Ghubaish, T. Salman, D. Unal, R. Jain, In-
trusion detection system for healthcare systems using medical
and network data: A comparison study, IEEE Access 8 (2020)
106576–106584. doi:10.1109/ACCESS.2020.3000421.
[9] I. Alrashdi, A. Alqazzaz, R. Alharthi, E. Aloufi, M. A.
Zohdy, H. Ming, Fbad: Fog-based attack detection for
iot healthcare in smart cities, in: 2019 IEEE 10th
Annual Ubiquitous Computing, Electronics Mobile Com-
munication Conference (UEMCON), 2019, pp. 0515–0522.
doi:10.1109/UEMCON47517.2019.8992963.
[10] S. Khan, A. Akhunzada, A hybrid dl-driven intelligent sdn-
enabled malware detection framework for internet of medical
things (iomt), Computer Communications 170 (2021) 209–216.
doi:https://doi.org/10.1016/j.comcom.2021.01.013.
URL https://www.sciencedirect.com/science/article/pii/
S0140366421000347
[11] A. Raza, K. P. Tran, L. Koehl, S. Li, AnoFed: Adaptive
anomaly detection for digital health using transformer-based
federated learning and support vector data description, Engi-
neering Applications of Artificial Intelligence 121 (May 2022)
(2023) 106051. doi:10.1016/j.engappai.2023.106051.
URL https://doi.org/10.1016/j.engappai.2023.106051
[12] P. Kasinathan, C. Pastrone, M. A. Spirito, M. Vinkovits,
Denial-of-service detection in 6lowpan based internet of things,
in: 2013 IEEE 9th international conference on wireless and
mobile computing, networking and communications (WiMob),
IEEE, 2013, pp. 600–607.
[13] A.-H. Muna, N. Moustafa, E. Sitnikova, Identification of ma-
licious activities in industrial internet of things based on deep
learning models, Journal of Information Security and Applica-
tions 41 (2018) 1–11.
[14] D. Oh, D. Kim, W. Ro, A malicious pattern detection engine
for embedded security systems in the internet of things, Sensors
14 (12) (2014) 24188–24211.
[15] S. Evmorfos, G. Vlachodimitropoulos, N. Bakalos, E. Gelenbe,
Neural network architectures for the detection of syn flood at-
tacks in iot systems, Proceedings of the 13th ACM International
Conference on PErvasive Technologies Related to Assistive En-
vironments (PETRA’20), Association for Computing Machin-
ery, New York, NY, USA, 2020. doi:10.1145/3389189.3398000.
38
URL https://doi.org/10.1145/3389189.3398000
[16] Y. N. Soe, Y. Feng, P. I. Santosa, R. Hartanto, K. Saku-
rai, Machine learning-based iot-botnet attack detection
with sequential architecture, Sensors 20 (16) (2020) 4372.
doi:10.3390/s20164372.
URL http://dx.doi.org/10.3390/s20164372
[17] S. Rathore, J. H. Park, Semi-supervised learning
based distributed attack detection framework for
iot, Applied Soft Computing 72 (2018) 79 89.
doi:https://doi.org/10.1016/j.asoc.2018.05.049.
URL http://www.sciencedirect.com/science/article/pii/
S1568494618303508
[18] E. J. Cho, J. H. Kim, C. S. Hong, Attack model and detec-
tion scheme for botnet on 6lowpan, in: Asia-Pacific Network
Operations and Management Symposium, Springer, 2009, pp.
515–518.
[19] N. K. Thanigaivelan, E. Nigussie, R. K. Kanth, S. Virtanen,
J. Isoaho, Distributed internal anomaly detection system for
internet-of-things, in: 2016 13th IEEE Annual Consumer Com-
munications & Networking Conference (CCNC), IEEE, 2016,
pp. 319–320.
[20] D. H. Summerville, K. M. Zach, Y. Chen, Ultra-lightweight deep
packet anomaly detection for internet of things devices, in: 2015
IEEE 34th International Performance Computing and Commu-
nications Conference (IPCCC), IEEE, 2015, pp. 1–8.
[21] T.-H. Lee, C.-H. Wen, L.-H. Chang, H.-S. Chiang, M.-C.
Hsieh, A lightweight intrusion detection scheme based on en-
ergy consumption analysis in 6lowpan, in: Advanced Technolo-
gies, Embedded and Multimedia for Human-centric Computing,
Springer, 2014, pp. 1205–1213.
[22] P. Pongle, G. Chavan, Real time intrusion and wormhole attack
detection in internet of things, International Journal of Com-
puter Applications 121 (9) (2015).
[23] S. Zhao, W. Li, T. Zia, A. Y. Zomaya, A dimension reduction
model and classifier for anomaly-based intrusion detection in in-
ternet of things, in: 2017 IEEE 15th Intl Conf on Dependable,
Autonomic and Secure Computing, 15th Intl Conf on Perva-
sive Intelligence and Computing, 3rd Intl Conf on Big Data
Intelligence and Computing and Cyber Science and Technol-
ogy Congress (DASC/PiCom/DataCom/CyberSciTech), IEEE,
2017, pp. 836–843.
[24] H. H. Pajouh, R. Javidan, R. Khayami, D. Ali, K.-K. R.
Choo, A two-layer dimension reduction and two-tier classifica-
tion model for anomaly-based intrusion detection in iot back-
bone networks, IEEE Transactions on Emerging Topics in Com-
puting (2016).
[25] M. J. Idrissi, H. Alami, A. El Mahdaouy, A. El Mekki,
S. Oualil, Z. Yartaoui, I. Berrada, Fed-ANIDS: Federated
learning for anomaly-based network intrusion detection sys-
tems, Expert Systems with Applications 234 (June) (2023).
doi:10.1016/j.eswa.2023.121000.
[26] X. Wang, Y. Wang, Z. Javaheri, L. Almutairi, N. Moghadamne-
jad, O. S. Younes, Federated deep learning for anomaly
detection in the internet of things, Computers and
Electrical Engineering 108 (March) (2023) 108651.
doi:10.1016/j.compeleceng.2023.108651.
URL https://doi.org/10.1016/j.compeleceng.2023.108651
[27] B. Weinger, J. Kim, A. Sim, M. Nakashima, N. Moustafa, K. J.
Wu, Enhancing IoT anomaly detection performance for fed-
erated learning, Digital Communications and Networks 8 (3)
(2022) 314–323. doi:10.1016/j.dcan.2022.02.007.
URL https://doi.org/10.1016/j.dcan.2022.02.007
[28] H. Alaiz-Moreton, J. Aveleira-Mata, J. Ondicol-Garcia, A. L.
Mu˜noz-Casta˜neda, I. Garc´ıa, C. Benavides, Multiclass classi-
fication procedure for detecting attacks on mqtt-iot protocol,
Complexity 2019 (2019) 1–12.
[29] C. Wang, Y. Sun, S. Lv, C. Wang, H. Liu, B. Wang, Intrusion
detection system based on one-class support vector machine and
gaussian mixture model, Electronics 12 (4) (2023) 930.
[30] E. Borgia, The internet of things vision: Key features, appli-
cations and open issues, Computer Communications 54 (2014)
1–31.
[31] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari,
M. Ayyash, Internet of things: A survey on enabling technolo-
gies, protocols, and applications, IEEE communications surveys
& tutorials 17 (4) (2015) 2347–2376.
[32] R. T. Fielding, Rest: architectural styles and the design of
network-based software architectures, Doctoral dissertation,
University of California (2000).
[33] P. Charles, L. Rabejac, Secure communications and man-in-
the-middle, in: International Workshop on Security Protocols,
Springer, 2002, pp. 31–37.
[34] J. Mirkovic, P. Reiher, F. Shepherd, Modeling and defending
against ddos attacks, Proceedings of the IEEE 92 (2) (2004)
317–331.
[35] M. Nitta, A. Hirano, M. Miura, Efficient brute-force attack
search algorithms, in: Proceedings of the 2009 ACM Workshop
on Cloud Computing Security, ACM, 2009, pp. 13–18.
[36] S. Raghavan, A. Goel, R. T. Rajan, C. V. Hota, Real-time de-
tection of nmap scans, in: Proceedings of the IEEE/RSJ Inter-
national Conference on Intelligent Robots and Systems (IROS),
IEEE, 2016, pp. 2615–2620.
[37] V. Jakkula, Tutorial on support vector machine (svm), School
of EECS, Washington State University 37 (2.5) (2006) 3.
[38] A. D. Shieh, D. F. Kamm, Ensembles of one class support vec-
tor machines, in: International workshop on multiple classifier
systems, Springer, 2009, pp. 181–190.
[39] S. Dreiseitl, M. Osl, C. Scheibb¨ock, M. Binder, Outlier detection
with one-class svms: an application to melanoma prognosis,
in: AMIA annual symposium proceedings, Vol. 2010, American
Medical Informatics Association, 2010, p. 172.
[40] N. Shahid, I. H. Naqvi, S. B. Qaisar, One-class support vec-
tor machines: analysis of outlier detection for wireless sensor
networks in harsh environments, Artificial Intelligence Review
43 (4) (2015) 515–563.
[41] C. Lu, J. Huang, L. Huang, Detecting urban anomalies using
factor analysis and one class support vector machine, The Com-
puter Journal 66 (2) (2023) 373–383.
[42] O. Kramer, O. Kramer, K-nearest neighbors, Dimensionality
reduction with unsupervised nearest neighbors (2013) 13–23.
[43] A. J. Myles, R. N. Feudale, Y. Liu, N. A. Woody, S. D. Brown,
An introduction to decision tree modeling, Journal of Chemo-
metrics: A Journal of the Chemometrics Society 18 (6) (2004)
275–285.
[44] G. I. Webb, E. Keogh, R. Miikkulainen, Na¨ıve bayes., Encyclo-
pedia of machine learning 15 (1) (2010) 713–714.
[45] A. Cutler, D. R. Cutler, J. R. Stevens, Random forests, Ensem-
ble machine learning: Methods and applications (2012) 157–
175.
[46] S. Bernard, L. Heutte, S. Adam, On the selection of decision
trees in random forests, in: 2009 International joint conference
on neural networks, IEEE, 2009, pp. 302–307.
[47] L. Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32.
doi:10.1023/A:1010933404324.
URL https://doi.org/10.1023/A:1010933404324
[48] I. Chivers, J. Sleightholme, I. Chivers, J. Sleightholme, An in-
troduction to algorithms and the big o notation, Introduction
to Programming with Fortran: With Coverage of Fortran 90,
95, 2003, 2008 and 77 (2015) 359–364.
[49] L. Breiman, Random forests, Machine Learning 45 (1) (2001)
5–32.
[50] Z. Chai, C. Zhao, Enhanced random forest with concurrent
analysis of static and dynamic nodes for industrial fault clas-
sification, IEEE Transactions on Industrial Informatics 16 (1)
(2019) 54–66.
[51] A. Liaw, M. Wiener, Classification and regression by random-
forest, R news 2 (3) (2002) 18–22.
[52] C. Chen, A. Liaw, L. Breiman, Random forests for imbalanced
data, in: Proceedings of the International Conference on Ma-
chine Learning (ICML), 2010.
[53] C. Chen, A. Liaw, L. Breiman, Using random forest to learn im-
balanced data, in: Proceedings of the International Conference
39
on Machine Learning (ICML), 2004.
[54] J. Bergstra, Y. Bengio, Random search for hyper-parameter op-
timization, in: Proceedings of the International Conference on
Machine Learning (ICML), 2012.
[55] D. Chicco, L. Oneto, An enhanced random forests approach
to predict heart failure from small imbalanced gene expression
data, IEEE/ACM Transactions on Computational Biology and
Bioinformatics 18 (6) (2020) 2759–2765.
[56] Y. Liu, J. Chen, Z. Su, Z. Luo, N. Luo, L. Liu, K. Zhang, Robust
head pose estimation using dirichlet-tree distribution enhanced
random forests, Neurocomputing 173 (2016) 42–53.
[57] D. Amaratunga, J. Cabrera, Y. S. Lee, Enriched random forests,
Bioinformatics 24 (18) (2008) 2010–2014.
[58] X. Liu, T. Fu, Z. Pan, D. Liu, W. Hu, J. Liu, K. Zhang, Au-
tomated layer segmentation of retinal optical coherence tomog-
raphy images using a deep feature enhanced structured random
forests classifier, IEEE journal of biomedical and health infor-
matics 23 (4) (2018) 1404–1416.
[59] Z. Yang, M. Chen, K.-K. Wong, H. V. Poor, S. Cui,
Federated learning for 6g: Applications, challenges,
and opportunities, Engineering 8 (2022) 33–41.
doi:https://doi.org/10.1016/j.eng.2021.12.002.
URL https://www.sciencedirect.com/science/article/pii/
S2095809921005245
[60] J. Koneˇcn`y, B. McMahan, D. Ramage, Federated optimization:
Distributed optimization beyond the datacenter, arXiv preprint
arXiv:1511.03575 (2015).
[61] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis,
A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cum-
mings, et al., Advances and open problems in federated learning,
Foundations and Trends®in Machine Learning 14 (1–2) (2021)
1–210.
[62] E. Diao, J. Ding, V. Tarokh, Heterofl: Computation and com-
munication efficient federated learning for heterogeneous clients,
arXiv preprint arXiv:2010.01264 (2020).
[63] B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. y Ar-
cas, Communication-efficient learning of deep networks from de-
centralized data, in: Artificial intelligence and statistics, PMLR,
2017, pp. 1273–1282.
[64] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B.
McMahan, S. Patel, D. Ramage, A. Segal, K. Seth, Practical
secure aggregation for privacy-preserving machine learning, in:
proceedings of the 2017 ACM SIGSAC Conference on Computer
and Communications Security, 2017, pp. 1175–1191.
[65] H. Li, J. Koneˇcn`y, P. Richt´arik, A. Sahu, C. Zhao, Federated
quantization for communication-efficient collaborative learning,
arXiv preprint arXiv:2007.07404 (2020).
[66] R. Hu, Y. Guo, H. Li, Q. Pei, Y. Gong, Personalized feder-
ated learning with differential privacy, IEEE Internet of Things
Journal 7 (10) (2020) 9530–9539.
[67] J. R. Vacca, Computer and information security handbook,
Newnes, 2012.
[68] S. Raza, L. Wallgren, T. Voigt, Svelte: Real-time intrusion de-
tection in the internet of things, Ad hoc networks 11 (8) (2013)
2661–2674.
[69] C. Cervantes, D. Poplade, M. Nogueira, A. Santos, Detection
of sinkhole attacks for supporting secure routing on 6lowpan
for internet of things, in: 2015 IFIP/IEEE International Sym-
posium on Integrated Network Management (IM), IEEE, 2015,
pp. 606–611.
[70] L. Wallgren, S. Raza, T. Voigt, Routing attacks and counter-
measures in the rpl-based internet of things, International Jour-
nal of Distributed Sensor Networks 9 (8) (2013) 794326.
[71] M. H. Ali, M. M. Jaber, S. K. Abd, A. Rehman, M. J. Awan,
R. Damaˇseviˇcius, S. A. Bahaj, Threat analysis and distributed
denial of service (ddos) attack recognition in the internet of
things (iot), Electronics 11 (3) (2022) 494.
[72] B. Bhushan, G. Sahoo, A. K. Rai, Man-in-the-middle attack
in wireless and computer networking—a review, in: 2017 3rd
International Conference on Advances in Computing, Commu-
nication & Automation (ICACCA)(Fall), IEEE, 2017, pp. 1–6.
[73] M. M. Alani, Detection of reconnaissance attacks on iot devices
using deep neural networks, in: Advances in Nature-Inspired
Cyber Security and Resilience, Springer, 2021, pp. 9–27.
[74] X. Gong, Y. Chen, H. Huang, Y. Liao, S. Wang, Q. Wang,
Coordinated backdoor attacks against federated learning with
model-dependent triggers, IEEE network 36 (1) (2022) 84–90.
[75] N. Bouacida, P. Mohapatra, Vulnerabilities in federated learn-
ing, IEEE Access 9 (2021) 63229–63249.
[76] V. Tolpegin, S. Truex, M. E. Gursoy, L. Liu, Data poison-
ing attacks against federated learning systems, in: Computer
Security–ESORICS 2020: 25th European Symposium on Re-
search in Computer Security, ESORICS 2020, Guildford, UK,
September 14–18, 2020, Proceedings, Part I 25, Springer, 2020,
pp. 480–501.
[77] D. G. Altman, Categorising continuous variables., British jour-
nal of cancer 64 (5) (1991) 975.
[78] W.-Y. Loh, Improving the precision of classification trees, The
Annals of Applied Statistics (2009) 1710–1737.
[79] D. Huang, R. Li, H. Wang, Feature screening for ultrahigh di-
mensional categorical data with applications, Journal of Busi-
ness & Economic Statistics 32 (2) (2014) 237–244.
[80] G. E. A. P. A. Batista, R. C. Prati, M. C. Monard, A study
of the behavior of several methods for balancing machine learn-
ing training data, SIGKDD Explor. Newsl. 6 (1) (2004) 20–29.
doi:10.1145/1007730.1007735.
URL https://doi.org/10.1145/1007730.1007735
[81] N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer,
Smote: Synthetic minority over-sampling technique, J. Artif.
Int. Res. 16 (1) (2002) 321–357.
[82] J. D. Rodriguez, A. Perez, J. A. Lozano, Sensitivity analysis
of k-fold cross validation in prediction error estimation, IEEE
transactions on pattern analysis and machine intelligence 32 (3)
(2009) 569–575.
[83] G. N. Ahmad, H. Fatima, S. Ullah, A. S. Saidi, et al., Efficient
medical diagnosis of human heart diseases using machine learn-
ing techniques with and without gridsearchcv, IEEE Access 10
(2022) 80151–80173.
[84] M. Vishnu, V. V. Rupak, S. Vedhapriyaa, M. Sangeetha,
R. Manjuladevi, C. Sagana, Recurrent gastric cancer prediction
using randomized search cv optimizer, in: 2023 International
Conference on Computer Communication and Informatics (IC-
CCI), IEEE, 2023, pp. 1–5.
[85] D. Ravikumar, Towards Enhancement of Machine Learning
Techniques Using CSE-CIC-IDS2018 Cybersecurity Dataset,
Rochester Institute of Technology, 2021.
[86] P. Bisen, A. Vishwakarma, et al., Machine learning based intru-
sion detection from wireless sensor network over nsl-kdd dataset,
IJRAR-International Journal of Research and Analytical Re-
views (IJRAR) 7 (1) (2020) 683–688.
[87] J. H. Anajemba, C. Iwendi, I. Razzak, J. A. Ansere, I. M. Ok-
palaoguchi, A counter-eavesdropping technique for optimized
privacy of wireless industrial iot communications, IEEE Trans-
actions on Industrial Informatics 18 (9) (2022) 6445–6454.
[88] P. Varga, S. Plosz, G. Soos, C. Hegedus, Security threats and is-
sues in automation iot, in: 2017 IEEE 13th International Work-
shop on Factory Communication Systems (WFCS), IEEE, 2017,
pp. 1–6.
[89] S. A. Chaudhry, K. Yahya, F. Al-Turjman, M.-H. Yang, A se-
cure and reliable device access control scheme for iot based sen-
sor cloud systems, IEEE Access 8 (2020) 139244–139254.
[90] Y. Kim, S. Hakak, A. Ghorbani, Ddos attack dataset (ci-
cev2023) against ev authentication in charging infrastructure,
in: Proceedings of the 20th International Conference on Pri-
vacy, Security, and Trust (PST2023), Copenhagen, Denmark,
2023.
40
... Nowadays, privacy protection provided by FL is more robust and efficient than was the case with the traditional ML model, while FL still ensures a similar prediction accuracy. As stated in a survey study [11], recent security defense instances [12][13][14][15][16][17][18] are also being developed based on the FL framework, including federated learning-based intrusion detection systems and anomaly detection. In the Internet of Things (IoT) Industry 4.0, the authors in [14] proposed a collaborative intrusion detection system (IDS) in which there are filters performing a deep neural network (DNN) and a central server collecting the filters' DNN parameters to generate a global model. ...
Article
Full-text available
In this study, we introduce a novel collaborative federated learning (FL) framework, aiming at enhancing robustness in distributed learning environments, particularly pertinent to IoT and industrial automation scenarios. At the core of our contribution is the development of an innovative grouping algorithm for edge clients. This algorithm employs a distinctive ID distribution function, enabling efficient and secure grouping of both normal and potentially malicious clients. Our proposed grouping scheme accurately determines the numerical difference between normal and malicious groups under various network scenarios. Our method addresses the challenge of model poisoning attacks, ensuring the accuracy of outcomes in a collaborative federated learning framework. Our numerical experiments demonstrate that our grouping scheme effectively limits the number of malicious groups. Additionally, our collaborative FL framework has shown resilience against various levels of poisoning attack abilities and maintained high prediction accuracy across a range of scenarios, showcasing its robustness against poisoning attacks.
Article
Full-text available
Intrusion detection systems (IDSs) play a significant role in the field of network security, dealing with the ever-increasing number of network threats. Machine learning-based IDSs have attracted a lot of interest owing to their powerful data-driven learning capabilities. However, it is challenging to train the supervised learning algorithms when there are no attack data at hand. Semi-supervised anomaly detection algorithms, which train the model with only normal data, are more suitable. In this study, we propose a novel semi-supervised anomaly detection-based IDS that leverages the capabilities of representation learning and two anomaly detectors. In detail, the autoencoder (AE) is applied to extract representative features of normal data in the first step, and then two semi-supervised detectors, the one-class support vector machine (OCSVM) and Gaussian mixture model (GMM), are trained on the derived features. The two detectors collaborate to detect anomalous samples. The OCSVM predicts the abnormal samples initially, and after that, the GMM is applied to recheck the misclassified samples further. The experiments demonstrate that the AE improves the detection rate, and two detectors are more promising than a single one.
Article
Full-text available
Predicting cardiac disease is considered one of the most challenging tasks in the medical field. It takes a lot of time and effort to figure out what's causing this, especially for doctors and other medical experts. In this paper, various Machine Learning algorithms such as LR, KNN, SVM, and GBC, together with the GridSearchCV, predict cardiac disease. The system uses a 5-fold cross-validation technique for verification. A comparative study is given for these four methodologies. The Datasets for both Cleveland, Hungary, Switzerland, and Long Beach V and UCI Kaggle are used to analyze the models' performance. It is found in the analysis that the Extreme Gradient Boosting Classifier with GridSearchCV gives the highest and nearly comparable testing and training accuracies as 100% and 99.03% for both the datasets (Hungary, Switzerland & Long Beach V and UCI Kaggle). Moreover, it is found in the analysis that XGBoost Classifier without GridSearchCV gives the highest and nearly comparable testing and training accuracies as 98.05% and 100% for both the datasets (Hungary, Switzerland & Long Beach V and UCI Kaggle). Furthermore, the analytical results of the proposed technique are compared with previous heart disease prediction studies. It is evident that amongst the proposed approach, the Extreme Gradient Boosting Classifier with GridSearchCV is producing the best hyperparameter for testing accuracy. The primary aim of this paper is to develop a unique model-creation technique for solving real-world problems.
Article
In digital healthcare applications, anomaly detection is an important task to be taken into account. For instance, in ECG (Electrocardiogram) analysis, the aim is often to detect abnormal ECG signals that are considered outliers. For such tasks, it has been shown that deep learning models such as Autoencoders (AEs) and Variational Autoencoders (VAEs) can provide state-of-the-art performance. However, they suffer from certain limitations. For example, the trivial method of threshold selection does not perform well if we do not know the reconstruction loss distribution in advance. In addition, since healthcare applications rely on highly sensitive personal information, data privacy concerns can arise when data are collected and processed in a centralized machine-learning setting. Hence, in order to address these challenges, in this paper, we propose AnoFed, a novel framework for combining the transformer-based AE and VAE with the Support Vector Data Description (SVDD) in a federated setting. It can enhance privacy protection, improve the explainability of results and support adaptive anomaly detection. Using ECG anomaly detection as a typical application of the framework in healthcare, we conducted experiments to show that the proposed framework is not only effective (in terms of the detection performance) but also efficient (in terms of computational costs), compared with a number of state-of-the-art methods in the literature. AnoFed is very lightweight in terms of the number of parameters and computation, hence it can be used in applications with resource-constrained edge devices.
Chapter
As IoT devices become more popular and ubiquitous, they become more obvious targets for malicious actors. With over 30 billion IoT devices online in 2020, security challenges become an inescapable reality. IoT malware infections witnessed 100% year-over-year increase in 2020 to account for over 32% of all mobile device infections. In this paper, we introduce a detection system based on deep neural networks, aimed at detecting IoT device scanning and reconnaissance attacks. The proposed system was implemented and showed very high accuracy exceeding 98%. Experiments also showed a false-positive rate of 1.9% and a false-negative rate lower than 0.02%.
Article
Federated Learning (FL) with mobile computing and Internet of Things (IoT) is an effective cooperative learning approach. However, several technical challenges still need to be addressed. For instance, dividing the training process among several devices may impact the performance of Machine Learning (ML) algorithms, often significantly degrading prediction accuracy compared to centralized learning. One of the primary reasons for such performance degradation is that each device can access only a small fraction of data (that it generates), which limits the efficacy of the local ML model constructed on that device. The performance degradation could be exacerbated when the participating devices produce different classes of events, which is known as the class balance problem. Moreover, if the participating devices are of different types, each device may never observe the same types of events, which leads to the device heterogeneity problem. In this study, we investigate how data augmentation can be applied for addressing these challenges and improves detection performance in an anomaly detection task using IoT datasets. Our extensive experimental results with three publicly accessible IoT datasets show performance improvement of up to 22.9% with the approach of data augmentation, compared to the baseline (without relying on data augmentation). In particular, stratified random sampling and uniform random sampling show the best improvement in detection performance with only modest increase in computation time, whereas the data augmentation scheme using Generative Adversarial Networks is the most time-consuming with limited performance benefits.
Article
Federated learning enables distributed training of deep learning models among user equipment (UE) to obtain a high-quality global model. A centralized server aggregates the updates submitted by UEs without knowledge of the local training data or process. Despite its privacy preserving merit, we reveal a severe security concern. Malicious UEs can manipulate their training data by injecting a back-door trigger. Thus, the global model that aggregates those malicious updates may make false predictions on the samples with the backdoor trigger. However, the effect of a single backdoor trigger will quickly be diluted by subsequent benign updates. In this work, we present an effective coordinated backdoor attack against federated learning using multiple local triggers; the global trigger consists of various separate local triggers. Moreover, in contrast to using random triggers, we propose using model-dependent triggers (i.e., generated based on local models of attackers) to conduct backdoor attacks. We conduct extensive experiments to assess the effectiveness of our proposed backdoor attacks on MNIST and CIFAR-10 datasets. Experimental results show that our proposed methodology outperforms both coordinated attacks using random triggers and single trigger backdoor attacks in terms of attack success rate. We also show that Byzantine-resilient aggregation methodologies are not robust to our proposed attacks.