ArticlePDF Available

Detecting IoT Attacks Using an Ensemble Machine Learning Model

March 2022
Future Internet 14(4):102

March 2022
14(4):102

DOI:10.3390/fi14040102

License
CC BY 4.0

Authors:

Sachin Sharma

TU Dublin

Malicious attacks are becoming more prevalent due to the growing use of Internet of Things (IoT) devices in homes, offices, transportation, healthcare, and other locations. By incorporating fog computing into IoT, attacks can be detected in a short amount of time, as the distance between IoT devices and fog devices is smaller than the distance between IoT devices and the cloud. Machine learning is frequently used for the detection of attacks due to the huge amount of data available from IoT devices. However, the problem is that fog devices may not have enough resources, such as processing power and memory, to detect attacks in a timely manner. This paper proposes an approach to offload the machine learning model selection task to the cloud and the real-time prediction task to the fog nodes. Using the proposed method, based on historical data, an ensemble machine learning model is built in the cloud, followed by the real-time detection of attacks on fog nodes. The proposed approach is tested using the NSL-KDD dataset. The results show the effectiveness of the proposed approach in terms of several performance measures, such as execution time, precision, recall, accuracy, and ROC (receiver operating characteristic) curve.

Content uploaded by Sachin Sharma

Content may be subject to copyright.





Citation: Tomer, V.; Sharma, S.

Detecting IoT attacks Using an

Ensemble Machine Learning Model.

Future Internet 2022,14, 102. https://

doi.org/10.3390/ﬁ14040102

Academic Editor: Paolo Bellavista

Received: 21 February 2022

Accepted: 22 March 2022

Published: 24 March 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

future internet

Article

Detecting IoT attacks Using an Ensemble Machine

Learning Model

Vikas Tomer 1and Sachin Sharma 2,*

1Department of Computer Science and Engineering, Graphic Era Deemed to be University,

Dehradun 248002, India; vikastomer.cse@geu.ac.in

School of Electrical and Electronic Engineering, Technological University Dublin, D07 EWV4 Dublin, Ireland

*Correspondence: Sachin.Sharma@TUDublin.ie

Abstract:

Malicious attacks are becoming more prevalent due to the growing use of Internet of Things

(IoT) devices in homes, ofﬁces, transportation, healthcare, and other locations. By incorporating fog

computing into IoT, attacks can be detected in a short amount of time, as the distance between IoT

devices and fog devices is smaller than the distance between IoT devices and the cloud. Machine

learning is frequently used for the detection of attacks due to the huge amount of data available

from IoT devices. However, the problem is that fog devices may not have enough resources, such as

processing power and memory, to detect attacks in a timely manner. This paper proposes an approach

to ofﬂoad the machine learning model selection task to the cloud and the real-time prediction task to

the fog nodes. Using the proposed method, based on historical data, an ensemble machine learning

model is built in the cloud, followed by the real-time detection of attacks on fog nodes. The proposed

approach is tested using the NSL-KDD dataset. The results show the effectiveness of the proposed

approach in terms of several performance measures, such as execution time, precision, recall, accuracy,

and ROC (receiver operating characteristic) curve.

Keywords: Internet of Things (IoT); machine learning; cybersecurity; DDoS

1. Introduction

Historically, only computers, mobile phones, and tablets were connected to the Internet.

The Internet of Things (IoT) today enables many kinds of devices and appliances (e.g.,

televisions, air conditioners, washing machines) to be connected to the Internet. IoT is

being used in several ﬁelds today, including healthcare, agriculture, trafﬁc monitoring,

energy saving, water supply, unmanned air vehicles, and automobiles.

A three-layer IoT architecture is illustrated in Figure 1; from left to right: (1) thing layer,

(2) fog layer, and (3) cloud layer. The thing layer includes IoT devices from several domains,

including smart-homes, eHealth, smart vehicles, smart drones, and smart-cities. This layer

enables data collection while having limited resources such as bandwidth, processing,

energy, and memory. Next comes the fog layer, which is closer to the thing layer and may

contain some operational resources to manage real-time operations and rapid decision

making. Finally, the cloud layer facilitates the collection, processing, and storage of data

in various data centers. However, as it is far away from the thing layer, it may take a long

time to incorporate decisions in the thing layer.

According to a recent report from the International Data Corporation (IDC) (https:

//www.idc.com/, accessed on 20 March 2022), the amount of data generated by IoT devices

will reach 73 zeta bytes by 2025, up from 18 zeta bytes in 2019. A massive inﬂux of data

opens up a lot of potential threats [

]. The problem is that IoT devices and their networks

tend to be insecure since they are typically under-powered, memory-limited, or insufﬁ-

ciently bandwidth-limited to perform basic security functions such as encryption. IBM

X-Force (https://securityintelligence.com/posts/internet-of-threats-iot-botnets-network-

Future Internet 2022,14, 102. https://doi.org/10.3390/ﬁ14040102 https://www.mdpi.com/journal/futureinternet

Future Internet 2022,14, 102 2 of 17

attacks/, accessed on 20 March 2022) reported in 2020 that attacks on IoT grew ﬁve-fold

over the previous year. Currently, IoT-enabled networks are at risk of losing privacy and

conﬁdentiality due to malware and botnet attacks [2].

Figure 1. A three-layer Internet of Things (IoT) architecture.

For the IoT, several security solutions have been proposed, such as authentication [

detection, and prevention [

]. Introducing machine learning (ML) algorithms into the IoT

may alleviate concerns about security and privacy [

]. Today, it is crucial to decide where

to run which algorithms for fast decision making, such as on the cloud or the fog or the

thing layer. When all ML decisions are made in the cloud, IoT decisions may be delayed.

In other layers, such as the thing or fog layer, it may be difﬁcult to apply ML solutions due

to their limited resources, such as bandwidth, processing, and energy.

Current research [7–12] indicates that deep learning algorithms are capable of detect-

ing IoT attacks more effectively than traditional machine learning algorithms. However,

only the cloud layer may have the resources to run these algorithms. In addition, these

algorithms are not always very effective in some situations, such as remote live operations

(e.g., remote surgery), since the system is supposed to make real-time decisions rapidly.

Previous work on IoT attacks [

] has shown that a machine learning technique such as

support vector machine (SVM) can only provide meaningful results if it is combined with

a feature extraction/reduction algorithm or optimization algorithm. This combination of

algorithms fails to meet the low resource requirement. ML techniques such as decision

trees, naïve Bayes, K-nearest neighbors (KNN), and others are extremely robust for applica-

tions such as ofﬂine or non-interactive predictions between small datasets. These models,

however, are considered weak when applied to real-time predictions. Studies conducted

in the state of the art [

–

] report that the detection rate is quite low when using these

classiﬁers to detect IoT attacks.

The paper proposes an ensemble model for an IoT system with limited bandwidth,

processing power, energy, and memory (e.g., in the fog layer) to detect IoT attacks. Denial

of service (DoS), authentication attacks, and probe attacks are taken into account. Moreover,

no additional feature extraction or dimensional reduction algorithm is used to increase

detection rates. This model is best suited to the real-time, quick detection of IoT attacks. In

the proposed approach, there are two important steps: (1) selecting the best ensemble model

that has a short execution time and high performance (e.g., accuracy), and (2) running the

best model to achieve a short delay when applying the decision. Firstly, we perform the

ﬁrst step in the cloud, as more resources are required for selecting the best ensemble model,

and the second step is performed in the fog layer, which has a low delay for real-time

applications.

In this paper, extensive data analysis experiments are performed on the NSL-KDD

dataset (https://www.unb.ca/cic/datasets/nsl.html, accessed on 20 March 2022). The

Future Internet 2022,14, 102 3 of 17

dataset represents IoT attacks on a network in real time, and it is an upgraded version

of the original KDD-99 dataset. The results show a high level of accuracy in a minimum

amount of time with the fewest possible resources needed. The paper is organized as

follows: Section 2presents the related work and the background, Section 3presents our

proposed approach, Section 4presents simulation scenarios, Section 5provides the results

and, ﬁnally, Section 6concludes the paper.

2. Background and Related Work

2.1. IoT-Speciﬁc Attacks Overview

From IoT devices, data can be collected which can then be processed and monitored,

depending on an application (e.g., e-healthcare or industrial) located in a cloud or fog layer.

There are several attacks related to the IoT in the literature. Denial of service (DoS) attacks,

authentication attacks, and probe attacks are presented below:

A denial of service (DoS) attack poses the greatest threat to IoT devices and servers

with open ports [

]. There are several types of DoS attacks such as Smurf, Neptune,

and Teardrop;

An authentication attack is an attack against privileged access. A remote to the user

(R2U) attack (such as HTTPtunnel and FTP_write) occurs when an intruder sends

malformed packets to a computer or server to which he/she does not have access.

User-to-root (U2R) attacks (such as Rootkit) occur when a malicious intruder attempts

to gain access to a network resource by posing as a normal user and then accessing it

using full permission;

In a probe attack, an intruder runs a scan of a network device to determine potential

vulnerabilities in the design of its topology or port settings and then exploits those in

the future to gain illegal access to conﬁdential information. There are several types of

probe attacks, such as IPsweep, Nmap, and Portsweep.

2.2. ML-Speciﬁc Related Work on Security and Privacy

A comparison of related work on ML-speciﬁc attack detection can be seen in Table 1,

including the ML (machine learning)/DL (deep learning) used, the pre-processing fea-

tures, and performance analysis performed. During the pre-processing step, encoding (E),

scaling (S), normalization (N), and dimensionality reductions (D) are taken into account.

Furthermore, as part of the performance analysis, accuracy, receiver operating characteristic

(ROC) curve, F-score, Matthews correlation coefﬁcient (MCC), and detection rate (DR) are

considered.

In [

], decision trees and rule induction are used to explain under what con-

ditions a speciﬁc type of attack (DoS, authentication attacks, and probe attacks) occurs

on a network. In this approach, encoding is used as a pre-processing technique, while

accuracy is used to evaluate the effectiveness of the method. Although this is a valuable

state-of-the-art approach, it cannot guarantee that any rules from decision trees will be

applicable for large sets of data because overﬁtting poses the greatest risks. Further, in [

principal component analysis (PCA) is utilized with a decision tree to detect and investigate

the reason of the anomalies.

The previous works of [

] show that attacks can be predicted with high

accuracy by using deep learning neural networks, either as a standalone technology [

]

or in combination with optimization [

] or machine learning algorithms [

]. More

precisely, [

] combine artiﬁcial neural networks (ANNs) with support vector machines

(SVMs), which provide signiﬁcantly higher detection rates than standalone deep learning

or machine learning algorithms. Particularly, [

] develops the hybridization by including

the SVM with ANN but also combining that fusion with a genetic algorithm (GA) and

particle swarm optimization (PSO). This hybridization achieves a 99.3% accuracy rate.

Future Internet 2022,14, 102 4 of 17

Table 1.

Related Work. The letters E, S, N, and D stand for encoding, standardization, normalization,

and dimensional reduction, respectively. Further, accuracy, Matthews correlation coefﬁcient, and

detection rate are denoted as A, MCC, and DR, respectively.

Reference ML/DL

Algorithm Used

Features Used

(X)or Not (×)

Analysis Performed

(X)or Not Performed (×)

[19,20]Decision Tree

+ Rule Induction

E(X), S(×),

N(×), D(×)

A(X), ROC(×), FScore(×),

MCC(×), DR(×)

[7,8]Deep Neural

Network (DNN)

E (×), S(×),

N(X), D(×)

A(×), ROC (×), FScore(X),

MCC(×), DR(×)

[22,23]Optimization

+ DNN

E(X), S(×),

N(X), D(X)

A(X), ROC (×), FScore(X),

MCC(×), DR(×)

[9,13]SVM-ANN

+ hybrid optimization

E(×), S(×),

N(X), D(×)

A(×), ROC (×), FScore(X),

MCC(X), DR(X)

[21]PCA

+ Random Decision

E(×), S(×),

N(X), D(X)

A(X), ROC (X), FScore(×),

MCC(×), DR(X)

[10,11]Dimensionality

Reduction + DNN

E (×), S(×),

N(X), D(X)

A(X), ROC(×), FScore(X),

MCC(×), DR(X)

[24]GA-based Latent

Dirichlet Allocation

E(X), S(×),

N(×), D(×)

A(X), ROC (×), FScore(X),

MCC(×), DR(X)

[25]Autoencoder based

LSTM classiﬁer

E (X), S(X),

N(X), D(X)

A(×), ROC (×), FScore(X),

MCC(×), DR(×)

[26]Multinomial Logistic

Regression

E(×), S(×),

N(×), D(×)

A(×), ROC (X), FScore(×),

MCC(×), DR(×)

[27]Ensemble Learning

with XGboost

E (X), S (×),

N(×), D(×)

A(X), ROC (×), FScore(×),

MCC(×), DR(×)

The dimensionality reduction factor is also explored in a wide variety of works. The

studies of [

] and refs. [

] used principal component analysis (PCA) with ANN

and showed an efﬁcacy of 91 percent F1-scores. Researchers from [

] have also explored

dimensionality reduction with one-hot encoder and combined outlier analysis, which

increased performance by 2.96 percent and 4.12 percent higher than CNN and RNN.

This approach to dimensionality reduction with machine learning yields a mix of higher

and average results. In addition, it is still unclear how many dimensionality reduction

algorithms will ﬁt within a single model to provide an optimal outcome. A combination of

latent Dirichlet allocation (LDA) and a genetic algorithm is used in [

], which provides a

below-average accuracy rate of 88.5 percent and a false positive rate of 6 percent.

The results are improved even more by techniques such as logistic regression and

autoencoder. The study of [

] uses an autoencoder with LSTM and carries out experiments

on a number of autoencoders, hitting the AUC score of 96 percent. Multinomial logistic

regression provided a 99 percent ROC for ﬁnding anomalies in [

]. The idea of ensemble

learning has also been explored by several authors. One of the appealing results, with

99.6 AUC, is provided by using XGBoost in [27].

The literature review covered almost all taxonomies of machine learning, from decision

trees to neural networks, and from regression (logistic) techniques to ensemble learning.

Following an extensive assessment, it was determined that a deep neural network with

some optimization algorithm or ensemble learning could provide an impressive detection

rate and the least false alarm rate of attacks. Additionally, feature engineering is also

required to improve this model.

Future Internet 2022,14, 102 5 of 17

2.3. Voting and Stacking Techniques

The voting process, as its name suggests, ensembles the results of a number of weak

classiﬁers by choosing the classiﬁer with the greatest number of common traits as the ﬁnal

one. The advantage of this method is that it ignores errors of misclassiﬁed classiﬁers. As

an example, to solve a classiﬁcation problem through voting, a range of weak classiﬁers

is selected, including K-nearest neighbor (KNN) classiﬁers and decision trees. Both naïve

Bayes and K-nearest neighbour classiﬁers yield the same class label as a result, which

differs from naïve Bayes. Following this, the maximum number of common votes from the

K-nearest neighbor classiﬁer and decision tree will be considered.

Stacking is a method of ensemble learning that takes into account heterogeneous

weak classiﬁers, which means that different machine learning algorithms are combined. In

addition, in stacking, there is the concept of a meta-layer that combines the classiﬁer results

from the base layers using a meta-layer model. For instance, to solve a classiﬁcation problem

through stacking, a range of weak classiﬁers, such as K-nearest neighbour classiﬁers,

decision trees, and naïve Bayes classiﬁers are selected at base layers, and their results are

combined through a neural network classiﬁer as a meta-layer model. In the meta-layer

model, the neural network will take inputs from the base layer and provide the outputs of

these three weak classiﬁers with a ﬁnal prediction.

2.4. Ensemble Machine Learning-Based Attack Detection

The authors of [

] demonstrate how ensemble machine learning, neural networks,

and kernel methods can be used to detect abnormal behavior in an IoT intrusion detection

system. In this study, ensemble methods outperform kernel and neural networks in terms

of accuracy and error detection rates.

To detect webshell-based attacks, ensemble machine learning is used in [

]. In

webshell attacks, a malicious script installed on a web server for remote administration

executes malicious code written in popular web programming languages. Ensemble

techniques, including random forest and extremely randomized trees, are applied in this

work, and voting is used in order to improve their performance. The study concluded

that random forests and extremely randomized trees are best for IoT scenarios involving

moderate resources (CPU, memory, etc). Nevertheless, voting is proved to be most effective

in scenarios requiring heavy resources. In [

], cyberattacks are detected using ensemble

methods for IoT-based smart cities. Ensemble methods were found to be more accurate than

other machine learning algorithms, including linear regression, support vector machines,

decision trees, and random forests.

Further, anomalies are detected using ensemble methods applied to software-deﬁned

networking (SDN) in IoT at [

]. In SDN, IoT networks could be controlled from a central

server called a controller [

]. Further, in [

], DDoS attacks are detected by using an

ensemble method that uses trafﬁc ﬂow metrics to classify attacks. The applied approach

yields fewer false alarms and a high degree of accuracy. Moreover, cyberattacks are detected

by enabling cloud–fog architecture on the Internet of Medical Things (IoMT) using ensemble

machine learning, in [

]. In this work, decision trees, naïve Bayes, and random forest

machine learning techniques are used as a base classiﬁer, and XGBoost is used at the next

level. This method achieved a high detection rate of 99.98% on the NSL-KDD dataset.

The detection of anomalies in the smart home is carried out by ensemble machine

learning rather than binary classiﬁcation in [

]. Ensemble machine learning was able to

detect anomalies in categorical datasets with minimal false positives. In [

], adaptive

learning is used to boost the intelligence of ensemble machine learning for the Internet of

Industrial Networks. This approach proved effective under ROC curve calculations.

2.5. IoT System with Cloud and Fog

Figure 1illustrates the beneﬁts of using the cloud for data processing because it may

have the resources necessary to perform complex computations. The cloud, however, has

several inherent weaknesses, including high costs, long latency, and limited bandwidth [

Future Internet 2022,14, 102 6 of 17

Further, due to proximity to IoT devices, fog is well suited for solving a variety of issues

including long latency, communication, control, and computation [

]. With fog computing,

time-sensitive data can be stored and analyzed locally [

]. Furthermore, by reducing the

amount and distance of data sent to the cloud, IoT applications can be made more secure

and private [42,43].

Researchers have employed a number of approaches and techniques to overcome

data transfer challenges in fog, including encryption-based data transfer, as described

in [

]. Furthermore, several researchers have proposed methods to improve security in

fog, including game-based security [

]. However, these works do not have the advantage

of functioning in real time. Currently, researchers are developing a method for predicting

real-time scenarios and minimizing the overall time factor by balancing cloud computing

with fog computing and optimizing the trade-off between the two (e.g., [

]). Likewise,

this approach is used in our paper to move resource-intensive and time-sensitive tasks to

the cloud and real-time tasks to the fog layer.

3. Proposed Approach

Our objective is to use ensemble machine learning techniques for detecting attacks in

an IoT system. This is because deep neural networks require substantial resources, such as

memory. The goal is to come up with the best ensemble method and to apply it for real-time

attack detection. Figure 2outlines the proposed approach with three layers: thing, fog and

cloud. It involves the following three steps (also shown in Figure 2): (1) data collection

at the cloud layer, (2) running the ensemble algorithm on the cloud and selecting the best

model, and (3) running the best selected algorithm in the cloud. The description of the

above tasks is given below.

1. Data collection at Cloud Layer

This step involves collecting data from the thing layer and passing it to the cloud layer.

To accomplish this, data from the thing layer can ﬁrst be transported to the fog layer.

The fog layer can then transport it to the cloud layer. While transporting the data to

the cloud layer, the fog layer can also ﬁlter data to decide which data to be transported

to the cloud. IoT attacks can be predicted using the following attributes: (1) login

details, (2) the ﬁelds of network data packets, such as fragment details, protocol type,

source and destination address, (3) service type, (4) ﬂags, and (5) duration. We provide

detailed information about the data used in our simulation in the next section.

2. Selecting a best model on the cloud

The objective of this step is to combine various basic machine learning classiﬁers

(such as naïve Bayes, KNN, and decision trees) with ensemble techniques (such as

stacking, bagging, and voting) to obtain optimal results (accuracy, precision, execution

time). As this is a time-consuming step, we recommend running it in the cloud. In

addition, we simply apply the basic machine learning classiﬁers, as they require a

short execution time.

Figure 3illustrates this step by including four layers: (1) the data layer, (2) the base

layer, (3) the meta-layer, and (4) the method selection layer. In the data layer, collected

data from the previous step is pre-processed and fed into the base layer. The base layer

applies different combinations of base classiﬁers, such as naïve Bayes

(B1)

, decision

trees

(B2)

, and KNN

(B3)

. The results of these combinations are then fed into the meta

layer, where ensemble methods, such as stacking

(E1)

, bagging

(E2)

, and voting

(E3)

aggregate the outcomes. Each ensemble method is evaluated in terms of accuracy,

precision, recall, and ROC and execution time. Further, the model with a combination

of base classiﬁers and an ensemble method that yields the best results is selected.

Algorithm 1describes the above-proposed approach in detail. The input parameters

of the algorithms are: (1) base classiﬁers (i.e.,

B=B1

. . . Bn

), (2) ensemble

methods (i.e.,

E=E1

. . . Em

), and (3) training dataset (D). At the ﬁrst two

lines of the algorithm, the output and the result (i.e., variable OUTPUT and Result in

Future Internet 2022,14, 102 7 of 17

Algorithm 1) are initialized to

NULL

. The third line initializes the execution time to

the maximum value.

In the fourth line, we store all the combinations of the base classiﬁers (i.e., using

the function ﬁndAllCombinations) in variable

. The proposed approach aims to

determine the best combination and the best ensemble method. Therefore, in line 5,

we iterate each of the combinations, and then, again, in line 7, each base classiﬁer

in the corresponding combination is iterated. Each base classiﬁer is applied to the

training dataset (D) with the outcome being stored in o(line 8).

Line 10 involves an iteration of the ensemble methods and the application of each

ensemble method to the outcome (

) at step 11. At line 12, the ensemble result is

calculated in terms of accuracy, precision, recall, etc. Further, at line 13, the execution

time of the combination of base classiﬁers and an ensemble method is calculated. The

new result (r) and execution time (time) is then compared to the previous best result

(Result) and time (ExcecutionTime). If this is the best result so far, the corresponding

combination and ensemble method is stored in the output (OUTPUT); see line 14.

Further, the result is stored in line 15. In the end, the best output is returned at line 21.

3. Running the best model on the fog layer

This step involves executing the model selected in the previous step over the fog

layer with the real-time data collected from the thing layer. The model consists of a

combination of base classiﬁers and an ensemble method.

Figure 2. Proposed approach.

Figure 3. Selection of an ensemble method.

Future Internet 2022,14, 102 8 of 17

Algorithm 1: Find a best model

procedure FINDABES TMO DEL(B,E,D)

// B←B1,B2,B3, . . . Bn. Here, B1,B2,B3, . . . Bnare Base

Classifiers.

// E←E1,E2,E3, . . . Em. Here, E1,E2,E3, . . . Emare Ensemble

methods.

// D is the training dataset.

1OUTPU T ←NULL;

2Result ←NU L L;

3ExecutionTime ←M AX;

4C←f ind All Combi natio ns(B);

// Find all the combinations of the Base Classifiers (B).

5foreach c∈Cdo

// Iterate each combination.

6o=0; // Initialize an outcome.

7foreach Ba∈cdo

// Iterate each Base Classifier in c.

8o←o,A ppl yBase Clas s f ier(Ba,D);

// Apply Baover dataset D and store the outcome of each

Bain the form of the ROC curve or any performance measure

in o.

9end

10 foreach Ea∈Edo

// Iterate each Ensemble method.

11 e←applyEnsemble Method(Ea,o);

// Apply Ensemble Method Eaon o.

12 r←f ind Resul t(e);

// Find the result in form of ROC or any other Performance

measures

13 time ←f i ndEx cecuti onTime (c,Ea);

// The execution time of the combination of base

classifiers (c) and an ensemble method (Ea)is calculated

14 if isResultBetter(r, time, Result, ExecutionTime) then

// If r and time is better than Result and

ExecutionTime.

15 OUTPU T ←c,Ea;

// Store the base classifiers and Ensemble method over

OUTPUT.

16 Result ←r;

17 ExecutionTime ←time;

18 end

19 end

20 end

21 Return :OUTPUT;

// Return the best model with base classifiers and an ensemble

method

The proposed approach to include cloud–fog/edge architecture is derived from the

analysis of an NGIAtlantic EU project [

], in which cross-Atlantic experimental validation

is proposed for intelligent SDN-controlled IoT networks. In this project, IoT devices

transmit data to an IoT application in the cloud over the Internet via a gateway (located at

edge/fog devices) whose security and latency are enhanced by running secure network

functions. Our approach is a practical solution in real-time for such a scenario since, in

production IoT networks, fog/edge nodes do not have a lot of resources to run heavy-

Future Internet 2022,14, 102 9 of 17

weight algorithms that require a lot of resources. Therefore, if only the trained model is

run in the fog layer (step 3, above), the fog node’s resource requirements will be lowered,

which is practical. Furthermore, since the cloud layer has plenty of resources, it makes

sense to train the data there, as described in steps 1 and 2.

4. Simulation Environment

This section presents the simulation environment in terms of server conﬁguration,

dataset description, cloud and fog data separation, and simulated base classiﬁer and

ensemble methods.

4.1. Server Conﬁguration

The proposed framework with fog and cloud nodes is tested on a server with a CPU

Core E7400 processor and 3.00 GB of RAM and a 32-bit operating system with 2.80 GHz.

The proposed ensemble algorithm is implemented on the cloud node and the best model is

run on the fog node. The Weka platform is used to run the experimentation at the cloud

layer and the real-time detection of IoT attacks at the fog layer.

4.2. Dataset Description

The NSL-KDD dataset (https://www.unb.ca/cic/datasets/nsl.html, accessed on 20

March 2022) is used for the simulation of this work. It contains 41 features to describe

each speciﬁc entity in an IoT network. Details on network intrusions with these 41 features

can be segmented into computational information (service, ﬂag, land, etc.), content-based

information (login information, root shell information, etc.), duration-based (such as dura-

tion from host to destination transfer, error rates), and host-based information (host and

destination ports and counts information).

In Figure 4, the NSL-KDD dataset is represented by two layers: (1) the inner layer

represents different types of IoT attacks in the dataset, such as Probe, DoS, U2R, and R2L;

(2) the outer layer represents examples of attacks within each category. Attacks such as

Saint, Satan, Nmap, and portsweep, which can be found in Figure 4, come under the Probe

IoT attack category. In these attacks, the attacker scans a network device to determine

potential weaknesses in its design, which are subsequently exploited in order to gain access

to conﬁdential information, as described in Section 4.

Figure 4. Layerwise NSL-KDD dataset description.

Future Internet 2022,14, 102 10 of 17

Likewise, attacks such as Neptune, Teardrop, Worm, and Smurf fall into the category

of DoS attacks. These attacks cause a denial of service when an attacker consumes resources

unnecessarily, making the service unavailable for legitimate users. Moreover, Sendmail,

Multihop, and phf belong to R2L (remote-to-user) attacks, while Perl, text, and sqlattack

belong to U2R (user-to-root) attacks. In Figure 4, variables are underlined according to

their segment. Most variables in this dataset are nominal. There are three basic protocol

types, TCP (transmission control protocol), UDP (user datagram protocol), and FTP (ﬁle

transfer protocol), that exist in the dataset.

4.3. Data Separation for the Cloud and Fog Layers

Our proposed scheme uses the cloud layer to keep track of historical data about

network connections associated with IoT attacks, while the fog layer analyzes real-time

data. Furthermore, the cloud layer consists of the target variable and its associated labels,

whereas the fog layer requires this variable to be predicted for new entries or labels.

Training and testing data segments are provided in the NSL-KDD dataset source. For

experimentation, training data is used as cloud data, and testing data as fog data. Further,

a signiﬁcant subset of the NSL-KDD dataset is used in the cloud layer for training and

validation, while the rest of the unlabeled data is considered for real-time processing in

the fog layer for testing. Moreover, K-cross validation is used with an 80:20 ratio at the

cloud layer.

4.4. Simulated Base Classiﬁers and Ensemble Methods

Simulating the proposed approach included the use of ﬁve machine learning classiﬁers

and two ensemble methods. The classiﬁers used are: (1) decision tree (DT), (2) random

forest (RF), (3) K-nearest neighbors (KNN), (4) logistic regression (LR), and (5) naïve Bayes

(NB), while ensemble techniques are voting and stacking. Table 2shows the detail of each

combination of base classiﬁers in the base layer. A total of 10 different model combinations

are tested. The models are listed in Table 2. This is because we selected ﬁve base classiﬁers,

and we created combinations of two. Therefore, we end up with 10 models (i.e., 5C2).

Table 2.

Base classiﬁer combinations: decision tree (DT), random forest (RF), K-nearest neighbor

(KNN), logistic regression (LR), naïve Bayes (NB).

Model Base Classiﬁer Combinations

1 DT RF KNN

2 RF KNN LR

3 KNN LR NB

4 LR NB DT

5 NB DT RF

6 DT KNN LR

7 RF LR NB

8 KNN NB DT

9 LR DT RF

10 NB RF KNN

5. Results and Analysis

Here, we evaluate the results of the proposed approach for the cloud and fog layers

using three factors: (1) execution time, (2) performance measures, and (3) error associated

with the ﬁnal model. On the cloud layer, a larger amount of data (training) is used to build

models and conduct experiments. Testing data is considered new data and is tested on the

fog layer. In the cloud layer, the best model is selected, and in the fog layer, it is evaluated

using real-time data. Our ﬁrst objective is to summarize the results, including the cloud

layer, and the method by which model 8 (distributed in Table 2), with an ensemble method,

Future Internet 2022,14, 102 11 of 17

was selected to be applied to the fog layer. Following that, we show the results obtained

from the real-time data in the fog layer.

5.1. Cloud Layer Result Analysis

5.1.1. Execution Time

Figure 5displays the execution time for voting and stacking ensemble methods over

all the models described in Table 2. The X-axis in Figure 5refers to the duration in seconds

to execute a model, while the Y-axis refers to the model number. Compared to the voting

ensemble method, stacking takes a much higher execution time. According to our results,

model number 8, with the voting technique, shows minimal execution time (9.96 s), with

KNN, NB, and DT used as base classiﬁers.

Figure 5. Execution times of all models.

5.1.2. Performance Measures

Figure 6shows overall performance as measured by kappa, F-measure, and the ROC

area. It shows that all the models have values greater than 0.99, with model 8 providing

the kappa value 0.991, the F-measure value 0.995, and the ROC area 0.999. Figure 7shows

the errors with voting as an ensemble method in terms of mean absolute error, root mean

square error, relative absolute error, and root-relative squared error. Model 1, with voting,

exhibits signiﬁcantly fewer errors than any other model. In this model, DT, RF, and K-NN

are used as base classiﬁers, and voting is used as an ensemble technique. In spite of this,

we selected model 8 with voting to run in the fog layer, as it performed well in terms

of execution times and other performance parameters, as shown in Figure 6. Based on

Figure 7, the root-relative squared error in model 8 with voting has the greatest impact, of

27.94 percent, and the mean absolute error has the least impact, of 0.6 percent.

Figure 6. Performance of all models.

Future Internet 2022,14, 102 12 of 17

Figure 7. Errors associated with all the models.

To verify further, we calculate the performance of model 8 in terms of precision, F-

measure, MCC, and PRC area (Figure 8), in addition to all other metrics. Through the

Y-axis, the result is accurate to three decimal places. The most signiﬁcant performance

metric is MCC, which indicates how random or real the prediction is. It ranges from

−

1 to

1. Model 8’s values in the experiment are typically closer to 99.99 percent. In general, model

8 with voting is highly optimized to run on the fog layer, according to the requirements of

real-time execution and excellent performance.

Figure 8. Performance of the selected model.

Future Internet 2022,14, 102 13 of 17

We found that model number 8, using K-nearest neighbor, naïve Bayes, and decision

trees as the base classiﬁers outperforms all other models with respect to execution time

and performance metrics (such as kappa, F-measure, ROC, and MCC). Since time is an

important factor in the selection of any model, the voting ensemble technique determines

that model 8 takes the least time: 1.15 s. Additionally, kappa, F-measure, ROC, and MCC

have maximum values of 6.39, 98.20, 99.60, and 96.40, respectively. There is also a mean

absolute error of 7.78 percent, a root mean square error of 17.64 percent, a relative absolute

error of 15.87 percent, and a root-relative squared error of 35.63 percent. Further, the

root-relative squared error of model 8 is 27.94 percent, and the minimum impact is 0.6

percent. In fact, model 8 is the most time-efﬁcient and resource-intensive model, which is

why it has the greatest impact.

5.2. Fog Layer Result Analysis

With the new data now being included, we measure the performance of model 8,

with this model having KNN, NB, and DT as the base classiﬁers, as well as voting as an

ensemble model.

5.2.1. Performance Measures

Performance measures such as kappa, F-measure, and ROC indicate how well the

model performs in the fog layer. Figure 9illustrates that all performance indicators in the

selected model are almost equal and at the top. The values are 96.39, 98.20, 99.60, and 96.40

for kappa, F-measure, ROC, and MCC, respectively.

Figure 9.

Performance on the fog node (using a model with KNN, NB, and DT as the base classiﬁers

as well as voting as an ensemble method).

5.2.2. Errors Associated

Figure 10 represents the mean absolute error (MAE), root mean square error (RMSE),

relative absolute error (RAE), and root-relative squared error (RRSE). Our experiment

yielded mean absolute error, root mean square error, relative absolute error, and root-

relative squared error values of 7.78, 17.64, 15.87, and 35.63 percent, respectively.

Future Internet 2022,14, 102 14 of 17

Figure 10.

Associated errors on the fog node (using a model with KNN, NB, and DT as the base

classiﬁers as well as voting as an ensemble method). Here, MAE stands for mean absolute Eeror,

RMSE stands for root mean square error, RAE stands for root absolute error, and RRSE stands for

root-relative squared error.

5.2.3. Execution Time and CPU Usage

Along with the previously discussed performance metric, we also calculated the

execution time of the chosen model, as well as all other models (not selected at the cloud

layer) using voting as an ensemble method on the fog node. This execution time is shown in

Figure 11. This is to determine whether we selected the correct model in terms of execution

time. The fog node execution time of model 8 with voting was the fastest of all models.

Figure 11. Execution time of all the models on the fog node.

Future Internet 2022,14, 102 15 of 17

Additionally, we calculated the CPU consumption within the fog layer. Less than 10%

of the CPU is consumed by the fog layer. Therefore, our method does not require additional

resources from fog nodes. Moreover, our approach has a low execution time. This shows

that our approach is highly cost-effective.

6. Conclusions

This study proposes an approach to ofﬂoad the ensemble machine learning model

selection task to the cloud and the real-time prediction task to fog nodes. Using this

technique, the cloud can handle more resource-intensive tasks and the fog nodes can

handle real-time computations to simplify and reduce real-time attack detection. The

proposed approach has been tested on the NSL-KDD dataset. Using a range of performance

indicators, such as kappa, F-measure, ROC, and MCC, our results showed that the selected

model in the cloud layer performed well in the fog layer. Moreover, the selected model in

the fog node took a minimum of 1.15 s in the experiments. The research also shows that the

ensemble method with voting takes less time to execute than stacking.

Our study used the NSL-KDD dataset. Our future plans are to collect data from

real testbed emulation. Currently, there are several testbeds available in the EU and the

US [

], such as Fed4Fire (https://www.fed4ﬁre.eu/, accessed on 20 March 2022),

COSMOS (https://cosmos-lab.org/, accessed on 20 March 2022) (Cloud-Enhanced Open

Software-Deﬁned Mobile Wireless Testbed for City-Scale Deployment), and POWDER

(https://powderwireless.net/, accessed on 20 March 2022) (Platform for Open Wireless

Data-Driven Experimental Research). We will create an edge/fog computing use case

on these testbeds and run our proposed approach in an IoT scenario presented in an

NGIAtlantic project [48].

Author Contributions:

Formal analysis, V.T. and S.S.; Methodology, V.T. and S.S.; Supervision, S.S.;

Validation, V.T.; Writing, original draft, V.T. and S.S. All authors have read and agreed to the published

version of the manuscript.

Funding:

This research was funded by the EU H2020 NGIAtlantic project under agreement No.

OC3-292.

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

Data Availability Statement: Not applicable.

Acknowledgments:

This work was carried out with the support of the EU H2020 NGIAtlantic project

under agreement No. OC3-292.

Conﬂicts of Interest: The authors declare no conﬂict of interest.

References

Abdulghani, H.A.; Nijdam, N.A.; Collen, A.; Konstantas, D. A Study on Security and Privacy Guidelines, Countermeasures,

Threats: IoT Data at Rest Perspective. Symmetry 2019,11, 774. [CrossRef]

Wang, A.; Liang, R.; Liu, X.; Zhang, Y.; Chen, K.; Li, J. An Inside Look at IoT Malware. In Industrial IoT Technologies and

Applications; Chen, F., Luo, Y., Eds.; Industrial IoT 2017; Lecture Notes of the Institute for Computer Sciences, Social Informatics

and Telecommunications Engineering; Springer: Cham, Switzerland, 2017.

Razdan, S.; Sharma, S. Internet of Medical Things (IoMT): Overview, Emerging Technologies, and Case Studies. IETE Tech. Rev.

2021, 1–14. [CrossRef]

Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw.

Comput. Appl. 2017,84, 25–37. [CrossRef]

Chaabouni, N.; Mosbah, M.; Zemmari, A.; Sauvignac, C.; Faruki, P. Network Intrusion Detection for IoT Security Based on

Learning Techniques. IEEE Commun. Surv. Tutor. 2019,21, 2671–2701. [CrossRef]

Xiao, L.; Wan, X.; Lu, X.; Zhang, Y.; Wu, D. IoT Security Techniques Based on Machine Learning: How Do IoT Devices Use AI to

Enhance Security? IEEE Signal Process. Mag. 2018,35, 41–49. [CrossRef]

Giacinto, G.; Roli, F.; Bruzzone, L. Combination of neural and statistical algorithms for supervised classiﬁcation of remote-sensing

images. Pattern Recognit. Lett. 2000,21, 385–397.

Future Internet 2022,14, 102 16 of 17

Bansal, A.; Mahapatra, S. A Comparative Analysis of Machine Learning Techniques for Botnet Detection. In Proceedings of

the 10th International Conference on Security of Information and Networks SIN ’17, New York, NY, USA, 13–15 October 2017;

Association for Computing Machinery: New York, NY, USA, 2017; pp. 91–98. [CrossRef]

Jaber, A.N.; Rehman, S.U. FCM–SVM based intrusion detection system for cloud computing environment. Clust. Comput.

2020

23, 3221–3231.

10.

Zhang, Y.; Ren, Y.; Wang, J.; Fang, L. Network forensic computing based on ANN-PCA. In Proceedings of the 2007 Interna-

tional Conference on Computational Intelligence and Security Workshops (CISW 2007), Harbin, China, 15–19 December 2007;

pp. 942–945.

11.

Hemavathi, D.; Srimathi, H. Effective feature selection technique in an integrated environment using enhanced principal

component analysis. J. Ambient. Intell. Humaniz. Comput. 2021,12, 3679–3688.

12.

Salo, F.; Nassif, A.B.; Essex, A. Dimensionality reduction with IG-PCA and ensemble classiﬁer for network intrusion detection.

Comput. Netw. 2019,148, 164–175.

13.

Hosseini, S.; Zade, B.M.H. New hybrid method for attack detection using combination of evolutionary algorithms, SVM, and

ANN. Comput. Netw. 2020,173, 107168.

14.

Amor, N.B.; Benferhat, S.; Elouedi, Z. Naive bayes vs. decision trees in intrusion detection systems. In Proceedings of the 2004

ACM Symposium on Applied Computing, Nicosia, Cyprus, 14–17 March 2004; pp. 420–424.

15.

Ingre, B.; Yadav, A. Performance analysis of NSL-KDD dataset using ANN. In Proceedings of the 2015 International Conference

on Signal Processing and Communication Engineering Systems, Guntur, India, 2–3 January 2015; pp. 92–96. [CrossRef]

16.

Zhang, C.; Ruan, F.; Yin, L.; Chen, X.; Zhai, L.; Liu, F. A Deep Learning Approach for Network Intrusion Detection Based

on NSL-KDD Dataset. In Proceedings of the 2019 IEEE 13th International Conference on Anti-counterfeiting, Security, and

Identiﬁcation (ASID), Xiamen, China, 25–27 October 2019; pp. 41–45. [CrossRef]

17.

Wang, H.; Sayadi, H.; Sasan, A.; Rafatirad, S.; Mohsenin, T.; Homayoun, H. Comprehensive Evaluation of Machine Learning

Countermeasures for Detecting Microarchitectural Side-Channel Attacks; GLSVLSI ’20; Association for Computing Machinery:

New York, NY, USA, 2020, pp. 181–186. [CrossRef]

18.

Ahmad, R.; Alsmadi, I. Machine learning approaches to IoT security: A systematic literature review. Int. Things (IoT)

2021

14, 100365. [CrossRef]

19.

Ambedkar, C.; Babu, V.K. Detection of probe attacks using machine learning techniques. Int. J. Res. Stud. Comput. Sci. Eng.

(IJRSCSE) 2015,2, 25–29.

20. Sabhnani, M.; Serpen, G. Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set. Intell.

Data Anal. 2004,8, 403–415.

21.

Abdelkeﬁ, A.; Jiang, Y.; Sharma, S. SENATUS: An Approach to Joint Trafﬁc Anomaly Detection and Root Cause Analysis. In

Proceedings of the 2018 2nd Cyber Security in Networking Conference (CSNet), Paris, France, 24–26 October 2018; pp. 1–8.

[CrossRef]

22.

Khare, N.; Devan, P.; Chowdhary, C.L.; Bhattacharya, S.; Singh, G.; Singh, S.; Yoon, B. Smo-dnn: Spider monkey optimization and

deep neural network hybrid classiﬁer model for intrusion detection. Electronics 2020,9, 692. [CrossRef]

23.

Manimurugan, S.; Majdi, A.Q.; Mohmmed, M.; Narmatha, C.; Varatharajan, R. Intrusion detection in networks using crow search

optimization algorithm with adaptive neuro-fuzzy inference system. Microprocess. Microsyst. 2020,79, 103261.

24.

Kasliwal, B.; Bhatia, S.; Saini, S.; Thaseen, I.S.; Kumar, C.A. A hybrid anomaly detection model using G-LDA. In Proceedings of

the 2014 IEEE International Advance Computing Conference (IACC), Gurgaon, India, 21–22 February 2014; pp. 288–293.

25.

Ieracitano, C.; Adeel, A.; Morabito, F.C.; Hussain, A. A novel statistical analysis and autoencoder driven intelligent intrusion

detection approach. Neurocomputing 2020,387, 51–62.

26. Chan, Y.H. Biostatistics 305. Multinomial logistic regression. Singap. Med. J. 2005,46, 259.

27.

Liu, J.; Kantarci, B.; Adams, C. Machine learning-driven intrusion detection for contiki-NG-based IoT networks exposed to

NSL-KDD dataset. In Proceedings of the 2nd ACM Workshop on Wireless Security and Machine Learning, Linz, Austria, 13 July

2020; pp. 25–30.

28.

Su, T.; Sun, H.; Zhu, J.; Wang, S.; Li, Y. BAT: Deep learning methods on network intrusion detection using NSL-KDD dataset.

IEEE Access 2020,8, 29575–29585.

29.

Abu Al-Haija, Q.; Al-Badawi, A. Attack-Aware IoT Network Trafﬁc Routing Leveraging Ensemble Learning. Sensors

2022

,22,

241. [CrossRef]

30.

Yong, B.; Wei, W.; Li, K.C.; Shen, J.; Zhou, Q.; Wozniak, M.; Połap, D.; Damaševiˇcius, R. Ensemble machine learning approaches

for webshell detection in Internet of things environments. In Transactions on Emerging Telecommunications Technologies; Wiley:

Hoboken, NJ, USA, 2020; p. e4085. [CrossRef]

31.

Rashid, M.M.; Kamruzzaman, J.; Hassan, M.M.; Imam, T.; Gordon, S. Cyberattacks Detection in IoT-Based Smart City Applications

Using Machine Learning Techniques. Int. J. Environ. Res. Public Health 2020,17, 9347. [CrossRef]

32.

Tsogbaatar, E.; Bhuyan, M.H.; Taenaka, Y.; Fall, D.; Gonchigsumlaa, K.; Elmroth, E.; Kadobayashi, Y. SDN-Enabled IoT Anomaly

Detection Using Ensemble Learning. In Artiﬁcial Intelligence Applications and Innovations; Maglogiannis, I., Iliadis, L., Pimenidis,

E., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 268–280.

Future Internet 2022,14, 102 17 of 17

33.

Sharma, S. Towards Artiﬁcial Intelligence Assisted Software Deﬁned Networking for Internet of Vehicles. In Intelligent Technologies

for Internet of Vehicles; Magaia, N., Mastorakis, G., Mavromoustakis, C., Pallis, E., Markakis, E.K., Eds.; Springer International

Publishing: Cham, Switzerland, 2021; pp. 191–222. [CrossRef]

34.

Latif, S.A.; Wen, F.B.X.; Iwendi, C.; Li, F.; Wang, L.; Mohsin, S.M.; Han, Z.; Band, S.S. AI-empowered, blockchain and SDN

integrated security architecture for IoT network of cyber physical systems. Comput. Commun. 2022,181, 274–283. [CrossRef]

35.

Rambabu, K.; Venkatram, N. Ensemble classiﬁcation using trafﬁc ﬂow metrics to predict distributed denial of service scope in the

Internet of Things (IoT) networks. Comput. Electr. Eng. 2021,96, 107444. [CrossRef]

36.

Kumar, P.; Gupta, G.P.; Tripathi, R. An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for

IoMT networks. Comput. Commun. 2021,166, 110–124. [CrossRef]

37.

Khare, S.; Totaro, M. Ensemble Learning for Detecting Attacks and Anomalies in IoT Smart Home. In Proceedings of the 2020 3rd

International Conference on Data Intelligence and Security (ICDIS), South Padre Island, TX, USA, 24–26 June 2020; pp. 56–63.

[CrossRef]

38.

Hung, Y.H. Improved Ensemble-Learning Algorithm for Predictive Maintenance in the Manufacturing Process. Appl. Sci.

2021

11, 6832. [CrossRef]

39.

Wang, J.; Pan, J.; Esposito, F.; Calyam, P.; Yang, Z.; Mohapatra, P. Edge cloud ofﬂoading algorithms: Issues, methods, and

perspectives. ACM Comput. Surv. (CSUR) 2019,52, 1–23.

40.

Zhang, P.; Zhou, M.; Fortino, G. Security and trust issues in Fog computing: A survey. Future Gener. Comput. Syst.

2018

,88, 16–27.

41.

Hu, P.; Dhelim, S.; Ning, H.; Qiu, T. Survey on fog computing: Architecture, key technologies, applications and open issues. J.

Netw. Comput. Appl. 2017,98, 27–42.

42.

Tariq, N.; Asim, M.; Al-Obeidat, F.; Zubair Farooqi, M.; Baker, T.; Hammoudeh, M.; Ghaﬁr, I. The Security of Big Data in

Fog-Enabled IoT Applications Including Blockchain: A Survey. Sensors 2019,19, 1788. [CrossRef]

43.

Alzoubi, Y.I.; Osmanaj, V.H.; Jaradat, A.; Al-Ahmad, A. Fog computing security and privacy for the Internet of Thing applications:

State-of-the-art. Secur. Priv. 2021,4, e145. [CrossRef]

44.

Alrawais, A.; Alhothaily, A.; Hu, C.; Xing, X.; Cheng, X. An attribute-based encryption scheme to secure fog communications.

IEEE Access 2017,5, 9131–9138.

45.

Hu, P.; Ning, H.; Qiu, T.; Song, H.; Wang, Y.; Yao, X. Security and privacy preservation scheme of face identiﬁcation and resolution

framework using fog computing in internet of things. IEEE Int. Things J. 2017,4, 1143–1155.

46.

Li, Z.; Zhou, X.; Liu, Y.; Xu, H.; Miao, L. A non-cooperative differential game-based security model in fog computing. China

Commun. 2017,14, 180–189.

47.

Osanaiye, O.; Chen, S.; Yan, Z.; Lu, R.; Choo, K.K.R.; Dlodlo, M. From cloud to fog computing: A review and a conceptual live

VM migration framework. IEEE Access 2017,5, 8284–8300.

48.

ATLANTIC-eVISION: Cross-Atlantic Experimental Validation of Intelligent SDN-controlled IoT Networks 2021–2022. Avail-

able online: https://ngiatlantic.eu/funded-experiments/atlantic-evision- cross-atlantic-experimental-validation-intelligent- sdn

(accessed on 20 March 2022).

49.

Berman, M.; Demeester, P.; Lee, J.W.; Nagaraja, K.; Zink, M.; Colle, D.; Krishnappa, D.K.; Raychaudhuri, D.; Schulzrinne, H.;

Seskar, I.; et al. Future Internets Escape the Simulator. Commun. ACM 2015,58, 78–89. [CrossRef]

50.

Suñé, M.; Bergesio, L.; Woesner, H.; Rothe, T.; Köpsel, A.; Colle, D.; Puype, B.; Simeonidou, D.; Nejabati, R.; Channegowda, M.;

et al. Design and implementation of the OFELIA FP7 facility: The European OpenFlow testbed. Comput. Netw.

2014

,61, 132–150.

[CrossRef]

Implementation of Lightweight Machine Learning-Based Intrusion Detection System on IoT Devices of Smart Homes

Article

Full-text available

Jun 2024

Smart home devices, also known as IoT devices, provide significant convenience; however, they also present opportunities for attackers to jeopardize homeowners’ security and privacy. Securing these IoT devices is a formidable challenge because of their limited computational resources. Machine learning-based intrusion detection systems (IDSs) have been implemented on the edge and the cloud; however, IDSs have not been embedded in IoT devices. To address this, we propose a novel machine learning-based two-layered IDS for smart home IoT devices, enhancing accuracy and computational efficiency. The first layer of the proposed IDS is deployed on a microcontroller-based smart thermostat, which uploads the data to a website hosted on a cloud server. The second layer of the IDS is deployed on the cloud side for classification of attacks. The proposed IDS can detect the threats with an accuracy of 99.50% at cloud level (multiclassification). For real-time testing, we implemented the Raspberry Pi 4-based adversary to generate a dataset for man-in-the-middle (MITM) and denial of service (DoS) attacks on smart thermostats. The results show that the XGBoost-based IDS detects MITM and DoS attacks in 3.51 ms on a smart thermostat with an accuracy of 97.59%.

Stacked Classification Model with Cryptographic Process in IoT Data to Prevent and Detect Attacks

Article

May 2024

N. Shashikala

Improved Crow Search-Based Feature Selection and Ensemble Learning for IoT Intrusion Detection

Article

Full-text available

Jan 2024

Network intrusion detection in the Internet of Things (IoT) framework has presented considerable challenges in recent decades. A wide variety of machine-learning approaches are introduced in network intrusion detection. The existing methodologies commonly lack consistency in achieving optimal performance across various multi-class categorization tasks. The present study elucidates implementing a unique intrusion system with the primary objective of enriching the efficacy of network intrusion detection. In the initial phase, it is imperative to employ data-denoising methodologies to effectively tackle the issue of data imbalance. In the next step, the enhanced Crow search algorithm is used to determine the most significant features that aid in better classifying intrusion attacks. In the final phase, the ensemble classifier takes the selected features as input to categorize the standard and invader labels. The present work introduces an ensemble mechanism that comprises four distinct classifiers. The assessment of the proposed approach is validated on two denoised datasets, specifically NSL-KDD and UNSW-NB15. The experimental outcomes demonstrate that the formulated approach achieves exceptional accuracy of 99.4% and 99.2% for the NSL-KDD and UNSW-NB15 datasets, respectively.

Evaluating Ensemble Learning Mechanisms for Predicting Advanced Cyber Attacks

Article

Full-text available

Dec 2023

With the increased sophistication of cyber-attacks, there is a greater demand for effective network intrusion detection systems (NIDS) to protect against various threats. Traditional NIDS are incapable of detecting modern and sophisticated attacks due to the fact that they rely on pattern-matching models or simple activity analysis. Moreover, Intelligent NIDS based on Machine Learning (ML) models are still in the early stages and often exhibit low accuracy and high false positives, making them ineffective in detecting emerging cyber-attacks. On the other hand, improved detection and prediction frameworks provided by ensemble algorithms have demonstrated impressive outcomes in specific applications. In this research, we investigate the potential of ensemble models in the enhancement of NIDS functionalities in order to provide a reliable and intelligent security defense. We present a NIDS hybrid model that uses ensemble ML techniques to identify and prevent various intrusions more successfully than stand-alone approaches. A combination of several distinct machine learning methods is integrated into a hybrid framework. The UNSW-NB15 dataset is pre-processed, and its features are engineered prior to being used to train and evaluate the proposed model structure. The performance evaluation of the ensemble of various ML classifiers demonstrates that the proposed system outperforms individual model approaches. Using all the employed experimental combination forms, the designed model significantly enhances the detection accuracy attaining more than 99%, while false positives are reduced to less than 1%.

Energy Consumption Reduction in Wireless Sensor Network-Based Water Pipeline Monitoring Systems via Energy Conservation Techniques

Article

Full-text available

Dec 2023

In wireless sensor network-based water pipeline monitoring (WWPM) systems, a vital requirement emerges: the achievement of low energy consumption. This primary goal arises from the fundamental necessity to ensure the sustained operability of sensor nodes over extended durations, all without the need for frequent battery replacement. Given that sensor nodes in such applications are typically battery-powered and often physically inaccessible, maximizing energy efficiency by minimizing unnecessary energy consumption is of vital importance. This paper presents an experimental study that investigates the impact of a hybrid technique, incorporating distributed computing, hierarchical sensing, and duty cycling, on the energy consumption of a sensor node in prolonging the lifespan of a WWPM system. A custom sensor node is designed using the ESP32 MCU and nRF24L01+ transceiver. Hierarchical sensing is implemented through the use of LSM9DS1 and ADXL344 accelerometers, distributed computing through the implementation of a distributed Kalman filter, and duty cycling through the implementation of interrupt-enabled sleep/wakeup functionality. The experimental results reveal that combining distributed computing, hierarchical sensing and duty cycling reduces energy consumption by a factor of eight compared to the lone implementation of distributed computing.

Enhancing Security of IoT Systems Using Machine Learning-Based Blockchain Technology

Chapter

Jun 2024

With the adoption of internet of things (IoT) devices, the security challenges associated with their implementation are a focus of research and development. This study examines the integration of cutting-edge technologies, machine learning knowledge, and blockchain to strengthen the payment security of IoT systems. The research explores the vulnerabilities present in traditional IoT architectures and presents a new approach that exploits machine learning algorithms embedded in a blockchain box. The core of this methodology involves implementing machine learning models to analyze large data sets generated by IoT devices, identifying anomalous patterns indicating possible security risks. However, locks are stored and validated by a secure lock, which ensures an immutable and transparent record of system activities. Blockchain integration alone improves data integrity, although it also establishes a decentralized consent mechanism, which reduces the risk of single points of failure and unauthorized access. This study evaluates the proposed solution through empirical analysis, considering real-world IoT scenarios. A comparative assessment was conducted against conventional security measures, demonstrating the effectiveness of machine learning-based blockchain approaches in detecting and mitigating diverse security threats. Additionally, this research addresses scalability and performance considerations, providing insight into the practical implementation of the proposed solution in the dynamic and resource-constrained environments typical of IoT ecosystems. In conclusion, this study contributes to the ongoing discourse on strengthening the security of IoT systems by merging machine learning and blockchain technologies. The proposed framework not only addresses challenges in today's security, but also emphasis on the foundation of infrastructure which is strong and adaptive also proficient of addressing the evolving IoT threat landscape in the near future.

Multicore Packet Distribution Method Using Multicore Network Interface Card Based on Tile-gx72 Network Processor

Conference Paper

Feb 2024

IoT Attack Detection Using LSTM Model

Conference Paper

Jan 2024

IoT Attack Detection and Prevention Through Machine Learning System

Conference Paper

Nov 2023

Machine Learning Approaches in Blockchain Technology-Based IoT Security: An Investigation on Current Developments and Open Challenges

Chapter

Feb 2024

In recent times, the implementation of blockchain technology has gained wide popularity in various applications such as healthcare, Internet of Things (IoT), and business, due to its critical security features. However, the rapid advancement of Internet of Things (IoT) platforms has also attracted cyber attackers, leading to numerous vulnerabilities. To address this issue, researchers have explored the combination of blockchain technology with machine learning algorithms. While there have been significant breakthroughs in this area, it is evident that certain limitations exist, making the approach unsuitable for all case studies. This book chapter explores the intersection of machine learning, blockchain technology, and the Internet of Things (IoT) in the context of security. Beginning with an overview of blockchain, IoT, and the role of machine learning techniques in blockchain technology, the chapter then delves into an investigation of relevant articles that employ machine learning-based blockchain technology in IoT security. The chapter also recognizes the current developments and open challenges in this field, highlighting potential areas for improvement and further study. By combining these two powerful technologies, the chapter aims to enhance security in IoT systems built on blockchain technology, while also addressing the limitations and exploring avenues for future research.

Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning

Article

Full-text available

Dec 2021
SENSORS-BASEL

Network Intrusion Detection Systems (NIDSs) are indispensable defensive tools against various cyberattacks. Lightweight, multipurpose, and anomaly-based detection NIDSs employ several methods to build profiles for normal and malicious behaviors. In this paper, we design, implement, and evaluate the performance of machine-learning-based NIDS in IoT networks. Specifically, we study six supervised learning methods that belong to three different classes: (1) ensemble methods, (2) neural network methods, and (3) kernel methods. To evaluate the developed NIDSs, we use the distilled-Kitsune-2018 and NSL-KDD datasets, both consisting of a contemporary real-world IoT network traffic subjected to different network attacks. Standard performance evaluation metrics from the machine-learning literature are used to evaluate the identification accuracy, error rates, and inference speed. Our empirical analysis indicates that ensemble methods provide better accuracy and lower error rates compared with neural network and kernel methods. On the other hand, neural network methods provide the highest inference speed which proves their suitability for high-bandwidth networks. We also provide a comparison with state-of-the-art solutions and show that our best results are better than any prior art by 1~20%.

Improved Ensemble-Learning Algorithm for Predictive Maintenance in the Manufacturing Process

Article

Full-text available

Jul 2021

Yu-Hsin Hung

Industrial Internet of Things (IIoT) technologies comprise sensors, devices, networks, and applications from the edge to the cloud. Recent advances in data communication and application using IIoT have streamlined predictive maintenance (PdM) for equipment maintenance and quality management in manufacturing processes. PdM is useful in fields such as device, facility, and total quality management. PdM based on cloud or edge computing has revolutionized smart manufacturing processes. To address quality management problems, herein, we develop a new calculation method that improves ensemble-learning algorithms with adaptive learning to make a boosted decision tree more intelligent. The algorithm predicts main PdM issues, such as product failure or unqualified manufacturing equipment, in advance, thus improving the machine-learning performance. Herein, semiconductor and blister packing machine data are used separately in manufacturing data analytics. The former data help in predicting yield failure in a semiconductor manufacturing process. The blister packing machine data are used to predict the product packaging quality. Experimental results indicate that the proposed method is accurate, with an area under a receiver operating characteristic curve exceeding 96%. Thus, the proposed method provides a practical approach for PDM in semiconductor manufacturing processes and blister packing machines.

Internet of Medical Things (IoMT): Overview, Emerging Technologies, and Case Studies

Article

Full-text available

May 2021

In the Internet of Medical Things (IoMT), the Internet of Things (IoT) is integrated with medical devices, enabling improved patient comfort, cost-effective medical solutions, quick hospital treatments, and even more personalized healthcare. The paper first provides the introduction of IoMTs and then introduces an architecture of IoMTs. Later, it provides the current operations of the healthcare system and discusses the mapping of these operations into the architectural diagram. Further, several emerging technologies such as Physically Unclonable Functions (PUF), Blockchain, Artificial Intelligence (AI), and Software-Defined Networking (SDN) are envisioned as important technologies to overcome several challenges in e-healthcare such as security, privacy, accuracy, and performance. Finally, we provide three case studies for IoMT based on – (1) PUF-based Authentication, (2) AI-enabled SDN Assisted e-healthcare, and (3) Blockchain Assisted Patient Centric System. The solutions presented in this paper may have a huge impact on the speed at which IoMT infrastructure can efficiently evolve with market evolution.

Fog computing security and privacy for the Internet of Thing applications: State‐of‐the‐art

Article

Full-text available

Dec 2020

Fog nodes are implemented near to end‐users Internet of Things (IoT) devices, which mitigate the impact of low latency, location awareness, and geographic distribution unsupported features of many IoT applications. Moreover, Fog computing decreases the data offload into the Cloud, which decreases the response time. Despite these benefits, Fog computing faces many challenges in meeting security and privacy requirements. These challenges occur due to the limitations of Fog computing resources. In fact, Fog computing may add new security and privacy issues. Although many papers have discussed the Fog security and privacy issues recently, most of these papers have discussed these issues at a very high level. This paper provides a comprehensive understanding of Fog privacy and security issue. In this survey, we review the literature on Fog computing to draw the state‐of‐the‐art of the security and privacy issues raised by Fog computing. The findings of this survey reveal that studying Fog computing is still in its infant stage. Many questions are yet to be answered to address the privacy and security challenges of Fog computing.

Cyberattacks Detection in IoT-Based Smart City Applications Using Machine Learning Techniques

Article

Full-text available

Dec 2020
Int J Environ Res Publ Health

In recent years, the widespread deployment of the Internet of Things (IoT) applications has contributed to the development of smart cities. A smart city utilizes IoT-enabled technologies, communications and applications to maximize operational efficiency and enhance both the service providers’ quality of services and people’s wellbeing and quality of life. With the growth of smart city networks, however, comes the increased risk of cybersecurity threats and attacks. IoT devices within a smart city network are connected to sensors linked to large cloud servers and are exposed to malicious attacks and threats. Thus, it is important to devise approaches to prevent such attacks and protect IoT devices from failure. In this paper, we explore an attack and anomaly detection technique based on machine learning algorithms (LR, SVM, DT, RF, ANN and KNN) to defend against and mitigate IoT cybersecurity threats in a smart city. Contrary to existing works that have focused on single classifiers, we also explore ensemble methods such as bagging, boosting and stacking to enhance the performance of the detection system. Additionally, we consider an integration of feature selection, cross-validation and multi-class classification for the discussed domain, which has not been well considered in the existing literature. Experimental results with the recent attack dataset demonstrate that the proposed technique can effectively identify cyberattacks and the stacking ensemble model outperforms comparable models in terms of accuracy, precision, recall and F1-Score, implying the promise of stacking in this domain.

DDoS Detection using Machine Learning Techniques

Article

May 2022

A Distributed Denial of Service (DDoS) attack is a type of cyber-attack that attempts to interrupt regular traffic on a targeted server by overloading the target. The system under DDoS attack remains occupied with the requests from the bots rather than providing service to legitimate users. These kinds of attacks are complicated to detect and increase day by day. In this paper, machine learning algorithm is employed to classify normal and DDoS attack traffic. DDoS attacks are detected using four machine learning classification techniques. The machine learning algorithms are tested and trained using the CICDDoS2019 dataset, gathered by the Canadian Institute of Cyber Security. When compared against KNN, Decision Tree, and Random Forest, the Artificial Neural Network (ANN) generates the best results.

Ensemble classification using traffic flow metrics to predict distributed denial of service scope in the Internet of Things (IoT) networks

Article

Dec 2021
COMPUT ELECTR ENG

The IoT networks are highly vulnerable to distributed denial of service, which is the most serious and intolerable attack format. The competent solutions to handle DDOS attacks on IoT networks are limited. The contemporary research practices are more specific to incorporate machine learning to device novel defense measures to defend the DDOS attacks on IoT networks. This manuscript addressed high dimensionality in training data of the machine learning-based DDOS attack defense in IoT networks. It portrayed a novel ensemble classification using traffic flow metrics ECTFM as features to predict the DDOS attacks on IoT networks. Experimental results outcome of the cross-validation addressed in the experimental study addressing the importance of the proposed approach towards DDOS defense accuracy with less false alarming. The performance of the proposed approach has been scaled by comparing it with DDOS defense contemporary contributions in IoT networks.

Journal Pre-proof AI-empowered, blockchain and SDN integrated security architecture for IoT network of cyber physical systems AI-empowered, blockchain and SDN integrated security architecture for IoT network of cyber physical systems

Article

Oct 2021
COMPUT COMMUN

Internet of things (IoT) is one of the most emerging technologies nowadays and it is one of the key enablers of industrial cyber physical system (CPSs). It has started to participate in almost every aspect of our social life, ranging from financial transactions to the healthcare system, communication to national security, battlefield to smart homes, and so on. However, the wide deployment of IoT suffers certain issues as well, such as interoperability, compatibility , heterogeneity, large amount of data, processing of heterogeneous data etc. Among others, energy efficiency and security are the utmost prominent issues. Scarce computing resources of IoT devices put hindrances on information sharing across edge or IoT network. Indeed, unintentional or malicious interference with IoT data may lead to severe concerns. In this study, the researcher exploits the potential benefits of a blockchain system and integrates it with software-defined networking (SDN) while justifying energy and security issues. More in detail, the researcher proposed a new routing protocol with the cluster structure for IoT networks using blockchain-based architecture for SDN controller. The proposed architecture obviates proof-of-work (PoW) with private and public blockchains for Peer-to-Peer (P2P) communication between SDN controllers and IoT devices. In addition to this, distributed trust-based authentication mechanism makes blockchain even more adoptive for IoT devices REVISED Manuscript (text UNmarked) Click here to view linked References J o u r n a l P r e-p r o o f Journal Pre-proof with limited resources. The experimental results show that the proposed cluster structure based routing protocol outperforms the state-of-the-art Ad-hoc On-demand Distance Vector (AODV), Destination-Sequenced Distance Vector (DSDV), Secure Mobile Sensor Network (SMSN), Energy efficient secured cluster based distributed fault diagnosis (EESCFD), and Ad-hoc On-demand Mul-tipath Distance Vector (AOMDV), in terms of energy consumption, network throughput, and packet latency. Proposed protocol help overcome the issues especially, energy management and security of the next generation industrial cyber physical systems.

Towards Artificial Intelligence Assisted Software Defined Networking for Internet of Vehicles

Chapter

Jun 2021

Sachin Sharma

In the Internet of Vehicles (IoV), the Internet of Things (IoT) is integrated with Vehicular Ad hoc NETworks (VANET). This enables gathering, processing and sharing of lots of information (regarding vehicles, roads and their surroundings) through the Internet and hence, helps in making intelligent decisions. On the other hand, Software Defined Networking (SDN) has the capability of designing a flexible programmable IoV network that can foster innovation and reduce complexity. Applying SDN in IoV will be useful, as SDN enabled IoV devices can be controlled seamlessly from an external server (called a controller) which can be located in the cloud and may have computational resources to run resource-intensive algorithms, making intelligent decisions. This chapter provides an introduction about SDN, describes the benefits of integrating SDN in IoV and reports the recent advances. It also presents an Artificial Intelligence (AI) based architecture and open challenges. Finally, the chapter presents an automatic configuration method with which SDN can be deployed automatically in IoV without any manual configuration. The experiments are performed on a publicly available European testbed using an emulator for wireless SDN networks. Experiments are conducted for automatic configuration of SDN in IoV network’s topologies and for data collection in SDN enabled IoV. The results show the effectiveness of the proposed automatic configuration method. Furthermore, AI-assisted intelligent decisions supported by SDN enabled IoV are introduced. The challenges and solutions presented in this chapter may have a huge impact on the speed at which IoV infrastructure can efficiently evolve with market evolution.

Machine Learning Approaches to IoT Security: A Systematic Literature Review

Article

Jan 2021

With the continuous expansion and evolution of IoT applications, attacks on those IoT applications continue to grow rapidly. In this systematic literature review (SLR) paper, our goal is to provide a research asset to researchers on recent research trends in IoT security. As the main driver of our SLR paper, we proposed six research questions related to IoT security and machine learning. This extensive literature survey on the most recent publications in IoT security identified a few key research trends that will drive future research in this field. With the rapid growth of large scale IoT attacks, it is important to develop models that can integrate state of the art techniques and technologies from big data and machine learning. Accuracy and efficiency are key quality factors in finding the best algorithms and models to detect IoT attacks in real or near real-time

Detecting IoT Attacks Using an Ensemble Machine Learning Model

Abstract

Recommended publications

Fog-Empowered Anomaly Detection in Internet of Things using Hyperellipsoidal Clustering

FBAD: Fog-based Attack Detection for IoT Healthcare in Smart Cities

Influence of Montoring: Fog and Edge Computing

Secure Data Query Framework for Cloud and Fog Computing

Secure Data Sequence Query Framework Based on Multiple Fogs

Secure Data Transmission Techniques for Privacy Preserving Computation Offloading Between Fog Comput...

A Novel Approach for Privacy Preserving Technique in IoT Fog and Cloud Environment