ArticlePDF Available

An effective way of cloud intrusion detection system using decision tree, support vector machine and naïve bayes algorithm

Authors:
  • VELS INSTITUTE OF SCIENCE, TECHNOLOGY & ADVANCED STUDIES (VISTAS), CHENNAI.

Abstract and Figures

Cloud computing is a vast area, use the resources with cost-effectively. The service provider is to share the resources anywhere at any time. But the network is the most vital to accessing data in the cloud. The cloud malicious takes advantages while using the cloud network. Intrusion Detection System (IDS) is monitoring the network and notifies attacks. In Intrusion Detection System, anomaly technique is most important. Whenever Virtual Machine is created, IDS track the known and unknown data’s. If any unknown data found, Intrusion Detection System detects the data using anomaly classification algorithm and send the report to admin. This paper proposes we are using support vector machine (SVM), Naive Bayes, and decision tree (J48) algorithms for predicting unwanted data’s. In these algorithms are help us to overcome the high false alarm rate. Our proposed work implemented part using the WEKA tool to give a statistical report, which gives a better outcome in little calculation time.
Content may be subject to copyright.
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-7 Issue-4S2, December 2018
38
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: ES2036017519/18©BEIESP
Abstract: Cloud computing is a vast area, use the resources
with cost-effectively. The service provider is to share the
resources anywhere at any time. But the network is the most vital
to accessing data in the cloud. The cloud malicious takes
advantages while using the cloud network. Intrusion Detection
System (IDS) is monitoring the network and notifies attacks. In
Intrusion Detection System, anomaly technique is most
important. Whenever Virtual Machine is created, IDS track the
known and unknown data’s. If any unknown data found,
Intrusion Detection System detects the data using anomaly
classification algorithm and send the report to admin. This paper
proposes we are using support vector machine (SVM), Naive
Bayes, and decision tree (J48) algorithms for predicting
unwanted data’s. In these algorithms are help us to overcome the
high false alarm rate. Our proposed work implemented part using
the WEKA tool to give a statistical report, which gives a better
outcome in little calculation time.
Keywords: SVM, Naive Bayes, Decision Tree (J48), NSL-KDD
dataset, H-IDS.
I. INTRODUCTION
Cloud computing is present at the remote location and it’s
providing services over the network. The user can be
configured, accessing and manipulating the application such
as data storage, infrastructure, server and application[1]. The
user can access anything as services such as infrastructure,
platform, and software anywhere in the world from the
cloud through the internet. The Cloud computing
communication with two ends. In see the figure 1 the front
end must communicate with user and cloud. When the user
need resources may like hardware/ software to execute for
maintaining the database, developing the applications and its
deliver the services via the network. Another end must
communicate with cloud and third party. The virtual
machine monitor (i.e. IDS) is run on multiple virtual
machines on the physical layer. The third party has fully
maintaining web and application server, database server and
developing tools [2]. The government, business sector, a
variety of academic, medical and lot of organizations are
increasingly using Information and Technology (IT) in
cloud computing. But they need to bring lots of security.
Because lots of network attack intrudes in the cloud. The
traditional attacks are IP spoofing, DDOS, User to Port, Port
Scanning etc. An IDS haven a new efficient solution of the
traditional network for securing packets. The role of IDS is
observed the network and to predict the malicious activity
and report to the cloud administrator. If an intrusion has
Revised Manuscript Received on December 30, 2018.
T. Nathiya, Ph.D. Research Scholar, Department of Computer
Science, School of Computing Science, Vels Institute of Science,
Technology & Advanced Studies (VISTAS), Chennai, Tamil Nadu, India.
G. Suseendran, Department of Information Technology, School of
Computing Science, Vels Institute of Science, Technology & Advanced
Studies (VISTAS), Chennai, Tamil Nadu, India.
detected, The IDS is creating issues alert signal to
continuously watch about this event. Whether this alert is
true positive or false alarms. The cloud network IDS have
placed at cloud server and administrated managed by service
provider.IDS handling large scale of the computing system,
automated, scalability, and synchronization of IDS[3][4][5].
The network intrusion detection system must choose the
feature and reduced the number of features can be easily
extracted out of high speed of data. Because the local area
network forwarding the packets with one gigabit per second
depends upon the hard disk speeds. However hard disk
speeds much slower. The minimal framework size is 64
bytes. So, one to 14.8 million frames can be transferred per
second. During this transaction, the network is monitoring
the data that’s a major challenge in cloud computing. The
most critical challenge is the real-time detection of data’s.
[6].
This paper aims to predictive data using four anomaly-
based algorithms for making an effective system for
detecting intrusion in cloud computing. These Abnormality
based techniques are discussed later in this paper. To make
the feature selection dataset attributes using various tools.
The balance of this paper is going to discuss on next
section related work and briefly describe the Anomaly based
techniques. And continuing section is Algorithm
classification and then section dataset and preprocessing.
And the next section is an experimental result and concludes
our paper and informs about future work.
II. ANOMALY BASED TECHNIQUES AND
RELATED WORK
Hybrid network intrusion detection system is designed to
install in the virtual network at each host layer. H-NIDS has
monitored the network traffic and reports into the higher
layer and it has using both signature/ anomaly based
techniques[7]. The misuse technique has identified only
well-known attacks from signature database. Using snort
rule using fast multi- pattern matching algorithm to detect
the attacks. But anomaly-based technique helps in unknown
attacks. Data mining, statistical modeling, and machine
learning techniques are used in anomaly based. The last
paper I had to create security framework, we are included in
the classification of algorithms such as decision tree, SVM,
and Naïve Bayes. These techniques are applied in anomaly
based IDS and it provides better accuracy and
confidentiality, low communication cost and low false
alerts[8][9].
An Effective Way of Cloud Intrusion Detection
System Using Decision tree, Support Vector
Machine and Naïve Bayes Algorithm
T. Nathiya, G. Suseendran
An Effective Way of Cloud Intrusion Detection System Using Decision tree, Support Vector Machine and Naïve
Bayes Algorithm
39
Retrieval Number: ES2036017519/18©BEIESP
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Md. Al Mehedi Hasan et al.[10] proposed a two
classification model, one is Support Vector Machine (SVM)
and another one Random Forest algorithm for classifying
attacks. The 90% of the dataset used for training and 10%
dataset used for testing which is not sufficient to verify the
accuracy. It took low execution time but the accuracy of
detection is not expected rate. So it requires more test case
results. Nabila Farnaaz et al.[11] proposed Random forest
classifier in IDS. And they are compared to other traditional
attacks. Using NSL-KDD dataset to evaluate the
performance by using random forest algorithm to detect
attacks like DOS, probe, U2R, and R2L. But they are
improving the accuracy of the classifier in feature selection
measure. G.V. Nadiammai et al.[12] proposed data mining
concept is integrated with IDS and it’s to identify the related
data, hidden data with less execution time. They are used
various algorithms for classification using KDD dataset. The
proposed algorithm had produced good accuracy and false
alarm rate. But two issues such as lack of user information
and these techniques have not achieved an automatic
intrusion detection system. Xueyan Jing et al.[13] proposed
the Fuzzy C-Means (FCM) algorithm employed to
clustering centers of training dataset and K-Nearest
Neighbors algorithm to combine with dem unknown attacks.
But they need more training and testing results.
Rahimeh Rouhi et al.[14] proposed anomaly detect pster
shafer theory. These proposed algorithms used in KDD’99
datasets. The performance of result was effective to detect
ion techniques applied in feature selection from KDD Cup
99 dataset. These paper used a feed-forward neural network
was trained to predict the normal/ attack packets in the
dataset. But these papers need more training and testing
datasets. Opeyemi Osanaiye et al.[15] proposed a decision
tree classification algorithm to detect the DDoS attacks. This
paper proposed feature selection methods. But these paper
not used the confusion matrix values. The accuracy and
detection rate are not mentioned. How to work the decision
tree classifications are not mentioned. Ozge cephelli et
al.[16] proposed detection system adopted the network with
traffic packets along sensitivity parameters. The proposed
Hybrid- IDS to detecting the DDoS attack. But the result has
decreased the performance. They need training performance.
There are used a limited number of DARPA 2000 datasets.
The proposed model was not clear. They are used
penetration testing tool to get the commercial bank detailed
dataset.
III. CLASSIFICATION TECHNIQUES
The IDS are classified into two techniques, one is
signature based and another one is anomaly-based
techniques. The signature based, we are using snort rule to
detect the known attack and anomaly based, and we are
using different classification such as naïve Bayesian,
decision tree, SVM to detect an unknown attack in anomaly
detection[17].
A. Snort
Observation ofreal time traffic is very difficult to detect
the intrusion while heavy load. It gives the solution of
network intrusion detection. Snort rule is an extremely
flexible rule and it’s easy to modify the nothing like
commercial NIDS. Snort can be running four approaches
(sniffer, packet logger, IDS, and IPS)[16][18]. Snort rule, if
the user to write own rule for incoming and outgoing
network packets and its combing two parts “The Header”
and “The Options” segment. When packets must meet the
threshold condition that only need to follow the snort rule.
B. Naive Bayesian Classification
It is one of the supervised learning algorithm as well as a
statistical method of classification. The learning algorithm
produce the function to predictions of the output values. The
system is providing the targets of new input values after
training data. Given the Bayesian algorithm is representing
a class variable and the set of attributes are1,2,,.
/1,2,,=(1,2,,/)()
p(1,2,,) (1)
For all i = 1, 2…, n, it
becomes,
(2)
Where (,,,  1,  + 1, )
 1,2,,
 
=p(g) (/)
=1
(1,2,,) (3)
The classification equation as:
/1,2,, (/)
=1
(/) = {(1/)(), (2
/)(), ..,(/)()} 4
(/) is a probability of g given h. ()is an prior
probability of hypothesis g. () is an prior probability of
training data h. (/) is an probability of g given h. In
thiseq.4 classification algorithm to help of improving the
speed and accuracy of IDS[19].
C. Decision Tree Classifier
This is a family of the supervised learning algorithm. The
Decision tree rules are easy to understand the user and using
knowledge system such as Weka tool. C4.5 is termed as
(J48 in Weka software). The main motive of using their
decision tree rule is to create the training model and which is
predicted the class value. Here the information gain ratio as
an amount to choose the splitting features. Decision tree is
classified into tree structure, the tree contains decision node
and leaf node. Decision node: it is root node, each internal
node corresponding to an attribute, Leaf node:
corresponding to a class values. The windows consist of
various classifiers like bays, function, Meta and tree. The
entropy of an attribute E is computed as eq.5:
=,log, (5)
=1
Let A is the total no of intrusion classes in the given
dataset (,) gives the ratio of instance in E and these are
assigned to ith class.
The Information Gain of the dataset G is calculated as
eq.6:
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-7 Issue-4S2, December 2018
40
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: ES2036017519/18©BEIESP
,GE,m ()
GE,m
ɛ ()
(6)
Then, eq.7, we prepare the gain ratio of an attributes E is
given by,
,
=gainE, G
splitinfoE, G (7)
Whereas (,) is calculated as,
,
=GE,m

ɛ ()GE ,m
 (8)
In eq.8, we are select the best split node to select the
feature with maximum gain ratio. Here, this one reduced the
computational complication[20].
D. Support vector machine Classifier
The SVM learning algorithm is run for classification and
regression. But it is mainly used in classification problems.
The process of SVM works in two class with a hyperplane.
The classification is complete using hyperplane which
training data created by the maximum margin. [10][21]
IV. PROPOSED APPROACH
In this section, we fully discussed about the network
intrusion detector using machine learning algorithm. See
figure 1 detailed explain our proposed work.
Figure 1: Sequence of Proposed approach.
The network intrusion detector is detecting the network
intrusion. Some stages the NIDS is confused whether
network packets are normal or abnormal and that critical
situation, we are using machine learning techniques to
classify the normal and abnormal packets.
Our proposed procedures is described below
Step 1: Load the intrusion dataset, which is containing 41
attributes features
Step 2: Preprocessing the data, which is reduced the
irrelevant and redundant the data
Step 3: Apply the machine learning classification
techniques (SVM, naïve bayes and decision tree)
Step 4: The classification techniques used to build the
model (trained the model)
Step 5: Predicted the data, whether the data is normal or
abnormal
Step 6: Finally compared the three classification
techniques and its performance results.
V. DATASET DESCRIPTION
Herewith we are using NSL KDD dataset, it’s advanced
of KDD dataset. The NSL KDD dataset attributes can be
used to detect the attacks like DOS, Probe, R2L, U2R etc.
Also, it is mainly used for abnormal detection.
The advantages of NSL KDD data set are listed one by
one, first, In the training set is not included the redundant
records, so the classification will not produce a partial result.
And next one is processing the duplicate records along
with testing dataset. NSL-KDD have produced better
reduction rates[22].
Each record has 42 attributes contains data and the five
various classes of the network.
One is original class and another four is attack classes and
these attack classes are DOS, Probe, R2L, and U2R. Table
1, shows the major types of attack in both training and
testing dataset[23].
Table 1: Display the Attack Classes and its Type
Type of Network Intrusion classification
Back = 1, land = 1, Neptune = 1, pod = 1,
Smurf = 1, teardrop = 1
Ipsweep = 2, nmap = 2,
Portsweep = 2, satan = 2
ftp_write = 3, guess_passwd = 3, imap = 3,
Multihop = 3, Phf = 3, spy = 3,
Warezclient = 3, warezmaster = 3
Buffer_overflow = 4, Loadmodule = 4,
Perl = 4, rootkit = 4
Normal = 0
VI. EXPERIMENTAL RESULT AND ANALYSIS
A. Experimental Setup
For experimental setup, we have using NSL-KDD
intrusion dataset and computerized data analyzing tool
WEKA is used to perform the classification testing. Because
the Weka tool is data mining process. It contains the
clustering, pre-processing, classification, regression and
feature selection models. It’s working on windows. Only
20% NSL KDD dataset are charity to perform the
classification. The presentation of the classifier is valuated
with help of altered parameter like true detection rate, false
detection rate, accuracy and execution time
B. Processing, Feature selection and Classification
The dataset can be classified initially preprocessing and
that range is 0 to 1 (i.e. the researcher choose this level
equal to 0.01, 0.05, or 0.10). The feature selection is nearly
41 available in the dataset. The SVM, Naïve Bayes and
decision tree algorithms are used in this classification work.
C. Result Analysis
The explorer carried out to the WAKA (Waikato
Environment for Knowledge Analysis), in the first step to
preprocessing only 1075 sample data and enter the
classifying the Naïve Bayes, SVM, J48 (Decision Tree)
algorithms.
An Effective Way of Cloud Intrusion Detection System Using Decision tree, Support Vector Machine and Naïve
Bayes Algorithm
41
Retrieval Number: ES2036017519/18©BEIESP
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
We have using WEKA tool for classification for
algorithm shown in fig. 1, 2, and 3 is using three algorithms
and its show the result in visualizing tree.
Figure 2: Classified NaiveBayes Algorithm of IDS
Dataset.
Figure 3: Confusion Matrix of SVM Algorithm.
Figure 4: Decision Tree for IDS dataset.
VII. PERFORMANCE ANALYSIS
The three-supervised learning algorithm can be combined
to predict different output results are given and only 20%
dataset to analyzing training set in Weka tool. it gives better
results for providing and Table 2 display the three-
classification algorithm produced TPR,FPR, Accuracy, and
ET (Execution Time) in percentage. NSL-KDD dataset has
been using the accessing the data.
A. TP Rate
 =
( +)
B. FP Rate
 =
( +)
C. Accuracy
The display accuracy value is the proportion of correctly
classified instance from the total amount of instance.
 = +
( + + +)
Table 2: Display the compared training dataset results
Classificati
on
Algorithms
Dataset
TPR
(%)
FP
R
(%)
Accura
cy (%)
ET
(Millisecon
ds)
Naive
Bayesian
41
Attribut
es
71.7
0
0.7
92.60
500
SVM
41
Attribut
es
98.0
1.4
97.50
125
J48
41
Attribut
es
99.3
0
0.5
99.30
180
During packets from source IP address to Designation IP
address travel the network many intrudes are attacked in the
virtual machine. While using the classification of naïve
Bayes, SVM, J48, and Random Forest algorithm to gives a
better result. The graphical representation of performance
result is given below the figure 4 and figure 5 display the TP
rate and FP rate respectively three machine learning
algorithm with NSL_KDD data set.
Figure 5: True positive rate of the NIDS
It shows the true positive rate results using 41 attributes
for training data compared to three algorithm decision tree
(>99%) with other. And it shows the false positive rate
results using 41 attributes for training data compared to
other the SVM algorithm only (<220 ms).
71.7
98
99.3
70
75
80
85
90
95
100
105
Naive Bayesian
SVM
J48
True Positive Rate%
Classification Algorithm
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-7 Issue-4S2, December 2018
42
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: ES2036017519/18©BEIESP
Figure 6: False positive rate of the NIDS
In figure 6 and figure 7 is display the accuracy and
execution time.
The accuracy value of decision tree value is higher than
SVM and naïve Bayesian. And the execution time is
compared to other the SVM is low.
Figure 7: Accuracy of the NIDS
Figure 8: Execution time (in ms) of the NIDS
Figure 4,5,6,7 demonstrates only the top three machine
learning algorithms performance result. Maximum all
algorithm reached 80 % but j48 have 99% of TPR and FPR
is very low time and accuracy is a maximum percentage and
ET is compared to other in these two algorithms is a very
short time to complete execution. The graph displays the
high true positive rate, high accuracy and small execution
time but compared to other random forest and j48 is
produced the highest performance.
VIII. CONCLUSION & FUTURE WORK
This paper, we proposed a naive Bayes, SVM and
decision tree algorithms using NSL-KDD dataset. In this
dataset 41 attributes are available. The proposed technique
need to 41 attributes is using. It is simulating the pre process
dataset into simulating the training data. Because lots of
network attack intrudes in the cloud. The traditional attacks
are IP spoofing, DDOS, User to Port, Port Scanning etc. An
IDS has a new efficient solution of the traditional network
for securing packets. I have using sample data in anomaly
techniques to improve the high accuracy and low false
alarm. Simulation results decision tree is better than in terms
of TP rate is almost 2% of SVM, 14% of naïve Bayesian.
The decision tree is better than FP rate almost 1% Lower
than SVM, 11% lower than naïve Bayesian. Accuracy value
of decision tree is better than almost 2% of SVM, 20% of
naïve Bayesian. And execution time of SVM is better than
other. The main conclusion is that decision tree performance
is better than other SVM and naïve Bayesian. As per our
proposed work an effective approach of network IDS in
cloud computing. In future work using the optimal feature
selection algorithm for reducing the attributes and to build
the training model.
REFERENCES
1. U. Kumar, “A Survey on Intrusion Detection Systems for Cloud
Computing Environment,” International Journal of Computer
Applications, vol. 109, no. 1, pp. 615, 2015.
2. K. Arjunan and C. N. Modi, “An enhanced intrusion detection
framework for securing network layer of cloud computing,” ISEA
Asia Security and Privacy Conference 2017, ISEASP 2017, 2017.
3. B. Mahalakshmi and G. Suseendran, “Effectuation of Secure
Authorized Deduplication in Hybrid Cloud,” Indian Journal of
Science and Technology, vol. 9, no. 25, Jul. 2016.
4. T. Nathiya, “Reducing DDOS Attack Techniques in Cloud
Computing Network Technology,” International Journal of Innovative
Research in Applied Sciences and Engineering (IJIRASE), vol. 1, no.
1, pp. 2329, 2017.
5. R. K. Bathla, G. Suseendran, and Shallu, “Research analysis of big
data and cloud computing with emerging impact of testing,”
International Journal of Engineering and Technology(UAE), vol. 7,
no. 3.27 Special Issue 27, pp. 239243, 2018.
6. R. Staudemeyer and C. Omlin, “Extracting salient features for
network intrusion detection using machine learning methods,” South
African Computer Journal, vol. 52, no. July, pp. 8296, 2014.
7. S. G. Kene and D. P. Theng, “A Review on Intrusion Detection
Techniques for cloud computing and Security Challenges,” IEEE
sponsored 2nd International Conference on Electronics and
Communication Systems (ICECS), pp. 227232, 2015.
8. N. Modi, “An Efficient Security Framework to Detect Intrusions at
Virtual Network Layer of Cloud Computing,” 19th international ICIN
conference- Innovations in clouds, Internet and Network, pp. 133
140, 2016.
0.7
1.4
0.5
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Naive
Bayesian
SVM
J48
False Positive Rate %
Classification Algorithm
92.6
97.5
99.3
90
95
100
Naive
Bayesian
SVM
J48
Accuracy %
Classification Algorithm
500
125
180
100
150
200
250
300
350
400
450
500
550
Naive
Bayesian
SVM
J48
ET (Milliseconds)
Classification Algorithm
An Effective Way of Cloud Intrusion Detection System Using Decision tree, Support Vector Machine and Naïve
Bayes Algorithm
43
Retrieval Number: ES2036017519/18©BEIESP
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
9. T. Nathiya and G. Suseendran, An Effective Hybrid Intrusion
Detection System for Use in Security Monitoring in the Virtual
Network Layer of Cloud Computing Technology,Data Management,
Analytics and Innovation, Advances in Intelligent Systems and
Computing,839, pp.483-496,2019. Doi: 10.1007/978-981-13-1274-
8_3.
10. Hasan, M. Nasser, B. Pal, and S. Ahmad, “Support Vector Machine
and Random Forest Modeling for Intrusion Detection System ( IDS
),” Journal of Intelligent Learning Systems and Applications, vol. 6,
no. February, pp. 4552, 2014.
11. Farnaaz and M. A. Jabbar, “Random Forest Modeling for Network
Intrusion Detection System,” Procedia Computer Science, vol. 89, pp.
213217, 2016.
12. G. V. Nadiammai and M. Hemalatha, “Effective approach toward
Intrusion Detection System using data mining techniques,” Egyptian
Informatics Journal, vol. 15, no. 1, pp. 3750, 2014.
13. Jing, Y. Bi, and H. Deng, “An innovative two-stage fuzzy kNN-DST
classifier for unknown intrusion detection,” International Arab
Journal of Information Technology, vol. 13, no. 4, pp. 359366, 2016.
14. Rouhi, F. Keynia, and M. Amiri, “Improving the Intrusion Detection
Systems’ Performance by Correlation as a Sample Selection Method,”
Journal of Computer Sciences and Applications, vol. 1, no. 3, pp. 33
38, 2013.
15. O. Osanaiye, H. Cai, K. K. R. Choo, A. Dehghantanha, Z. Xu, and M.
Dlodlo, “Ensemble-based multi-filter feature selection method for
DDoS detection in cloud computing,” Eurasip Journal on Wireless
Communications and Networking, vol. 2016, no. 1, 2016.
16. Ö. Cepheli, S. Büyükçorak, and G. Karabulut Kurt, “Hybrid Intrusion
Detection System for DDoS Attacks,” Journal of Electrical and
Computer Engineering, vol. 2016, 2016.
17. N. M. Turab, A. Abu, and T. Shadi, “C Loud Computing Challenges
and Solutions,” International Journal of Computer Networks &
Communications (IJCNC), vol. 5, no. 5, pp. 209216, 2013.
18. N. Hubballi and V. Suryanarayanan, “False alarm minimization
techniques in signature-based intrusion detection systems: A survey,”
Computer Communications, vol. 49, pp. 117, 2014.
19. K. Chai, H. T. Hn, and H. L. Cheiu, “Naive-Bayes Classification
Algorithm,” Bayesian Online Classifiers for Text Classification and
Filtering, pp. 97104, 2002.
20. H. Chauhan and A. Chauhan, “Implementation of decision tree
algorithm c4.5 1,” International journal of scientific and research
publications, vol. 3, no. 10, pp. 46, 2013.
21. O. Catak and M. E. Balaban, “CloudSVM: Training an SVM
classifier in cloud computing systems,” Lecture Notes in Computer
Science (including subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), vol. 7719 LNCS, no. January,
pp. 5768, 2013.
22. M. S. Revathi, “A Detailed Analysis on NSL-KDD Dataset Using
Various Machine Learning Techniques for Intrusion Detection,”
International Jornal of Engineering Research and Technology, vol. 2,
no. 12, pp. 18481853, 2013.
23. L. Dhanabal and S. P. Shantharajah, “A Study on NSL-KDD Dataset
for Intrusion Detection System Based on Classification Algorithms,”
International Journal of Advanced Research in Computer and
Communication Engineering, vol. 4, no. 6, pp. 446452, 2015.
... The primary goal of their DT rule is to generate a training model from which the projected label rate is derived [47]. The structure of a decision tree is characterized as a tree; the tree has decision nodes and leaf nodes [57]. It is the origin node, with each interior node representing a feature. ...
Article
Full-text available
Information and communication technology (ICT) advancements have altered the entire computing paradigm. As a result of these improvements, numerous new channels of communication are being created, one of which is the Internet of Things (IoT). The IoT has recently emerged as cutting-edge technology for creating smart environments. The Internet of Medical Things (IoMT) is a subset of the IoT, in which medical equipment exchange information with each other to exchange sensitive information. These developments enable the healthcare business to maintain a higher level of touch and care for its patients. Security is seen as a significant challenge in whatsoever technology’s reliance based on the IoT. Security difficulties occur owing to the various potential attacks posed by attackers. There are numerous security concerns, such as remote hijacking, impersonation, denial of service attacks, password guessing, and man-in-the-middle. In the event of such attacks, critical data associated with IoT connectivity may be revealed, altered, or even rendered inaccessible to authorized users. As a result, it turns out to be critical to safeguard the IoT/IoMT ecosystem against malware assaults. The main goal of this study is to demonstrate how a deep recurrent neural network (DRNN) and supervised machine learning models (random forest, decision tree, KNN, and ridge classifier) can be utilized to develop an efficient and effective IDS in the IoMT environment for classifying and forecasting unexpected cyber threats. Preprocessing and normalization of network data are performed. Following that, we optimized features using a bio-inspired particle swarm algorithm. On the standard data for intrusion detection, a thorough evaluation of experiments in DRNN and other SML is performed. It was established through rigorous testing that the proposed SML model outperforms existing approaches with an accuracy of 99.76%.
... In this paper data mining tool WEKA is used to perform experiments for Intrusion Detection [23] and applied ML kernel based support vector classification methods using LIBSVM to build classification model. Randomly selected set of 104867 points of the total data set (1048566) is used for testing various kernels with ten-fold cross validation. ...
Article
Full-text available
This paper investigates the performance among the various kernel based SVM classifiers for intrusion detection in cloud environment. Several researchers have presented the different kernel functions of SVM for Intrusion Detection. There is always an ambiguity in choosing which kernel function is to apply for better detection rate to identify classification accuracy factor. This paper explores to achieve this objective to identify the popular kernel functions linear, polynomial, radial basis function and Sigmoid. The CIDDS-001 dataset is adapted because of it is a recently available benchmark dataset and generated with new types of attacks of cloud environment. To evaluate the performance of different kernel functions computational time and accuracy taken as QoS metrics with tenfold cross validation. The numerical results are calculated and conclusions are drawn.
... once embedding the key message and so matters key, a stegoimage file is generated. This file is then shared with the individual receiver [34]. ...
Article
Full-text available
Image steganography based on various methods like LSB method of hiding data image to cover images are most common method for secret communication. There are many steg-analysis algorithms available for to detect hidden data in an image. There are many existing cryptography methods for image encryption like DES, AES, RSA, etc. but these are of less security on attack.In order to overcome these above mentioned drawbacks a novel chaotic map algorithm has been introduced for image encryption and further proceeded with PPM (Pixel Pattern Mapping) algorithm for image hiding. The algorithm computes chaotic encryption method on RGB planes of the image to be hidden then is followed by PPM based image hiding method on cover image. Our method includes RGB image for encryption instead of gray scale image as in state-of-art methods.
Article
Full-text available
Cloud computing is a vast area which uses the resources cost-effectively. The performance aspects and security are the main issues in cloud computing. Besides, the selection of optimal features and high false alarm rate to maintain the highest accuracy of the testing are also the foremost challenges focused. To solve these issues and to increase the accuracy, an effective cloud IDS using Grasshopper optimization Algorithm (GOA) and Deep belief network (DBN) is proposed in this paper. GOA is used to choose the ideal features from the set of features. Finally, DBN is developed for classification according to their selected feasible features. The introduced IDS is simulated on the Python platform and the performance of the suggested model of deep learning is assessed based on statistical measures named as Precision, detection accuracy, f-measure and Recall. The NSL_KDD, and UNSW_NB15 are the two datasets used for the simulation, and the results showed that the proposed scheme achieved maximum classification accuracy and detection rate.
Article
Full-text available
The article explores the open data phenomenon, data-driven analytics and special attention is given to the media industry where they act as raw materials for creating text, image or video content. The aim of scientific intelligence is to identify the principle of working with data in the public domain in Ukraine. Direct opportunities for analytics, access to public information are considered, classification features of data are revealed, social indicators are singled out and interpreted for society. The methodological base was the methods group. There were a method of scientific research for to study the theoretical parameters of the problem in the scientific literature, sociological survey method, the factual method, a classification method for formalized description of information and a survey was implemented. The open data mechanisms, which catalyse social transparency, were disclosed through the possibility of free-of-charge verification of future business partners and companies (the presence of lawsuits, analysis of participation in state tenders, tax arrears, etc.). It is emphasized that the possibilities of using services based on open data imposes a stamp on the civil society development in Ukraine. Because of this, the social aspect of the influence of various projects on the formation of socially active and young people's consciousness develops, who forms the system of public control over the activity of state bodies. Resumen El artículo explora el fenómeno de los datos abiertos, el análisis basado en datos y se presta especial atención a la industria de los medios, donde actúan como materia prima para crear contenido de texto, imagen o video. El objetivo de la inteligencia científica es identificar el principio de trabajar con datos de dominio público en Ucrania. Se consideran oportunidades directas para el análisis, se considera el acceso a la información pública, se revelan las características de clasificación de los datos, se destacan e interpretan los indicadores sociales para la sociedad. La base metodológica fue el grupo de métodos. Se implementó un método de investigación científica para estudiar los parámetros teóricos del problema en la literatura científica, el método de encuesta sociológica, el método factual, un método de clasificación para la descripción formalizada de la información y una encuesta. Los mecanismos de datos abiertos, que catalizan la transparencia social, se dieron a conocer mediante la posibilidad de verificación gratuita de futuros socios comerciales y empresas (presencia de juicios, análisis de participación en licitaciones estatales, atrasos tributarios, etc.). Se enfatiza que las posibilidades de utilizar servicios basados en datos abiertos imponen un sello en el desarrollo de la sociedad civil en Ucrania. Debido a esto, se desarrolla el aspecto social de la influencia de diversos proyectos en la formación de la conciencia socialmente activa y de los jóvenes, quienes conforman el sistema de control público sobre la actividad de los órganos estatales.
Article
Cloud Computing has revolutionized the Information Technology by allowing the users to use variety number of resources in different applications in a less expensive manner. The resources are allocated to access by providing scalability flexible on-demand access in a virtual manner, reduced maintenance with less infrastructure cost. The majority of resources are handled and managed by the organizations over the internet by using different standards and formats of the networking protocols. Various research and statistics have proved that the available and existing technologies are prone to threats and vulnerabilities in the protocols legacy in the form of bugs that pave way for intrusion in different ways by the attackers. The most common among attacks is the Distributed Denial of Service (DDoS) attack. This attack targets the cloud’s performance and cause serious damage to the entire cloud computing environment. In the DDoS attack scenario, the compromised computers are targeted. The attacks are done by transmitting a large number of packets injected with known and unknown bugs to a server. A huge portion of the network bandwidth of the users’ cloud infrastructure is affected by consuming enormous time of their servers. In this paper, we have proposed a DDoS Attack detection scheme based on Random Forest algorithm to mitigate the DDoS threat. This algorithm is used along with the signature detection techniques and generates a decision tree. This helps in the detection of signature attacks for the DDoS flooding attacks. We have also used other machine learning algorithms and analyzed based on the yielded results.
Chapter
In online Services, Distributed Denial of Service (DDoS) remains as one of the main threats. Attackers can execute DDoS by the steps which is easier and with the high efficiency, to slow down services for the user’s access. To detect the DDoS attack, machine learning algorithms are used. The supervised machine learning algorithms like Naive Bayes, decision tree, k-nearest neighbors (k-NN) and random forest, are used for detection and mitigation of attack. There are three steps: information collecting, Preprocessing and feature extraction in the classification algorithm for detection of “Normal or DDoS” attack using the NSL-KDD dataset. Different algorithms exhibit different behavior based on the selected features. The performance of DDOS attack detection is compared and best algorithm is suggested.
Article
Full-text available
Zika virus a mosquito borne flavivirus disease, which is spreading hastily across all over the world. Nearly 95 countries are infected with Zika, Aedes aegypti Mosquitoes is the source of spreading the virus. Microcephaly, myelitis, Guillain-Barre Syndrome and neuropathy are the causes of ZVD. Miscarriages and preterm birth also possible also occur during the time of infection. To overcome an early prediction system is used for detecting the virus using symptoms. The zika dataset is stored in cloud and in our proposed work a Multilayer Perceptron Neural Network classifier used for predicting the Zika virus. The classifier produces accuracy level of 97% the highest accuracy level. Based on the symptoms ZVD is predicted at earlier stage, if they found as infected RNA test will be taken for the concerned person.
Article
Full-text available
Big Data has increased much focus from the scholastic world and the IT business. In the advanced and figuring world, all together is created and gathered at a rate that quickly surpasses the limit go. Right now, more than 2 billion individuals worldwide are associated with the Internet, and more than 5 billion people possess cell phones. By 2020, 50 billion gadgets are relied upon to be associated with the Internet. Now, anticipated information creation will be 44 times more prominent than that in 2009. As data is exchanged a nd shared at light speed on optic fiber and remote systems, the volume of information and the speed of market development increment. In any case, the quick development rate of such substantial information creates various difficulties, for example, the fast development of information, exchange speed, different information, and security. In any case, Big Data is still in its outset arrange, and the space has not been checked on all in all. Distributed computing has opened up new open doors for testing offices. New innovation and social network patterns are making an ideal tempest of chance, empowering cloud to change inside tasks, Customer connections and industry esteem chains. To guarantee high caliber of cloud applications being worked on, designer must perform testing to analyze the quality and exactness whatever they plan. In this examination paper, we address a testing natural engineering with important key advantages, to perform execution of experiments and utilized testing strategies to improve nature of cloud applications.
Article
Full-text available
With the growing usage of technology, intrusion detection became an emerging area of research. Intrusion Detection System (IDS) attempts to identify and notify the activities of users as normal (or) anomaly. IDS is a nonlinear and complicated problem and deals with network traffic data. Many IDS methods have been proposed and produce different levels of accuracy. This is why development of effective and robust Intrusion detection system is necessary. In this paper, we have built a model for intrusion detection system using random forest classifier. Random Forest (RF) is an ensemble classifier and performs well compared to other traditional classifiers for effective classification of attacks. To evaluate the performance of our model, we conducted experiments on NSL-KDD data set. Empirical result show that proposed model is efficient with low false alarm rate and high detection rate.
Article
Full-text available
Objectives: Different user can access the same data repeatedly and trying to store it in the memory of the cloud server. Due to this there is a problem of maintaining the storage space and bandwidth. The main purpose of this study is how the data is secured, whether the authorized person is accessing the data or not and finally to check whether same data is repeatedly stored in the memory to avoid duplication of the data. Methods: In deduplication, to guard the confidentiality of sensitive information, it's encrypted/decrypted by the planned convergent coding technique before outsourcing for higher protection of knowledge security. Findings: The convergent encryption method and open authorization protocol and deduplication are combined together and check the data for deduplication in a secured way. The possibilities of using other algorithms are also considered for further implementation.
Article
Full-text available
Widespread adoption of cloud computing has increased the attractiveness of such services to cybercriminals. Distributed denial of service (DDoS) attacks targeting the cloud's bandwidth, services and resources to render the cloud unavailable to both cloud providers, and users are a common form of attacks. In recent times, feature selection has been identified as a pre-processing phase in cloud DDoS attack defence which can potentially increase classification accuracy and reduce computational complexity by identifying important features from the original dataset during supervised learning. In this work, we propose an ensemble-based multi-filter feature selection method that combines the output of four filter methods to achieve an optimum selection. We then perform an extensive experimental evaluation of our proposed method using intrusion detection benchmark dataset, NSL-KDD and decision tree classifier. The findings show that our proposed method can effectively reduce the number of features from 41 to 13 and has a high detection rate and classification accuracy when compared to other classification techniques.
Article
Full-text available
Distributed denial-of-service (DDoS) attacks are one of the major threats and possibly the hardest security problem for today’s Internet. In this paper we propose a hybrid detection system, referred to as hybrid intrusion detection system (H-IDS), for detection of DDoS attacks. Our proposed detection system makes use of both anomaly-based and signature-based detection methods separately but in an integrated fashion and combines the outcomes of both detectors to enhance the overall detection accuracy. We apply two distinct datasets to our proposed system in order to test the detection performance of H-IDS and conclude that the proposed hybrid system gives better results than the systems based on nonhybrid detection.
Conference Paper
Full-text available
Nowadays, Cloud Computing is the first choice of every IT organization because of its scalable and flexible nature. However, the security and privacy is a major concern in its success because of its open and distributed architecture that is open for intruders. Intrusion Detection System (IDS) is the most commonly used mechanism to detect various attacks on cloud. This paper shares an overview of different intrusions in cloud. Then, we analyze some existing cloud based intrusion detection systems with respect to their various types, positioning, detection time, detection techniques, data source and attacks. The analysis also provides limitations of each technique to determine whether they fulfill the security needs of cloud computing environment or not. We highlight the deployment of IDS that uses multiple detection methods to manage with security challenges in cloud.
Article
Cloud computing has grown for various IT capabilities such as IoTs, Mobile Computing, Smart IT, etc. However, due to the dynamic and distributed nature of cloud and vulnerabilities existing in the current implementations of virtualization, several security threats and attacks have been reported. To address these issues, there is a need of extending traditional security solutions like firewall, intrusion detection/prevention systems which can cope up with high-speed network traffic and dynamic network configuration in the cloud. In addition, identifying feasible network traffic features is a major challenge for an accurate detection of the attacks. In this paper, we propose a hypervisor level distributed network security (HLDNS) framework which is deployed on each processing server of cloud computing. At each server, it monitors the underlying virtual machines (VMs) related network traffic to/from the virtual network, internal network and external network for intrusion detection. We have extended a binary bat algorithm (BBA) with two new fitness functions for deriving the feasible features from cloud network traffic. The derived features are applied to the Random Forest classifier for detecting the intrusions in cloud network traffic and intrusion alerts are generated. The intrusion alerts from different servers are correlated to identify the distributed attack and to generate new attack signature. For the performance and feasibility analysis, the proposed security framework is tested on the cloud network testbed at NIT Goa and using recent UNSW-NB15 and CICIDS-2017 intrusion datasets. We have performed a comparative analysis of the proposed security framework in terms of fulfilling the cloud network security needs.
Article
Intrusion detection is the essential part of network security in combating against illegal network access or malicious attacks. Due to constantly evolving nature of network attacks, it has been a technical challenge for an Intrusion Detection System (IDS) to recognize unknown attacks or known attacks with inadequate training data. In this work, an innovative fuzzy classifier is proposed for effectively detecting both unknown attacks and known attacks with insufficient or inaccurate training information. A Fuzzy C-Means (FCM) algorithm is firstly employed to softly compute and optimise clustering centers of the training datasets with some degree of fuzziness counting for inaccuracy and ambiguity in the training data. Subsequently, a distance-weighted k-Nearest Neighbors (k-NN) classifier, combined with the Dempster Shafer Theory (DST) is introduced to assess the belief functions and pignistic probabilities of the incoming data associated with each of known classes. Finally, a two-stage intrusion detection scheme is implemented based on the obtained pignistic probabilities and their entropy function to determine if the input data are normal, one of the known attacks or an unknown attack. The proposed intrusion detection algorithm is evaluated through the application of the KDD’99 datasets and their variants containing known and unknown attacks. The experimental results show that the new algorithm outperforms other intrusion detection algorithms and is especially effective in detecting unknown attacks.