The overall framework of network intrusion detection model.

The overall framework of network intrusion detection model.

Source publication
Article
Full-text available
In imbalanced network traffic, malicious cyber-attacks can often hide in large amounts of normal data. It exhibits a high degree of stealth and obfuscation in cyberspace, making it difficult for Network Intrusion Detection System(NIDS) to ensure the accuracy and timeliness of detection. This paper researches machine learning and deep learning for i...

Contexts in source publication

Context 1
... proposed the intrusion detection model shown in Fig- ure 1. Data pre-processing first performed in our intrusion detection structure, including duplicate, outlier, and missing value processing. ...
Context 2
... example, "TCP", "UDP" and "ICMP" are functions of three protocol types. After OneHot encoding, they become binary vectors (1, 0, 0), (0, 1, 0), (0, 0, 1). The protocol type function can be divided into three categories, including 11 categories for flag function and 70 categories for service function. ...

Similar publications

Article
Full-text available
Background Birthweight is an important indicator during the fetal development process to protect the maternal and infant safety. However, birthweight is difficult to be directly measured, and is usually roughly estimated by the empirical formulas according to the experience of the doctors in clinical practice. Methods This study attempts to combin...

Citations

... However, the current intrusion scenarios are extremely sophisticated and have the ability to easily breach the security mechanisms that are enforced by conventional security measures. This has come about in a colossal expansion in the quantity of specialists working in this space and has too expanded the quantity of examination commitments in this space [4]. ...
... Lau lin et al. [4], in imbalanced organization traffic, pernicious digital assaults can regularly stow away in a lot of typical information. It displays a serious level of covertness and jumbling in the internet, making it hard for Network Intrusion Detection System (NIDS) to guarantee the precision and practicality of discovery. ...
... The Canadian Institute of Cyber Security (CIC) generated the intrusion sensing dataset CSE-CIC-IDS2018 in 2018 and it is the data that is used in the BC-TRANS Network [12] Furthermore, it is the most recent and complete incursion dataset that is currently open to the public. A dataset called CSE-CIC-IDS2018 was gathered in preparation for actual attacks. ...
Article
The IoT (Internet of Things) encompasses numerous networks and connected devices. One of the primary concerns surrounding IoT, according to researchers and security experts, is the potential risks to privacy and cybersecurity. Deep learning offers significant capabilities for self-adjustment, self-organization, and generalization. Recognizing this, advanced deep learning algorithms are employed in this research to address the privacy and security issues plaguing the IoT landscape. To address these concerns, a novel model called BC-Trans Network is proposed, leveraging the strengths of both Blockchain technology and a transformer component. The transformer plays a vital role in identifying abnormal data, enabling the system to take proactive measures against potential threats. In addition Hash-2 is introduced for the verification of IoT users, adding an extra layer of security to the authentication process. The Blockchain model is utilized to securely store user passwords and details, ensuring a robust and tamper-proof authentication mechanism. To validate the proposed model, a publicly available dataset CSE-CIC-IDS2018 is employed. Pre-processing techniques, including feature selection using the chi-square method, are applied to refine the dataset. The transformer module then classifies the data as normal or abnormal, allowing for accurate identification of potential security breaches. To further safeguard the data and protect the privacy of users, a Fully Homomorphic Encryption (FHE) method is employed. This advanced encryption enables the encryption of categorized normal data, ensuring its confidentiality even during transmission and storage. The study's findings support IoT-cloud server security and privacy by demonstrating the effectiveness of the suggested paradigm in identifying and thwarting network threats. With detection times of 225.3 seconds, an accuracy of 99.25%, a precision of 99.53%, a recall of 99.32%, and an F1 score of 99.59%, the proposed system exhibits impressive performance. Furthermore, as the output numbers increase, the system's metrics improve, suggesting its scalability and flexibility.
... In contrast to PSO-LightGBM [30], AlexNet [31], CWGAN-CSSAE [32] and MSCNN-LSTM [33], the 2dCNN-BiLSTM intrusion detection model extracts and learns a more comprehensive set of features at the same time with better results. There are two reasons for this: (1) SMOTEEN sampling method makes a high-quality expansion for the dataset, thus providing good data support for the current detection model training; (2) the structure and parameter settings used in our model are more suitable for the characteristics of the data in the UNSW_NB15 dataset. ...
... In addition to enhancing reinforcement learning's automatic learning ability, this model also mitigates the effect of the data imbalance phenomenon, outperforming the original AE-RL model in several ways. To address the imbalance problem, Liu et al. [48] proposed the difficult set sampling algorithm DSSTE which reduces the imbalance of the original training set and augments minority class data with targeted augmentation. ...
Preprint
Full-text available
An intrusion detection algorithm based on deep learning is one of the hot topics in this field. In spite of some important research results, some shortcomings remain. In addition, the lack of labeled training data makes it more difficult to detect unknown attacks. High-dimensional and massive data will reduce the detection accuracy of the model and increase its complexity during training. Moreover, the phenomenon of data imbalance can seriously affect classification accuracy and degrade detection performance for rare attacks. In order to deal with the above problems, an intrusion detection algorithm based on convolutional long-short-term-memory and auto-encoding (CLAE) is proposed. Firstly, we design an encoder network module consisting of two CNN layers and one LSTM layer. The encoder network can encode raw data and output compressed one-dimensional vectors. Secondly, a decoder network module is designed. The decoder network has all layers symmetrical to the encoder network. In order to recover the original input data, this module is able to learn high-level features. And then, an attention residual block is designed in this paper to extract the key information from the data. As a result of ARB, more important features can be identified, redundant features can be filtered, feature learning of the model can be made more efficient, and intrusion detection accuracy can be improved significantly. Finally, using the NSL-KDD dataset and UNSW-NB15 dataset, CLAE’s superiority in detecting unknown attacks is demonstrated. The experimental results show that the proposed model CLAE can achieve state-of-the-art results compared to other models.
... Clustering algorithms are influenced by a variety of factors, including their own internal workings. SVM is a pattern recognition and data analysis supervised learning model [7]. The OCSVM approach is an addition of the SVM method, particularly well-suited to unlabeled data. ...
Article
Intrusion detection technologies have grown in popularity in recent years using machine learning. The variety of new security attacks are increasing, necessitating the development of effective and intelligent countermeasures. The existing intrusion detection system (IDS) uses Signature or Anomaly based detection systems with machine learning algorithms to detect malicious activities. The Signature-based detection rely only on signatures that have been pre-programmed into the systems, detect known attacks and cannot detect any new or unusual activity. The Anomaly based detection using supervised machine learning algorithm detects only known threats. To address this issue, the proposed model employs an unsupervised machine learning approach for detecting attacks. This approach combines the Sub Space Clustering and One Class Support Vector Machine algorithms and utilizes feature selection methods such as Chi-square, as well as Feature Discretization Methods like Equal Width Discretization to identify both known and undiscovered assaults. The results of the experiments using proposed model outperforms several of the existing system in terms of detection rate and accuracy and decrease in the computational time.
... In this work, a total of 8 ML algorithms were evaluated, such as XGB [47], GB [48], DT [49], RF [50], ET [51], LR [52], GNB [53] and LDA [54]. The totrp 5 ML algorithms with the best performance for intrusion detection are briefly explained below. ...
... This algorithm combines the idea of Boosting, overcoming the speed and accuracy of limited calculations and blocks, and simultaneously orders each function. It allows parallelizing the computation when searching for the best-split point, which significantly accelerates the calculation speed [50]. • GB is used for solving regression and classification problems. ...
... • RF this algorithm trains many DTs, each using a random subset of samples and features. RF achieves increased tree diversity and gives better outcomes [50]. • ET are DT ensembles. ...
Article
Full-text available
One of the fields where Artificial Intelligence (AI) must continue to innovate is computer security. The integration of Wireless Sensor Networks (WSN) with the Internet of Things (IoT) creates ecosystems of attractive surfaces for security intrusions, being vulnerable to multiple and simultaneous attacks. This research evaluates the performance of supervised ML techniques for detecting intrusions based on network traffic captures. This work presents a new balanced dataset (IDSAI) with intrusions generated in attack environments in a real scenario. This new dataset has been provided in order to contrast model generalization from different datasets. The results show that for the detection of intruders, the best supervised algorithms are XGBoost, Gradient Boosting, Decision Tree, Random Forest, and Extra Trees, which can generate predictions when trained and predicted with ten specific intrusions (such as ARP spoofing, ICMP echo request Flood, TCP Null, and others), both of binary form (intrusion and non-intrusion) with up to 94% of accuracy, as multiclass form (ten different intrusions and non-intrusion) with up to 92% of accuracy. In contrast, up to 90% of accuracy is achieved for prediction on the Bot-IoT dataset using models trained with the IDSAI dataset.
... They conclude that oversampling and undersampling both increase the measure of recall significantly in the case of severely imbalanced data. A novel Difficult Set Sampling Technique (DSSTE) algorithm [30] has been proposed to tackle the class imbalance problem that utilizes the Edited Nearest Neighbor (ENN) algorithm to divide the imbalanced training set into separate buckets and then uses the k-means algorithm to compress the majority class samples. They combine majority and minority class instances in the difficult set in order to perform data augmentation. ...
Article
Full-text available
The class imbalance problem negatively impacts learning algorithms’ performance in minority classes which may constitute more severe attacks than the majority ones. This study investigates the benefits of balancing strategies and imbalanced learning approaches on intrusion data from Software Defined Networking (SDN). Although the research community has covered the imbalance problem in machine learning-based intrusion detection, addressing this problem in SDN is novel and powerful. Addressing the class imbalance problem over InSDN (the only publicly available SDN intrusion detection dataset as of recent) is of significant impact on future research in the area of intrusion detection in SDN. We address the class imbalance problem through data-level and classifier-level techniques. Our research objective is to determine suitable methods of addressing the class imbalance problem in machine learning-based intrusion detection in SDN. We propose custom deep learning architectures based on GANs and Siamese Neural Networks for generative modeling and similarity-based intrusion detection. This paper provides benchmarking results from classification with Random Oversampling (ROS), SMOTE, GANs, weighted Random Forest, and Siamese-based one-shot learning. We have found that Random Forest (RF) outperforms deep learning models in the classification of minority class instances. This supports the notion that RF can handle class imbalance well. We also observe that widely-used balancing techniques, ROS and SMOTE, drastically decrease the False Positive Rate (FPR) but increase the False Negative Rate (FNR) in the classification of minority classes. Conclusively, while data-level methods improve classification performance over deep learning models, they, in fact, degrade RF’s performance, i.e. cause higher numbers of false predictions. Therefore, RF does not need additional balancing strategies to get higher performance. Although this work addresses the class imbalance problem in SDN intrusion data, it provides a well-designed benchmark that can be exemplary for any network intrusion detection data. Thus, it may have a significant impact on future studies in this respective domain.
... • Lan Liu et al. [1] have compared Six different classical models based on improved classification performance, e.g., Random forest, Long and short memory (LSTM), etc., with the Data set sampling Technique DSSTE. With this model, researchers also conclude that Deep learning execution is better than machine learning. ...
... • Lan Liu et al. [1] have compared Six different classical models based on improved classification performance, e.g., Random forest, Long and short memory (LSTM), etc., with the Data set sampling Technique DSSTE. With this model, researchers also conclude that Deep learning execution is better than machine learning. ...
... Indeed, ML models have been used on CSE-CIC-IDS2018 (often with at least another similar dataset) [6,10,12,13]. More recently Deep Learning (DL) approaches have been investigated as well [7,8,9,15,16]. All the works adopting ML and DL referenced so far leveraged the pre-processed version of dataset (CSV format), composed of 79 statistical features of biflows, resulting in a post-mortem analysis: as anticipated, this strongly differs from our early setup. ...
... Therefore, a more rigorous assessment is in order. Among the works with a solid evaluation methodology, the best F1 Score obtained performing binary classification is 96.13% using LightGMB [13], on the other hand, performing multiclass classification the best F1 Score is 97.73% using TCN+LSTM [9], and 96.98% applying RF (with upsampling) [7], respectively for DL and ML models. ...
Conference Paper
Full-text available
Current intrusion detection techniques cannot keep up with the increasing amount and complexity of cyber attacks. In fact, most of the traffic is encrypted and does not allow to apply deep packet inspection approaches. In recent years, Machine Learning techniques have been proposed for post-mortem detection of network attacks, and many datasets have been shared by research groups and organizations for training and validation. Differently from the vast related literature, in this paper we propose an early classification approach conducted on CSE-CIC-IDS2018 dataset, which contains both benign and malicious traffic, for the detection of malicious attacks before they could damage an organization. To this aim, we investigated a different set of features, and the sensitivity of performance of five classification algorithms to the number of observed packets. Results show that ML approaches relying on ten packets provide satisfactory results.