TCP/IP model and IoT protocols.

TCP/IP model and IoT protocols.

Source publication
Article
Full-text available
Intrusion detection in computer networks is of great importance because of its effects on the different communication and security domains. The detection of network intrusion is a challenge. Moreover, network intrusion detection remains a challenging task as a massive amount of data is required to train the state-of-the-art machine learning models...

Context in source publication

Context 1
... et al. [20] proposed an ensemble-based intrusion detection technique against application layer protocols (MQTT, HTTP, and DNS), as shown in Figure 1. The authors generated the new statistical flow features from the protocols based on their potential properties. ...

Similar publications

Article
Full-text available
In countries like Japan, Australia, France, Denmark, and South Korea, the numbers of single-person households and older adults living alone have been steadily increasing each year, leading to the social issue of lonely deaths among older adults. Against this backdrop, this study proposes a method to develop a system for preventing lonely deaths bas...
Preprint
Full-text available
The current pandemic situation has increased cyber-attacks drastically worldwide. The attackers are using malware like trojans, spyware, rootkits, worms, ransomware heavily. Ransomware is the most notorious malware, yet we did not have any defensive mechanism to prevent or detect a zero-day attack. Most defensive products in the industry rely on ei...
Article
Full-text available
With the rise in smart devices, the Internet of Things (IoT) has been established as one of the preferred emerging platforms to fulfil their need for simple interconnections. The use of specific protocols such as constrained application protocol (CoAP) has demonstrated improvements in the performance of the networks. However, power-, bandwidth-, an...
Article
Full-text available
With the appearance of large-scale systems, the size of the generated logs increased rapidly. Almost every software produces such files. Log files contain runtime information of the software, as well as indicate noteworthy events or suspicious behaviors like errors. To understand and monitor the operation of the system, log files are a valuable sou...
Article
Full-text available
Detection of minor leaks in oil or gas pipelines is a critical and persistent problem in the oil and gas industry. Many organisations have long relied on fixed hardware or manual assessments to monitor leaks. With rapid industrialisation and technological advancements, innovative engineering technologies that are cost-effective, faster, and easier...

Citations

... The researchers in [44] suggested several techniques to develop IDS employing the UNSW-NB15 dataset. They identified optimal feature subsets from the dataset by analyzing their relationships using the correlation coefficient technique. ...
... Moreover, showing the power of using a hybrid feature selection method including GWO and PSO, the work of [56] obtained the highest result when CICID2017 dataset was utilized (see Fig. 5), outperforming the other works including [36], [38], [46], [50], [51], and [52]. Furthermore, for the UNSW-NB15, the two works [41] and [44] provide the same highest accuracy of 99.3%, surpassing the works of [35], [45], [46], [47], [49], [57], and [60] (see Fig. 6). Another less frequent dataset, KDDCup99, is used by 6 works including [33], [41], [43], [45], [56], and [58]. ...
Article
Full-text available
With the increasing use of the Internet and its coverage of all areas of life and the increasing amount of sensitive and confidential information on the Internet, the number of malicious attacks on that information has increased with the aim of destroying, changing, or misusing it. Consequently, the need to discover and prevent these kinds of attacks has increased in order to maintain privacy, reliability, and even availability. For this purpose, intelligent systems have been developed to detect these attacks, which are called Intrusion Detection System (IDS). These systems were tested and applied to special benchmark datasets that contain a large number of features and a massive number of observations. However, not all the features are important, and some are not relevant. Therefore, applying feature selection techniques becomes crucial, which select the features with the most importance and relevance in order to enhance the performance of the classification model. The aim of this review paper is to conduct a comparative analysis of various state-of-the-art IDS that use algorithm classifications to detect network attacks with the cooperation of feature selection techniques that have been applied to various well-known IDS datasets, such as KDD cup99, NSL-KDD, etc. This comparison is based on several factors, including the utilized classification technique, feature selection used, employed evaluation metrics, datasets used, and finally the highest accuracy rate obtained by each study.
... for the KDDCUP99 dataset, 80% of data is allocated for training purposes, while the remaining 20% is reserved for evaluating the model's performance. To further refine the model and tune hyperparameters, the training data can be subdivided into 10 folds as in [46]. During this process, nine folds are utilized for training, while the remaining fold is employed for validation. ...
Article
Full-text available
The increasing prevalence of cyber threats has created a critical need for robust defense against such incidents. Many Cyber Intrusion Detection Systems (CIDSs), utilizing machine learning have been developed for this purpose. Although, these recent CIDSs have provided the capability to analyze vast amounts of data and identify malicious activities, there are still challenges to be tackled to enhance their effectiveness. The exponential growth of the search space is one of these challenges which makes finding an optimal solution computationally infeasible for large datasets. Furthermore, the weight space while searching for optimal weight is highly nonlinear. Motivated by the observed characteristics, complexities, and challenges in the field, this paper presents an innovative (CIDS) named ANWO-MLP (Adaptive Nonlinear Whale Optimization Multi-layer Perceptron). A novel feature selection method called ANWO-FS (Adaptive Nonlinear Whale Optimization-Feature Selection) is employed in the proposed CIDS to identify the most predictive features enabling robust MLP training even in the highly nonlinear weight spaces. The insider threat detection process is improved by investigating vital aspects of CIDS, including data processing, initiation, and output handling. We adopt ANWOA (previously proposed by us) to mitigate local stagnation, enable rapid convergence, optimize control parameters, and handle multiple objectives by initializing the weight vector in the ANWO-MLP training with minimal mean square error. Experiments conducted on three highly imbalanced datasets demonstrate an average efficacy rate of 98.33%. The details of the results below show the robustness, stability, and efficiency of the proposed ANWO-MLP compared to existing approaches.
... Wazzan et al. (2021) discussed various machine-learning techniques that identify botnet attacks (Wazzan et al., 2021). Standard machine-learning models have been used to detect malicious activities and botnet attacks, such as RF, logistic regression (LR), support vector machines (SVM), and DT (Ahmad et al., 2022;Aldhyani and Alkahtani, 2022;Bapat et al., 2018;Hegde et al., 2020;Injadat et al., 2020;Velasco-Mata et al., 2021). In another article, researchers studied the impact of different machine learning models on the classification of IoT botnet attacks (Joshi and Abdelfattah, 2020). ...
... The proposed study is superior since it used a comprehensive database that included 257,673 entries rather than some prior studies that utilized samples from the dataset (Aldhyani and Alkahtani, 2022;Alshamkhany et al., 2020) which used about 82000 records. On the other hand, the number of features in the proposed method is less than the number of features in other studies (Ahmad et al., 2022;Baig et al., 2017;Jing and Chen, 2019;Ahsan et al., 2021;Gwon et al., 2019;Husain et al., 2019). The study is distinct in that it includes a larger amount of data yet still yields outcomes better than those found in earlier studies. ...
... Equation (2) illustrates that the represents the proportion of uncertainty reduction in the Intrusion Detection System input due to the Intrusion Detection System output. Practically speaking, the value is between 0 and 1. ...
Article
Full-text available
In computer network security, the escalating use of computer networks and the corresponding increase in cyberattacks have propelled Intrusion Detection Systems (IDSs) to the forefront of research in computer science. IDSs are a crucial security technology that diligently monitor network traffic and host activities to identify unauthorized or malicious behavior. This study develops highly accurate models for detecting a diverse range of cyberattacks using the fewest possible features, achieved via a meticulous selection of features. We chose 5, 9, and 10 features, respectively, using the Artificial Bee Colony (ABC), Flower Pollination Algorithm (FPA), and Ant Colony Optimization (ACO) feature-selection techniques. We successfully constructed different models with a remarkable detection accuracy of over 98.8% (approximately 99.0%) with Ant Colony Optimization (ACO), an accuracy of 98.7% with the Flower Pollination Algorithm (FPA), and an accuracy of 98.6% with the Artificial Bee Colony (ABC). Another achievement of this study is the minimum model building time achieved in intrusion detection, which was equal to 1 s using the Flower Pollination Algorithm (FPA), 2 s using the Artificial Bee Colony (ABC), and 3 s using Ant Colony Optimization (ACO). Our research leverages the comprehensive and up-to-date CSE-CIC-IDS2018 dataset and uses the preprocessing Discretize technique to discretize data. Furthermore, our research provides valuable recommendations to network administrators, aiding them in selecting appropriate machine learning algorithms tailored to specific requirements.
... 3) MDPI follows with 6 papers representing 4 percent of the total number of paper selected 4) Willey was fifth with a total of 5 papers representing 3.5 percent 5) ACM was the least with only 3 paper representing 2.1percent Ahmad et al. (2022) argued that feature selection involve the reduction of computational cost by removing features that have no effect on target variable. Many researchers have agreed that feature selection which form a major part in any classification problem is essential to obtaining accurate results. ...
... This means that to achive better results in terms of accuracy and low false alarm rate, deep learning feature selection technques should be implemented. Apart from deep learning another technique that have gain popularity is the use of hybrid feature selection technique which consist of the use of more that one feature selection technique to select efficient feature to improve on the classification accuracy.This technique was implement by Ahmad et al.(2022) when the suggested p-value and correlation measure as a means to build an Efficient network intrusion detection system. Experimental results of their study suggested an improved performance as compared to using a single technique. ...
Article
Full-text available
Network Security has become a major concern to governments, businesses and individuals all over the world as cybercriminals continuously attack networks and cause harm to personal and organizational data. Different forms of Intrusion Detection Systems(IDSs) have been proposed over the years to minimize these cyberattacks. Several researchers have tried to improve upon the detection accuracy and thus, reducing false alarm rates posed by some of the IDSs. In this paper, we conducted a chronological systematic review of hybrid intrusion detection systems covering all domains. In all, about 300 recent research articles were selected in the area but only 146 articles were able to meet the given quality assurance test. A critical review of the selected articles revealed that 61% did not carry out proper feature selection as a data preprocessing step and as low as 35% handled an imbalanced dataset. We have also done extensive discussions, spanning eleven years of research works on the existing Intrusion Detection Systems.
... -One partition from this dataset was configured as a training and test set. This dataset is a collection of network packets exchanged between hosts (see Table 14) (Ahmad et al, 2022 ...
Article
Full-text available
Introduction: Analyzing the high-dimensional datasets used for intrusion detection becomes a challenge for researchers. This paper presents the most often used data sets. ADFA contains two data sets containing records from Linux/Unix. AWID is based on actual traces of normal and intrusion activity of an IEEE 802.11 Wi-Fi network. CAIDA collects data types in geographically and topologically diverse regions. In CIC-IDS2017, HTTP, HTTPS, FTP, SSH, and email protocols are examined. CSECIC-2018 includes abstract distribution models for applications, protocols, or lower-level network entities. DARPA contains data of network traffic. ISCX 2012 dataset has profiles on various multi-stage attacks and actual network traffic with background noise. KDD Cup '99 is a collection of data transfer from a virtual environment. Kyoto 2006+ contains records of real network traffic. It is used only for anomaly detection. NSL-KDD corrects flaws in the KDD Cup '99 caused by redundant and duplicate records. UNSW-NB-15 is derived from real normal data and the synthesized contemporary attack activities of the network traffic. Methods: This study uses both quantitative and qualitative techniques. The scientific references and publicly accessible information about given dataset are used. Results: Datasets are often simulated to meet objectives required by a particular organization. The number of real datasets are very small compared to simulated dataset. Anomaly detection is rarely used today. Conclusion: 95 The main characteristics and a comparative analysis of the data sets in terms of the date they were created, the size, the number of features, the traffic types, and the purpose are presented.
... In contemporary research, the following machine and deep learning techniques have been used extensively for detecting network traffic anomalies: Decision Tree [12], XG-Boost [13], Support Vector Machine [14], Deep Neural Network [15], and Convolution Neural Network [16]. In addition, Ahmad et al. [17] mentioned that many researchers use ensemble techniques to achieve optimized performance in the network intrusion detection domain. We took advantage of the power of the Ensemble learner after exhaustively removing unuseful features. ...
Article
Full-text available
Due to the wide variety of network services, many different types of protocols exist, producing various packet features. Some features contain irrelevant and redundant information. The presence of such features increases computational complexity and decreases accuracy. Therefore, this research is designed to reduce the data dimensionality and improve the classification accuracy in the UNSW-NB15 dataset. It proposes a hybrid dimensionality reduction system that does feature selection (FS) and feature extraction (FE). FS was performed using the Recursive Feature Elimination (RFE) technique, while FE was accomplished by transforming the features into principal components. This combined scheme reduced a total of 41 input features into 15 components. The proposed systems’ classification performance was determined using an ensemble of Support Vector Classifier (SVC), K-nearest Neighbor classifier (KNC), and Deep Neural Network classifier (DNN). The system was evaluated using accuracy, detection rate, false positive rate, f1-score, and area under the curve metrics. Comparing the voting ensemble results of the full feature set against the 15 principal components confirms that reduced and transformed features did not significantly decrease the classifier’s performance. We achieved 94.34% accuracy, a 93.92% detection rate, a 5.23% false positive rate, a 94.32% f1-score, and a 94.34% area under the curve when 15 components were input to the voting ensemble classifier.
... The supervised methods rely on labeled data, employing them for rigorous training to fine-tune their predictive capabilities, ultimately facilitating the accurate detection of network attacks [27]. In Ref. [28], the authors eliminated highly correlated features and evaluated three algorithms, i.e., SVM, artificial neural network (ANN), and AdaBoost with decision tree, on the preprocessed dataset. In particular, the AdaBoost model uses decision trees as the weak learner and updates weights using the AdaBoost algorithm. ...
Article
Full-text available
Cybersecurity faces constant challenges from increasingly sophisticated network attacks. Recent research shows machine learning can improve attack detection by training models on large labeled datasets. However, obtaining sufficient labeled data is difficult for internal networks. We propose a deep transfer learning model to learn common knowledge from domains with different features and distributions. The model has two feature projection networks to transform heterogeneous features into a common space, and a classification network then predicts transformed features into labels. To align probability distributions for two domains, maximum mean discrepancy (MMD) is used to compute distribution distance alongside classification loss. Though the target domain only has a few labeled samples, unlabeled samples are adequate for computing MMD to align unconditional distributions. In addition, we apply a soft classification scheme on unlabeled data to compute MMD over classes to further align conditional distributions. Experiments between NSL-KDD, UNSW-NB15, and CICIDS2017 validate that the method substantially improves cross-domain network attack detection accuracy.
... Normally, 80% of the data is allocated for training purposes, while the remaining 20% is reserved for evaluating the model's performance. To further refine the model and tune hyperparameters, the training data can be subdivided into 10 folds, for instance as in [42] for 10-fold cross-validation. During this process, nine folds are utilized for training the model, while the remaining fold is employed for validation. ...
Preprint
Full-text available
The critical need for robust defense against cyber-attacks has driven thedevelopment of Cyber Intrusion Detection Systems (CIDSs). AlthoughCIDSs have proven good capabilities of swiftly and accurately iden-tifying malicious activities, there are still some challenges have to betackled. The exponential growth of the search space is one of thesechallenges making finding an optimal solution computationally infeasi-ble for large datasets. Furthermore, the weight space while searchingfor optimal weight is highly nonlinear. Motivated by the observedcharacteristics, complexities, and challenges in the field, this paperpresents an innovative (CIDS) named ANWO-MLP (Adaptive NonlinearWhale Optimization Multi-layer Perceptron). A novel feature selectionmethod called ANWO-FS (Adaptive Nonlinear Whale Optimization -Feature Selection) is employed in the proposed CIDS to identify themost predictive features enabling robust MLP training even in thehighly nonlinear weight spaces. The insider threat detection process is improved by investigating vital aspects of CIDS, including data pro-cessing, initiation, and output handling. We adopt ANWOA (previouslyproposed by us) to mitigate local stagnation, enable rapid conver-gence, optimize control parameters, and handle multiple objectives byinitializing the weight vector in the ANWO-MLP training with mini-mal mean square error. Experiments were conducted on various highlyimbalanced datasets to verify the robustness, stability, and efficiencyof our proposed ANWO-MLP . By employing diverse metrics, superiordetection rates are demonstrated compared to existing approaches.
... This model achieves maximum accuracy with a 99.99% true negative rate and a 0% false negative rate by using a decision tree algorithm rather than other ML classifiers. Other researchers [15] proposed deep learning model-based intrusion detection using hybrid sampling to solve the problem of data imbalance. To produce a balanced dataset, they use one-sided selection (OSS) to reduce the majority samples and increase the minority samples using the Synthetic Minority Over-Sampling Technique (SMOTE). ...
... The RF-SFS feature selection technique is utilized to improve the classification results and reduce the execution time. The proposed models ensure their efficiency when compared to the literature's existing models [5][6][7][8][9][10][11][12][13][14][15][16][17][18]. When applied to the UNSW-NB15 dataset, SFS-RF and RF-SFS-GRU approaches achieved high accuracy scores of 78.52% and 79%, respectively, rather than the proposed approach in [18], which achieves accuracy scores in the ANN and KNN approaches of 72.30% and 77.51%, respectively, when using 19 features. ...
Article
Full-text available
Despite the fact that satellite-terrestrial systems have advantages such as high throughput, low latency, and low energy consumption, as well as low exposure to physical threats and natural disasters and cost-effective global coverage, their integration exposes both of them to particular security challenges that can arise due to the migration of security challenges from one to another. Intrusion Detection Systems (IDS) can also be used to provide a high level of protection for modern network environments such as satellite-terrestrial integrated networks (STINs). To optimize the detection performance of malicious activities in network traffic, four hybrid intrusion detection systems for satellite-terrestrial communication systems (SAT-IDSs) are proposed in this paper. All the proposed systems exploit the sequential forward feature selection (SFS) method based on random forest (RF) to select important features from the dataset that increase relevance and reduce complexity and then combine them with a machine learning (ML) or deep learning (DL) model; Random Forest (RF), Long Short-Term memory (LSTM), Artificial Neural Networks (ANN), and Gated Recurrent Unit (GRU). Two datasets—STIN, which simulates satellite networks, and UNSW-NB15, which simulates terrestrial networks—were used to evaluate the performance of the proposed SAT-IDSs. The experimental results indicate that selecting significant and crucial features produced by RF-SFS vastly improves detection accuracy and computational efficiency. In the first dataset (STIN), the proposed hybrid ML system SFS-RF achieved an accuracy of 90.5% after using 10 selected features, compared to 85.41% when using the whole dataset. Furthermore, the RF-SFS-GRU model achieved the highest performance of the three proposed hybrid DL-based SAT-IDS with an accuracy of 87% after using 10 selected features, compared to 79% when using the entire dataset. In the second dataset (UNSW-NB15), the proposed hybrid ML system SFS-RF achieved an accuracy of 78.52% after using 10 selected features, compared to 75.4% when using the whole dataset. The model with the highest accuracy of the three proposed hybrid DL-based SAT-IDS was the RF-SFS-GRU model. It achieved an accuracy of 79% after using 10 selected features, compared to 74% when using the whole dataset.