ThesisPDF Available

Implementation and evaluation of secure and scalable anomaly-based network intrusion detection

Authors:

Abstract and Figures

Abstract Corporate communication networks are frequently attacked with sophisticated and previously unseen malware or insider threats, which makes advanced defense mechanisms such as anomaly based intrusion detection systems necessary, to detect, alert and respond to security incidents. Both signature-based and anomaly detection strategies rely on features extracted from the network traffic, which requires secure and extensible collection strategies that make use of modern multi core architectures. Available solutions are written in low level system programming languages that require manual memory management, and suffer from frequent vulnerabilities that allow a remote attacker to disable or compromise the network monitor. Others have not been designed with the purpose of research in mind and lack in terms of flexibility and data availability. To tackle these problems and ease future experiments with anomaly based detection techniques, a research framework for collecting traffic features implemented in a memory-safe language will be presented. It provides access to network traffic as type-safe structured data, either for specific protocols or custom abstractions, by generating audit records in a platform neutral format. To reduce storage space, the output is compressed by default. The approach is entirely implemented in the Go programming language, has a concurrent design, is easily extensible and can be used for live capture from a network interface or with PCAP and PCAPNG dumpfiles. Furthermore the framework offers functionality for the creation of labeled datasets, targeting application in supervised machine learning. To demonstrate the developed tooling, a series of experiments is conducted, on classifying malicious behavior in the CIC-IDS-2017 dataset, using Tensorflow and a Deep Neural Network.
Content may be subject to copyright.
A preview of the PDF is not available
... Again, the study used CSV file, which contains features obtained from network flow. Mieden [8] presented many experiments on classifying unwanted behavior in CICIDS2017 dataset using deep learning with tensor flow. They worked on CSV files which contains 85 features. ...
Article
Full-text available
The increasing use of Internet networks has led to increased threats and new attacks day after day. In order to detect anomaly or misused detection, Intrusion Detection System (IDS) has been proposed as an important component of secure network. Because of their model free properties that enable them to identify the network pattern and discover whether they are normal or malicious, Machine-learning technique has been useful in the area of intrusion detection. Different types of machine learning models were leveraged in anomaly-based IDS. There is an increasing demand for reliable and real-world attacks dataset among the research community. In this paper, a detailed analysis of most-recent dataset CICIDS2017 has been made. During the analysis, many problems and shortcoming in a dataset were found. Some solutions are proposed to fix these problems and produce optimized CICIDS2017 dataset. A 36-feature has been extracted during the analysis, and compared to 23-featured extracted by the dataset from literatures. The 36-features gave the best result of losses, accuracy and F1-score metrics for the testing model.
Book
Full-text available
With the rapid rise in the ubiquity and sophistication of Internet technology and the accompanying growth in the number of network attacks, network intrusion detection has become increasingly important. Anomaly-based network intrusion detection refers to finding exceptional or nonconforming patterns in network traffic data compared to normal behavior. Finding these anomalies has extensive applications in areas such as cyber security, credit card and insurance fraud detection, and military surveillance for enemy activities. Network Anomaly Detection: A Machine Learning Perspective presents machine learning techniques in depth to help you more effectively detect and counter network intrusion.
Article
Full-text available
Network packet tracing has been used for many different purposes during the last few decades, such as network software debugging, networking performance analysis, forensic investigation, and so on. Meanwhile, the size of packet traces becomes larger, as the speed of network rapidly increases. Thus, to handle huge amounts of traces, we need not only more hardware resources, but also efficient software tools. However, traditional tools are inefficient at dealing with such big packet traces. In this paper, we propose pcapWT, an efficient packet extraction tool for large traces. PcapWT provides fast packet lookup by indexing an original trace using a wavelet tree structure. In addition, pcapWT supports multi-threading for avoiding synchronous I/O and blocking system calls used for file processing, and is particularly efficient on machines with SSD. PcapWT shows remarkable performance enhancements in comparison with traditional tools such as tcpdump and most recent tools such as pcapIndex in terms of index data size and packet extraction time. Our benchmark using large and complex traces shows that pcapWT reduces the index data size down below 1% of the volume of the original traces. Moreover, packet extraction performance is 20% better than with pcapIndex. Furthermore, when a small amount of packets are retrieved, pcapWT is hundreds of times faster than tcpdump.
Article
Full-text available
Anomaly detection in communication networks provides the basis for the uncovering of novel attacks, misconfigurations and network failures. Resource constraints for data storage, transmission and processing make it beneficial to restrict input data to features that are (a) highly relevant for the detection task and (b) easily derivable from network observations without expensive operations. Removing strong correlated, redundant and irrelevant features also improves the detection quality for many algorithms that are based on learning techniques. In this paper we address the feature selection problem for network traffic based anomaly detection. We propose a multi-stage feature selection method using filters and stepwise regression wrappers. Our analysis is based on 41 widely-adopted traffic features that are presented in several commonly used traffic data sets. With our combined feature selection method we could reduce the original feature vectors from 41 to only 16 features. We tested our results with five fundamentally different classifiers, observing no significant reduction of the detection performance. In order to quantify the practical benefits of our results, we analyzed the costs for generating individual features from standard IP Flow Information Export records, available at many routers. We show that we can eliminate 13 very costly features and thus reducing the computational effort for on-line feature generation from live traffic observations at network nodes.
Conference Paper
Full-text available
In 1987, Dorothy Denning published the seminal paper on anomaly detection as applied to intrusion detection on a sin- gle system. Her paper sparked a new paradigm in intrusion detection research with the notion that malicious behavior could be distinguished from normal system use. Since that time, a great deal of anomaly detection research based on Denning's original premise has occurred. However, Den- ning's assumptions about anomalies that originate on a sin- gle host have been applied essentially unaltered to networks. In this paper we question the application of Denning's work to network based anomaly detection, along with other as- sumptions commonly made in network-based detection re- search. We examine the assumptions underlying selected studies of network anomaly detection and discuss these as- sumptions in the context of the results from studies of net- work traffic patterns. The purpose of questioning the old paradigm of anomaly detection as a strategy for network intrusion detection is to reconfirm the paradigm as sound or begin the process of replacing it with a new paradigm in light of changes in the operating environment.
Article
A computer system intrusion is seen as any set of actions that attempt to compromise the integrity, confidentiality or availability of a resource.1 The introduction of networks and the Internet caused great concern about the protection of sensitive information and have resulted in many computer security research efforts during the past few years. Although preventative techniques such as access control and authentication attempt to prevent intruders, these can fail, and as a second line of defence, intrusion detection has been introduced. Intrusion detection systems (IDS) are implemented to detect an intrusion as it occurs, and to execute countermeasures when detected.Usually, a security administrator has difficulty in selecting an IDS approach for his unique set-up. In this Report, different approaches to intrusion detection systems are compared, to supply a norm for the best-fit system. The results would assist in the selection of a single appropriate intrusion detection system or combine approaches that best fit any unique computer system.
Conference Paper
With exponential growth in the size of computer networks and developed applications, the significant increasing of the potential damage that can be caused by launching attacks is becoming obvious. Meanwhile, Intrusion Detection Systems (IDSs) and Intrusion Prevention Systems (IPSs) are one of the most important defense tools against the sophisticated and ever-growing network attacks. Due to the lack of adequate dataset, anomaly-based approaches in intrusion detection systems are suffering from accurate deployment, analysis and evaluation. There exist a number of such datasets such as DARPA98, KDD99, ISC2012, and ADFA13 that have been used by the researchers to evaluate the performance of their proposed intrusion detection and intrusion prevention approaches. Based on our study over eleven available datasets since 1998, many such datasets are out of date and unreliable to use. Some of these datasets suffer from lack of traffic diversity and volumes, some of them do not cover the variety of attacks, while others anonymized packet information and payload which cannot reflect the current trends, or they lack feature set and metadata. This paper produces a reliable dataset that contains benign and seven common attack network flows, which meets real world criteria and is publicly available. Consequently, the paper evaluates the performance of a comprehensive set of network traffic features and machine learning algorithms to indicate the best set of features for detecting the certain attack categories.
Conference Paper
Deep packet inspection systems (DPI) process wire format network data from untrusted sources, collecting semantic information from a variety of protocols and file formats as they work their way upwards through the network stack. However, implementing corresponding dissectors for the potpourri of formats that today's networks carry, remains time-consuming and cumbersome, and also poses fundamental security challenges. We introduce a novel framework, Spicy, for dissecting wire format data that consists of (i) a format specification language that tightly integrates syntax and semantics; (ii) a compiler toolchain that generates efficient and robust native dissector code from these specifications just-in-time; and (iii) an extensive API for DPI applications to drive the process and leverage results. Furthermore, Spicy can reverse the process as well, assembling wire format from the high-level specifications. We pursue a number of case studies that show-case dissectors for network protocols and file formats---individually, as well as chained into a dynamic stack that processes raw packets up to application-layer content. We also demonstrate a number of example host applications, from a generic driver program to integration into Wireshark and Bro. Overall, this work provides a new capability for developing powerful, robust, and reusable dissectors for DPI applications. We publish Spicy as open-source under BSD license.
Conference Paper
When building applications that process large volumes of network traffic – such as firewalls, routers, or intrusion detection systems – one faces a striking gap between the ease with which the desired analysis can often be described in high-level terms, and the tremendous amount of low-level implementation details one must still grapple with for coming to an efficient and robust system. We present a novel environment, HILTI, that provides a bridge between these two levels by offering to the application designer the abstractions required for effectively describing typical network analysis tasks, while still being designed to provide the performance necessary for monitoring Gbps networks in operational settings. The new HILTI middle-layer consists of two main pieces: an abstract machine model that is specifically tailored to the networking domain and directly supports the field's common abstractions and idioms in its instruction set; and a compilation strategy for turning programs written for the abstract machine into optimized, natively executable task-parallel code for a given target platform. We have developed a prototype of the HILTI environment that fully supports all of the abstract machine's functionality, and we have ported a number of typical networking applications to the new environment. We also discuss how HILTI's processing can transparently integrate custom hardware elements where available as well as leverage non-standard many-core platforms for parallelization.
Article
Several information security techniques are available today to protect information and systems against unauthorized use, duplication, alteration, destruction and virus attacks. Intrusion detection a key component of information security (protect, detect and react) and network defense, provides information on successful and unsuccessful attempts to compromise information assurance (availability, integrity, and confidentiality). Intruders can broadly be categorized into two types: external intruders, who are unauthorized users of information, and systems they attack, and internal intruders, who have permission to access information and systems with a few restrictions. In this chapter we first present the state-of-the-art of the evolution of intrusion detection technology and address a few intrusion detection techniques and IDS implementations. An overview of computer attack taxonomy and computer attack demystification along with a few detection signatures is presented. Special emphasis is also given to the current IDS limitations. Further we describe few obfuscation techniques applied to recent viruses that were used to thwart commercial grade antivirus tools.
Article
A model of a real-time intrusion-detection expert system capable of detecting break-ins, penetrations, and other forms of computer abuse is described. The model is based on the hypothesis that security violations can be detected by monitoring a system's audit records for abnormal patterns of system usage. The model includes profiles for representing the behavior of subjects with respect to objects in terms of metrics and statistical models, and rules for acquiring knowledge about this behavior from audit records and for detecting anomalous behavior. The model is independent of any particular system, application environment, system vulnerability, or type of intrusion, thereby providing a framework for a general-purpose intrusion-detection expert system. Index Terms-Abnormal behavior, auditing, intrusions, monitoring, profiles, security, statistical measures. I. INTRODUCTION