Examples of transformed events by the log2text component

Examples of transformed events by the log2text component

Source publication
Article
Full-text available
Insider threat detection has drawn increasing attention in recent years. In order to capture a malicious insider’s digital footprints that occur scatteredly across a wide range of audit data sources over a long period of time, existing approaches often leverage a scoring mechanism to orchestrate alerts generated from multiple sub-detectors, or requ...

Context in source publication

Context 1
... different security logs can be transformed into an identical format. Table 3 shows the examples drawn from the experimental dataset for each type of security log. At this time, the text can be generated by concatenating the six words for each event. ...

Similar publications

Article
Full-text available
The purpose of this paper is to analyze, discuss, and develop a study of world universal digitalization processes as well as challenges and threats, and develop an approach to defining the shadow digital economy. Along with huge innovative achievements, digitalization processes are accompanied by the formation of a digital economy and the growth of...
Article
Full-text available
Named entity recognition (NER) is a word-level sequence tagging task. The key of Chinese cybersecurity NER is to obtain meaningful word representations and to delicately model the inter-word relations. However, Chinese is a language of compound words and lacks morphological inflections. Moreover, the role and meaning of a word depends on the contex...
Article
Full-text available
This chapter contributes to the cybersecurity conversation around being a parent in the digital era. We briefly examined parenting Styles, the intuitive nature of Digital technologies and their implications on children's development if not properly harnessed. Different people adopt different approaches. Regardless of the parenting style, parents ne...
Preprint
Full-text available
To avoid costly security patching after software deployment, security-by-design techniques (e.g., STRIDE threat analysis) are adopted in organizations to root out security issues before the system is ever implemented. Despite the global gap in cybersecurity workforce and the high manual effort required for performing threat analysis, organizations...

Citations

... However, more issues were discovered in the KDD99 data set [30,33,34]. A lot of these issues were solved in the updated NSL-KDD data set such as removing redundant records which lead to reducing the size of training and test sets and doing experiments easier and faster [21,30,35]. ...
Article
Full-text available
Insider threats have recently become one of the most urgent cybersecurity challenges facing numerous businesses, such as public infrastructure companies, major federal agencies, and state and local governments. Our purpose is to find the most accurate machine learning (ML) model to detect insider attacks. In the realm of machine learning, the most convenient classifier is usually selected after further evaluation trials of candidate models which can cause unseen data (test data set) to leak into models and create bias. Accordingly, overfitting occurs because of frequent training of models and tuning hyperparameters; the models perform well on the training set while failing to generalize effectively to unseen data. The validation data set and hyperparameter tuning are utilized in this study to prevent the issues mentioned above and to choose the best model from our candidate models. Furthermore, our approach guarantees that the selected model does not memorize data of the threats occurring in the local area network (LAN) through the usage of the NSL-KDD data set. The following results are gathered and analyzed: support vector machine (SVM), decision tree (DT), logistic regression (LR), adaptive boost (AdaBoost), gradient boosting (GB), random forests (RFs), and extremely randomized trees (ERTs). After analyzing the findings, we conclude that the AdaBoost model is the most accurate, with a DoS of 99%, a probe of 99%, access of 96%, and privilege of 97%, as well as an AUC of 0.992 for DoS, 0.986 for probe, 0.952 for access, and 0.954 for privilege.
... By training a model using Word2vec with the corpus, the researchers were able to approximate the posterior probabilities associated with insider behaviors. The proposed approach proves to be effective and scalable for practical applications in insider threat detection [17]. The problem with this method is detecting malware incidents that will happen after they have bypassed perimeter security layers. ...
... In our previous work [7], we performed related studies on the feature extractions of user behavior, and categorized them into two types: (i) statistical features based on artificial definition; (ii) hidden features based on representation learning. Although previous methods [8][9][10][11][12][13] have their own unique insights, they are still faced with the following limitations: ...
... Inspired by position encoding [28], Yuan et al. [27] retained the absolute time information of user activities by calculating the minute offset, and used self-attention mechanism to construct the final behavior representation. Secondly, there are some other historical baseline schemes [13,29,30] that prefer to capture temporal characteristics in a round-about way. In these schemes, the model itself is only a modeling method, and training data and training mode are key. ...
... In other words, they simply take the individual historical behaviors as the model input. For example, Liu et al. [13] used the "4W" template to reorganize audit logs, and arranged them based on user id in chronological order to form training corpus. is indirect extraction method has the advantage of simplicity, but also confronts the challenge of limited performance. ...
Article
Full-text available
Insider threat detection is important for the smooth operation and security protection of an organizational system. Most existing detection models establish historical baseline by reconstructing single-day and individual user behaviors, and then treat any outlier of the baseline as a threat. However, such methods ignore the temporal and spatial correlations between different activities, which result in an unsatisfying performance. To address such an issue, we propose a novel insider threat detection method, namely, Memory-Augmented Insider Threat Detection (MAITD), in this paper. Such an idea is motivated by the observation that the combination of individual model that focuses on historical baseline and group model that represents peer baseline can effectively identify the low-signal yet long-lasting insider threats, and reduce the possibility of false positives. To illustrate, our MAITD captures the temporal and spatial correlation of user behaviors by constructing compound behavioral matrix and common group model, and combines specific application scenarios to integrate the detection results. Moreover, it introduces the memory-augmented network into autoencoder to enlarge the reconstruction error of abnormal samples, thereby reducing the false negative rate. The experimental results on CERT dataset show that the instance-based and user-based AUCs of MAITD reach up to 87.54% and 94.56%, respectively, which significantly outperform previous works.
... It was shown in [66] that to capture the distributed digital footprints of malicious insiders among a variety of audit data sources over a prolonged period of time, current methods usually use scoring mechanism to arrange alerts produced from several sub-detectors. Mentioned roaches lead to great deployment complexity and extra cost [66]. ...
... It was shown in [66] that to capture the distributed digital footprints of malicious insiders among a variety of audit data sources over a prolonged period of time, current methods usually use scoring mechanism to arrange alerts produced from several sub-detectors. Mentioned roaches lead to great deployment complexity and extra cost [66]. The authors of [67] explained that US Navy with the help of Stottler Henke extends and improves enhance the Intelligent Surface Threat Identification System (ISTIS). ...
Preprint
Full-text available
The threat hunting lifecycle is a complex atmosphere that requires special attention from professionals to maintain security. This paper is a collection of recent work that gives a holistic view of the threat hunting ecosystem, identifies challenges, and discusses the future with the integration of artificial intelligence (AI). We specifically establish a life cycle and ecosystem for privacy-threat hunting in addition to identifying the related challenges. We also discovered how critical the use of AI is in threat hunting. This work paves the way for future work in this area as it provides the foundational knowledge to make meaningful advancements for threat hunting.
... The proposed system shows clear advantages in both detection performance and the ability to generalize when compared to other works in the literature employing unsupervised anomaly detection methods for insider threat detection on the CERT datasets [11], [13]- [18], [57]. ...
... On CERT R6.2 data, our approach achieved AUC of 0.977 and 0.981 on day and week data. In comparison, recent best AUCs achieved on R6.2 day data were 0.814 (Matterer and Lejeune [17]), and 0.956 (Liu et al. [57], on only 3 malicious insiders). This demonstrates the advantage of our approach in embedding temporal information in data representation, as opposed to using a learner with temporal learning capabilities such as Long Short-Term Memory [17] and Markov models [13]. ...
Article
Full-text available
... Existing works helped to understand the exhaustive range of methods applied from various perspectives. To list a few, it includes statistical metric based solutions [9], machine learning [10], image-based analysis [11], blockchain-based analysis [12], natural language processing (NLP) approaches [13] and dealing with the insider attack in several critical application areas like health care [14] and Internet of Things [15]. There is an upward trend towards using machine learning and deep learning based solutions for insider threat analysis. ...
Preprint
Full-text available
Cyberattacks from within an organization's trusted entities are known as insider threats. Anomaly detection using deep learning requires comprehensive data, but insider threat data is not readily available due to confidentiality concerns of organizations. Therefore, there arises demand to generate synthetic data to explore enhanced approaches for threat analysis. We propose a linear manifold learning-based generative adversarial network, SPCAGAN, that takes input from heterogeneous data sources and adds a novel loss function to train the generator to produce high-quality data that closely resembles the original data distribution. Furthermore, we introduce a deep learning-based hybrid model for insider threat analysis. We provide extensive experiments for data synthesis, anomaly detection, adversarial robustness, and synthetic data quality analysis using benchmark datasets. In this context, empirical comparisons show that GAN-based oversampling is competitive with numerous typical oversampling regimes. For synthetic data generation, our SPCAGAN model overcame the problem of mode collapse and converged faster than previous GAN models. Results demonstrate that our proposed approach has a lower error, is more accurate, and generates substantially superior synthetic insider threat data than previous models.
... As a reactive way, the audit data can be retrieved to avoid further damage if cybersecurity incidents have happened [48]. Furthermore, the risks are monitored based on the established risk profile. ...
Preprint
Full-text available
Cyber assurance, which is the ability to operate under the onslaught of cyber attacks and other unexpected events, is essential for organizations facing inundating security threats on a daily basis. Organizations usually employ multiple strategies to conduct risk management to achieve cyber assurance. Utilizing cybersecurity standards and certifications can provide guidance for vendors to design and manufacture secure Information and Communication Technology (ICT) products as well as provide a level of assurance of the security functionality of the products for consumers. Hence, employing security standards and certifications is an effective strategy for risk management and cyber assurance. In this work, we begin with investigating the adoption of cybersecurity standards and certifications by surveying 258 participants from organizations across various countries and sectors. Specifically, we identify adoption barriers of the Common Criteria through the designed questionnaire. Taking into account the seven identified adoption barriers, we show the recommendations for promoting cybersecurity standards and certifications. Moreover, beyond cybersecurity standards and certifications, we shed light on other risk management strategies devised by our participants, which provides directions on cybersecurity approaches for enhancing cyber assurance in organizations.
... were studied and existing approaches based on adopted deep learning architecture were evaluated in literature In literature [10] [14], extensive study of various techniques of Deep Neural Network(DNN) were studied and evaluated and also insights into lot of improvements which can be made to existing models with RNN and Reinforcement Neural Networks. Hu et.al in [11], used multiple log source events which were correlated, as Active Directory(AD),Virtual Private Network(VPN) product, data security products in building user profiles which was an important technique. Significant task on UBA platform with multi algorithm technique combining One Class Support Vector Machine(OCSVM), RNN and isolation forest was used on aggregated data source was done in literature [12] which showed improved effectiveness over individual method. ...
Conference Paper
Full-text available
In the field of security analysis of an organization, identifying anomalous activities of user from log data for insider threat detection is difficult as well as important. Identification of such anomalous insider behavior is commonly achieved by use of behavior modeling. This paper presents an approach of one class learning, also known as unary classification or class modelling, where the model is exclusively trained on majority class data. The model learns what a model behavior for an employee of an organization is. The proposed paper attempts to detect the insider threat activities and monitor if any unexpected or suspicious behavior are observed by the model, which produces high reconstruction error within the model and are classified as anomalies. Training of the model implements feature vectors extracted form user log activities in a fixed window of per day. This approach implements Gated Recurrent Unit(GRU) based Autoencoder to model user behavior per day and detect anomalous insider threat points. Since the model is overfitted on normal data, the error produced by normal data is very low while the autoencoder produces high error on malicious class of abnormal data. The dataset used in work is Computer Emergency Response Team(CERT) r4.2 and feature vectors are derived according to the number of times a user performs certain activity within a day is used. Behavior learning through GRU autoencoder is used. At different threshold, performance of model was measured and the model demonstrated good distinction with minimum mis-classfication for both classes with values of true positive and true negative rates at 79.81%.
... Despite the aforementioned challenges, industry and academia have proposed many insider threat detection approaches [4][5][6][7][8][9][10]. Since malicious behavior is widely varying, it is impractical to explicitly characterize insider threat. ...
... To further improve the model accuracy, Jiang et al. expand the feature vector by exploiting the graph convolutional network and structural information between users [8]. Inspired by natural language processing, Liu et al. first use the "4W" template to reorganize the audit logs, and then transform the human-consumable textual data into the machinable-consumable numerical vector with the help of the Word2vec model [4]. e main advantage of this approach is that it can capture the potential semantic properties in the original audit logs without relying on any domain knowledge. ...
... Similarly, we experimentally determine the appropriate value for threshold κ. Although the determination of malicious user depends on whether the person performs a malicious act or not in the practical situation, there is a Input: user set U, user behavior set Ω, transformation set Ψ, softmax classifier h s (x) Output: suspicious behavior set Φ, suspicious user set U a (1) Φ = ∅, U a � ∅, Γ � {week, day, session} (2) for u ∈ U do (3) Ω u � Ω train + Ω test (4) for ω ∈ Ω train do (5) for t ∈ Γ do (6) calculate the feature vectors x u t based on user id u; (7) end for (8) end for (9) S u t ← x u t : t ∈ Γ //create the grayscale image S according to equation (2) (10) S Ψ ← (Ψ j (s), j): s ∈ S u t , Ψ j ∈ Ψ //create the self-labelled dataset (11) while not converged do (12) train h s (x) on the self-labelled dataset S Ψ (13) end while (14) for i ∈ 0, . . . , K − 1 { } do// K � |Ψ| (15) calculate α i according to the numerical method [35]. ...
Article
Full-text available
Insider threat detection has been a challenging task over decades; existing approaches generally employ the traditional generative unsupervised learning methods to produce normal user behavior model and detect significant deviations as anomalies. However, such approaches are insufficient in precision and computational complexity. In this paper, we propose a novel insider threat detection method, Image-based Insider Threat Detector via Geometric Transformation (IGT), which converts the unsupervised anomaly detection into supervised image classification task, and therefore the performance can be boosted via computer vision techniques. To illustrate, our IGT uses a novel image-based feature representation of user behavior by transforming audit logs into grayscale images. By applying multiple geometric transformations on these behavior grayscale images, IGT constructs a self-labelled dataset and then trains a behavior classifier to detect anomaly in a self-supervised manner. The motivation behind our proposed method is that images converted from normal behavior data may contain unique latent features which remain unchanged after geometric transformation, while malicious ones cannot. Experimental results on CERT dataset show that IGT outperforms the classical autoencoder-based unsupervised insider threat detection approaches, and improves the instance and user based Area under the Receiver Operating Characteristic Curve (AUROC) by 4% and 2%, respectively.
... Despite the above challenges, the industrial and academic put forward lots of insider threat detection approaches [4]- [10]. Since malicious behavior is widely varying, it is impractical to explicitly characterize insider threat. ...
... To further improve the model accuracy, Jiang et al. expand the feature vector by exploiting the graph convolutional network and structural information between users [8]. Inspired by natural language processing, Liu et al. first use the '4W' template to reorganize the audit logs, and then transform the human-consumable textual data into the machinable-consumable numerical vector with the help of Word2vec model [4]. The main advantage of this approach is that it can capture the potential semantic properties in the original audit logs without relying on any domain knowledge. ...
Preprint
Insider threat detection has been a challenging task over decades, existing approaches generally employ the traditional generative unsupervised learning methods to produce normal user behavior model and detect significant deviations as anomalies. However, such approaches are insufficient in precision and computational complexity. In this paper, we propose a novel insider threat detection method, Image-based Insider Threat Detector via Geometric Transformation (IGT), which converts the unsupervised anomaly detection into supervised image classification task, and therefore the performance can be boosted via computer vision techniques. To illustrate, our IGT uses a novel image-based feature representation of user behavior by transforming audit logs into grayscale images. By applying multiple geometric transformations on these behavior grayscale images, IGT constructs a self-labelled dataset and then train a behavior classifier to detect anomaly in self-supervised manner. The motivation behind our proposed method is that images converted from normal behavior data may contain unique latent features which keep unchanged after geometric transformation, while malicious ones cannot. Experimental results on CERT dataset show IGT outperforms the classical autoencoder-based unsupervised insider threat detection approaches, and improves the instance and user based Area under the Receiver Operating Characteristic Curve (AUROC) by 4% and 2%, respectively.