Automated data collection system model.

Source publication

The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes

Article

Full-text available

Sep 2017

The arms race between distributors of malware and those seeking to provide defenses has so far favored the former. Signature detection methods have been unable to cope with the onslaught of new binaries aided by rapidly developing obfuscation techniques. Recent research has focused on the analysis of low-level opcodes, both static and dynamic, as a...

Context 1

... examining executables at the opcode level, these can be observed. As shown in fig.3, the system automates processing of the executables and extraction of runtraces. ...

View in full-text

FIGURE 1. Disassembly analysis . This is a screenshot of the ASM file...

FIGURE 2. Our ACNN framework. This model consists of 5 layers: Input,...

FIGURE 6. Attention mechanisms + Residual mechanisms 2. This model...

FIGURE 7. Distribution of data sets. In the picture, the x-axis...

How to Make Attention Mechanisms More Practical in Malware Classification

Article

Full-text available

Oct 2019

Malware and its variants continue to pose a threat to network security. Machine learning has been widely used in the field of malware classification, but some emerging studies, such as attention mechanisms, are rarely applied in this field. In this paper, we analyze the correspondence between bytecode and disassembly of malware, and propose a new f...

IoT Malware Network Traffic Classification using Visual Representation and Deep Learning

Preprint

Full-text available

Oct 2020

With the increase of IoT devices and technologies coming into service, Malware has risen as a challenging threat with increased infection rates and levels of sophistication. Without strong security mechanisms, a huge amount of sensitive data is exposed to vulnerabilities, and therefore, easily abused by cybercriminals to perform several illegal act...

机器学习安全性问题及其防御技术研究综述

Article

Full-text available

Jan 2018

Machine learning has already become one of the most widely used techniques in the field of computer science, and it has been widely applied in image processing, natural language processing, network security and other fields. However, there has been many security threats that need to be overcome on current machine learning algorithms and training da...

UMUDGA: A dataset for profiling algorithmically generated domain names in botnet detection

Article

Full-text available

Mar 2020

In computer security, botnets still represent a significant cyber threat. Concealing techniques such as the dynamic addressing and the Domain Generation Algorithms (DGAs) require an improved and more effective detection process. To this extent, this data descriptor presents a collection of over 30 million manually-labeled algorithmically generated...

An Evaluation of the Proposed Framework for Access Control in the Cloud and BYOD Environment

Article

Full-text available

Jan 2018

As the bring your own device (BYOD) to work trend grows, so do the network security risks. This fast-growing trend has huge benefits for both employees and employers. With malware, spyware and other malicious downloads, tricking their way onto personal devices, organizations need to consider their information security policies. Malicious programs c...

Enhancing Cyber-Resilience for Small and Medium-Sized Organizations with Prescriptive Malware Analysis, Detection and Response

Article

Full-text available

Jul 2023
SENSORS-BASEL

In this study, the methodology of cyber-resilience in small and medium-sized organizations (SMEs) is investigated, and a comprehensive solution utilizing prescriptive malware analysis, detection and response using open-source solutions is proposed for detecting new emerging threats. By leveraging open-source solutions and software, a system specifically designed for SMEs with up to 250 employees is developed, focusing on the detection of new threats. Through extensive testing and validation, as well as efficient algorithms and techniques for anomaly detection, safety, and security, the effectiveness of the approach in enhancing SMEs’ cyber-defense capabilities and bolstering their overall cyber-resilience is demonstrated. The findings highlight the practicality and scalability of utilizing open-source resources to address the unique cybersecurity challenges faced by SMEs. The proposed system combines advanced malware analysis techniques with real-time threat intelligence feeds to identify and analyze malicious activities within SME networks. By employing machine-learning algorithms and behavior-based analysis, the system can effectively detect and classify sophisticated malware strains, including those previously unseen. To evaluate the system’s effectiveness, extensive testing and validation were conducted using real-world datasets and scenarios. The results demonstrate significant improvements in malware detection rates, with the system successfully identifying emerging threats that traditional security measures often miss. The proposed system represents a practical and scalable solution using containerized applications that can be readily deployed by SMEs seeking to enhance their cyber-defense capabilities.

BFEDroid: A Feature Selection Technique to Detect Malware in Android Apps Using Machine Learning

Article

Full-text available

Oct 2022

Malware detection refers to the process of detecting the presence of malware on a host system, or that of determining whether a specific program is malicious or benign. Machine learning-based solutions first gather information from applications and then use machine learning algorithms to develop a classifier that can distinguish between malicious and benign applications. Researchers and practitioners have long paid close attention to the issue. Most previous work has addressed the differences in feature importance or the computation of feature weights, which is unrelated to the classification model used, and therefore, the implementation of a selection approach with limited feature hiccups, and increases the execution time and memory usage. BFEDroid is a machine learning detection strategy that combines backward, forward, and exhaustive subset selection. This proposed malware detection technique can be updated by retraining new applications with true labels. It has higher accuracy (99%), lower memory consumption (1680), and a shorter execution time (1.264SI) than current malware detection methods that use feature selection.

A Survey of Recent Advances in Deep Learning Models for Detecting Malware in Desktop and Mobile Platforms

Preprint

Full-text available

Sep 2022

Malware is one of the most common and severe cyber-attack today. Malware infects millions of devices and can perform several malicious activities including mining sensitive data, encrypting data, crippling system performance, and many more. Hence, malware detection is crucial to protect our computers and mobile devices from malware attacks. Deep learning (DL) is one of the emerging and promising technologies for detecting malware. The recent high production of malware variants against desktop and mobile platforms makes DL algorithms powerful approaches for building scalable and advanced malware detection models as they can handle big datasets. This work explores current deep learning technologies for detecting malware attacks on the Windows, Linux, and Android platforms. Specifically, we present different categories of DL algorithms, network optimizers, and regularization methods. Different loss functions, activation functions, and frameworks for implementing DL models are presented. We also present feature extraction approaches and a review of recent DL-based models for detecting malware attacks on the above platforms. Furthermore, this work presents major research issues on malware detection including future directions to further advance knowledge and research in this field.

Static Feature Selection for IoT Malware Detection

Article

Jun 2022

Abstract— Our world has recently witnessed the explosive growth of IoT networks as one of the pillars of the 4th industrial revolution. Malware on IoT devices also grows accordingly in number and sophisticated techniques. Therefore, it is necessary to come up with more efficient approaches to IoT malware detection with machine learning models that can be used in solutions using limited resources. In this paper, we study and evaluate the efficiency of using a weight of term frequency– inverse document frequency model in feature selection method combined with an effective machine learning model in IoT malware detection based on opcode sequence features. We performed experiments on a MIPS ELF dataset that included 4,511 malicious samples with main four classes and 4,393 benign programs. Experiment results show that our proposed method has very good performance on the above dataset with detection and classification accuracy which are 99.8% and 95.8% respectively while the models only use 20 opcodes that have the highest weight values.

SOMDROID: android malware detection by artificial neural network trained using unsupervised learning

Article

Full-text available

Mar 2022

Android has gained its popularity due to its open-source and number of freely available apps in its official play store. Appropriate functioning of Android apps depends upon the permission or set of permissions which an app demands at the time of installation and run-time. By taking the advantage of these permissions or set of permissions, cybercriminals are developing malware-infected apps daily. In this study, we proposed a framework named as “SOMDROID”, that work on the principle of unsupervised machine learning algorithm. To develop an effective and efficient Android malware detection model, we collect 5,00,000 distinct Android apps from promised repositories and extract 1844 unique features. Further, to select significant features or feature sets, we applied six different feature ranking approaches in this study. With the selected feature or feature sets, we implement the Self-Organizing Map (SOM) algorithm of Kohonen and measure four distinct performance parameters, i.e., Intra-cluster distance, Inter-cluster distance, Accuracy and F-measure. Empirical result reveals that our proposed framework is able to detect 98.7% malware that belongs to unknown families and in addition to that the detection rate is higher by 2% when compared to commercial anti-virus scanners and frameworks proposed in the literature.

Toward lightweight intrusion detection systems using the optimal and efficient feature pairs of the Bot-IoT 2018 dataset

Article

Full-text available

Oct 2021
INT J DISTRIB SENS N

Intrusion detection systems (IDS) play a vital role in traffic flow monitoring on Internet of Things networks by providing a secure network traffic environment and blocking unwanted traffic packets. Various IDS approaches have been proposed previously based on data mining, fuzzy techniques, genetic, neuro-genetic, particle swarm intelligence, rough sets, and conventional machine learning (ML). However, these methods are not energy efficient and do not perform accurately due to the inappropriate feature selection or the use of full features of datasets. In general, datasets contain more than 10 features. Any ML-based lightweight IDS trained with full features turns into an inefficient and heavyweight IDS. This case challenges IoT networks that suffer from power efficiency problems. Therefore, lightweight (energy-efficient), accurate, and high-performance IDS are paramount instead of inefficient and heavyweight IDS. To address these challenges, a new approach that can help to determine the most effective and optimal feature pairs of datasets which enable the development of lightweight IDS was proposed. For this purpose, 10 ML algorithms and the recent BoT-IoT(2018) dataset were selected. 12-best-features recommended by the developers of this dataset were used in this study. 66 unique feature pairs were generated from the 12-best-features. Next, 10 full feature-based IDS were developed by training the 10 ML algorithms with the 12-full-features. Similarly, 660 feature pair-based lightweight IDS were developed by training the 10 ML algorithms via each feature pair out of the 66 feature pairs. Moreover, the 10 IDS trained with 12-best-features and the 660 IDS trained via 66 feature pairs were compared to each other based on the ML algorithmic groups. Then, the feature pair-based lightweight IDS that achieved the accuracy level of the ten full-feature-based IDS were selected. This way, the optimal and efficient feature pairs, and the lightweight IDS were determined. The most lightweight IDS achieved more than 90% detection accuracy.

Deep learning-aided runtime opcode-based Windows malware detection

Article

Full-text available

Sep 2021
NEURAL COMPUT APPL

Thousands of new malware codes are developed every day. Signature-based methods, which are employed by common malware detectors, are susceptible to code obfuscation and novel malware. In this paper, we present an alternative method for malware detection, which makes use of assembly opcode sequences obtained during runtime. First, for sequential opcode data, we utilize natural language processing and deep learning techniques to facilitate the extraction of deeper behavioral features. Due to these features, this method can be impervious to code obfuscation and effective against novel malware. Finally, these features are fed to various machine learning algorithms for classification. The experiments on a more class balanced dataset of 26869 samples demonstrated that MCC (Matthew’s correlation coefficient) score as high as 0.95 is achievable with this approach. The MCC score results for the experiments conducted on imbalanced and artificially balanced datasets are 0.81 and 0.83, respectively.

A Review on Machine Learning Approaches for Network Malicious Behavior Detection in Emerging Technologies

Article

Full-text available

Apr 2021
Entropy

Network anomaly detection systems (NADSs) play a significant role in every network defense system as they detect and prevent malicious activities. Therefore, this paper offers an exhaustive overview of different aspects of anomaly-based network intrusion detection systems (NIDSs). Additionally, contemporary malicious activities in network systems and the important properties of intrusion detection systems are discussed as well. The present survey explains important phases of NADSs, such as pre-processing, feature extraction and malicious behavior detection and recognition. In addition, with regard to the detection and recognition phase, recent machine learning approaches including supervised, unsupervised, new deep and ensemble learning techniques have been comprehensively discussed; moreover, some details about currently available benchmark datasets for training and evaluating machine learning techniques are provided by the researchers. In the end, potential challenges together with some future directions for machine learning-based NADSs are specified.

We Need to Talk About AntiViruses: Challenges & Pitfalls of AV Evaluations

Article

Full-text available

Apr 2020
COMPUT SECUR

Security evaluation is an essential task to identify the level of protection accomplished in running systems or to aid in choosing better solutions for each specific scenario. Although antiviruses (AVs) are one of the main defensive solutions for most end-users and corporations, AV’s evaluations are conducted by few organizations and often limited to compare detection rates. Moreover, other important factors of AVs’ operating mode (e.g., response time and detection regression) are usually underestimated. Ignoring such factors create an “understanding gap” on the effectiveness of AVs in actual scenarios, which we aim to bridge by presenting a broader characterization of current AVs’ modes of operation. In our characterization, we consider distinct file types, operating systems, datasets, and time frames. To do so, we daily collected samples from two distinct, representative malware sources and submitted them to the VirusTotal (VT) service for 30 consecutive days. In total, we considered 28,875 unique malware samples. For each day, we retrieved the submitted samples’ detection rates and assigned labels, resulting in more than 1M distinct VT submissions overall. Our experimental results show that: (i) phishing contexts are a challenge for all AVs, turning malicious Web pages detectors less effective than malicious files detectors; (ii) generic procedures are insufficient to ensure broad detection coverage, incurring in lower detection rates for particular datasets (e.g., country-specific) than for those with world-wide collected samples; (iii) detection rates are unstable since all AVs presented detection regression effects after scans in different time frames using the same dataset and (iv) AVs’ long response times in delivering new signatures/heuristics create a significant attack opportunity window within the first 30 days after we first identified a malicious binary. To address the effects of our findings, we propose six new metrics to evaluate the multiple aspects that impact the effectiveness of AVs. With them, we hope to assess corporate (and domestic) users to better evaluate the solutions that fit their needs more adequately.

The rise of machine learning for detection and classification of malware: Research developments, trends and challenges

Article

Full-text available

Mar 2020
J NETW COMPUT APPL

The struggle between security analysts and malware developers is a never-ending battle with the complexity of malware changing as quickly as innovation grows. Current state-of-the-art research focus on the development and application of machine learning techniques for malware detection due to its ability to keep pace with malware evolution. This survey aims at providing a systematic and detailed overview of machine learning techniques for malware detection and in particular, deep learning techniques. The main contributions of the paper are: (1) it provides a complete description of the methods and features in a traditional machine learning workflow for malware detection and classification, (2) it explores the challenges and limitations of traditional machine learning and (3) it analyzes recent trends and developments in the field with special emphasis on deep learning approaches. Furthermore, (4) it presents the research issues and unsolved challenges of the state-of-the-art techniques and (5) it discusses the new directions of research. The survey helps researchers to have an understanding of the malware detection field and of the new developments and directions of research explored by the scientific community to tackle the problem.

Automated data collection system model.

Context in source publication

Similar publications

Citations