Figure - available from: Neural Computing and Applications
This content is subject to copyright. Terms and conditions apply.
Illustration of an assembly code sequence. First part of each line denotes an opcode. Second part is called operand

Illustration of an assembly code sequence. First part of each line denotes an opcode. Second part is called operand

Source publication
Article
Full-text available
Thousands of new malware codes are developed every day. Signature-based methods, which are employed by common malware detectors, are susceptible to code obfuscation and novel malware. In this paper, we present an alternative method for malware detection, which makes use of assembly opcode sequences obtained during runtime. First, for sequential opc...

Similar publications

Article
Full-text available
Static malware detection approaches are time-consuming and cannot deal with code obfuscation techniques. Dynamic malware detection approaches, on the other hand, address these two challenges, however, suffer from behavioral ambiguity, such as the system calls obfuscation. In this paper, we introduce Markhor, a dynamic and behavior-based malware det...

Citations

... The former involves analyzing visual representations of malware, while the latter leverages the sequence of operations, known as opcodes, within the program to identify malicious code. Parildi et al. [21] presented an alternative method for malware detection using assembly opcode sequences, utilizing natural language processing and deep learning techniques for deeper behavioral features, and achieving MCC scores of up to 0.95. In another similar study, Santos et al. [22] proposed a method that involved examining the frequency of opcode sequences and building a semi-supervised machine-learning classifier using a set of labeled and unlabeled malware and legitimate software instances. ...
Article
Full-text available
This paper presents a unique hybrid classifier that combines deep neural networks with a type-III fuzzy system for decision-making. The ensemble incorporates ResNet-18, Efficient Capsule neural network, ResNet-50, the Histogram of Oriented Gradients (HOG) for feature extraction, neighborhood component analysis (NCA) for feature selection, and Support Vector Machine (SVM) for classification. The innovative inputs fed into the type-III fuzzy system come from the outputs of the mentioned neural networks. The system’s rule parameters are fine-tuned using the Improved Chaos Game Optimization algorithm (ICGO). The conventional CGO’s simple random mutation is substituted with wavelet mutation to enhance the CGO algorithm while preserving non-parametricity and computational complexity. The ICGO was evaluated using 126 benchmark functions and 5 engineering problems, comparing its performance with well-known algorithms. It achieved the best results across all functions except for 2 benchmark functions. The introduced classifier is applied to seven malware datasets and consistently outperforms notable networks like AlexNet, ResNet-18, GoogleNet, and Efficient Capsule neural network in 35 separate runs, achieving over 96% accuracy. Additionally, the classifier’s performance is tested on the MNIST and Fashion-MNIST in 10 separate runs. The results show that the new classifier excels in accuracy, precision, sensitivity, specificity, and F1-score compared to other recent classifiers. Based on the statistical analysis, it has been concluded that the ICGO and propose method exhibit significant superiority compared to the examined algorithms and methods. The source code for ICGO is available publicly at https://nimakhodadadi.com/algorithms-%2B-codes. Graphical abstract
... It is essential to have defense techniques that provide effective malware classification. To counter voluminous malware attacks, machine learning techniques [3,6,17,40,62], especially convolutional neural network (CNN) architecture, are increasingly being applied to detect and classify malware families [5,12,18,23,27]. Existing research suggests that malware variants from similar families often contain similarities due to code reuse [8,51,56]. ...
Article
Full-text available
Targeted malware attacks are usually more purposeful and harmful than untargeted attacks, so it is important to perform the malware family classification. In classification tasks, convolutional neural networks (CNN) have shown superior performance. However, clean samples with intentional small-scale perturbations (i.e. adversarial examples) may lead to incorrect decisions made by CNN-based classifiers. The most successful approach to improve the robustness of classifiers is adversarially trained on practical adversarial examples. Despite many attempts, previous works have not dealt with generating executable adversarial examples in a pure black-box manner to emulate adversarial threats. The aim of this work is to generate realistic adversarial malware examples and improve the robustness of classifiers against these attacks. We first explain the decision of malware classification by the saliency detection technique and argue that there are two similarities in saliency distribution of CNN classifiers. To explore the under-researched Malware to Malware threats that deceive PE malware classifiers into targeted misclassification, we propose the Saliency Append (SA) attack method based on the two saliency similarities, which produces adversarial examples via only one query, achieving higher attack success rate than other append-based attacks. We use these examples to improve the robustness of classifiers by adversarially trained on the generated adversarial examples. Compared to classifiers trained on other attacks, our approach produces classifiers that are significantly more robust against the proposed SA attack as well as others.
... On the other hand, a post-deployment approach performs either signaturebased or behavior-based verification. Signature-based verification (e.g., [15]) relies on the attack signature of the malicious code snippet and checks the binary code of AMF to find a match. On the other hand, behavior-based verification matches the current system call sequence [16] against the normal sequence of system calls to identify that SYS 3 is a mismatch and causing the integrity breach in AMF. ...
... These malware detection techniques are mainly grouped into signature-based, behaviour/dynamic-based, and hybrid detection techniques [16] [19] [20]. Recently, DL models for malware detection have advanced to effective techniques based on binary image classification [16], app's permissions and proprietary Android API package usage [21], and operational code [22]. This work presents a detailed review of current DL technologies which are used to develop malware detection models. ...
Preprint
Full-text available
Malware is one of the most common and severe cyber-attack today. Malware infects millions of devices and can perform several malicious activities including mining sensitive data, encrypting data, crippling system performance, and many more. Hence, malware detection is crucial to protect our computers and mobile devices from malware attacks. Deep learning (DL) is one of the emerging and promising technologies for detecting malware. The recent high production of malware variants against desktop and mobile platforms makes DL algorithms powerful approaches for building scalable and advanced malware detection models as they can handle big datasets. This work explores current deep learning technologies for detecting malware attacks on the Windows, Linux, and Android platforms. Specifically, we present different categories of DL algorithms, network optimizers, and regularization methods. Different loss functions, activation functions, and frameworks for implementing DL models are presented. We also present feature extraction approaches and a review of recent DL-based models for detecting malware attacks on the above platforms. Furthermore, this work presents major research issues on malware detection including future directions to further advance knowledge and research in this field.
... A detailed analysis of assembly code vector representation is studied for code obfuscation [15]. Detailed experiments of opcode-based malware detection are shown by proposing a word embedding with a deep learning model [16,17]. Though these methods have shown that the models can detect new malware, obfuscated malware detection experiments are not included. ...
Article
Full-text available
Recent security attack reports show that the number of malware attacks is gradually growing over the years due to the rapid adoption of smart healthcare systems. The development of a safe and secure smart healthcare system is considered to be important from a security point of view. Malware detection is an essential subsystem in healthcare ecosystems to secure the system from malware attacks. The literature survey shows that malware detection is done using deep learning with either portable executable (PE)-Header or PE-Imports or PE-Image or application programming interface (API) calls. However, each of these feature sets is important in PE files to boost the malware detection rate. This work proposes a Multi-View attention-based deep learning framework for malware detection by considering features of PE-Header, PE-Imports, PE-Image, and API calls. Detailed evaluation and experimental analysis of the proposed method is shown on the malware detection benchmark datasets. The proposed approach performed better than the machine learning-based and non-attention-based approaches with an accuracy of 95% for malware detection using features from PE-Header, PE-Imports, PE-Image, and API calls. In addition, detailed evaluation results are included for image-based malware detection on datasets from Windows and Android operating systems. In the Windows-based dataset, the proposed approach showed an accuracy of 98% and an accuracy of 97% in the Android-based dataset. Also, the proposed approach performed better than the existing malware detection approaches. Experimental results on three malware datasets indicate that the proposed method is robust and generalizable for both Windows and Android-based malware detection in smart healthcare systems.
... e rest of Table 3 shows the files count in all the three datasets, malicious and benign files count, feature extraction method, and encoding techniques for features to be fed to machine learning and deep learning algorithms. In order to demonstrate the significance of the proposed approach, some of the recent opcode-based deep learning techniques such as [67][68][69][70][71][72] are trained and tested with the Op2Vec dataset. For a fair comparison, the same experimental setup is used for the experiments with Op2Vec. ...
Article
Full-text available
Android is one of the leading operating systems for smartphones in terms of market share and usage. Unfortunately, it is also an appealing target for attackers to compromise its security through malicious applications. To tackle this issue, domain experts and researchers are trying different techniques to stop such attacks. All the attempts of securing the Android platform are somewhat successful. However, existing detection techniques have severe shortcomings, including the cumbersome process of feature engineering. Designing representative features require expert domain knowledge. There is a need for minimizing human experts’ intervention by circumventing handcrafted feature engineering. Deep learning could be exploited by extracting deep features automatically. Previous work has shown that operational codes (opcodes) of executables provide key information to be used with deep learning models for the detection process of malicious applications. The only challenge is to feed opcodes information to deep learning models. Existing techniques use one-hot encoding to tackle the challenge. However, the one-hot encoding scheme has severe limitations. In this paper, we introduce (1) a novel technique for opcodes embedding, which we name Op2Vec, and (2) based on the learned Op2Vec, we have developed a dataset for end-to-end detection of Android malware. Introducing the end-to-end Android malware detection technique avoids expert-intensive handcrafted feature extraction and ensures automation. Some of the recent deep learning-based techniques showed significantly improved results when tested with the proposed approach and achieved an average detection accuracy of 97.47%, precision of 0.976, and F1 score of 0.979.
Article
With the steady increase in the demand for Internet of Things (IoT) devices in diverse industries, such as manufacturing, medical care, and transportation infrastructure, the production of malware tailored for Smart IoT environments is also increasing. Accordingly, various malware detection studies are being conducted to detect not only known malware but also variant malware. However, it is difficult to detect malware transformed in a way that hides malicious behavior by changing and deleting bytes or modifying the assembly code. Therefore, in this study, we propose a malware detection for static security service (Mal3S) scheme that provides a secure Smart IoT environment by accurately detecting various types of malware. Mal3S extracts bytes, opcodes, API calls, strings, and dynamic link libraries (DLLs) through static analysis and then generates five types of images. Images of various sizes are trained on a multi spatial pyramid pooling network (SPP-net) model to detect malware. When evaluating the performance of Mal3S using three malware datasets, the average detection accuracy was 98.02% and the classification accuracy was 98.43%, showing better performance than existing malware detection techniques. In addition, Mal3S has demonstrated effective generalization capabilities for various types of malware.
Article
Malware is one of the most common and severe cyber threat today. Malware infects millions of devices and can perform several malicious activities including compromising sensitive data, encrypting data, crippling system performance, and many more. Hence, malware detection is crucial to protect our computers and mobile devices from malware attacks. Recently, deep Learning (DL) has emerged as one of the promising technologies for detecting malware. The recent high production of malware variants against desktop and mobile platforms makes DL algorithms powerful approaches for building scalable and advanced malware detection models as they can handle big datasets. This work explores current deep learning technologies for detecting malware attacks on Windows, Linux, and Android platforms. Specifically, we present different categories of DL algorithms, network optimizers, and regularization methods. Different loss functions, activation functions, and frameworks for implementing DL models are discussed. We also present feature extraction approaches and a review of DL-based models for detecting malware attacks on the above platforms. Furthermore, this work presents major research issues on DL-based malware detection including future research directions to further advance knowledge and research in this field.