Conference PaperPDF Available

Identifying Vulnerabilities in Docker Image Code using ML Techniques

Authors:
Identifying Vulnerabilities in Docker Image Code
using ML Techniques
Jayama Pinnamaneni
Department of CSE, IFSCR Centre
PES University
Bengaluru, India
jayamapinnamaneni26@gmail.com
Nagasundari S
Department of CSE, IFSCR Centre
PES University
Bengaluru, India
snagasundari5@gmail.com
Prasad Honnavalli
Department of CSE, IFSCR Centre
PES University
Bengaluru, India
prasad.honnavalli@gmail.com
Abstract - A Docker container image can be defined as a
lightweight, unattached, executable package of software that
includes everything like code, runtime, system tools, system
libraries and settings, needed to run an application, because of
these features the container images are preferred over virtual
machines. With this enormous usage, there is a lot of scope for
the security issues arising in the container images. There are
many open-source projects like Anchore, Clair that statically
scan the container image’s docker file to find the
vulnerabilities using databases like CVE, RedHat etc. Static
analysis of container image main code is equally necessary to
identify any vulnerabilities in the code and not only focus on
the vulnerabilities based on OS level, as many malicious
activities might take place if code is not scanned for any
vulnerabilities. The main aim of the project is to create a static
code analysing machine learning model to identify the
vulnerable python libraries in container images.
Keywords- Docker, containers, images, keylogging,
vulnerability
I. INTRODUCTION
Containers provide a way of packaging the application’s
code, configurations, binaries and required libraries into a
single object file. Hence, containers have wide range of
advantages like increase in portability, less overhead,
lightweight, and greater efficiency. With all these advantages
the containers are being deployed by many companies and it
is seen that more than 80% of cloud-based companies have
shifted to deploy containers for their work. The increasing
popularity of the containers instead of virtual machines is
giving raise to security concerns. One of the common
software vulnerabilities seen is keylogging. Keylogging is a
concept where in the key strokes are recorded secretly
without the user’s knowledge. To provide security to
container images, few open-source projects have been
created, like Anchor and Clair. The open -source projects
make use of the existing vulnerability databases which has
classified vulnerabilities according to impact. The
underlying platform used by the images are scanned and
results are displayed if the image can be used or not. The
static scanning of the images is not enough to identify any
vulnerabilities, as it is not scanning the code The objective of
the projects is to show that, it is fairly easy to induce
vulnerability into container images, the induced vulnerability
is not caught by the static scanning tools and to show that
static analysis of image’s code is important to identify the
vulnerabilities in code. In order to achieve the above
objectives, the scope of the project is to induce keylogging
vulnerability into container images, to highlight the
loopholes in the open-source projects and to create ML
model for identifying the vulnerable libraries.
II. LITERATURE SURVEY
Docker uses isolation features of Linux. One of the
features includes namespaces. Namespaces provide isolated
workspace for each specific container thus differentiating
and isolating one container from other running containers.
Various namspaces created are PID, MNT, NET, UTS, IPC
[28]. These namspaces provide unique process ID and mount
directory paths which are assigned to each container.
Another feature includes cgroups, which lets docker control
the usage of resources that can be accessed by each
container. Chroot is another mechanism which limits the
exposure of file system to any container process [17].
Even though docker uses the Linux security features like
Namespaces, Chroot and Cgroups for safer execution of
containers, these features can also have some loops holes
because of which the vulnerabilities in docker arise [19]. The
isolation functionality in docker is strict but a common
network bridge is shared by all the containers thus possibility
of enabling Address Resolution Protocol poisoning attacks is
high between the containers [16]. The host hardening feature
SELinux creates a profile for each container created and this
feature protects host from containers but does not protect
containers from other containers. All the administration tasks
for the containers are done by the host, which requires root
admin access [18]. These instances prove that various
security vulnerabilities would arise if proper configuration of
containers are not done. The official and community images
are updated in less than 400 days. Vulnerabilities like
overflow and denial of service have been found in both the
types of images. A child image inherits almost 80% of
vulnerabilities from parent images [13]. [14] Mentioned that
nearly half of the vulnerabilities that are found in the
container have no fix identified. Some of the vulnerable
containers have not been updated for as long as two years.
The study suggested that Docker scan tools should add more
data related to bugs and also add a technical lag measure to
remind the container images to update. [15] study identified
the two categories of security analysis of container images
namely static and dynamic. Static analysis is used to
examine the contents of the container images without
executing the commands in it. Dynamic analysis observes
the behavior of the container during its execution. There are
many open-source projects developed to statically scan the
images like Anchore, Clair, Microscanner etc. The study
identified that either static or dynamic analysis alone cannot
identify the vulnerabilities and bugs in the code but
combining both the concepts together would result in better
vulnerability identification. The study by [22] suggested that
security scanning of the container images is not enough as
the execution of images changes and the vulnerabilities keep
updating in the tools periodic scanning of the images it very
2022 2nd Asian Conference on Innovation in Technology (ASIANCON)
P
une, India. Aug 26-28, 2022
978-1-6654-6851-0/22/$31.00 ©2022 IEEE 1
2022 2nd Asian Conference on Innovation in Technology (ASIANCON) | 978-1-6654-6851-0/22/$31.00 ©2022 IEEE | DOI: 10.1109/ASIANCON55314.2022.9908676
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 17,2023 at 09:05:29 UTC from IEEE Xplore. Restrictions apply.
important. To understand the working of the various
container security vulnerability detection tools [20]
conducted a survey to see the working of the various static
scanning tools like Anchore, AppArmor, Cilium, Clair,
Dagda and Microscanner. The results suggested that the
Anchore identified comparatively the highest number of
vulnerabilities in docker images than any other tools
mentioned. [21] suggested that code scanning helped in
identifying high level vulnerabilities that were not identified
by any of the static scanning tools. Hence highlighting the
importance of code scanning in docker image security.
Many solutions were presented by many research papers
to identify the vulnerabilities in container images. The initial
studies were focused on creating a pipeline, which follows
steps like downloading the images from Docker Hud,
identifying the image metadata, using Clair to identify the
vulnerabilities in the images and finally generating
vulnerability score for each image and this score helps to
decide if the image can be used by the user or not. This
pipeline doesn’t acknowledge the issues that might arise
once the container image starts executing hence is only
limited to the static analysis of images [1]. The next analysis
includes the creation of tools to compare the working of
various open-source project tools like Anchore, Clair and
Trivy and see which tool provides better vulnerability
detection to which operating system like Dedian, Ubuntu and
Alpine [2]. Tools like AppArmor and Seccomp use the
Linux security features to restrict the actions of container
images and also provide a profile that evaluates the policy
violations made by the container images. These tools provide
access control policies at a lower level of the security
architecture once the containers have executed. Hence [3]
provided an additional layer of protection which provides
access control policies, compare the container image with
black listed database and monitor the runtime behavior of the
container images with respect to the usage of resources
allocated to it. Sysdig is also one of the tools used to monitor
the working of the docker images [4]. System call made by
the containers play a very important role in determining the
security of the container images, unnecessary system calls
can lead to increased attack surfaces, hence [5] explores the
list of system calls made by container images using dynamic
and static analysis and lists out that both the methods are
equally effective in identifying the system calls that can be
blacklisted. Denial of Service (DoS) is one of the most
common attacks made, detecting DoS in container images
[10] using tools like Sysdig and Falco which work over the
system calls made by a container image. In terms of creating
a strong frame work for minimal security issues in container
images [6] proposed six step by step analysis namely, image
hardening, container isolation, container self-security,
vulnerability management, secret management and audit and
monitoring. Using normal anti-virus for container images
does not yield required results, hence [7] proposed a new
anti-virus mechanism for container images which identifies
the malware in real time files and discard them even before
the container image runs. In terms of effectiveness and
efficiency the mechanism needs improvements. Over the
time many tools have been developed for securing the
containerization environment. [8] consolidated various tools
under different security sections like configuration based,
code based and rule-based tools. This classification helps
container architects to design new security features.
Seccomp tool is used to create profiles, but the hassle is to
keep them updated with respect to recent updates in Linux.
To shield the containers from various vulnerabilities [9]
proposed a docker security which automates the AppArmor
working of creating profiles based on kernel operations. The
results showed that containers are able to defend themselves
from attacks better than using only Docker security but still
has the issues of not able to identify all the vulnerabilities.
DDoS is also a famous attack that happens worldwide. [10]
explores the various features that need to induced along with
isolation features to provide security to container images.
With increasing usage of Machine Learning Techniques in
various fields, few research explored the idea of using ML
techniques in container security by identifying the
vulnerabilities [27]. [12] analysis the various static and
dynamic scheme tools. For static analysis Clair tool is used
while dynamic scheme is using ML techniques like KNN,
PCA+KNN, K-Means and Self-Organizing Map. The
research concluded that the dynamic analysis using ML
technique Self-Organizing Map identified the maximum
number of vulnerabilities. Combining the static and dynamic
schemes together could yield better result by identifying
almost 86% of vulnerabilities. Along with that the research
also suggested that the code scanning also resulted in
vulnerability identification. [11] used the neural network ML
technique to identify the various anomaly in container
images
Keylogger is a program written to obtain confidential
information from user secretly by capturing their keystrokes
and using this information for malicious purposes. [23]
explores the various ways a keylogger can be created and
how it can be detected. [24] identifies two types of
keyloggers static and dynamic namely. To be able to create a
new keylogging code [25] it is important to know the
implementation of the keylogging in the terms of various
programming languages. Nowadays the memory only
malware is increasing and the detection of these type of
malware is becoming tough, hence [26] explores the idea of
keylogging in memory only malware and how the
application described can lead to detection of the malware.
III. IMPLEMENTATION
The project is divided into four modules. The first
module is focused on creating a vulnerable image. The
second module delas with masquerading the vulnerable
image as no-vulnerable one. The third module works on
creating a dataset, consisting of multiple vulnerable and non-
vulnerable images. The fourth module devises a ML model
to correctly identify the vulnerable images.
A. Creating a vulnerable container image
Initially a non-vulnerable code is written and tested for
its expected output. The first normal docker image is created
to display top rated movies from IMDB website until the
user stops the application. The second normal docker image
is a simple login form, which will prompt user to enter
details. The third normal docker image is an application
where in the user is required to enter a string or a word
which will be searched in Wikipedia and a summary will be
displayed. A vulnerable code is written using various
libraries and tested by building an image. The first code that
is written is using the keyboard and threading libraries where
each keystroke made is notified by a thread and listed in a
folder which is created for every one minute at the location
as specified in the code. The second code is created using the
2
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 17,2023 at 09:05:29 UTC from IEEE Xplore. Restrictions apply.
OS and pyxhook libraries, pyxhook library is written for the
Linux distribution specifically, the OS library is used to
identify the input device and pyxhook creates a hook to
identify the keystrokes and log them in a single file with a
serial number. The third code is created using pynput and
listener libraries, the listener library is used to listen the
keystrokes made by the input device which is detected using
the pynput library, which itself is a malicious library. The
keystrokes are logged into a single file with exact time of
each keystroke. The three vulnerable codes created are
scanned in virustotal.com to check if they are already
identified as vulnerable or not. All three have cleared with 0
vulnerabilities. Now, the vulnerable image is created by
combining each vulnerable code and non-vulnerable code
together. The three vulnerable images build by combining
respective vulnerable and non-vulnerable codes are named as
movie, login and wiki. The vulnerable images are run to see
the expected result is achieved. Fig 1, is the working of login
image, where in on the right side is the docker execution and
no left side is the logging file content.
Fig. 1. Keylogging file after running login image
B. B Masquerading the image as non-vulnerable
In this module, the vulnerable images that were
successfully created in module 1 are used. Each docker file is
given as an input to existing popular container image
scanning tools like Anchore and Docker Hub. To do the
docker scan, initially a docker account is required and the
account has to be logged in the Linux terminal. Later
“docker scan name_of_image” command is used to scan the
image. To do the anchore scan “curl -s https://ci-
tools.anchore.io/inline_scan-latest | bash -s -- -r
name_of_image” command is used. In both the cases the
report is generated immediately. These tools scan for any
vulnerabilities in docker files. The expected result is that
vulnerable images have to pass without being flagged as
vulnerable. The scan should not identify vulnerable libraries
our code. If any vulnerabilities were found then the image
has to be discarded and a new image has to be created as in
Fig. 2. Docker Scan for login image
Fig. 3. Anchore report for login image
module 1. Fig 2 and 3 are the docker scan and anchore scan
respectively of the previously created docker image login.
The main library that is used to create the docker image
“pyxhook” is not identified by these scanning tools.
C. Creating a database of all vulnerabilities for ML model
Since the three vulnerable images have successfully been
created now, those codes can be used to combine with other
non-vulnerable codes. A total of 100 codes are created, out
of which 75 are the codes which have been labeled as
vulnerable by combining the vulnerable with non-vulnerable
codes. Rest 25 codes are purely non-vulnerable. A dataset
with three columns namely S. No, Category and Code is
created.
D. Designing ML model to capture the vulnerable image
As the dataset is fully labeled and the expectation is that
the code has to be classified as vulnerable or not. Thus, the
supervised classification machine learning algorithms have
to be explored. The machine learning algorithms namely
Liner regression, Decision Tress, Naïve Bayes, K-Nearest
Neighbors (KNN), Support Vector Machine (SVM),
Random Forest, Gradient Boost, XGBoost. With all these
algorithms the accuracy and precision of the prediction is
observed and compared. The results observed as in Table
1.1, are that Decision Tree, Random Forest, Gradient Boost
and XGBoost, algorithms are giving a 100 percent accuracy
and precision. The top features that were selected to predict
if a code is vulnerable or not includes the libraries that are
used to create the vulnerable docker images namely
keyboard, listener, pyxhook, pynput, thread.
TABLE I. ML ALGORITHM SUMMARY
In order to make an algorithm which will work on any
size database, soft voting between decision Tree, random
forest and gradient boost algorithms is applied and this
model is used to predict any new code as vulnerable image
or not. Another reason to select soft voting is, that the
predictions are made based on the average probability given
to a class instead of the majority voting in hard voting.
Finally, a user interface is created using streamlit which
uses the pickled file consisting of the voting model created.
3
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 17,2023 at 09:05:29 UTC from IEEE Xplore. Restrictions apply.
The user can directly give the entire code as input into the
webpage and it gives a result if the code is safe to use or not.
The user interface also detects the programming language of
the code given by the user. If the language is not python,
then the vulnerability libraries identification doesn’t procced.
When the code is identified as python and it is identified as
vulnerable then the page also specifies the key libraries that
were present in the code which make it vulnerable. It also
identifies the CVE’s corresponding to keylogging activity.
IV. RESULT
The final user interface is created, which has two sections
Predict and CVE_Predicition. The Predict page has a section
to input the entire code and click on the check button to see
if the code has any of the keylogging vulnerable libraries. In
Fig 4 it can be seen that keyboard library is identified.
Fig. 4. Website home page
In Fig 5 it is observed that, a table is presented which
shows the list CVE identified from CVE main website. The
table also shows the score and level of criticality of each
CVE identified. A total of 31 CVEs is identified, these CVEs
have been identified from CVE website by giving keylogger
search. The CVEs range from year 2022 to 2018.
Fig. 5. Website cve_prediction page
In Fig 6 it is observed that, a hyperlink is given in the
user interface website for the user to understand each CVE
clearly.
Fig. 6. Main CVE website hyperlink from user interface
In Fig 7 and 8 it can be seen that code is identified to be
safe and no CVE is identified.
Fig. 7. Safe code identification
Fig. 8. Safe code result in cve_prediction page
V. CONCLUSION
With enormous increase in usage of containers, the
security risks associated with them also increased greatly
over time. Many solutions have been proposed to counter
those risks, one of the main risks associated is lack of
analyzing the main image code apart from docker file in a
container image. Many container image scanning tools scan
the docker file and identify the OS level vulnerabilities. The
current work focuses on creating a user interface for
identification of vulnerable libraries in any docker image
code. A dataset has been created, which contains python
code, focused on keylogging vulnerability. The dataset is
created to create a machine learning model, which could
scan the docker code and identify the libraries. The user
interface makes it easy for the user to scan and identify the
vulnerabilities in the code.
The future improvements to the work include extending
the database to other programming languages like C, C++,
Java, Go etc. The vulnerabilities like DOS attack, cross site
scripting, memory corruption, overflow vulnerability etc. can
be added, for which the code can be written in various
different languages.
REFERENCES
[1] Kwon, Soonhong, and Jong-Hyouk Lee. "Divds: Docker image
vulnerability diagnostic system." IEEE Access 8 (2020): 42666-
42673.
[2] Berkovich, Shay, Jeffrey Kam, and Glenn Wurster. "{UBCIS}:
Ultimate Benchmark for Container Image Scanning." In 13th
USENIX Workshop on Cyber Security Experimentation and Test
(CSET 20). 2020.
[3] Sarkale, Vivek Vijay, Paul Rad, and Wonjun Lee. "Secure cloud
container: Runtime behavior monitoring using most privileged
container (mpc)." In 2017 IEEE 4th International Conference on
Cyber Security and Cloud Computing (CSCloud), pp. 351-356. IEEE,
2017.
[4] Madhumathi, R. "The relevance of container monitoring towards
container intelligence." In 2018 9th International Conference on
Computing, Communication and Networking Technologies
(ICCCNT), pp. 1-5. IEEE, 2018.
[5] Casalicchio, Emiliano, and Stefano Iannucci. "The stateǦofǦtheǦart in
container technologies: Application, orchestration and
security." Concurrency and Computation: Practice and
Experience 32, no. 17 (2020): e5668.
4
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 17,2023 at 09:05:29 UTC from IEEE Xplore. Restrictions apply.
[6] Ghavamnia, Seyedhamed, Tapti Palit, Azzedine Benameur, and
Michalis Polychronakis. "Confine: Automated system call policy
generation for container attack surface reduction." In 23rd
International Symposium on Research in Attacks, Intrusions and
Defenses (RAID 2020), pp. 443-458. 2020
[7] Dissanayaka, Akalanka Mailewa, Susan Mengel, Lisa Gittner, and
Hafiz Khan. "Dynamic & portable vulnerability assessment testbed
with Linux containers to ensure the security of MongoDB in
Singularity LXCs." In Companion Conference of the
Supercomputing-2018 (SC18). 2018.
[8] Han, Sung-Hwa, Hoo-Ki Lee, Gwang-Yong Gim, and Sung-Jin Kim.
"Empirical study on anti-virus architecture for container
platforms." IEEE Access 8 (2020): 134940-134949.
[9] Pothula, Dharmanandana Reddy, Krishna M. Kumar, and Sanil
Kumar. "Run Time Container Security Hardening Using A Proposed
Model Of Security Control Map." In 2019 Global Conference for
Advancement in Technology (GCAT), pp. 1-6. IEEE, 2019.
[10] Lee, Wonjun, and Mohammad Nadim. "Kernel-Level Rootkits
Features to Train Learning Models Against Namespace Attacks on
Containers." In 2020 7th IEEE International Conference on Cyber
Security and Cloud Computing (CSCloud)/2020 6th IEEE
International Conference on Edge Computing and Scalable Cloud
(EdgeCom), pp. 50-55. IEEE Computer Society, 2020.
[11] Tien, ChinǦWei, TseǦYung Huang, ChiaǦWei Tien, TingǦChun
Huang, and SyǦYen Kuo. "KubAnomaly: Anomaly detection for the
Docker orchestration platform with neural network
approaches." Engineering reports 1, no. 5 (2019): e12080.
[12] Tunde-Onadele, Olufogorehan, Jingzhu He, Ting Dai, and Xiaohui
Gu. "A study on container vulnerability exploit detection." In 2019
IEEE International Conference on Cloud Engineering (IC2E), pp.
121-127. IEEE, 2019.
[13] Zerouali, Ahmed, Tom Mens, Gregorio Robles, and Jesus M.
Gonzalez-Barahona. "On the relation between outdated docker
containers, severity vulnerabilities, and bugs." In 2019 IEEE 26th
International Conference on Software Analysis, Evolution and
Reengineering (SANER), pp. 491-501. IEEE, 2019.
[14] Shu, Rui, Xiaohui Gu, and William Enck. "A study of security
vulnerabilities on docker hub." In Proceedings of the Seventh ACM
on Conference on Data and Application Security and Privacy, pp.
269-280. 2017.
[15] Brady, Kelly, Seung Moon, Tuan Nguyen, and Joel Coffman.
"Docker container security in cloud computing." In 2020 10th Annual
Computing and Communication Workshop and Conference (CCWC),
pp. 0975-0980. IEEE, 2020.
[16] Combe, Theo, Antony Martin, and Roberto Di Pietro. "To docker or
not to docker: A security perspective." IEEE Cloud Computing 3, no.
5 (2016): 54-62.
[17] Sultan, Sari, Imtiaz Ahmad, and Tassos Dimitriou. "Container
security: Issues, challenges, and the road ahead." IEEE Access 7
(2019): 52976-52996.
[18] Tomar, Aparna, Diksha Jeena, Preeti Mishra, and Rahul Bisht.
"Docker security: A threat model, attack taxonomy and real-time
attack scenario of dos." In 2020 10th International Conference on
Cloud Computing, Data Science & Engineering (Confluence), pp.
150-155. IEEE, 2020.
[19] Jagelid, Michelle. "Container Vulnerability Scanners: An Analysis."
(2020).
[20] Javed, Omar, and Salman Toor. "Understanding the Quality of
Container Security Vulnerability Detection Tools." arXiv preprint
arXiv:2101.03844 (2021).
[21] Watada, Junzo, Arunava Roy, Ruturaj Kadikar, Hoang Pham, and
Bing Xu. "Emerging trends, techniques and open issues of
containerization: a review." IEEE Access 7 (2019): 152443-152472.
[22] Babar, M. Ali, and Ben Ramsey. "Understanding container isolation
mechanisms for building security-sensitive private cloud." The
University of Adelaide, Australia (2017).
[23] Wood, Christopher, and Rajendra Raj. "Keyloggers in Cybersecurity
Education." In Security and Management, pp. 293-299. 2010.
[24] Manan Kalpesh Shah, Devashree Kataria, S. Bharath Raj, Piya G.
“Real Time Working of Keylogger Malware Analysis” International
Journal of Engineering Research & Technology (IJERT), (2020):
2278-0181.
[25] Tuli, Preeti, and Priyanka Sahu. "System monitoring and security
using keylogger." International Journal of Computer Science and
Mobile Computing 2, no. 3 (2013): 106-111.
[26] Case, Andrew, Ryan D. Maggio, Md Firoz-Ul-Amin, Mohammad M.
Jalalzai, Aisha Ali-Gombe, Mingxuan Sun, and Golden G. Richard
III. "Hooktracer: Automatic detection and analysis of keystroke
loggers using memory forensics." Computers & Security 96 (2020):
101872.
[27] Nassif, Ali & Abu Talib, Manar & Nassir, Qassim & Albadani, Halah
& Albab, Fatima. (2021). “Machine Learning for Cloud Security: A
Systematic Review.” IEEE Access. PP. 1-1.
10.1109/ACCESS.2021.3054129.
[28] Rice, Liz. Container Security: Fundamental Technology Concepts
that Protect Containerized Applications. N.p.: O'Reilly Media, 2020.
5
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 17,2023 at 09:05:29 UTC from IEEE Xplore. Restrictions apply.
Article
Nowadays, cloud computing is gaining tremendous attention to deliver information via the internet. Virtualization plays a major role in cloud computing as it deploys multiple virtual machines on the same physical machine and thus results in improving resource utilization. Hypervisor-based virtualization and containerization are two commonly used approaches in operating system virtualization. In this paper, we provide a systematic literature review on various phases in maintenance of containers that are container image detection, container scheduling, container security measures, and performance evaluation of containers. We have selected 145 primary studies out of which 24% of studies are related to container performance evaluation, 42% of studies are related to container scheduling techniques, 22% of studies are related to container security measures, and 12% of studies are related to container image detection process. A few studies are related to container image detection process and evaluation of container security measures. Resource utilization is the most considered performance objectives in almost all container scheduling techniques. We conclude that there is a need to introduce new tagging approaches, smell detection approaches, and also new approaches to detect and resolve threat issues in containers so that we can maintain the security of containers.
Article
Full-text available
The popularity and usage of Cloud computing is increasing rapidly. Several companies are investing in this field either for their own use or to provide it as a service for others. One of the results of Cloud development is the emergence of various security problems for both industry and consumer. One of the ways to secure Cloud is by using Machine Learning (ML). ML techniques have been used in various ways to prevent or detect attacks and security gaps on the Cloud. In this paper, we provide a Systematic Literature Review (SLR) of ML and Cloud security methodologies and techniques. We analyzed 63 relevant studies and the results of the SLR are categorized into three main research areas: (i) the different types of Cloud security threats, (ii) ML techniques used, and (iii) the performance outcomes. We have defined 11 Cloud security areas. Moreover, distributed denial-of-service (DDoS) and data privacy are the most common Cloud security areas, with a 16% level of use and 14%respectively. On the other hand, we found 30 ML techniques used, some used hybrid and others as standalone. The most popular ML used is SVM in both hybrid and standalone models. Furthermore, 60% of the papers compared their models with other models to prove the efficiency of their proposed model. Moreover, 13 different evaluation metrics were enumerated. The most applied metric is true positive rate and least used is training time. Lastly, from 20 datasets found, KDD and KDD CUP’99 are the most used among relevant studies.
Conference Paper
Full-text available
Containers are regularly used in modern cloud-native deployment practices. They support agile and continuous integration/continuous deployment (CI/CD) paradigms, isolating services. As containers become more ubiquitous, container security becomes crucial as well. Scanning container images for known vulnerabilities caused by vulnerable software is a critical security activity of the CI/CD process. Both commercial and open-source tools exist for container image scanning. Results from these scanners, however, are inconsistent. Inconsistent results make it hard for developers to choose the best solution for their environment. In this paper, we present the Ultimate Benchmark for Container Image Scanning (UBCIS), a benchmark for evaluating image scanners. UBCIS contains a classification of known vulnerabilities in common base container images, as well as a framework for running container vulnerability scanning tools. UBCIS makes it possible to evaluate scanners. We discuss intricacies of classifying vulnerabilities, presenting a process that can be used when determining the relevance of vulnerability. Finally, we provide recommendations for choosing the best scanner for a specific environment.
Article
Full-text available
Container platforms provide many functions for diverse applications and are used to build and operate various information services. They have been extended not only to Linux and Unix-based servers but also to Windows and macOS-based desktops and laptops. Many systems use anti-virus software to minimize damage caused by malware. Most anti-virus software provide real-time malware detection functions and block the execution of malware by enforcing access denial functions for malware that cannot be deleted or for original files that cannot be restored. However, current anti-virus technologies are not designed for container platforms. Therefore, they cannot detect malware in containers in real time; nor can they block malware execution or user access to malware owing to the isolation feature provided by container platforms. To resolve these issues, we propose a functionally-isolated anti-virus architecture for container platforms. The proposed anti-virus architecture separates the functions of a legacy anti-virus engine to ensure compatibility with the isolation features of a container platform. By implementation, it was confirmed that the proposed anti-virus architecture can detect in real-time the entry of malware in a container platform and block the execution of, and user access to unrecoverable malware-infected files. The performance of the proposed functionally-isolated anti-virus architecture is similar to that of legacy anti-virus technology and was verified to be sufficiently effective.
Conference Paper
Full-text available
Reducing the attack surface of the OS kernel is a promising defense-in-depth approach for mitigating the fragile isolation guarantees of container environments. In contrast to hypervisor-based systems, malicious containers can exploit vulnerabilities in the underlying kernel to fully compromise the host and all other containers running on it. Previous container attack surface reduction efforts have relied on dynamic analysis and training using realistic workloads to limit the set of system calls exposed to containers. These approaches, however, do not capture exhaustively all the code that can potentially be needed by future workloads or rare runtime conditions, and are thus not appropriate as a generic solution. Aiming to provide a practical solution for the protection of arbitrary containers, in this paper we present a generic approach for the automated generation of restrictive system call policies for Docker containers. Our system, named Confine, uses static code analysis to inspect the containerized application and all its dependencies, identify the superset of system calls required for the correct operation of the container, and generate a corresponding Seccomp system call policy that can be readily enforced while loading the container. The results of our experimental evaluation with 150 publicly available Docker images show that Confine can successfully reduce their attack surface by disabling 145 or more system calls (out of 326) for more than half of the containers, which neutralizes 51 previously disclosed kernel vulnerabilities.
Article
Full-text available
Since the development of Docker in 2013, container utilization projects have emerged in various fields. Docker has the advantage of being able to quickly share application build environments among developers through container technology, but it does not provide security guarantees for known security vulnerabilities inside Docker images. Since the Docker images are shared without a means of security vulnerability diagnostic, polluted Docker images can be distributed so that the Docker-based application build environments can be easily collapsed. In this paper, we introduce a Docker Image Vulnerability Diagnostic System (DIVDS) for a reliable Docker environment. The proposed DIVDS diagnoses Docker images when uploading or downloading the Docker images from a Docker image repository.
Article
Advances in malware development have led to the widespread use of attacker toolkits that do not leave any trace in the local filesystem. This negatively impacts traditional investigative procedures that rely on filesystem analysis to reconstruct attacker activities. As a solution, memory forensics has replaced filesystem analysis in these scenarios. Unfortunately, existing memory forensics tools leave many capabilities inaccessible to all but the most experienced investigators, who are well versed in operating systems internals and reverse engineering. The goal of the research described in this paper is to make investigation of one of the greatest threats that organizations face, userland keyloggers, less error-prone and less dependent on manual reverse engineering. To accomplish this, we have added significant new capabilities to HookTracer, which is an engine capable of emulating code discovered in a physical memory captures and recording all actions taken by the emulated code. Based on this work, we present new memory forensics capabilities, embodied in a new Volatility plugin, hooktracer_messagehooks, that uses Hooktracer to automatically decide whether a hook in memory is associated with a malicious keylogger or benign software. We also include a detailed case study that illustrates our technique’s ability to successfully analyze very sophisticated keyloggers, such as Turla.
Conference Paper
The container-based cloud computing service is increasingly adopted by many service providers for its efficiency and flexibility. Containers isolated by namespaces share OS kernel. When the kernel-level rootkits exploit vulnerabilities existing in kernel, the namespace can be invalidated leading to critical security incidents. Even though many traditional approaches have been made to detect kernel-level rootkits, it is hard to detect new attacks conducted in the new environment such as container-based cloud computing system. In this paper, we show some possible attack scenarios by kernel-level rootkits exploiting kernel namespaces and suggest key features that can be used to train machine learning and neural network models.