Mined protocol for java.util.Stack. States with gray background are...

Automated Change Rule Inference for Distance-Based API Misuse Detection

Preprint

Jul 2022

Developers build on Application Programming Interfaces (APIs) to reuse existing functionalities of code libraries. Despite the benefits of reusing established libraries (e.g., time savings, high quality), developers may diverge from the API's intended usage; potentially causing bugs or, more specifically, API misuses. Recent research focuses on developing techniques to automatically detect API misuses, but many suffer from a high false-positive rate. In this article, we improve on this situation by proposing ChaRLI (Change RuLe Inference), a technique for automatically inferring change rules from developers' fixes of API misuses based on API Usage Graphs (AUGs). By subsequently applying graph-distance algorithms, we use change rules to discriminate API misuses from correct usages. This allows developers to reuse others' fixes of an API misuse at other code locations in the same or another project. We evaluated the ability of change rules to detect API misuses based on three datasets and found that the best mean relative precision (i.e., for testable usages) ranges from 77.1 % to 96.1 % while the mean recall ranges from 0.007 % to 17.7 % for individual change rules. These results underpin that ChaRLI and our misuse detection are helpful complements to existing API misuse detectors.

Automatically Seed Corpus and Fuzzing Executables Generation Using Test Framework

Article

Full-text available

Jan 2022

Fuzzing is widely utilized as a practical test method to determine unknown vulnerabilities in software. Although fuzzing shows excellent results for code coverage and crash count, it is not easy to apply these effects to library fuzzing. A library cannot run independently; it is only executed by an application called a customer program. In particular, a fuzzing executable and a seed corpus are needed to execute the library code by calling a specific function sequence and passing the input of the fuzzer to reproduce the various states of the library. However, preparing the environment for library fuzzing is challenging because it relies on the human expertise and requires both an understanding of the library and fuzzing knowledge. This study proposes FuzzBuilderEx , a system that provides an automated fuzzing environment for a library by utilizing the test framework to resolve this problem. FuzzBuilderEx conducts a static/dynamic analysis of the test code to automatically generate seed corpus and fuzzing executables that enable library fuzzing. Furthermore, the automatically generated seed corpus and fuzzing executable are compatible with existing fuzzers, such as the American Fuzzy Lop (AFL). This study applied FuzzBuilderEx to nine open-source libraries for performance evaluation and confirmed the effects of an increase in code coverage by 31.2% and a unique crash count of 58.7% compared to previous studies. Notably, we detected three zero-day vulnerabilities and registered one of them in the common vulnerabilities and exposures (CVE) database.

Dealing with Variability in API Misuse Specification

Preprint

May 2021

APIs are the primary mechanism for developers to gain access to externally defined services and tools. However, previous research has revealed API misuses that violate the contract of APIs to be prevalent. Such misuses can have harmful consequences, especially in the context of cryptographic libraries. Various API misuse detectors have been proposed to address this issue including CogniCrypt, one of the most versatile of such detectors and that uses a language CrySL to specify cryptographic API usage contracts. Nonetheless, existing approaches to detect API misuse had not been designed for systematic reuse, ignoring the fact that different versions of a library, different versions of a platform, and different recommendations or guidelines might introduce variability in the correct usage of an API. Yet, little is known about how such variability impacts the specification of the correct API usage. This paper investigates this question by analyzing the impact of various sources of variability on widely used Java cryptographic libraries including JCA, Bouncy Castle, and Google Tink. The results of our investigation show that sources of variability like new versions of the API and security standards significantly impact the specifications. We then use the insights gained from our investigation to motivate an extension to the CrySL language named MetaCrySL, which builds on meta programming concepts. We evaluate MetaCrySL by specifying usage rules for a family of Android versions and illustrate that MetaCrySL can model all forms of variability we identified and drastically reduce the size of a family of specifications for the correct usage of cryptographic APIs

An Interview Study of how Developers use Execution Logs in Embedded Software Engineering

Conference Paper

Full-text available

May 2021

FILO: FIx-LOcus Localization for Backward Incompatibilities Caused by Android Framework Upgrades

Preprint

Dec 2020

Mobile operating systems evolve quickly, frequently updating the APIs that app developers use to build their apps. Unfortunately, API updates do not always guarantee backward compatibility, causing apps to not longer work properly or even crash when running with an updated system. This paper presents FILO, a tool that assists Android developers in resolving backward compatibility issues introduced by API upgrades. FILO both suggests the method that needs to be modified in the app in order to adapt the app to an upgraded API, and reports key symptoms observed in the failed execution to facilitate the fixing activity. Results obtained with the analysis of 12 actual upgrade problems and the feedback produced by early tool adopters show that FILO can practically support Android developers.FILO can be downloaded from https://gitlab.com/learnERC/filo, and its video demonstration is available at https://youtu.be/WDvkKj-wnlQ.

FILO: FIx-LOcus localization for backward incompatibilities caused by Android framework upgrades

Conference Paper

Dec 2020

An empirical study on API parameter rules

Conference Paper

Full-text available

Jun 2020

Anomaly Detection in Cloud Computing Environments

Thesis

Jun 2020

Florian Schmidt

Cloud computing is widely applied by modern software development companies. Providing digital services in a cloud environment offers both the possibility of cost-efficient usage of computation resources and the ability to dynamically scale applications on demand. Based on this flexibility, more and more complex software applications are being developed leading to increasing maintenance efforts to ensure the reliability of the entire system infrastructure. Furthermore, highly available cloud service requirements (99.999% as industry standards) are difficult to guarantee due to the complexity of modern systems and can therefore just be ensured by great effort. Due to these trends, there is an increasing demand for intelligent applications that automatically detect anomalies and provide suggestions solving or at least mitigating problems in order not to cascade a negative impact on the service quality. This thesis focuses on the detection of degraded abnormal system states in cloud environments. A holistic analysis pipeline and infrastructure is proposed, and the applicability of different machine learning strategies is discussed to provide an automated solution. Based on the underlying assumptions, a novel unsupervised anomaly detection algorithm called CABIRCH is presented and its applicability is analyzed and discussed. Since the choice of hyperparameters has a great influence on the accuracy of the algorithm, a hyperparameter selection procedure with a novel fitness function is proposed, leading to further automation of the integrated anomaly detection. The method is generalized and applicable for a variety of unsupervised anomaly detection algorithms, which will be evaluated including a comparison to recent publications. The results show the applicability for the automated detection of degraded abnormal system states and possible limitations are discussed. The results show that detection of system anomaly scenarios achieves accurate detection rates but comes with a false alarm rate of more than 1%.

Learning a graph-based classifier for fault localization

Article

Full-text available

Jun 2020

Because software emerged, locating software faults has been intensively researched, culminating in various approaches and tools that have been applied in real development. Despite the success of these developments, improved tools are still demanded by programmers. Meanwhile, some programmers are reluctant to use any tools when locating faults in their development. The state-of-the-art situation can be naturally improved by learning how programmers locate faults. The rapid development of open-source software has accumulated many bug fixes. A bug fix is a specific type of comments containing a set of buggy files and their corresponding fixed files, which reveal how programmers repair bugs. Feasibly, an automatic model can learn fault locations from bug fixes, but prior attempts to achieve this vision have been prevented by various technical challenges. For example, most bug fixes are not compilable after checking out, which hinders analyzing bug fixes by most advanced static/dynamic tools. This paper proposes an approach called ClaFa that trains a graph-based fault classifier from bug fixes. ClaFa is built on a recent partial-code tool called Grapa, which enables the analysis of partial programs by the complete code tool called WALA. Once Grapa has built a program dependency graph from a bug fix, ClaFa compares the graph from the buggy code with the graph from the fixed code, locates the buggy nodes, and extracts the various graph features of the buggy and clean nodes. Based on the extraction result, ClaFa trains a classifier that combines Adaboost and decision tree learning. The trained ClaFa can predict whether a node of a program dependency graph is buggy or clean. We evaluate ClaFa on thousands of buggy files collected from four open-source projects: Aries, Mahout, Derby, and Cassandra. The f-scores of ClaFa achieves are approximately 80% on all projects.

Inferring Bug Signatures to Detect Real Bugs

Article

Full-text available

May 2020
IEEE T SOFTWARE ENG

Due to the complexity and variety of programs, it is difficult to manually enumerate all bug patterns, especially for those related to API usages or project-specific rules. With the rapid development of software, many past bug fixes accumulate in software version histories. These bug fixes contain valuable samples of illegal coding practices. The gap between existing bug samples and well-defined bug patterns motivates our research. In the literature, researchers have explored techniques on learning bug signatures from existing bugs, and a bug signature is defined as a set of program elements explaining the cause/effect of the bug. However, due to various limitations, existing approaches cannot analyze past bug fixes in large scale, and to the best of our knowledge, no previously unknown bugs were ever reported by their work. The major challenge to automatically analyze past bug fixes is that, bug-inducing inputs are typically not recorded, and many bug fixes are partial programs that have compilation errors. As a result, for most bugs in the version history, it is infeasible to reproduce them for dynamic analysis or to feed buggy/fixed code directly into static analysis tools which mostly depend on compilable complete programs. In this paper, we propose an approach, called DEPA, that extracts bug signatures based on accurate partial-code analysis of bug fixes. With its support, we conduct the first large scale evaluation on 6,048 past bug fixes collected from four popular Apache projects. In particular, we use DEPA to infer bug signatures from these fixes, and to check the latest versions of the four projects with the inferred bug signatures. Our results show that DEPA detected 27 unique previously unknown bugs in total, including at least one bug from each project. These bugs are not detected by their developers nor other researchers. Among them, three of our reported bugs are already confirmed and repaired by their developers. Furthermore, our results show that the state-of-the-art tools detected only two of our found bugs, and our filtering techniques improve our precision from 25.5% to 51.5%.

Mined protocol for java.util.Stack. States with gray background are liable states.

Citations