Figure - available from: Empirical Software Engineering
This content is subject to copyright. Terms and conditions apply.
The actual ROC-AUC scores of the model sequence generated by FENSE

The actual ROC-AUC scores of the model sequence generated by FENSE

Source publication
Article
Full-text available
Context Just-in-time defect prediction (JITDP) leverages modern machine learning models to predict the defect-proneness of commits. Such models require adequate training data, which is unavailable in projects with short histories. To address this problem, cross-project methods reuse the data or models in other projects to make predictions, grounded...

Similar publications

Preprint
Full-text available
Bias mitigators can improve algorithmic fairness in machine learning models, but their effect on fairness is often not stable across data splits. A popular approach to train more stable models is ensemble learning, but unfortunately, it is unclear how to combine ensembles with mitigators to best navigate trade-offs between fairness and predictive p...

Citations

... Their results show that JIT-Fine outperforms all used state-of-the-art methods across 10 performance measures, showing significant improvements. Besides, researchers also explored JIT-SDP methods in a crossproject context (Kamei et al. 2016;Tabassum et al. 2020;Zhang et al. 2022). In this study, our primary focus is on using manually engineered features to build JIT-SDP models for comparison under a time-wise-cross-validation setting. ...
... In the future, we plan to address this bias by conducting experiments on a more diverse range of defect prediction datasets from various domains. Additionally, our empirical study is based on within-project JIT-SDP, and the conclusions may not be directly transferable to cross-project JIT-SDP (Kamei et al. 2016;Zhang et al. 2022) and heterogeneous SDP (Chen et al. 2021) settings. So we encourage researchers to explore different prediction settings to continue the research. ...
Article
Full-text available
Just-in-time software defect prediction (JIT-SDP) is a fine-grained, easy-to-trace, and practical method. Unfortunately, JIT-SDP usually suffers from the class imbalance problem, which affects the performance of the models. Data sampling is one of the commonly used class imbalance techniques to overcome this problem. However, there is a lack of comprehensive empirical studies to compare different data sampling techniques on the performance of JIT-SDP. In this paper, we consider both defect classification and defect ranking, two typical application scenarios. To this end, we performed an empirical comparison of 10 data sampling algorithms on the performance of JIT-SDP. Extensive experiments on 10 open-source projects with 12 performance measures show that the effectiveness of data sampling techniques can indeed vary relying on the specific evaluation measures in both defect classification and defect ranking scenarios. Specifically, the RUM algorithm has demonstrated superior performance overall in the context of defect classification, particularly in F-measure, AUC, and MCC. On the other hand, for defect ranking, the ENN algorithm has emerged as the most favorable option, exhibiting perfect results in Popt\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{opt}$$\end{document}, Recall@20%, and F-measure@20%. However, data sampling techniques can lead to an increase in false alarms and require the inspection of a higher number of changes. These findings highlight the importance of carefully selecting the appropriate data sampling technique based on the specific evaluation measures for different scenarios.
... Kamei et al. evaluated their cross-project defect prediction technique and obtained promising results [37]. Other researchers have also considered cross-project defect prediction with ensemble methods [38]. ...
Article
Full-text available
Software engineering workflows use version control systems to track changes and handle merge cases from multiple contributors. This has introduced challenges to testing because it is impractical to test whole codebases to ensure each change is defect-free, and it is not enough to test changed files alone. Just-in-time software defect prediction (JIT-SDP) systems have been proposed to solve this by predicting the likelihood that a code change is defective. Numerous techniques have been studied to build such JIT software defect prediction models, but the power of pre-trained code transformer language models in this task has been underexplored. These models have achieved human-level performance in code understanding and software engineering tasks. Inspired by that, we modeled the problem of change defect prediction as a text classification task utilizing these pre-trained models. We have investigated this idea on a recently published dataset, ApacheJIT, consisting of 44k commits. We concatenated the changed lines in each commit as one string and augmented it with the commit message and static code metrics. Parameter-efficient fine-tuning was performed for 4 chosen pre-trained models, JavaBERT, CodeBERT, CodeT5, and CodeReviewer, with either partially frozen layers or low-rank adaptation (LoRA). Additionally, experiments with the Local, Sparse, and Global (LSG) attention variants were conducted to handle long commits efficiently, which reduces memory consumption. As far as the authors are aware, this is the first investigation into the abilities of pre-trained code models to detect defective changes in the ApacheJIT dataset. Our results show that proper fine-tuning improves the defect prediction performance of the chosen models in the F1 scores. CodeBERT and CodeReviewer achieved a 10% and 12% increase in the F1 score over the best baseline models, JITGNN and JITLine, when commit messages and code metrics are included. Our approach sheds more light on the abilities of language models in software engineering tasks, promoting their use in production environments and ensuring that deployed software is defect-free efficiently.
... It refers to combining multiple individual models to make more accurate predictions or classifications. It is based on the concept of "wisdom of the crowd," where combining the predictions of multiple models often leads to better overall performance than relying on a single model [20] [50]. ...
Article
Full-text available
Software defect prediction plays a crucial role in enhancing software quality while achieving cost savings in testing. Its primary objective is to identify and send only defective modules to the testing stage. This research introduces an intelligent ensemble-based software defect prediction model that combines diverse classifiers. The proposed model employs a two-stage prediction process to detect defective modules. In the first stage, four supervised machine learning algorithms are employed: Random Forest, Support Vector Machine, Naïve Bayes, and Artificial Neural Network. These algorithms are optimized through iterative parameter optimization to achieve the highest accuracy possible. In the second stage, the predictive accuracy of the individual classifiers is integrated into a voting ensemble to make the final predictions. This ensemble approach further improves the accuracy and reliability of the defect predictions. Seven historical defect datasets from the NASA MDP repository, namely CM1, JM1, MC2, MW1, PC1, PC3, and PC4, were utilized to implement and evaluate the proposed defect prediction system. The results demonstrate that each dataset’s proposed intelligent system achieved remarkable accuracy, outperforming twenty state-of-the-art defect prediction techniques, including base classifiers and ensemble methods.
... 针对开放式需求获取与工程化里程碑持续转换问题, 研究 提出了基于跨社区关联的需求汇聚方法, 利用信息检索和语义分析技术, 建立起跨项目 [42] 、跨社区 [43] 相关需求的关联关系, 并构建了融合软件开发社区与知识分享社区的开源资源汇聚、评估与检索平台 OSSEAN [44] , 有效支撑互联网环境下的多源多样化需求反馈的自动获取、聚合与关联. 针对个性化合 并请求到确定性功能收敛问题, 我们针对不同粒度、不同类型、不同质量的协作任务, 提出了开发者 推荐 [45] 、审阅人推荐 [46] 、任务规范引导 [47] 、贡献质量检测 [48] 等智能化技术, 实现群智软件 "项目 -开发者"、"任务 -开发者"、"任务 -任务" 等异质主体之间的高效联接与适配. 针对群智软件阶段性 版本持续性发布问题, 聚焦持续集成、持续部署等 DevOps 关键技术, 开展了软件发布前的初始质量 与实际运行态质量细粒度的实证分析 [49] , 提出了云原生环境下持续部署工作流的优化策略 [50,51] , 可 有效支撑群智范式要素中 "原型作品" 的高效筛选与 "原型版本" 的快速上线, 进而促进群智软件转入 新一轮需求激发汇聚. ...
Article
With the development of smartphones, mobile applications play an irreplaceable role in our daily life, which characteristics often commit code changes to meet new requirements. This characteristic can introduce defects into the software. To provide immediate feedback to developers, previous researchers began to focus on just‐in‐time (JIT) software defect prediction techniques. JIT defect prediction aims to determine whether code commits will introduce defects into the software. It contains two scenarios, within‐project JIT defect prediction and cross‐project JIT defect prediction. Regardless of whether within‐project JIT defect prediction or cross‐project JIT defect prediction all need to have enough labeled data (within‐project JIT defect prediction assumes that have plenty of labeled data from the same project, while cross‐project JIT defect prediction assumes that have sufficient labeled data from source projects). However, in practice, both the source and target projects may only have limited labeled data. We propose the MTL‐DNN method based on multi‐task learning to solve this question. This method contains the data preprocessing layer, input layer, shared layers, task‐specific layers, and output layer. Where the common features of multiple related tasks are learned by sharing layers, and the unique features of each task are learned by the task‐specific layers. For verifying the effectiveness of the MTL‐DNN approach, we evaluate our method on 15 Android mobile apps. The experimental results show that our method significantly outperforms the state‐of‐the‐art single‐task deep learning and classical machine learning methods. This result shows that the MTL‐DNN method can effectively solve the problem of insufficient labeled training data for source and target projects.