Conference PaperPDF Available

An Ensemble Model for Software Defect Prediction

June 2022

June 2022

DOI:10.1109/ICoDT255437.2022.9787439

Conference: 2022 2nd International Conference on Digital Futures and Transformative Technologies (ICoDT2)
At: Islamabad, Pakistan

Authors:

Attique Ur Rehman

University of Sialkot

Tahir Muhammad Ali

Gulf University for Science and Technology (Kuwait)

Show all 5 authorsHide

Software testing is one of the important ways to ensure the quality of software. It is found that testing cost more than 50% of overall project cost. Effective and efficient software testing utilizes the minimum resources of software. Therefore, it is important to construct the procedure which is not only able to perform the efficient testing but also minimizes the utilization of project resources. The goal of software testing is to find maximum defects in the software system. As world is continuously moving toward data driven approach for making important decision. Therefore, in this research paper we performed the machine learning analysis on the publicly available datasets and tried to achieve the maximum accuracy. The major focus of the paper is to apply different machine learning techniques on the datasets and find out which technique produce efficient result. Particularly, we proposed an ensemble learning models and perform comparative analysis among KNN, Decision tree, SVM and Naïve Bayes on different datasets and it is demonstrated that performance of Ensemble method is more than other methods in term of accuracy, precision, recall and F1-score. The classification accuracy of ensemble model trained on CM1 is 98.56%, classification accuracy of ensemble model trained on KM2 is 98.18% similarly, the classification accuracy of ensemble learning model trained on PC1 is 99.27%. This reveals that ensemble learning is more efficient method for making the defect prediction as compared other techniques.

Research Paper Overview

…

Figures - uploaded by Attique Ur Rehman

Content may be subject to copyright.

Content uploaded by Attique Ur Rehman

Content may be subject to copyright.

An Ensemble Model for Software Defect Prediction

Amad Rizwan Ali

UserMaven INC

amad.ali@usermaven.com

Tahir Muhammad Ali

Department of Computer Science

College of Science and Arts

GULF University for Science and Technology

Kuwait

ali.t@gust.edu.kw

Attique Ur Rehman

Department of Software Engineering

University Of Sialkot

Sialkot, Pakistan

attique.urrehman@uskt.edu.pk

Muhammad Abbas

College of Electrical

and Mechanical Engineering

National University of Science

and Technology

Rawapindi, Pakistan

Ali Nawaz

College of Electrical

and Mechanical Engineering

National University of Science

and Technology

Rawapindi, Pakistan

anawazcse19ceme.ce.ceme.edu.pk

Abstract—Software testing is one of the important ways to

ensure the quality of software. It is found that testing cost more

than 50% of overall project cost. Effective and efﬁcient software

testing utilizes the minimum resources of software. Therefore, it

is important to construct the procedure which is not only able to

perform the efﬁcient testing but also minimizes the utilization of

project resources. The goal of software testing is to ﬁnd maximum

defects in the software system. As world is continuously moving

toward data driven approach for making important decision.

Therefore, in this research paper we performed the machine

learning analysis on the publicly available datasets and tried

to achieve the maximum accuracy. The major focus of the

paper is to apply different machine learning techniques on the

datasets and ﬁnd out which technique produce efﬁcient result.

Particularly, we proposed an ensemble learning models and

perform comparative analysis among KNN, Decision tree, SVM

and Na¨

ıve Bayes on different datasets and it is demonstrated

that performance of Ensemble method is more than other

methods in term of accuracy, precision, recall and F1-score. The

classiﬁcation accuracy of ensemble model trained on CM1 is

98.56%, classiﬁcation accuracy of ensemble model trained on

KM2 is 98.18% similarly, the classiﬁcation accuracy of ensemble

learning model trained on PC1 is 99.27%. This reveals that

ensemble learning is more efﬁcient method for making the defect

prediction as compared other techniques.

Index Terms—Software Quality Engineering, Software testing,

Machine learning, Supervised learning, Software Defects

I. INT ROD UC TI ON

The success of any software completely depends on proper

software development process and testing is the important

phase of software development life cycle. It is important

step to ensure quality in the software as minute defect in

the software effects the later stages of software development

tremendously. Software testing consumes more than 50% of

the overall development cost [1] and the effort required by

the software testing is approximately 40-60% of the overall

development process [2]. Therefore, software testing needs

to manage efﬁciently so that resources would be effectively

utilize. Software testing can be performed either manually

or automated. Manual testing is time consuming as well as

inaccurate due to involvement of human being as compared to

automated testing, which is more accurate and time sufﬁcient.

The goal of software testing is to ﬁnd maximum number

of defects in the software. Software defect prediction is the

process to identify the defects in the software. Early ﬁnding

of defects not only effects the quality of software but also

helps the effective utilization of resources. Machine learning

is the widely used technique to predict or ﬁnd defects in

the software [8], [9]. From last few decades the applicability

of machine learning in the real-world problems rises due to

availability of huge amount of labelled data. Machine learning

is the ability of computer to learn from data [2]. Machine

learning is classiﬁed into three categories 1) Supervised learn-

ing 2) Unsupervised learning 3) Semi-supervised learning. In

supervised learning, there is both features and labels while

in unsupervised learning there is feature only and in semi-

supervised learning there is small amount of labelled data and

huge amount of unlabeled data. The application of supervised

learning is more than unsupervised and semi supervised learn-

ing. Unsupervised learning is also widely used for software

defect detection [3], [4], [5], [6]. Supervised learning is further

categorized into classiﬁcation and regression. In regression, the

labels are continuous variables while the labels in classiﬁcation

are discrete variables. In this paper we are dealing with classi-

ﬁcation task as datasets are labels with discrete variables. This

paper proposed the ensemble leaning technique for increasing

the accuracy of defect prediction. The datasets used in the

experiment is the open-source data of PROMISE repository

[7] CM1, KC1, PC1. The result of the proposed technique

is compared with few prominent machine learning algorithms

such as K nearest neighbors (KNN), support vector machine

(SVM), decision tree (DT) and it is shown that results of

proposed ensemble learning method is effective and sound.

The evaluation metrics used for comparison of results are

precision, recall, accuracy, F1-score and ROC curve.

Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on June 09,2022 at 07:23:26 UTC from IEEE Xplore. Restrictions apply.

Fig. 1: Research Paper Overview

The main contributions of the papers are summarized as;

•Proposed ensemble learning model for the defect predic-

tion.

•Detailed comparison of proposed model with models of

machine learning.

•Achieve defect prediction accuracy of maximum 99%

The rest of the research paper is organized as follows. The next

section will review some literature. The proposed methodology

is described in Section III and the corresponding experimental

results are presented in Section IV. The ﬁnal section concludes

the research paper. The research paper overview is shown in

Fig 1.

II. LITERATURE REVIEW

According to [10], software testing is the integral part

pf software development life cycle. The overall success of

software relies on the testing phase. A large number of

machine learning techniques are proposed for efﬁcient defect

detection in the software. C. Manjula et al. [11] proposed a

hybrid machine learning approach in which Genetic algorithm

is presented to improve ﬁtness function and for the better

optimization of the features then the optimized features are

processed through decision tree (DT). The performance is

compared with ID3 based decision tree and it is proven that

proposed hybrid approach achieve better results. The proposed

approach successfully addresses the performance challenge. I

Laradji et al. [12] proposed ensemble learning technique with

more emphasis on feature engineering. The proposed method

combines ensemble learning and efﬁcient feature selection to

address the robustness problem of previous defect prediction

techniques. The proposed method also presents average prob-

ability ensemble (APE) which is ensemble of seven machine

learning models and reveals that the efﬁciency of APE is im-

proved over other machine learning models such as weighted

SVM and random forests. It is found that the APE combines

with greedy forward selection produce better results for PC2,

PC4 and MC1 datasets. O Arar et al. [13] proposed a hybrid

technique for defect prediction in which Artiﬁcial Neural

Network (ANN) is use for making prediction and the weight

of ANN is optimized through Artiﬁcial Bee Colony (ABC)

algorithm. The performance of proposed method is compared

with Na¨

ıve Bayes, Random Forest, C4.5, Immunos and AIR-

Sand algorithms and found that the performance of proposed

technique and random forest is equal on KC1 datasets and

produces high accuracy on KC2 and CM1 datasets. However,

performance is not good on PC1 and JM1 datasets as compared

to other techniques. The author suggests that more focus on

the feature engineering may reveal better results. M Siers et

al. [14] proposed ensemble method for classiﬁcation of defect.

The proposed method is the ensemble of decision tree called

as CSForest. For minimizing the classiﬁcation cost a cost

sensitive coting technique called CSVoting is proposed. The

evaluation of proposed technique is performed on six promi-

nent classiﬁers C4.5, SVM, SysFor+Voting1, SysFor+Voting2,

CSC+C4.5, CSTree and six publicly available datasets. The

proposed method shows that the lower prediction cost is

achieved by combining CSForest and CSVoting. P Singh et

al. [15] performed an analysis of prominent machine learning

i.e ANN, PSO (Particle Swarm Optimization), DT, NB and

LC (Linear classiﬁer) and evaluated on seven PROMISE

[16] datasets. These algorithms are analyzed by using KEEL

tools and validated using k-fold cross validated technique.

The results of the analysis reveal that LC has highest defect

prediction accuracy in four out of seven datasets then followed

by Na¨

ıve Bayesian, Decision Tree and Neural Network having

second highest accuracy in one dataset therefore LC is more

dominant over other classiﬁers. Manjula et al. [17] proposed a

hybrid machine learning approach in which Genetic algorithm

is presented to improve ﬁtness function and for the better

optimization of the features then the optimized features are

processed through Deep Neural Network (DNN). The perfor-

mance is compared with Na¨

ıve Bayes, SVM, Decision Tree,

KNN and other several ML algorithms and it is proven that

proposed hybrid approach achieve better results. The proposed

approach successfully addresses the performance challenge.

The classiﬁcation accuracy of proposed approach is 97.82% on

KC1 dataset, 97.59% on CM1 dataset, 97.96% for PC3 dataset

and 98.0% for PC4 dataset which is better than previous

techniques.

Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on June 09,2022 at 07:23:26 UTC from IEEE Xplore. Restrictions apply.

III. METHODOLOGY

In this section, the proposed methodology is described in

details.

A. Feature Engineering

In this section, a detailed analysis of datasets and basic fea-

ture processing performed of the datasets are also explained.

1) Feature Description: The datasets used in this paper is

the open-source data which is created and distributed by the

NASA Data Metric Program [16]. The datasets are available

in different versions and for our experimentation we are just

using CM1, KC2 and PC1. The CM1 dataset is composed of

498 instances and 21 attributes similarly, KC2 is composed of

522 instances and 21 attributes and PC1 composed of 1109

instances and 21 attributes.

2) Feature Selection: The main step of feature engineering

is to select feature relevant [18] to the domain of problem

we are going to solve and ignore the irrelevant features.

In this task, a statistic based ﬁlter based ranking methods

is used, which use statistics measures to assign a score to

each feature and present the users with a ranked list of

features. Speciﬁcally, a Chi-Squared statistic method is applied

to evaluate the importance of the feature and selection.

3) Handling Class Imbalance: During the analysis of the

datasets, it is observed that data is not balanced in some classes

there are more instances and other have less instances. The

Cm1 has approximately 90% false (not-defect) values and

10% true (defect) values. Similarly, KC2 has 20% true (defect)

values and 80% (non-defect) values as compared to PC1 which

has 93%(defect) and 7% (non-defect) values. So, the problem

of class imbalanced is solved by sample Bootstrapping [19]

and the number of observations used for sampling is 7.

Bootstrapping is also known as Bootstrap aggregation is a

random sampling with replacement method for creating the

random samples of the data.

B. Proposed Learning Model

The proposed ensemble learning model trained on CM1

is shown in Fig 2. which shows that after performing basic

feature preprocessing an ensemble learning is applied which is

ensemble of Classiﬁcation by Regression [20] and KNN model

[21]. After the ensemble method, KNN is applied again for

achieving better classiﬁcation accuracy. The classiﬁcation by

regression model has multiple subprocesses and subprocesses

has operators that is trained on regression model then the

operator of the regression model is trained by classiﬁcation

model. KNN acronym of K nearest neighbor is the prominent

classiﬁcation model. It ﬁnds the distance between test point

and every point on the training data, then ﬁnd the k nearest

neighbor between the points.

IV. RES ULTS AND DISCUSSION

In this section, the results of proposed ensemble learning is

presented.

Fig. 2: Proposed learning model

A. Evaluation Metrices

The most important evaluation metrics [22] we are using in

the experiment are described in below equations.

•Precision is the ratio of TP to TP and FP as illustrated

in equation (1).

P recision =T P

T P +F P (1)

•Recall is the ratio TP to TP and FN as illustrated in

equation (2).

Recall =T P +T P

F N (2)

•Accuracy is the ratio of TP and TN to total positive (P)

and negative (N) as illustrated (3).

Accuracy =T P +T N

P+N(3)

B. Accuracy on Test Set

It is observed that accuracy of ensemble learning model

trained on CM1 is 98.56% similarly, accuracy of model trained

on KC2 is 98.17% and accuracy of ensemble learning model

trained on PC1 is 99.27%. As it reviewed that the maximum

percentage of accuracies achieved previously was 97.59%

[17] on CM1. Our experimentation results demonstrate that

ensemble learning produce more better results than simple

learning models [23].

Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on June 09,2022 at 07:23:26 UTC from IEEE Xplore. Restrictions apply.

TABLE I: Comparative analysis of prominent ML models w.r.t

CM1 Dataset

Model Accuracy Precision Recall F-score

SVM 89.87% 89.95% 99.89% Unknown

KNN 94.41% 99.36% 99.57% 95.41%

Decision Tree 94.26% 94.31% 99.68% 57.75%

Random Forest 92.54% 92.34% 100% 40.91%

Proposed Ensemble 98.56% 98.8% 99.62% 99.62%

TABLE II: Comparative analysis of prominent ML models

w.r.t KC2 Dataset

Model Accuracy Precision Recall F-score

SVM 83.85% 84.41% 97.7% 43.81%

KNN 96.41% 95.51% 100% 99.93%

Decision Tree 88.59% 92.69% 93% 71.91%

Random Forest 92.43% 91.26% 100% 77.87%

Proposed Ensemble 98.17% 97.75% 100% 95.22%

TABLE III: Comparative analysis of prominent ML models

w.r.t PC1 Dataset

Model Accuracy Precision Recall F-score

SVM 92.66% 92.66% 100% Unknown

KNN 98.5% 98.68% 99.72% 88.22%

Decision Tree 94.89% 94.86% 99.91% 45.16%

Random Forest 95.36% 95.22% 100% 56.10%

Proposed Ensemble 99.27% 99.86% 99.35% 95.13%

C. Accuracy of Train and Test Datasets

It is observed that accuracy of ensemble learning model

trained on CM1 is 98.56% similarly, accuracy of model trained

on KC2 is 98.17% and accuracy of ensemble learning model

trained on PC1 is 99.27%. As it reviewed that the maximum

percentage of accuracies achieved previously was 97.59%

[17] on CM1. Our experimentation results demonstrate that

ensemble learning produce more better results than simple

learning models [24].

D. Comparison of Precision, Recall, F1-Score of Proposed

Learning Model

The formulae to calculate precision, recall and accuracy are

given in equation (1), (2), and (3) respectively. Experimen-

tal results demonstrate that our proposed ensemble learning

method achieved more accuracy than the previous methods.

The comparison of results of precision, recall and F1-score

is demonstrated in TABLE I, TABLE II, and TABLE III

respectively.

E. ROC curve

ROC curve acronym of receiver operating characteristics

curve is the machine learning evaluation tool for analyzing

the behavior of different classiﬁers at different threshold

[25]. It is the graphical representation between False Positive

Rate (FPR) and True Positive Rate (TPR). The ROC shows

that Ensemble learning model is achieve more accuracy than

previous proposed model.

F. Comparison of Proposed Learning Model with Prominent

Machine Learning Model

In this section, the comparison between proposed ensemble

learning model is compared with prominent machine such as

SVM, KNN, Decision Tree and Random Forest [26], [27],

[28], [29]. The comparison is performed on the basis of train

and test accuracy, precision, recall, F-score. The accuracy

achieved on CM1 by applying SVM is 89.87%, KNN is

95.41% and accuracy of DT is 94.26% and on Random Forest

accuracy is 92.54%. Similarly, accuracy achieved on KC2 by

applying SVM is 83.85%, KNN is 89.92% and accuracy of

DT is 88.59% and on Random Forest accuracy is 92.43%.

Also, the accuracy achieved on PC1 by applying SVM is

92.66%, KNN is 98.5% and accuracy of DT is 94.89% and on

Random Forest accuracy is 95.36%. The result of comparison

demonstrates that ensemble learning model is more suitable

model for this type of datasets for software defect prediction.

V. DISCUSSION

Software testing consumes more than 50% resources of

overall software development process. Defect detection is one

of the important activities of software testing. Early detection

of defects reduces the consumption of resources. There are

many techniques proposed for prediction of defects. Machine

learning is widely used technique for the detection of software

defects and produces efﬁcient accuracy as well. Particularly,

Machine learning algorithm i.e., SVM, KNN, DT, RF, NN,

DNN, GA and their different variation are widely used for

defect prediction. In this paper, we proposed an Ensemble

learning models and trained on three different datasets i.e.,

CM1, KC2, PC1. The datasets are publicly available in

promise repository which is created by NASA Data Metric

Program (DMP) and composed of 21 continuous variables

and 1 label with two classes i.e., Defect or Not Defect.

Before applying machine learning model, the class imbalanced

problem which is the common problem in machine learning

is handled by using sample with replacement technique called

Bootstrapping. It is observed that ensemble learning model

trained on CM1 produces 98.56% accuracy and ensemble

learning model trained on KC2 produces 98.18% accuracy

similarly ensemble learning model of ensemble learning model

trained on PC1 produces accuracy of 99.27%. The proposed

model is compared with prominent machine learning model

i.e., SVM, KNN, DT and RF and it is observed that pro-

posed model produces high result than all other models. The

comparison between models is performed on the basis of

evaluation metrics i.e., precision, recall, F1-score and accuracy.

The experimental results demonstrate that ensemble learning

models are efﬁcient for software defect prediction over other

machine learning techniques

VI. CONCLUSION

It is concluded that software testing is the most important

way to ensure quality in the software system. Therefore, in

this research paper, we proposed ensemble learning model for

predicting the defects in the software. The proposed model is

Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on June 09,2022 at 07:23:26 UTC from IEEE Xplore. Restrictions apply.

trained on the Promise datasets. The performance of proposed

model in term of accuracy metrics i.e., accuracy, precision,

recall and F1-score is compared with other prominent ma-

chine learning algorithms i.e., SVM, KNN, Decision Tree and

Random Forest and it is observed that proposed ensemble

learning model produces better results. Classiﬁcation accuracy

of ensemble model trained on CM1 is 98.56%, Classiﬁcation

accuracy of ensemble model trained on KC2 is 98.18% and

classiﬁcation accuracy on PC1 is 99.27%. The proposed en-

semble learning model can help Software Engineer for efﬁcient

detection and prediction of software defects earlier.

ACKNOWLEDGMENT

This work was supported by the ”GULF UNIVERSITY

FOR SCIENCE AND TECHNOLOGY (GUST) under grant

number ”223565”. The preprint of this paper is also available

online [29].

REFERENCES

[1] Felderer, M., Ramler, R. “Risk Orientation in Software Testing

Processes of Small and Medium Enterprises: An Exploratory and

Comparative Study”. Software Qual Journal 24, 519–548 (2016).

https://doi.org/10.1007/s11219-015-9289-z.

[2] Kassab, Mohamad, Joanna F. DeFranco, and Phillip A. Laplante “Soft-

ware Testing: The State of the Practice” IEEE Software 34.5 (2017):

46-52.

[3] Bishnu, Partha S., and Vandana Bhattacherjee. “Software Fault Pre-

diction Using Quad Treebased K-Means Clustering Algorithm” IEEE

Transactions on knowledge and data engineering 24.6 (2011): 1146-

1150.

[4] Park, Mikyeong, and Euyseok Hong. “Software Fault Prediction Model

Using Clustering Algorithms Determining the Number of Clusters

Automatically” International Journal of Software Engineering and Its

Applications 8.7 (2014): 199-204.

[5] Hong, Euyseok, and Mikyeong Park. “Unsupervised Learning Model

for Fault Prediction Using Representative Clustering Algorithms” KIPS

Transactions on Software and Data Engineering 3.2 (2014): 57-64.

[6] Hong, Euyseok. “Severity-based Fault Prediction Using Unsu-

pervised Learning” The Journal of The Institute of Internet,

Broadcasting and Communication 18.3 (2018): 151 -157. doi:

10.1109/TSE.2014.2322358.

[7] Chen, Mingming, and Yutao Ma. “An Empirical Study on Predicting

Defect Numbers” In SEKE, pp. 397-402. 2015.

[8] Shepperd, Martin, David Bowes, and Tracy Hall. “Researcher bias:

The Use of Machine Learning in Software Defect Prediction” IEEE

Transactions on Software Engineering 40, no. 6 (2014): 603-616.

[9] Z. Tian, J. Xiang, S. Zhenxiao, Z. Yi and Y. Yunqiang, “Soft-

ware Defect Prediction Based on Machine Learning Algorithms”

2019 IEEE 5th International Conference on Computer and Com-

munications (ICCC), Chengdu, China, 2019, pp. 520-525, doi:

10.1109/ICCC47050.2019.9064412.

[10] V. Garousi, R. ¨

Ozkan, A. J. I. Betin-Can, and S. “Technology, Multi-

Objective Regression Test Selection in Practice: An Empirical Study in

the Defense Software Industry” vol. 103, pp. 40-54, 2018.

[11] Manjula, C., and Lilly Florence. “Hybrid Approach for Software Defect

Prediction Using Machine Learning with Optimization Technique” Inter-

national Journal of Computer and Information Engineering 12.1 (2018):

28-32.

[12] Laradji, Issam H., Mohammad Alshayeb, and Lahouari Ghouti. “Soft-

ware Defect Prediction Using Ensemble Learning on Selected Features”

Information and Software Technology 58 (2015): 388-402.

[13] Arar, ¨

Omer Faruk, and K¨

urs¸at Ayan. “Software Defect Prediction Using

Cost-Sensitive Neural Network” Applied Soft Computing 33 (2015):

263-277.

[14] Siers, Michael J., and Md Zahidul Islam. “Software Defect Prediction

Using a Cost Sensitive Decision Forest and Voting, and a Aotential

Solution to the Class Imbalance Problem” Information Systems 51

(2015): 62-71.

[15] P. Deep Singh and A. Chug, “Software Defect Prediction Analysis Using

Machine Learning Algorithms” 7th International Conference on Cloud

Computing, Data Science Engineering - Conﬂuence, Noida, 2017, pp.

775-781, doi: 10.1109/CONFLUENCE.2017.7943255.

[16] Sayyad Shirabad, J. and Menzies, T.J. (2005) “The PROMISE Reposi-

tory of Software Engineering Databases” School of Information Tech-

nology and Engineering, University of Ottawa, Canada. Available:

http://promise.site.uottawa.ca/SERepository

[17] Manjula, C., and Lilly Florence. “Deep Neural Network-Based Hybrid

Approach for Software Defect Prediction Using Software Metrics”

Cluster Computing 22.4 (2019): 9847-9863.

[18] Ni, C., Chen, X., Wu, F., Shen, Y., Gu, Q. “An Empirical Study

on Pareto Based Multiobjective Feature Selection for Software Defect

Prediction” Journal of Systems and Software, 152, (2019), 215-238.

[19] Huda, S., Liu, K., Abdelrazek, M., Ibrahim, A., Alyahya, S., Al-Dossari,

H., Ahmad, S. “An Ensemble Oversampling Model for Class Imbalance

Problem in Software Defect Prediction” IEEE access, 6, 24184-24195.

(2018).

[20] “RapidMiner Documentation” Last Accessed (June 2020)

https://docs.rapidminer.com/latest/studio/operators/modeling/

predictive/ensembles/classiﬁcation by regression.html

[21] J. Friedman, T. Hastie, R. Tibshirani, “The Elements of Statistical

Learning” volume 1, Springer series in statistics New York, 2001.

[22] M¨

uller, Andreas C., and Sarah Guido. “Introduction to Machine Learn-

ing with Python: a guide for data scientists” O’Reilly Media, Inc.”, 2016.

[23] Wang, Tiejian, Zhiwu Zhang, Xiaoyuan Jing, and Liqiang Zhang.

“Multiple Kernel Ensemble Learning for Software Defect Prediction”

Automated Software Engineering 23, no. 4 (2016): 569590.

[24] Khan, Shahzad Ali, and Zeeshan Ali Rana. “Evaluating Performance

of Software Defect Prediction Models Using Area Under Precision-

Recall Curve (AUC-PR)” In 2019 2nd International Conference on

Advancements in Computational Sciences (ICACS), pp. 1-6. IEEE,

2019.

[25] “Rapid Miner Documentation” Last Accessed (June 2020)

https://docs.rapidminer.com/latest/studio/operators/validation/performance

[26] Reshi, Junaid Ali and Satwinder Singh. “Predicting Software Defects

through SVM: An Empirical Approach” ArXiv abs/1803.03220 (2018):

n. pag.

[27] Gong, Lina, Shujuan Jiang, Rongcun Wang, and Li Jiang. “Empirical

Evaluation of the Impact of Class Overlap on Software Defect Predic-

tion” In 2019 34th IEEE/ACM International Conference on Automated

Software Engineering (ASE), pp. 698 -709. IEEE, 2019.

[28] SGe, Jianxin, Jiaomin Liu, and Wenyuan Liu. “Comparative Study on

Defect Prediction Algorithms of Supervised Learning Software Based

on Imbalanced Classiﬁcation Data Sets” In 2018 19th IEEE/ACIS In-

ternational Conference on Software Engineering, Artiﬁcial Intelligence,

Networking and Parallel/Distributed Computing (SNPD), pp. 399 -406.

IEEE, 2018

[29] Nawaz, Ali, Attique Ur Rehman, and Muhammad Abbas. ”A Novel

Multiple Ensemble Learning Models Based on Different Datasets for

Software Defect Prediction.” arXiv preprint arXiv:2008.13114 (2020).

Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on June 09,2022 at 07:23:26 UTC from IEEE Xplore. Restrictions apply.

Bug Prediction Models: seeking the most efficient

Preprint

Full-text available

Jan 2024

Choosing the most appropriate machine learning model for bug prediction tasks is critical. This paper primarily compares the predictive power of individual models versus ensemble models. We begin by experimenting with popular single-machine learning models commonly used in bug prediction, like neural networks and support vector machines. Additionally, we test with ensemble models that combine individual models' unique strengths, aiming to maximize each's benefits. Our evaluation is based on datasets containing historical development data from well-known open-source software projects. We rely on various metrics when assessing the models, encompassing accuracy, precision, recall, and F1 score. Based on our research findings, it has been observed that ensemble models tend to outperform single models, particularly when it comes to maintaining resilience across various datasets.Nevertheless, factors like the project's complexity, data availability, and computational resources all play a role in determining whether to use single or ensemble models. This paper offers a thorough analysis of the factors to consider when selecting machine learning models and approaches for bug prediction, providing valuable insights into the field. Furthermore, it offers practical advice for professionals, enabling them to make informed choices.

Cross Project Software Defect Prediction Using Machine Learning: A Review

Article

Full-text available

Oct 2023

Software defect prediction is a crucial area of study focused on enhancing software quality and cutting down on software upkeep expenses. Cross Project Defect Prediction (CPDP) is a method meant to use information from different source projects to spot software issues in a specific project. CPDP comes in handy when the project being analyzed lacks enough or any data about defects for creating a dependable defect prediction model. Machine learning that is a part of artificial intelligence learns from data and then makes forecasts or choices. Machine learning (ML) is a key component of CPDP because it can learn from heterogeneous and imbalanced data sources. However, there are many challenges and open issues in applying machine learning to CPDP, such as data selection, feature extraction, model selection, evaluation metrics, and transfer learning. In this study, we provide a complete review of existing literature from 2018 to 2023 on Defect Prediction using Machine Learning, covering the main methods, applications, and limitations. We also use ML to identify current research gaps and future directions for CPDP. This paper will serve as a useful reference for researchers interested in using ML for CPDP.

Deep Learning Based Continuous Integration and Continuous Delivery Software Defect Prediction with Effective Optimization Strategy

Article

Apr 2024
KNOWL-BASED SYST

Software Defect Predictor and Classifier Tool Using Machine Learning Techniques

Conference Paper

Sep 2023

Empirical Evaluation of the Impact of Class Overlap on Software Defect Prediction

Conference Paper

Full-text available

Nov 2019

Evaluating Performance of Software Defect Prediction Models Using Area Under Precision-Recall Curve (AUC-PR)

Conference Paper

Full-text available

Feb 2019

Multi-objective regression test selection in practice: An empirical study in the defense software industry

Article

Full-text available

Jun 2018
INFORM SOFTWARE TECH

Context Executing an entire regression test-suite after every code change is often costly in large software projects. To cope with this challenge, researchers have proposed various regression test-selection techniques. Objective This paper was motivated by a real industrial need to improve regression-testing practices in the context of a safety-critical industrial software in the defence domain in Turkey. To address our objective, we set up and conducted an “action-research” collaborative project between industry and academia. Method After a careful literature review, we selected a conceptual multi-objective regression-test selection framework (called MORTO) and adopted it to our industrial context by developing a custom-built genetic algorithm (GA) based on that conceptual framework. GA is able to provide full coverage of the affected (changed) requirements while considering multiple cost and benefit factors of regression testing. e.g., minimizing the number of test cases, and maximizing cumulative number of detected faults by each test suite. Results The empirical results of applying the approach on the Software Under Test (SUT) demonstrate that this approach yields a more efficient test suite (in terms of costs and benefits) compared to the old (manual) test-selection approach, used in the company, and another applicable approach chosen from the literature. With this new approach, regression selection process in the project under study is not ad-hoc anymore. Furthermore, we have been able to eliminate the subjectivity of regression testing and its dependency on expert opinions. Conclusion Since the proposed approach has been beneficial in saving the costs of regression testing, it is currently in active use in the company. We believe that other practitioners can apply our approach in their regression-testing contexts too, when applicable. Furthermore, this paper contributes to the body of evidence in regression testing by offering a success story of successful implementation and application of multi-objective regression testing in practice<br/

An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction

Article

Full-text available

Mar 2018

Software systems are now ubiquitous and are used every day for automation purposes in personal and enterprise applications; they are also essential to many safety-critical and mission-critical systems, e.g., air traffic control systems, autonomous cars, and SCADA systems. With the availability of massive storage capabilities, high speed Internet, and the advent of Internet of Things (IoT) devices, modern software systems are growing in both size and complexity. Maintaining a high quality of such complex systems while manually keeping the error rate at a minimum is a challenge. Therefore, automated detection of faulty components in a software system is important during software development and also post-delivery. Fault detection models usually needs to be trained on a labeled-balanced dataset with both faulty and non-faulty samples. Earlier work, e.g. Mohsin et al. (2016), showed that most real fault detection training dataset are imbalanced. Thereby, the trained model gets over-fitted and classifies faulty components as non-faulty components. The consequence of a high false negative rate is cumulative and results in generating more errors when using the model in other software systems – never seen before, which is very expensive. In this paper, we propose a software defect prediction ensemble model which considers the class imbalance problem in real software datasets. We use different oversampling techniques to build an ensemble classifier that can reduce the effect of low minority samples in the defective data. The proposed approach is verified using PROMISE software engineering dataset. The results show that our ensemble oversampling technique can more greatly reduce the false negative rate compared to the standard classification techniques and identify the faulty components more accurately resulting in a less expensive detection system (lowering the rate of non-faulty predictions of faulty modules).

Deep neural network based hybrid approach for software defect prediction using software metrics

Article

Full-text available

Jul 2019
CLUSTER COMPUT

In the field of early prediction of software defects, various techniques have been developed such as data mining techniques, machine learning techniques. Still early prediction of defects is a challenging task which needs to be addressed and can be improved by getting higher classification rate of defect prediction. With the aim of addressing this issue, we introduce a hybrid approach by combining genetic algorithm (GA) for feature optimization with deep neural network (DNN) for classification. An improved version of GA is incorporated which includes a new technique for chromosome designing and fitness function computation. DNN technique is also improvised using adaptive auto-encoder which provides better representation of selected software features. The improved efficiency of the proposed hybrid approach due to deployment of optimization technique is demonstrated through case studies. An experimental study is carried out for software defect prediction by considering PROMISE dataset using MATLAB tool. In this study, we have used the proposed novel method for classification and defect prediction. Comparative study shows that the proposed approach of prediction of software defects performs better when compared with other techniques where 97.82% accuracy is obtained for KC1 dataset, 97.59% accuracy is obtained for CM1 dataset, 97.96% accuracy is obtained for PC3 dataset and 98.00% accuracy is obtained for PC4 dataset.

Software Testing: The State of the Practice

Article

Full-text available

Jan 2017

A Web-based survey examined how software professionals used testing. The results offer opportunities for further interpretation and comparison to software testers, project managers, and researchers. The data includes characteristics of practitioners, organizations, projects, and practices.

Predicting Software Defects Through SVM: An Empirical Approach

Article

Full-text available

Aug 2017

Software defect prediction is an important aspect of preventive maintenance of a software. Many techniques have been employed to improve software quality through defect prediction. This paper introduces an approach of defect prediction through a machine learning algorithm, support vector machines (SVM), by using the code smells as the factor. Smell prediction model based on support vector machines was used to predict defects in the subsequent releases of the eclipse software. The results signify the role of smells in predicting the defects of a software. The results can further be used as a baseline to investigate further the role of smells in predicting defects.

Software Defect Prediction based on Machine Learning Algorithms

Conference Paper

Dec 2019

An Empirical Study on Pareto based Multi-objective Feature Selection for Software Defect Prediction

Article

Mar 2019
J SYST SOFTWARE

The performance of software defect prediction (SDP) models depend on the quality of considered software features. Redundant features and irrelevant features may reduce the performance of the constructed models, which require feature selection methods to identify and remove them. Previous studies mostly treat feature selection as a single objective optimization problem, and multi-objective feature selection for SDP has not been thoroughly investigated. In this paper, we propose a novel method MOFES (Multi-Objective FEature Selection), which takes two optimization objectives into account. One optimization objective is to minimize the number of selected features, this objective is related to the cost analysis of this problem. Another objective is to maximize the performance of the constructed SDP models, this objective is related to the benefit analysis of this problem. MOFES utilizes Pareto based multi-objective optimization algorithms (PMAs) to solve this problem. In our empirical study, we design and conduct experiments on RELINK and PROMISE datasets, which are gathered from real open source projects. Firstly, we analyze the influence of different PMAs on MOFES and find that NSGA-II can achieve the best performance on both datasets. Then, we compare MOFES method with 22 state-of-the-art filter based and wrapper based feature selection methods, and find that MOFES can effectively select fewer but closely related features to construct high-quality models. Moreover, we also analyze the frequently selected features by MOFES, and these findings can be used to provide guidelines on gathering high-quality SDP datasets. Finally, we analyze the computational cost of MOFES and find that MOFES only needs 107 seconds on average.

Comparative Study on Defect Prediction Algorithms of Supervised Learning Software Based on Imbalanced Classification Data Sets

Conference Paper

Jun 2018

An Ensemble Model for Software Defect Prediction

Abstract and Figures

Recommended publications

Machine learning-based early detection of diabetes risk factors for improved health management

A Data Preprocessing and Stacking Ensemble Learning Model for Improved CHD Prediction

Comparative Analysis of Machine Learning Models for Predicting Diabetes: Unveiling the Superiority o...

A Novel Multiple Ensemble Learning Models Based on Different Datasets for Software Defect Prediction