Conference PaperPDF Available

CNN - Based Priority Prediction of Bug Reports

Authors:
CNN-Based Priority Prediction of Bug Reports
R.M.D.S Rathnayake
Department of Computing and
Information Systems,
Sabaragamuwa University of Sri Lanka
Belihuloya, Sri Lanka
dilkisr@gmail.com
B.T.G.S Kumara
Department of Computing and
Information Systems,
Sabaragamuwa University of Sri Lanka
Belihuloya, Sri Lanka
btgsk2000@gmail.com
E.M.U.W.J.B Ekanayake
Department of Computer Science and
Informatics,
Uva Wellassa University of Sri Lanka
Sri Lanka
jayalath@uwu.ac.lk
Abstract When considering software maintenance, priority
prediction is an essential part of it. Thousands of bugs are
reported daily in the Bug Tracking System (BTS); Bugzilla,
JIRA, and GitHub are commonly used. Priority assignment for
the reported bugs is conducted manually. Therefore, this task
takes considerable time to do, and there is also a high
possibility of making a mistake. Therefore, it is imperative to
have a way to predict the bug report's priority automatically.
Our study proposed a model based on the Convolutional
Neural Network (CNN) to predict the bug report's priority.
First, preprocess the textual content in bug reports using
natural language processing (NLP) approaches. Then extract
the features from the textual context (short description) using
the Bag-of-word feature extraction method. Finally, train a
CNN-based classifier to make priority predictions based on its
input. Then our result is compared with the Support Vector
Machine (SVM) and Temporal Convolutional Network (TCN)
to find a better model for priority prediction. The final results
show that the proposed approach based on the CNN classifier
performs better than the other approaches, and it shows a 71%
accuracy while others have low accuracy, like 63% and 48%
for SVM and TCN, respectively. The proposed model's
performance was evaluated using the Bugzilla dataset, which
included over 25,000 bug reports.
Keywords— Bug reports, Priority prediction, CNN, TCN,
SVM
I. INTRODUCTION
Both open and closed software projects have many bugs,
and due to these bugs, the quality and performance of the
software systems may decrease. In practice, it is impossible
to create defect-free software. Bugs are common in
software, and many projects will be delivered with defects
[1]. In order to enhance the next version of the system, the
developers let the users report bugs in the bug tracking
system (BTS). Developers can use BTS to manage bug
reports and triage them [2]. Users support developers in the
resolution of reported defects by reporting bugs. It is the
common procedure in the software maintenance procedure
[3]. One of the essential phases of software development is
software maintenance. The resolution of reported issues is
important, expansive, and critical because the complex
systems have explosive growth [4].
A bug report includes details that help to regenerate the
bug. A standard bug report includes several predefined
fields: bugID, creation time, summary, description, bug
status, resolution, priority, and severity. The severity of a
defect, on the other hand, is an effect and scope on the
system; meanwhile, a bug's priority defines the order in
which the developer should resolve it. A bug report's
priority in Bugzilla is assigned to it on a range of P1 to P5,
with P1 being the highest priority and P5 being the lowest
priority. The bug report's severity might vary from trivial,
minor, normal, major, critical, or blocker. After reporting
the bug, a trigger examines the reported bug and decides and
manually assigns the priority and severity to bugs. This
process is known as bug triaging, and it is a manual process
that requires considerable time to resolve a bug [4].
In some cases, priority and severity are left blank due to
a lack of experience and technical understanding. Bug
triaging is time-consuming and needs domain knowledge.
As a result, determining priority and severity should be
automated.
Different methods to automate the severity prediction of
bug reports have been developed by researchers [5–12].
Most methods use standard machine learning algorithms
such as Naïve Bayes (NB), decision trees, support vector
machines (SVM), and j48. Different methods based on the
Deep Learning approach to automate the severity prediction
of bug reports have been developed by researchers [2], [14],
and [15]. Most of them used a CNN-based approach with
some machine learning algorithms to compare the
performance of proposed approaches.
Various automated techniques to predict the bug report's
priority has been presented to automate the process [16-18].
However, their performance is inaccurate, and such an
approach's performance requires significant improvement. A
variety of Neural Network-Based automated techniques to
predict the bug report's priority has been presented to
automate the process [1], and [18-21].
However, the thing is, the performance of these
approaches needs to be significantly improved. W. Y.
Ramay et al. [2] researched the emotions of the reporters to
predict the severity. According to Umer et al. [3], the
number of reporters with unfavorable feelings is more
severe than non-severe bugs. As a result, reports are
expressive when writing bugs. Such sentiments (emotions)
could aid in prioritizing and predicting the bug report's
severity.
The following is a breakdown of the structure of the
paper. The literature review is described in Part II. Then, in
Part III, we go over the details of our proposed approach.
The proposed approach's evaluation process and results are
described in Part IV. The threats are explained in Part V.
Part VI concludes this article. Many researchers proposed
several approaches to prioritize the bug reports. Most of
them seem to give good results. Recent studies consider the
emotional analysis in bug reports' priority predictions.
II. LITERATURE REVIEW
The researcher proposed traditional and novel
approaches in their previous studies to predict the bug
2021 International Conference on Decision Aid Sciences and Application (DASA)
978-1-6654-1634-4/21/$31.00 ©2021 IEEE 299
reports' priority and severity. Most of the traditional
approaches are proposed using machine learning algorithms.
In recent years researchers have been focused on the deep
neural network-based approaches in the predictions.
Mrunalini M et al. [9], in their study, proposed the
bagging ensemble approach to predict the bug report's
severity. Also, the proposed method is comparable to the
C4.5 classifier.
In [10], used many machine learning (ML) approaches
to determine the defect's severity depending on the bug
report's textual description. It has been noticed that when the
count of terms increases from 125 onwards, the performance
of machine learning approaches is steady.
[11], proposed a technique based on the concept profile
to severity prediction for a provided bug report by analyzing
historical bug reports and building cps from them. Because
of the greater Recall, Precision, and F-measures, the
proposed approach can successfully predict the severity of a
given problem better than the Naïve Bayes (NB), NB
Multinomial and, k-nearest neighbor (KNN).
A study by A. Kaur et al. [12] compared multiple ML
algorithms at two different levels for severity predicting the
bugs in the software systems. According to the findings,
predicting severity at the component level yields better
outcomes than predicting severity at the system level.
The CNN and Random Forest with Boosting techniques
have been proposed to increase the severity of the binary
bug classifications over the latest technologies to increase
the performance of the severity of the binary bug
classifications over the latest technologies [13]. It is a new
deep learning model for classifying the severity of multiple
classes.
In [3], [22], and [23], the authors introduced the ML
based models to priority prediction of the bug reports. In [3],
they proposed an approach combined with NLP techniques
and machine learning algorithms. They have proposed the
"DRONE" approach, which is focused on the emotional
value to predict the priority of the bug reports.
P. A. Choudhary et al. [21] developed priority prediction
models using neural networks and text classification
techniques. They found that textual, temporal, author-
related, severity, product, and component features impact a
bug's priority.
The researcher in [2] used a deep learning-based
automatic technique. First, they use natural language
preprocessing (NLP) algorithms to extract the text
preprocessing of bug reports. Then, for each bug report,
analyze and apply an emotion score. They then generate a
vector for every bug report that has been preprocessed.
Fourth, refer to the vector and emotional scores generated
from each bug or report for severity prediction to a deep
learning-based classification. The cross-product outcomes
show that the proposed method for predicting the bug
report's severity works better than the more advanced
approaches.
In [1], M.Sallam et al. used the RNN-LSTM approach to
predict the bug report's priority, and they compared the
results with SVM and KNN. They conclude that the LSTM
accurately predicts and allocates the priority of bugs.
Neural Network-Based approaches are used [1], and [18-
20] to predict the bug report's priority. The Multilayer
Perception (MLP) based approach proposed by P. A.
Choudhary et al. [18] to priority prediction and finally shows
that the MLP performs better than the Naïve Bayes
Classifier. In [18], they have used several fields of the bug
reports for classification, and W. Zhang et al. [19] used the
description field to extract the features, and M. Kumari et al.
[1] identified four fields that can affect the priority level of
the bug reports, and they have used these fields for their
study. RNN-LSTM [1] and CNN-based approaches [20]
were also used in priority prediction.
In conclusion, researchers have developed several ML
algorithms for predicting the priority and severity of bug
reports. A few types of research are conducted to predict the
priority of bug reports using deep learning-based
classifications. In those studies, most of the research used
emotion analysis with the deep learning-based classifier.
The proposed approach to this work is different from the
existing approaches. Most researchers consider semantic
analysis with the deep neural classifier, but we do not
consider semantic analysis in our study. We applied a deep
neural network-based approach, NLP techniques, and
feature extractions to predict the bug report's priority.
III. P
ROPOSED
A
PPROACH
A. Overview
An overview of the bug report priority prediction using
deep neural networks is shown in Fig. 1. The following is
how our approach was used to predict the bug report's
priority:
First, we collect a dataset for our research. We used
various open-source projects to extract the bug reports.
In the second step, we used NLP techniques to
preprocess the dataset.
Third, we extract the features from the short
description part of the bug report using the Bag-of-Word
feature extraction method.
Finally, for priority prediction, we develop deep
learning-based classifiers (i.e., CNN-based classifiers).
Fig. 1. An overview of the proposed approach
2021 International Conference on Decision Aid Sciences and Application (DASA)
300
B. Data Acquisition
Bugs in software projects are reported to bug tracking
systems, which allow engineers to keep track of reported
bugs in real-time. As previously stated, we used a dataset
taken from Bugzilla for this work. The datasets consist of
bug reports extracted from four open-source projects
(Eclipse, Netbeans, Mozilla, and Open Office) and contain
more than 20,000 bug reports. Table I illustrates the total
number of bug reports downloaded from each of the open-
source software projects.
There are 11 columns in this dataset. The fields are bug
id, description, classification, product, platform, component,
operating system, bug status, resolution, priority, and
severity. We used a unique feature from the selected dataset,
a short textual description of the bug report. Because we
consider the description part of the bug is most suitable for
the prediction of the priority. This feature can be considered
as the most appropriate to the reported bug report's priority
prediction.
Our study filters the dataset according to the total
number of bugs related to the priority level.
TABLE II. T
OTAL
N
UMBER
O
F
B
UG
R
EPORTS
Project
Total Number
of Bug Reports
Eclipse 8,478
Mozilla 4,165
Netbeans 4,305
Open Office 4,553
Total 21,501
After the filtering process, we obtain data containing
bugs in P1, P2, P4 & P5, and we eliminate the bug with P3
as a priority level. P3 is taken as a default value for the
priority of the bug report. Therefore, developers also do not
give much consideration when they assign P3 priority for
the bug. As we earlier mentioned, the majority of the bug
reports are labeled priority as P3. Therefore, removing these
bugs is like removing most of the bugs from the database.
Fig. 2 shows the total number of bug reports for the
selected priority level.
Fig. 2. Total number of bug reports for the selected priority level
There were many bug reports that kept the priority
column empty or assigned as P3. For research purposes, we
remove those bug reports from our dataset.
The extracted dataset consists of features such as bugID,
short description, classification, product, component,
resolution, operation system, severity, bug status, and
priority of the bug. We select the main feature as a short
description. This description is presented in the form of a
brief text. The priority prediction technique in this study was
primarily based on the priority and a description of the bug
report.
C. Preprocessing
This part will go over each phase of the NLP technique
to preprocess the text (short description) of bug reports. The
majority of bug reports include unnecessary and irrelevant
content. Therefore, we do preprocess to enhance the
performance of our approach. Most researchers used the
NLP technique to preprocess bug reports. They also used the
below steps when preprocessing. To clean the textual
information in bug reports, we used the following
preprocessing steps in this research:
Tokenization: - This is the process of breaking text
into sentences, words, and clauses. It removes
useless symbols from the text and separates them
into tokens. Preprocessing necessitates this step
because bug reports are frequently written as a
combination of words and meaningless symbols
like punctuation marks and spaces. As a result,
tokenization eliminates symbols and breaks the text
down into tokens.
Stop-word removal: - "the," "in," "am," "our,"
"is," "I," "he," and "that" are frequently used stop-
words and which have no meaning individually.
Therefore, in the context of bug reports, such terms
do not include important information, and this data
may reduce the classification's efficiency. As a
result, these terms are removed from the tokens
derived in the first preprocessing step.
Stemming: - The process of reducing words to
their stems is known as stemming. Each word in a
bug report's description can take several different
forms. All selected words are stemmed and
converted to their ground words. Stemming is
highly important in the fields of text mining and
information retrieval. For instance, the words
"give," "gives," "gave," and "given" can all be
substituted with a single word, "give." Stemming
can be done using a variety of algorithms. For
lemmatization, however, we employ Porter's
stemming algorithm [3], which is a widely used
stemming approach by many researchers.
Finally, all preprocessing terms are referred to as
"features," and these terms are mainly used to develop the
priority prediction model.
The extracted features from the description of the bug
reports can be used to characterize bug reports in a
classification problem. We use Bugzilla to gather the bug
reports and construct a model for the priority prediction of a
new bug. We review the bug reports and look for feature
words in the description of each bug. Creating a prediction
model needs to convert text data from a bug report into a
Bag-of-Words feature vector. A dataset is converted from a
dataset (description) to a vector of features, representing
terms in the dataset.
2021 International Conference on Decision Aid Sciences and Application (DASA)
301
TABLE II. PERFORMANCE ON PRIORITY LEVEL
Used
Approach
P1
P2
P4
P5
Precision
Recall
F1-socre
Precision
Recall
F1-socre
Precision
Recall
F1-socre
Precision
Recall
F1-socre
CNN 58% 58% 58% 74% 70% 72% 72% 81% 76% 79% 74% 76%
TCN 34% 48% 40% 45% 46% 45% 46% 31% 37% 70% 65% 68%
SVM 51% 63% 57% 67% 59% 63% 64% 70% 67% 77% 59% 67%
D. Feature Extraction
After the initial text is cleaned and normalized, we need
to transform it into their features for modeling. Bag-of-
Words (BoW) is a method of extracting features from the
text for modeling purposes. It is a text representation that
may extract features from a text by describing the
recurrence of words inside the text. In this approach, we
consider each word count as a feature.
To obtain the Bag-of-Words, always perform
preprocessing steps and generate a set of all the available
words before sending it for modeling. In the bug reports, the
text is messy and unstructured, and to train our model, we
need structured, well-defined, fixed-length inputs. Using the
Bag-of-Words method, the unstructured text can be
converted into a fix-length vector.
E. CNN-Based Classifier
In this research, we developed a one-dimensional CNN
model to predict the bug report's priority. After
preprocessing the data then we take the features in the short
description of the bug reports. Then we loaded data for the
I-D (Dimensional) CNN model. First, define the model and
provide required three-dimensional inputs (samples, time
steps & features). For this model, we used 32 parallel
features maps and a kernel size of 5. The kernel size refers
to the number of input time steps taken into account while
reading or processing the input sequence into feature maps,
whereas feature maps refer to the number of times the input
is processed or interpreted. The developed model fits a fixed
number of epochs (400), and 32 samples are used as a batch
size.
IV. EVALUATION
A. Metrics
The performance measurements should be applied to a
test data set after the classifier has been adequately trained
and developed the priority prediction model. In this
research, we determine the priority-specific Accuracy,
Precision, Recall, and F1-score of the approach to
measuring the performance of the proposed approach of
CNN, TCN, and SVM on the given bug reports.
In this work, we consider about five priority classes
(levels). So, we conduct microanalysis and macroanalysis
for all priority levels of C. We calculate macroprecision
(Precisionmac ), macrorecall (Recallmac ), macro F1-score
(F1-scoremac ), microprecision (Precisionmic), microrecall
(Recallmic), and micro F1-score (F1-scoremic).
B. Results
This research compares the CNN, TCN, and SVM
approaches in the bug report's priority prediction. Up to
now, we have used microanalysis to determine how much
the performance of CNN has improved versus each class.
We also make a priority-level comparison to evaluate
CNN's performance at each priority level.
1) Comparison on Priority Level:
Table II shows the evaluation findings for every priority
level for the approaches applied. The first column in Table II
lists the three approaches, whereas columns 2-4, 5-7, 8-10,
11-13, and 14-16 list the priority levels we used for this
study. The table content represents the performance results of
CNN, TCN, and SVM.
At every priority level, CNN performs better than
TCN and SVM.
The performance increment of CNN over SVM in F1-
score differs from 1.72% = (58%-57%)/58% to
11.84% = (76%-67%)/76%.
2) Comparison of Macroanalysis and Microanalysis:
Evaluation results of Macroanalysis and Microanalysis
are shown in Table III. The three approaches used in this
study are presented in the first column. The results of
Macroanalysis and Microanalysis are presented in columns
2-4 and 5-7, respectively. The following observations can be
derived from Table III:
In both macro and microanalysis, CNN
outperforms TCN and SVM; however, SVM
outperforms TCN in both macro and microanalysis.
For both macro and microanalysis, the performance
improvement of CNN over SVM in F1-score is
12.70% = (71%-63%)/63%.
Moreover, the performance improvement of CNN
upon TCN for macro and microanalysis in F1-score
is 51.06% = (71%-47%)/47% and 47.92% = (71%-
48%)/48% respectively.
TABLE III. EVALUATION RESULTS OF MACRO &
MICROANALYSIS
Approach
Macro Micro
Precision Recall F1-
score
Precision Recall F1-
score
CNN 71% 71% 71% 71% 71% 71%
TCN 49% 47% 47% 49% 48% 48%
SVM 65% 63% 63% 65% 63% 63%
Table IV shows the accuracy of three different
classifications. Column one includes the three approaches
used for this work. The second column presents the
accuracy of each approach. We can observe that CNN
outperforms TCN and SVM by achieving 71% accuracy.
The final results showed that our proposed deep learning
(CNN) approach gave the highest performance outcomes
across all performance metrics. (Accuracy = 71%, Precision
= 71%, Recall = 71%, F1-score = 71%)
2021 International Conference on Decision Aid Sciences and Application (DASA)
302
TABLE I. ACCURACY OF DIFFERENT CLASSIFICATIONS
Approach Accuracy
CNN
71%
TCN 48%
SVM 63%
V. CONCLUSION AND FUTURE WORKS
Assigning a bug report's priority is a manual process.
Therefore, there is a high probability that the priority
assigned may be incorrect. As a result, automating the
process of prioritizing bug reports is critical. The bug
reports from the four open-source projects are gathered
using the Java program for this study. The extracted dataset
comprises almost 20,000 problem reports, and after filtering,
the dataset reduces as bug reports with a default priority
value are removed. First, preprocessing the bug reports'
short descriptions using NLP techniques and used steps are
discussed in the above section. In the second step, features
are extracted using the Bag-of-Words feature extraction
method. In the third step, we developed a model for this
study using CNN, TCN, and SVM. The performance of the
developed models is measured at a class level using
Accuracy, Precision, Recall, and F1-score. The CNN-based
approach outperformed other approaches by obtaining the
highest accuracy of 71%. Therefore, the proposed approach
performs well in the bug reports' priority prediction.
We hope to predict the severity and priority of bug
reports in the future using different feature extraction
methods and deep learning-based approaches. We hope to
use a different dataset than the one we used for this work.
REFERENCES
[1] H. Bani-Salameh, M. Sallam, and B. Al shboul, "A deep-learning-
based bug priority prediction using RNN-LSTM neural," e-Inform.
Softw. Eng. J., vol. 15, no. 1, 2021
[2] W. Y. Ramay, Q. Umer, X. C. Yin, C. Zhu, and I. Illahi, "Deep neural
network-based severity prediction of bug reports," IEEE Access, vol. 7,
pp. 46846–46857, 2019.
[3] Q. Um3er, H. Liu, and Y. Sultan, "Emotion based automated priority
prediction for bug reports," IEEE Access, vol. 6, pp. 35743–35752,
2018
[4] Q. Umer, H. Liu, and I. Illahi, "CNN-based automatic prioritization of
bug reports," IEEE trans. reliab., vol. 69, no. 4, pp. 1341–1354, 2020.
[5] M. Sharma, M. Kumari, R. K. Singh, and V. B. Singh, "Multiattribute
based machine learning models for severity prediction in cross project
context," in Computational Science and Its Applications—ICCSA.
Cham, Switzerland: Springer, 2014, pp. 227–241
[6] Menzies and A. Marcus, "Automated severity assessment of software
defect reports," in Proc. IEEE Int. Conf. Softw. Maintenance, Sep./Oct.
2008, pp. 346–355.
[7] A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals, "Predicting the
severity of a reported bug," in Proc. 7th IEEE Working Conf. Mining
Softw. Repositories (MSR), May 2010, pp. 1–10
[8] Y. Tian, D. Lo, and C. Sun, "Information retrieval based nearest
neighbor classification for fine-grained bug severity prediction," in
Proc. 19th Work. Conf. Reverse Eng., Washington, DC, USA, Oct.
2012, pp. 215–224. doi: 10.1109/WCRE.2012.31.
[9] M. N. Pushpalatha and M. Mrunalini, "Predicting the severity of bug
reports using classification algorithms," in the 2016 International
Conference on Circuits, Controls, Communications and Computing
(I4C), 2016, pp. 1–4.
[10] K. K. Chaturvedi and V. B. Singh, "Determining Bug severity using
machine learning techniques," in 2012 CSI Sixth International
Conference on Software Engineering (CONSEG), 2012, pp. 1–6.
[11] T. Zhang, G. Yang, B. Lee, and A. T. S. Chan, "Predicting the severity
of bug report by mining bug repository with concept profile," I
Proceedings of the 30th Annual ACM Symposium on Applied
Computing - SAC '15, 2015.
[12] A. Kaur and S. G. Jindal, "Text analytics-based severity prediction of
software bugs for apache projects," Int. j. Syst. assur. eng. manag.,
vol. 10, no. 4, pp. 765–782, 2019.
[13] A. Kukkar, R. Mohana, A. Nayyar, J. Kim, B.-G. Kang, and N.
Chilamkurti, "A novel deep-learning-based Bug Severity
classification technique using convolutional neural networks and
random forest with boosting," Sensors (Basel), vol. 19, no. 13, p.
2964, 2019.
[14] A. Chauhan and R. Kumar, "Bug severity classification using
semantic feature with convolution neural network," in Advances in
Intelligent Systems and Computing, Singapore: Springer Singapore,
2020, pp. 327–335.
[15] Y. Tian, D. Lo, and C. Sun, "Drone: Predicting priority of reported
bugs by multi-factor analysis," in Proc. IEEE Int. Conf. Softw.
Maintenance (ICSM), Sep. 2013, pp. 200–209, doi: 10.1109/
ICSM.2013.31.
[16] J. Kanwal and O. Maqbool, "Bug prioritization to facilitate bug report
triage," J. Comput. Sci. Technol., vol. 27, no. 2, pp. 397–412, Mar.
2012, doi: 10.1007/s11390-012-1230-3
[17] L. Yu, W.-T. Tsai, W. Zhao, and F. Wu, "Predicting defect priority
based on neural networks," in Advanced Data Mining and
Applications, L. Cao, J. Zhong, and Y. Feng, Eds. Berlin, Germany:
Springer, 2010, pp. 356–367
[18] P. A. Choudhary, "Neural Network based bug priority prediction
model using text classification techniques," Int. j. adv. res. comput.
sci., vol. 8, no. 5, pp. 1315–1319, 2017.
[19] W. Zhang and C. Challis, "Automatic bug priority prediction using
DNN based regression," in Advances in Natural Computation, Fuzzy
Systems and Knowledge Discovery, Cham: Springer International
Publishing, 2020, pp. 333–340
[20] M. Kumari and V. B. Singh, "An improved classifier based on
entropy and deep learning for bug priority prediction," in Advances in
Intelligent Systems and Computing, Cham: Springer International
Publishing, 2020, pp. 571–580.
[21] P. A. Choudhary, "Neural Network-based bug priority prediction
model using text classification techniques," Int. j. adv. res. Comput.
Sci., vol. 8, no. 5, pp. 1315–1319, 2017.
[22] Pushpalatha, Mrunalini, and S. R. Bista, "Predicting the priority of
bug reports using classification algorithms," Indian j. comput. sci.
eng., vol. 11, no. 6, pp. 811–818, 2020.
[23] R. Malhotra, A. Dabas, H. A, and M . Pant, "A study on machine
learning applied to software bug priority prediction," in 2021 11th
International Conference on Cloud Computing, Data Science &
Engineering (Confluence), 2021.
2021 International Conference on Decision Aid Sciences and Application (DASA)
303
... Some researchers conducted their research using a deeplearning approach. A study [7] proposes a CNN-based model to predict bug report priority. The model preprocesses textual content, extracts features, and trains a CNN classifier to make priority predictions. ...
Article
Full-text available
Context: Predicting the priority of bug reports is an important activity in software maintenance. Bug priority refers to the order in which a bug or defect should be resolved. A huge number of bug reports are submitted every day. Manual filtering of bug reports and assigning priority to each report is a heavy process, which requires time, resources, and expertise. In many cases mistakes happen when priority is assigned manually, which prevents the developers from finishing their tasks, fixing bugs, and improve the quality. Objective: Bugs are widespread and there is a noticeable increase in the number of bug reports that are submitted by the users and teams' members with the presence of limited resources, which raises the fact that there is a need for a model that focuses on detecting the priority of bug reports, and allows developers to find the highest priority bug reports. This paper presents a model that focuses on predicting and assigning a priority level (high or low) for each bug report. Method: This model considers a set of factors (indicators) such as component name, summary, assignee, and reporter that possibly affect the priority level of a bug report. The factors are extracted as features from a dataset built using bug reports that are taken from closed-source projects stored in the JIRA bug tracking system, which are used then to train and test the framework. Also, this work presents a tool that helps developers to assign a priority level for the bug report automatically and based on the LSTM's model prediction. Results: Our experiments consisted of applying a 5-layer deep learning RNN-LSTM neural network and comparing the results with Support Vector Machine (SVM) and K-nearest neighbors (KNN) to predict the priority of bug reports. The performance of the proposed RNN-LSTM model has been analyzed over the JIRA dataset with more than 2000 bug reports. The proposed model has been found 90% accurate in comparison with KNN (74%) and SVM (87%). On average, RNN-LSTM improves the $F$-measure by 3% compared to SVM and 15.2% compared to KNN. Conclusion: It concluded that LSTM predicts and assigns the priority of the bug more accurately and effectively than the other ML algorithms (KNN and SVM). LSTM significantly improves the average $F$-measure in comparison to the other classifiers. The study showed that LSTM reported the best performance results based on all performance measures (Accuracy = 0.908, AUC = 0.95, $F$-measure = 0.892).
Article
Full-text available
The accurate severity classification of a bug report is an important aspect of bug fixing. The bug reports are submitted into the bug tracking system with high speed, and owing to this, bug repository size has been increasing at an enormous rate. This increased bug repository size introduces biases in the bug triage process. Therefore, it is necessary to classify the severity of a bug report to balance the bug triaging process. Previously, many machine learning models were proposed for automation of bug severity classification. The accuracy of these models is not up to the mark because they do not extract the important feature patterns for learning the classifier. This paper proposes a novel deep learning model for multiclass severity classification called Bug Severity classification to address these challenges by using a Convolutional Neural Network and Random forest with Boosting (BCR). This model directly learns the latent and highly representative features. Initially, the natural language techniques preprocess the bug report text, and then n-gram is used to extract the features. Further, the Convolutional Neural Network extracts the important feature patterns of respective severity classes. Lastly, the random forest with boosting classifies the multiple bug severity classes. The average accuracy of the proposed model is 96.34% on multiclass severity of five open source projects. The average F-measures of the proposed BCR and the existing approach were 96.43% and 84.24%, respectively, on binary class severity classification. The results prove that the proposed BCR approach enhances the performance of bug severity classification over the state-of-the-art techniques.
Article
Full-text available
Severity i.e impact, extent and effect on software is a decisive attribute which decides how instantly the bug should be fixed. Predicting the severity of software bugs is important to improve the bug triaging and resolution process. To reduce the effort and time required in manual assessment of severity of newly reported bugs, many techniques and methods are used in past researches. To help software developers to utilize their resources efficiently, this study evaluates a number of machine learning techniques for predicting the severity of software bugs at system and component level. The techniques are evaluated on thirteen apache projects automatically extracted using the Bug Report Collection System tool. Severity is predicted based on the most frequent terms extracted from the summary of bugs using text mining. Performance metrics such as precision, recall and accuracy are used to interpret the results obtained from various techniques. The result of the study advocates that Boosting (an ensemble learner) technique outperforms other machine learning techniques such as Bayesian learners, decision tree, support vector machine applied in previous researches.
Article
Full-text available
Software maintenance is an essential phase of software development. Developers employ issue tracking systems to collect bugs for software improvement. Users submit bugs through such issue tracking systems and decide the severity of reported bugs. The severity is an important attribute of a bug that decides how quickly it should be solved. It helps developers to solve important bugs on time. However, manual severity assessment is a tedious job and could be incorrect. To this end, in this paper, we propose a deep neural network-based automatic approach for the severity prediction of bug reports. First, we apply natural language processing techniques for text preprocessing of bug reports. Second, we compute and assign an emotion score for each bug report. Third, we create a vector for each preprocessed bug report. Forth, we pass the constructed vector and the emotion score of each bug report to a deep neural network based classifier for severity prediction. We also evaluate the proposed approach on the history-data of bug reports. The results of cross-product suggest that the proposed approach outperforms the state-of-the-art approaches. On average, it improves the f-measure by 7.90% .
Article
Severity of the bug reports are assigned by the user, developer and tester of the software, but priority is assigned by the developers. The developer assigns different priorities such as P1, P2, P3, P4, and P5 where P1 is higher priority and P5 is lower priority. If a bug report contains the priority level P1, then it will be given higher priority for fixing. In order to prioritize which bug to be fixed first based on the priority, priority information is required. Even though, the priority is assigned by developer, sometimes it may incorrect, because of busy schedule or inexperienced developer. That time, developer can use this recommendation system for more accurate priority assignment and also time may be saved. In this work, predicting the priority of bug report is presented using different classification algorithms such as Naïve bayes, Simple Logistic and Random Tree. Among the three classifiers Simple Logistics gives better result over other two classifiers.
Article
Software systems often receive a large number of bug reports. Triagers read through such reports and assign different priorities to different reports so that important and urgent bugs could be fixed on time. However, manual prioritization is tedious and time-consuming. To this end, in this article, we propose a convolutional neural network (CNN) based automatic approach to predict the multiclass priority for bug reports. First, we apply natural language processing (NLP) techniques to preprocess textual information of bug reports and covert the textual information into vectors based on the syntactic and semantic relationship of words within each bug report. Second, we perform the software engineering domain specific emotion analysis on bug reports and compute the emotion value for each of them using a software engineering domain repository. Finally, we train a CNN-based classifier that generates a suggested priority based on its input, i.e., vectored textual information and emotion values. To the best of our knowledge, it is the first CNN-based approach to bug report prioritization. We evaluate the proposed approach on open-source projects. Results of our cross-project evaluation suggest that the proposed approach significantly outperforms the state-of-the-art approaches and improves the average F1-score by more than 24%.
Chapter
Bugs are inevitable during software development. It is important to prioritize bugs and fix them based on their priorities. The priority assignment is usually done manually. Besides the cost of human effort, this process may also introduce bias since different people might have different opinions on the same issue. In this paper, we propose an approach to automate the process. It builds features from bug reports using Natural Language Processing, then trains a predictive model based on a deep Neural Network. The proposed approach was tested using a comprehensive data set containing more than 82 thousand bug reports. It runs in near real-time and its performance is significantly better than the previously reported results.
Chapter
The bug reports are reported at a faster rate, resulting in uncertainties and irregularities in the bug reporting process. The noise and uncertainty also generated due to increasing enormous size of the bugs to the bug tracking system. In order to build a better classifier, we need to take care of these uncertainties and irregularity. In this paper, we built classifiers based on machine learning techniques Naïve Bayes (NB) and Deep Learning (DL) using entropy based measures for bug priority prediction. We have considered severity, summary weight and entropy attribute to predict the bug priority. The experimental analysis is conducted on eight products of an open source project OpenOffice. We have considered the performance measures, namely accuracy, precision, recall and f-measure to compare the proposed approach. We observed that the attribute entropy has improved the performance of classifier in both the cases NB and DL. DL with entropy is performing better than NB with entropy.