Figure 4. Boxplot of absolute residuals for COCOMO

Pareto Efficient Multi-Objective Optimization for Local Tuning of Analogy-Based Estimation

Article

Full-text available

Nov 2016
NEURAL COMPUT APPL

Analogy Based Effort Estimation (ABE) is one of the prominent methods for software effort estimation. The fundamental concept of ABE is closer to the mentality of expert estimation but with an automated procedure in which the final estimate is generated by reusing similar historical projects. The main key issue when using ABE is how to adapt the effort of the retrieved nearest neighbors. The adaptation process is an essential part of ABE to generate more successful accurate estimation based on tuning the selected raw solutions, using some adaptation strategy. In this study we show that there are three interrelated decision variables that have great impact on the success of adaptation method: (1) number of nearest analogies (k), (2) optimum feature set needed for adaptation, and (3) adaptation weights. To find the right decision regarding these variables, one need to study all possible combinations and evaluate them individually to select the one that can improve all prediction evaluation measures. The existing evaluation measures usually behave differently, presenting sometimes opposite trends in evaluating prediction methods. This means that changing one decision variable could improve one evaluation measure while it is decreasing the others. Therefore, the main theme of this research is how to come up with best decision variables that improve adaptation strategy and thus, the overall evaluation measures without degrading the others. The impact of these decisions together has not been investigated before, therefore we propose to view the building of adaptation procedure as a multi-objective optimization problem. The Particle Swarm Optimization Algorithm (PSO) is utilized to find the optimum solutions for such decision variables based on optimizing multiple evaluation measures

Pareto Efficient Multi Objective Optimization for Local Tuning of Analogy Based Estimation

Article

Full-text available

Sep 2015
NEURAL COMPUT APPL

Analogy Based Effort Estimation (ABE) is one of the prominent methods for software effort estimation. The fundamental concept of ABE is closer to the mentality of expert estimation but with an automated procedure in which the final estimate is generated by reusing similar historical projects. The main key issue when using ABE is how to adapt the effort of the retrieved nearest neighbors. The adaptation process is an essential part of ABE to generate more successful accurate estimation based on tuning the selected raw solutions, using some adaptation strategy. In this study we show that there are three interrelated decision variables that have great impact on the success of adaptation method: (1) number of nearest analogies (k), (2) optimum feature set needed for adaptation, and (3) adaptation weights. To find the right decision regarding these variables, one need to study all possible combinations and evaluate them individually to select the one that can improve all prediction evaluation measures. The existing evaluation measures usually behave differently, presenting sometimes opposite trends in evaluating prediction methods. This means that changing one decision variable could improve one evaluation measure while it is decreasing the others. Therefore, the main theme of this research is how to come up with best decision variables that improve adaptation strategy and thus, the overall evaluation measures without degrading the others. The impact of these decisions together has not been investigated before, therefore we propose to view the building of adaptation procedure as a multi-objective optimization problem. The Particle Swarm Optimization Algorithm (PSO) is utilized to find the optimum solutions for such decision variables based on optimizing multiple evaluation measures. We evaluated the proposed approaches over 15 datasets and using 4 evaluation measures. After extensive experimentation we found that: (1) predictive performance of ABE has noticeably been improved, (2) optimizing all decision variables together is more efficient than ignoring any one of them. (3) Optimizing decision variables for each project individually yield better accuracy than optimizing them for the whole dataset.

An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation

Article

Full-text available

May 2015
J SYST SOFTWARE

A Better Case Adaptation Method for Case-Based Effort Estimation Using Multi-Objective Optimization

Conference Paper

Full-text available

Dec 2014

Case-Based Reasoning (CBR) is considered as one of the efficient methods in the area of software effort estimation because of its outstanding performance and capability of handling noisy datasets. This study examines the performance of multi-objective Particle Swarm Optimization algorithm to find the best configuration parameters for the adaptation process. Particularly, we propose a new adaptation method for which its parameters can be optimized by making tradeoff between multiple accuracy measures. The proposed adaptation is fully automated and able to dynamically adapt each case in the dataset individually. Based on empirical validation over 8 datasets, the performance figures have seen good improvements against conventional CBR and some adapted versions of CBR. Keywords—Case-Based Reasoning; adaptation method; software effort estimation; multi-objective particle swarm optimization;

Towards effective feature selection in estimating software effort using machine learning

Article

May 2023

Software effort estimation is a vital process in the software industry for successfully administering 5Ds of the software development life cycle (SDLC). The 5Ds stand for demand, development, direction, deployment, and designated cost of the software. Software development effort estimation (SDEE) is an effort prediction mechanism to calculate the effort for the development of the software product in order to minimize the challenges in the software field. Academics and practitioners are striving to identify which machine learning estimation technique yields more accurate results based on evaluation metrics, datasets, and other pertinent aspects. The feature selection techniques impact accuracy by selecting the main and relevant features in the dataset and eliminating the redundant and irrelevant features in the dataset. To achieve accurate estimations, the paper utilizes feature selection algorithms, along with various machine learning techniques, which predict the desired effort and the performance of the model has been measured in terms of prediction accuracy, value, relative error, and mean absolute error. The datasets China and Maxwell are trained with the relevant features by applying feature selection algorithms, and estimation techniques are applied to predict the effort. The performance is compared with the regression models and feature selection techniques utilized by many authors previously. The result of the proposed methodology significantly gives the best performance with the combination of feature selection and estimation models than all regression models when applied alone, to both datasets. From the results, it is perceptible that random forest is performing well with the feature selection techniques and obtains the highest prediction accuracy of 99.33% with the China and 89.47% with the Maxwell datasets.

Stacking regularization in analogy-based software effort estimation

Article

Full-text available

Feb 2022
SOFT COMPUT

Analogy-based estimation (ABE) estimates the effort of the current project based on the information of similar past projects. The solution function of ABE provides the final effort prediction of a new project. Many studies on ABE in the past have provided various solution functions, but its effectiveness can still be enhanced. The present study is an attempt to improve the effort prediction accuracy of ABE by proposing a solution function SABE: Stacking regularization in analogy-based software effort estimation. The core of SABE is stacking, which is a machine learning technique. Stacking is beneficial as it works on multiple models harnessing their capabilities and provides a better estimation accuracy as compared to a single model. The proposed method is validated on four software effort estimation datasets and compared with the already existing solution functions: closet analogy, mean, median and inverse distance weighted mean. The evaluation criteria used are mean magnitude of relative error (MMRE), median magnitude of relative error (MdMRE), prediction (PRED) and standard accuracy (SA). The results suggested that the SABE showed promising performance for almost all the evaluation criteria when compared with the results of the earlier studies.

A pragmatic ensemble learning approach for effective software effort estimation

Article

Full-text available

Jun 2022
Innovat Syst Software Eng

The immense increase in software technology has resulted in the convolution of software projects. Software effort estimation is fundamental to commence any software project and inaccurate estimation may lead to several complications and setbacks for present and future projects. Several techniques have been following for ages of the software effort estimation. As the application of software is extensively increased in its size and complexity, the traditional methods aren’t adequate to meet the requirements. To achieve the accurate estimation of software effort, in this paper, a gradient boosting regressor model is proposed as a robust approach. The performance is compared with regression models such as stochastic gradient descent, K-nearest neighbor, decision tree, bagging regressor, random forest regressor, Ada-boost regressor, and gradient boosting regressor by employing COCOMO’81 containing 63 projects and CHINA of 499 projects. The regression models are evaluated by the evaluation metrics such as MAE, MSE, RMSE, and R2. From the results, it is evident that the gradient boosting regressor model is performing well by obtaining an accuracy of 98% with COCOMO’81 and 93% with CHINA dataset. The proposed method significantly performs better than all regression models used in comparison with both the datasets.

Ensemble of Learning Project Productivity in Software Effort Based on Use Case Points

Conference Paper

Full-text available

Dec 2018

Ensemble of Learning Project Productivity in Software Effort Based on Use Case Points

Preprint

Full-text available

Dec 2018

It is well recognized that the project productivity is a key driver in estimating software project effort from Use Case Point size metric at early software development stages. Although, there are few proposed models for predicting productivity, there is no consistent conclusion regarding which model is the superior. Therefore, instead of building a new productivity prediction model, this paper presents a new ensemble construction mechanism applied for software project productivity prediction. Ensemble is an effective technique when performance of base models is poor. We proposed a weighted mean method to aggregate predicted productivities based on average of errors produced by training model. The obtained results show that the using ensemble is a good alternative approach when accuracies of base models are not consistently accurate over different datasets, and when models behave diversely.

Employing Gene Expression Programming in Estimating Software Effort

Article

Full-text available

Aug 2018

The problem of estimating the effort for software packages is one of the most significant challenges encountering software designers. The precision in estimating the effort or cost can have a huge impact on software development. Various methods have been investigated in order to discover good enough solutions to this problem; lately evolutionary intelligent techniques are explored like Genetic Algorithms, Genetic Programming, Neural Networks, and Swarm Intelligence. In this work, Gene Expression Programming (GEP) is investigated to show its efficiency in acquiring equations that best estimates software effort. Datasets employed are taken from previous projects. The comparisons of learning and testing results are carried out with COCOMO, Analogy, GP and four types of Neural Networks, all show that GEP outperforms all these methods in discovering effective functions for the estimation with robustness and efficiency.

Boxplot of absolute residuals for COCOMO

Citations