ChapterPDF Available

Application of Shuffled Frog-Leaping Algorithm for Optimal Software Project Scheduling and Staffing

May 2021

May 2021

DOI:10.1007/978-3-030-70713-2_28

In book: Innovative Systems for Intelligent Health Informatics, Data Science, Health Informatics, Intelligent Systems, Smart Computing (pp.293-303)

Authors:

Ameen Ahmed Oloduowo

University of Ilorin

Hammed Mojeed

University of Ilorin

Abdullateef Oluwagbemiga Balogun

Universiti Tecknologi PETRONAS

Show all 7 authorsHide

Software Project Scheduling Problem is one of the most crucial issues in software development because it includes resources planning; cost estimates, staffing and cost control which if not properly planned affect the timely completion of the software project. Software project scheduling is a problem of scheduling the tasks (work packages) and employees in such a way that the overall project completion time is minimized without violating dependency constraints (tasks dependencies) and being consistent with resource constraints. This study adopts a Search Based Software Engineering approach that focuses on multi-objective optimization for a software project planning using the Shuffled Frog Leaping Algorithm, a memetic meta-heuristic algorithm. The objectives are optimal ordering of work packages without dependency violation and allocation of staff to the work packages such that only employee(s) with required competence(s) are allotted to a given work package. The study was carried out in four stages, namely: frog (solution) representation, definition of the fitness function, implementation of Shuffled Frog Leaping Algorithm and evaluation with a randomly generated Software Project Scheduling Problem. The study concludes that it is possible to find an efficient solution to a Software Project Scheduling Problem by implementing the SFLA than any other traditional computing means which are tedious, error prone and costly.

Employee assignment representation

…

Experimental results of SFLA and random search

…

Figures - uploaded by Abdullateef Oluwagbemiga Balogun

Content may be subject to copyright.

Content uploaded by Abdullateef Oluwagbemiga Balogun

Content may be subject to copyright.

Application of Shufﬂed Frog-Leaping Algorithm

for Optimal Software Project Scheduling

and Stafﬁng

Ahmed O. Ameen1(B), Hammed A. Mojeed1, Abdulazeez T. Bolariwa1,

Abdullateef O. Balogun1,2, Modinat A. Mabayoje1, Fatima E. Usman-Hamzah1,

and Muyideen Abdulraheem1

1Department of Computer Science, University of Ilorin, PMB 1515, Ilorin, Nigeria

{aminamed,mojeed.ha,balogun.ao1,mabayoje.ma,usman-hamza.fa,

muyideen}@unilorin.edu.ng, abdullateef_16005851@utp.edu.my

2Department of Computer and Information Sciences, Universiti Teknologi PETRONAS,

32610 Bandar Seri Iskandar, Perak, Malaysia

Abstract. Software Project Scheduling Problem is one of the most crucial issues

in software development because it includes resources planning; cost estimates,

stafﬁng and cost control which if not properly planned affect the timely completion

of the software project. Software project scheduling is a problem of scheduling

the tasks (work packages) and employees in such a way that the overall project

completion time is minimized without violating dependency constraints (tasks

dependencies) and being consistent with resource constraints. This study adopts

a Search Based Software Engineering approach that focuses on multi-objective

optimization for a software project planning using the Shufﬂed Frog Leaping

Algorithm, a memetic meta-heuristic algorithm. The objectives are optimal order-

ing of work packages without dependency violation and allocation of staff to the

work packages such that only employee(s) with required competence(s) are allot-

ted to a given work package. The study was carried out in four stages, namely:

frog (solution) representation, deﬁnition of the ﬁtness function, implementation

of Shufﬂed Frog Leaping Algorithm and evaluation with a randomly generated

Software Project Scheduling Problem. The study concludes that it is possible

to ﬁnd an efﬁcient solution to a Software Project Scheduling Problem by imple-

menting the SFLA than any other traditional computing means which are tedious,

error prone and costly.

Keywords: Shufﬂed Frog-Leaping Algorithm ·Software Project Scheduling

Problem ·Software project planning ·Search Based Software Engineering

1 Introduction

Software development for organizations is a very complex task as it deals with manag-

ing people, technologies and business processes [1]. In software development process,

effective planning is important because failure to plan and/or poor planning can result in

F. Saeed et al. (Eds.): IRICT 2020, LNDECT 72, pp. 293–303, 2021.

https://doi.org/10.1007/978-3-030-70713-2_28

294 A. O. Ameen et al.

unnecessary delays and overhead costs [2]. Due to this uncertainty incurred in planning

software project, given timing and budget constraints are often unacceptable; which in

turn leads to business critical failures. Software development companies often struggle

to deliver projects timely, within budget and with required quality. Possible causes of

this problem are poor project scheduling and ineffective team stafﬁng [3]. Therefore,

software engineering projects require good software project management techniques

to ensure that projects are completed on schedule and within budget [4]. In order to

achieve proper planning and management of software project, tasks need to be optimally

scheduled and resources be effectively allocated.

Scheduling is setting a sequence of time-dependent functions to execute a set of

dependent tasks that constitute a project [5]. Dependency of tasks in terms of priority and

precedence is very crucial to software project scheduling. There are priorities constraints

between tasks in projects, but in addition to these constraints there may be another kind

of constraints between tasks based on resource allocation [5]. Apart from considering

priority and precedence limitations, scheduling should be carried out in a way to be

consistent with resource constraints. Good allocations (team stafﬁng) are very crucial

for software projects, since humans are their main resources [8,20]. The importance

of effective software project scheduling cannot be overemphasized when managing the

development of medium to large scale projects as it is required to carry out projects that

can meet the deadline and budget [8].

Software Project Scheduling Problem (SPSP) is a kind of optimization problem that

seeks to ﬁnd optimal schedule for a software project so that the precedence and resource

constraints are satisﬁed and ensuring that project cost and duration are minimized [3].

This problem has been found to be Non-deterministic Polynomial (NP)-hard [9,19].

To solve this problem, meta-heuristic evolutionary algorithms such as Genetic Algo-

rithm [10], Ant Colony Optimization [9], Shufﬂed Frog Leaping (SFL) algorithm [4]

and Differential Evolutionary Algorithm [5,6] have been successfully applied. Major-

ity of these studies however, consider only task scheduling for the formulation of the

problem [21–23]. There is a need for studies that combines tasks scheduling and stafﬁng

(allocation of jobs to developers) in software development project planning problem. In

this work, a memetic approach based on Shufﬂed Frog-Leaping Algorithm (SFLA) is

presented for optimal project scheduling and stafﬁng when the objectives are combined.

1.1 The SFLA Algorithm

Shufﬂed Frog Leaping Algorithm (SFLA) is a novel memetic meta-heuristic ﬁrst pro-

posed by Eusuff and Lansey [11] for solving combinatorial optimization problems and it

was ﬁrst used to solve problem of water resource in network distribution [12]. The SFLA

has been designed as a meta-heuristic to perform an informed search using a heuristic

function [13]. The SFLA is a fusion of deterministic and random approaches. The deter-

ministic strategy allows the algorithm to use response surface information effectively

to guide the heuristic search as in Particle Swarm Optimization (PSO). The random

approach ensures the ﬂexibility and robustness of the search pattern. The SFLA does

not specify the individuals belonging to it population rather, it uses an abstract model,

called a virtual population [13].

Application of Shufﬂed Frog-Leaping Algorithm 295

The search begins with a randomly selected population of P frogs (i.e. solutions).

The population is partitioned into several m memeplexes (parallel communities) that can

evolve independently to search the solution space in different directions. The individual

frogs contain ideas (memes) that can be inﬂuenced by the ideas of other frogs within each

memeplex and evolve through an optimization process refeered to as memetic evolution

[14]. Memetic evolution enhance the quality of worst frog Xw and guide its performance

towards a goal. To ensure that the evolution process is competitive, it is required that

frogs with better memes (ideas) contribute more to the development of new ideas than

frogs with poor ideas. During thi evolution step, the frogs may change their memes using

the information from the memeplex best frog Xb or the global best frog Xg of the entire

population [13]. Accordingly, the position of the frog with the worst ﬁtness is adjusted

using Eqs. 1–2.

Change in Frog position:

Di=rand () . (Xb−Xw)(1)

New position:

Xnew =Xw+Di;(Dmax >=Di>=−Dmax)(2)

Where rand() is a random number between 0 and 1; and Dmax is the maximum

allowed change in a frog’s position. If this process yields a better frog (solution), it

replaces the worst frog. Elsewise, the calculations in Eqs. (1) and (2) are repeated with

respect to the global best frog (that is Xg replaces Xb). If no improvement becomes

possible in this latter case, then a new solution (frog) with any arbitrary ﬁtness is randomly

generated to replace the worst frog [14]. The calculations then continue for a speciﬁc

number of evolutionary iterations within each memeplex.

After a number of memetic evolutionary steps, ideas are pass among memeplexes

in a shufﬂing process (global search). The local search and the shufﬂing processes

continue until convergence criteria are satisﬁed. The algorithm has been tested on several

combinatorial problems and found to be efﬁcient in ﬁnding global solutions [14]. The

core parameters of SFLA are: population size, P, number of memeplexes, m, and number

of evolutionary iterations in each memeplex, q [3].

2 Related Works

Considering the application of SFLA, Elbeltagi, Hegazy and Grierson [14] compared

the searching mechanism of the Genetic Algorithm (GA) with that of the SFLA and the

experimental results of the comparison show that the SFLA have better performance than

the GA in solving some problems of continuous functions. Their work also proposed an

improved SFLA, introduced a new parameter called search-acceleration factor (C) to the

original formulation of the SFLA, analyzed the positive role of the new parameter and

solved discrete and continuous optimization problems. Nejad, Jahani and Sarlak [15]

applied SFLA to Economic Load Dispatch (ELD) problem in power system. Their objec-

tive was to ﬁnd the optimal combination of power generations that minimizes the total

generation cost while satisfying an equality constraint and inequality constraints. Two

296 A. O. Ameen et al.

representative systems (IEEE 30 bus and 57 bus) were used to test their proposed SFLA

algorithm in comparison with the GA based method for the solution of the ELD prob-

lem. The result proved that the SFLA technique was faster than the GA technique. Also,

Liping, Weiwei, Yefeng and Yixian [16] introduced the SFLA to solve an uncapacitated

Single Level Lot Sizing (SLLS) problem and gained ideal results.

Gerasimou et al. [17] investigated the application of a Particle Swarm Optimization

(PSO) algorithm to software project scheduling and effective team stafﬁng. The study

aims to create optimal project schedules by specifying the best sequence for executing

a project’s tasks to minimize the total project duration and seeks to form skillful and

productive working teams with the best utilization of developer skills. A combination

of Constriction-PSO and Binary-PSO variations were employed to solve the problem.

Results from empirical experiments showed that PSO was able to generate feasible

solutions with feasibility rate of approximately 100% and hit rate of virtually 100% in

all of considered problems. However, as the complexity and size of the problems increase

a progressive decrease in these percentages is observed reaching as low as 30%. This

shows that the employed algorithm still encounters difﬁculties in producing optimal

solution as project complexity increases.

Chen and Zhang [18] developed an approach based on an event-based scheduler

(EBS) and an ant colony optimization (ACO) algorithm for optimal project scheduling

and stafﬁng. The model employed the event-based scheduler to simplify the restricted

ﬂexibility of human resource allocation. The project plan was model as task list and

employee allocation matrix, then Ant Colony Optimization (ACO) algorithm was applied

to solve the problem. Experimental results showed that the representation scheme with

the EBS is effective, and the proposed algorithm manages to yield better plans with lower

costs and more stable workload assignments compared with other existing approaches

such as the Tabu Search (TS) algorithm for the multiskill scheduling Problem, the

knowledge-based GA (KGA) and the time-line-based GA. The study however considered

not the employee experience in the formulation. Stylianou and Andreou [7] proposed a

procedure for software project managers to support their project scheduling and team

stafﬁng activities by adopting a genetic algorithm approach as an optimisation technique

in order to construct a project’s optimal schedule and to assign the most experienced

employees to tasks. Experimental results obtained revealed that the genetic algorithm

is capable of ﬁnding optimal solutions for projects of varying sizes when using either

one of the objective functions. However, when the objective functions were combined,

the genetic algorithm presents difﬁculties in reaching optimal solutions especially when

having preference to assign the most experienced employees over the project’s dura-

tion. This study presents SFLA as a memetic meta-hueristic algorithm to tackle this

shortcoming.

Recent works have focused on combining task scheduling and team allocation/

resource assigning based on multiple skills (as also adopted in this study) using differ-

ent optimization approaches. Lin, Zhu, and Gao [24] proposed a genetic programming

hyper-heuristic algorithm for minimizing makespan in multi-skill resource constrained

project scheduling problem (MS–RCPSP). Comparisons with existing algorithms such

Application of Shufﬂed Frog-Leaping Algorithm 297

as HACO, GRASP and DEGR showed that the proposed algorithm performed consid-

erably better with regards to solution quality and convergence rate. The same multi-

skill formulation was also employed by Li et al. [25] with focus on skill evolution and

cooperation effectiveness in project scheduling.

Van Den Eeckhout, Maenhout and Vanhoucke [26] applied a heuristic procedures

based on iterated local search to an integrated personnel stafﬁng problem and the project

scheduling problem formulation such that the demand for staff and the scheduling of

the resources is determined simultaneously as proposed in this study. However, their

objective is to determine the personnel budget that minimizes project cost rather than

combining minimized completion time and cost objectives. Recently, an optimization

procedure for large scale resource constrained multi-objective project scheduling prob-

lem based on cooperative coevolution was proposed by Shen, Guo and Li [27]. Duration

and cost are considered together as objectives with employees’ satisfaction. Experimental

results on 15 randomly generated large-scale instances with up to 2048 decision vari-

ables indicated the high scalability of the proposed approach with regards to convergence

ability.

3 Methodology

To model the problem, Design Structure Matrix (DSM) which enforces the dependencies

among tasks was used. It is represented as a jagged array of two-dimension where row

indices represent WP ids. Using an hypothetical software project consisting of seven

WPs, an example of a modeled DSM is shown in Fig. 1.The DSM indicates that WP1

does not depend on any task before it can actually start, WP1 must ﬁnish before WP2

can start, WP1 and WP2 must ﬁnish before WP3 and WP4 can start, WP1, WP2, WP3

and WP4 must ﬁnish before WP5 can start, WP4 and WP5 must ﬁnish before WP6 can

ﬁnish and WP5 must ﬁnish before WP7 can start. For a software project scheduling

problem, the number of WPs is usually less than or equal to 2n – 1, with n representing

the number of employees required to complete the project.

WP 1

WP 21

WP 312

WP 412

WP 51234

WP 645

WP 75

Fig. 1. DSM model representation of dependencies constraints

The staff allocation is modeled using binary representation of an integer number

×having a value in the interval 1 to 2n – 1, where n equals the employee involved

in the project. The value of each bit in the binary equivalence denotes an employee

involvement in the current task. A value of 1 means the corresponding employee is

298 A. O. Ameen et al.

allocated for the given WP and 0 means the employee is not allocated. Starting from

the left, the ﬁrst bit denotes employee1s involvement in the task, the next bit represents

employee2s involvement and so on. Assuming that four employees are available for the

project represent by the DSM in Fig. 1, the employee assignment of any of the WPs is the

binary equivalence of a number between 1 and 15. An example of employee assignment

under this representation is presented in Table 1. Associated with each employee is

skill set represented as a linear array of skill types. Also, for each WP, the required

competence(s) is deﬁned which is represented as an n-array of skill types. The total

required competence of a WP is the sum of all the inherent skill set possessed by the

team of employees assigned to the WP.

Tabl e 1. Employee assignment representation

Work package Employee assignment Binary equivalence Remarks

1 2 0010 Task assigned to only employee

211 1011 Task assigned to employees 1, 3

and 4

315 1111 Task assigned to employees 1, 2,

3and4

4 5 0101 Task assigned to employees 2

and 4

5 4 0100 Task assigned to only employee2

613 1101 Task assigned to employees 1, 2

and 4

7 7 0111 Task assigned to employees 2, 3

and 4

3.1 Frog Representation

A frog represents a feasible solution to project scheduling and stafﬁng problem. It is

encoded as an n ×2 array where each row consists of a WP id and an integer number

representing employee assignment. The row index indicates the position of the WP in the

WPs ordering. For example, row index 0 indicates position (POS) 1, and the associated

WP start ﬁrst before any other WP. A typical frog schema is shown in Fig. 2.

3.2 SFLA Design

Shufﬂed Frog Leaping Algorithm (SFLA) works generally as follows: At ﬁrst, a virtual or

random population of frogs is created (where p is the population size). Subsequently, the

ﬁtness of the individual frogs is evaluated. Afterwards, the frogs are sorted in descending

order of their ﬁtness (that is the ﬁttest to the worst). Thereafter, the frogs are partitioned

into m memeplexes. Then, a local search is performed within each memeplex. During

Application of Shufﬂed Frog-Leaping Algorithm 299

Fig. 2. A frog representation

each intra-memeplex local search, the best frog and the worst frog are identiﬁed as Xb

and Xw respectively and the global best frog is identiﬁed as Xg. Then, a process is

applied to improve only the worst frog, excluding other frogs. Consequently, in this

approach the position of the worst frog (Xw) is adjusted using Eqs. 3–7.

chunkLength =0.5×frog_size (3)

Start =rand() ×(frog_size −chunkLength (4)

I=chunkLength +start;start <=I<frog_size (5)

Di=Swap(Xb(start,I),Xw(start,I)) (6)

Xnew =Xw+Di(7)

Where rand() is a random number between 0 and 1; and I is index of the last memotype

in a chunk (to make improvement with). This evolutionary step is illustrated in Fig. 3.

If this process yields a better frog (solution), it replaces the worst frog. Elsewise, the

calculations in Eqs. (6) and (7) are repeated with respect to the global best frog (that

is Xg replaces Xb). If no improvement becomes possible in this latter case, then a new

solution (frog) with any arbitrary ﬁtness is randomly generated to replace the worst frog.

The calculations then continue for a speciﬁc number of evolutionary iterations within

each memeplex [11]. There is no generally accepted number of evolutionary iterations

for local search during memetic evolution of SFLA, researchers have diverse opinions.

This study adopts the one used in [4], as it proved efﬁcient in their study. The study put

the number of evolutionary iterations, q to be dependent on the problem size with the

value q =2n, where n is the number of frogs in a memeplex.

3.3 Fitness Function

This study adopts a penalty-based ﬁtness calculation. The ﬁtness of a solution is com-

puted as the sum of dependency violations and skill mismatches for each WP. Depen-

dency violation is measured by checking if all other WPs a given WP depends on as

represented in the DSM precede the WP in the solution or not. For each of the WPs it

300 A. O. Ameen et al.

depends on that does not precede it in the solution, dependency violation (V) is incre-

mented by 1, otherwise no increment. Skill mismatch for a WP is computed by checking

if for each employee eiassigned to wpi, none of the WP’s required competences or

competence(wpi) is a subset of the set of the total competences of the team of employ-

ees assigned to wpi. In otherwords, skill-set(ei) ∩competence(wpi) equals F.The number

of the skill mismatches (M) is incremented by 1 for every mismatch that occurs within a

solution. So, consequently the lower the ﬁtness value the better the solution (frog). The

ﬁtness of a frog is measured based on Eq. 8.

Fig. 3. Worst frog improvement

Fit(A)=n

k=1(V+M);Fit(A)>=0(8)

Where: V =number of dependency violations, M =number of skill mismatches

and n =number of WPs for frog A.

4 Results and Discussion

The modiﬁed SFLA was tested on a randomly generated software project scheduling

and stafﬁng problem consisting of seven (7) WPs, twelve (12) dependencies, ﬁve (5)

employees. The population size is varied as 100, 150, 200 and 300. The number of

memeplexes is also varied as 5, 10 for each cases of the population size. This variation of

population size and memeplex size is necessary because there are no generally acceptable

criteria for choosing population size and number of memeplexes for a given problem.

These parameters together with the number of evolutionary iterations greatly inﬂuence

the performance of the algorithm. The number of evolutionary iterations per memeplex is

set to 2N, where N is the number of frogs in each memeplex as proposed [4]. Owing to the

fact that SFLA works on a virtual or randomly generated population and tries to improve

the frogs based on the convergence criteria set, its result and how well the improvement

of frogs is done is always time varied. Hence, there is need to run the algorithm a number

of instances and then average the results to have a better evaluation of the performance

of the algorithm on a given problem and how well the feasible solutions to the problem

(frog) are improved before ﬁnally selecting the best solution.

Table 2presents the results of experiments carried out on the random problem using

SFLA with varied population size and number of memeplexes, and pure random search.

Application of Shufﬂed Frog-Leaping Algorithm 301

A total of twenty (20) independent runs as proposed in [28] were performed on each

case of the variation and the results were averaged. The same experiments were also

carried out with pure random search and for a comparison with the proposed approach.

All algorithms are implemented in Java.

Tabl e 2. Experimental results of SFLA and random search

Population size Number of memeplex

(es)

Average ﬁtness (SFLA) Average ﬁtness

(random)

100 50.27 4.12

10 0.49

150 50.23 5.11

10 0.32

200 50.16 4.79

10 0.28

300 50.15 5.14

10 0.22

It can be deduced that the algorithm worked better on a larger population size and for

the same population size when the number of memeplex was varied, the lower the number

of memeplex, the better the improvement of the whole population, the average ﬁtness of

the individual frogs in the population and the selected best solution. The SFLA approach

was also compared with a pure random approach of generating feasible solutions (frog)

based on a set threshold (maximum of 3 dependency violation can be made) and the

proposed SFLA proved better when compared with the results of the random search

approach. Figure 4presents this comparison in a line graph.

100 150 200 300

Average Fitness

Populaon size

Ramdom Search

SFLA

Fig. 4. Average ﬁtness comparison of SFLA and random search

302 A. O. Ameen et al.

From Fig. 4it is observed that SFLA signiﬁcantly outperformed random search in

all population sizes with difference of up to 4.92 average ﬁtness. This result revealed the

effectiveness of SFLA in handling project scheduling and stafﬁng problem under our

formulation.

5 Conclusion

In this work, a good data structure that enforces dependency constraints among Work

Packages (WPs) was successfully adopted. The study was able to ﬁnd a mathematical

representation with easy implementation for staff allocation. This enables the adoption

of a good data structure in representing a frog (solution) that will cater for both work

package ordering and staff allocation. The study adopts the power of SFLA to ﬁnd

the near-optimal solution for randomly generated Software Project Scheduling Problem

(SPSP) and a comparison was made with a purely random approach. The SFLA approach

in project planning provides a new, effective and efﬁcient perspective to recent software

projects scheduling. The result analysis of the study shows that it performs reasonably

well in project scheduling. In the future, we plan to include more objectives, carry out

empirical studies with real world project scheduling standard problem instances and

compare results with existing studies.

References

1. Kang, K., Hahn, J.: Learning and forgetting curves in software development: does type of

knowledge matter? In: ICIS 2009 Proceedings, p. 194 (2009)

2. Mojeed, H.A., Bajeh, A.O., Balogun, A.O., Adeleke, H.O.: Memetic approach for multi-

objective overtime planning in software engineering projects. J. Eng. Sci. Technol. 14(6),

3213–3233 (2019)

3. Patil, N., Sawanti, K., Warade, P., Shinde, Y.: Survey paper for software project scheduling

and stafﬁng problem. Int. J. Adv. Res. Comput. Commun. Eng. 7, 5675–5677 (2014)

4. Oladele, R.O., Mojeed, H.A.: A shufﬂed frog-leaping algorithm for optimal software project

planning! Afr. J. Comput. ICT 7(1), 147–152 (2014)

5. Amiri, M., Barbin, J.P.: New approach for solving software project scheduling problem using

differential evolution algorithm! Int. J. Found. Comput. Sci. Technol. 5(1), 1–5 (2015)

6. Eshraghi, A.: A new approach for solving resource constrained project scheduling problems

using differential evolution algorithm. Int. J. Ind. Eng. Comput. 7(2), 205–216 (2016)

7. Stylianou, C.S., Andreou, A.S.: Intelligent software project scheduling and team stafﬁng

with genetic algorithm. In: IFIP Advances in Information and Communication Technology

(IFIPAICT), vol. 364. Springer, Heidelberg (2011)

8. Shen, X., Minku, L.L., Bahsoon, R., Yao, X.: Dynamic software project scheduling through

a proactive-rescheduling method. IEE Trans. Softw. Eng. 42(7), 658–686 (2016)

9. Vitekar, K.N., Dhanawe, S.A., Hanchate, D.B.: Review of solving software project scheduling

problem with ant colony optimization. Int. J. Adv. Res. Electr. Electron. Instrum. Eng. 2(4),

1177–1186 (2013)

10. Karova, M., Petkova, J., Smarkov, V.: A genetic algorithm for project planning problem. In:

Proceedings International Scientiﬁc Conference Computer Science 2008, pp. 647–651 (2008)

11. Eusuff, M.M., Lansey, K.E.: Optimization of water distribution network design using the

shufﬂed frog leaping algorithm. J. Water Resour. Plan. Manag. 129(3), 210–225 (2003)

Application of Shufﬂed Frog-Leaping Algorithm 303

12. Mai, G., Li, Y.: An improved shufﬂed frog leaping algorithm and its application. In: Pro-

ceedings of International Conference on Advances in Mechanical Engineering and Industrial

Informatics, China (2015)

13. Eusuff, M., Lansey, K., Pasha, F.: Shufﬂed frog leaping algorithm: a memetic meta-heuristic

for discrete optimization. Eng. Optim. 38(2), 129–154 (2006)

14. Elbeltagi, E., Hegazy, T., Grierson, D.: A modiﬁed shufﬂed frog-leaping optimization

algorithm: applications to project management. Struct. Infrastruct. Eng. 3(1), 53–60 (2007)

15. Nejad, H.C., Jahani, R., Sarlak, G.: Applying shufﬂed frog-leaping algorithm for economic

load dispatch of power system. Am. J. Sci. Res. 20, 82–89 (2011)

16. Liping, Z., Weiwei, W., Yefeng, X., Yixian, C.: Application of shufﬂed frog leaping algorithm

to uncapacitated SLLS problem. AASRI Procedia 1, 226–231 (2012)

17. Gerasimou, S., Stylianou, C., Andreou, A.S.: An investigation of optimal project scheduling

and team stafﬁng in software development using particle swarm optimization. ICEIS 2, 168–

171 (2012)

18. Chen, W.N., Zhang, J.: Ant colony optimization for software project scheduling and stafﬁng

with an event-based scheduler. IEEE Trans. Softw. Eng. 39(1), 1–17 (2013)

19. Weisstein, E.W.: NP-Hard Problem (2017). https://mathworld.wolfram.com/NP-HardPr

oblem.html

20. Wysocki, R.K.: Effective Project Management: Traditional,Agile, Extreme, 5th edn., pp. 167–

171. Wiley Publishing, Indianapolis (2009)

21. Marler, R.T., Arora, J.S.: Survey of multi-objective optimization methods for engineering.

Struct. Multidiscip. Optim. 26, 369–395 (2004)

22. Krasnogor, N., Aragon, A., Pacheco, J.: Metaheuristic procedures for training neural networks.

Operations Research/Computer Science Interfaces Series, vol. 36, pp. 225–248 (2006)

23. Rezende, A.V., Silva, L., Britto, A., Amaral, R.: Software project scheduling problem in the

context of search-based software engineering: a systematic review. J. Syst. Softw. 155, 43–56

(2019)

24. Lin, J., Zhu, L., Gao, K.: A genetic programming hyper-heuristic approach for the multi-skill

resource constrained project scheduling problem. Expert Syst. Appl. 140, 112915 (2020)

25. Li, Q., Sun, Q., Tao, S., Gao, X.: Multi-skill project scheduling with skill evolution and

cooperation effectiveness. Eng. Constr. Archit. Manag. 27, 2023–2045 (2019)

26. Van Den Eeckhout, M., Maenhout, B., Vanhoucke, M.: A heuristic procedure to solve the

project stafﬁng problem with discrete time/resource trade-offs and personnel scheduling

constraints. Comput. Oper. Res. 101, 144–161 (2019)

27. Shen, X., Guo, Y., Li, A.: Cooperative coevolution with an improved resource allocation

for large-scale multi-objective software project scheduling. Appl. Soft Comput. 88, 106059

(2020)

28. Harman, M., Mansouri, S.A., Zhang, Y.: Search based software engineering. A comprehen-

sive analysis and review of trends techniques and applications. Technical report TR-09-03.

Department of computer science, King’s College, London (2009)

An Empirical Study on Data Sampling Methods in Addressing Class Imbalance Problem in Software Defect Prediction

Chapter

Full-text available

Jan 2022

With the growing rate of software systems and their applications in diverse walks of life, developing a software system that has no defects is a subject that cannot be overemphasized. Detection of software defects is one of the most prominent difficulties in the area of software engineering (SE) or software development process. Defects are usually unconscious flaws that make the software system behave unexpectedly or contrary to the specified requirements. This has made the subject of software defect prediction (SDP) a very critical one. Due to their dynamism, SDP solutions based on machine learning (ML) methods are envisaged as a viable approach. However, the latent data quality problem is a significant challenge to developing effective SDP models. The class imbalance is a classic example of the data quality problem in which there is a huge differential in the number of class (majority and minority) labels. Findings from studies have shown that data sampling methods are capable of addressing the class imbalance problem. Hence, this study conducts an empirical comparative analysis on the effect of data sampling methods in addressing the class imbalance problem inherent in SDP. Specifically, the performance of five data sampling (oversampling techniques (SMOTE, ADASYN, and ROS) and undersampling techniques (RUS and NM) methods on four software defect datasets with varying granularities are investigated. As prediction models, decision tree (DT) and random forest (RF) classifiers are deployed as well. Predictive performances of developed models were evaluated using accuracy, the area under the curve (AUC), and Matthews correlation coefficient (MCC) values. Observations from the experimental results showed that the introduction of data sampling methods in SDP processes not only addresses the class imbalance problem but also improves the prediction performances of the experimented classifiers. In addition, models based on ROS resampled datasets had superior predictive performance compared with other studied data sampling-based datasets. In conclusion, it can therefore be recommended to deploy data sampling methods, particularly oversampling methods in SDP processes and other applicable machine learning tasks.KeywordsSoftware defect predictionClass imbalanceData samplingMachine learning

An Enhanced Information Retrieval-Based Bug Localization System with Code Coverage, Stack Traces, and Spectrum Information

Article

Apr 2022

Several strategies such as Vector Space Model (VSM), revised Vector Space Model (rVSM), and integration of additional elements such as stack trace and previously corrected bug report have been utilized to improve the Information Retrieval (IR) based bug localization process. Most of the existing IR-based approaches make use of source code files without filtering, which eventually increases the search space of the technique, thereby slowing down the bug localization process. This study developed an enhanced IR-based bug localization model as a viable solution. Specifically, an enhanced rVSM (e-rVSM) is developed based on the hybridization of code coverage, stack traces, and spectrum information. Combining the stack trace and spectrum information as additional features can enhance the accuracy of the IR-based technique by boosting the bug localization process. Code coverage analysis was conducted to remove irrelevant source files and reduce the search space of the IR technique. Then the filtered source files are preprocessed via tokenization and stemming from selecting relevant features and removing unwanted words. The preprocessed data is further analyzed by finding similarities between the preprocessed bug reports and source code files using the e-rVSM. Finally, scores for each source code and suspected buggy files are ranked in descending order. The performance of the proposed e-rVSM is tested on two open-source projects (Zxing and SWT), and its effectiveness is assessed using TopN rank (where N = 5, 10), Mean Reciprocal Rank (MRR), and Mean Average Precision (MAP). Findings from the experimental results revealed the effectiveness of e-rVSM in bug localization. In particular, e-rVSM recorded a significant Top 5 (80.2%; 65%) and Top 10 (89.1%; 75%) rank values on SWT and Zxing dataset respectively. Also, the proposed e-rVSM had MRR values of 80% and 54% on the SWT dataset and MAP values of 61.22% and 47.23% on the Zxing dataset.

A Novel Rank Aggregation-Based Hybrid Multifilter Wrapper Feature Selection Method in Software Defect Prediction

Article

Full-text available

Nov 2021
Comput Intell Neurosci

The high dimensionality of software metric features has long been noted as a data quality problem that affects the performance of software defect prediction (SDP) models. This drawback makes it necessary to apply feature selection (FS) algorithm(s) in SDP processes. FS approaches can be categorized into three types, namely, filter FS (FFS), wrapper FS (WFS), and hybrid FS (HFS). HFS has been established as superior because it combines the strength of both FFS and WFS methods. However, selecting the most appropriate FFS (filter rank selection problem) for HFS is a challenge because the performance of FFS methods depends on the choice of datasets and classifiers. In addition, the local optima stagnation and high computational costs of WFS due to large search spaces are inherited by the HFS method. Therefore, as a solution, this study proposes a novel rank aggregation-based hybrid multifilter wrapper feature selection (RAHMFWFS) method for the selection of relevant and irredundant features from software defect datasets. The proposed RAHMFWFS is divided into two stepwise stages. The first stage involves a rank aggregation-based multifilter feature selection (RMFFS) method that addresses the filter rank selection problem by aggregating individual rank lists from multiple filter methods, using a novel rank aggregation method to generate a single, robust, and non-disjoint rank list. In the second stage, the aggregated ranked features are further preprocessed by an enhanced wrapper feature selection (EWFS) method based on a dynamic reranking strategy that is used to guide the feature subset selection process of the HFS method. This, in turn, reduces the number of evaluation cycles while amplifying or maintaining its prediction performance. The feasibility of the proposed RAHMFWFS was demonstrated on benchmarked software defect datasets with Naïve Bayes and Decision Tree classifiers, based on accuracy, the area under the curve (AUC), and F-measure values. The experimental results showed the effectiveness of RAHMFWFS in addressing filter rank selection and local optima stagnation problems in HFS, as well as the ability to select optimal features from SDP datasets while maintaining or enhancing the performance of SDP models. To conclude, the proposed RAHMFWFS achieved good performance by improving the prediction performances of SDP models across the selected datasets, compared to existing state-of-the-arts HFS methods. 1. Introduction The software development lifecycle (SDLC) is a formal framework that has been specifically planned and built for the production or development of high-quality software systems. To ensure a timely and reliable software system, gradual steps in the SDLC, such as requirement elicitation, software system review, software system design, and software system maintenance, must be closely followed and applied [1–3]. Nevertheless, since the SDLC step-by-step operations are done by professionals, human errors or failures are inevitable. Because of the large scale and dependencies in modules or parts of software systems today, these errors are common and recurring. As a result, if not corrected immediately, these errors will result in unreliable computing structures and, eventually, software failure. That is, the occurrence of errors in software system modules or components will result in flawed and low-quality software systems. Furthermore, flaws in software systems can irritate end-users and customers when the broken software system does not work as intended after the end-user has already wasted limited resources (time and effort) [4–6]. Therefore, it is critical to consider early prediction and discovery of software flaws before product delivery or during the software development process. Early detection or prediction of defective modules or components in a software system allows those modules or components to be corrected momentarily and available resources to be used optimally [7, 8]. Software defect prediction (SDP) is the use of machine learning (ML) methods to determine the defectivity of modules or components in software. SDP, in particular, is the application of ML methods to software features identified by software metrics to detect faults in software modules or components [9–12]. For SDP, some researchers have suggested and applied both supervised and unsupervised ML approaches [13–18]. Nonetheless, the predictive accuracy of SDP models is entirely dependent on the consistency and inherent characteristics of the software datasets used to create them. The magnitude and complexities of software systems are closely related to the software metrics used to characterize the consistency and performance of software systems. That is, large and scalable software systems necessitate several software metric structures to deliver functionality that best reflects the output of those software systems [19–21]. In general, software systems with a large number of features as a result of the accumulation of software metrics are often composed of redundant and irrelevant features, which can be described as a high dimensionality problem. According to research, the high dimensionality problem has a negative impact on the prediction accuracy of SDP models [22, 47]. Researchers agree that the feature selection (FS) approach is an effective method for addressing high-dimensionality problems. For each SDP process, these FS methods essentially selects valuable and critical software features from the initial software defect dataset [23–26]. The application of FS methods results in the creation of a subset of features containing germane and critical features from a collection of trivial and unnecessary features, thus resolving the high dimensionality of the dataset. In other words, FS methods choose the most significant features while retaining dataset performance [27–29]. There are three types of FS methods, namely, filter FS (FFS), wrapper FS (WFS), and hybrid FS (HFS). The FFS method has lower computational complexity but the predictive performance of classification algorithms on such filtered data cannot be guaranteed [30–32]. On the other hand, WFS methods guarantee good predictive performance but come with the cost of high computational complexity and lack of generalizability [31, 33]. The HFS approach combines the strength of both FFS and WFS methods [34, 35]. However, filter rank selection problem and complex search strategies are inherent limitations/drawbacks of HFS methods. In particular, selecting the most appropriate filter method for HFS is difficult, as the performance of FFS methods depends on the choice of datasets and classifiers [36–41]. Also, the local optima stagnation and high computational costs of WFS as a result of large search spaces are inherited by the HFS method [42–44]. Therefore, this research has developed a novel rank aggregation-based hybrid multifilter wrapper feature selection (RAHMFWFS) method for the selection of relevant and irredundant features from software defect datasets. The proposed RAHMFWFS is divided into two stepwise stages. The first stage involves a rank aggregation-based multifilter feature selection (RMFFS) method. RMFFS addresses the filter rank selection problem by aggregating individual rank lists from multiple filter methods and using a rank aggregation method to generate a single, robust, and non-disjoint rank list. In the second stage, the aggregated ranked features are further preprocessed by an enhanced wrapper feature selection (EWFS) method based on a reranking strategy. A dynamic reranking strategy is used to guide the feature subset selection process of the WFS method which in turn reduces the number of wrapper evaluation cycles while maintaining or amplifying its prediction performance. The reranked feature list is then outputted as the optimal feature subset by the proposed RAHMFWFS. The feasibility of the proposed RAHMFWFS was demonstrated on benchmarked software defect datasets with Naïve Bayes and Decision Tree classifiers based on accuracy, area under the curve (AUC), and F-measure values. The proposed RAHMFWFS takes advantage of filter-filter and filter-wrapper relationships to give optimal feature subsets with high predictive performance and also to improve the search strategy in the wrapper in order reduce its evaluation cycle and subsequently improve performance of SDP models. The main contributions of this study are as follows:(1)To develop a novel rank aggregation-based hybrid multifilter wrapper feature selection (RAHMFWFS) method for the selection of relevant and irredundant features from software defect datasets.(2)To empirically evaluate and validate the performance of RAHMFWFS against rank aggregation-based multifilter feature selection (RMFFS) and enhanced wrapper feature selection (EWFS) methods that are constituents of the proposed RAHMFWFS.(3)To empirically evaluate and validate the performance of RAHMFWFS against existing hybrid FS methods. The remainder of this paper is structured as follows. Reviews on existing related works are presented in Section 2. Details on proposed RAHMFWFS and experimental methods are described in Section 3. Experimental results are analyzed and discussed in Section 4 and the research is concluded with highlights of future works in Section 5. 2. Related Works High dimensionality is a data quality problem that affects the predictive capabilities of SDP models. In other words, the frequency of redundant and noisy software features as a result of the number and increase in software parameters used to determine the output of a software system has a negative impact on SDP prediction models. Existing research has shown that FS methods can be used to solve the high dimensionality problem. As a result, numerous studies have suggested various FS approaches and investigated their implications on the predictive efficiency of SDP models. Cynthia et al. [45] evaluated the influence of FS approaches on SDP prediction models. The effect of five FS methods on selected classifiers was specifically investigated. Based on their findings, they concluded that FS methods have a substantial (positive) impact on the prediction output of the chosen classifiers. Nevertheless, the scope of their research (number of FS methods and datasets chosen) was small. Akintola et al. [2] also compared filter-based FS approaches on heterogeneous prediction models, focusing on the following classifiers: principal component analysis (PCA), correlation-based feature selection (CFS), and filtered subset evaluation (FSE). They also discovered that using FS methods in SDP is advantageous because it increases the prediction accuracy of chosen classifiers. In their research, Balogun et al. [23] explored the effect of FS methods on models in SDP based on applied search methods. The output of eighteen FS methods was evaluated using four classifiers. Their results support the use of FS methods in SDP; however, the impact of FS methods on SDP differs across datasets and classifiers. They reported that filter-based feature selection methods had higher accuracy values than other FS methods tested. Nonetheless, the issue of filter rank selection problem persists because the output of filter-based FS methods is dependent on the dataset and classifier used in the SDP phase. In a similar study, Balogun et al. [24] performed an exhaustive analytical study on the effect of FS approaches on SDP models, focusing on particular discrepancies and anomalies in previous research outlined by Ghotra et al. [46] and Xu et al. [40]. They concluded from their experimental findings that the effectiveness of FS approaches is dependent on the dataset and classifier used. As a result, there are no best FS approaches. Since each filter-based FS approach functions differently, this adds to the support for FFS methods and points to the existence of a filter rank selection problem in SDP. Wahono et al. [47] improved an ensemble-based SDP model using a metaheuristic-based WFS approach. As a search method for the WFS, they combined Particle Swarm Optimization (PSO) and the genetic algorithm (GA). Their findings demonstrated that the WFS approach improves the ensemble method’s predictive efficiency. They then combined PSO and GA as search methods for the proposed WFS method. Their results indicated that the use of WFS method increases the ensemble method's predictive performance. This demonstrates that metaheuristic search methods can be just as effective as traditional Best-First Search (BFS) and Greedy Stepwise Search (GSS) methods. Likewise, in their analysis, Song et al. [48] used two WFS approaches: forward selection and backward elimination. Based on their experimental results, they hypothesized that both forms of WFS benefited SDP models and contended that there is no discernible difference between their performances. However, their emphasis on WFS was restricted to forward selection and backward elimination only. However, metaheuristics and other search methods can be as effective as, if not more effective than, forward selection and backward elimination in WFS methods. Muthukumaran et al. [49] used 10 FS methods to conduct a systematic analytical analysis on 16 defective datasets (7 FFS, 2 WFS, and 1 embedded method). WFS based on GSS method outperformed other FS methods in their study. The effect of FS methods on SDP models was studied by Rodríguez et al. [50]. Correlation-based FS (CFS), consistency-based FS (CNS), fast correlation-based filter (FCBF), and WFS were empirically contrasted. They stated that datasets with fewer features maintain or outperform the original dataset and that the WFS method outperforms the other FFS approaches that were tested. However, it should be noted that WFS methods are computationally expensive, which may be attributed to the use of standard exhaustive search methods. Jia [51] has suggested an HFS approach for SDP that combines the strengths of three FFS methods: chi-squared (CS), information gain (IG), and association filter (AF). In that study, the Top K features were chosen based on the average rating of each element in the respective rating list. Their findings revealed that models based on the HFS approach outperformed models based on individual FFS methods (CS, IG, AF). Nonetheless, the distorted rankings of each feature will have an impact on the efficacy of averaging rank lists [52]. Furthermore, picking random Top K features may not be the right method, since valuable features may be overlooked during the selection process [45]. In another context, Onan [53] deployed a reranking search algorithm with an CNS method for selecting relevant features and reducing the computational complexity of the subset evaluation in the classification of breast cancer. Also, a fuzzy-rough instance selection method was incorporated into the proposed method for instance selection. Experimental findings from the study showed that the proposed HFS method can select relevant features and instances set for model construction. In another related study, Onan and Korukoğlu [54] constructed an ensemble of FS methods for text sentiment classification. They aggregated individual feature lists from diverse FS methods using the GA method. Findings from their results indicated that the proposed ensemble approach can generate more robust and relevant features than respective individual FS methods. As a result, FS approaches are effective at decreasing or eliminating dataset features and amplifying the performance of models in SDP. Even so, choosing a suitable FFS approach remains a challenge. Also, trapping in local maxima and the high computational cost of WFS methods is an open problem for HFS. Hence, this study proposes a novel rank aggregation-based hybrid multifilter wrapper feature selection (RAHMFWFS) method for the selection of relevant and irredundant features from software defect datasets. 3. Methodology This section contains information on selected classifiers, baseline FFS methods, the proposed RAHMFWFS method, the experimental procedure, datasets studied, and the performance evaluation measures. 3.1. Classification Algorithms Decision Tree (DT) and Naïve Bayes (NB) algorithms were used as prediction models in this analysis due to their high prediction efficiency and their potential for operating on imbalanced datasets [23, 55]. Furthermore, parameter tuning often has little effect on DT and NB. Finally, DT and NB have been used repeatedly in existing SDP studies. Table 1 contains information on the DT and NB classifiers. Classification algorithms Parameter settings Decision Tree (DT) ConfidenceFactor = 0.25; MinObj = 2 Naïve Bayes (NB) NumDecimalPlaces = 2; NumAttrEval = Normal Dist.

Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy

Article

Full-text available

Nov 2021

Finding defects early in a software system is a crucial task, as it creates adequate time for fixing such defects using available resources. Strategies such as symmetric testing have proven useful; however, its inability in differentiating incorrect implementations from correct ones is a drawback. Software defect prediction (SDP) is another feasible method that can be used for detecting defects early. Additionally, high dimensionality, a data quality problem, has a detrimental effect on the predictive capability of SDP models. Feature selection (FS) has been used as a feasible solution for solving the high dimensionality issue in SDP. According to current literature, the two basic forms of FS approaches are filter-based feature selection (FFS) and wrapper-based feature selection (WFS). Between the two, WFS approaches have been deemed to be superior. However, WFS methods have a high computational cost due to the unknown number of executions available for feature subset search, evaluation, and selection. This characteristic of WFS often leads to overfitting of classifier models due to its easy trapping in local maxima. The trapping of the WFS subset evaluator in local maxima can be overcome by using an effective search method in the evaluator process. Hence, this study proposes an enhanced WFS method that dynamically and iteratively selects features. The proposed enhanced WFS (EWFS) method is based on incrementally selecting features while considering previously selected features in its search space. The novelty of EWFS is based on the enhancement of the subset evaluation process of WFS methods by deploying a dynamic re-ranking strategy that iteratively selects germane features with a low subset evaluation cycle while not compromising the prediction performance of the ensuing model. For evaluation, EWFS was deployed with Decision Tree (DT) and Naïve Bayes classifiers on software defect datasets with varying granularities. The experimental findings revealed that EWFS outperformed existing metaheuristics and sequential search based WFS approaches established in this work. Additionally, EWFS selected fewer features with less computational time as compared with existing metaheuristics and sequential search-based WFS methods.

MEMETIC APPROACH FOR MULTI-OBJECTIVE OVERTIME PLANNING IN SOFTWARE ENGINEERING PROJECTS

Article

Full-text available

Dec 2019

Software projects often suffer from unplanned overtime due to uncertainty and risk incurred due to changing requirement and attempt to meet up with time-to-market of the software product. This causes stress to developers and can result in poor quality. This paper presents a memetic algorithmic approach for solving the overtime-planning problem in software development projects. The problem is formulated as a three-objective optimization problem aimed at minimizing overtime hours, project makespan and cost. The formulation captures the dynamics of error generation and propagation due to overtime using simulation. Multi-Objective Shuffled Frog-Leaping Algorithm (MOSFLA) specifically designed for overtime planning is applied to solve the formulated problem. Empirical evaluation experiments on six real-life software project datasets were carried out using three widely used multi-objective quality indicators. Results showed that MOSFLA significantly outperformed the existing traditional overtime management strategies in software engineering projects in all quality indicators with 0.0118, 0.3893 and 0.0102 values for Contribution (IC), Hypervolume (IHV) and Generational Distance (IGD) respectively. The proposed approach also produced significantly better IHV and IGD results than the state of the art approach (NSGA-IIV) in 100% of the project instances. However, the approach could only outperform NSGA-IIV in approximately 67% of projects instances with respect to IC.

A Shuffled Frog-Leaping Algorithm for Optimal Software Project Planning

Article

Full-text available

Apr 2014

In recent time, software project management has received considerable attention from researchers in the field of Search Based Software Engineering (SBSE). This paper presents an approach to Search Based Software Project Planning based on Shuffled Frog-Leaping Algorithm (SFLA). Our approach seeks to optimize work package scheduling with a view to achieving early overall completion time. To evaluate the algorithm, it is tested on a set of randomly generated problems and it's results are compared with those of Genetic Algorithm (GA). Results indicate that SFLA is significantly superior to GA.

An improved Shuffled frog leaping Algorithm and its application

Conference Paper

Full-text available

Jan 2015

Review of Solving Software Project Scheduling Problem with Ant Colony Optimization

Article

Full-text available

May 2013

SPSP is a problem of scheduling the task and employee. SPSP is a NP-hard (Non Polynomial) problem. SPSP is a problem which is related to RCPSP problem. For solving such problem number of model has been developed. Number of Meta heuristic algorithm is also applied to solve such problem (e.g. GA). This paper presents the survey of methods and models that are put into the historical context. SPSP split the task and distribute dedication of employee to task nodes. Author proposes an ACO Meta heuristics approach to solve the SPSP problem. Author use ACO for solving such problem hence he called it as an ACS: SPSP. Result of this paper is compared with GA to solve SPSP. The proposed algorithm is very efficient and promising and obtains more accuracy.

Dynamic Software Project Scheduling through a Proactive-Rescheduling Method

Article

Full-text available

Jan 2015

Software project scheduling in dynamic and uncertain environments is of significant importance to real-world software development. Yet most studies schedule software projects by considering static and deterministic scenarios only, which may cause performance deterioration or even infeasibility when facing disruptions. In order to capture more dynamic features of software project scheduling than the previous work, this paper formulates the project scheduling problem by considering uncertainties and dynamic events that often occur during software project development, and constructs a mathematical model for the resulting multi-objective dynamic project scheduling problem (MODPSP), where the four objectives of project cost, duration, robustness and stability are considered simultaneously under a variety of practical constraints. In order to solve MODPSP appropriately, a multi-objective evolutionary algorithm based proactive-rescheduling method is proposed, which generates a robust schedule predictively and adapts the previous schedule in response to critical dynamic events during the project execution. Extensive experimental results on 21 problem instances, including three instances derived from real-world software projects, show that our novel method is very effective. By introducing the robustness and stability objectives, and incorporating the dynamic optimization strategies specifically designed for MODPSP, our proactive-rescheduling method achieves a very good overall performance in a dynamic environment.

Cooperative coevolution with an improved resource allocation for large-scale multi-objective software project scheduling

Article

Dec 2019
APPL SOFT COMPUT

The existing literature of search-based software project scheduling merely studied to schedule a small to medium-scale project in static scenarios, while little work has considered to schedule a large-scale software project with uncertainties. However, many real-world software projects involve a large number of tasks and employees. Meanwhile, they are confronted with uncertain environments. To tackle such problems, this paper constructs a mathematical model for the large-scale multi-objective software project scheduling problem, and proposes a cooperative coevolutionary multi-objective genetic algorithm to solve the established model. In our model, more practical features of human resources and tasks are captured in the context of large-scale projects than the previous studies. Two efficiency related objectives of duration and cost are considered together with robustness to uncertainties and employees’ satisfaction to allocations subject to various realistic constraints. Three novel strategies are incorporated in the proposed algorithm, which include the problem feature-based variable decomposition method, the improved computational resource allocation mechanism and the problem-specific subcomponent optimizer. To evaluate the performance of the proposed algorithm, empirical experiments have been performed on 15 randomly generated large-scale software project scheduling instances with up to 2048 decision variables, and three instances derived from real-world software projects. Experimental results indicate that on most of the 15 random instances and three real-world instances, the proposed algorithm achieves significantly better convergence performance than several state-of-the-art evolutionary algorithms, while maintaining a set of well-distributed solutions. Thus, it can be concluded that the proposed algorithm has a promising scalability to decision variables on software project scheduling problems. We also demonstrate how different compromises among the four objectives can offer software managers a deeper insight into various trade-offs among many objectives, and enabling them to make an informed decision.

Multi-skill project scheduling with skill evolution and cooperation effectiveness

Article

Nov 2019

Purpose Recently, there has been increasing focus on the development of multi-skilled workforce in project management. The purpose of this paper is to investigate a multi-skill project scheduling problem (MSPSP), which combines project scheduling and multi-skill personnel assignment. The distinct features of skill evolution and cooperation effectiveness are considered in the problem to maximize the total project effectiveness and skill development simultaneously. Design/methodology/approach The Bi-objective non-linear integer programming (LIP) models are formulated for the problem using three types of skill development objective function: number of experts, total skill increment and “bottleneck” skill increment. Non-linear models are then linearized through several linearization techniques, and the ε -constraint method is used to convert the bi-objective models into single-objective models. Findings A construction project case is used to validate the proposed models. In comparison with models that do not consider skill evolution and cooperation effectiveness, the models proposed in this paper offer more realistic solutions and show better performance with regard to both project effectiveness and skill development. Originality/value This research extends the current MSPSP by considering skill evolution based on the “learning effect” as well as the influence of cooperation in an activity-based team, which are common phenomena in practice but seldom studied. LIP models formulated in this paper can be solved by any off-the-shelf optimization solver, such as CPLEX. Besides, the proposed LIP models can offer better project scheduling and personnel assignment plan, which would be of immense practical value in project management applications.

A genetic programming hyper-heuristic approach for the multi-skill resource constrained project scheduling problem

Article

Sep 2019
EXPERT SYST APPL

Multi-skill resource-constrained project scheduling problem (MS-RCPSP) is one of the most investigated problems in operations research and management science. In this paper, a genetic programming hyper-heuristic (GP-HH) algorithm is proposed to address the MS-RCPSP. Firstly, a single task sequence vector is used to encode solution, and a repair-based decoding scheme is proposed to generate feasible schedules. Secondly, ten simple heuristic rules are designed to construct a set of low-level heuristics. Thirdly, genetic programming is utilized as a high-level strategy which can manage the low-level heuristics on the heuristic domain flexibly. In addition, the design-of-experiment (DOE) method is employed to investigate the effect of parameters setting. Finally, the performance of GP-HH is evaluated on the intelligent multi-objective project scheduling environment (iMOPSE) benchmark dataset consisting of 36 instances. Computational comparisons between GP-HH and the state-of-the-art algorithms indicate the superiority of the proposed GP-HH in computing feasible solutions to the problem.

Software Project Scheduling Problem in the Context of Search-Based Software Engineering: A Systematic Review

Article

May 2019
J SYST SOFTWARE

This work provides a systematic literature review of the software project scheduling problem, in the context of search-based software engineering, and summarizes the main models, techniques, search algorithms and evaluation criteria applied to solve this problem. We also discuss trends and research opportunities. Our keyword search found 438 papers, published in the last 20 years. After considering the inclusion and exclusion criteria and performing the snowballing procedure, we have analyzed 37 primary studies. The results show the predominance of the use of evolutionary algorithms. The static model, in which the scheduling is performed once during the project, is considered in the majority of the papers. Synthetic instances are commonly used to validate the heuristic and hypervolume and execution time are the mostly applied evaluating criteria.

A heuristic procedure to solve the project staffing problem with discrete time/resource trade-offs and personnel scheduling constraints

Article

Sep 2018
COMPUT OPER RES

When scheduling projects under resource constraints, assumptions are typically made with respect to the resource availability and activities are planned each with its own duration and resource requirements. In resource scheduling, important assumptions are made with respect to the staffing requirements. Both problems are typically solved in a sequential manner leading to a suboptimal outcome. We integrate these two interrelated scheduling problems to determine the optimal personnel budget that minimises the overall cost. Integrating these problems increases the scheduling flexibility, which improves the overall performance. In addition, we consider some resource demand flexibility in this research as an activity can be performed in multiple modes. In this paper, we present an iterated local search procedure for the integrated multi-mode project scheduling and personnel staffing problem. Detailed computational experiments are presented to evaluate different decomposition heuristics and comparison is made with alternative optimisation techniques.

Application of Shuffled Frog-Leaping Algorithm for Optimal Software Project Scheduling and Staffing

Abstract and Figures

Recommended publications

HYBRID SFLA-TABU SEARCH ALGORITHM FOR OPTIMAL PROJECT SCHEDULING AND STAFFING

Machine Learning Assisted Interactive Multi-objectives Optimization Framework: A Proposed Formulatio...

An optimized event based software project scheduling with uncertainty treatment

MEMETIC APPROACH FOR MULTI-OBJECTIVE OVERTIME PLANNING IN SOFTWARE ENGINEERING PROJECTS