Conference PaperPDF Available

A Machine Learning Study to Enhance Project Cost Forecasting

June 2022
IFAC-PapersOnLine 55(10):3286-3291

June 2022
55(10):3286-3291

DOI:10.1016/j.ifacol.2022.10.127

Conference: 0th IFAC CONFERENCE ON MANUFACTURING MODELLING, MANAGEMENT AND CONTROL

Authors:

Tolga Inan

TOBB University of Economics and Technology

Timur Narbaev

Kazakh-British Technical University

Oncu Hazir

Rennes School of Business

In project management it is critical to obtain accurate cost forecasts using effective methods. This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we validate the model using three hundred experiments in the testing phase. Overall, the proposed model produces more accurate cost estimates when compared to the traditional Earned Value Management index-based model.

The training and testing phases of the proposed ML approach.

…

Figures - uploaded by Timur Narbaev

Content may be subject to copyright.

Content uploaded by Timur Narbaev

Content may be subject to copyright.

Content uploaded by Timur Narbaev

Content may be subject to copyright.

IFAC PapersOnLine 55-10 (2022) 3286–3291

ScienceDirect

Available online at www.sciencedirect.com

Peer review under responsibility of International Federation of Automatic Control.

10.1016/j.ifacol.2022.10.127

10.1016/j.ifacol.2022.10.127 2405-8963

2022 The Authors. This is an open access article under the CC BY-NC-ND license

(

https://creativecommons.org/licenses/by-nc-nd/4.0/

)

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

 Tolgaİnanetal./IFACPapersOnLine55-10(2022)3286–3291 3287

2022 The Authors. This is an open access article under the CC BY-NC-ND license

(

https://creativecommons.org/licenses/by-nc-nd/4.0/

)

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

A Machine Learning Study to Enhance Project Cost Forecasting

Tolga İnan*, Timur Narbaev**, Öncü Hazir ***

* Electrical-Electronics Engineering Department, Çankaya University, Ankara, Turkey (e-mail:

tolga.inan@cankaya.edu.tr)

** Business School, Kazakh-British Technical University, Almaty, Kazakhstan (e-mail:

t.narbaev@kbtu.kz)

*** Supply Chain Management and Information Systems Department, Rennes School of Business,

Rennes, France (e-mail: oncu.hazir@rennes-sb.com)

Abstract: In project management it is critical to obtain accurate cost forecasts using effective methods.

This study presents a Machine Learning model based on Long-Short Term Memory to forecast the project

cost. The model uses the seven-dimensional feature vector, including schedule and cost performance factors

and their moving averages as a predictor. Based on the cost variation patterns from the training phase, we

validate the model using three hundred experiments in the testing phase. Overall, the proposed model

produces more accurate cost estimates when compared to the traditional Earned Value Management index-

based model.

Keywords: Cost forecasting, Earned Value Management, Estimate at Completion, Machine Learning,

Project Management.

1. INTRODUCTION

Almost all the projects experience cost overruns irrespective

of their size and type, as they face many uncertainties during

their life cycles. Various project monitoring and control

methodologies such as Earned Value Management (EVM) are

commonly used to limit these cost overruns. These methods

mainly support the organizations to monitor the progress of the

projects and budget use effectively. During the project

execution, at any time, project teams need to know about what

has happened since the project start and, more importantly, be

able to foresee what might happen in the remaining life of the

projects. This makes accurate estimates critical to completing

projects under budget and maintaining reliable communication

with project stakeholders.

However, project monitoring and forecasting decisions are

prone to the increasing uncertainties of today’s data-rich

business environments. In this respect, we observe the

considerable potential for using Artificial Intelligence

techniques in production (Cadavid et al., 2019; Rai et al.,

2021) and project control (Munir, 2019; Chen et al., 2020; Ong

and Uddin, 2020; Natarajan, 2022).

More specifically, Machine Learning (ML) algorithms can aid

organizations in enhancing project cost forecasting, which we

focus on in this study. Even though the potential benefits are

remarkable, the literature is still scant (Willems and

Vanhoucke, 2015; Pellerin and Perrier, 2019; Hashemi et al.,

2020). The following section will briefly introduce some

pertinent ML applications for cost forecasting in projects.

We first discuss the fundamentals of EVM, a Project

Management (PM) methodology used to measure and forecast

duration and cost in projects (Humphreys, 2018; Mahmoudi et

al., 2021). We note that the conventional index-based EVM

forecasting approaches assume linearity in cost growth

(Anbari, 2003; PMI, 2019). However, cost growth is usually

nonlinear in projects and often resembles an S-shaped curve

pattern (Barraza et al., 2004; Narbaev and De Marco, 2014b).

Moreover, the index-based methods may result in inaccurate

forecasts in the early stages of a project (Kim and Reinschmidt,

2011; Warburton and Cioffi, 2016) due to a few data points

available that make the extrapolation to the project’s

remaining part not reliable (Lipke et al., 2009; De Marco et al.,

2016). These two limitations of the index-based cost

forecasting methods motivate our study.

ML models have not been extensively applied for cost

forecasting in ongoing projects. However, ML has a great

potential to enhance decision making in PM (IPMA, 2020).

Organizations have been carrying out more and more projects

and gathering a tremendous amount of data from the

undertaken projects. In fact, along with the data, know-how is

also accumulated. Traditionally, this know-how is carried

from project to project by senior managers. Project managers

mainly depend on their experiences and implement traditional

methods to estimate the total cost and completion time and

take corrective actions using the predictions and relying on

their experiences.

Our study aims to demonstrate how managers can benefit more

from the data of the completed projects by using ML. In

particular, the projects' data with similar characteristics are

used to train ML-based estimators and support project

managers in making estimations. These estimation methods

can improve forecasting practices by considering the

nonlinearity in cost growth and making estimates by studying

the given data points. To show the effectiveness of the chosen

approach, we will compare the accuracy of the cost forecasting

results of our ML approach with the ones by the traditional

index-based model.

The remainder of the paper is structured as follows. Next, we

introduce key EVM metrics and review relevant studies on ML

applications in project cost forecasting. Then, we present our

ML approach and the project dataset for calculating the cost

estimates. Next, we report the results of our comparative

analysis and discuss the main results. Finally, we conclude

with the study summary, research limitations, and future

research directions.

2. BACKGROUND

2.1 Key EVM metrics

According to the Project Management Institute (PMI, 2019),

EVM is a methodology used by project managers to monitor

and control the schedule and budget of a project. It is based on

three key metrics: Planned Value (PV) – the budgeted value of

the scheduled work; Actual Cost – the actual value of the

performed work; and Earned Value (EV) – the budgeted value

of the performed work (Anbari, 2003). Budget at Completion

(BAC) is the project’s total budget and Cost at Completion

(CAC) is its total actual cost at completion. Planned Duration

(PD) is the project’s scheduled duration and Actual Duration

(AD) is its actual duration at completion. To assess the

project’s cost performance (efficient use of BAC), Cost

Performance Index (CPI=EV/AC) is used. To measure the

project’s schedule progress, Schedule Performance Index

(SPI=EV/PV) is applied.

Finally, Cost Estimate at Completion (EAC($)) is the forecast

that represents the final cost of a project. Our study uses ML

to obtain a more accurate EAC($), and the index-based

formula in (1) is used as a benchmark. We compare our

EAC($) results with the ones calculated by (1).

($)=+− (1)

The linear model (1) is selected as the benchmark following

the study of Batselier and Vanhoucke (2015b), who conducted

a comparative analysis of eight index-based EAC($) models.

They used the EVM data of 51 real projects and associated

simulations for comparison. Based on the accuracy results,

measured by Mean Absolute Percentage Error (MAPE)

(defined next in the paper), they found that the model by (1)

showed dominance over the other seven models and produced

the most accurate EAC($) estimates.

2.2 Brief review of ML applications for project cost

forecasting

First, we note that ML models have not been extensively

applied in project monitoring and control. Only a few studies

developed ML models, specifically for project cost forecasting

during the project execution phase. Table 1 provides a

summary of these studies with a brief description of their

models and contribution to the EVM body of knowledge.

Among the first implementations, Pewdum et al. (2009)

developed an Artificial Neural Network (ANN) model based

on Backpropagation to improve the accuracy of duration and

cost estimates. They integrated numerous variables. Among all

the variables, the traffic volume, weather conditions, contract

duration, construction budget, percent complete of the planned

work, and percent complete of the actual work performed were

the most influential. Their model produced accurate EAC($)

when applied to highway construction projects in Thailand.

Table 1. A summary of the reviewed studies on ML

applications for project forecasting

Study

Description

Contribution to

EVM

Pewdum et

al. (2009)

Several project perfor-

mance factors are inte-

grated into the ANN-

based Backpropagation

model

Duration and

cost forecasting

during project

execution

Narbaev and

De Marco

(2014a)

Supervised regression

approach that integrates

the EVM cost data

through the Gompertz

Growth modeling

Cost forecasting

during project

execution

Elmousalami

(2021)

Numerous ML meth-

ods, including the En-

semble-based, are com-

pared using Fuzzy

Logic

Cost estimation

during project

planning

Ottaviani

and De

Marco

(2022)

Multiple linear regres-

sion model is proposed

using the EVM cost

data as independent (in-

put) variables

Cost forecasting

during project

execution

Natarajan

(2022)

The ANN and Refer-

ence class forecasting

approaches are inte-

grated to produce prob-

abilistic estimates

Duration and

cost forecasting

during project

planning and

execution

Wauters and

Vanhoucke

(2016)

The four ML tech-

niques are compared to

produce more accurate

duration estimates

Duration fore-

casting during

project execu-

tion

The current

study

The ANN-based Long-

Short Term Memory

model that uses the

EVM-based CPI and

SPI metrics and their

derivatives

Cost forecasting

during project

execution

3288 Tolgaİnanetal./IFACPapersOnLine55-10(2022)3286–3291

Narbaev and De Marco (2014a) adopted a Supervised

nonlinear regression approach based on the Gompertz Growth

model. Applying their model to nine construction projects, the

authors compared the accuracy of their estimates with the ones

produced by implementing the CPI-integrated index-based

model. As their model fits better to the S-shaped curve

observed in project cost growth, they obtained more accurate

cost estimates.

Elmousalami (2021) integrated ML algorithms into the cost

estimation efforts at the project development stage. Using the

Fuzzy Logic, they embedded the uncertainty factors in their

ML models and showed that the Ensemble methods were

superior in predicting performance.

Recently, Ottaviani and De Marco (2022) developed a multiple

linear regression model to assess the impact of input variables

(CPI, original cost forecast, and percent of work performed)

and improve the model fitting to the project’s real CAC value.

Using the data of 29 real-life projects, they showed that their

model with the three variables provided higher accuracy and

lower variance in EAC($) estimates.

Natarajan (2022) proposed a comprehensive model that

integrated Reference class forecasting (the outside view of a

project) and ML (the inside view from the project data) to

improve schedule and budget planning and control. Using the

cost data of 106 and the schedule data of 130 oil and gas

projects, the author showed a higher predictive capability of

the ML approach in predicting the most likely cost and

schedule overruns in projects.

To forecast the project duration, Wauters and Vanhoucke

(2016) applied Decision Tree, Bagging, Random Forest, and

Boosting techniques. They compared their forecasting results

with the ones by the conventional models (based on linear

performance indexes). Using artificial project data, they

showed that ML approaches had more accurate predicting

capabilities than the traditional index-based methods.

Finally, we refer to some review studies. Willems and

Vanhoucke (2015) examined the EVM methods and some ML

applications in project control. Hashemi et al. (2020) discussed

the ML applications for project cost forecasting. Ulusoy and

Hazir (2021) listed many interesting application areas of ML

in PM.

3. METHODOLOGY

3.1 Model

ML has been increasingly used in many fields, from computer

vision to biometric recognition, from advertising to the defense

industry. In the literature, ML approaches are classified into

Supervised learning, Unsupervised learning, Semi-supervised

learning, Reinforcement learning, and Dimensionality

reduction (e.g., Panda et al., 2021).

Unsupervised learning methods use input data, mainly to find

out the regularity in data. On the other hand, Supervised

techniques use both input and output data. Depending on the

problem, output data can be real numbers, integers, or

categories. In our study, the cost figures (real numbers)

constitute the output data.

Therefore, in this study, we focus on Supervised ML as there

is output data. In this approach, the training phase is crucial as

the patterns between the input-output data are found. On the

other hand, the testing phase of the Supervised ML generates

outputs following the input-output patterns determined during

the training phase. In Supervised ML, approaches can be

grouped into classification and regression subcategories. The

classification algorithms generate discrete or categorical

outputs. The regression algorithms generate continuous

outcomes. Time-sequence regression is the type of Supervised

ML that we implement in this study.

To explain how we used the time-sequence regression in our

study, we describe our approach including the processes used

in the ML training and testing phases. Fig. 1 presents how

these two phases are used within our prediction algorithm. We

learn the suitable ML model in the training phase, and the

learned ML model is used as a predictor in the testing phase.

We employ the ML model of a recurrent ANN type, namely

Long-Short Term Memory (LSTM). The LSTM networks are

suitable for sequence-to-sequence regression problems. We

refer to Hochreiter and Schmidhuber (1997) and Greff et al.

(2016) for more information on the LSTM networks.

ML models require features (inputs) to make the prediction.

Therefore, we must define the features of our ML model. We

design a seven-dimensional feature vector. Six dimensions of

the feature vector consist of CPI and SPI metrics and their

moving average filtered versions (having window sizes of two

and three tracking points for each metric). The seventh and last

dimension of the input vector is the normalized time. The

normalized time is found by dividing AD by PD for a

particular tracking point. The output (predicted value) of the

ML model is the cost at completion.

The training-testing protocol we use for the ML is as follows.

We use 12 projects in the training phase and three projects in

the testing phase of our experiments. Projects in the training

and testing phase are randomly selected. We repeat the

experiment a hundred times for each of the three projects,

covering the training and testing phases. Therefore, all projects

are used in both training and testing phases. Accordingly, the

results are reported independently of the training and testing

sets.

Figure 1. The training

and testing phases of the proposed ML

approach.

 Tolgaİnanetal./IFACPapersOnLine55-10(2022)3286–3291 3289

Narbaev and De Marco (2014a) adopted a Supervised

nonlinear regression approach based on the Gompertz Growth

model. Applying their model to nine construction projects, the

authors compared the accuracy of their estimates with the ones

produced by implementing the CPI-integrated index-based

model. As their model fits better to the S-shaped curve

observed in project cost growth, they obtained more accurate

cost estimates.

Elmousalami (2021) integrated ML algorithms into the cost

estimation efforts at the project development stage. Using the

Fuzzy Logic, they embedded the uncertainty factors in their

ML models and showed that the Ensemble methods were

superior in predicting performance.

Recently, Ottaviani and De Marco (2022) developed a multiple

linear regression model to assess the impact of input variables

(CPI, original cost forecast, and percent of work performed)

and improve the model fitting to the project’s real CAC value.

Using the data of 29 real-life projects, they showed that their

model with the three variables provided higher accuracy and

lower variance in EAC($) estimates.

Natarajan (2022) proposed a comprehensive model that

integrated Reference class forecasting (the outside view of a

project) and ML (the inside view from the project data) to

improve schedule and budget planning and control. Using the

cost data of 106 and the schedule data of 130 oil and gas

projects, the author showed a higher predictive capability of

the ML approach in predicting the most likely cost and

schedule overruns in projects.

To forecast the project duration, Wauters and Vanhoucke

(2016) applied Decision Tree, Bagging, Random Forest, and

Boosting techniques. They compared their forecasting results

with the ones by the conventional models (based on linear

performance indexes). Using artificial project data, they

showed that ML approaches had more accurate predicting

capabilities than the traditional index-based methods.

Finally, we refer to some review studies. Willems and

Vanhoucke (2015) examined the EVM methods and some ML

applications in project control. Hashemi et al. (2020) discussed

the ML applications for project cost forecasting. Ulusoy and

Hazir (2021) listed many interesting application areas of ML

in PM.

3. METHODOLOGY

3.1 Model

ML has been increasingly used in many fields, from computer

vision to biometric recognition, from advertising to the defense

industry. In the literature, ML approaches are classified into

Supervised learning, Unsupervised learning, Semi-supervised

learning, Reinforcement learning, and Dimensionality

reduction (e.g., Panda et al., 2021).

Unsupervised learning methods use input data, mainly to find

out the regularity in data. On the other hand, Supervised

techniques use both input and output data. Depending on the

problem, output data can be real numbers, integers, or

categories. In our study, the cost figures (real numbers)

constitute the output data.

Therefore, in this study, we focus on Supervised ML as there

is output data. In this approach, the training phase is crucial as

the patterns between the input-output data are found. On the

other hand, the testing phase of the Supervised ML generates

outputs following the input-output patterns determined during

the training phase. In Supervised ML, approaches can be

grouped into classification and regression subcategories. The

classification algorithms generate discrete or categorical

outputs. The regression algorithms generate continuous

outcomes. Time-sequence regression is the type of Supervised

ML that we implement in this study.

To explain how we used the time-sequence regression in our

study, we describe our approach including the processes used

in the ML training and testing phases. Fig. 1 presents how

these two phases are used within our prediction algorithm. We

learn the suitable ML model in the training phase, and the

learned ML model is used as a predictor in the testing phase.

We employ the ML model of a recurrent ANN type, namely

Long-Short Term Memory (LSTM). The LSTM networks are

suitable for sequence-to-sequence regression problems. We

refer to Hochreiter and Schmidhuber (1997) and Greff et al.

(2016) for more information on the LSTM networks.

ML models require features (inputs) to make the prediction.

Therefore, we must define the features of our ML model. We

design a seven-dimensional feature vector. Six dimensions of

the feature vector consist of CPI and SPI metrics and their

moving average filtered versions (having window sizes of two

and three tracking points for each metric). The seventh and last

dimension of the input vector is the normalized time. The

normalized time is found by dividing AD by PD for a

particular tracking point. The output (predicted value) of the

ML model is the cost at completion.

The training-testing protocol we use for the ML is as follows.

We use 12 projects in the training phase and three projects in

the testing phase of our experiments. Projects in the training

and testing phase are randomly selected. We repeat the

experiment a hundred times for each of the three projects,

covering the training and testing phases. Therefore, all projects

are used in both training and testing phases. Accordingly, the

results are reported independently of the training and testing

sets.

Figure 1. The training and testing phases of the proposed ML

approach.

The evaluation criteria to assess the accuracy of our model’s

cost estimate for a given project is the percentage error (the

percent difference between the cost estimate and the actual

cost of a project). We find the absolute average of these errors

for all the projects in our dataset and measure this average with

the Mean Absolute Percentage Error (MAPE), as in

 

󰈅󰇛󰇜

 󰈅󰇛󰇜





Where t=1,…, n is the number of tracking periods for a project.

3.2 Dataset

We use the actual project data shared by the Operations

Research & Scheduling Research Group of Ghent University

(ORSRG, 2022; Batselier and Vanhoucke, 2015a). This

database includes EVM data of 133 projects that have been

executed and completed in different industries. The dataset

mainly constitutes construction projects. Considering the

database structure, we limit our scope only to construction

projects, and our final dataset included the EVM data of 41

real-life completed projects.

The total duration and cost data of the 41 construction projects

extracted from the database are shown in Fig. 2. The projects

have a large range of durations and budgets. Considering this

variance, we chose the projects within a specific range. For the

budget, we kept the upper limit to 3 million Euros. For the

duration, we chose the projects with a maximum duration of

150 days and a minimum of four tracking points. The projects

that fall in these ranges have some similarities, but the others

are very small or big projects and quite different in project

characteristics and resources. By setting budget and time

limits, we generated a project pool of 15 projects. We

randomly selected 12 projects for the training set, and the

remaining three projects were reserved for testing.

4. SUMMARY of the RESULTS

Our results show that in 75.33% of the projects tested, the

MAPE (2) obtained using our ML model was smaller than that

obtained with the traditional index-based model (1). We found

the difference between MAPEs and provide its results as a

histogram in Fig. 3.

A positive difference in MAPE in this histogram shows a

smaller MAPE of our ML method. The negative difference

indicates the projects where our ML model produced a larger

MAPE than the conventional index-based model. About

50.00% of 75.33% projects tested have a MAPE difference of

about 1.00%. Even though this is a negligible difference in

EAC($) estimate’s accuracy between the two models, we note

that the proposed model has a feature to learn from the given

EVM data. This is because the EAC($) estimates calculated in

the testing phase followed the input-output patterns of the

EVM data of the projects analyzed in the training phase.

Following this, during the training phase, the cost-related

EVM data was utilized to build the proposed ML algorithm

using LSTM network. Our ML model evaluated this input data

repeatedly until learning its cost growth pattern (behavior).

5. CONCLUSION

Cost overrun is a common problem in projects undertaken in

various industries. To deal with this common problem, project

managers opt for continuously monitoring the use of the

project budgets. Many of them try to produce accurate cost

estimates using the traditional EVM methods. These methods

are mainly based on cost and schedule performance indexes

which are linear. However, projects' budget acquisition and

cost growth patterns are nonlinear and resemble an S-shape.

Therefore, such methods have the inherent limitations in

providing more reliable and accurate cost estimates that reflect

the real cost growth behavior. Also, whatever the approach

followed or the method implemented, having accurate cost

estimates is critical to completing the projects successfully and

maintaining loyal relationships with project stakeholders.

Considering this limitation of the existing index-based models

and the importance of having accurate cost estimates, in this

study, we developed an ML algorithm for estimating the total

project cost more accurately. We employed a Supervised ML

model based on the LSTM protocol to forecast EAC($). The

EVM data of 41 real completed projects validated the proposed

approach. The training phase of our approach with 12 projects

allowed us to learn from the given dataset the patterns that

characterized the changes in the project cost. We used the

seven-dimensional feature vector that considered EVM

metrics like CPI and SPI and their moving averages and the

Figure 2. The cost and duration plot for 41 projects.

Figure 3. The MAPE difference between the proposed ML model

and the EVM index-based model.

3290 Tolgaİnanetal./IFACPapersOnLine55-10(2022)3286–3291

normalized time as a predictor. Based on this, we used the

learned patterns to calculate EAC($). In the testing phase, we

validated our approach on three projects with an associated

hundred experiments for each project. We compared our

approach’s EAC($) accuracy results with the ones computed

using the widely used index-based model in practice (1).

Overall, our model produced more accurate EAC($) results in

75.33% of project cases.

We acknowledge the following limitations that can potentially

be addressed in future research. First, we conducted the

experiments using a small dataset. We intend to extend the

current research using a larger pool of projects and evaluate

the model using additional forecasting criteria such as stability

and timeliness of EAC($), in addition to the accuracy. Second,

we will also work with projects from different industries, not

only construction. However, the initial results obtained with

the proposed ML approach are promising. The proposed

method can be combined with other forecasting techniques to

improve the solutions further.

ACKNOWLEDGMENTS

This research was funded by the Science Committee of the

Ministry of Education and Science of the Republic of

Kazakhstan (Grant No. AP09259049).

REFERENCES

Anbari, F.T. (2003). Earned value project management

method and extensions. Project Management Journal,

34(4), 12–23.

Barraza, G.A., Back, W.E., and Mata, F. (2004). Probabilistic

forecasting of project performance using stochastic S

curves. Journal of Construction Engineering and

Management, 130(1).

Batselier, J. and Vanhoucke, M. (2015a). Construction and

evaluation framework for a real-life project database,

International Journal of Project Management, 33(3),

697–710.

Batselier, J. and Vanhoucke, M. (2015b). Empirical evaluation

of earned value management forecasting accuracy for

time and cost. Journal of Construction Engineering and

Management, 141(11), 05015010.

Cacavid, J.P.U., Lamouri, S., Grabot, B., and Fortin, A.

(2019). Machine Learning in Production Planning and

Control: A Review of Empirical Literature. IFAC

PapersOnLine, 52(13), 385–390.

Chen, Z., Demeulemeester, E., Bai, S., and Guo, S. (2020). A

Bayesian approach to set tolerance limits for a statistical

project management. International Journal of

Production Research, 58(10), 3150-3163.

De Marco, A., Rosso, M., and Narbaev, T. (2016). Nonlinear

cost estimates at completion adjusted with risk

contingency. Journal of Modern Project Management,

4(2), 24–33.

Elmousalami, H.H. (2020). Comparison of Artificial

Intelligence techniques for project conceptual cost

prediction: A case study and comparative analysis. IEEE

Transactions on Engineering Management, 68(1), 183-

196.

Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., and

Schmidhuber, J. (2016). LSTM: A search space odyssey.

IEEE Transactions on Neural Networks and Learning

Systems, 28(10), 2222-2232.

Hashemi, S.T., Ebadati, O.M., and Kaur, H. (2020). Cost

estimation and prediction in construction projects: a

systematic review on machine learning techniques. SN

Applied Sciences, 2, 1703.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term

memory. Neural Computation, 9(8), 1735-1780.

Humphreys, G.C. (2018). Project management using Earned

value. 4th edition. Humphreys & Associates, Inc.

IPMA. (2020). Report on Artificial Intelligence impact in

Project Management. International Project Management

Association. Amsterdam, The Netherlands.

Kim, B.C. and Reinschmidt, K.F. (2011). Combination of

project cost forecasts in earned value management.

Journal of Construction Engineering and Management,

137(11), 958–966.

Lipke, W., Zwikael, O., Henderson, K., and Anbari, F. (2009).

Prediction of project outcome: the application of

statistical methods to earned value management and

earned schedule performance indexes. International

Journal of Project Management, 27(4), 400–407.

Mahmoudi, A., Bagherpour M., and Javed, S.A. (2021). Grey

earned value management: Theory and applications.

IEEE Transactions on Engineering Management, 68(6),

1703-1721.

Munir, M. (2019). How Artificial Intelligence can help project

managers. Global Journal of Management And Business

Research, 19(4), 1-8.

Narbaev, T. and De Marco, A. (2014a). An Earned Schedule-

based regression model to improve cost estimate at

completion, International Journal of Project

Management, 32(6), 1007-1018.

Narbaev, T. and De Marco, A. (2014b). Combination of

growth model and earned schedule to forecast project

cost at completion. Journal of Construction Engineering

and Management, 140(1), 04013038.

Natarajan, A. (2022). Reference class forecasting and Machine

Learning for improved offshore oil and gas megaproject

planning: Methods and application. Project Management

Journal, 53(OnlineFirst), 1-29.

Ong, S. and Uddin, S. (2020). Data science and Artificial

Intelligence in project management: The past, present

 Tolgaİnanetal./IFACPapersOnLine55-10(2022)3286–3291 3291

normalized time as a predictor. Based on this, we used the

learned patterns to calculate EAC($). In the testing phase, we

validated our approach on three projects with an associated

hundred experiments for each project. We compared our

approach’s EAC($) accuracy results with the ones computed

using the widely used index-based model in practice (1).

Overall, our model produced more accurate EAC($) results in

75.33% of project cases.

We acknowledge the following limitations that can potentially

be addressed in future research. First, we conducted the

experiments using a small dataset. We intend to extend the

current research using a larger pool of projects and evaluate

the model using additional forecasting criteria such as stability

and timeliness of EAC($), in addition to the accuracy. Second,

we will also work with projects from different industries, not

only construction. However, the initial results obtained with

the proposed ML approach are promising. The proposed

method can be combined with other forecasting techniques to

improve the solutions further.

ACKNOWLEDGMENTS

This research was funded by the Science Committee of the

Ministry of Education and Science of the Republic of

Kazakhstan (Grant No. AP09259049).

REFERENCES

Anbari, F.T. (2003). Earned value project management

method and extensions. Project Management Journal,

34(4), 12–23.

Barraza, G.A., Back, W.E., and Mata, F. (2004). Probabilistic

forecasting of project performance using stochastic S

curves. Journal of Construction Engineering and

Management, 130(1).

Batselier, J. and Vanhoucke, M. (2015a). Construction and

evaluation framework for a real-life project database,

International Journal of Project Management, 33(3),

697–710.

Batselier, J. and Vanhoucke, M. (2015b). Empirical evaluation

of earned value management forecasting accuracy for

time and cost. Journal of Construction Engineering and

Management, 141(11), 05015010.

Cacavid, J.P.U., Lamouri, S., Grabot, B., and Fortin, A.

(2019). Machine Learning in Production Planning and

Control: A Review of Empirical Literature. IFAC

PapersOnLine, 52(13), 385–390.

Chen, Z., Demeulemeester, E., Bai, S., and Guo, S. (2020). A

Bayesian approach to set tolerance limits for a statistical

project management. International Journal of

Production Research, 58(10), 3150-3163.

De Marco, A., Rosso, M., and Narbaev, T. (2016). Nonlinear

cost estimates at completion adjusted with risk

contingency. Journal of Modern Project Management,

4(2), 24–33.

Elmousalami, H.H. (2020). Comparison of Artificial

Intelligence techniques for project conceptual cost

prediction: A case study and comparative analysis. IEEE

Transactions on Engineering Management, 68(1), 183-

196.

Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., and

Schmidhuber, J. (2016). LSTM: A search space odyssey.

IEEE Transactions on Neural Networks and Learning

Systems, 28(10), 2222-2232.

Hashemi, S.T., Ebadati, O.M., and Kaur, H. (2020). Cost

estimation and prediction in construction projects: a

systematic review on machine learning techniques. SN

Applied Sciences, 2, 1703.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term

memory. Neural Computation, 9(8), 1735-1780.

Humphreys, G.C. (2018). Project management using Earned

value. 4th edition. Humphreys & Associates, Inc.

IPMA. (2020). Report on Artificial Intelligence impact in

Project Management. International Project Management

Association. Amsterdam, The Netherlands.

Kim, B.C. and Reinschmidt, K.F. (2011). Combination of

project cost forecasts in earned value management.

Journal of Construction Engineering and Management,

137(11), 958–966.

Lipke, W., Zwikael, O., Henderson, K., and Anbari, F. (2009).

Prediction of project outcome: the application of

statistical methods to earned value management and

earned schedule performance indexes. International

Journal of Project Management, 27(4), 400–407.

Mahmoudi, A., Bagherpour M., and Javed, S.A. (2021). Grey

earned value management: Theory and applications.

IEEE Transactions on Engineering Management, 68(6),

1703-1721.

Munir, M. (2019). How Artificial Intelligence can help project

managers. Global Journal of Management And Business

Research, 19(4), 1-8.

Narbaev, T. and De Marco, A. (2014a). An Earned Schedule-

based regression model to improve cost estimate at

completion, International Journal of Project

Management, 32(6), 1007-1018.

Narbaev, T. and De Marco, A. (2014b). Combination of

growth model and earned schedule to forecast project

cost at completion. Journal of Construction Engineering

and Management, 140(1), 04013038.

Natarajan, A. (2022). Reference class forecasting and Machine

Learning for improved offshore oil and gas megaproject

planning: Methods and application. Project Management

Journal, 53(OnlineFirst), 1-29.

Ong, S. and Uddin, S. (2020). Data science and Artificial

Intelligence in project management: The past, present

and future. Journal of Modern Project Management,

7(4), 123–456.

ORSRG (2022). Real data. Operations Research & Scheduling

Research Group. Ghent University. Available at

https://www.projectmanagement.ugent.be/research/data

/realdata

Ottaviani, F.M. and De Marco, A. (2022). Multiple linear

regression model for improved project cost forecasting.

Procedia Computer Science, 196(2022) 808–815.

Panda, S.K., Mishra, V., Balamurali, R., and Elngar, A.A.

(2021). Artificial Intelligence and Machine Learning in

business management: Concepts, challenges, and case

studies. 1st edition. CRC Press Taylor&Francis Group.

Florida, US.

Pellerin, R. and Perrier, N. (2019). A review of methods,

techniques and tools for project planning and control.

International Journal of Production Research, 57(7),

2160–2178.

Pewdum, W, Rujirayanyong, T., and Sooksatra, V. (2009).

Forecasting final budget and duration of highway

construction projects. Engineering, Construction, and

Architectural Management, 16(6), 544–557.

PMI. (2019). The standard for Earned Value Management. 2nd

edition. Project Management Institute (PMI). Newtown

Square, PA.

Rai, R., Tiwari, M.K., Ivanov, D., and Dolgui, A. (2021).

Machine Learning in manufacturing and industry 4.0

applications. International Journal of Production

Research, 59(16), 4773-4778.

Ulusoy, G. and Hazir, Ö. (2021). Recent developments and

some promising research areas. In Ulusoy, G. and Hazır,

Ö. An introduction to project modeling and planning,

457-469. Springer Texts in Business and Economics.

Springer, Cham.

Warburton, R.D.H. and Cioffi, D.F. (2016). Estimating a

project’s earned and final duration. International

Journal of Project Management. 34 (8), 1493–1504.

Wauters, M. and Vanhoucke, M. (2016). A comparative study

of Artificial Intelligence methods for project duration

forecasting. Expert Systems with Applications, 46, 249-

261.

Willems, L.L. and Vanhoucke, M. (2015). Classification of

articles and journals on project control and Earned Value

Management. International Journal of Project

Management, 33(7), 1610–1634.

Exploring the Challenges and Impacts of Artificial Intelligence Implementation in Project Management: A Systematic Literature Review

Article

Full-text available

Jan 2023

Predicting the outcome of regional development projects using machine learning

Article

Full-text available

Dec 2023

Morocco, in its pursuit of inclusive and sustainable territorial development, initiated the advanced regionalization experiment over six years ago. The primary challenge facing government officials today is the management of a burgeoning number of regional development projects. In this article we developed a predictive model based on artificial intelligence and Machine Learning to predict the outcomes of regional development projects, in order to identify the risks associated with their potential failure, and anticipate their impact. To accomplish this, we implemented various data mining techniques and classification algorithms. We collected and analyzed data from past and ongoing regional development projects, considering diverse factors that influence their success or failure. Through rigorous experimentation, we assessed the effectiveness of different predictive models. Our findings reveal that the Random Forest classifier stands out as an efficient algorithm for predicting the outcomes of regional development projects. This research contributes to the broader discourse on the practical implementation of artificial intelligence in public policy and regional development, showcasing its potential to optimize resource allocation, and alleviate the burden of repetitive administrative tasks for organizations operating with limited resources.

A machine learning study to improve the reliability of project cost estimates

Article

Full-text available

Sep 2023
INT J PROD RES

Project managers need reliable predictive analytics tools to make effective project intervention decisions throughout the project life cycle. This study uses Machine learning (ML) to enhance the reliability in project cost forecasting. A XGBoost forecasting model is developed and computational experiments are conducted using real data of 110 projects representing 1268 cost data points. The developed model performs better than some Earned value management (EVM), ML (Random forest, Support vector regression, LightGBM, and CatBoost), and non-linear growth (Gompertz and Logistic) models. The model produces more accurate estimates at the early, middle, and late stages of the project execution, allowing for early warning signals for more effective cost control. In addition, it shows more accurate estimates in most projects tested, suggesting consistency when repeatedly used in practice. Project forecasting studies mainly used ML to estimate the project duration; a few ML studies estimated the project cost at the project's conceptual stage. This study uses real data and EVM metrics, proposing an effective XGBoost model for forecasting the cost throughout the project life cycle.

Research on Cost Prediction Method for Transmission Line Engineering Based on Three Layer DNN Model

Conference Paper

Apr 2024

Прогнозування інцидентів під час планування спринтів у ІТ-проєктах

Article

Full-text available

Mar 2024

Зосереджено увагу на обґрунтуванні доцільності застосування технології машинного навчання для підвищення ефективності планування процесів, виконання яких передбачено в ітерації (Sprints) ІТ-проєкту, що реалізовують з використанням методології Scrum. Розглянуто проблеми, які виникають під час планування задач такого проєкту. Проаналізовано причини некоректного планування та шляхи можливого вирішення проблеми. Виокремлено проблему управління незапланованими у проєкті процесами та визначено вплив їх появи на коректність планування ітерацій. Проведено аналіз доцільності використання технологій машинного навчання для прогнозування кількості незапланованих завдань впродовж майбутніх ітерацій та запропоновано ці завдання трактувати як інциденти (апаратні збої). Визначено чинники, які впливають на виникнення незапланованих процесів роботи у трьох сегментах: історичні показники кількості інцидентів, апаратне забезпечення та дані мережевого навантаження. Обрано засіб прогнозування – регресор екстремального градієнтного підсилення та за допомогою нього проведено прогнозування ймовірності появи незапланованих процесів роботи. Розглянуто основні принципи роботи алгоритму. Описано переваги застосування цього методу в контексті досліджуваного середовища. Висвітлено особливості процедури порівняльного аналізу моделей регресії. Продемонстровано вплив підбору даних ознак на результат процесу прогнозування та візуалізовано результати застосування методу. Обґрунтовано вибір робочої моделі регресії та представлено результати прогнозування. Описано практичне завдання для аналізу ефективності застосування досліджуваного підходу. Сформовано контрольну та експериментальну команди для дослідження. Наведено приклад використання результатів прогнозування під час планування процесів роботи у ітерації. Проведено порівняльний аналіз підходів до планування ітерацій з урахуванням результатів прогнозування та без них прогнозування. Відображено результати аналізу та оцінено вплив прогнозування на процес прийняття рішень. Доведено ефективність застосування методу регресії екстремального градієнтного підсилення до планування процесів роботи ітерацій проєкту, що реалізують з використанням методології Scrum. Наведено перспективи розвитку подальших напрямів дослідження, галузі застосування отриманих результатів.

A hybrid forecasting model to predict the duration and cost performance of projects with Bayesian Networks

Article

Jun 2024
EUR J OPER RES

Lasso-GBDT-based Investment Forecasting for Distribution Transformer Replacement Projects

Conference Paper

Feb 2023

Multiple Linear Regression Model for Improved Project Cost Forecasting

Conference Paper

Full-text available

Jan 2022

Several studies have been conducted in the Project Management field further to improve the Earned Value Management (EVM) methodology to forecast the project cost estimate at completion (EAC). This work aims at developing a linear model to increase the accuracy of the standard EAC and minimize the variance of the error. The research is conducted on an EVM data set comprising 29 real-life projects for a total of 805 observations. Multiple linear regression analysis is performed to evaluate the number of regressors, the priority of the candidate EVM variables into the regression model, and to assess the diagnostics of the model fit. The new EAC formulation is benchmarked, the results show the model to provide higher accuracy and lower variance compared to the standard formulation.

Machine learning in manufacturing and industry 4.0 applications

Article

Full-text available

Aug 2021

The machine learning (ML) field has deeply impacted the manufacturing industry in the context of the Industry 4.0 paradigm. The industry 4.0 paradigm encourages the usage of smart sensors, devices, and machines, to enable smart factories that continuously collect data pertaining to production. ML techniques enable the generation of actionable intelligence by processing the collected data to increase manufacturing efficiency without significantly changing the required resources. Additionally, the ability of ML techniques to provide predictive insights has enabled discerning complex manufacturing patterns and offers a pathway for an intelligent decision support system in a variety of manufacturing tasks such as intelligent and continuous inspection, predictive maintenance, quality improvement, process optimisation, supply chain management, and task scheduling. While different ML techniques have been used in a variety of manufacturing applications in the past, many open questions and challenges remain, from Big data curation, storage, and understanding, data reasoning to enable real-time actionable intelligence to topics such as edge computing and cybersecurity aspects of smart manufacturing. Hence, this special issue is focused on bringing together a wide range of researchers to report the latest efforts in the fundamental theoretical as well as experimental aspects of ML and their applications in manufacturing and productionsystems.

Cost estimation and prediction in construction projects: a systematic review on machine learning techniques

Article

Full-text available

Oct 2020

Construction cost predictions to reduce time risk assessment are indispensable steps for process of decision-making of managers. Machine learning techniques need adequate dataset size to model and forecast the cost of projects. Therefore, this paper presents analysis and studied manuscripts that proposed for cost estimation with machine learning techniques for the last 30 years. The impact of this manuscript is deep studied of machine learning techniques and applied an analysis methodology in cost estimation based on direct cost and indirect cost of construction projects, which consists of two parts. In the first part, for study the proposals, we focus on collecting related studied from Google Scholar and Science Direct journals. The interested application areas for project cost estimation are building, highway, public, roadway, water-related constructions, road tunnel, railway, hydropower, power plant and power projects. The second part is regarded to the analysis of the proposals. For cost analysis, there are possibilities to consider two approaches as qualitative and quantitative. However, reflect to the machine learning techniques the quantitative approach is studied. In quantitative approach, we categorized the models in three parts, as statistical, analogues and analytical model and analyze them based on their features. Correspondingly, papers have been thoroughly investigated based on the application area, method applied, techniques implemented, journals, which have been published in, and the year of publication. The most important outcome of this study is to find out the different analytics methods and machine learning algorithms to predict the cost estimation of construction and related projects and aid to find out the suitable applied methods.

Comparison of Artificial Intelligence Techniques for Project Conceptual Cost Prediction: A Case Study and Comparative Analysis

Article

Full-text available

Feb 2020

Haytham Elmousalami

Developing a reliable parametric cost model at the conceptual stage of the project is crucial for project managers and decision makers. Existing methods, such as probabilistic and statistical algorithms have been developed for project cost prediction. However, these methods are unable to produce accurate results for conceptual cost prediction due to small and unstable data samples. Artificial intelligence (AI) and machine learning (ML) algorithms include numerous models and algorithms for supervised regression applications. Therefore, a comparative analysis for AI models is required to guide practitioners to the appropriate model. The article focuses on investigating 20 AI techniques which are conducted for conceptual cost modeling, such as fuzzy logic model, artificial neural networks, multiple regression analysis, case-based reasoning, hybrid models, such as genetic fuzzy model, and ensemble methods such as scalable boosting trees (XGBoost) and random forest. Field canals improvement projects (FCIPs) are used as an actual case study to analyze the performance of the applied ML models. Out of 20 AI techniques, the results show that the most accurate and suitable method is XGBoost with 9.091% and 0.929 based on mean absolute percentage error and adjusted R2, respectively. Nonlinear adaptability, handling missing values and outliers, model interpretation, and uncertainty have been discussed for the 20 developed AI models. In addition, this study presents a publicly open dataset for FCIPs to be used for future models’ validation and analysis.

Reference Class Forecasting and Machine Learning for Improved Offshore Oil and Gas Megaproject Planning: Methods and Application

Article

Jan 2022

Ananth Natarajan

This article develops and describes rigorous oil and gas project forecasting methods. First, it builds a theoretical foundation by mapping megaproject performance literature to these projects. Second, it draws on heuristics and biases literature, using a questionnaire to demonstrate forecasting-related biases and principal-agent issues among industry project professionals. Third, it uses methodically collected project performance data to demonstrate that overrun distributions are non-normal and fat-tailed. Fourth, reference-class forecasting is demonstrated for cost and schedule uplifts. Finally, a predictive approach using machine learning (ML) considers project-specific factors to forecast the most likely cost and schedule overruns in a project.

Artificial Intelligence and Machine Learning in Business Management: Concepts, Challenges, and Case Studies

Book

Sep 2021

Recent Developments and Some Promising Research Areas

Chapter

Apr 2021

Upon successful completion of this Chapter, the reader will be able to:

Machine Learning in Production Planning and Control: A Review of Empirical Literature

Article

Jan 2019

Proper Production Planning and Control (PPC) is capital to have an edge over competitors, reduce costs and respect delivery dates. With regard to PPC, Machine Learning (ML) provides new opportunities to make intelligent decisions based on data. Therefore, this paper provides an initial systematic review of publications on ML applied in PPC. The research objective of this study is to identify standard activities as well as techniques to apply ML in PPC. Additionally, the commonly used data sources in literature to implement a ML-aided PPC are identified. Finally, results are analyzed and gaps leading to further research are highlighted.

Grey Earned Value Management: Theory and Applications

Article

Aug 2019

Project stakeholders always investigate possible approaches to monitor project progress closely and further, taking necessary actions during the whole phases of the project in order to manage delays. Earned value management (EVM) is one of the methods, which can forecast the required costs for accomplishment of the project. The data collected from projects undertaken in order to update the master schedule often suffer from a level of uncertainty. Ignoring these uncertainties may even lead to project failure. Fuzzy theory has been previously used in the EVM for taking uncertainties into account. A major disadvantage of using fuzzy approaches is the need for incorporating expert judgments to construct a suitable membership function for all activities in the project undertaken. A potential approach for overcoming this issue lies in Grey theory. In this article, the current study deals with the EVM method in grey systems paradigm. Also, the performance of the proposed method, called grey earned value management (EVM-G), is evaluated through some numerical examples and a case study. The results demonstrate that the proposed approach has a unique performance in highly uncertain environments when experts have become unavailable. Comparisons between EVM-G and the fuzzy earned value management approaches reveal the superior performance of EVM-G. KEY WORDS: Cost management, earned value management, fuzzy theory, grey system theory, project management.

A Bayesian approach to set the tolerance limits for a statistical project control method

Article

Jun 2019

In this paper, we address the project schedule control problem under an uncertain environment. We propose a new method to set the tolerance limits based on the Earned Value Management/Earned Schedule (EVM/ES) schedule performance metrics. These tolerance limits can help a project manager to identify whether the schedule deviations from the baseline schedule are within the possible deviations derived from the expected variability of the project or if corrective actions must be taken to get the project back on track. We view the project control problem as a statistical hypothesis test with the null hypothesis being that the project progress is out of control. First, a simulation is performed to generate two types of empirical conditional distributions of the monitored schedule indicator. Afterwards, an algorithm that uses the derived conditional distributions as inputs is proposed to optimise the tolerance limits. An extensive computational experiment is carried out to assess the performance of the proposed approach. Additionally, sensitivity experiments are conducted to analyse four underlying factors that may influence the power of the proposed method. Experimental results show that our approach can keep the first type error under the required level ( α=0.05) in any situation, meanwhile reducing the second type error significantly compared with three other methods in the literature.

A Machine Learning Study to Enhance Project Cost Forecasting

Abstract and Figures

Recommended publications

A machine learning study to improve the reliability of project cost estimates

Integrating Risk in Project Cost Forecasting

Integrating Estimates at Completion with Cost Contingency Management

Nonlinear cost estimates at completion adjusted with risk contingency