ArticlePDF Available

A unique support vector regression for improved modelling and forecasting of short-term gasoline consumption in railway systems

January 2015
International Journal of Services and Operations Management 21(2):217

January 2015
21(2):217

DOI:10.1504/IJSOM.2015.069382

Authors:

Ali Azadeh

University of Tehran

Azam Boskabadi

University of Texas at Arlington

Shima Pashapour

University of Tehran

This study presents a support vector regression algorithm and time series framework to estimate and predict weekly gasoline consumption in railway transportation industry. For training support vector machines, recursive finite Newton (RFN) algorithm is used. Furthermore, it considers the effect of number of holidays per weeks and amount of transported freight and number of transported passengers in gasoline consumption prediction. Transported passengers per kilometre and transported tons per kilometre are the most important factors in railway industry. For this reason, this study assesses the effect of these factors on weekly gasoline consumption. Weekly gasoline consumption in railway transportation industry of Iran from August 2009 to December 2011 is considered. It is shown that SVR achieves better results in comparison with other intelligent tools such as artificial neural network (ANN).

Different views of SVM

…

nputs and outputs of SVR and ANN models

…

Loss functions, (a) quadratic (b) Laplace (c) Huber (d) ε-intensive (see online version for colours)

…

No caption available

…

No caption available

…

Figures - uploaded by Ali Azadeh

Content may be subject to copyright.

Content uploaded by Ali Azadeh

Content may be subject to copyright.

nt. J. Services and Operations Management, Vol. 21, No. 2, 2015 217

A unique support vector regression for improved

modelling and forecasting of short-term gasoline

consumption in railway systems

Ali Azadeh*, Azam Boskabadi and

Shima Pashapour

School of Industrial Engineering,

Center of Excellence for Intelligent Based Experimental Mechanic,

College of Engineering,

University of Tehran,

P.O. Box 11155-4563, Tehran, Iran

Email: aazadeh@ut.ac.ir

Email: Azam.boskabadi@gmail.com

Email: shima.pashapour@ut.ac.ir

*Corresponding author

Abstract: This study presents a support vector regression algorithm and

time series framework to estimate and predict weekly gasoline consumption in

railway transportation industry. For training support vector machines, recursive

finite Newton (RFN) algorithm is used. Furthermore, it considers the effect

of number of holidays per weeks and amount of transported freight and number

of transported passengers in gasoline consumption prediction. Transported

passengers per kilometre and transported tons per kilometre are the most

important factors in railway industry. For this reason, this study assesses the

effect of these factors on weekly gasoline consumption. Weekly gasoline

consumption in railway transportation industry of Iran from August 2009 to

December 2011 is considered. It is shown that SVR achieves better results in

comparison with other intelligent tools such as artificial neural network (ANN).

Keywords: support vector regression; SVR; gasoline consumption;

forecasting; railway system; artificial neural network; ANN.

Reference to this paper should be made as follows: Azadeh, A., Boskabadi, A.

and Pashapour, S. (2015) ‘A unique support vector regression for

improved modelling and forecasting of short-term gasoline consumption in

railway systems’, Int. J. Services and Operations Management, Vol. 21, No. 2,

pp.217–237.

Biographical notes: Ali Azadeh is an Eminent University Professor and

Founder of Department of Industrial Engineering and Co-founder of Research

Institute of Energy Management and Planning at University of Tehran. He

obtained his PhD in Industrial and Systems Engineering from the University of

Southern California. He received the 1992 Phi Beta Kappa Alumni Award

for Excellence in Research and Innovation of Doctoral Dissertation in USA. He

is the recipient of six awards at the University of Tehran. He is also the

recipient of National Eminent Researcher Award in Iran. He has published

more than 650 papers in reputable academic journals and conference

proceedings.

218

. Azadeh et al.

Azam Boskabadi received her MSc from the School of Industrial Engineering

at University of Tehran. Her current research includes logistics planning,

complex adaptive systems, agent-based modelling and simulation.

Shima Pashapour is a PhD student at the School of Industrial Engineering at

University of Tehran. Her current research includes operations research,

complex adaptive systems, agent-based modelling and simulation.

1 Motivation and significance

This study presents an intelligent algorithm to model time series data with respect to

gasoline consumption in railway transportation industry in Iran. This is the first study that

integrates conventional time series and support vector regression (SVR) for forecasting

and modelling gasoline consumption in railway industry in Iran. The superiority of the

proposed algorithm is shown by comparing its results with other intelligent tools such as

artificial neural network (ANN).

2 Introduction

Transportation is the main factor for evaluating the industrial improvement of a country.

Gasoline, as one of the most important resources of energy, with its ever growing role in

world economy and transportation has gained special attention more than before.

Through the development of societies and growth of transportation activities, optimum

use of fuel has considered more effective factor in improvement of corporations and their

services. Because of this reason, a major topic applicable in this area is the estimation of

the gasoline consumption, which reveals the consumption growth in the forthcoming

years. Su (2011) considered the effect of population density, freeway road density, and

congestion on household gasoline consumption by using semiparametric and parametric

approaches. Results showed that areas with higher freeway densities, higher levels of

congestion, or lower population densities consume more gasoline. Tasdemir et al. (2011)

used ANN and fuzzy expert system (FES) modelling of a gasoline engine to predict

engine power, torque, specific fuel consumption and hydrocarbon emission. Results show

that developed ANN and FES can be used reliably in automotive industry and

engineering instead of experimental work. Coyle et al. (2012) estimate supply and

demand functions for gasoline using information from excise tax. Results show that

raising fuel taxes will generate significant amounts of revenue with relatively low

efficiency costs. Crôtte et al. (2010) estimated cross elasticities of the demand for

gasoline per vehicle using both a time series cointegration model and a panel GMM

model for Mexican states. Results show that more fuel efficient technologies have a

negligible effect on gasoline consumptions and vehicle stock size has a higher impact

on gasoline consumption. Wadud et al. (2010) model US gasoline demand using

semiparametric techniques. Results show that households located in urban areas reduce

consumption more than those in rural areas in response to an increase in price.

Pock (2010) used various dynamic panel estimators to estimate gasoline demand. Results

unique support vector regression for improved modelling and forecastin

219

show that standard pooled estimators are more reliable than common IV/GMM

estimators.

This study presents an intelligent integrated algorithm to model time series data with

respect to gasoline consumption in railway transportation industry in Iran. The main

contribution in this study is that this is the first study integrates conventional time series

and SVR for forecasting and modelling gasoline consumption in railway industry in Iran.

Our purpose is forecasting the amount of gasoline that will be used for CEOs’ long-term

decision-making strategies in Iran railway transportation industry. We are supposed to

answer these questions:

• How can heavy industries like a railway industry predict their main resources of

energy and which factors are more important in this procedure?

• If these industries want to predict their usage of energy in the future, which approach

is the best approach that can recognise patterns and analyse data?

• How close is the result of the selected approach to the real amount?

Main factors that are considered in this paper for prediction of gasoline consumption are

number of holidays in weeks, amount of transported loads and, and number of

transported passengers. Transported passengers per kilometre and transported tons per

kilometre are the most important factors in the railway industry. This study considers the

weekly data from August 2009 to December 2011 to show the applicability and

superiority of the proposed algorithm.

3 Literature review

There are a lot of papers in literature that worked on railway industry. Azadeh et al.

(2012) presented an integrated fuzzy modelling and simulation approach for modelling

and scheduling of cargo and passenger trains with time limitations. Rayeni and Saljooghi

(2014) developed a new secondary goal based on symmetric weight selection of

cross-efficiency for ranking and measuring efficiency of railway in Iran. Khare and

Handa (2011) conducted an exploratory research to study customer experience of the

online reservation system of the Indian railways. In this paper, SVR is used for improved

modelling and forecasting of short-term gasoline consumption in railway systems.

Some papers considered prediction of gasoline consumption. Togun and Baysec

(2010b) presented an ANN model to predict the torque and brake gasoline consumption.

They developed a model based on back propagation algorithm. They showed that their

model had high efficiency and accuracy. Togun and Baysec (2010a) presented a model

based on genetic programming (GP) for generating the formulations for gasoline engine

torque and brake specific gasoline consumption. The proposed model show very good

agreement with the experimental results. The results of proposed model compared with

neural network model and strongly good agreement was observed between two

predictions. Azadeh et al. (2010) presented an adaptive intelligent algorithm for

forecasting gasoline demand based on ANN, regression, and design of experiment

(DOE). The results show that ANN provides far less error than regression. Park and Zhao

(2010) estimated US gasoline demand using a time-varying regression.

A new technique in time series forecasting is support vector machines (SVMs)

(Mukherjee et al., 1997; Muller et al., 1999, 1997) and because time series prediction is

220

. Azadeh et al.

like an auto-regression in time, a regression method can be applied for this task (Thissen

et al., 2003). SVMs illustrate a recent AI model developed by Vapnik (1995) that has

already been used in a vast range of applications. SVMs, as a set of related supervised

learning methods, are a learning tool for solving classification and regression problems

that analyse data and recognise patterns. A SVM makes a hyper plane or set of hyper

planes in a high or infinite dimensional space, which can be used for classification,

regression or other tasks. SVM implements by matching input vectors into a higher

dimensional feature space. The optimum hyper plane is identified through this feature

space with the help of a kernel function, K(x). This inner product in the feature space

makes training data linear-separable. The selected kernel function should satisfy Mercer’s

condition that determines if a prospective kernel is actually an inner product in

some space and guarantees that unique global optimum solutions are obtained (Burges,

1998). Different kinds of kernel functions used today such as linear kernel, the

polynomial kernel, radial basis function (RBF) [include Gaussian radial basis function

(GAUSSIANRBF), exponential RBF and MLP], and sigmoid kernel. But the RBF kernel

has been proposed by the most of users as the best choice, because of its ability to analyse

higher dimension data, use of just one hyper parameter to search, and fewer numerical

difficulties (Hsu et al., 2003). As a large number of researchers such as Burges (1998),

Hao (2003) and Wang et al. (2005) stated that SVMs had a higher performance than

traditional learning tools, SVMs capability and prediction accuracy are determined by the

optimal penalty and kernel parameters.

A novel version of SVM was proposed by Vapnik (1995), Burges (1998) and Smola

and Schoolkopf (2004). It uses a regression method for modelling and prediction. This

method is called SVR.

The model made by SVMs classification depends only on a subset of the training

data, because the cost function for making the model does not consider training points

that lie beyond the margins. Similarly, the model made by SVR depends only on a subset

of the training data and its cost function for building the model ignores any training data

close to the model prediction (within a threshold ε).

As mentioned SVMs and kernel methods (KMs) have become one of the most

popular approaches for learning from examples in science and engineering. They include

some methods for generalisation improvement, model selection, hyper-parameter tuning

as well as a priori knowledge. Sanchez (2003) presented the basic methodology of SVMs

and KMs, and also their utilisation that are classification, regression, clustering, density

estimation, and novelty detection. At the end, he discussed the area of application of

SVMs and KMs.

Tay and Cao (2002) have presented a modified version of SVMs for modelling

non-stationary financial time series that called C-ascending SVM. They obtained this

method by a simple modification of the regularised risk function in support vector

machines. “The empirical risk function has equal weight C to all the ε_insensitive errors

between the predicted and actual values” in the standard SVMs. The regularisation

constant C plays the trade-off role between the empirical risk and the regularised term. So

they applied a variable amount for C. They concluded that the C-ascending support

vector machine has better forecasting results for the actually ordered sample data than the

standard SVMs.

Cao (2003) have proposed using the SVMs experts for time series anticipating. There

are two stages neural network architecture in this kind of support vector machines. In

the first stage, the whole input space is divided into several disjointed regions by

unique support vector regression for improved modelling and forecastin

221

self-organising feature map (SOM) as a clustering algorithm. Then, in the second step,

SVM constructs the best fit partitioned regions by finding the most appropriate kernel

function and the optimal free parameters of SVMs.

Thissen et al. (2003) predicted time series for the field of chemo metrics by

performing SVMs, Elman recurrent neural networks, and autoregressive moving average

(ARMA) models. The authors have applied ε-SVR method for forecasting future data.

They have considered three datasets; ARMA time series, the chaotic and non-linear

Mackey-Glass data, and a real-world practice dataset containing relative differential

pressure changes of a filter. For the ARMA dataset, the ARMA model performed best for

this dataset rather than other methods. For the real-world dataset, the SVM performed

slightly worse than the best Elman network, because the training phase of the SVM was

not feasible with this relatively large dataset. But by using a training set with the much

lower resolution (i.e., 10%), this method was able to anticipate the filter series well. They

concluded that the largest benefits of SVMs is the fact that a global solution exists and is

found in contrast to neural networks which have to be trained with randomly chosen

initial weight settings.

Kim (2003) has predicted the stock price index by using SVMs in his paper. He also,

has searched the effect of the upper bound C and the kernel parameter δ2 in SVM. It is

clear from experimental result that the prediction performances of SVMs are sensitive to

the value of these parameters, so finding the optimal value of the parameters is important.

In addition, they examined the feasibility of applying SVM in financial forecasting by

comparing it with back propagation neural networks and case-based reasoning. And at

the end, they concluded that SVM provides a promising alternative to stock market

prediction.

Bo et al. (2007) have proposed a recursive finite Newton algorithm for training

non-linear support vector regression (SVR-RFN). They have used the insensitive Huber

loss function (IHLF). They concluded that their method outperforms the LIBSVM 2.82.

Lu et al. (2009) have presented a two-stage forecasting model for financial time

series. At first, they used independent component (IC) analysis for generating ICs to

forecast variables. Then, ICs including noise were identified and removed. The rest of the

ICs were used as the input variables of the SVR forecasting model. They surveyed two

datasets including the Nikkei 225 opening cash index and the TAIEX closing cash index,

and compared the mentioned method with traditional SVR and random walk models by

considering prediction error and prediction accuracy as criteria. They concluded that their

method outperforms the traditional SVR and random walk models.

For improving model in order to improve the performance of the standard SVR

model, Yang et al. (2009) have presented the localised support vector regression (LSVR)

model. Their model offers a systematic and automatic scheme to adapt the margin locally

and flexibly. Hence, it can stand noise adaptively. They declared that this model

incorporates the standard SVR as a special case. And also by kernelisation, this model

can generate non-linear approximating functions, so it is able to apply to general

regression problem.

Hong et al. (2010) have forecasted Taiwanese 3G mobile phone demand by SVR with

hybrid evolutionary algorithms. They employed genetic algorithm-simulated annealing

hybrid algorithm (GA-SA) to select the suitable parameter combination for a SVR model.

Finally, they compared the results with two other models, namely the autoregressive

integrated moving average (ARIMA) model and the general regression neural networks

(GRNN) model.

222

. Azadeh et al.

Hong (2010) has presented an application of a novel algorithm, namely chaotic ant

swarm optimisation (CAS), in a SVR-based electric load forecasting model, to improve

the forecasting performance by searching its suitable parameters combination. Finally, he

has compared the SVR model with CAS (SVRCAS) results with the other alternative

methods, namely SVRCPSO (SVR with chaotic PSO), SVRCGA (SVR with chaotic

GA), regression model, and ANN model.

Fu et al. (2010) for prediction of chaotic time series with outliers have presented

annealing robust fuzzy neural networks (ARFNNs) with SVR. They have used a

combination model that merges SVR, RBF networks and simplified fuzzy inference

system. Finally, the superiority of their method has been shown with different SVR for

training and prediction of chaotic time series with outliers.

Lu and Wang (2010) combined IC analysis and growing hierarchical self-organising

maps with SVR to forecast product demand. They used IC analysis method to detect and

remove the noise of data and further improve the performance of predicting model. Then,

they used growing hierarchical self-organising maps to classify data. After that, SVR was

applied to construct the product demand forecasting model.

Kavaklioglu (2011) have used SVR methodology to model and predict Turkey’s

electricity consumption. A grid search for the model parameters was performed to find

the best e-SVR model for each variable based on root mean square error.

Hong et al. (2011a) have used SVR to model and forecast the tourism demands with

chaotic genetic algorithm (CGA), namely SVRCGA. The proposed CGA based on the

chaos optimisation algorithm is used to overcome premature local optimum in

determining three parameters of a SVR model.

In another research, Hong et al. (2011b) used the combination of SVR with

continuous ant colony optimisation algorithms (SVRCACO) to model and forecast inter-

urban traffic flow. They compared the results with the seasonal autoregressive integrated

moving average (SARIMA) time series model.

Hong (2011) proposed a traffic flow forecasting model that combines the seasonal

SVR model with chaotic simulated annealing algorithm. Then, they compared the

results with the SARIMA time series model (SSVRCSA) to predict inter-urban traffic

flow.

Hong et al. (2011c) also used the hybrid genetic algorithm-simulated annealing

algorithm to determine the suitable parameter combination for a SVR in another research

and compared these results with SARIMA, back propagation neural network (BPNN),

Holt-Winters (HW) and seasonal Holt-Winters (SHW) models. They presented a SVR

traffic flow forecasting model.

Meiying et al. (2011) presented a Bayesian evidence framework to infer the LS-SVR

model parameters. In fact, because the traditional least squares support vector regression

(LSSVR) model (using cross validation to determine the regularisation parameter and

kernel parameter) is time-consuming, they used a Bayesian evidence framework.

Lin et al. (2011) forecasted concentrations of air pollutants by logarithm support

vector regression with immune algorithms (SVRLIA) model which takes advantage of

the structural risk minimisation of SVR models. In this investigation, three pollutants

were collected and examined to determine the feasibility of the developed SVRLIA

model.

Nagi et al. (2011) used a computational intelligence scheme based on the SOM and

SVM for the prediction of daily peak load. They used SOM as a clustering tool to cluster

unique support vector regression for improved modelling and forecastin

223

the training data into two subsets, using the Kohonen rule. Finally, they used the SVR to

fit the testing data based on the clustered subsets for predicting the daily peak load.

Rasouli et al. (2012) forecasted daily stream flow by machine learning methods –

Bayesian neural network (BNN), SVR and Gaussian process (GP) – with weather and

climate inputs. Finally, they compared the results with multiple linear regressions (MLR).

Che et al. (2012) used an adaptive fuzzy combination model based on self-organising

map and SVR for forecasting electric load.

As mentioned, for estimation of gasoline consumption in Iran railway industry, a

support vector regression is used to reach the best results in comparing with other

intelligent tools such as ANN. ANN is a powerful intelligent tool that is used in literature

for different purposes (Satapathy et al., 2012). The algorithm uses SVR and time series

model to predict Iran’s gasoline consumption. The SVR model is discussed in the next

section. The proposed formulation of model is presented in Section 4. The results of

model are discussed on a case study in Section 5. Conclusion of paper is presented in

Section 6.

4 Support vector regression

Many approaches for obtaining systems with intelligent behaviour are based on

components that learn automatically from previous experience. The development of these

learning techniques is the objective of the area of research known as machine learning.

Among the various existing algorithms, SVM for classification and SVR, after training

from a series of examples, can successfully predict the output at an unseen location

performing an operation known as induction. The machine learning process can be

divided into four categories:

• Supervised learning: it creates a function from training data which includes pairs

of input objects (typically vectors) and desired output. The output of regression

function is a continuous value, or in classification it can predict a class label of

the input object. First of all, the supervised learner is taught by a finite number of

training examples and by the use of these examples, it predicts the value of the

function for any valid input data. Nowadays supervised learning algorithms are so

popular and well-known such as: neural networks, nearest neighbour algorithm,

decision tree learning, and support vector machines.

• Unsupervised learning: as it can be guessed from its name, in this category there is

not any kind of supervisors, so there is only input data without output data (answer).

Usually the unsupervised learning problems are not mathematically well-defined and

their goal is different case by case. For instance, in data clustering, the main purpose

is to group similar data. The affinity parameter between data samples should be

subjectively predetermined and there is no objective criterion for evaluating its

validity quantitatively.

• Semi-supervised learning: it uses both unlabeled and labelled data for training,

typically a small amount of labelled data with a large amount of unlabeled data.

224

. Azadeh et al.

• Reinforcement learning: in this category, an agent explores an environment,

perceives its current state and takes actions. The environment, in return, provides a

reward (which can be positive or negative). These algorithms attempt to find a policy

for maximising cumulative reward for the agent through the problem.

As mentioned before, SVMs form part of supervised learning create a ‘decision-maker’

system which tries to predict new values (Figure 1).

Figure 1 Different views of SVM

Source: Parrella (2007)

For conventional purposes, the group of examples that make up the SVM is called

training set, whereas the group which contains the examples used in the prediction is

called test set. SVMs can also be applied to regression problems by the introduction of an

alternative loss function. The loss function must be modified to include a distance

measure. Figure 2 illustrates four possible loss functions.

The loss function in Figure 2(a) corresponds to the conventional least squares error

criterion. The loss function in Figure 2(b) is a Laplacian loss function that is less sensitive

to outliers than the quadratic loss function. Huber proposed the loss function in

Figure 2(c) as a robust loss function that has optimal properties when the underlying

distribution of data is unknown. These three loss functions will produce no sparseness in

the support vectors. To address this issue, Vapnik proposed the loss function in

Figure 2(d) as an approximation to Huber’s loss function that enables a sparse set of

support vectors to be obtained. The ε-insensitive loss function is attractive, because the

SV solution can be sparse unlike the quadratic and Huber cost functions (that all the data

points will be support vectors).

The main purpose of SVMs is to find a function f(x) that has deviated away from all

the training data. At the same time, this function should be as flat as possible to prevent

over fitting.

unique support vector regression for improved modelling and forecastin

225

Figure 2 Loss functions, (a) quadratic (b) Laplace (c) Huber (d) ε-intensive (see online version

for colours)

(a) (b)

5 Model formulation

As mentioned before, if we assume y as a single output that is a function of N input

variables x, a training dataset of length N can be presented as below:

()( )

(

)

{}

11 2 2

, , , , , , where and , 1, 2, ,

NN k k

Txyxy xy xR yRk N=∈∈=……

xk is n dimensional vector that demonstrates the values of each input at time step k and yk

is scalar that demonstrates the output variable at time step k. Now, the issue is finding a

model that describes this training dataset. In the standard SVR formulation a linear model

is considered as equation (1).

ˆ() ,yx wx b=〈 〉+ (1)

where the estimated output of the model is shown by ˆ,y w is the weight vector, b is the

bias term, and vector inner product is denoted by 〈.,.〉. The vector w is actually an element

of the feature space of the problem. Because of all real world problems cannot be solved

and modelled with linear formula, it allow non-linear modelling. In the SVR

226

. Azadeh et al.

methodology kernel functions Φ(x) from input space to feature space are used as

non-linear function as presented in equation (2).

ˆ() ,Φ()

xwxb=+ (2)

Therefore, selecting the right non-linear maps Φ(x) and fitting for the best w is

significant issue. As mentioned before we use the RBFs because of its ability to analyse

higher-dimension data, use of just one hyper parameter to search, and fewer numerical

difficulties. To making w as flat as possible, we need to minimise the norm ( || . || ) of the

w vector for every data point i = 1, 2,…,N as presented in model (3):

()

Minimise 2

Subject to , Φ

,Φ

yw x bε

wx byε

−−≤

+− ≤

(3)

This model is correct only if we assume that the problem is feasible. If we want to allow

some errors, we should introduce some slack-variables *

(, )

ξξ that enlarge the tolerance

of the machine as presented in model (4):

()

Minimise 2

Subject to , Φ

,Φ

ii i

wC ξξ

yw x bεξ

ξξ

−−≤+

≥

∑

(4)

The constant C determines the trade-off between the flatness of the function and the

amount of larger deviations of tolerance. ε is the maximum error permitted in an element,

ε and C are parameters which define the limit of maximal tolerance. Finding the right

value is very complicated and there is a vast literature on how to choose the best ones.

Generally, the most used technique is to find them by trial and error.

In fact, in the case of regression, a margin of tolerance ξ is set in approximation to the

SVM which would have already requested from the problem. However, the main idea is

always the same: to minimise error, individualising the hyper plane which maximises the

margin, keeping in mind that part of the error is tolerated. Usually this optimisation

problem is solved in its dual form, therefore constraints are carried in the cost function by

use of Lagrange multipliers and the Lagrangian (L) is created as model (5):

()

** **

,Φ

iii i

iii i iiii

LwC ξξ

εξ yw x b

εξ yw x b ηξ η ξ

=+ +

−+−+ +

−++− −−+

∑

∑∑

(5)

,,and

iii i

ηη

ββ

are the Lagrange multipliers.

unique support vector regression for improved modelling and forecastin

227

Figure 3 Graphical details of ε-insensitive loss function

Figure 3 shows how the error of SVR is calculated. Up until the threshold ξ, the error is

considered 0, after the error it becomes calculated as ‘error-epsilon’. In fact in order to

find the minimum, one needs to take all the partial derivatives of the Lagrangian

with respect to ξi, *,

ξ w and b; and set them to zero. These expressions along with the

complementary Karush-Kuhn-Tucker conditions lead to the final quadratic programming

(QP) model (6):

()

() ()

()

Minimise

Subject to 0

ii j j

ii iii

εy

−−

+−− −

−=

≤≤

∑∑

∑

ββ β β

ββ ββ

ββ

(6)

where Kij = K(xi, xj) = Φ(xi)TΦ(xj).

Kij is the kernel function based on the original non-linear maps and we do not need to

calculate w. By solving the above QP problem the optimum

i, *

are concluded for

i = 1, 2,…,N. Therefore, we gain a pair of

i, *

for all of the training data point. It is

possible that some of the

i, *

pair would be vanished by having zero value. Those

training data point that the

i, *

pair dose not vanish are a support vector. And also the

bias b is computed from this formula ˆ

(0),

εyy

−

+= when this condition is satisfied for

all the support vectors. Finally, we can use the following formulation for model in the

dual space:

()

ˆ() ,

ii i

yx K xx b

=− +

∑

ββ

(7)

And the RBF is as follow:

() ()()

,exp 2

ij ij

ij i j

xx xx

KKxx σ

⎛⎞

−−

⎜⎟

⎝⎠

(8)

228

. Azadeh et al.

Now we have the input parameters and the number of them (N), the regularisation

parameter (C), the maximum allowable error in the output (ε) and the kernel parameter

(σ) that determines the spread of the function.

With the use of mean absolute percentage error (MAPE) in test data, we are able to

evaluate the performance of the model and it is computed as follow:

Nii

MAPE Ny

−

=∑ (9)

6 Experiment

SVMs, as a set of related supervised learning methods, are a powerful learning tool for

solving classification and regression problems that analyse data and recognise patterns.

As it mentioned before, a SVR is designed for gasoline consumption in railway

transportation industry in Iran. Then, we designed and ANN model to compare the results

and consider the accuracy of SVR model. The collected gasoline dataset consists of

116 samples with three attributes. Transported freight per kilometre, transported

passengers per kilometre and the number of holidays per weeks are the inputs and weekly

gasoline consumption in Iran railway is the output. The task is to predict the consumption

of gasoline in railway transportation industry. The gasoline dataset is split into

90 training samples and 26 test samples. All dataset is scaled into the interval [–1, 1]. By

the use of SVR-RFN (Bo et al., 2007) many different values of (σ, C) has been used and

optimal values of them is assigned. Then, SVR and ANN are solved by using the

MATLAB R2010a software on a Core-i5 with 2.40 GHz. Some computational results of

the output of the models are shown in Figure 4, Figure 5 and Table 1.

In Table 1, the inputs that include number of holidays per week, transported freight

per kilometre (million tons per kilometre), transported passengers per kilometre (million

passengers per kilometre), gasoline consumption and the outputs of two models that

include predicted gasoline consumption by SVR and ANN and also their MAPE are

represented.

In Figure 4, gasoline consumption (blue line), predicted gasoline consumption by

SVR (red line) and residuals (green line) are shown. As it mentioned before, for

comparing the results of SVR, an ANN has been used to estimate and predict weekly

gasoline consumption in railway transportation industry. It has been used feed-forward

back propagation network for estimation because of its great ability to map inputs to

outputs. Figure 5 shows the results of ANN for this problem.

Figure 4 and Figure 5 show that prediction of gasoline consumption by SVR is more

realistic than prediction by ANN. Table 1 also show that the MAPE by using SVR is

12.50%, but MAPE by using ANN is 28.39%. Therefore SVR makes better prediction for

the gasoline dataset. In fact by using SVR, we can capture the trend of data and make

precise forecast and it can be very powerful tool for prediction in forecasting problems

and it may be used to estimate energy consumption in other transportation systems in all

over the world.

unique support vector regression for improved modelling and forecastin

229

Table 1 Inputs and outputs of SVR and ANN models

Number of

holidays per

weeks

Transported freight

per kilometre

Transported

passengers per

kilometre

Gasoline

consumption

Predicted gasoline

consumption by ANN

Predicted gasoline

consumption by SVR MAPE by ANN MAPE by SVR

0 376 271 328,400 292,078.96 304,525.32 11.06% 7.27%

0 416 267 194,000 234,604.2 169,226.2 20.93% 12.77%

0 448 279 279,250 234,234.9 321,640.15 16.12% 15.18%

0 419 300 198,900 155,778.48 186,409.08 21.68% 6.28%

1 433 350 202,100 171,966.89 241,165.93 14.91% 19.33%

0 427 331 546,515 562,527.8895 496,782.135 2.93% 9.10%

0 413 307 556,142 581,613.3036 576,886.0966 4.58% 3.73%

1 390 320 578,100 547,980.99 547,345.08 5.21% 5.32%

0 425 307 805,648 821,760.96 822,808.3024 2.00% 2.13%

1 419 308 2,696,993 3,144,349.129 2,506,854.994 16.59% 7.05%

1 401 312 3,054,292 3,289,122.222 2,838,658.985 7.69% 7.06%

0 433 308 3,352,113 3,296,133.043 4,438,264.789 1.67% 32.40%

0 411 306 3,717,584 5,015,568.606 4,293,615.35 34.91% 15.49%

0 387 315 4,685,795 5,056,807.503 2,890,072.882 7.92% 38.32%

1 397 333 4,996,361 4,736,550.228 3,711,055.39 5.20% 25.72%

1 368 330 4,732,666 3,631,024.508 4,914,760.684 23.28% 3.85%

0 379 305 4,982,343 5,173,157.322 5,802,922.053 3.83% 16.47%

0 395 291 5,728,337 3,803,883.842 6,452,249.912 33.60% 12.64%

2 414 275 6,209,073 5,855,155.839 6,962,331.407 5.70% 12.13%

0 400 263 6,254,518 9,623,201.464 7,011,265.358 53.86% 12.10%

0 444 262 6,001,143 3,114,193.841 4,507,631.234 48.11% 24.89%

0 401 270 5,951,384 8,583,522.307 5,762,385.827 44.23% 3.18%

0 396 297 5,834,499 8,242,885.569 6,840,850.804 41.28% 17.25%

230

. Azadeh et al.

Table 1 Inputs and outputs of SVR and ANN models (continued)

Number of

holidays per

weeks

Transported freight

per kilometre

Transported

passengers per

kilometre

Gasoline

consumption

Predicted gasoline

consumption by ANN

Predicted gasoline

consumption by SVR MAPE by ANN MAPE by SVR

0 385 302 6,200,574 8,580,310.508 6,279,933.685 38.38% 1.28%

1 403 318 6,157,591 3,704,434.579 6,436,197.135 39.84% 4.52%

2 401 330 6,097,827 4,862,456.262 6,406,228.495 20.26% 5.06%

0 405 307 6,142,619 6,726,167.805 5,516,397.844 9.50% 10.19%

1 414 319 6,299,457 9,242,062.508 6,837,498.909 46.71% 8.54%

0 381 310 6,142,215 6,480,036.825 5,879,436.016 5.50% 4.28%

0 438 315 6,181,518 8,629,623.818 6,274,417.29 39.60% 1.50%

5 420 325 6,511,955 8,977,617.544 7,094,519.527 37.86% 8.95%

1 419 366 6,346,846 5,940,647.856 5,390,154.325 6.40% 15.07%

0 438 314 6,310,899 3,692,150.926 6,606,195.918 41.50% 4.68%

0 445 288 6,404,782 3,652,910.701 7,882,009.461 42.97% 23.06%

0 468 302 6,298,429 6,720,423.743 6,075,270.692 6.70% 3.54%

0 434 311 6,482,897 3,627,891.96 8,000,495.561 44.04% 23.41%

0 479 306 6,397,869 9,650,584.495 6,706,823.363 50.84% 4.83%

0 413 317 6,317,499 7,414,766.097 8,935,821.86 17.37% 41.45%

1 393 315 5,859,065 8,386,246.754 7,694,092.043 43.13% 31.32%

0 395 305 5,963,647 8,176,093.991 7,409,198.102 37.10% 24.24%

0 437 326 6,148,955 3,331,128.28 5,761,189.615 45.83% 6.31%

1 403 327 5,929,462 4,368,388.375 4,956,241.306 26.33% 16.41%

0 428 343 6,049,008 3,994,303.747 5,888,058.058 33.97% 2.66%

0 413 336 5,990,294 7,394,460.829 5,107,142.256 23.44% 14.74%

1 378 344 5,921,860 3,336,232.038 5,872,180.917 43.66% 0.84%

0 433 361 5,906,986 4,050,681.671 5,717,932.902 31.43% 3.20%

unique support vector regression for improved modelling and forecastin

231

Table 1 Inputs and outputs of SVR and ANN models (continued)

Number of

holidays per

weeks

Transported freight

per kilometre

Transported

passengers per

kilometre

Gasoline

consumption

Predicted gasoline

consumption by ANN

Predicted gasoline

consumption by SVR MAPE by ANN MAPE by SVR

1 398 364 5,945,310 3,625,495.053 6,905,240.864 39.02% 16.15%

0 412 375 5,986,946 5,024,058.716 6,705,066.118 16.08% 11.99%

1 345 380 6,076,471 8,251,843.6 7,755,954.883 35.80% 27.64%

0 356 378 6,142,379 8,640,786.432 6,017,900.869 40.67% 2.03%

0 413 343 6,114,930 4,280,917.903 7,772,767.523 29.99% 27.11%

0 399 252 5,963,808 8,106,527.001 3,128,881.997 35.93% 47.54%

0 431 262 6,093,792 8,344,427.335 3,941,732.107 36.93% 35.32%

1 414 263 6,100,340 8,808,534.256 5,990,454.94 44.39% 1.80%

0 419 324 6,055,007 5,533,810.448 8,104,757.883 8.61% 33.85%

0 400 388 6,202,582 4,505,285.766 6,756,987.63 27.36% 8.94%

0 439 394 6,158,782 6,454,403.536 5,469,179.873 4.80% 11.20%

0 399 349 6,290,364 4,211,613.709 5,230,062.337 33.05% 16.86%

1 423 348 6,219,366 4,810,931.646 5,807,117.736 22.65% 6.63%

0 424 340 6,079,518 8,569,450.698 6,054,193.501 40.96% 0.42%

0 462 350 6,203,760 3,340,951.444 6,351,238.539 46.15% 2.38%

0 441 346 6,382,580 9,364,872.795 6,460,321.728 46.73% 1.22%

0 484 345 6,338,768 2,434,807.296 6,401,433.637 61.59% 0.99%

0 423 338 6,363,702 6,726,433.014 5,897,746.873 5.70% 7.32%

1 433 358 6,610,467 9,917,773.002 8,921,058.064 50.03% 34.95%

1 416 352 6,396,451 9,180,052.326 5,394,087.565 43.52% 15.67%

0 416 341 6,359,159 9,385,459.121 6,928,511.951 47.59% 8.95%

0 434 323 6,311,278 5,942,249.276 5,971,623.128 5.85% 5.38%

2 362 310 6,252,551 7,582,758.069 6,834,287.775 21.27% 9.30%

232

. Azadeh et al.

Table 1 Inputs and outputs of SVR and ANN models (continued)

Number of

holidays per

weeks

Transported freight

per kilometre

Transported

passengers per

kilometre

Gasoline

consumption

Predicted gasoline

consumption by ANN

Predicted gasoline

consumption by SVR MAPE by ANN MAPE by SVR

0 457 336 6,213,050 3,993,737.505 5,270,110.078 35.72% 15.18%

0 458 300 6,242,344 3,485,693.023 5,681,309.944 44.16% 8.99%

0 408 286 6,067,530 5,697,410.67 8,588,452.831 6.10% 41.55%

0 418 289 5,797,463 8,188,561.551 5,351,058.349 41.24% 7.70%

0 399 306 6,240,853 8,656,943.825 5,263,329.229 38.71% 15.66%

1 381 334 6,240,232 3,305,624.805 5,212,054.276 47.03% 16.48%

1 384 349 6,260,662 4,035,611.75 6,059,268.66 35.54% 3.22%

0 421 360 6,233,075 3,268,201.677 5,288,712.347 47.57% 15.15%

0 422 367 6,205,547 9,749,500.26 7,201,625.112 57.11% 16.05%

1 391 355 6,372,346 8,112,620.892 7,012,866.436 27.31% 10.05%

0 439 331 6,334,043 6,796,428.139 5,856,110.623 7.30% 7.55%

0 430 324 6,295,672 5,221,579.94 5,368,095.269 17.06% 14.73%

0 433 328 6,207,819 6,536,833.407 6,303,765.487 5.30% 1.55%

5 430 322 6,411,659 9,680,550.14 5,977,764.248 50.98% 6.77%

0 397 369 6,336,286 9,216,332.892 5,510,648.121 45.45% 13.03%

1 382 338 6,070,806 9,036,281.81 5,602,990.97 48.85% 7.71%

0 372 291 6,074,187 8,075,280.774 6,793,730.599 32.94% 11.85%

0 445 302 6,280,505 7,789,183.865 6,171,803.747 24.02% 1.73%

0 441 331 6,240,607 9,763,136.939 5,528,435.884 56.45% 11.41%

0 453 330 6,295,008 5,931,057.433 7,443,920.29 5.78% 18.25%

1 441 348 6,174,216 5,791,414.608 7,969,506.501 6.20% 29.08%

0 418 323 6,344,598 6,052,746.492 3,871,135.612 4.60% 38.99%

0 398 315 6,078,317 8,033,873.635 4,195,378.975 32.17% 30.98%

unique support vector regression for improved modelling and forecastin

233

Table 1 Inputs and outputs of SVR and ANN models (continued)

Number of

holidays per

weeks

Transported freight

per kilometre

Transported

passengers per

kilometre

Gasoline

consumption

Predicted gasoline

consumption by ANN

Predicted gasoline

consumption by SVR MAPE by ANN MAPE by SVR

0 422 350 5,928,236 8,489,822.168 6,753,083.16 43.21% 13.91%

2 398 352 6,038,828 7,527,986.158 6,191,221.639 24.66% 2.52%

1 381 355 5,959,803 7,960,056.649 5,635,433.606 33.56% 5.44%

0 371 367 6,080,230 8,000,048.618 6,180,752.883 31.57% 1.65%

1 357 377 5,878,211 5,355,050.221 5,763,202.826 8.90% 1.96%

0 384 375 6,027,713 8,466,447.265 6,422,361.063 40.46% 6.55%

0 417 378 6,301,421 9,128,384.897 5,826,282.525 44.86% 7.54%

1 407 380 6,504,907 3,601,960.43 6,412,496.082 44.63% 1.42%

0 373 381 6,367,646 6,042,896.054 7,700,758.317 5.10% 20.94%

0 400 304 6,285,312 4,206,098.923 5,713,995.845 33.08% 9.09%

0 422 229 6,574,721 4,510,466.551 7,111,300.126 31.40% 8.16%

0 448 278 6,531,711 6,707,505.238 8,606,243.424 2.69% 31.76%

1 443 269 6,448,218 6,783,525.336 6,466,093.004 5.20% 0.28%

1 454 364 6,752,270 7,110,140.31 6,534,956.278 5.30% 3.22%

0 404 397 6,638,044 3,220,496.051 5,467,887.859 51.48% 17.63%

0 419 401 6,571,895 6,256,444.04 7,464,785.501 4.80% 13.59%

0 448 481 6,512,153 3,519,062.703 8,237,717.67 45.96% 26.50%

1 391 479 6,312,874 3,540,688.46 6,443,102.132 43.91% 2.06%

0 361 353 6,146,221 7,053,963.042 5,815,206.101 14.77% 5.39%

0 373 372 6,076,737 8,212,523.181 5,832,920.259 35.15% 4.01%

0 366 362 6,090,438 8,663,461.755 5,988,165.453 42.25% 1.68%

0 400 380 5,995,286 8,374,101.596 6,195,704.112 39.68% 3.34%

0 395 385 5,806,401 6,090,914.649 6,342,805.598 4.90% 9.24%

1 332 310 5,739,097 7,662,073.708 5,426,696.142 33.51% 5.44%

234

. Azadeh et al.

Figure 5 The predicted line by ANN for dataset (see online version for colours)

Figure 4 The SVR by RBF kernel function for dataset (see online version for colours)

7 Conclusions

This study presented a SVR algorithm to estimate and predict weekly gasoline

consumption in railway transportation industry in Iran. Furthermore, it considered the

effect of number of holidays per weeks and amount of transported loads and number of

transported passengers in gasoline consumption prediction. This is the first study that

integrates conventional time series and SVR for forecasting and modelling gasoline

consumption in railway industry in Iran. Weekly gasoline consumption in railway

transportation industry from August 2009 to December 2011 was considered. By

comparing SVR’s results with other intelligent tools such as ANN, it was revealed that

SVR makes better prediction results for dataset. Therefore, managers could use SVR for

accurate prediction of gasoline consumption for making strategic decisions. Using

meta-heuristic approaches such as PSO, GA, and etc., in determining three parameters of

a SVR model can be an extension to this study. In this paper, we used transported

passengers per kilometre, transported tons per kilometre, and holidays in weeks as inputs

unique support vector regression for improved modelling and forecastin

235

of our model. We can use the other factors and compare the results to determine the most

important and relevant factors for prediction of gasoline consumption as future research.

Acknowledgements

The authors are grateful for the valuable comments and suggestion from the respected

reviewers. Their valuable comments and suggestions have enhanced the strength and

significance of our paper. This study was supported by a grant from University of Tehran

(Grant No. 8106013/1/14). The authors are grateful for the support provided by the

College of Engineering, University of Tehran, Iran.

References

Azadeh, A., Arab, R. and Behfard, S. (2010) ‘An adaptive intelligent algorithm for forecasting long

term gasoline demand estimation: the cases of USA, Canada, Japan, Kuwait and Iran’, Expert

Systems with Applications, Vol. 37, No. 12, pp.7427–7437.

Azadeh, A., Faghihroohi, S. and Izadbakhsh, H.R. (2012) ‘Optimisation of train scheduling in

complex railways with imprecise and ambiguous input data by an improved integrated model’,

International Journal of Services and Operations Management, Vol. 13, No. 3, pp.310–328.

Bo, L., Wang, L. and Jiao, L. (2007) ‘Recursive finite Newton algorithm for support vector

regression in the primal’, Neural Computation, Vol. 19, No. 4, pp.1082–1096.

Burges, C.J.C. (1998) ‘A tutorial on support vector machines for pattern recognition’, Data Min

Knowl. Disc., Vol. 2, No. 2, pp.121–167.

Cao, L. (2003) ‘Support vector machines experts for time series forecasting’, Neurocomputing,

Vol. 51, pp.321–339.

Che, J., Wang, J. and Wang, G. (2012) ‘An adaptive fuzzy combination model based on

self-organizing map and support vector regression for electric load forecasting’, Energy,

Vol. 37, No. 1, pp.657–664.

Coyle, D., DeBacker, J. and Prisinzano, R. (2012) ‘Estimating the supply and demand of gasoline

using tax data’, Energy Economics, Vol. 34, No. 1, pp.195–200.

Crôtte, A., Noland, R.B. and Graham, D.J. (2010) ‘An analysis of gasoline demand elasticities at

the national and local levels in Mexico’, Energy Policy, Vol. 38, No. 8, pp.4445–4456.

Fu, Y.Y., Wu, CH.J., Jeng, J.T. and Ko, CH.N. (2010) ‘ARFNNs with SVR for prediction

of chaotic time series with outliers’, Expert Systems with Applications, Vol. 37, No. 6,

pp.4441–4451.

Hao, P.Y. (2003) Fuzzy Decision Model Using Support Vector Learning – A Kernel Function

Based Approach, PhD thesis, Department of Computer Science and Information Engineering,

National Cheng Kung University, Tainan, Taiwan.

Hong, W.CH. (2010) ‘Application of chaotic ant swarm optimization in electric load forecasting’,

Energy Policy, Vol. 38, No. 10, pp.5830–5839.

Hong, W.CH. (2011) ‘Traffic flow forecasting by seasonal SVR with chaotic simulated annealing

algorithm’, Neurocomputing, Vol. 74, Nos. 12–13, pp.2096–2107.

Hong, W.CH., Dong, Y., Chen, L.Y. and Lai, CH.Y. (2010) ‘Taiwanese 3G mobile phone demand

forecasting by SVR with hybrid evolutionary algorithm’, Expert Systems with Applications,

Vol. 37, No. 6, pp.4452–4462.

Hong, W.CH., Dong, Y., Chen, L.Y. and Wei, SH.Y. (2011a) ‘SVR with hybrid chaotic genetic

algorithms for tourism demand forecasting’, Applied Soft Computing, Vol. 11, No. 2,

pp.1881–1890.

236

. Azadeh et al.

Hong, W.CH., Dong, Y., Zheng, F. and Lai, CH.Y. (2011b) ‘Forecasting urban traffic flow by SVR

with continuous ACO’, Applied Mathematical Modelling, Vol. 35, No. 3, pp.1282–1291.

Hong, W.CH., Dong, Y., Zheng, F. and Wei, SH.Y. (2011c) ‘Hybrid evolutionary algorithms in a

SVR traffic flow forecasting model’, Applied Mathematics and Computation, Vol. 217,

No. 15, pp.6733–6747.

Hsu, C.W., Chang, C.C. and Lin, C.J. (2003) A Practical Guide to Support Vector Classification,

Technical Report, Department of Computer Science, National Taiwan University, Taipei,

Taiwan.

Kavaklioglu, K. (2011) ‘Modeling and prediction of Turkey’s electricity consumption using

support vector regression’, Applied Energy, Vol. 88, No. 1, pp.368–375.

Khare, A. and Handa, M. (2011) ‘Customers’ quality perceptions towards online railway

reservation services in India: an exploratory study’, International Journal of Services and

Operations Management, Vol. 9, No. 4, pp.491–505.

Kim, K.J. (2003) ‘Financial time series forecasting using support vector machines’,

Neurocomputing, Vol. 55, No. 1, pp.307–319.

Lin, K.P., Pai, P.F. and Yang, SH.L. (2011) ‘Forecasting concentrations of air pollutants by

logarithm support vector regression with immune algorithms’, Applied Mathematics and

Computation, Vol. 217, No. 12, pp.5318–5327.

Lu, C.J., Lee, T. and Chiu, CH. (2009) ‘Financial time series forecasting using independent

component analysis and support vector regression’, Decision Support Systems, Vol. 47, No. 2,

pp.115–125.

Lu, CH.J. and Wang, Y.W. (2010) ‘Combining independent component analysis and growing

hierarchical self-organizing maps with support vector regression in product demand

forecasting’, International Journal of Production Economics, Vol. 128, No. 2, pp.603–613.

Meiying, Q., Xiaoping, M., Jianyi, L. and Ying, W. (2011) ‘Time-series gas prediction model using

LS-SVR within a Bayesian framework’, Mining Science and Technology (China), Vol. 21,

No. 1, pp.153–157.

Mukherjee, S., Osuna, E. and Girosi, F. (1997) ‘Nonlinear prediction of chaotic time series using

support vector machines’, NNSP’97: Neural Networks for Signal Processing VII: Proceedings

of the IEEE Signal Processing Society Workshop, pp.511–520.

Muller, K.R., Smola, A.J., Ratsch, G., Scholkopf, B. and Kohlmorgen, J. (1999) ‘Using support

vector machines for time series prediction’, in Scholkopf, B., Burges, C.J.C. and Smola, A.J.

(Eds.): Advances in Kernel Methods – Support Vector Learning, pp.243–254.

Muller, K.R., Smola, J.A., Ratsch, G., Scholkopf, B., Kohlmorgen, J. and Vapnik, V.N. (1997)

‘Predicting time series with support vector machines’, ICANN’97: Proceedings of the seventh

International Conference on Artificial Neural Networks, pp.999–1004.

Nagi, J., Yap, K.S., Nagi, F., Tiong, S.K. and Ahmed, S.K. (2011) ‘A computational intelligence

scheme for the prediction of the daily peak load’, Applied Soft Computing, Vol. 11, No. 8,

pp.4773–4788.

Park, S.Y. and Zhao, G. (2010) ‘An estimation of US gasoline demand: a smooth time-varying

cointegration approach’, Energy Economics, Vol. 32, No. 1, pp.110–120.

Parrella, F. (2007) Online Support Vector Regression, pp.1–101, Department of Information

Science, University of Genoa.

Pock, M. (2010) ‘Gasoline demand in Europe: new insights’, Energy Economics, Vol. 32, No. 1,

pp.54–62.

Rasouli, K., Hsieh, W.W. and Cannon, A.J. (2012) ‘Daily stream flow forecasting by machine

learning methods with weather and climate inputs’, Journal of Hydrology, Vol. 414, No. 11,

pp.284–293.

Rayeni, M.M. and Saljooghi, F.H. (2014) ‘Ranking and measuring efficiency using secondary goals

of cross-efficiency evaluation – a study of railway efficiency in Iran’, International Journal of

Services and Operations Management, Vol. 17, No. 1, pp.1–16.

unique support vector regression for improved modelling and forecastin

237

Sanchez, V.D.A. (2003) ‘Advanced support vector machines and kernel methods’,

Neurocomputing, Vol. 55, No. 1, pp.5–20.

Satapathy, S., Patel, S.K. and Mishra, P.D. (2012) ‘Discriminate analysis and neural network

approach in water utility service’, International Journal of Services and Operations

Management, Vol. 12, No. 4, pp.468–489.

Smola, A.J. (1996) Regression Estimation with Support Vector Learning Machines, Master’s thesis,

Technische Universit at Munchen.

Smola, A.J. and Schoolkopf, B. (2004) ‘A tutorial on support vector regression’, Statistics and

Computing, Vol. 14, No. 3, pp.199–222.

Su, Q. (2011) ‘The effect of population density, road network density, and congestion on household

gasoline consumption in US urban areas’, Energy Economics, Vol. 33, No. 3, pp.445–452.

Tasdemir, S., Saritas, I., Ciniviz, M. and Allahverdi, N. (2011) ‘Artificial neural network and fuzzy

expert system comparison for prediction of performance and emission parameters on a

gasoline engine’, Expert Systems with Applications, Vol. 38, No. 11, pp.13912–13923.

Tay, F.E.H. and Cao, L.J. (2002) ‘Modified support vector machines in financial time series

forecasting’, Neurocomputing, Vol. 48, No. 1, pp.847–861.

Thissen, U., Brakel, R., Weijer, A.P., Melessen, W.J. and Buydens, L.M.C. (2003) ‘Using support

vector machines for time series prediction’, Chemometrics and Intelligent Laboratory Systems,

Vol. 69, No. 1, pp.35–49.

Togun, N. and Baysec, S. (2010a) ‘Genetic programming approach to predict torque and

brake specific fuel consumption of a gasoline engine’, Applied Energy, Vol. 87, No. 11,

pp.3401–3408.

Togun, N. and Baysec, S. (2010b) ‘Prediction of torque and specific fuel consumption of a gasoline

engine by using artificial neural networks’, Applied Energy, Vol. 87, No. 1, pp.349–355.

Vapnik, V.N. (1995) The Nature of Statistical Learning Theory, 2nd ed., Springer-Verlag,

New York.

Wadud, Z., Noland, R.B. and Graham, D.J. (2010) ‘A semiparametric model of household gasoline

demand’, Energy Economics, Vol. 32, No. 1, pp.93–101.

Wang, Y., Wang, S. and Lai, K.K. (2005) ‘A new fuzzy support vector machines to evaluate credit

risk’, IEEE Transactions on Fuzzy System, Vol. 13, No. 6, pp.820–831.

Yang, H., Huang, K., King, I. and Lyu, M.R. (2009) ‘Localized support vector regression for time

series prediction’, Neurocomputing, Vol. 72, No. 10, pp.2659–2669.

Design of a Distribution Network in a Multi-product, Multi-period Green Supply Chain System Under Demand Uncertainty

Article

Feb 2022

This paper proposes a novel fuzzy mathematical model for a distribution network design problem in a multi-product, multi-period, multi-echelon, multi-plant, multi-retailer, multi-mode of transportation green supply chain system. The three purposes of the model are to minimise total network cost, maximise net profit per capita for each human resource, and diminish CO2 emission throughout the network. P-hub median location with multiple allocations is used for locating the distribution centres. One scenario is designed for fuzzy customer demands with a trapezoidal membership function. Furthermore, the model determines the design of the network (selecting the optimum numbers, locations of plants, and distribution centres to open), finding the best strategy for material transportation through the network with the availability of different transportation modes, the capacities level of the facilities (plants or distribution centres (DCs)), and the number of outsourced products. Finally, all uncertain customer demands for all product types can be satisfied based on the methods mentioned above. This multi-objective mixed-integer non-linear mathematical model is solved by NSGA-II, MOPSO and a hybrid meta-heuristic algorithm. The results show that NSGA-II is the exclusive algorithm that obtains the best result according to the evaluation criteria.

Machine learning Clustering Algorithms Based on the DEA Optimization Approach for Banking System in Developing Countries

Article

Jun 2020

Machine learning grows quickly, which has made numerous academic discoveries and is extensively evaluated in several areas. Optimization, as a vital part of machine learning, has fascinated much consideration of practitioners. The primary purpose of this paper is to combine optimization and machine learning to extract hidden rules, remove unrelated data, introduce the most productive Decision-Making Units (DMUs) in the optimization part, and to introduce the algorithm with the highest accuracy in Machine learning part. In the optimization part, we evaluate the productivity of 30 banks from eight developing countries over the period 2015-2019 by utilizing Data Envelopment Analysis (DEA). An additive Data Envelopment Analysis (DEA) model for measuring the efficiency of decision processes is used. The additive models are often named Slack Based Measure (SBM). This group of models measures efficiency via slack variables. After applying the proposed model, the Malmquist Productivity Index (MPI) is computed to evaluate the productivity of companies. In the machine learning part, we use a specific two-layer data mining filtering pre-processes for clustering algorithms to increase the efficiency and to find the superior algorithm. This study tackles data and methodology-related issues in measuring the productivity of the banks in developing countries and highlights the significance of DMUs productivity and algorithms accuracy in the banking industry by comparing suggested models.

Exploring the Impacts of Covid-19 Pandemic on Risks Faced by Infrastructure Projects in Pakistan

Article

Jan 2022

The current COVID-19 pandemic is making a huge impact on society. Like many sectors, the ongoing construction projects are also either abandoned or halted due to this pandemic, especially in developing countries. We have conducted this study to evaluate the impact of COVID-19 pandemic on construction projects by using the concept of rework projects. “Rework project” is a class of projects that are initiated to achieve the intended objectives in the second attempt after failing to achieve the goals in the first attempt. By comparing the risks/challenges faced during these special categories of projects from the past with the normal projects of the relatively same size, we have come up the unique challenges with high significance while managing these rework projects after the pandemic. All the projects selected for this study were construction projects in Pakistan. People who were involved in the selected projects in different capacities were interviewed and analysis of the responses was performed. The study there are some unique risks faced during rework projects with a high significance which otherwise have low significance in normal projects. The unique challenges/risks such as time urgency, overburdened resources, and mobilization of contractors, inappropriate documentation gaps, technological changes, contractual claims with cost claims, and changes working rates were highly significant in rework projects. By having clear recognition and attention to these highly significant risks, organizations and project managers will be well equipped in devising strategies to manage those risks and to complete the rework projects in the post-pandemic world.

AHP Decision Making Algorithm for Development of HVDC and EHVAC in Developing Countries

Article

Full-text available

Jun 2020

Nowadays, as the population of urban areas increases, the need for consumption increases as well. This amount of consumption requires power generation centers with large volumes exploiting that it needs to be big enough, which guides technology towards bulk power transmission systems. In doing so, two types of power transmission systems, including HVDC and EHVAC, can be studied. However, since none of the above technologies has been used in developing countries, a decision should be made to introduce and develop any of these technologies. Applying both technologies together would not be cost-effective. A decision-making development needs the principles of conflicting purposes for alternatives and the selection of the best choice based on the needs of decision-makers. Multi-objective optimization methods may well provide a solution for this selection. Thus, this paper studies deciding on the introduction and Development of HVDC and EHVAC in a developing country, Iran. To this end, measures of this selection are described in detail, and then, AHP, one of the well-known MCDM method, is used to make the final decision.

Machine learning Clustering Algorithms Based on the DEA Optimization Approach for Banking System in Developing Countries

Article

Jun 2020

The primary purpose of this paper is to combine Data Envelopment Analysis (DEA) optimization approach with machine learning clustering method in datamining in order to introduce the most efficient DEA Decision-Making Units (DMUs) and the best Clustering algorithm respectively. The main goal of this paper in optimization part is to evaluate bank efficiency with cross-efficiency over 2014-2019 with Data Envelopment Analysis (DEA) for 12 banks from two developing countries. The cross-efficiency evaluation is an extension of DEA that provides a ranking method and eliminates unrealistic DEA weighting schemes on weight restrictions, without requiring prior information. Applying cross-efficiency can be beneficial for managers to expand their comparison and evaluation. The ranking of decision-making units (DMUs) is one of the most critical topics in efficiency assessment. To find the superior model, we consider input-oriented BCC-CCR and CCR-BCC models. This study overcomes with some data and methodology issues in measuring the efficiency of developing country's banks and highlights the importance of inspiring increased efficiency through the banking industry comparing new suggested models and the new results. After applying the optimization step, in the second part, in Machine learning step, clustering method has been applied. Clustering is the procedure of grouping similar items together. This group of the items is called the cluster. Different clustering algorithms can be used according to the behavior of data. Farthest First and Expectation Maximization algorithms have been applied. Finally, BCC-CCR and Farthest First algorithms have been proposed as a superior optimization model and machine learning algorithm, respectively.

A Novel Improved Data Envelopment Analysis Model Based on SBM and FDH Models

Article

Full-text available

May 2020

During the past decade, applying nonparametric operation research problems such as Data Envelopment Analysis(DEA) has received significant consideration among researchers. In this paper, a new DEA-based SBM-FDH model is introduced. Finally, productivity evaluation for banking systems in Malmquist Productivity Index (MPI) based on the proposed model has been compared with Slack Based Measurement (SBM) and Free Disposal Hull (FDH). The obtained results confirm the high performance of the proposed model in comparison to the other models used in this paper.

Modeling and Prediction of Iran's Steel Consumption Based on Economic Activity Using Support Vector Machines

Preprint

Full-text available

Dec 2019

The steel industry has great impacts on the economy and the environment of both developed and underdeveloped countries. The importance of this industry and these impacts have led many researchers to investigate the relationship between a country's steel consumption and its economic activity resulting in the so-called intensity of use model. This paper investigates the validity of the intensity of use model for the case of Iran's steel consumption and extends this hypothesis by using the indexes of economic activity to model the steel consumption. We use the proposed model to train support vector machines and predict the future values for Iran's steel consumption. The paper provides detailed correlation tests for the factors used in the model to check for their relationships with the steel consumption. The results indicate that Iran's steel consumption is strongly correlated with its economic activity following the same pattern as the economy has been in the last four decades.

Feature Engineering and Forecasting via Integration of Derivative-free Optimization and Ensemble of Sequence-to-sequence Networks: Renewable Energy Case Studies

Preprint

Full-text available

Sep 2019

This research introduces a framework for forecasting, reconstruction and feature engineering of multivariate processes. We integrate derivative-free optimization with ensemble of sequence-to-sequence networks. We design a new resampling technique called additive which along with Bootstrap aggregating (bagging) resampling are applied to initialize the ensemble structure. We explore the proposed framework performance on three renewable energy sources wind, solar and ocean wave. We conduct several short- to long-term forecasts showing the superiority of the proposed method compare to numerous machine learning techniques. The findings indicate that the introduced method performs reasonably better when the forecasting horizon becomes longer. In addition, we modify the framework for automated feature selection. The model represents a clear interpretation of the selected features. We investigate the effects of different environmental and marine factors on the wind speed and ocean output power respectively and report the selected features. Moreover, we explore the online forecasting setting and illustrate that the model exceeds alternatives through different measurement errors.

Forecasting gasoline consumption using machine learning algorithms during COVID-19 pandemic

Article

Jan 2022
ENERG SOURCE PART A

Due to travel restrictions and the general economic slowdown caused by the Coronavirus Disease 2019 (COVID-19), the gasoline consumption profile has exhibited unusual behavior. Depending on the severity of lockdown policies, the consumption pattern has changed even at different stages of the epidemic. Forecasting gasoline demand has become a more difficult and essential tool for energy planning. Therefore, reliable models are needed to ensure energy security in pandemic conditions. Presenting a case study on Turkey, this paper investigates the impact of the COVID-19 pandemic on gasoline demand. Four common machine learning models, including Gaussian Process Regression, Sequential Minimal Optimization Regression, Multi-Layer Perceptron Regressor, and Random Forest, were used to estimate daily gasoline consumption. In the training of the models, inputs such as historical gasoline demand, national holidays, date attributes, gasoline price, and COVID-19 related factors such as curfews and travel bans were considered. Analysis results showed that the Random Forest model performed best with the highest correlation coefficient (0.959) and the lowest mean absolute percentage error (11.526%), and root mean square percentage error (17.022%) values in the test dataset. This study can help policymakers understand the impact of such an emergency on the energy industry and respond quickly to potential threats.

Forecasting automobile gasoline demand in Australia using machine learning-based regression

Article

Oct 2021
ENERGY

We use a variant of machine learning (ML) to forecast Australia's automobile gasoline demand within an autoregressive and structural model. By comparing the outputs of various model specifications, we find that training set selection plays an important role in forecasting accuracy. More specifically, however, the performance of training sets starting within identified systematic patterns is relatively worse, and the impact on forecast errors is substantial. We explain these systematic variations in machine learning performance, and explore the intuition behind the ‘black-box’ with the support of economic theory. An important finding is that these time points coincide with structural changes in Australia's economy. By examining the out-of-sample forecasts, the model's external validity can be demonstrated under normal situations; however, its forecasting performance is somewhat unsatisfactory under event-driven uncertainty, which calls on future research to develop alternative models to depict the characteristics of rare and extreme events in an ex-ante manner.

Customers' quality perceptions towards online railway reservation services in India: An exploratory study

Article

Full-text available

Jul 2011

With increased globalisation, technology is witnessing growing application and acceptance across service delivery platforms. Service organisations are using technology for improving their service quality and customer service culture by developing profitable, long-term relationships with customers (Webster, 1992; Achrol, 1997; Gounaris, 2005). Delivery of superior service quality is an important determinant of success in the service industry. An exploratory research was conducted to study customer experience of the online reservation system of the Indian railways. The purpose was to understand the customer's perception regarding technology deployment in improving services. Dimensions of the service quality model (Parasuraman et al., 1988) were used for understanding perceptions regarding the online railway reservation system. The study indicates that though the service is gaining acceptability in the country, there still exists considerable scope for improvement in customer experience with regard to the various service quality dimensions for the online reservation system. Based on the findings of the study, recommendations for improvement in various aspects of the service are stated.

Optimisation of train scheduling in complex railways with imprecise and ambiguous input data by an improved integrated model

Article

Full-text available

Oct 2012

Modelling and scheduling of cargo and passenger trains with time limitations, queue priority and limited station lines is a cumbersome task. Furthermore, when distribution of traverse times are unknown or could not be estimated, conventional simulation approach fails and the need for a more intelligent approach becomes inevitable. This paper presents an integrated fuzzy modelling and simulation approach for such ambiguous cases. The case of this study is based on a specific and actual train route (800 kilometres). To show superiority of the approach the system under study was modelled through both conventional and fuzzy simulation approach. For the purpose of verification and validation, the fuzzy interval data of fuzzy simulation model was compared with the interval data of actual system through fuzzy analysis of variance (ANOVA). Then, random data from fuzzy simulation, conventional simulation models and actual system were compared with design of experiment (DOE). Hence, ANOVA and least significant difference (LSD) method were used to show fuzzy simulation approach is an ideal and superior modelling approach for uncertain and ambiguous scheduling railroad systems such as the actual case of this study.

A tutorial on support vector regression

Article

Aug 2004

In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

A Tutorial on Support Vector Machines for Pattern Recognition

Article

Jun 1998

Christopher J. C. Burges

The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

Support vector regression

Article

Jan 2007

A Practical Guide to Support Vector Classification

Article

Jan 2003

Advanced support vector machines and kernel methods

Article

Jan 2003
NEUROCOMPUTING

The Nature of Statistical Learning Theory

Chapter

Jan 2000

Vladimir N. Vapnik

In the history of research of the learning problem one can extract four periods that can be characterized by four bright events: (i) Constructing the first learning machines, (ii) constructing the fundamentals of the theory, (iii) constructing neural networks, (iv) constructing the alternatives to neural networks.

Ranking and measuring efficiency using secondary goals of cross-efficiency evaluation - A study of railway efficiency in Iran

Article

Jan 2014

This paper makes use of cross-efficiency measures, as an extension to data envelopment analysis (DEA). The cross-efficiency evaluation not only provides a ranking among the decision-making units (DMUs) but also eliminates unrealistic DEA weighting schemes without requiring a priori information on weight restrictions. A problem that possibly reduces the usefulness of the cross-efficiency evaluation method is that the cross-efficiency scores may not be unique due to the presence of alternate optima. So, it is recommended that secondary goals be introduced in cross-efficiency evaluation. In some cases, pursuing the best ranking is more important than maximising the individual score. This paper develops a new secondary goal based on symmetric weights selection. For illustration of models, the railway activities from 1977 to 2010 are considered and the efficiency of each year is calculated and compared to the other years.

Discriminate analysis and neural network approach in water utility service

Article

Jul 2012

The conflicting preferences amongst stakeholders and the incomplete, uncertain and contradictory understanding about water service by the Indian consumers, it is recognised that managing water resources sustain ably is a wicked problem. In India customer, satisfaction and service care are every day pushing professionals in the water industry to seek to improve their performance, lowering costs and increasing the provided service level the actual water supply available to the residents is intermittent and inequitable. Despite concerted efforts the demand-supply gap is on the rise. This imbalance is further exacerbated by the high level of non-revenue water - including both technical and commercial losses. This paper develops a systematic assessment of the sustainability of water services provided to the consumers in rural, urban and municipality area in India by neural network method and also consumer wise perception is calculated by linear discriminant method.

A unique support vector regression for improved modelling and forecasting of short-term gasoline consumption in railway systems

Abstract and Figures

Recommended publications

Supervised Weighting-Online Learning Algorithm for Short-Term Traffic Flow Prediction

A cyber-enabled visual inspection system for rail corrugation

The Influence of Travel Distance on Mode Share for Regional Trips in China

Prediction of tram track gauge deviation using artificial neural network and support vector regressi...