ArticlePDF Available

Electricity consumption prediction based on LSTM with attention mechanism

Wiley
IEEJ Transactions on Electrical and Electronic Engineering
Authors:

Abstract and Figures

Power data analysis in power system, such as electricity consumption prediction, has always been the basis for the power department to adjust electricity price, substation regulation, total load prediction and peak avoidance management. In this paper, a short‐term time‐phased electricity consumption prediction model based on Long Short‐Term Memory (LSTM) with an attention mechanism is proposed. First, the attention mechanism is used to assign weight coefficients to the input sequence data. Then, the output value of every cell of LSTM is calculated according to the forward propagation method, and the error between the real value and the predicted value is calculated using the back‐propagation method. The gradient of each weight is calculated according to the corresponding error term, and the weight of the model is updated by the gradient descent direction to make the error smaller. Using modeling and predicting experiments on different types of electricity consumption, the results show that the prediction accuracy of the model proposed increased by 6.5% compared to the state‐of‐the‐art model. The model has a good effect on electricity consumption prediction. Not only can it be close to actual results numerically, but it can also better predict the development trend of data. © 2020 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
This content is subject to copyright. Terms and conditions apply.
IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING
IEEJ Trans 2020
Published online in Wiley Online Library (wileyonlinelibrary.com). DOI:10.1002/tee.23088
Paper
Electricity Consumption Prediction Based on LSTM with
Attention Mechanism
Zhifeng Lin*,Non-menber
Lianglun Cheng**,Non-member
Guoheng Huang**a,Non-menber
Power data analysis in power system, such as electricity consumption prediction, has always been the basis for the power
department to adjust electricity price, substation regulation, total load prediction and peak avoidance management. In this paper,
a short-term time-phased electricity consumption prediction model based on Long Short-Term Memory (LSTM) with an attention
mechanism is proposed. First, the attention mechanism is used to assign weight coefficients to the input sequence data. Then,
the output value of every cell of LSTM is calculated according to the forward propagation method, and the error between the
real value and the predicted value is calculated using the back-propagation method. The gradient of each weight is calculated
according to the corresponding error term, and the weight of the model is updated by the gradient descent direction to make
the error smaller. Using modeling and predicting experiments on different types of electricity consumption, the results show that
the prediction accuracy of the model proposed increased by 6.5% compared to the state-of-the-art model. The model has a good
effect on electricity consumption prediction. Not only can it be close to actual results numerically, but it can also better predict
the development trend of data. ©2020 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
Keywords: electricity consumption prediction; attention mechanism; LSTM; error optimization
Received 12 April 2019; Revised 18 September 2019
1. Introduction
Electricity consumption prediction is one of the core technolo-
gies in the construction of smart grid, also known as grid 2.0. It
also plays important roles in electricity development planning and
business planning. Currently, the electricity industry is developing
rapidly based on the electricity transmission and public electric-
ity resources provided by the State Grid. The electricity generated
by a smart grid can be used not only for electricity supply in
the region but also for transmission to other regions through the
national grid line. In addition, some studies have shown that elec-
tricity prediction can also help improve the efficiency of electricity
distribution in smart grids, especially in electricity stations. As
the electricity transmission cost from the electricity station to the
substation or the user is very high in the electricity grid, unneces-
sary costs can be reduced through electricity consumption analysis
and prediction before the planned construction of the power trans-
mission network infrastructure [1]. Therefore, accurately analyzing
and predicting electricity consumption is not only the key to ensur-
ing the smooth operation of the national or regional social and
economic systemsthe need to ensure the development of the
electricity industry is also required.
In the field of time series, there are still many shortcomings
in the current study of electricity consumption prediction, such
as the prediction of different types of electricity consumption.
Many uncertain impact factors make the prediction difficult and
complex, such as irregular data fluctuations, measurement error,
and so on [2,3]. Therefore, a new method should be proposed
aCorrespondence to: Guoheng Huang. E-mail: kevinwong@gdut.edu.cn
*Laboratory of Cyber-Physical System, Department of Computer Science
and Technology, School of Computes, Guangdong University of Technol-
ogy, Guangzhou, China
**School of Computes, Guangdong University of Technology, Guangzhou,
China
to solve such problems. Currently, there are some traditional
statistical-based models, such as Holt-Winters model [4,5] and
the Auto Regressive Integrated Moving Average (ARIMA) model
[6,7]. The Holt-Winters model is used to predict the electricity
consumption of the data centers, which can remarkably increase
the energy efficiency of data centers [8]. The ARIMA model is
used to predict the electricity consumption of medical institutions,
but smoothing of the data is needed at the beginning [9]. These
models are required to enhance the smoothness and quantity of
data. If these requirements are not met, the prediction effects of
these models are relatively poor. Moreover, for different types of
data, it is always necessary to manually adjust the parameters,
which makes it difficult to generalize the model. In order to
solve the problem of generalization, some machine learning-based
prediction methods have been proposed, such as Support Vector
Machine (SVM) [10,11], neural network [12,13] and so on. These
methods have different variants depending on the application
scenario. Among them, Long Short-Term Memory (LSTM) and
Gated Recurrent Unit (GRU) are two different variants of recurrent
neural networks (RNNs), which have better predictive effects on
time series prediction. The method is proposed to utilize the LSTM
network, which takes a sequence of past consumption profiles to
perform a month-ahead electricity consumption prediction as a
sequence [14]. The multilayer GRU is used to construct the model
to predict the electricity consumption [15]. However, these two
methods are not accurate enough to extract the features of the
training data to obtain the best prediction results.
Due to the particularity and variability of electricity consump-
tion data, if the model cannot purposefully learn from key data,
the prediction accuracy of the model will be relatively poor. Elec-
tricity consumption data can be divided into different categories
according to the type of electricity consumption, such as residen-
tial electricity, commercial electricity, large industrial electricity,
agricultural electricity, and so on. Electricity consumption of vari-
ous types has different trends and features. For example, regarding
©2020 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
Z. LIN, L. CHENG, AND G. HUANG
Fig. 1. Smart grid framework
residential electricity, electricity consumption is relatively low and
is affected by the region, the season, and so on [16]. Regarding
business electricity consumption, it varies from region to region
as business practices in different regions have different character-
istics. It is also the same for industrial and agricultural electricity.
Thus, how to accurately extract the specific features of electricity
consumption sequence in different types is the key to improv-
ing the prediction effect. However, the methods above perform
indiscriminate learning on the time series, which results in poor
prediction.
In order to perform features extraction on electricity consump-
tion data more efficiently, we propose a module based on LSTM
network with an attention mechanism for electricity consumption
prediction. The contributions of our study are shown as follows:
1. Apply LSTM as a basic model to the field of electricity
consumption prediction and achieve better results.
2. The attention mechanism is used to assign weight coeffi-
cients to the input sequence data so that the specific features
can be accurately extract.
3. Effectively improve the accuracy of electricity consumption
prediction based on real-world datasets and have reference
values in the construction of smart grid. The location of our
proposed scheme in the smart grid construction framework
is shown in Fig. 1.
The content of this paper is mainly divided into four sections.
The second section introduces the overall structure and electric-
ity consumption of electricity consumption prediction, the third
section introduces the experimental process and experimental
results, and the last section is the conclusion and future work.
2. Electricity Consumption Prediction
with Attention-LSTM
The overall architecture of the model is shown in Fig. 2, includ-
ing input and output modules, sequence attention mechanism,
LSTM network and weight optimization module.
The input data are the electricity consumption sequence data for
a period of time, and the output data are the predicted electricity
consumption sequence data for a period of time after that. The
attention mechanism, which will be introduced in detail in Section
2.1, is used to weight the input training data to make it easier
to learn. Layers in the LSTM network include an input layer, an
output layer and a hidden layer. The values of the input sequence
form the input layer, which is the first layer, and the output layer is
the final layer that contains the predicted result. The hidden layer
exists between the input and output layers. To reduce learning
errors of the weights of the attention mechanism and LSTM, the
errors from the previous iteration are fed back into the network, and
the weights are optimized; more training details will be introduced
in Section 2.2.
2.1. Data weighted with attention mechanism The
attention mechanism stems from the study of human vision.
Fig. 2. LSTM with attention mechanism
Fig. 3. Graphical illustration of attention mechanism
Attention is used in the field of machine translation; when
seeking attention distribution probability distribution, it gives a
probability to any word in the input sentence [17]. Then, the soft
attention model and hard attention model are proposed. The soft
attention model is a fully differentiable deterministic mechanism
that spreads to other parts of the network while propagating
through the attention mechanism. The hard attention model is
a stochastic process in which the system randomly samples an
implicit state instead of using all implicit states for decoding [18].
As the gradient can be directly calculated rather than estimated
by a random process and can be effectively integrated with the
prediction algorithm, we choose soft attention as the attention
mechanism.
In order to make rational use of limited visual information
processing resources, humans need to select a specific part of
the visual area and then focus on it. Likewise, in order to allow
the model to focus on sequence segments that can represent key
features of the whole sequence, the soft attention mechanism
is used to improve the system performance of the electricity
consumption sequence learning task. The graphical illustration of
the attention proposed is shown in Fig. 3.
In the attention mechanism, the weights ak
tof input sequence
is computed according to the previous hidden state ht1and the
previous cell state ct1and then feed the computed
Xtinto the
LSTM unit.
In the process of electricity consumption prediction, data
need to be preprocessed to meet the input requirements of
the attention mechanism. Given an electricity consumption
sequence Seq ={s1,s2,s3,...,sN}, we divide it into training
sequence Seqtrain ={s1,s2,s3,...,sM}and testing sequence
Seqtest ={sM+1,sM+2,sM+3,...,sN},whereNis the length of
the sequence, and Mis the length of the training sequence. Then,
divide the training sequence into nsequence segments. The value
of ncan be calculated using the following formula:
n=MT
k+1(1)
2IEEJ Trans (2020)
ELECTRICITY CONSUMPTION PREDICTION WITH ATTENTION-LSTM
Fig. 4. LSTM network
where Tis the length of each sequence segment and also represents
the number of LSTM cells, and kis the step size that the data need
to move backward each time the data is segmented.
The set of sequence segments is Xo=Seqtrain ={x1,x2,x3,
...,xn},wherex2represents the second sequence segment, x3
is the third sequence segment, and so on. x1
tas an element of
Xt={x1
t,x2
t,x3
t,... ,xn
t},t={1, 2, 3, ... ,T}represents the value
of the first sequence segment at time t. The attention mechanism
can be constructed via an input Xoby referring to the previous
hidden state ht1and the cell state ct1in the LSTM unit with:
ek
t=Vetanh(We[ht1;ct1]+Uexk
t+Be)(2)
and
ak
t=exp(ek
t)
n
i=1exp(ei
t)(3)
where Ve,We,Ueare parameters to learn; Beis the bias terms;
and ak
t,k={1, 2, 3, ... ,n}is the attention weight measuring the
importance of the input electricity consumption sequence at time
t. SoftMax function is applied to ek
t,k={1, 2, 3, ... ,n}to ensure
all the attention weights sum to 1. The training sequence segment
is weighted by (2)(3), and the segment that has a greater influence
on the prediction effect will be given a greater weight. In electricity
consumption prediction, the model with an attention mechanism
will focus more on periods that include peak electricity and sudden
changes in electricity rather than treating all time periods equally.
The attention mechanism is a feedforward network that can be
jointly trained with other components of the LSTM. With these
attention weights, the sequence can be adaptively extracted with:
Xt=(a1
tx1
t,a2
tx2
t,a3
tx3
t,... ,an
txn
t)(4)
2.2. Prediction with LSTM LSTM is an improved
RNN network that is good at exploiting nonlinear relationships
between time series data. It replaces the hidden layer of RNN
cells with LSTM cells, which adds a state for long-term memory,
as shown in Fig. 4.
Compared with RNN, LSTM cell has one more state c,which
enables LSTM to have long-term memory. The LSTM network
consists of TLSTM cells in an orderly manner in the electricity
consumption prediction; each LSTM cell can be constructed via
the hidden state ht1; the cell state ct1from the upper layer of
cells; and input sequence
Xt, which is the output of the attention
mechanism.
LSTM enhances the control of data weights by introducing the
concept of a gate to control long-term state c. The forget gate
controls the hidden state of the upper layer, the input gate controls
the input data, and the output gate controls the output data of the
layer. Detailed architecture of LSTM cell is shown in Fig. 5.
Given an electricity consumption sequence weighted by
the attention mechanism
Xt=(a1
tx1
t,a2
tx2
t,a3
tx3
t,... ,an
txn
t),
Fig. 5. LSTM cell structure
maximumminimum normalization is used to process data as
follows:
X
t=
Xtmin(
Xt)
max(
Xt)min(
Xt)(5)
The output of the forget gate is calculated using the following
formula:
ft=σ(wxf ak
txk
t+whf ht1+bf)(6)
where xk
tis an element of the
Xt,k={1, 2, 3, ...,n},wxf is the
weight coefficient matrix of the input xto the forget gate f,whf is
the weight coefficient matrix of the hidden state of the upper layer
ht1to the forget gate f,andbfis the bias of the forget gate. σ
is the sigmoid activation function.
The cell state ctcan be updated with:
it=σ(wxi ak
txk
t+whi ht1+bi)
ct=ftct1+ittanh(wxc ak
txk
t+whc ht1+bc)(7)
where itis the output of the input gate; wxi is the weight coefficient
matrix of the input xto the input gate i;whi is the weight
coefficient matrix of the hidden state of the upper layer ht1to
the input gate i;wxi is the weight coefficient matrix of the input
xto the input gate i;wxc is the weight coefficient matrix of the
inputxto the candidate cell status c;whc is the weight coefficient
matrix of the hidden state of the upper layer ht1to the candidate
cell status c;andand tanh are an elementwise multiplication
and the hyperbolic tangent activation function, respectively.
The hidden state htcan be updated with:
ot=σ(wxo ak
txk
t+who ht1+bo)
ht=ottanh(ct)(8)
where otis the output of the output gate, wxo is the weight
coefficient matrix of the input xto the output gate o,andwho
is the weight coefficient matrix of the hidden state of the upper
layer ht1to the output gate o.
The predicted value of the model can be calculated with:
pt=σ(Vht+ct)(9)
In the process of electricity prediction, only the value of the last
LSTM cell is output at a time.
The root mean squard error (RMSE) is used as the error
calculation formula for the predicted value and the true value.
The loss function is defined as:
loss =n
i=1(piyi)2
n(10)
3IEEJ Trans (2020)
Z. LIN, L. CHENG, AND G. HUANG
where piis the predicted data, and yiis the true data.
Given the network initialization random number seed, learning
rate ηand training iterations steps, the Back Propagation Through
Time (BPTT) method is used to minimize the error. The weights
are iteratively optimized in the meanwhile to obtain the attention
mechanism of the training and the LSTM network. BPTT is a time-
based back-propagation algorithm. First, the data are predicted
based on the forward calculation method, and then, the error
between the actual value and the predicted value are propagated
backward. Unlike the conventional back propagation, the error
term calculates the gradient of each weight according to the
corresponding error term and updates the weight of the model
by the direction of the gradient descent.
Algorithm 1
Input: The input sequence Seq
The number of training data M
The length of each sequence segment T
The LSTM parameters Sstate ,seed,steps
Output: Predicted sequence Po
Evaluation parameter R2
1: N=Len(Seq )
2: Get Seqtrain,Seq test from Seq by M
3: Generate Xofrom Seqtrain by T
4: For each t[1, T]
5: Get Xtfrom Xoat time t
6: Xt
=Max_Min(Xt)
7:
Xt=Attention_Mechanism(
Xt)
8: Append
Xwith
Xt
9: End
10: Create LSTMcell by Sstate(c,h)
11: Connect LSTMnet by LSTMcell
12: Initialize LSTMnet by seed
13: For each step [1, steps]
14: P=LSTMnet(
X)
15: Weight updated by using BPTT with Loss and η
16: End
17: Get LSTM
net
18: Get Te 1from Seqtrain
19: For each i[1, (NM)]
20: pi=LSTM
net(Te i)
21: Get Te i+1from Te iand pi
22: Append P0with pi
23: End
24: Output P
o=de_Max_Min(Po),R2(Po,Seq test )
In the testing session, the trained LSTM network with an atten-
tion mechanism is represented as LSTM
net. According to the
LSTM network, each iteration can predict the data at the next
point in time, that is, by giving the initial input sequence Te 1=
{sMT+1,sMT+2,sMT+3,...,sM},wheresare the last T
elements of the training sequence, and the prediction result of the
next moment p1=LSTM
net(Te 1), and then, p2=LSTM
net(Te 2),
where Te 2is composed of {sMT+2,sMT+3,...,sM,p1}.
In this way, we can obtain a set of predicted sequences
Po={P1,P2,P3,...,PNM}.
Then, the prediction sequence is denormalized by the following
formula:
P
o=[Pomin(Po)]×[max(
Xt)min(
Xt)] (11)
Finally, given Seqtest ={sM+1,s
M+2,s
M+3,...,s
N}, the elec-
tricity consumption prediction model is evaluated by calculating
the coefficient of determination R2of the real sequence and the
predicted value of the testing sequence with:
SS tot =(pis)2
SS res =(sipi)2
R2=1SS res
SS tot
(12)
where SS tot is the error between real data and average, pi
represents predicted data, and sis the average of real data. SS res is
the sum of squares of the residuals of real data and the predicted
data, and siis the real data. If the predicted data are closer to the
true data, the value of R2will be closer to 1.
Algorithm 1 is listed in order to make the model training and
evaluation algorithm flow more concisely.
3. Experiment
This experiment combines the actual situation of China Southern
Power Grid and applies the method proposed to predict electricity
consumption.
The running environment of the experiment is Python3.6, Linux
14.04; the CPU configuration is Inter Core i5-7300HQ; and the
GPU configuration is Nvidia GeForce GTX 1050.
The experimental datasets consist of numbers and uses the data
of the first power supply station of China Southern Power Grid
from May 20 to June 20, 32 days in total. The types of electricity
collected belong to four categories, including ‘Residential’, ‘Large
Industrial electricity’, ‘Business’ and ‘Agricultural’. Each electric-
ity type contains 768 electricity consumption records. The range of
the training set for each electricity type is 1M,M=768*0.8,
and the range of the testing set is (M+1) N,N=768. Adam
optimizer is used to train the model, the input layer Tis 8, the out-
put layer is 1, the hidden layer contains four neurons, the random
seed is 1, the number of iterations steps is 500, and the learning
rate η=0.01.
3.1. Experiments for four different types of electricity
In order to verify the generalization of the proposed method, we
performed four sets of experiments with different electricity types.
The conditions of each set of experiments are the same except
for the training data. For example, the normalization process and
evaluation model are set the same in each set.
The power consumption prediction results for each type of
electricity are shown in Fig. 6. The x-axis represents time in hours,
and the y-axis represents hourly electricity consumption prediction
in KW/h. There are two curves in each of the figure: one represents
real data, and the other represents forecast data. The experimental
results of residential, large industrial, business, and agricultural
electricity consumption show that the predicted curve using the
proposed method has a high degree of fit to the real curve. Not
only can it accurately predict the electricity consumption peaks,
but it also predict trends. The model is evaluated by (12), and the
evaluation results are shown in Table I.
The model performs better for large industrial electricity and
business electricity consumption prediction, which benefits from
the use of attention mechanisms to focus on the rules of learning
data. However, the accuracy of residential and agricultural electric-
ity consumption prediction is low. The reason for this phenomenon
might be the high mutation rate of these two types of electricity
consumption. Thus, the electricity consumption trend might not be
well learned from the training data for a whole month.
3.2. Effect of the attention mechanism To verify
the validity of the attention mechanism, we performed two
experiments; the LSTM model with attention mechanism is used to
4IEEJ Trans (2020)
ELECTRICITY CONSUMPTION PREDICTION WITH ATTENTION-LSTM
145
True Predict
(a)
140
135
130
125
LW/h
120
115
110
0 20406080
Hours
100 120 140
True Predict
(b)
40 000
30 000
20 000
LW/h
10 000
0
0 20406080
Hours
100 120 140
True Predict
(c)
350
300
250
200
LW/h
150
100
50
0
0 20406080
Hours
100 120 140
True Predict
(d)
1400
1600
1800
1200
1000
LW/h
800
600
02040
60 80
Hours
100 120 140
Fig. 6. The prediction results for different types of electricity: (a) residential, (b) large industrial electricity, (c) business and (d) agricultural
Tab l e I. R2score of electricity consumption for different types of
electricity consumption
Electricity type R2score
Residential 0.87
Large industrial electricity 0.99
Business 0.98
Agricultural 0.88
predict electricity consumption in the first experiment, and LSTM
method without attention mechanism [14] is used in the other
experiment. The result is shown in Fig. 7.
It can be clearly seen in Fig. 7 that, in the sudden change
of electricity consumption, the yellow curve (predict without
attention) cannot accurately predict the arrival of the sudden
change point, but the green curve (predict with attention) can
accurately predict this.
The introduction of the attention mechanism helps the model to
learn the salient features of the sequence by giving weight to the
sequence segments, reducing the interference factors and obtaining
better prediction results.
3.3. Comparative experiment First, traditional meth-
ods based on statistics are used to predict large industrial electricity
consumption data. The methods used are HoltWinter [8] and
ARIMA [9]. The experimental result is shown in Fig. 8. Different
seasonal period (SP) values are set in the HoltWinter method.
There is a big gap between the prediction results of HoltWinter
with different SPs. When SP is set to 24, the result is perfect. But
when another value is set, the result is very bad. The generalization
of HoltWinter is very poor, which takes a long time to adjust the
parameters for different data, and much prior knowledge is needed
in the parameter adjustment process. Time series data should be
True Predict without attention
Predict with attention
40 000
30 000
LW/h
20 000
10 000
0
0 20406080
Hours
100 120 140
Fig. 7. Predicted result of LSTM with attention and LSTM
Without attention
stationary or stable after differential processing when ARIMA is
used. If the requirement cannot be met, the prediction effect will
be poor.
Then, machine learning-based methods are used to predict the
large industrial electricity consumption data. The methods used
are SVM [11] and Neural Networks. The kernel function used by
the SVM method is Radial Basis Function; the configuration of
the neural network is 3, 4 and 5 layers of fully connected layers;
and Relu acts as an activation function. The experimental result is
showninFig.9.
The prediction effects of machine learning-based methods are
better than the prediction effects of traditional methods. Because
the correlation of data over time is not considered, the prediction
effect of neural networks with different numbers of hidden layers
is not very different.
The results of the experiments in Sections 3.2 and 3.3 are
summarized as Table II, which shows that the proposed LSTM
method with attention mechanism has the highest prediction
5IEEJ Trans (2020)
Z. LIN, L. CHENG, AND G. HUANG
Predict with ARIMA
Predict with Holt_Winter
Sp=24
Predict with Holt_Winter
Sp=23
True
Predict with Holt_Winter
Sp=22
40 000
50 000
60 000
70 000
30 000
LW/h
20 000
10 000
0
0 20406080
Hours
100 120 140
Fig. 8. Predicted result of traditional methods
Predict with SVM
Predict with NN_5Predict with NN_4
True
Predict with NN_3
25 000
30 000
35 000
40 000
15 000
20 000
LW/h
10 000
5000
0
0 20406080
Hours
100 120 140
Fig. 9. Predicted result of machine learning-based methods
Table II. R2score of electricity consumption for large industrial
electricity consumption between different methods
Method R2score
HoltWinter [8] (SP =22) 5.6
HoltWinter [8] (SP =23) 0.44
HoltWinter [8] (SP =24) 0.98
ARIMA [9] 0.53
SVM [11] 0.69
Neural Network (3 hidden layers) 0.70
Neural Network (4 hidden layers) 0.72
Neural Network (5 hidden layers) 0.71
LSTM [14] (without attention) 0.91
LSTM (with attention, proposed) 0.99
accuracy. Prediction accuracy increased by 6.5% compared to
state-of-the-art model (LSTM without attention mechanism).
4. Conclusion
In this paper, we propose an LSTM network with an attention
mechanism to predict the electricity consumption data. First, the
attention mechanism is used to process the training data, so that
the LSTM training can focus on the correct sequence segment,
and then, the weight coefficient of the attention mechanism and
the LSTM are updated by back-propagation and gradient descent
to minimize the RMSE. Finally, we use four sets of data to
evaluate the predicted effect of the proposed method and compare
it with other methods. From comparison with some state-of-the-
art algorithms, the method performs on the best prediction effect.
It does not only learn the law of actual change of electricity
consumption more accurately but also improves the accuracy of
the prediction model.
In the future, we will focus on long-sequence predictions and
incorporate more power-influencing factors into the model.
Acknowledgments
This work was supported by National Natural Science Foundation of
China Youth Science Fund Project (Research on Service Composition
Optimization Model and Optimization Algorithm for Manufacturing IoT
Collaborative Perception, No. 61502110).
References
(1) Yeliz Y, ¨
Onen A, Muyeen SM, Vasilakos AV, irfan Alan. Enhancing
smart grid with microgrids: Challenges and opportunities. Renewable
and Sustainable Energy Reviews 2017; 72:205 214.
(2) Colak I, Sagiroglu S, Fulli G, Yesilbudak M, Covrig CF. A survey
on the critical issues in smart grid technologies. Renewable and
Sustainable Energy Reviews 2016; 54:396 405.
(3) Bouzid AM, Guerrero JM, Cheriti A, Bouhamida M, Sicard P,
Benghanem M. A survey on control of electric power distributed
generation systems for microgrid applications. Renewable and Sus-
tainable Energy Reviews 2015; 44:751 766.
(4) Yang Y-M, Yu H, Sun Z. Aircraft failure rate forecasting method
based on Holt-Winters seasonal model. 2017 IEEE 2nd International
Conference on Cloud Computing and Big Data Analysis (ICCCBDA),
2017; 520– 524.
(5) Zheng T, Zhang Y, Fan C. Research on hospital operation index
prediction method based on PSO-Holt-winters model. Proceedings
of the 2nd International Conference on Computer Science and
Application Engineering, 23, 2018.
(6) Kumar SV, Vanajakshi L. Short-term traffic flow prediction using
seasonal ARIMA model with limited input data. European Transport
Research Review 2015; 7(3):21.
(7) Guarnaccia C, Mastorakis NE, Quartieri J, Tepedino C, Kaminaris
SD. Development of seasonal ARIMA models for traffic noise
forecasting. MATEC Web of Conferences, 05013, 2017.
(8) Rossi M, Brunelli D. Forecasting data centers power consumption
with the Holt-Winters method. 2015 IEEE Workshop on Environ-
mental, Energy, and Structural Monitoring Systems (EESMS) Pro-
ceedings, 2015; 210 214.
(9) Kaur H, Ahuja S. Time series analysis and prediction of electricity
consumption of health care institution using ARIMA model. Pro-
ceedings of Sixth International Conference on Soft Computing for
Problem Solving, 347– 358(2017)
(10) Magoul`
es F, Piliougine M, Elizondo D. Support vector regression for
electricity consumption prediction in a building in japan. 2016 IEEE
Intl Conference on Computational Science and Engineering (CSE)
and IEEE Intl Conference on Embedded and Ubiquitous Computing
(EUC) and 15th Intl Symposium on Distributed Computing and
Applications for Business Engineering (DCABES), 2016; 189– 196.
(11) Fu Y, Li Z, Zhang H, Xu P. Using support vector machine to predict
next day electricity load of public buildings with sub-metering
devices. Procedia Engineering 2015; 121:1016 1022.
(12) Zhang Y, Guo L, Li Q, Li J. Electricity consumption forecasting
method based on MPSO-BP neural network model. arXiv preprint
arXiv:1810.08886, 2018.
(13) Hu W, Tao Z, Guo D, Pan Z. Natural gas prediction model based on
wavelet transform and BP neural network. 2018 33rd Youth Academic
Annual Conference of Chinese Association of Automation (YAC),
2018; 952– 955.
(14) Kim N, Kim M, Choi JK. LSTM based short-term electricity
consumption forecast with daily load profile sequences. 2018 IEEE
7th Global Conference on Consumer Electronics (GCCE), 2018;
136– 137.
(15) Ke K, Hongbin S, Chengkang Z, Brown C. Short-term electrical
load forecasting method based on stacked auto-encoding and GRU
neural network. Evolutionary Intelligence 2019;12, 385.
(16) Li P, Sun BY, Li ZM, Jiang JS. Investigation and analysis on
electrical load of residence. Building Electricity 2014; 33(7):1318.
(17) Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly
learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
6IEEJ Trans (2020)
ELECTRICITY CONSUMPTION PREDICTION WITH ATTENTION-LSTM
(18) Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Bengio
Y. Show, attend and tell: Neural image caption generation with
visual attention. International Conference on Machine Learning,
2015; 2048– 2057.
Zhifeng Lin (Non-member) received a B.S. degree in automation
from Guangdong University of Technology
(GDUT), China, in 2017. Currently, he is
pursuing an M.S. degree in computer science
and technology at GDUT. His research inter-
ests include time series pattern mining, deep
neural networks, and computer networks.
Lianglun Cheng (Non-member) is currently a professor at Guang-
dong University of Technology, a computer
dean of Guangdong University of Technol-
ogy, a doctoral tutor, an excellent teacher
of Nanyue, and a cross-century talentsin
of Guangdong Province. Executive director
of the Robotics Professional Committee of
China Automation Association, member of
China Computer Federation, and vice chair-
man of Guangdong Automation Association. His main research
interests include Knowledge Graph, Knowledge Automation and
Information Physics Fusion Systems.
Guoheng Huang (Non-menber) is currently a talented person in
the ‘Hundred Talents Program’ of Guang-
dong University of Technology, an assistant
professor of computer science, and a mas-
ter’s tutor. He received his B.S. (Mathe-
matics and Applied Mathematics) and M.E.
(Computer Science) degrees from South
China Normal University in 2008 and 2012,
respectively, and his Ph.D. (Software Engi-
neering) from Macau University in 2017. His research interests
include computer vision, pattern recognition and artificial intelli-
gence. He has hosted and undertaken a number of national and
provincial-level scientific research projects, including the National
Natural Science Foundation and National Key Research and Devel-
opment Plan.
7IEEJ Trans (2020)
... Lin et al. [28] proposed deep learning models for predicting power generated by the wave energy converter (WEC). Most of the research work studied above is mainly dependent on the prediction of wave parameters using the machine learning approach [21], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40]. ...
... In this step, the wave height is forecasted using ocean/marine environmental factors, and finally, power generation is also predicted [28], [29], [30]. Power generation mathematical modeling was adopted from WEC models presented by "Searaser" [19]. ...
Article
With the evolution and integration of information and communication technologies, the marine environment is being converted into a smart ocean of things. The only way to monitor the marine environment is to access marine information through satellites, radar, etc. Recently, many researchers have focused their interest on generating power from renewable energy. Among all the available renewable resources, ocean waves are attracting the interest of researchers for power generation. Therefore, this article focuses on designing a data-driven forecasting model for marine renewable energy generation applications. This article applies a novel Gini-impurity-index-based bidirectional long short-term memory model for selecting the best ocean/marine environmental factors to forecast wave height and ultimately predict power generation using the numerical model. This article presents short-and long-term forecasting results. In the experiment, four stations each are taken for both short-and long-term forecasting. The average root-mean-square error was approximately 0.17 for long-term forecasting and approximately 0.05 for short-term forecasting. Index Terms-Data driven, forecasting, ocean environment, renewable energy, wave height.
... Equation (14) shows the mathematical formulas of MAE, MSE, and RMSE metrics. R 2 score, another performance metric we use in forecasting models, is stated in Equation (15) (Li,2017;Lin et al.,2020). ...
Article
Full-text available
In today's world, where economic and industrial development continues, the importance of electrical energy is constantly increasing. Energy demand should be forecast as precisely as possible to reduce lost energy costs in the system, to plan generation expenditures appropriately, to ensure that market players are not economically harmed, and to deliver quality and uninterrupted energy to system consumers. Balancing the electric energy supply and demand of the system is possible with a forecasting plan. Our research aims to generate hourly electricity consumption load forecasts for the period 2018-2021 using Turkish Electricity Consumption Data and meteorological data, with the addition of time and public holiday features. The forecasting performance of the models is evaluated by training multiple machine learning models and deep neural network-based time series models with the data. When the prediction results of our load demand forecasting problem were evaluated, it was seen that deep learning methods gave higher results in prediction success compared to machine learning models. It has been observed that the prediction success of the LSTM model, one of the deep learning methods we use, is higher than the RNN and GRU models. The analysis envisages the elimination of mismatches between energy supply and demand.
... This diversity allows for exploring various approaches to electricity consumption prediction, catering to different data characteristics and prediction requirements. Studies have demonstrated that advanced prediction models, such as those incorporating deep learning techniques like LSTM with attention mechanisms, can significantly enhance prediction accuracy [36,38]. These models leverage complex patterns in electricity consumption data, leading to more precise forecasts. ...
Article
Full-text available
This study addresses the critical challenge of accurately forecasting electricity consumption by utilizing Exponential Smoothing and Seasonal Autoregressive Integrated Moving Average (SARIMA) models. The research aims to enhance the precision of forecasting in the dynamic energy landscape and reveals promising outcomes by employing a robust methodology involving model application to a large amount of consumption data. Exponential Smoothing demonstrates accurate predictions, as evidenced by a low Sum of Squared Errors (SSE) of 0.469. SARIMA, with its seasonal ARIMA structure, outperforms Exponential Smoothing, achieving lower Mean Absolute Percentage Error (MAPE) values on both training (2.21%) and test (2.44%) datasets. This study recommends the adoption of SARIMA models, supported by lower MAPE values, to influence technology adoption and future-proof decision-making. This study highlights the societal implications of informed energy planning, including enhanced sustainability, cost savings, and improved resource allocation for communities and industries. The synthesis of model analysis, technological integration, and consumer-centric approaches marks a significant stride toward a resilient and efficient energy ecosystem. Decision-makers, stakeholders, and researchers may leverage findings for sustainable, adaptive, and consumer-centric energy planning, positioning the sector to address evolving challenges effectively and empowering consumers while maintaining energy efficiency.
... Backpropagation Through Time (BPTT) is a backpropagation algorithm based on time series, also known as gradient descent [25]. By iterating the parameters to be updated, it minimizes the error and finds the optimal parameter value. ...
Article
Full-text available
Long Short-Term Memory (LSTM) is an effective method for stock price prediction. However, due to the nonlinear and highly random nature of stock price fluctuations over time, LSTM exhibits poor stability and is prone to overfitting, resulting in low prediction accuracy. To address this issue, this paper proposes a novel variant of LSTM that couples the forget gate and input gate in the LSTM structure, and adds a “simple” forget gate to the long-term cell state. In order to enhance the generalization ability and robustness of the variant LSTM, the paper introduces an attention mechanism and combines it with the variant LSTM, presenting the Attention Mechanism Variant LSTM (AMV-LSTM) model along with the corresponding backpropagation algorithm. The parameters in AMV-LSTM are updated using the Adam gradient descent method. Experimental results demonstrate that the variant LSTM alleviates the instability and overfitting issues of LSTM, effectively improving prediction accuracy. AMV-LSTM further enhances accuracy compared to the variant LSTM, and compared to AM-LSTM, it exhibits superior generalization ability, accuracy, and convergence capability.
... The suggested model included two datasets, although the appliances load prediction (AEP) dataset produced the best error prediction results (RMSE (2%), MAE (1%). An LSTM network-based attention mechanism-based module for predicting power consumption was suggested by Lin et al. [21]. The experimental data sets utilized came from China Southern Power Grid's power supply station. ...
Article
Full-text available
A smart city utilizes vast data collected through electronic methods, such as sensors and cameras, to improve daily life by managing resources and providing services. Moving towards a smart grid is a step in realizing this concept. The proliferation of smart grids and the concomitant progress made in the development of measuring infrastructure have garnered considerable interest in short-term power consumption forecasting. In reality, predicting future power demands has shown to be a crucial factor in preventing energy waste and developing successful power management techniques. In addition, historical time series data on energy consumption may be considered necessary to derive all relevant knowledge and estimate future use. This research paper aims to construct and compare with original deep learning algorithms for forecasting power consumption over time. The proposed model, LSTM-GRU-PPCM, combines the Long -Short-Term -Memory (LSTM) and Gated- Recurrent- Unit (GRU) Prediction Power Consumption Model. Power consumption data will be utilized as the time series dataset, and predictions will be generated using the developed model. This research avoids consumption peaks by using the proposed LSTM-GRU-PPCM neural network to forecast future load demand. In order to conduct a thorough assessment of the method, a series of experiments were carried out using actual power consumption data from various cities in India. The experiment results show that the LSTM-GRU-PPCM model improves the original LSTM forecasting algorithms evaluated by Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) for various time series. The proposed model achieved a minimum error prediction of MAE=0.004 and RMSE=0.032, which are excellent values compared to the original LSTM. Significant implications for power quality management and equipment maintenance may be expected from the LSTM-GRU-PPCM approach, as its forecasts will allow for proactive decision-making and lead to load shedding when power consumption exceeds the allowed level
Article
Full-text available
INTRODUCTION: As population has increased over successive generations, human dependency on electricity has increased to the point where it has become a norm and indispensable, and the idea of living without it has become unthinkable.OBJECTIVES: Machine learning is emerging as a fundamental method for performing tasks autonomously without human intervention. Forecasting electricity consumption is challenging due to the many factors that influence it; embracing modern technology with its heavy focus on machine learning and artificial intelligence is a potential solution.METHODS: This study employs various machine learning algorithms to forecast power usage and determine which method performs best in predicting the dataset based on different variables.RESULTS: Eight models were tested, including Linear Regression, DT Classifier, RF Classifier, KNN, DT Regression, SVM, Logistic Regression, and GNB Classifier. The Decision Tree model had the greatest accuracy of 98.3%.CONCLUSION: The Decision Tree model’s accuracy can facilitate efficient use of electricity, leading to both conservation of electricity and cost savings, and be a guiding light in future planning.
Article
To improve the forecasting accuracy of power load, the forecasting model based on sparrow search algorithm (SSA), variational mode decomposition (VMD), attention mechanism and long short-term memory (LSTM) was proposed. Firstly, SSA is used to optimize the number of decomposition and penalty factor in VMD and realize the decomposition operation of the initial data. Then, LSTM is used to predict each component, and on this basis, feature and temporal attention mechanisms are introduced. Feature attention mechanism is introduced to calculate the contribution rate of relevant input features in real time, and the feature weights are modified to avoid the limitations of traditional methods relying on the threshold of expert experience association rules. Temporal attention mechanism is applied to extract the historical key moments and improve the stability of the time series prediction effect. Finally, the final result is obtained by superimposing the prediction results of each component to complete the power load prediction. Practical examples show that, compared with other methods, the proposed model achieves the highest prediction accuracy, with an RMSE of 1.23, MAE of 0.99 and MAPE of 11.62%.
Article
Full-text available
With the rapid development of smart grid, to solve the power enterprises’ requirement in short-term load forecasting, this paper proposes a short-term electrical load forecasting method based on stacked auto-encoding and GRU (Gated recurrent unit) neural network. Firstly, the method input historical data which contains power load, weather information, and holiday information, and use auto-encoding to compress the historical data; and then, the multi-layer GRU is used to construct the model to predict the power load. The experiment results show, compared with traditional models, the proposed method can effectively predict the daily variation of power load and have lower prediction error and higher precision.
Article
Full-text available
In this paper, a time series analysis approach is adopted to monitor and predict a traffic noise levels dataset, measured in a site of Messina, Italy. In general, acoustical noise shows a high prediction complexity, since its slope is strongly related to the variability of the sources and to intrinsic randomness. In the analysed site the predominant source is road traffic, that has a periodic and non-stationary behaviour. The study of the time evolution of this hazardous agent is very useful to assess the impact to human health and activities. The time series models adopted in this paper are of the stochastic seasonal ARIMA class; these types of model are based on the strong periodicity registered in the acoustical equivalent levels. The observed periodicity is related to the highly variability of urban traffic in the different days of the week. Three different seasonal ARIMA models are proposed and calibrated on a rich dataset of 800 sound level measurements. The predictive capabilities of these techniques are encouraging. The implemented models show a good forecasting performances in terms of low residuals, i.e. difference between observed and estimated noise values. The residuals are analysed by means of statistical indexes, plots and tests.
Article
Full-text available
The modern electric power systems are going through a revolutionary change because of increasing demand of electric power worldwide, developing political pressure and public awareness of reducing carbon emission, incorporating large scale renewable power penetration, and blending information and communication technologies with power system operation. These issues initiated in establishing microgrid concept which has gone through major development and changes in last decade, and recently got a boost in its growth after being blessed by smart grid technologies. The objective of this paper is to presents a detailed technical overview of microgrid and smart grid in light of present development and future trend. First, it discusses microgrid architecture and functions. Then, smart features are added to the microgrid to demonstrate the recent architecture of smart grid. Finally, existing technical challenges, communication features, policies and regulation, etc. are discussed from where the future smart grid architecture can be visualized.
Conference Paper
The prediction1 of hospital operation indicators is of great significance and can provide an important basis for hospital operation and management, so as to assist managers to make decisions such as resource allocation and task planning. In order to solve this problem, a novel Holt-Winters model based on particle swarm optimization (PSO) is proposed, aiming at the accurate prediction of hospital operating indicators. In the process of model construction, according to the characteristics of time series data of hospital operation indicators, a time decay mean square error function is constructed as an optimization function of particle swarm optimization algorithm, which enables particle swarm optimization algorithm to better fit recent historical data and grasp the characteristics of recent time series, so as to improve the prediction accuracy. An example is given to analyze the hospital operation index data of a third-class hospital from 2014 to 2017. By initializing the parameters of the model and optimizing the parameters, the improved PSO-Holt-Winters model of TDMSE-1 is established, which can accurately predict the outpatient, inpatient, emergency, discharged and surgical cases.
Chapter
The purpose of this research is to find a best fitting model to predict the electricity consumption in a health care institution and to find the most suitable forecasting period in terms of monthly, bimonthly, or quarterly time series. The time series data used in this study has been collected from a health care institution Apollo Hospital, Ludhiana for the time period of April 2005 to February 2016. The analysis of the time series data and prediction of electricity consumption have been performed using ARIMA (Autoregressive Integrated Moving Average) model. The most suitable candidate model for the three time series is selected by considering the lowest value of two relative quality measures i.e. AIC (Akaike Information Criterion) and SBC (Schwarz Bayesian Criterion). The appropriate forecasting period is selected by considering the lowest value of RMSE (Root Mean Square Error) and MPE (Mean Percentage Error). After building the final model a two-year prediction of electricity consumption of the health care institution is performed.