ArticlePDF Available

Evolutionary Deep Learning Based Energy Consumption Prediction for Buildings

December 2018
IEEE Access PP(99):1-1

December 2018
PP(99):1-1

DOI:10.1109/ACCESS.2018.2887023

License
CC BY-NC-ND 4.0

Authors:

Abdulaziz Almalaq

University of Hail

Today’s energy resources are closer to consumers due to sustainable energy and advanced technology. To that end, ensuring a precise prediction of energy consumption at the buildings level is vital and significant to manage the consumed energy efficiently using a robust predictive model. Growing concern about reducing the energy consumption of buildings makes it necessary to predict future energy consumption precisely using an optimizable predictive model. Most of the previously proposed methods for energy consumption prediction are conventional prediction methods that are normally designed based on the developer’s knowledge about the hyper-parameters. However, the time lag inputs and the network’s hyper-parameters of learning methods need to be adjusted to have a more accurate prediction. This article proposes a novel hybrid prediction approach based on evolutionary deep learning method that is combining genetic algorithm with Long Short-Term Memory and optimizing its objective function with time window lags and the network’s hidden neurons. The performance of the presented optimization predictive model is investigated using public building datasets of residential and commercial buildings for very short-term prediction and the results indicate that evolutionary deep learning models have better performance than conventional and regular prediction models.

. The LSTM model hyper-parameters.

…

. The best parameters GA-LSTM models for the residential building and the percentage of reduction with benchmark LSTM.

…

The evolutionary DL algorithm scheme.

…

. The 10-fold cross-validation results of GA-LSTM-1 for the second case study.

…

Figures - uploaded by Abdulaziz Almalaq

Content may be subject to copyright.

Content uploaded by Abdulaziz Almalaq

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.

Digital Object Identiﬁer 10.1109/ACCESS.2017.DOI

Evolutionary Deep Learning Based

Energy Consumption Prediction for

Buildings

ABDULAZIZ ALMALAQ, (Member, IEEE), JUN JASON ZHANG (Senior Member, IEEE)

The Department of Electrical and Computer Engineering, University of Denver, Denver, CO, 80237 USA

Corresponding author: Abdulaziz Almalaq (e-mail: abdulaziz.almalaq@du.edu).

ABSTRACT Today’s energy resources are closer to consumers due to sustainable energy and advanced

technology. To that end, ensuring a precise prediction of energy consumption at the buildings level is

vital and signiﬁcant to manage the consumed energy efﬁciently using a robust predictive model. Growing

concern about reducing the energy consumption of buildings makes it necessary to predict future energy

consumption precisely using an optimizable predictive model. Most of the previously proposed methods

for energy consumption prediction are conventional prediction methods that are normally designed based

on the developer’s knowledge about the hyper-parameters. However, the time lag inputs and the network’s

hyper-parameters of learning methods need to be adjusted to have a more accurate prediction. This article

proposes a novel hybrid prediction approach based on evolutionary deep learning method that is combining

genetic algorithm with Long Short-Term Memory and optimizing its objective function with time window

lags and the network’s hidden neurons. The performance of the presented optimization predictive model

is investigated using public building datasets of residential and commercial buildings for very short-term

prediction and the results indicate that evolutionary deep learning models have better performance than

conventional and regular prediction models.

INDEX TERMS Energy consumption, evolutionary computation, genetic algorithms, machine learning,

predictive models, recurrent neural networks

I. INTRODUCTION

THE microgrid is a recent power scenario that proposes

closer power generation to consumers using renewable

resources e.g., rooftop PV panels at buildings and local

energy storages. By utilizing the renewable energy at the

consumer level such as buildings, the consumption will be

cheaper and cleaner; however, there will be some energy

consumed in buildings from the local grid which needs

to be adjusted and predicted efﬁciently to reduce the con-

sumption cost and environmental impacts. Nowadays, energy

consumption in buildings accounts for a large proportion

of the primary energy worldwide and plays a vital role in

carbon emission. Therefore, precision prediction of energy

consumption at building level has become a crucial topic and

it is necessary to develop a reliable optimization predictive

model, to reduce energy costs and improve environmental

buildings.

Generally, it is challenging to predict a building’s energy

consumption precisely due to the many inﬂuential factors

correlated with energy usages, such as weather conditions,

geographical location, building structure, occupancy, etc. The

energy consumption prediction problems have been inves-

tigated widely during the last two decades, where many

researchers have contributed to this topic in some way. There

are two major techniques of energy consumption prediction

that have been applied on buildings, such as physical methods

in [1] [2] [3] and statistical methods e.g., Auto-Regressive

Integrated Moving Average (ARIMA) in [4] [5] [6]. The arti-

ﬁcial intelligence and machine learning (ML) have been con-

ducted to solve the problem of energy prediction in buildings

such as Artiﬁcial Neural Network (ANN) in [1] [7] [8] [9]

[10] [11], Support Vector Machine (SVM) in [12] [13] [14]

[15], Decision Tree in [16] [17] [18] and k-nearest neighbor

in (kNN) [19] [20]. The ANN and its developments were the

most applied method for energy consumption prediction in

buildings with different techniques, such as input variable

selection, network hyper-parameters tuning and training al-

gorithm improvement. The ANN approach based on input

VOLUME 4, 2016 1

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

variable selection, as in [21] and [22], utilized to analyze and

select all potential relevant input variables.

Recently, deep learning (DL) approaches, which are ad-

vanced ML method by adding multi-hidden layers to the

standard ML neural network, have received a wide attention

across a range of disciplines, e.g., image recognition [23],

natural language processing [24], and time series prediction

[25]. The DL methods enhanced the prediction and the

classiﬁcation accuracies in various problems such as stock

market forecasting [26] [27], solar irradiance forecasting

[28] [29] wind speed prediction [30] [31]. Moreover, the

DL approaches have been utilized for energy consump-

tion prediction using Convolutional Neural Network (CNN),

Recurrent Neural Network (RNN) and Long Short-Term

Memory (LSTM). In [32], the CNN method is utilized for

hourly energy load prediction in the smart grid using bagging

forecasting models. Authors compared their proposed model

to several conventional methods. The results showed the

effectiveness of the CNN in comparison with conventional

prediction models. In addition, the CNN method is applied to

an individual residential building in [33]. The results showed

that the CNN outperformed the other compared methods in

the paper. Another method applied to energy consumption

prediction in a household is the RNN in [34]. Authors

proposed pooling-based deep RNN to batches a group of

load’s proﬁles into a pool of inputs. The results showed that

their proposed method outperformed the compared methods

including ARIMA and Support Vector Regression. In [35],

an overview study for different types of the RNN including

LSTM applied for time series prediction. Authors compared

the various architectures of the RNN and their performances

in short-term prediction. In [36], authors applied the LSTM

for short-term prediction in a residential load. The results

showed that the LSTM outperformed traditional methods.

Another technique for LSTM used for a residential load

prediction is the LSTM-based sequence to sequence in [37].

Authors claimed that the results of the LSTM and LSTM-

based sequence to sequence are comparable results with other

DL methods used for energy consumption prediction in the

literature. An extensive review of the DL methods applied to

solve energy prediction problem can be found in [38].

In the last decade, many intelligent evolutionary computa-

tions based on optimization methods have been applied to the

problem of energy consumption in buildings, e.g., Genetic

Algorithm (GA), Particle Swarm Optimization (PSO) and

Evolution Strategies (ES). These methods are types of meta-

heuristic optimization techniques that are nature inspired in

mathematical optimization processes. In terms of forecasting

chaotic time series, the PSO method improved the results of

the ANN predictive model in [39] [40] [41]. For the problem

of energy consumption prediction, PSO-ANN, and GA-ANN

hybrid prediction methods applied with principal component

analysis to select relevant input energy variables in [40]. The

hybrid approaches resulted in better performance than regular

ANN, where they had the same accuracy level. In addition,

the GA was employed to improve Adaptive Network-based

Fuzzy Inference Systems using two building datasets of Great

building Energy Predictor Shootout and a library building in

[42]. The optimization population-based research found the

better performance of hybrid predictive models than regular

ones. For the problem of time series, the ES was used to

improve the ANN training models and converges faster to

optimal solution [43].

Commonly, many hyper-parameters of the DL network,

such as the number of hidden layers, the number of hidden

neurons, activation function, etc., are inﬂuential factors in

the energy prediction model. If the selected hyper-parameters

of the predictive DL model are unsuccessful, the model

performs poorly and will lead to local optimum results. In

addition, the predictive window size or time lags of the input

variables play another big role in terms of ﬁnding optimum

prediction value. Selecting the right hyper-parameters and the

ﬁne window size is an optimization process that improves

the accuracy of the prediction model. In [44], a literature

review shows that the evolutionary computation concepts are

used to improve ML algorithm prediction, such as ANN and

Fuzzy logic. Thus, there is a need to be employed to the DL

algorithms, such as for the LSTM since it has proven better

prediction performance in the literature.

The modeling technique presented in this paper is based on

evolutionary DL method which utilizes the GA optimization

method to improve the accuracy prediction levels of the

LSTM method for the energy consumption in buildings.

The proposed approach is compared with the results of

conventional predictive models in the literature, e.g, ARIMA,

Decision Tree, kNN, multilayer perceptron (MLP), which is a

type of ANN with a potential of the deep neural network, and

LSTM with different deep architectures. The optimization

investigation is modeled by searching for the ﬁne window

size and the right number of hidden neurons. The GA-

LSTM model is trained and tested with two different building

datasets for residential and commercial buildings for very

short-term prediction.

The motivation of this work is to develop an optimization

predictive DL model using GA, and the research objective

is to ﬁnd a global or near-global optimum prediction error

in the problem of building’s energy consumption prediction

by searching in a population base of the LSTM hyper-

parameters and window size. This work contributed to the

solution of precision energy prediction at the building level

by using the GA-LSTM model to optimize the objective

function.

The contents of the paper are organized as follows: prob-

lem formulation is ﬁrstly presented in Section II. In Section

III, we elaborate the method of LSTM network and GA

optimization method. Then, we reformulate the optimization

problem in the case study to ﬁnd the optimal predictive GA-

LSTM model in Section IV. The prediction results are eval-

uated and compared with regular DL predictive models and

conventional models in Section V. Finally, some conclusions

and future work are presented in Section VI.

2VOLUME 4, 2016

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

II. PROBLEM FORMULATION

The energy consumption in a building is a time series prob-

lem that has a sequence of observations at time-space as xi=

{x1, x2, ...}where each observation in xi∈Rcorresponding

to a particular time step i. The predicted time series is deﬁned

as yi∈R, which is the energy consumption prediction.

The DL model is trained and tested as a supervised learning

problem for future time step predictions, where a predictor

function hpredicts a next step energy consumption value

yield as yi+1. In general, the utilized sliding window method

for multiple steps prediction (τ) is deﬁned as:

yi+τ=h(xi+τ, xi−1+τ, ...xi−w+τ)(1)

where wis the window size. If the window size w= 1, the

prediction function will be yi+1 =h(xi).

The optimization technique used with objective function

or the loss function is expressed as:

arg minv

i=1

(xi+τ−yi+τ)2∀y∈yi(2)

subject to. xi−w+τ≤xi−w+τ≤xi−w+τ,(3)

where mrepresents the total number of data points in the time

series, xi+τand yi+τare the real and the predicted energy

consumption of future steps, respectively, and xi−w+τand

xi−w+τare constraints of window size. The objective of the

optimizer is to minimize the energy consumption prediction

error with a sliding window and a number of hidden neurons

in the DL network architecture. The solutions space is de-

ﬁned as Rfor the minimization ﬁtness function. The task of

the optimization problem is to ﬁnd a solution x∗∈Rsuch

that:

h∗=h(x∗)≤h(x)∀x∈xi(4)

where h∗is a global optimum ﬁtness and x∗is the minimum

location in the solutions space.

III. METHODS

A. LONG SHORT-TERM MEMORY

An extension of MLP with feedback connections is deﬁned

as a recurrent neural network (RNN) [45]. The RNN network

is a sequential data neural network processor because it has

internal memory to update the state of each neuron in the

network with previous inputs as in Fig. 1. The RNN is usually

trained with the back-propagation algorithm, but it fails with

vanishing gradient descent for long-term of training. The

LSTM, which is one type of RNN, is designed to provide

a longer-term memory where internal self-loops are used for

storing information to overcome the vanishing of the gradient

descent in the RNN [45]. There are ﬁve crucial elements in

the computational graph of the LSTM: 1) input gate, 2) forget

gate, 3) output gate, 4) cell and 5) state output, as shown

in the Fig. 2. The gate operations, such as reading, writing,

and erasing, are performed to change cell memory states. The

Input Layer Hidden Layer Output Layer

FIGURE 1. An example of the RNN with one hidden layer. .

𝐶(𝑡−1)

ℎ(𝑡−1)

𝜎

𝑥𝑖

𝑓

𝑡

ℎ𝑡

𝜎

𝑖𝑡

𝐶𝑡

tanh

𝑈

𝜎

𝑜𝑡×

tanh

FIGURE 2. An illustration of the LSTM scheme showing the input gate, forget

gate and output gate. .

following equations show the mathematical representation of

the LSTM model:

it=σ(xiWi,n +h(t−1)Wi,m +bi),(5)

ft=σ(xiWf,n +h(t−1)Wf,m +bf),(6)

ot=σ(xiWo,n +h(t−1)Wo,m +bo),(7)

U= tanh(xiWU,n +h(t−1)WU,m +bU),(8)

Ct=ft×Ct−1+it×U, (9)

ht=ot×tanh(U),(10)

where σdenotes the sigmoid activation function, xiis the

input vector, itis the input of the input gate where the

subscript means input, ftis the input of the forget gate where

the subscript means forget, otis the input of the output gate

where the subscript means output, Uis the update signal, Ct

is the state value at the time tof computation and htis the

output of the LSTM cell. W(.)and b(.)are the weight matrices

and bias vectors, respectively. The weights correspond to the

current state values of a particular variable are denoted as

W(.),n and previous state signal as W(.),m. The memory state

can be modiﬁed by the decision of the input gate using a

sigmoid function with an on/off state. If the value of the input

gate is minimal and close to zero, there will be no change in

the state cell memory Ct.

VOLUME 4, 2016 3

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

1 1 0 1 0 0 1 0 0 1 0 0 0 0 0 0

1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0

Parent 1 Parent 2

Offspring 1 Offspring 2

FIGURE 3. One point crossover operation.

B. GENETIC ALGORITHM

The GA is a common nonlinear optimization algorithm

which solves constrained and unconstrained optimization

problems and provides an optimal or near-optimal solution

through searching in a complex space. It is, found by Holland

in 1975, an adaptive global optimization search based on nat-

ural selection of Darwinian analogy and genetic biology [46]

and utilizes crossover and mutation probabilities to guide

the search of an optimum solution (individual) in the ﬁtness

function. The GA is based on a population search where a set

of candidate solutions (individuals) of the ﬁtness function are

obtained after a series of iterative computations. One of the

advantages of the GA is less sensitive to initialization due to

the nature of mutation and crossover probabilities, however,

it is not the best method for online implementation due to its

slow convergence in a complex space [46].

The individuals are composed of chromosomes, which

are candidate solutions, based on the Darwinian principle of

survival of the ﬁtness value. The ﬁtness function determines

the living ability and living quality of each individual as

depending on the evolutionary process of the GA.

There are three major operators of the evolutionary process

in the GA, which are the crossover operator, the mutation

operator, and the selection operator. These operators directly

affect the ﬁtness value searching process, and ﬁnd the most

optimum solution. Another strategy in the GA that pledges

the convergence of the ﬁtness value to the optimum is

elitism selection which means copying the best individual

in the generation to next generation [46]. Nevertheless, the

chromosome length and crossover method, such as one-point

crossover, two-point crossover, etc., are important techniques

to ﬁnd the optimum value in the efﬁcient process.

The operation of crossover, which is the most important

operation in the GA algorithm, is a random exchange of

two chromosomes that are genotyped in a binary gene’s base

using one of the crossover methods as Fig. 3. The mutation

operation is the random alteration in one gene or more from

1 to 0 or vice versa. The selection operation is the process

of selecting the highest ﬁtness value among the population’s

individuals by using a selection method, e.g., the roulette

wheel and tournament selection.

Moreover, The population size and number of generation

are important factors that inﬂuence computation complexity.

Select GA parameters LSTM predictive

model & fitness value

Create random

population

Mutation SelectionCrossover

Generate new

population Stop? Output results & best

child fitness value

Yes

FIGURE 4. The GA algorithm operation scheme.

If the population size, which implies the number of the

solution in each generation, is too large, the GA algorithm

will cost large computation quantity and the probability of

plunged local optimum is low. If the population size is small,

the algorithm complexity will be reduced and the likelihood

of falling in a local optimum is high.

The convergence of the evolutionary process in the GA

algorithm is found with iterative steps, where the termination

criterion is pre-deﬁned with the maximum number of itera-

tion. Fig. 4 shows an illustration of the GA iteration process

and the basic process of the GA steps is as follows:

1) Generate initial population randomly.

2) Evaluate the ﬁtness value of each individual in the

population.

3) Perform the crossover operation.

4) Perform the mutation operation.

5) Perform the selection method.

6) Stop the GA algorithm if the termination criterion is

satisﬁed, otherwise, return to number (2).

IV. DATASETS AND DESIGN MODELING

A. DATASETS

1) Residential Building

The public dataset of a single residential building is named

as individual household electric power consumption in [47].

The dataset consists of historical energy consumption in

kW from December 2006 to November 2010 with one-

minute resolution. The model in this paper used only the

active power consumption of the household from the dataset.

The total number of samples in the dataset is more than 2

million time-steps. Fig. 5 (a) shows the variation of power

consumption with different seasons and days and Fig. 5 (b)

shows a heat map illustration of the averaged daily power

consumption for one month. It is worth noting from the heat

map that the residential building has a large volatility of

consumption for each day during one month.

2) Commercial Building

The energy dataset of a single commercial building, which

is a primary or secondary school in Denver, Colorado, USA,

4VOLUME 4, 2016

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

2007-06 2007-12 2008-06 2008-12 2009-06 2009-12 2010-06

Time (Days)

0.5

1.0

1.5

2.0

2.5

3.0

Global active power (kW)

Daily power active consumption in the residential building (kW)

(a) Line graph.

0 5 10 15 20 25 30

Time (Days)

Time (Hours)

Heat map of hourly averaged power consumption in the residential building for one month (January 2007)

(b) Heat map.

FIGURE 5. The daily average power consumption of residential building.

is randomly chosen from a list of publicly published com-

mercial buildings datasets in [48] with the name 213.csv. The

data contains energy consumption values in kW/h of one year

in 2012 with ﬁve minutes resolution where the data size is

105408 time-steps. Fig. 6 (a) shows the line graph of daily

averaged energy consumption and Fig. 6 (b) shows the heat

map of averaged daily energy consumption for one month.

From the heat map, the commercial building has a consistent

high consumption during the working hours. However, the

consumption is the lowest in the weekend days.

B. DESIGN MODELING

The proposed model in this research is utilized to optimize

the prediction error of the LSTM as in Fig. 7. The hybrid

model of the GA-LSTM is designed with a couple of hid-

den layers and an optimizable number of hidden neurons

besides an optimizable window size. The optimization model

schemes of GA-LSTM is shown in Fig. 8. The ﬁrst step of

the model is preprocessing the input dataset through normal-

ization method as:

i=xi−min

max −min (11)

where xiis the original value of the input dataset, x0

iis

the normalized value scaled to the range [0,1],max is the

maximum value of the features, and min is the minimum

value of the features. Normalizing the dataset features avoids

2012-02 2012-04 2012-06 2012-08 2012-10 2012-12

Time (Days)

Energy consumption (kW/h)

Daily energy consumption in the commercial building (kW/h)

(a) Line graph.

0 5 10 15 20 25 30

Time (Days)

Time (Hours)

Heat map of hourly averaged energy consumption in the commercial building for one month (January 2012)

(b) Heat map.

FIGURE 6. The daily average energy consumption of commercial building.

the problem of dominating the large number ranges and helps

the algorithm to perform accurately.

The second step is to select the appropriate time lags

or window size of the dataset observations and convert the

data to a supervised learning form. Then, splitting the data

into two main datasets of a training dataset and a testing

dataset with the ﬁrst 70% of the dataset and the last 30%

of the dataset, respectively. To evaluate the performance

of our proposed model properly, the training data is only

utilized separately for the training process in the LSTM and

the testing data is used for evaluating the predictive model.

For instance, we utilized the ﬁrst 33 months of residential

building data with the one-minute resolution for training the

proposed model and 14 months of data for the testing process.

Similarly, we used 73785 time-steps of commercial building

data for training and the rest is used for testing.

The fourth step is training the model with an initial window

size and a number of hidden neurons in the ﬁrst hidden

layer. Then, testing the model by testing set with the selected

window size and the number of hidden neurons is performed

to calculate the prediction accuracy of the loss function using

mean squared error, and the optimizer is stochastic gradient

descent (SGD). The total number of epochs of all learning

models is 300 epochs when one epoch is a complete pass

through the training dataset. An illustration of the LSTM

hyper-parameters hybrid with GA are demonstrated in Table

1. The window size, and the number of hidden neurons are

VOLUME 4, 2016 5

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

used to construct a ﬁtness function as in equation (2). The

ending condition must be satisﬁed when the operation ends,

otherwise, it will proceed and ﬁnd a better solution in the

next generation. When the condition is satisﬁed in the ﬁrst

LSTM model with one hidden layer, the model may need

to be improved by adding a second hidden layer to the

next LSTM model. The best window size and the number

of hidden neurons in the ﬁrst LSTM with one hidden layer

will be held and added to the second LSTM model with two

hidden layers. The GA process is done in the second LSTM

model by only optimizing the number of hidden neurons in

the second hidden layer at the second LSTM model.

The evolution base operation, e.g., GA as in Fig. 8, is a

system to search for better solutions by using evolutionary

concepts, including crossover, mutation and selection. Gener-

ating new chromosomes of window size and number hidden

layers by integrating new behavior of the model to strengthen

searching dynamics and improve the prediction accuracy.

One of the important features of chromosomes in the GA is

genotyping which is the binary coding of the features, and the

phenotype refers to decoding parameters to variable values

in order to be fed back to the model. The chosen parameters

in our experiment, e.g., crossover probability Pcx, mutation

probability PM, number of generations M, size of population

in each generation N, and the length of the chromosome lare

represented in Table 2.

C. MODELING TOOLS

The used platforms in our modeling are Intel Core i5 2.7 GHz

CPU and an external NVIDIA graphics driver with GTX1080

using mocOS High Sierra operating system. The develop-

ment environment of our system is Python 2.7 where the

DL models were implemented with the Keras deep learning

framework [49], the GA model was achieved with DEAP

framework [50], and ML models were performed with scikit-

learn framework [51].

Fitness calculation

Optimal Prediction

Optimization of

LSTM by GA

Energy consumption

estimation by LSTM

Stop?

FIGURE 7. The evolutionary DL algorithm scheme.

V. RESULTS AND DISCUSSIONS

Finding the optimal or near optimal number of time lags and

the number of hidden neurons in each layer in the LSTM

network is a non-deterministic polynomial (NP) problem

which is not easy to solve. The GA algorithm is a promising

metaheuristic method which tends to solve such NP problems

TABLE 1. The LSTM model hyper-parameters.

Hyper-parameter Selection

Number of hidden layers (Nl) 1-3

Number of hidden neurons in each layer (Nnp) Optimizable with GA

Window size (Nt) Optimizable with GA

Optimizer (opt) SGD

Loss function Mean squared error

Number of epochs (Nep) 300

TABLE 2. The GA model parameters.

Parameter Selection

Crossover probability (Pcx) 0.7

Mutation probability (PM) 0.015

Selection Tournament selection

Population Size (N) 20

Number of Generations (M) 20

Fitness Function Root mean square error

for good optimal solutions sometimes near to global opti-

mum as found in these studies for time series lags [52] and

[53]. Therefore, the number of time lags and the number of

neurons are a potent combination of dependencies that affect

the prediction process such as model overﬁtting problem and

computation complexity. The selected range of window size

or time lags in this experiment is (1-64) time lags and the

range of number of hidden neurons in each layer is (1-1024)

neurons. The results found in this section are solutions to the

NP problem in each LSTM model.

For the prediction models, several different evaluation

criteria are utilized to evaluate the prediction performance

results in the literature. The ﬁrst criterion is directly using

the 30% testing dataset to examine the performance of the

prediction model. The second criterion of model performance

evaluation is the metrics calculation where the conventional

methods are the root-mean-squared error (RMSE), the per-

centage of coefﬁcient of variance and the mean absolute error

(MAE) deﬁned as follows:

RM SE =v

i=1

(xi−yi)2(12)

CV =RMSE

¯y×100% (13)

MAE =1

i=1

|xi−yi|(14)

where mrepresents the total number of data points in the

time series, xiis the real measured time series in the original

scale of the dataset, yiis the predicted output of the time

series, and ¯yis the average of the actual values of energy

consumption. The model is benchmarked with conventional

6VOLUME 4, 2016

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

Input dataset

Data preprocessing

End?

Select window size

Train LSTMTes t

Fitness and accuracy

evaluation

Population

Convert genotype to

phenotype

Genetic operation

End?

Selected window size

Train LSTMTes t

Fitness and accuracy

evaluation

Population

Convert genotype to

phenotype

Genetic operation

End?

Selected window size

Train LSTMTes t

Fitness and accuracy

evaluation

Population

Convert genotype to

phenotype

Genetic operation

Optimized fitness function & optimal prediction

One hidden layer Two hidden layers Three hidden layers

No No No

Yes Ye s Ye s

Can the

model be

improved?

Keep the selected window size and number of

neurons for the current LSTM networks

Yes

FIGURE 8. The GA-LSTM optimization architecture with three hidden layers.

prediction methods such as ARIMA, Decision Tree regres-

sion, and kNN. In addition, the model is compared with a

hybrid prediction model, which is GA-ANN, used for tuning

the neural network parameters. To evaluate the proposed

approach with traditional DL models, the model is compared

with MLP and LSTM which were designed with 10 hidden

neurons in the ﬁrst, 5 neurons in the second and two in the

third hidden layer.

The last criterion to examine the performance of the

proposed model is cross-validation which splits the dataset

sets into k-fold subsets to estimate the general performance

of the prediction model and gives an insight on how the

model generalizes the independent variables throughout the

datasets. The method repeats the process of splitting the

dataset into training and testing portions for k-times where

the size of the testing data remains ﬁxed but moving through

the original dataset and the remainder used as training dataset

every fold as in Fig. 9.

Applying this method to the proposed model produces

a robust averaged estimation of the prediction when each

observation in the dataset is used for training and testing

at each fold. We utilized 10-fold cross-validation in our

experiment for the best parameters of the proposed model in

each case study of the residential and commercial buildings

using time series cross-validator [51].

1st

iteration

2nd

iteration

3rd

iteration

10th

iteration

Data

Train Te s t

FIGURE 9. Cross validation method with kth folds.

A. PREDICTING RESIDENTIAL BUILDING POWER

CONSUMPTION

Table 3 illustrates how the performance of the proposed GA-

LSTM model compares with those conventional prediction

models for the ﬁrst case study in residential building power

consumption. In the table, there are different architectures of

regular DL models e.g., MLP-1 with one hidden layer and

MLP-2 with two hidden layers. The obtained results show

that the proposed model outperformed other models in met-

rics evaluations. From the table, we ﬁnd that the two models

MLP and LSTM performed in a similar way to the opposite

of the proposed method, which overtook them signiﬁcantly.

It is noted that the prediction accuracies get worse when the

VOLUME 4, 2016 7

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

networks get deeper because of the dependencies of the of the

network hyper-parameters. In addition, the statistical model

ARIMA and the kNN produced the worst prediction errors in

comparison with other learning methods, however, the Deci-

sion Tree regression performed better than other conventional

models and obtained prediction error close to the DL models.

The conventional hybrid model GA-ANN performed better

than all conventional methods and traditional DL methods

for predicting residential energy consumption, however, the

proposed approach outperformed the conventional hybrid

model.

Table 4 shows the optimal parameters of GA-LSTM-1,

GA-LSTM-2 and GA-LSTM-3 and the percentage of reduc-

tion in comparison with the LSTM models. We can see the

window size is the same for all hidden layers because it is

used as an input for the next hidden layer. It is worth noticing

that the best percentage of reduction with the regular LSTM-

1 model is 17.319 % in terms of RMSE value. In addition, the

deeper networks performed good percentages of reduction in

terms RMSE values.

Table 5 shows the 10-k fold results of the proposed model

GA-LSTM-1 that achieved the best prediction from Table 3.

The prediction error results in each fold are different because

the training dataset (Dtr) size and testing dataset (Dts ) size

are shufﬂed during the process of cross-validation and the

ﬁnal prediction error is averaged over the 10 folds. This

validation process of the model increases the conﬁdence of

the prediction efﬁciency because the tested data is different

and unseen during the training operation.

TABLE 3. The comparison with conventional methods over one minute

resolution for the residential building.

Method RMSE (kW) CV (%) MAE (kW)

ARIMA 0.264 24.170 0.095

Decision Tree 0.233 21.321 0.085

kNN 0.258 23.672 0.111

GA-ANN 0.223 20.158 0.072

MLP-1 0.232 20.934 0.083

MLP-2 0.231 20.844 0.081

MLP-3 0.231 20.844 0.079

LSTM-1 0.235 21.205 0.084

LSTM-2 0.233 21.025 0.084

LSTM-3 0.238 21.476 0.086

GA-LSTM-1 0.1943 17.526 0.062

GA-LSTM-2 0.217 19.581 0.071

GA-LSTM-3 0.225 20.303 0.074

Fig. 10 shows a prediction comparison of the residential

active power consumption for very short term prediction. The

comparison is made for all prediction models given in Table

3. From the graph, we can note that the proposed model is

superior to the other two DL models benchmarked in this

study i.e., MLP and LSTM. The GA-LSTM-1 was the best

prediction line graph followed the original data line graph.

It is worth noting that the GA-ANN is a skillful model that

TABLE 4. The best parameters GA-LSTM models for the residential building

and the percentage of reduction with benchmark LSTM.

Proposed Method Benchmark RMSE % of reduction

NlNnp Nt- Percentage (%)

1 139 23 LSTM-1 17.319

2 139 & 43 23 LSTM-2 6.866

3 139 & 43 & 64 23 LSTM-3 5.462

TABLE 5. The 10-fold cross-validation results of GA-LSTM-1 for the ﬁrst case

study.

Fold No. Dtr Dts RMSE CV (%) MAE

1 188668 188659 0.221 20.238 0.082

2 377327 188659 0.237 21.703 0.085

3 565986 188659 0.220 20.146 0.082

4 754645 188659 0.212 19.413 0.071

5 943304 188659 0.219 20.054 0.073

6 1131963 188659 0.213 19.505 0.071

7 1320622 188659 0.203 18.589 0.069

8 1509281 188659 0.212 19.413 0.071

9 1697940 188659 0.202 18.498 0.069

10 1886599 188659 0.197 18.936 0.066

Mean - - 0.213 19.560 0.074

SD - - 0.012 1.057 0.007

follows the proposed approach. We can see that the GA-

LSTM outperform the models used to predict consumed

energy.

0 5 10 15 20

Time (one minute)

1.15

1.20

1.25

1.30

1.35

1.40

Power consumption (kW)

Energy consumption prediction for residential building over one minute resolution

Original

ARIMA

Decision Tree

kNN

-ANN

MLP-1

MLP-2

MLP-3

LSTM-1

LSTM-2

LSTM-3

GA_LSTM-1

GA_LSTM-2

GA_LSTM-3

FIGURE 10. Prediction comparison between the proposed model with

different conventional prediction models for very short term prediction.

B. PREDICTING COMMERCIAL BUILDING ENERGY

CONSUMPTION

The second case study is predicting commercial building

energy consumption as in Table 6 which shows how the ef-

fectiveness of the proposed GA-LSTM model in comparison

with those conventional prediction models. The results from

the table show that the proposed method outperformed other

methods in prediction accuracies, however, both MLP and

LSTM results are close to each other. It is noticeable that

the prediction accuracies failed with the deeper network in

8VOLUME 4, 2016

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

the conventional methods due to dependencies of the network

hyper-parameters. As noted from the ﬁrst case study and the

second case study, the statistical model ARIMA and the kNN

were the worst prediction errors in comparison with other

learning methods and the Decision Tree regression obtained

prediction error close to the DL models. Similarly, the con-

ventional hybrid model GA-ANN obtained better predictions

than conventional models and DL models for predicting

commercial energy consumption, however, the proposed ap-

proach is a superior model to all compared methods.

The optimal parameters of GA-LSTM are given in Table

7 where the window size is ﬁxed for all hidden layers

because it is used as an input to the next hidden layer in

the proposed method. From the table, the percentage of

reduction comparison is illustrated and the best percentage is

10.669 % in comparison with LSTM-1. The other two deeper

networks performed close to each other in their percentages

of reduction.

The 10-k fold results of the best prediction GA-LSTM-1

from Table 6 are shown in Table 8. From the table, the shufﬂe

operation of the 10-fold cross-validation produced different

prediction errors due to the different size of training and

testing in each fold. when the tested data is different in each

fold and unseen during the training process, the validation

technique promotes the certainty of the prediction efﬁciency

of the proposed model.

TABLE 6. The comparison with conventional methods over ﬁve minutes

resolution for the commercial building.

Method RMSE (kW/h) CV (%) MAE (kW/h)

ARIMA 0.539 10.462 0.297

Decision Tree 0.482 9.353 0.273

kNN 0.544 10.561 0.326

GA-ANN 0.469 9.145 0.268

MLP-1 0.495 9.615 0.305

MLP-2 0.490 9.507 0.295

MLP-3 0.478 9.271 0.271

LSTM-1 0.478 9.283 0.276

LSTM-2 0.486 9.430 0.286

LSTM-3 0.480 9.312 0.276

GA-LSTM-1 0.427 8.303 0.238

GA-LSTM-2 0.451 8.755 0.256

GA-LSTM-3 0.449 8.716 0.263

TABLE 7. The best parameters of GA-LSTM models for the commercial

building and the percentage of reduction with benchmark LSTM.

Proposed Method Benchmark RMSE % of reduction

NlNnp Nt- Percentage (%)

1 459 42 LSTM-1 10.669

2 459 & 187 23 LSTM-2 7.201

3 459 & 187 & 82 23 LSTM-3 6.458

The prediction performance in Fig. 11 shows a comparison

between the proposed GA-LSTM and conventional methods

TABLE 8. The 10-fold cross-validation results of GA-LSTM-1 for the second

case study.

Fold No. Dtr Dts RMSE CV (%) MAE

1 9586 9582 0.199 3.859 0.147

2 19168 9582 0.194 3.765 0.111

3 28750 9582 0.384 7.456 0.217

4 38332 9582 0.460 8.940 0.251

5 47914 9582 0.401 7.785 0.251

6 57496 9582 0.647 12.570 0.373

7 67078 9582 0.763 14.806 0.431

8 76660 9582 0.617 11.985 0.354

9 86242 9582 0.357 6.935 0.221

10 95824 9582 0.291 5.653 0.198

Mean - - 0.43 8.38 0.26

SD - - 0.18 3.53 0.10

of the commercial building for each prediction model. It is

noticed from the graph that the proposed model performed

better than the other models in this study and followed the

original dataset for very short term prediction. The proposed

GA-LSTM proofed its strength over the other compared

methods.

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0

Time

(Five minutes)

.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

Energy

consumption

(kW/h)

Energy

consumption prediction for commercial building over five minutes resolution

Original

ARIMA

Decision Tree

kNN

-ANN

MLP-1

MLP-2

MLP-3

LSTM-1

LSTM-2

LSTM-3

GA_LSTM-1

GA_LSTM-2

GA_LSTM-3

FIGURE 11. Prediction comparison between the proposed model with

different conventional prediction models for very short term prediction.

C. OPTIMIZATION RESULTS DISCUSSIONS

Hybridizing LSTM with GA produced more accurate pre-

diction as seen from the tables and ﬁgures above. As the

NP problem, it was not easy to ﬁnd the best window size

and number of hidden neurons in each layer because of the

suitable combination of these parameters in each layer is a

huge probabilistic task.

Fig. 12 (a) and (b) shows scatter plots of the best or survive

offsprings in each generation at GA optimization problem of

residential energy prediction, and comparisons between the

number of hidden neurons and window size versus the CV

score in percent. Fig. 12 (a) illustrates the performance of

the GA-LSTM model while searching the best individual of

hidden neurons which is 139 with 17.5% prediction accuracy.

It is noticeable from the ﬁgure that the model converged

with the number of neurons more than 100 and less than 150

VOLUME 4, 2016 9

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

neurons, however, the larger number failed to produce precise

predictions. Similarly, Fig. 12 (b) presents the searching

process of the proposed model to ﬁnd best window size which

is 23-time lags. From the ﬁgure, we can see that between

20 to 40 time lags the model performed the best results in

comparison with smaller and larger time lags. Therefore, the

GA-LSTM model converged to optimum results in the range

of (100-150) neurons and the window size in the range of

(20-40) time lags.

(a) Number of hidden neurons vs CV(%).

(b) Window size or time lags vs CV(%).

FIGURE 12. Scatter plots of window size and number of hidden neurons

individuals in the GA optimization process for the residential energy prediction

model.

The scatter plots of the second case study in the commer-

cial building are given in Fig. 13 (a) and (b). The scatter

plot of the number of neurons versus the CV in Fig 13

(a) has a wider distribution than previous scatter plot of

neurons in the residential building. There are a couple of local

optimum individuals in the ﬁgure where the best offspring

was 459 neurons with 8.3% prediction. Fig. 13 (b) shows

the convergence results between 40 and 50-time lags where

the smaller time lags are the worst prediction accuracy in the

experiment. The best individual is 42 with CV 8.3%. Thus,

the proposed model GA-LSTM led to optimum parameters

of the number of hidden neurons and the window size in the

commercial energy prediction.

(a) Number of hidden neurons vs CV(%).

(b) Window size or time lags vs CV(%).

FIGURE 13. Scatter plots of GA-LSTM optimization process for the

commercial energy prediction model.

VI. CONCLUSION

Recently, the energy prediction in buildings has been a vital

problem of energy conservation and cost-effectiveness due

to the increase of energy consumption globally. There were

many attempts to predict the energy consumption efﬁciently

using physical models and statistical models. One of those

attempts was the DL methods that obtained a promising

prediction result with deeper neural network architectures.

This paper proposed an evolutionary-based development to

the DL prediction models in order to improve prediction

accuracy and network architecture.

The proposed approach combines the GA with the LSTM

method by evolving the window size prediction and number

of hidden neurons and examining a couple of hidden layers.

The implementation of the prediction system was applied to

two public datasets of residential and commercial buildings.

The proposed model presented better performance than the

compared conventional prediction methods such as ARIMA,

10 VOLUME 4, 2016

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

Decision Tree, kNN, GA-ANN, MLP and LSTM. The best

percentage of reduction in comparison with the regular

LSTM for the residential building case study is 17.319 % and

for the commercial building case study is 10.669 %.

The reasoning behind the evolutionary learning concept

is that for DL algorithms, it is faster and efﬁcient to ﬁnd

the optimized window size and the optimized number of

hidden neurons than to ﬁnd them based the developer’s

knowledge and experimental trials. Although the evolution-

ary DL concept is more demanding regarding computational

requirements, it notably outperformed the best conventional

prediction models.

Since the proposed approach is an optimization-based

technique for energy consumption prediction using the GA

and the LSTM, the computational complexity of this tech-

nique depends on several operators, that affect computing

time, including time input lags in the LSTM, number of hid-

den neurons and layers in the LSTM, number of generations

in the GA, population size in the GA, etc.. These factors can

create an NP computational time problem in the approach.

For instance, if the ﬁrst individual in the ﬁrst generation has

14 input lags and 200 number of hidden neurons and the

second individual has 14 input lags and 250 number of hidden

neurons, the computation time of the second individual is

higher than the ﬁrst. To that end, considering a parallel

computational technique, such as MapReduce, can reduce

the time consumption of the proposed model using one of

the computational frameworks, e.g., Apache Hadoop and

Apache Spark. In addition, applying the parallel computation

technique to the proposed model can provide a real-time

prediction paradigm, e.g., real-time power forecasting, that

can train the historical inputs variables ofﬂine and update and

test the recent input variables online.

In real-world applications, the energy consumption load

in buildings has a relationship with several underlying fac-

tors, such as temperature, humidity, work time, holidays,

occupants, etc.. These factors can provide more information

about the energy consumption variability and uncertainty.

Thus, the proposed approach is modeled to handle multiple

input parameters and big data non-linear prediction. If these

factors considered, the proposed model can result in better

prediction accuracies. In future work, there will be a study

of the effectiveness of using other DL methods such as GRU

and CNN which are not implemented in this study due to the

high computational complexity.

REFERENCES

[1] K. Amarasinghe, D. Wijayasekara, H. Carey, M. Manic, D. He, and W. P.

Chen, “Artiﬁcial neural networks based thermal energy storage control

for buildings,” in IECON 2015 - 41st Annual Conference of the IEEE

Industrial Electronics Society, Nov 2015, pp. 005421–005 426.

[2] A. I. Dounis, “Artiﬁcial intelligence for energy conservation in buildings,”

Advances in Building Energy Research, vol. 4, no. 1, pp. 267–299, 2010.

[3] A. M. Khudhair and M. M. Farid, “A review on energy conservation

in building applications with thermal storage by latent heat using phase

change materials,” Energy conversion and management, vol. 45, no. 2, pp.

263–275, 2004.

[4] N. Amjady, “Short-term hourly load forecasting using time-series mod-

eling with peak load estimation capability,” IEEE Transactions on Power

Systems, vol. 16, no. 3, pp. 498–505, Aug 2001.

[5] M. T. Hagan and S. M. Behr, “The time series approach to short term load

forecasting,” IEEE Transactions on Power Systems, vol. 2, no. 3, pp. 785–

791, Aug 1987.

[6] J. Contreras, R. Espinola, F. J. Nogales, and A. J. Conejo, “Arima models

to predict next-day electricity prices,” IEEE Transactions on Power Sys-

tems, vol. 18, no. 3, pp. 1014–1020, Aug 2003.

[7] S. L. Wong, K. K. Wan, and T. N. Lam, “Artiﬁcial neural networks for

energy analysis of ofﬁce buildings with daylighting,” Applied Energy,

vol. 87, no. 2, pp. 551–557, 2010.

[8] S. A. Kalogirou, “Artiﬁcial neural networks in energy applications in

buildings,” International Journal of Low-Carbon Technologies, vol. 1,

no. 3, pp. 201–216, 2006.

[9] C. Roldán-Blay, G. Escrivá-Escrivá, C. Álvarez-Bel, C. Roldán-Porta, and

J. Rodríguez-García, “Upgrade of an artiﬁcial neural network prediction

method for electrical consumption forecasting using an hourly temperature

curve model,” Energy and Buildings, vol. 60, pp. 38–46, 2013.

[10] J. G. Jetcheva, M. Majidpour, and W.-P. Chen, “Neural network model

ensembles for building-level electricity load forecasts,” Energy and Build-

ings, vol. 84, pp. 214–223, 2014.

[11] M. De Felice and X. Yao, “Short-term load forecasting with neural network

ensembles: A comparative study [application notes],” IEEE Computational

Intelligence Magazine, vol. 6, no. 3, pp. 47–56, 2011.

[12] B. Dong, C. Cao, and S. E. Lee, “Applying support vector machines

to predict building energy consumption in tropical region,” Energy and

Buildings, vol. 37, no. 5, pp. 545–553, 2005.

[13] Q. Li, Q. Meng, J. Cai, H. Yoshino, and A. Mochida, “Applying support

vector machine to predict hourly cooling load in the building,” Applied

Energy, vol. 86, no. 10, pp. 2249–2256, 2009.

[14] L. Ghelardoni, A. Ghio, and D. Anguita, “Energy load forecasting us-

ing empirical mode decomposition and support vector regression,” IEEE

Transactions on Smart Grid, vol. 4, no. 1, pp. 549–556, 2013.

[15] B.-J. Chen, M.-W. Chang et al., “Load forecasting using support vector

machines: A study on eunite competition 2001,” IEEE transactions on

power systems, vol. 19, no. 4, pp. 1821–1830, 2004.

[16] Q. Ding, “Long-term load forecast using decision tree method,” in 2006

IEEE PES Power Systems Conference and Exposition, Oct 2006, pp.

1541–1543.

[17] M. A. Al-Gunaid, M. V. Shcherbakov, D. A. Skorobogatchenko, A. G.

Kravets, and V. A. Kamaev, “Forecasting energy consumption with the

data reliability estimatimation in the management of hybrid energy system

using fuzzy decision trees,” in 2016 7th International Conference on

Information, Intelligence, Systems Applications (IISA), July 2016, pp. 1–

[18] Y. yuan Chen, Y. Lv, Z. Li, and F. Wang, “Long short-term memory model

for trafﬁc congestion prediction with online open data,” in 2016 IEEE 19th

International Conference on Intelligent Transportation Systems (ITSC),

Nov 2016, pp. 132–137.

[19] R. Zhang, Y. Xu, Z. Y. Dong, W. Kong, and K. P. Wong, “A composite

k-nearest neighbor model for day-ahead load forecasting with limited

temperature forecasts,” in 2016 IEEE Power and Energy Society General

Meeting (PESGM), July 2016, pp. 1–5.

[20] W. Kong, Z. Y. Dong, D. J. Hill, F. Luo, and Y. Xu, “Short-term residential

load forecasting based on resident behaviour learning,” IEEE Transactions

on Power Systems, vol. 33, no. 1, pp. 1087–1088, Jan 2018.

[21] S. Ding, H. Li, C. Su, J. Yu, and F. Jin, “Evolutionary artiﬁcial neural

networks: a review,” Artiﬁcial Intelligence Review, vol. 39, no. 3, pp. 251–

260, 2013.

[22] S. Karatasou, M. Santamouris, and V. Geros, “Modeling

and predicting building’s energy use with artiﬁcial neu-

ral networks: Methods and results,” Energy and Buildings,

vol. 38, no. 8, pp. 949 – 958, 2006. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0378778805002161

[23] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image

recognition,” in 2016 IEEE Conference on Computer Vision and Pattern

Recognition (CVPR), June 2016, pp. 770–778.

[24] N. Majumder, S. Poria, A. Gelbukh, and E. Cambria, “Deep learning-based

document modeling for personality detection from text,” IEEE Intelligent

Systems, vol. 32, no. 2, pp. 74–79, Mar 2017.

[25] P. Jiang, C. Chen, and X. Liu, “Time series prediction for evolutions of

complex systems: A deep learning approach,” in 2016 IEEE International

VOLUME 4, 2016 11

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/ACCESS.2018.2887023, IEEE Access

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

Conference on Control and Robotics Engineering (ICCRE), April 2016,

pp. 1–6.

[26] D. L. Minh, A. Sadeghi-Niaraki, H. D. Huy, K. Min, and H. Moon, “Deep

learning approach for short-term stock trends prediction based on two-

stream gated recurrent unit network,” IEEE Access, vol. 6, pp. 55 392–

55 404, 2018.

[27] N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj, and A. Iosiﬁdis, “Tem-

poral bag-of-features learning for predicting mid price movements using

high frequency limit order book data,” IEEE Transactions on Emerging

Topics in Computational Intelligence, pp. 1–12, 2018.

[28] H. Lee and B. Lee, “Bayesian deep learning-based conﬁdence-aware

solar irradiance forecasting system,” in 2018 International Conference on

Information and Communication Technology Convergence (ICTC), Oct

2018, pp. 1233–1238.

[29] A. Alzahrani, P. Shamsi, M. Ferdowsi, and C. Dagli, “Solar irradiance

forecasting using deep recurrent neural networks,” in 2017 IEEE 6th

International Conference on Renewable Energy Research and Applications

(ICRERA), Nov 2017, pp. 988–994.

[30] M. Khodayar, J. Wang, and M. Manthouri, “Interval deep generative neural

network for wind speed forecasting,” IEEE Transactions on Smart Grid,

pp. 1–1, 2018.

[31] M. Khodayar and J. Wang, “Spatio-temporal graph deep neural network

for short-term wind speed forecasting,” IEEE Transactions on Sustainable

Energy, pp. 1–1, 2018.

[32] X. Dong, L. Qian, and L. Huang, “A cnn based bagging learn-

ing approach to short-term load forecasting in smart grid,” in 2017

IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted

Computed, Scalable Computing Communications, Cloud Big Data

Computing, Internet of People and Smart City Innovation (Smart-

World/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Aug 2017, pp. 1–6.

[33] K. Amarasinghe, D. L. Marino, and M. Manic, “Deep neural networks for

energy load forecasting,” in 2017 IEEE 26th International Symposium on

Industrial Electronics (ISIE), June 2017, pp. 1483–1488.

[34] H. Shi, M. Xu, and R. Li, “Deep learning for household load forecasting -

a novel pooling deep rnn,” IEEE Transactions on Smart Grid, vol. 9, no. 5,

pp. 5271–5280, Sept 2018.

[35] F. M. Bianchi, E. Maiorino, M. C. Kampffmeyer, A. Rizzi, and R. Jenssen,

“An overview and comparative analysis of recurrent neural networks for

short term load forecasting,” arXiv preprint arXiv:1705.04378, 2017.

[36] D. Gan, Y. Wang, N. Zhang, and W. Zhu, “Enhancing short-term proba-

bilistic residential load forecasting with quantile long-short-term memory,”

The Journal of Engineering, vol. 2017, no. 14, pp. 2622–2627, 2017.

[37] D. L. Marino, K. Amarasinghe, and M. Manic, “Building energy load

forecasting using deep neural networks,” in Industrial Electronics Society,

IECON 2016-42nd Annual Conference of the IEEE. IEEE, 2016, pp.

7046–7051.

[38] A. Almalaq and G. Edwards, “A review of deep learning methods applied

on load forecasting,” in 2017 16th IEEE International Conference on

Machine Learning and Applications (ICMLA), Dec 2017, pp. 511–516.

[39] L. Song, H. Qing, Y. Ying-ying, and L. Hao-ning, “Prediction for chaotic

time series of optimized bp neural network based on modiﬁed pso,” in The

26th Chinese Control and Decision Conference (2014 CCDC), May 2014,

pp. 697–702.

[40] H. Chenglei, L. Kangji, L. Guohai, and P. Lei, “Forecasting building

energy consumption based on hybrid pso-ann prediction model,” in 2015

34th Chinese Control Conference (CCC), July 2015, pp. 8243–8247.

[41] A. Afram, F. Janabi-Shariﬁ, A. S. Fung, and K. Raahemifar,

“Artiﬁcial neural network (ann) based model predictive control

(mpc) and optimization of hvac systems: A state of the art

review and case study of a residential hvac system,” Energy

and Buildings, vol. 141, pp. 96 – 113, 2017. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0378778816310799

[42] K. Li, H. Su, and J. Chu, “Forecasting building energy

consumption using neural networks and hybrid neuro-fuzzy

system: A comparative study,” Energy and Buildings, vol. 43,

no. 10, pp. 2893 – 2899, 2011. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0378778811003124

[43] M. D. Sulistiyo, R. N. Dayawati, and Nurlasmaya, “Evolution strate-

gies for weight optimization of artiﬁcial neural network in time series

prediction,” in 2013 International Conference on Robotics, Biomimetics,

Intelligent Computational Systems, Nov 2013, pp. 143–147.

[44] J. Zhang, Z. Zhan, Y. Lin, N. Chen, Y. Gong, J. Zhong, H. S. H. Chung,

Y. Li, and Y. Shi, “Evolutionary computation meets machine learning: A

survey,” IEEE Computational Intelligence Magazine, vol. 6, no. 4, pp. 68–

75, Nov 2011.

[45] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press,

2016, http://www.deeplearningbook.org.

[46] O. Kramer, Machine learning for evolution strategies. Springer, 2016,

vol. 20.

[47] D. Dheeru and E. Karra Taniskidou, “UCI machine learning repository,”

2017. [Online]. Available: http://archive.ics.uci.edu/ml

[48] “Buildings datasets,” 2012. [Online]. Available:

https://trynthink.github.io/buildingsdatasets/

[49] F. Chollet et al., “Keras,” https://github.com/fchollet/keras, 2015.

[50] F.-A. Fortin, F.-M. De Rainville, M.-A. Gardner, M. Parizeau, and

C. Gagné, “DEAP: Evolutionary algorithms made easy,” Journal of Ma-

chine Learning Research, vol. 13, pp. 2171–2175, jul 2012.

[51] L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel,

V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. Van-

derPlas, A. Joly, B. Holt, and G. Varoquaux, “API design for machine

learning software: experiences from the scikit-learn project,” in ECML

PKDD Workshop: Languages for Data Mining and Machine Learning,

2013, pp. 108–122.

[52] Z.-L. Sun, D.-S. Huang, C.-H. Zheng, and L. Shang, “Optimal selection

of time lags for tdsep based on genetic algorithm,” Neurocomputing,

vol. 69, no. 7, pp. 884 – 887, 2006, new Issues in Neurocomputing: 13th

European Symposium on Artiﬁcial Neural Networks. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0925231205002146

[53] K. Lukoseviciute and M. Ragulskis, “Evolutionary algorithms for

the selection of time lags for time series forecasting by fuzzy

inference systems,” Neurocomputing, vol. 73, no. 10, pp. 2077

– 2088, 2010, subspace Learning / Selected papers from the

European Symposium on Time Series Prediction. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0925231210001554

ABDULAZIZ ALMALAQ received the B.S. de-

gree in Electrical Engineering from College of En-

gineering, University of Hail, Hail, Saudi Arabia,

in 2011, the M.S. degree in electrical engineering

from the Electrical and Computer Engineering De-

partment, University of Denver, Denver, Colorado,

USA, in 2015. He is currently a Ph.D. candidate

at Electrical and Computer Engineering Depart-

ment, University of Denver. His research interests

include signal processing, machine learning, deep

learning, and intelligent power system applications.

JUN JASON ZHANG received the B.E. and M.E.

degrees in electrical engineering from Huazhong

University of Science and Technology, Wuhan,

China, in 2003 and 2005, respectively, and the

Ph.D. degree in electrical engineering from Ari-

zona State University, USA, in 2008. He is cur-

rently an associate professor of electrical and com-

puter engineering at the University of Denver.

He authored/coauthored over 70 peer reviewed

publications and he is the Technical Co-Chair for

the 48th North American Power Symposium (NAPS2016). His research

interests include sensing theory, signal processing and implementation, time-

varying system modeling, and their applications in intelligent power and

energy systems.

12 VOLUME 4, 2016

Recurrent Neural Networks for Energy Management Systems: A Case Study

Preprint

Full-text available

May 2024

Hybrid energy systems, which integrate diverse energy sources including solar power plants, supercapacitors, UPS batteries, generators, hydrogen cells, and the grid, represent sophisticated yet highly promising approaches to enhancing energy efficiency, reducing operational costs, and supporting renewable and grid-independent initiatives. The inherent complexity of these systems necessitates the energy management strategy (EMS) capable of judiciously allocating resources in line with demand forecasts. A critical component of devising an effective task scheduling system within this framework is the ability to generate precise forecasts of energy production from renewable sources, solar power in this case. This paper showcases the deployment and comparative evaluation of two advanced deep learning models, Long Short-term Memory Recurrent Neural Networks (LSTMs) and Bidirectional Long Short-term Memory Networks (BiLSTMs), and our proposed Ensemble model, which averages the forecasts from LSTM and BiLSTM models, developed at our Laboratory for Energy Management ({LabE). Our primary goal is to predict solar power output for three days at 15-minute intervals. Incorporating thirteen weather features, our findings reveal that the proposed models perform well in predicting energy production data, with the Ensemble predictions showing the best performance for 15-minute interval forecasts spanning three days.

Building energy consumption prediction method based on Bayesian regression and thermal inertia correction

Article

Full-text available

Nov 2023

The accurate prediction of building energy consumption is a crucial prerequisite for demand response (DR) and energy efficiency management of buildings. Nevertheless, the thermal inertia and probability distribution characteristics of energy consumption are frequently ignored by traditional prediction methods. This paper proposes a building energy consumption prediction method based on Bayesian regression and thermal inertia correction. The thermal inertia correction model is established by introducing an equivalent temperature variable to characterize the influence of thermal inertia on temperature. The equivalent temperature is described as a linear function of the actual temperature, and the key parameters of the function are optimized through genetic algorithm (GA). Using historical energy usage, temperature, and date type as inputs and future building energy comsuption as output, a Bayesian regression prediction model is established. Through Bayesian inference, combined with prior information on building energy usage data, the posterior probability distribution of building energy usage is inferred, thereby achieving accurate forecast of building energy consumption. The case study is conducted using energy consumption data from a commercial building in Nanjing. The results of the case study indicate that the proposed thermal inertia correction method is effective in narrowing the distribution of temperature data from a range of 24.5°C to 36.5°C to a more concentrated range of 26.5°C to 34°C, thereby facilitating a more focused and advantageous data distribution for predictions. Upon applying the thermal inertia correction method, the relative errors of the Radial Basis Function (RBF) and Deep Belief Network (DBN) decreases by 2.0% and 3.1% respectively, reaching 10.9% and 7.0% correspondingly. Moreover, with the utilization of Bayesian regression, the relative error further decreases to 4.4%. Notably, the Bayesian regression method not only achieves reduced errors but also provides probability distribution, demonstrating superiority over traditional methods.

Innovative framework for accurate and transparent forecasting of energy consumption: A fusion of feature selection and interpretable machine learning

Article

Full-text available

Apr 2024
APPL ENERG

The study presents a novel framework integrating feature selection (FS) and machine learning (ML) techniques to forecast inland national energy consumption (EC) in the United Kingdom across all energy sources. This innovative framework strategically combines three FS approaches with five interpretable ML models using Shapley Additive Explanations (SHAP), with the dual goal of enhancing accuracy and transparency in EC predictions. By meticulously selecting the most pertinent features from diverse features-including meteorological conditions, socioeconomic parameters, and historical consumption patterns of different primary fuels-the proposed framework enhances the robustness of the forecasting model. This is achieved through benchmarking three FS approaches: ensemble filter, wrapper, and a hybrid ensemble filter-wrapper. In addition, we introduce a novel ensemble filter FS, synthesizing outcomes from multiple base FS methods to make well-informed decisions about feature retention. Experimental results underscore the efficacy of integrating both wrapper and ensemble filter-wrapper FS approaches with interpretable ML models, ensuring the forecasting process remains comprehensible and interpretable while utilizing a manageable number of features (four to eight). In addition, experimental results indicate that different feature subsets are usually selected for each combined FS approach and ML model. This study not only demonstrates the framework's capability to provide accurate forecasts but also establishes it as a valuable tool for policymakers and energy analysts.

Innovative Framework for Accurate and Transparent Forecasting of Energy Consumption: A Fusion of Feature Selection and Interpretable Machine Learning

Preprint

Full-text available

Jan 2023

Analyzing the opening and closing of windows in residential for predicting the energy consumption using optimized multi-scale convolution networks

Article

May 2024

Purpose This proposal aims to forecast energy consumption in residential buildings based on the effect of opening and closing windows by the deep architecture approach. In this task, the developed model has three stages: (1) collection of data, (2) feature extraction and (3) prediction. Initially, the data for the closing and opening frequency of the window are taken from the manually collected datasets. After that, the weighted feature extraction is performed in the collected data. The attained weighted feature is fed to predict energy consumption. The prediction uses the efficient hybrid multi-scale convolution networks (EHMSCN), where two deep structured architectures like a deep temporal context network and one-dimensional deep convolutional neural network. Here, the parameter optimization takes place with the hybrid algorithm named jumping rate-based grasshopper lemur optimization (JR-GLO). The core aim of this energy consumption model is to predict the consumption of energy accurately based on the effect of opening and closing windows. Therefore, the offered energy consumption prediction approach is analyzed over various measures and attains an accurate performance rate than the conventional techniques. Design/methodology/approach An EHMSCN-aided energy consumption prediction model is developed to forecast the amount of energy usage during the opening and closing of windows accurately. The emission of CO 2 in indoor spaces is highly reduced. Findings The MASE measure of the proposed model was 52.55, 43.83, 42.01 and 36.81% higher than ANN, CNN, DTCN and 1DCNN. Originality/value The findings of the suggested model in residences were attained high-quality measures with high accuracy, precision and variance.

Trends in the current state of research on load forecasts in the building sector

Preprint

Full-text available

May 2024

Benjamin Schminke

With the increasing decentralization of energy supply, the need to generate and use electricity locally is growing. Energy management systems at building level can be used for this purpose. Thermal and electrical load forecasts are needed as a basis for this. The paper "Overview of the current state of research on load forecasts in the building sector" provides an introduction to the topic of load forecasts in the building sector. For this purpose, 80 scientific articles were quantitatively examined, and focal points were examined with regard to properties, data basis and methods. This current elaboration builds on the previous publication and provides chronological evaluations of the papers examined to show trends and developments for the period from 2014 onwards. This paper starts by briefly summarising the main findings of the previous paper. Subsequently, the respective focal points are examined and, insofar as temporal developments are recognizable, are presented. Thus, it becomes apparent that forecasts are increasingly being made for a specific form of energy and that research interest in forecasts for other forms of energy is declining. The investigation of the granularity of forecasts shows that the dominance of one-hour intervals is decreasing. At the same time, data sets used for research are becoming increasingly larger and are recorded over longer periods of time. This may be related to the growing research interest in methods from the field of machine learning. Especially in the area of artificial neural networks the research interest in recurrent neural networks and deep learning is increasing. Finally, the current state of research on load forecasting in the building sector is defined on the basis of identified focal points as well as trends and developments and emerging research questions are outlined.

A Hierarchical Personalized Attention Mechanism for Federated Learning in Smart Grid Load Forecasting

Conference Paper

Nov 2023

Anomaly detection in time-series data using evolutionary neural architecture search with non-differentiable functions

Article

Mar 2024
APPL SOFT COMPUT

Review of the building energy performance gap from simulation and building lifecycle perspectives: Magnitude, causes and solutions

Article

Mar 2024

Energy Consumption Level Classification using Stepwise Dynamic Nearest Neighbor Algorithm

Conference Paper

Nov 2023

Deep Learning Approach for Short-Term Stock Trends Prediction Based on Two-Stream Gated Recurrent Unit Network

Article

Full-text available

Sep 2018

Financial news has been proven to be a crucial factor which causes fluctuations in stock prices. However, previous studies heavily relied on analyzing shallow features and ignored the structural relation among words in a sentence. Several sentiment analysis researches have tried to point out the relationship between investors’ reaction and news events. However, the sentiment dataset was usually constructed from the lingual dataset which is unrelated to the financial sector and led to poor-performance. This paper proposes a novel framework to predict the directions of stock prices by using both financial news and sentiment dictionary. The original contributions of this study include the proposal of a novel two-stream Gated Recurrent Unit Network and Stock2Vec - a sentiment word embedding trained on financial news dataset and Harvard IV-4. Two main experiments are conducted: the first experiment predicts S&P 500 index stock price directions using the historical S&P 500 prices and the articles crawled from Reuters and Bloomberg, the second experiment forecasts the price trends of VN-index using VietStock news and stock prices from cophieu68. Results show that (1) Two-stream GRU outperforms state-of-the-art models; (2) Stock2Vec is more efficient in dealing with financial datasets; (3) Applying the model, a simulation scenario proves that our model is effective for the stock sector.

Recurrent Neural Networks for Short-Term Load Forecasting: An Overview and Comparative Analysis

Book

Full-text available

Jan 2017

The key component in forecasting demand and consumption of resources in a supply network is an accurate prediction of real-valued time series. Indeed, both service interruptions and resource waste can be reduced with the implementation of an effective forecasting system. Significant research has thus been devoted to the design and development of methodologies for short term load forecasting over the past decades. A class of mathematical models, called Recurrent Neural Networks, are nowadays gaining renewed interest among researchers and they are replacing many practical implementations of the forecasting systems, previously based on static methods. Despite the undeniable expressive power of these architectures, their recurrent nature complicates their understanding and poses challenges in the training procedures. Recently, new important families of recurrent architectures have emerged and their applicability in the context of load forecasting has not been investigated completely yet. This work performs a comparative study on the problem of Short-Term Load Forecast, by using different classes of state-of-the-art Recurrent Neural Networks. The authors test the reviewed models first on controlled synthetic tasks and then on different real datasets, covering important practical cases of study. The text also provides a general overview of the most important architectures and defines guidelines for configuring the recurrent networks to predict real-valued time series.

Enhancing Short Term Probabilistic Residential Load Forecasting with Quantile Long-Short-Term Memory

Article

Full-text available

Nov 2017
JOE

In the study of load forecasting, short term load forecasting in the horizon of individuals is prone to manifest non-stationary and stochastic features compared to predicting the aggregated loads. Hence better methodologies should be proposed to forecast short-term residual loads more accurately, and refined representation of forecasting results should be reconsidered to make the prediction more reliable. This paper offers a format of short-term probabilistic forecasting results in terms of quantiles, which can better describe the uncertainty of residual loads, and a deep-learning based method, quantile long-short-term-memory (Q-LSTM), to implement probabilistic residual load forecasting. Experiments are conducted on an open dataset. Results show that the proposed method overrides traditional methods significantly in terms of average quantile score (AQS).

Bayesian Deep Learning-based Confidence-aware Solar Irradiance Forecasting System

Conference Paper

Oct 2018

Temporal Bag-of-Features Learning for Predicting Mid Price Movements Using High Frequency Limit Order Book Data

Article

Oct 2018

Time-series forecasting has various applications in a wide range of domains, e.g., forecasting stock markets using limit order book data. Limit order book data provide much richer information about the behavior of stocks than its price alone, but also bear several challenges, such as dealing with multiple price depths and processing very large amounts of data of high dimensionality, velocity, and variety. A well-known approach for efficiently handling large amounts of high-dimensional data is the bag-of-features (BoF) model. However, the BoF method was designed to handle multimedia data such as images. In this paper, a novel temporal-aware neural BoF model is proposed tailored to the needs of time-series forecasting using high frequency limit order book data. Two separate sets of radial basis function and accumulation layers are used in the temporal BoF to capture both the short-term behavior and the long-term dynamics of time series. This allows for modeling complex temporal phenomena that occur in time-series data and further increase the forecasting ability of the model. Any other neural layer, such as feature transformation layers, or classifiers, such as multilayer perceptrons, can be combined with the proposed deep learning approach, which can be trained end-to-end using the back-propagation algorithm. The effectiveness of the proposed method is validated using a large-scale limit order book dataset, containing over 4.5 million limit orders, and it is demonstrated that it greatly outperforms all the other evaluated methods.

A CNN based bagging learning approach to short-term load forecasting in smart grid

Conference Paper

Aug 2017

Interval Deep Generative Neural Network for Wind Speed Forecasting

Article

Jun 2018

In recent years, wind speed forecasting is considered as a challenging task required for the prediction of wind energy resources. As a highly varying data, wind speed time series requires highly nonlinear temporal features for the prediction tasks. However, most forecasting approaches apply shallow supervised features extracted using architectures with few nonlinear hidden layers. Moreover, the exact features captured in such methodologies cannot decrease the wind data uncertainties. In this paper, an interval probability distribution learning (IPDL) model is proposed based on Restricted Boltzmann Machines and Rough Set Theory to capture unsupervised temporal features from the wind speed data. The proposed model contains a set of interval latent variables tuned to capture the probability distribution of wind speed time series data using contrastive divergence with Gibbs sampling. A real-valued interval deep belief network (IDBN) is further designed employing a stack of IPDLs with a fuzzy type II inference system (FT2IS) for the supervised regression of future wind speed values. In order to automatically learn meaningful unsupervised features from the underlying wind speed data, real-valued input units are designed inside IDBN to better approximate the wind speed probability distribution function compared to classic DBNs. The high generalization capability of our unsupervised feature learning model incorporated with the robustness of IPDLs and FT2IS leads to accurate predictions. Simulation results on the Western Wind Dataset reveal significant performance improvement in 1-hr up to 24-hr ahead predictions compared to single-model approaches including both shallow and deep architectures, as well as recently proposed hybrid methodologies. IEEE

Spatio-Temporal Graph Deep Neural Network for Short-Term Wind Speed Forecasting

Article

Jun 2018

Wind speed forecasting is still a challenge due to the stochastic and highly varying characteristics of wind. In this paper, a graph deep learning model is proposed to learn the powerful spatiotemporal features from the wind speed and wind direction data in neighboring wind farms. The underlying wind farms are modeled by an undirected graph where each node corresponds to a wind site. For each node, temporal features are extracted using a Long Short-Term Memory (LSTM) Network. A scalable graph convolutional deep learning architecture (GCDLA) motivated by the localized first-order approximation of spectral graph convolutions, leverages the extracted temporal features to forecast the wind speed time series of the whole graph nodes. The proposed GCDLA captures spatial wind features as well as deep temporal features of the wind data at each wind site. To further improve the prediction accuracy and capture robust latent representations, Rough Set Theory is incorporated with the proposed graph deep network by introducing upper and lower bound parameter approximations in the model. Simulation results show the advantages of capturing deep spatial and temporal interval features in the proposed framework compared to the state-of-the-art deep learning models as well as shallow architectures in the recent literature.

A Review of Deep Learning Methods Applied on Load Forecasting

Conference Paper

Dec 2017

The utility industry has invested widely in smart grid (SG) over the past decade. They considered it the future electrical grid while the information and electricity are delivered in two-way flow. SG has many Artificial Intelligence (AI) applications such as Artificial Neural Network (ANN), Machine Learning (ML) and Deep Learning (DL). Recently, DL has been a hot topic for AI applications in many fields such as time series load forecasting. This paper introduces the common algorithms of DL in the literature applied to load forecasting problems in the SG and power systems. The intention of this survey is to explore the different applications of DL that are used in the power systems and smart grid load forecasting. In addition, it compares the accuracy results RMSE and MAE for the reviewed applications and shows the use of convolutional neural network CNN with k-means algorithm had a great percentage of reduction in terms of RMSE.

Solar irradiance forecasting using deep recurrent neural networks

Conference Paper

Nov 2017

Evolutionary Deep Learning Based Energy Consumption Prediction for Buildings

Abstract and Figures

Recommended publications

Predicting sound radiation efficiency and sound transmission loss of orthotropic cross-laminated tim...

An Overview of Osmotic Power: A Viable Power Generation Technology for Present and Future

MODELO DE INTEGRAÇÃO DE RECURSOS COMO INSTRUMENTO PARA UM PLANEJAMENTO ENERGÉTICO SUSTENTÁVEL

ENERGY Production, Conversion, Storage, Conservation, and Coupling Second Edition, Springer 2016