ArticlePDF Available

Learning to shift load under uncertain production in the smart grid

International Transactions on Electrical Energy Systems

December 2020
31(6)

DOI:10.1002/2050-7038.12748

Authors:

Mohsen Ghaffari

IT University of Copenhagen

Mohsen Afsharchi

University of Zanjan

Demand‐side management (DSM) enables customers to decide consciously on how to seek and obtain power from the grid. The prevailing method available in DSM is load shifting. The grid is assisted through reducing load demands during the peak hours and altering the demand time into the off‐peak hours in a manner that the consumption sources could be met and several online load alterations may be precluded. The present article develops three approaches based on centralized multiagent reinforcement learning (CMARL) wherein the grid is modeled via a cooperative game. In the proposed methods, learning takes place in a center agent, and agents, in turn, are not to communicate with each other throughout the learning process. The results of implementations indicate that the proposed producers optimize the grid performance through diminishing customer costs and enhancing security by minimizing inter‐customer interactions.

Centralized multiagent reinforcement learning to load shifting for costumers of the smart grid

…

Two linear and polynomial techniques of SVR method on sample data gathered using function sin(x)

…

The proposed models' behavior with test day data: (A) Demand‐based model. (B) Binary model. (C) Decreased binary model

…

Percentage of agents' participation in load shifting for changing the amount of λ

…

Agent's participation in load shifting for a day λ = 0.6

…

Figures - available from: International Transactions on Electrical Energy Systems

This content is subject to copyright. Terms and conditions apply.

Content uploaded by Mohsen Ghaffari

Content may be subject to copyright.

RESEARCH ARTICLE

Learning to shift load under uncertain production

in the smart grid

Mohsen Ghaffari

| Mohsen Afsharchi

1,2

Department of Computer Science and

Information Technology, Institute for

Advanced Studies in Basic Sciences

Zanjan, Zanjan, Iran

Department of Electrical and Computer,

Engineering, University of Zanjan,

Zanjan, Iran

Correspondence

Mohsen Afsharchi, Department of

Electrical and Computer, Engineering,

University of Zanjan, Zanjan, Iran.

Summary

Demand-side management (DSM) enables customers to decide consciously on

how to seek and obtain power from the grid. The prevailing method available in

DSM is load shifting. The grid is assisted through reducing load demands during

the peak hours and altering the demand time into the off-peak hours in a man-

ner that the consumption sources could be met and several online load alter-

ations may be precluded. The present article develops three approaches based on

centralized multiagent reinforcement learning (CMARL) wherein the grid is

modeled via a cooperative game. In the proposed methods, learning takes place

in a center agent, and agents, in turn, are not to communicate with each other

throughout the learning process. The results of implementations indicate that the

proposed producers optimize the grid performance through diminishing cus-

tomer costs and enhancing security by minimizing inter-customer interactions.

KEYWORDS

centralized multiagent reinforcement learning, demand-side management, game theory, load

shifting, smart grid

1|INTRODUCTION

A smart grid, consisting of renewable energy sources, storage units, and power equipment, transfers energy from pro-

ducers to consumers via two-way digital technology so as to decrease the production cost and increase reliability

through controlling energy consumption. Recent studies have mainly focused on infrastructure, management, and

smart grid security. In these grids, controlling methodologies and communication technologies are conducted smartly

in both distribution and transmission levels so that electrical energy could be supplied.

In recent years, an objective of smart grid constructions has been to provide the grid with a demand-side management

(DSM) system while considering the optimized performance of the existing production sources and better load demand

management under diverse conditions. In this approach, consumption load is altered in a way that the grid sustainability

can be increased while air pollution can be decreased. In principle, using DSM in smart grids results in increased general

efficiency, security, and sustainability according to the maximum capacity of the current infrastructure.

List of Symbols and Abbreviations: mp, Marginal production; ℵ, Set of customers; dh

i, Certain amount of ith customer demand in hour h;d

, Total

energy demand for all customers; g

, The suitable amount of load for the grid; PH, The set of peak hours; PH0, The set of off-peak hours; β, Load

reduction factor; ED, Load demands from the utility company's point of view; rh

i, The amount of load that each customer should shift; λ, Customer's

satisfaction factor of participating in load shifting; ch

i, Price of each unit of energy on hfor ith customer; B, Locational marginal pricing factor; S, The

set of states; A, The set of actions; R, The reward function; T, The transition function; Δ, The distribution of probabilities in the set of states;

α, Learning factor; γ, Discount factor; CMARL, centralized multi-agent reinforcement learning; DSM, demand side management; MARL, multi-agent

reinforcement learning; MDP, markov decision process; SVR, support vector regression.

Received: 28 March 2020 Revised: 28 July 2020 Accepted: 27 November 2020

DOI: 10.1002/2050-7038.12748

https://doi.org/10.1002/2050-7038.12748

Smart pricing is a unique feature of smart grids. It is capable of persuading customers to control energy consump-

tion by increasing costs during peak hours as well as enticement schemes all over the grid. Furthermore, DSM using

smart pricing allows the customers to make their choices considering their interests (in terms of economy) as well as

the grids' interests or—in load volume management terms—their plans the way electrical demands are scheduled dur-

ing all hours of the day. In other words, through by apprised of the hourly cost change trend and the total demand of

the grid, customers can choose the optimum time for their desired load demand. As a result, not only will they reduce

their costs but also they can contribute to proper load shift in the grid, that is, the grid could flawlessly generate and dis-

tribute the load demanded.

Owing to the nonlinear and discrete nature of the current smart grids model, a major part of related investigations

has dealt with developing optimization algorithms to obtain the best power distribution mode. Furthermore, other

important studies on smart grids were conducted on the components and strategies of DSM, which are to be discussed.

In their study, Mohsenian-Rad et al

developed a method based on game theory. They defined the grid customers as

players and the daily schedules of their household appliances and loads as a strategy. They assumed that the utility

company could adopt adequate pricing tariffs that differentiate energy usage in terms of time and level. They claimed

that the costs of the grid would be minimized if a Nash equilibrium was achieved. The main problem of this work is to

find a Nash equilibrium in the real world. The number of customers of an actual grid can be so high that finding the

Nash equilibrium may appear inefficient.

Atzeni et al

attempted to introduce a different approach to energy management to improve the grid model by tak-

ing energy storage into account. Using the day-ahead technique and a general energy pricing model in DSM mecha-

nism, they tackle the grid optimization from two different perspectives: a user-oriented optimization and a holistic-

based design. In the day-ahead technique, customers should indicate their schedules the day before the test day so that

it would be possible to manage the consumption load of the grid for the next day. It should be noted that this gives rise

to a number of problems in terms of indicating the amount of customer demand.

The overall value of implementing DSM and demand-response management (DRM) schemes was studied in Refer-

ence 3 via a Stackelberg game formulation. The work in Reference 4 investigated how energy consumption could be

optimized through a two-step centralized model, in which a power supplier provides consumers with an energy price

parameter and consumption summary vector.

Chen et al

convinced consumers to do load shifting by adopting an instantaneous load billing. They used an aggre-

gative game in their DSM scenario of modeling selfish customer behavior. Furthermore, they proposed an algorithm for

the conditions where there was no central control.

Unlike the conducted studies in this area, Chai et al

studied DRM with multiple utility companies. In their paper, they

modeled the interaction between utility companies and residential users as a two-level game. That is, the competition among

the utility companies was formulated as a noncooperative game, while the interaction among the residential users was for-

mulated as an evolutionary game. Finding the Nash equilibrium is a problem that all non-cooperative games struggle with.

All the mentioned works have discussed DSM in terms of economic and optimization aspects. Although these stud-

ies proposed positive matters, most of them supposed the behavior of agents to be extremely rational and assumed that

customers are to behave as preferred by the grid, which, of course, it is not the case in the real world. Thus, a major pro-

portion of the studies have not considered the irrational behavior of customers. Moreover, as mentioned before, very

often, it has been really challenging to attain a Nash equilibrium in large communities in the real world.

Wang et al

studied a load shifting method using a non-cooperative game. The main characteristic of the study was the

prospect theory, which simulated the agents' behavior. The evident disadvantage of this work was the exersion of limita-

tions on customers' load-shifting participation; that is, whenever customers tended to start participation in load-shifting,

they were obliged to continue their participation until the last hour of the day (h=24). In other words, participating in load

shifting is almost identical to a one-sided door. Thus, the customers cannot offer their load to be bought at certain hours of

a day and demand their load during the next hours of the same day, but can require demand during their preferred hours.

Rather, it is the grid that will determine how much of the customer's demand could be met for the given hours. Moreover,

once the customers participate in load shifting, they are not allowed to leave the system until the end of the day.

In Reference 8, Wijaya et al suggested a method to solve the load shifting problem based on a multi-unit auction in

which all the customers send information about their demands to a center and then the center regulates their load

shifting arbitrarily and, in the end, sells the electric energy using an auction. Thus, the grid manages the demand, and

the load also is supplied with the best price. Moreover, customers can request their demands according to their pre-

ferred time and price. The probable challenges the method seems to suffer from are related to both grid and economy.

First, there are many interactions in the grid that create severe security problems. Second, while auction has many

advantages, some disadvantages accompany it when used directly.

2of18 GHAFFARI AND AFSHARCHI

He et al

proposed a model predictive control framework-based distributed demand-side energy management

method for users and utilities in a smart grid. Users are equipped with renewable energy resources, energy storage sys-

tems, and different types of smart loads. With the proposed method, each user finds an optimal operation routine in

response to the varying electricity prices according to his/her preference individually: for example, the power reduction

of flexible loads, the start time of “shiftable”loads, the operation power of schedulable loads, and the charge/discharge

routine of the energy storage systems.

In Reference 10, model predictive control methodologies were developed to address the distribution line congestion

and balancing problem in electric power distribution systems. The authors formulated model demand response strate-

gies as mixed-integer quadratic program optimization problems involving both continuous and binary variables.

Jamil and Mittal

proposed two optimization algorithms for solving the load shifting problem in a smart grid. The

first one was a particle swarm optimization algorithm and the second one a grasshopper optimization algorithm, both

of which were proposed and applied in three load areas of the smart grid, that is, residential, commercial, and

industrial.

Ali et al

simulated load shifting occurring in the residential buildings, with air-conditioners as the shiftable load.

The significant problem of these works was that the method might create new peaks. To overcome this problem, Khalid

et al

presented a novel approach to optimize load demand and storage management in response to dynamic pricing

using machine learning and optimization algorithms. Although, the authors show effectiveness in terms of minimizing

the electricity bill as well as intercepting minimal-price peaks creation, their use of linear programming optimization

renders their work inefficient for real-world grids.

Afzaal et al

proposed an auction mechanism to optimize the energy traded between consumers and multiple sup-

pliers within a smart grid. They suggested an agent-based forecasting method that is capable of predicting the energy

consumption of each consumer with a lead-time of 1 h. This forecasting is exploited to estimate the cost to purchase the

required energy from multiple suppliers. As mentioned before, auction breeds problems of fairness and security. Also,

this work ignores grid customer satisfaction.

High penetration of renewable energy sources and electrical energy storage systems in electrical distribution grids

has changed distribution system operators' energy balance. For this purpose, Chamandoust et al

modeled the energy

scheduling problem of a residential smart electrical distribution grid with renewable energy sources and DSM as a tri-

objective model. The proposed model was solved using the epsilon-constraint method, which was inefficient in real-

world grids. Since the proposed approach has three objective functions, different Pareto solutions are obtained and the

best solution is determined by the decision-making method. Also, they modeled the uncertain behavior of renewable

energy sources using the stochastic optimization approach.

Using centralized reinforcement learning, the present study attempts to develop a method to solve the problem of

load shifting in a multiagent environment. In the proposed method, customers have the right to participate in load

shifting not necessarily in sequential intervals. Furthermore, since it is modeled based on a cooperative game, the prob-

lem of finding a Nash equilibrium is acceptably removed. The most important aspect of the proposed method, com-

pared with the other ones, could be its independence from the number of customers in the grid, which may eliminate

worries about grid extension. Moreover, the necessary number of communication between the customers is reduced sig-

nificantly in comparison with other works. Thus, we claim that our suggested approach, in addition to costs reduction,

solves the balancing problem of the grid without causing customer dissatisfaction, reduces the possibility of security

threats by decreasing the number of communication, and also improves the time complexity (running time) of finding

optimum load shifting in the grid.

Research on smart grids has shown that, owing to the use of renewable energy, sometimes the grids commit errors

on predicting the amount of energy produced. The reason lies in different factors, such as unexpected changes in

weather conditions. In such cases, the level of produced energy would be lower than expected. To deal with production

level irregularity, we specify it by using a uniform distribution, previously indicated by the utility company, called mp

in this paper. Thus, during the learning phase, the level of produced load in an hour may meet the demand during an

hour in one episode while in another episode over the same hour it produces the lowest level of the predicted load,

which cannot meet the demand of the grid. In such cases, the customers should be encouraged to shift loads by increas-

ing the energy price during the given hours and reducing it over other hours.

This paper is composed of five sections. Basic definitions in the field of smart grids will be presented following smart

grids introduction and load distribution in Section 2. Then game theory and reinforcement learning will be reviewed in

Section 3. Section 4 is devoted to the proposed method. Section 5 includes the results of applying the proposed models

and analysis of the research results.

GHAFFARI AND AFSHARCHI 3of18

2|DEFINITIONS

Considering the cost effect on customer behavior, DSM tries to develop working methods in which load demands in the

grid can be managed. Thus, in this section we will discuss how load can be managed and define some terms needed to

proceed further.

2.1 |Load management

Smart grids consist of buyers and sellers of energy, all of whom are generally referred to as the customers of the grid. In

this article, customers are referred to as buyers. Let ℵbe a set of customers, where each customer consumes a certain

amount of energy per hour, and it is indicated by dh

ifor the ith customer in the hth hour. Thus, the total energy demand

for all customers in hcan be assessed using the following equation (Wang et al

dh=X

i∈ℵ

ið1Þ

DSM has different load management strategies, including valley filling, peak clipping, load growth, strategic conser-

vation, and load shifting. In this study, we use load shifting to manage the load efficiently. It is expected that the distri-

bution of customers' load demands after load shifting results in companies facing fewer problems regarding energy

production and distribution.

In load shifting, customers of the grid change the time of a part of their demand from peak hours to off-peak hours.

Therefore, both the to-be-reduced load amount during peak hours and the new time of that demand play key roles in

load shifting. In other words, not only the load demand amount the customers are to reduce at peak hours is important,

but also the time that they can request the same demand again from the grid is of great importance, as it may cause

new peak hours. Suppose a case in which several customers simultaneously shift their load demand to h. What may

result is a sudden increase in load demands in h. Thus, the grid will not be able to meet this demand rate and the prob-

lem of peak hour will happen at that hour.

Utility companies assess the rate of the required load for hour hbased on the data gathered from previous years. More-

over, they predict the rate of produced load by the grid by considering such factors as weather conditions. Finally, based on

the results, they will be able to indicate the suitable amount of load for the grid by using the following equation;

gh=

β×dh+mp,h∈PH

dh+ED

PH +mp,his minimum load hour

dh+mp,O:W:

ð2Þ

where PH is the set of peak hours and mp refers to marginal production resulting from the uncertain nature of energy

production (i.e., unexpected changes in weather conditions or unpredicted destructions in power generating equip-

ment). mp is assumed uncertain and can be indicated randomly with a uniform distribution. βis the load reduction fac-

tor that should be applied to total demand of the grid at each time. It changes the load level of the grid, where energy

production and distribution incur the minimum cost. As mentioned before, the utility company can predict the total

demand of the grid by applying the data gathered from the previous years. Next, utilizing the predicted total amount of

energy that the grid can generate and distribute at minimum cost, the utility company indicates the value of β. In this

article, we set β= 0.9 and the value of mp is drawn from a uniform distribution between 0 and 50 kW. In addition, ED

shows the load demands from the utility company's point of view and can be calculated by the following Equation (7):

ED =X

h∈PH

1−βðÞ×dhð3Þ

Note that ED will be shifted to another time based on the desired time of utility companies. However, this rate can

be shifted to the adjacent hours to either peak hours or any hours with which customers are more satisfied; the grid is

4of18 GHAFFARI AND AFSHARCHI

also able to define it according to its predictions. An important point while deciding upon the time of shifting the load

is to consider the lowest price for each load unit at a given hour because the main factor that encourages customers to

participate in load shifting is the low prices.

Based on Equation (2), utility companies can compute the load per hour to make it bearable for the grid. Next,

based on the targeted load, they can specify by what amount each customer should shift their load demand from the

peak hour of hto that of the off-peak hours.

i= max dh−gh−mp,0



×dh

dh×λið4Þ

where 0 < λ

≤1 shows the ith customer's satisfaction factor of participating in load shifting. This factor allows our

models to learn the customer's decision concerning the load before proceeding to load shifting. In this case, we can

claim that the customer satisfaction with load shifting is taken into account. Also it helps us to learn how to face all

types of customers.

2.2 |Demand cost

Owing to the increasing need for electricity, it would be impossible to reduce the demand significantly. However, con-

trolling and managing hourly consumption can reduce the costs of production and distribution, which makes it possible

to reduce the price of each unit of energy per hour. Here, we assume that the price of each unit of energy in hfor the

ith customer depends on the ratio of their demand to total demand from the grid in the given hour

i=B×dh

dhð5Þ

where Bcan be calculated based on the locational marginal pricing

because the price function is not necessarily time-

dependent. Use of Equation (5) makes the price of each unit of energy to be dependent on the total demand for energy.

That is, the less the customers' demand, compared with total demand, the less the price they pay for each unit of

energy.

3|CENTRALIZED LEARNING FORMULATION FOR DEMAND-SIDE

MANAGEMENT

The main idea behind reinforcement learning (RL) is that the rewarded behavior is likely to be repeated, whereas

behavior that is punished is less likely to recur. Thus, an agent learns from the received environmental feedback by two

different signals: state signal indicates the agent state in the environment, and the reward signal shows feedback of the

environment to determine the desirability of the agent state. The agent tries to maximize its long-term utility.

If the environment includes a set of agents whose behaviors affect one another, the learning will be called multi-

agent reinforcement learning (MARL). Learning in multiagent environments takes place in two ways: centralized and

decentralized. In decentralized learning, the agents should interact with each other at all times which, in turn, may

decrease efficiency. Thus, communications between the agents should be restricted. There are several methods to con-

trol the communications, but most of the time the results are not optimal. However, the centralized method reduces

communications in the grid by using a central learner. Moreover, the learning process in this approach is easier than in

decentralized methods. The reason is that all needed information is available for the center. Centralized learning is

structured in such a way that allows each of the agents to be informed about the acts of one another, because of the

learning process. The centralized learning process is similar to the one in which the center, first, simulates the behavior

of agents for itself and learns how to act in different states. Second, when the agents act in the environment, based on

what they have learned before, the center can suggest the agents what action is suitable in the current state (Figure 1).

The other important factor in a multiagent environment is defining the way agents may interact with each other.

The interaction is divided into two categories: cooperative and noncooperative. Agents in this work are cooperative,

and thus the agents not only pay attention to optimize their profit but also try to optimize the costs of the whole grid.

GHAFFARI AND AFSHARCHI 5of18

Note that, by factors we defined here, improving the costs of the whole grid results in improving the customers' costs.

Thus, the resulting conditions not only satisfy the customers but also improve the load level in the grid.

To formulate the model in RL, we use tuple S,A,R,T

, where Srefers to the set of states; Athe set of actions;

R=S×A!ℝ, the reward function for each action by the agent in current state; and T=S×A!Δ, the transition

function, where Δis the distribution of probabilities in the set of states S.

Note that, RL is applied to the problems in which the agent moves through some sequence of states in order to

reach the intended state. Thus, we should try to find a model that maps our problem to RL.

In this work, the customers are known as the agents in the model. We should use MARL because we do not deal

with one single customer. The learning method in this article is centralized so that the center receives information from

all customers (agents) in the learning phase and learns how each agent acts in different states using the Q-learning algo-

rithm. Next, in the test phase, all customers send the vector of their daily demand to the center agent. Based on what it

has learned, the center agent indicates the best action for the customers of the grid and finally sends the received infor-

mation to the customers (Figure 1). Thus, the information being sent is only twice as much as the number of customers

in the grid and, apparently, the amount is not high enough to disrupt the grid.

4|PROPOSED MODELS

Three models of the load shifting problem are presented in this section based on MARL. RL-oriented algorithms

include two parts: learning and testing. The learning process is carried out on different data gathered from the previous

behaviors of the customers so that Q-table can be learned and deployed.

During the process of learning, first of all, the data related to one-day schedule of the customers are sent to the cen-

ter agent. The next step of learning is to move through the state space to receive feedback from the environment while

switching from one state to the other. Thus, in this step, the center agent selects an action randomly for each agent and

then, according to the coming sequence of actions, computes a new state in which the grid is. Moreover, it updates the

environmental reward in order to change the state. Note that since we use a cooperative game, the center agent only

needs to calculate the total reward. The center agent continues the process until it reaches one of the goal states (where

there is not a peak hour) or traps states (existing only in the decreased binary model to be explained shortly). Subse-

quently, the whole process is repeated for other available sequences of actions. We will explain the stopping condition

of each model as we introduce them in the following sections. When learning is completed, the center agent becomes

able to act in a way that leads the agents to the intended state and send them back to the customer agents. Obviously, if

the number of the customers is assumed n, then the total number of communications between the customer agents and

the center agent will be in the order of O(n).

4.1 |Demand-based model

In this model, each state includes a list of the total energy demand of all customers during 24 h:

FIGURE 1 Centralized multiagent reinforcement learning to load shifting for costumers of the smart grid

6of18 GHAFFARI AND AFSHARCHI

S=d1,d2,…,d24

½

fg ð6Þ

The definition of state in this form causes the state space to be continuous so that it is not possible to create the

whole state space. To solve this problem, we use the rounding method. In this method, first we choose some values as

base values, for example, 0, 100, 200, …and round the demand to the nearest base value. Using this method, we change

the state space from continuous into discrete. Note that applying the method may be problematic in terms of accuracy,

but compared to the total demand, it can be ignored.

For example, suppose that we have two customers with the following demand vectors:

D1= 10,5, 5, 5,5, 5, 150,200, 0, 0,0, 0, 200,170, 60, 30,20, 30, 40,200, 150, 170, 200, 20½

and

D2= 0,0, 0, 0,0, 0, 100,150, 30, 10,10, 10, 100,150, 60, 50,50, 40, 20,200, 200, 150, 150, 0½ 

The total value of load demand in the grid on different hours is as follows:

G= 10,5, 5, 5,5, 5, 250, 350, 30, 10,10, 10, 350,320, 120, 80,70, 70, 60,400, 350, 320,350, 20½ 

As can be seen, the vector includes real numbers, which results in a continuous state space. Thus, given the values

0, 50, 100, …as the base values, the starting state is

s= 0,0, 0, 0, 0, 0, 250,350, 50, 0,0, 0, 350,320, 100, 100,100, 100, 50,400, 350, 300, 350, 0½ 

By having access to the total demand of the grid per hour as well as peak hours, we can compute the amount of load

that should be shifted from these hours to off-peak hours. Thus, we define a set of actions in which the agents can act as

A=fah!h0h∈PH&h0∈PH0

jg ð7Þ

where PH0is a set of off-peak hours. Equation (7) shows that the load shifts from hour hto hour h

so that h=h

means no shift. Otherwise, the amount of demand that is shifted can be indicated using Equation (4).

Once the center agent attributes an action (a) to each agent in the current state (s), and transfers the new state (s

)

in which the agents are located, we use the equation

Ts,a

h!h0



:ð8Þ

for each agent do

h0=sh0+rh0

h=sh−rh

In the next step, the center agent computes the reward of this shift so that it can update the Q-table according to the

reward resulting from the action in the given state. Table 1 is used to compute the reward of action ain state s. To show

how a reward function works, an example is presented here. If a shift takes place from peak hour to off-peak hour, the

agents will have the reward of load shifting in which they participate. However, this will not be the case when they

change off-peak hours to peak hours (since a new peak hour is created). In this case, the customers will have to suffer

extra costs and, as a result, they will learn that they have not chosen the right action.

As mentioned before, by considering the suggested definition of the states, we face the dimensionality curse prob-

lem. In other words, although it is probable to create the whole state space theoretically, in practice it is impossible to

create it efficiently. Since creating the whole state space is not possible using the present model, we cannot claim that

the center agent has learned the desired behavior. Therefore, we use support vector regression (SVR). Suppose that the

GHAFFARI AND AFSHARCHI 7of18

state space has one dimension and also that the value assigned to Qof each state has a real number. The role of SVR is

to fit a function for the state space as well as their Q-value in a two-dimensional space (as shown in Figure 2). This

method assesses the Q-value of the state on the test day by comparing it with the value from the SVR. Therefore, by

using SVR instead of searching in a very big state space we make use of the most identical state in a small space.

Although the demand-based model, using rounding and SVR, is generally capable of dealing with huge state space,

the accuracy will decrease, as well. Therefore, we do not claim that this model is optimal with the best time complexity.

Algorithm 1. illustrates The corresponding algorithm to the demand-based model.

Algorithm 1. Centralized MARL load shifting demand-based model.

Phase 1: Learning phase

1: While not convergence do

2: Choose randomly data belong to previous years

3: s

Calculates the state by d

vector using Equation (6)

4: While exist peak hour do

5: s s

6: a

Sets an action to each agent using Equation (7)

7: r

Calculates reward (s,a

) using Table 1

TABLE 1 Rfunction

SAdh0+rh

dh≤ghNo shift —dh

i×ch

Shift ≤gh0rh

i×c0

h−c0



>gh0rh

i×c0

h−c0



dh>gh&dh−rh

i≤ghand 9i0≠i:dh−rh

i0≤ghNo shift —−dh

i×ch

Shift ≤gh0dh

i−rh



×c0



−rh

i×c0



>gh0dh

i−rh



×c0



+rh

i×c0



dh>ghand dh−rh

i≤ghand 8i0≠i:dh−rh

i0>ghNo shift —−dh

i×ch

Shift ≤gh0dh

i−rh



×c0



+rh

i×c0



>gh0dh

i−rh



×c0



−rh

i×c0



dh−rh

i>ghNo shift —−rh

i×c0

h0−c0









Shift ≤gh0rh

i×c0

h0−c0









>gh0−rh

i×c0

h0−c0









FIGURE 2 Two linear and polynomial techniques of SVR

method on sample data gathered using function sin(x)

8of18 GHAFFARI AND AFSHARCHI

8: s

Calculates new state using Equation (8)

9: Q

t+1

(s,a

) (1 −α). Q

(s,a

)+α.(r

+γ.max Q

))

10: EndWhile

11: EndWhile

Phase 2: Testing phase

1: Each agent determines an hourly energy demand scheduling vector and sends it to center.

2: The center does following steps:

a: Calculates initial state

b: Runs SVR(Q) to fit the initial state with one of the learned states

c: Simulates behavior of agents and finds the sequence of actions that reaches them to the goal state

d: Sends back the sequence of actions to the agents

4.2 |Binary model

The demand-based model problems arise from continuous state space and its infinite variation. The first problem was

the continuity of the values, but we discretized the state space to remove the problem. The second problem of the model

was the curse of dimensionality while working with SVR. This prevented us from ensuring that the results are optimal.

SVR is not only inaccurate but also expensive to perform due to the high dimensions of the state space. All the men-

tioned reasons cause the suggested method to have lower efficiency. Thus, we tried to eliminate the direct dependence

of the state space on the load demand.

The idea we use to solve the problems is to assume the grid situation at hour hinsteadof the total demand. In other

words, in the binary model, we compare the total load of the grid during hour hwith the value that the grid can provide

and then store the results as a binary value. By doing so, we reduce the state space of the real numbers to 24 binary num-

bers. As a result, the state in this model includes a list of 24 numbers showing that the hour is either peak or off-peak:

S=x1,x2,…,x24

½

fg ð9Þ

where

xh=1, h∈PH

0, h∈PH0

ð10Þ

Therefore, in this model, the states are either 0 or 1, where 1's refers to the peak hours and 0's to off-peak hours.

The goal state is a vector of 24 zero numbers. For example, assume that there are two customers with demand vectors

as follows:

D1= 10,5, 5, 5,5, 5, 150,200, 0, 0,0, 0, 200,170, 60, 30,20, 30, 40,200, 150, 170, 200, 20½

and

D2= 0,0, 0, 0,0, 0, 100,150, 30, 10,10, 10, 100,150, 60, 50,50, 40, 20,200, 200, 150, 150, 0½ 

The goal vector of grid is

G= 10,5, 5, 5,5, 105, 150,150, 130, 110,10, 160, 150,170, 170, 180, 170, 220,210, 200, 250,220, 200, 170½ 

The start state is

GHAFFARI AND AFSHARCHI 9of18

s= 0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,0½

and the goal state is

s= 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0½

Note that doing actions in the learning phase could increase the number of 1's in state vectors because of the wrong

load shifting.

According to the definition of state in the model, it is not required to do rounding or use SVR since the total state

space is 2

, in which, it is possible to create all the states and learn it using the conventional Q-learning method.

Note that the problem of demand-based model arose from its state structure. In the binary model, we only change

the state structure to get a new transition function. So, the structures of action and reward will be similar to those of

the demand-based model. The transition function changes as follows:

Ts,ah!h0

ðÞ:ð11Þ

for each agent do

dh0 dh0+rh0

dh dh−rh

h= 0 and s0

h0=0, dh<ghand dh0<gh0

h= 0 and s0

h0=1, dh<ghand dh0>gh0

h= 1 and s0

h0=0, dh>ghand dh0<gh0

h= 1 and s0

h0=1, d

h>ghand dh0>gh0

It is apparent from the definition that the new transition function acts in two steps. First, it shifts the load according

to the action of the agents; then, it updates the condition to generate a new state.

In the binary model, it is possible to create the whole state space but great care should be taken in selecting the

learning data. The model can provide an optimal answer only if proper data is selected. Algorithm 2. illustrates The

corresponding algorithm to the binary model.

Algorithm 2. Centralized MARL load shifting binary model.

Phase 1: Learning phase

1: While not convergence do

2: Choose randomly a data belonging to previous years

3: s

Calculates state by d

vector using Equation (9)

4: While exist peak hour do

5: s s

6: a

Sets an action to each agent using Equation (7)

7: r

Calculates reward (s,a

) using Table 1

8: s

Calculates new state using Equation (11)

9: Q

t+1

(s,a

) (1 −α). Q

(s,a

)+α.(r

+γ.max Q

))

10: EndWhile

11: EndWhile

Phase 2: Testing phase

1: Each agent determines an hourly energy demand scheduling vector and sends it to center.

2: The center does following steps:

a: Calculates the initial state

b: Simulates behavior of agents and finds the sequence of actions that reaches them to the goal state

c: Sends back the sequence of actions to the agents

10 of 18 GHAFFARI AND AFSHARCHI

4.3 |Decreased binary model

The binary model has solved the problems of the demand-based model. However, it seems that the time complexity of

the binary model can be reduced (i.e., learning 2

states needs a lot of time). Therefore, we propose a decreased binary

model to reduce the time complexity. To this end, we employ the batch technique, which is based on the principle that

peak hours for long periods (i.e., season periods) are always the same.

In this proposed model, we associate the state space with the number of peak hours. To this end, off-peak hours are

deleted from the list and the creation of new peak hours is managed by adding a flag. Note that, in the decreased binary

model, the state consists of a vector with length PH + 1, which causes the whole of the state space to shrink from 2

2PH +1.

The definition of state in the decreased binary model is as follows:

S=f½x1,x2,…,x24,flaggð12Þ

where

xh=1, h∈PH

0, h∈PH0

ð13Þ

and

flag =1, each PH does not change

0, O:W:

ð14Þ

According to the definition of S, the start state of the model is always a vector, including 1 in the number of PH ,

and 0 for the flag. The goal state will be a vector of PH +1,0.

PH = 12,13, 14, 19,20, 21

The start state is

s= 1,1,1,1,1,1,0½

and the goal state is

s= 0,0,0,0,0,0,0½

Note that the value of flag changes to 1 when a new peak hour is created, which we refer to as the trap state. The

algorithm resets to the start state when facing the trap states. Also, the agents receive punishment to be discouraged

from going to this state.

Considering the changes created in the state of the model, compared to the binary model, it is required to redefine

the transition function. However, the definitions of action and reward are left unchanged. The important point in defin-

ing the transition function is to check the flag value

Ts,ah!h0

ðÞ:ð15Þ

for each agent do

dh0 dh0+rh0

dh dh−rh

GHAFFARI AND AFSHARCHI 11 of 18

h= 0 and s0

flag =0, dh<ghand∄PH0

h= 0 and s0

flag =1, dh<ghand9P H0

h= 1 and s0

flag =0, dh>ghand∄PH0

h= 1 and s0

flag =1, dh>ghand9PH0

According to Equation (15), during the hour at which load is shifted, a new total demand is computed. Then, it is

compared with the desirable generation of the grid, so that the flag value is updated in case the grid fails to generate

the new demand. Algorithm 3. illustrates The corresponding algorithm to the decreased binary model.

In the present section, we have introduced three models based on CMARL to solve the peak problems in the smart

grids. The first model was the demand-based model, which has the curse of dimensionality problem. Next, we proposed

the binary model to solve the problem. In the end, to improve the learning phase running time, we suggested the

decreased binary model. Below, the proposed models' simulation results will be discussed

Algorithm 3. Centralized MARL Load Shifting Decreased Binary Model

Phase 1: Learning phase

1: While not convergence do

2: Choose randomly a data belong to previous years

3: s

Calculates state by d

vector using Equation (12)

4: While exist peak hour do

5: s s

6: a

Sets an action to each agent using Equation (7)

7: r

Calculates reward (s,a

) using Table 1

8: s

Calculates new state using Equation (15)

9: Q

t+1

(s,a

) (1 −α). Q

(s,a

)+α.(r

+γ.max Q

))

10: If happening, halt Wachieving goal then.

11: go to step 2

12: EndIf

13: EndWhile

14: EndWhile

Phase 2: Test phase

1: Each agent determines an hourly energy demand scheduling vector and sends it to center.

2: The center does following steps:

a: Calculates initial state

b: Simulates behavior of agents and finds the sequence of actions which reaches them to goal state

c: Sends back the sequence of actions to agents

5|SIMULATION RESULTS AND ANALYSIS

In this section, we present the results of implementations of the proposed models. Real-world data have been used to

simulate the work,

which includes the initial demand of the customers. For example, we used data gathered from a

full-service restaurant and a hospital located in Anchorage state in 2004–2005.

Although binary simulations and the decreased binary models run during 00:00–24:00 intervals, the simulations of

demand-based model runs during a 5-h interval between 16:00 and 20:00. The reason is that applying this model to con-

ventional systems is not viable due to the rapid growth of state space and the high cost of SVR application. Thus,

12 of 18 GHAFFARI AND AFSHARCHI

through reducing the state space size, we analyze some of our simulations in order to show how well our model is

working. Furthermore, we use λ= 0.6 for all the customers.

First, different models' functions will be compared in the case where the test data exist exactly in the learning data.

In other words, in this section, the scalability of the models facing the new states will be discussed.

Figure 3 shows the difference between the models where the data of test day existed in the learning data and the cir-

cumstance under which the test-day data were not similarly found in the learning data. The results show that the

demand-based model cannot guarantee optimal response unless it learns the data of the test day in the learning phase,

although it certainly improves the cost. The reason lies in the exact adjustment of the states in binary and decreased

binary models without considering the value of the customers' demand. However, for the demand-based model, an

insignificant change in the demand rate of an agent can create a new state, which has not been learned before. A close

examination of the figure shows that there is a trivial inexactness in the results of the demand-based model, which is

the result of rounding and SVR (since adjustment takes place in the nearest state). Thus, the demand-based model, in

spite of its complete simplicity, is not considered efficient in terms of computational accuracy and a real-world

situation.

Uncertainty in real-world problems can always lead to inaccurate outputs. Considering the stochastic nature of

energy production, in this section we try to study the way the proposed models act under the conditions of the best and

worst rate of production. Considering the high sensibility of the demand-based model to the type of data relating to the

test day, we involve both conditions of existing and missing test data in the learning phase.

First, we consider the ideal state in which the rate of energy production has the best condition. This causes the grid

to provide the demanded load of the customers during the hours with the highest customer satisfaction. Table 2, created

according to the best production rate of the grid, illustrates that all models always provide an acceptable answer to the

given state if the data of the learning phase include the data of the test day. However, even if the data of the test day are

not learned exactly, the binary and decreased binary models mostly give an acceptable answer to the problem. The

results of this section reveal the principle that if the grid has its best production rate, the learning in the binary and

decreased binary models will occur in a way that the agents give an optimal answer most often.

FIGURE 3 The proposed

models' behavior with test day

data: (A) Demand-based model.

(B) Binary model. (C) Decreased

binary model

GHAFFARI AND AFSHARCHI 13 of 18

Real-world scenarios show that sometimes the smart grids cannot produce the predicted level of energy because of

using renewable resources to produce energy. This could be attributed to different factors such as changes in weather

conditions. Table 3 shows a situation in which the level of energy production is not as predicted. According to the

results, the binary and decreased binary models act independent of the production rate, and in this case, they often pro-

vide an optimized answer. Comparing the results of the demand-based model in two states shows that the problem of

the model is not the way it learns, but it is the lack of adjustment that makes it work weaker than the two other models.

In other words, if the stochastic nature of the production is considered, all three models are flawless. Yet, it is the struc-

ture of the states that fails the demand-based model to ensure to find any optimized answer upon observing a new

state.

Another factor in need of investigation is λ. Since we make use of cooperative games, our prediction about the cus-

tomers' participation is that they tend to distribute a considerable part of their load demand if necessary. In other

words, we expect that the agents, released from their tendency to do some of their jobs on certain hours, participate

more effectively in load shifting and create the necessary condition of the grid. Our investigations show that the rate of

customer participation in load shifting moves in inverse relation to λ(see Figure 4). In this respect, according to our

prediction, the agents need to prefer the profit of the grid to their satisfaction if necessary. Of course, in the real world

some customers prefer their satisfaction over the whole profit of the grid. Considering the λfactor to vary from person

to person, we will study the behavior of the agents during the learning phase, and there is no reason to worry ourselves

over the trouble in the grid caused by this kind of customers.

TABLE 2 Responsiveness average

Method Model Test data is in learning data (%) Test data is not in learning data (%)

Centralized Demand-based 100 46

Binary 100 100

Decreased binary 100 100

TABLE 3 Responsiveness average in worst case

Method Model Test data is in learning Data (%) Test data is not in learning Data (%)

Centralized Demand-based 95 38

Binary 99 94

Decreased binary 99 98

FIGURE 4 Percentage of agents' participation in load shifting

for changing the amount of λ

14 of 18 GHAFFARI AND AFSHARCHI

In this article, we attempt to increase the customers' participation in load shifting during peak hours. Thus, we have

presented some results in Figure 5, which show that the agents are much inclined to do load shifting at peak hours than

at other hours. This can be explained by the way the reward function is defined. Sometimes load distribution takes

place on off-peak hours because the creation of traps states in the learning phase of the demand-based and binary

models which have no direct control over the creation of these states.

The customers of the grid are to be encouraged to change the time of their load demand. Reducing the costs of

energy can be a enticement for them to participate in load shifting. In our next experiment, we study the effects of our

models on the cost of energy. We try to show why the customers will be satisfied with participating in load shifting.

According to Figure 6, after load shifting, the average cost of energy was significantly reduced and the high cost, which

was imposed on the grid on peak hours, was removed.

Figure 7A demonstrates the learning phase's running time, which is not important as this phase is run only once

(or once a season). Note that the demand-based model is not efficient for such reasons as having the state constructed

by real numbers and using the SVR method.

In most of the studies, a rise in the number of customers results in decreasing efficiency of the running time. We

had claimed that our models are efficient even if the number of customers increases; Figure 7B confirms our claim.

FIGURE 5 Agent's participation in load shifting for a

day λ= 0.6

FIGURE 6 Effect of load shifting on cost change

GHAFFARI AND AFSHARCHI 15 of 18

FIGURE 7 The effect of

customers' number on running

time of learning phase and test

phase for the three proposed

models: (A) Learning phase

running time. (B) Test phase

running time

FIGURE 8 Total demand of 10 customers in a year

FIGURE 9 The total demand of ten customers for a week and

a day

16 of 18 GHAFFARI AND AFSHARCHI

Figure 8 illustrates the total demand of the grid customers for each season. According to this figure, the customers'

demand is routine and regular for a season. Therefore, peak hours are fixed for a season and customer demand is pre-

dictable, as we have considered. Figure 9 displays the total customer demands for a week and a day in detail.

In line with the hypothesis, load shifting reduces the costs of the grid for both utility companies and customers,

resolves balancing problems, and enhances the grid security via reducing the number of interactions among customers.

The results of our simulations indicate that learning helps in increasing load shifting at peak hours without creating

new peaks. Also, it paves the way to overcoming most of the problems that may have occurred in previous works, such

as load spike, selfish customers, customer dissatisfaction, and lack of scalability. Since different situations of the grid

are learned, an optimum sequence of actions that may lead to the goal states could be found. Thus, different types of

customers will not change the results, as their behavior have already been learned.

Further research is required to answer the cases whose customers have power generation and storage devices. Also,

centralized learning is also suggested, which is believed to have the potential to propose decentralized learning to load

shifting problems.

6|CONCLUSION

In this article, three models of managing energy in smart grids through load shifting were introduced, all of which oper-

ate based on centralized RL. The binary model was developed to eliminate the disadvantages of the demand-based

model, and the decreased binary models were introduced to improve the learning rate of the binary model. The

decreased binary model is the most thorough compared to other ones. According to the simulation results, the

suggested working methods reduce costs while improving smart grid's performance in terms of energy production and

distribution. Furthermore, the structures that are used improve the rate of decision making in the grid. The complica-

tions resulting from finding the point of Nash equivalent (which is required in the most works of the field) are absent

here and the interactions between the agents are at the minimum level.

PEER REVIEW

The peer review history for this article is available at https://publons.com/publon/10.1002/2050-7038.12748.

DATA AVAILABILITY STATEMENT

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

ORCID

Mohsen Ghaffari https://orcid.org/0000-0002-1939-9053

Mohsen Afsharchi https://orcid.org/0000-0001-8329-9463

REFERENCES

1. Mohsenian-Rad A, Wong VW, Jatskevich J, Schober R, Leon-Garcia A. Autonomous demand-side management based on game-theoretic

energy consumption scheduling for the future smart grid. IEEE Trans Smart Grid. 2010;1:320-331.

2. Atzeni I, Ordóñez LG, Scutariand G, Palomar DP, Fonollosa JR. Noncooperative and cooperative optimization of distributed energy gen-

eration and storage in the demand-side of the smart grid. Trans Sig Process. 2013;61:2454-2472.

3. Manshaei MH, Zhu Q, Alpcan T, Bas¸ar T, Hubaux J-P. Game theory meets network security and privacy. ACM Comput Surv. 2013;45:

1-39.

4. Fadlullah ZM, Quan DM, Kato N, Stojmenovic I. GTES: an optimized game-theoretic demand-side management scheme for smart grid.

IEEE Syst J. 2014;8:588-597.

5. Chen H, Li Y, Louie RH, Vucetic B. Autonomous demand side management based on energy consumption scheduling and instanta-

neous load billing: an aggregative game approach. IEEE Trans Smart Grid. 2014;5:1744-1754.

6. Chai B, Chen J, Yang Z, Zhang Y. Demand response management with multiple utility companies: a two-level game approach. IEEE

Trans Smart Grid. 2014;5:722-731.

7. Wang Y, Saad W, Mandayam NB, Vincent Poor H. Load shifting in the smart grid: to participate or not? IEEE Trans Smart Grid. 2015;7

(6):2604-2614.

8. Wijaya TK, Larson K, Aberer K. Matching demand with supply in the smart grid using agent-based multiunit auction. In Fifth Interna-

tional Conference on Communication Systems and Networks (COMSNETS); 2013.

GHAFFARI AND AFSHARCHI 17 of 18

9. He M, Zhang F, Huang Y, Chen J, Wang J, Wang R. A distributed demand side energy management algorithm for smart grid. Energies.

2019;12(3):426.

10. Kalogeropoulos I, Sarimveis H. Predictive control algorithms for congestion management in electric power distribution grids. Appl Math

Model. 2020;77:635-651.

11. Jamil M, Mittal S. Hourly load shifting approach for demand side management in smart grid using grasshopper optimisation algorithm.

IET Gen Trans Dis. 2020;14(5):808-815.

12. Ali SNH, Lenzen M, Huang J. Shifting air-conditioner load in residential buildings: benefits for low-carbon integrated power grids. IET

Renew Power Gen. 2018;12(11):1314-1323.

13. Khalid Z, Abbas G, Awais M, Alquthami T, Rasheed MB, Novel Load A. A Novel Load scheduling mechanism using artificial neural net-

work based customer profiles in smart grid. Energies. 2020;13(5):1-23.

14. Afzaal A, Kanwal F, Ali AH, Bashir K, Anjum F. Agent-based energy consumption scheduling for smart grids: an auction-theoretic

approach. IEEE Access. 2020;8:73780-73790.

15. Chamandoust H, Derakhshan G, Hakimi SM, Bahramara S. Tri-objective scheduling of residential smart electrical distribution grids

with optimal joint of responsive loads with renewable energy sources. J Energy Storage. 2020;27:101112.

16. Mohammad Shahidehpour ZL, Yamin H. Market Operations in Electric Power Systems: Forecasting, Scheduling, and Risk Management.

Electric Power Systems: Wiley-IEEE Press; 2002.

17. OpenEI. Available from http://en.openei.org/datasets/files/961/pub/.

18. Melo FS. Convergence of Q-learning: a simple proof; 2001.

How to cite this article: Ghaffari M, Afsharchi M. Learning to shift load under uncertain production in the

smart grid. Int Trans Electr Energ Syst. 2021;31:e12748. https://doi.org/10.1002/2050-7038.12748

APPENDIX A

Convergence of algorithm

In this section, the convergence of the proposed models is discussed. Since the applied method to prove convergence is

similar in all the models discussed, we shall present our evidence only once.

Theorem 1. For one finite Markov decision process (MDP), the Q-learning always converges in optimal level in stan-

dard conditions.

Considering that Theorem 1 was proved in Reference 18, in order to prove the convergence of the suggested models,

it is sufficient in the present case to prove that MDP of the models is finite.

Observation 1. Production and distribution costs of the grid are always of finite value.

Theorem 2. The suggested models converge to an optimized value.

Proof

According to Observation 1, the costs of the grid is never infinite. Thus, the costs of the grid for individual customers

are finite and, finally, the reward function, which is defined based on the customers' costs, cannot have an infinite

value. Since the higher limits of the costs of the grid, as well as of the customers, are finite and the lower limit of

the costs is also finite, the reduction of costs can be done to finite numbers. On the other hand, all defined actions

are disparate and always occur on peak hours and the number of customers is finite. Therefore, the number of

states of load shifting cannot be infinite. Then, it can be claimed that the suggested models always have finite

MDP and, according to Theorem 1, the models are convergent in an optimal level.

18 of 18 GHAFFARI AND AFSHARCHI

A preview of this full-text is provided by Wiley.

Learn more

Content available from International Transactions on Electrical Energy Systems

This content is subject to copyright. Terms and conditions apply.

Learning-based systems for assessing hazard places of contagious diseases and diagnosing patient possibility

Article

Full-text available

Oct 2022
EXPERT SYST APPL

To manage the propagation of infectious diseases, particularly fast-spreading pandemics, it is necessary to provide information about possible infected places and individuals, however, it needs diagnostic tests and is time-consuming and expensive. To smooth these issues, and motivated by the current Coronavirus disease (COVID-19) pandemic, in this paper, we propose a learning-based system and a hidden Markov model (i) to assess hazardous places of a contagious disease, and (ii) to predict the probability of individuals’ infection. To this end, we track the trajectories of individuals in an environment. For evaluating the models and the approaches, we use the Covid-19 outbreak in an urban environment as a case study. Individuals in a closed population are explicitly represented by their movement trajectories over a period of time. The simulation results demonstrate that by adjusting the communicable disease parameters, the detector system and the predictor system are able to correctly assess the hazardous places and determine the infection possibility of individuals and cluster them accurately with high probability, i.e., on average more than 96%. In general, the proposed approaches to assessing hazardous places and predicting the infection possibility of individuals can be applied to contagious diseases by tailoring them to the influential features of the disease.

Analysis of renewable-friendly smart grid technologies for the distributed energy investment projects using a hybrid picture fuzzy rough decision-making approach

Article

Full-text available

Nov 2022

Smart grid systems help increase RWJ projects (RWJ) so that environmentally friendly energy production can be generated. However, efficient technologies should be implemented to ensure the sustainability of smart grid systems. This study aims to evaluate renewable-friendly smart grid technologies regarding distributed energy investment projects by using a hybrid picture fuzzy rough decision-making approach. Firstly, selected criteria are weighted using the multi stepwise weight assessment ratio analysis (M-SWARA) method based on picture fuzzy rough sets (PFRSs). Subsequently, different renewable-friendly smart grid technologies are ranked with the complex proportional assessment (COPRAS) technique by using PFRSs. It is determined that research and development play the most critical role with respect to the renewable-friendly smart grid technologies for distributed energy investment projects. On the other side, cost is another essential factor for this issue. It is also identified that direct current links are the most important renewable-friendly smart grid technology alternative. Priorities should be given to the development of research and development studies on renewable energies to increase the efficiency of smart grid systems. In this context, private sector companies have a very important role. Similarly, incentives provided by governments to RWJ research and development studies should be increased. Within the scope of these studies, new technologies for RWJ types should be emphasized. In this context, new technologies for all RWJ alternatives should be followed comprehensively. Increasing research and development for such investments will also make smart grid systems more successful.

Agent-Based Energy Consumption Scheduling for Smart Grids: An Auction-Theoretic Approach

Article

Full-text available

Apr 2020

The future smart grid would help to benefit both the users and the electricity providing companies from smart pricing techniques. In addition, smart pricing can be used to achieve social objectives and would in turn fluctuate wholesale market into demand side. Collecting abundant information regarding the users electricity consumption pattern is a challenging task for utility providing companies. That is, users may not be willing to expose their indigenous information without any incentive. In this paper an Optimal Energy Consumption Scheduling (OECS) mechanism is proposed to tackle this problem. An agent-based forecasting method is designed, which is capable of predicting energy consumption of each consumer with a lead-time of one hour. This forecasting is exploited to estimate the cost of buying required amount of energy from multiple suppliers. Consequently, based on the estimated required energy and cost, an auction mechanism is proposed to optimize the energy traded between consumers and multiple suppliers within a smart grid. The objectives include increased efficiency and cost reduction of electricity usage by the end users. The results and properties of the proposed OECS mechanism are studied, and it is shown that the auction technique is budget balanced for distribution of electrical energy among consumers from diverse renewable generation resources. Extensive numerical simulations are also conducted to show and prove the beneficial properties of OECS mechanism.

A Novel Load Scheduling Mechanism Using Artificial Neural Network Based Customer Profiles in Smart Grid †

Article

Full-text available

Feb 2020

In most demand response (DR) based residential load management systems, shifting a considerable amount of load in low price intervals reduces end user cost, however, it may create rebound peaks and user dissatisfaction. To overcome these problems, this work presents a novel approach to optimizing load demand and storage management in response to dynamic pricing using machine learning and optimization algorithms. Unlike traditional load scheduling mechanisms, the proposed algorithm is based on finding suggested low tariff area using artificial neural network (ANN). Where the historical load demand individualized power consumption profiles of all users and real time pricing (RTP) signal are used as input parameters for a forecasting module for training and validating the network. In a response, the ANN module provides a suggested low tariff area to all users such that the electricity tariff below the low tariff area is market based. While the users are charged high prices on the basis of a proposed load based pricing policy (LBPP) if they violate low tariff area, which is based on RTP and inclining block rate (IBR). However, we first developed the mathematical models of load, pricing and energy storage systems (ESS), which are an integral part of the optimization problem. Then, based on suggested low tariff area, the problem is formulated as a linear programming (LP) optimization problem and is solved by using both deterministic and heuristic algorithms. The proposed mechanism is validated via extensive simulations and results show the effectiveness in terms of minimizing the electricity bill as well as intercepting the creation of minimal-price peaks. Therefore, the proposed energy management scheme is beneficial to both end user and utility company.

Hourly Load Shifting approach for demand sidemanagement in smart grid using grasshopper optimizationalgorithm

Article

Full-text available

Feb 2020
IET GENER TRANSM DIS

In this new era of communication, the advent of the smart grid has revolutionised the power system network. The goal of smart grids is to provide a more reliable, environment‐friendly and economically efficient power system. Demand side management or demand side response is one of the key components of the smart grid which accomplishes the smart grid that would provide intelligence to the traditional grid. Here, a new approach has been proposed for the demand side management, which is based on shifting a load from peak to off‐peak time. The main objective of the work is to reduce the peak hour demand and the utility bill of the consumers. To achieve these objectives, the proposed strategy is modelled as a minimised optimisation problem and it tries to find out the optimal solution. For that, two optimisation algorithms, the first one is particle swarm optimisation algorithm and the second one is grasshopper optimisation algorithm, are proposed and applied in three area loads of the smart grid, i.e. residential, commercial and industrial. The obtained simulation results show a significant reduction in peak hour demand and utility bills.

A Distributed Demand Side Energy Management Algorithm for Smart Grid

Article

Full-text available

Jan 2019

This paper proposes a model predictive control (MPC) framework-based distributed demand side energy management method (denoted as DMPC) for users and utilities in a smart grid. The users are equipped with renewable energy resources (RESs), energy storage system (ESSs) and different types of smart loads. With the proposed method, each user finds an optimal operation routine in response to the varying electricity prices according to his/her own preference individually, for example, the power reduction of flexible loads, the start time of shift-able loads, the operation power of schedulable loads, and the charge/discharge routine of the ESSs. Moreover, in the method a penalty term is used to avoid large fluctuation of the user’s operation routines in two consecutive iteration steps. In addition, unlike traditional energy management methods which neglect the forecast errors, the proposed DMPC method can adapt the operation routine to newly updated data. The DMPC is compared with a frequently used method, namely, a day-ahead programming-based method (denoted as DDA). Simulation results demonstrate the efficiency and flexibility of the DMPC over the DDA method.

Load Shifting in the Smart Grid: To Participate or Not?

Article

Full-text available

Sep 2015

Demand-side management (DSM) has emerged as an important smart grid feature that allows utility companies to maintain desirable grid loads. However, the success of DSM is contingent on active customer participation. Indeed, most existing DSM studies are based on game-theoretic models that assume customers will act rationally and will voluntarily participate in DSM. In contrast, in this paper, the impact of customers' subjective behavior on each other's DSM decisions is explicitly accounted for. In particular, a noncooperative game is formulated between grid customers in which each customer can decide on whether to participate in DSM or not. In this game, customers seek to minimize a cost function that reflects their total payment for electricity. Unlike classical game-theoretic DSM studies which assume that customers are rational in their decision-making, a novel approach is proposed, based on the framework of prospect theory (PT), to explicitly incorporate the impact of customer behavior on DSM decisions. To solve the proposed game under both conventional game theory and PT, a new algorithm based on fictitious player is proposed using which the game will reach an epsilon-mixed Nash equilibrium. Simulation results assess the impact of customer behavior on demand-side management. In particular, the overall participation level and grid load can depend significantly on the rationality level of the players and their risk aversion tendency.

Autonomous Demand Side Management Based on Energy Consumption Scheduling and Instantaneous Load Billing: An Aggregative Game Approach

Article

Full-text available

Jul 2014

In this paper, we investigate a practical demand side management scenario where the selfish consumers compete to minimize their individual energy cost through scheduling their future energy consumption profiles. We adopt an instantaneous load billing scheme to effectively convince the consumers to shift their peak-time consumption and to fairly charge the consumers for their energy consumption. For the considered DSM scenario, an aggregative game is first formulated to model the strategic behaviors of the selfish consumers. By resorting to the variational inequality theory, we analyze the conditions for the existence and uniqueness of the Nash equilibrium (NE) of the formulated game. Subsequently, for the scenario where there is a central unit calculating and sending the real-time aggregated load to all consumers, we develop a one timescale distributed iterative proximal-point algorithm with provable convergence to achieve the NE of the formulated game. Finally, considering the alternative situation where the central unit does not exist, but the consumers are connected and they would like to share their estimated information with others, we present a distributed synchronous agreement-based algorithm and a distributed asynchronous gossip-based algorithm, by which the consumers can achieve the NE of the formulated game through exchanging information with their immediate neighbors.

Tri-objective scheduling of residential smart electrical distribution grids with optimal joint of responsive loads with renewable energy sources

Article

Dec 2019

Predictive Control Algorithms for Congestion Management in Electric Power Distribution Grids

Article

Jul 2019
APPL MATH MODEL

In this paper, model predictive control methodologies are developed to address two main issues which arise in electric power distribution systems, namely the congestion of the distribution lines and the balancing problem. Consumer energy demand is divided into an uncontrollable part, a controllable part that can be either stored in energy storage devices in order to be consumed at later times or shifted in time in the form of hourly consumption or a consumption that maintains a pattern. Demand – response strategies involve consumers actively in the balancing effort and are part of the MPC methodologies, which are formulated as Mixed Integer Quadratic Program optimization problems involving both continuous and binary variables. Finally, these new developments are tested on the IEEE European Low Voltage Test Feeder which highlights the performance of the proposed control schemes.

Shifting air-conditioner load in residential buildings: Benefits for low-carbon integrated power grids

Article

Apr 2018

This study presents a simulation of low-carbon electricity supply for Australia, contributing new knowledge by demonstrating the benefits of load shifting in residential buildings for downsizing renewable electricity grids comprising wind, hydro, biomass, and solar resources. The load-shifting potential for the whole of Australia is estimated, based on air-conditioner load data and an insulation model for residential buildings. Load shifting is applied to enable transferring residential airconditioner load from peak to off-peak periods, assuming that air-conditioners can be turned-on a few hours ahead of need, during periods where demand is low and renewable resource availability is high, and turned-off during periods of high demand and low resource availability. Thus, load shifting can effectively reduce installed capacity requirements in renewable electricity grids. For 1 h load shifting of residential air-conditioners, Australian electricity demand can be met at the current reliability standards by 130 GW installed capacity, at cost around 12.5 ¢/kWh, and a capacity factor of 32%. The installed capacity can be further reduced by increasing the number of hours that loads can be shifted. The findings suggest that the application of load shifting in residential buildings can play a significant role for power networks with high renewable energy penetration.

Game Theory Meets Network Security and Privacy

Article

Jan 2013

This survey provides a structured and comprehensive overview of research on security and privacy in computer and communication networks that use game-theoretic approaches. We present a selected set of works to highlight the application of game theory in addressing different forms of security and privacy problems in computer networks and mobile applications. We organize the presented works in six main categories: security of the physical and MAC layers, security of self-organizing networks, intrusion detection systems, anonymity and privacy, economics of network security, and cryptography. In each category, we identify security problems, players, and game models. We summarize the main results of selected works, such as equilibrium analysis and security mechanism designs. In addition, we provide a discussion on the advantages, drawbacks, and future direction of using game theory in this field. In this survey, our goal is to instill in the reader an enhanced understanding of different research approaches in applying game-theoretic methods to network security. This survey can also help researchers from various fields develop game-theoretic solutions to current and emerging security problems in computer networking.

Learning to shift load under uncertain production in the smart grid

Abstract and Figures

Recommended publications

Load Shifting in the Smart Grid: To Participate or Not?

Demand Response Management in a Smart Grid with Multiple Users and Utility Companies

Game-Theoretic Energy Management for Residential Users with Dischargeable Plug-in Electric Vehicles

A demand response modeling for residential consumers in smart grid environment using game theory bas...