Conference PaperPDF Available

Smart Grid Optimization by Deep Reinforcement Learning over Discrete and Continuous Action Space

June 2018

June 2018

DOI:10.1109/PVSC.2018.8547862

Conference: 2018 IEEE 7th World Conference on Photovoltaic Energy Conversion (WCPEC) (A Joint Conference of 45th IEEE PVSC, 28th PVSEC & 34th EU PVSEC)

Authors:

Tomah Sogabe

The University of Electro-Communications

Dinesh Malla

The University of Electro-Communications

Show all 10 authorsHide

Sketch of smart grid optimization using deep neural network based reinforcement learning algorithm.

…

Figures - uploaded by Yoshitaka Okada

Content may be subject to copyright.

Content uploaded by Yoshitaka Okada

Content may be subject to copyright.

- 1 -

Smart Grid Optimization by Deep Reinforcement Learning over

Discrete and Continuous Action Space

To mo hi r o H i ra t a 1, Dinesh Bahadur Malla2, Katsuyoshi Sakamoto3, Koichi Yamaguchi3,

Yosh it ak a Ok ad a1, Tomah Sogabe1,2,3,

1Research Center for Advanced Science and Technology, The University of Tokyo, 153-8904,Japan

2Technology Solution Group, Grid Inc., Kita Aoyama, Minato-ku, Tokyo, 107-0061, Japan

3Info-Powered Energy System Research Center, Department of Engineering Science,

The University of Electro-Communications, Chofu, Tokyo, 182-8585, Japan

Abstract- In this work, we have applied two deep

reinforcement learning (DRL) algorithms designed for both

discrete and continuous action space. These algorithms were

well embedded in a rigorous physical model using Simscape

Power SystemsTM (Matlab/SimulinkTM Environment) for

smart grid optimization. Bechmark test were conducted by

comparing the results from the MILP (Mixed-integer linear

programming) and the DRL. The results showed that the agent

successfully captured the energy demand and supply feature

in the training data and learnt to choose behavior leading to

maximize its profit.

Keywords—deep reinforcement learning, smart grid,

optimization

I INTRODUCTION

Energy grid system containing renewable energy

resources(RES) such as photovoltaic energy, wind power as

well as hydropower have been considered as alternative

power supply configuration. It is renovating conventional

grid systems, aiming at reducing the emission of CO2 while

mitigating the global warming. A decentralized energy

system is more robust and resilient against the unexpected

natural disasters, which are frequently occur in countries

such as Japan. However, due to the intermittent nature of

RES, a mismatch between electricity supply and demand is

often encountered and causes instability and limit of power

output. As an effective approach to these challenges, smart

grid has been proposed and has shown great technological

innovation towards intelligent, robust and functional power

grid [1][2].

Smart grid evolves energy transmission among different

sub-smart grid utilities, which finally contribute to the

efficient energy management ecosystem of energy storage,

energy supply, balanced load demand over large scale grid

configuration. Construction of efficient smart grid system

is in principle a control optimization mathematical problem.

A w ide ra nge of metho ds hav e been propo sed to tackle thi s

challenge including linear and dynamic programming as

well as heuristic methods such as PSO, GA, game or fuzzy

theory and so on [3]. In the recent years, studies on energy

optimization in smart grid has gradually shifted to agent-

based machine learning method represented by state of art

deep learning and deep reinforcement learning. Especially

deep neural network based reinforcement learning methods

are emerging and gain popularity to for smart grid

application [4][5].

In this work, we focus on the following issues and

tasks:

(1) Different from previous reports, we have developed our

deep reinforcement learning algorithm embedded in a

rigorous physical model using Simscape Power SystemsTM

for smart power grid optimization. All the parameters used

in smart grid represents the realistic electric circuits and

detailed fluctuation regarding the voltage, frequency and

phase can be therefore fully revealed, which are not

available in previous reports where the constructed smart

grid system could not output sufficient information.

(2) For RL, model-free off-policy deep Q-learning suing

MatlabTM is developed. Actor critic and DQN are suited for

addressing continuous state space and discrete action space

respectively. Here we have focused on the discrete action

control designed for switching the grid power supply/sell

and battery charge /discharge.

(3) For continuous state and continuous action space, we

have self-developed a H-DDPG (hybrid-deterministic

policy gradient) algorithm, in which we have hybridized the

latest deep deterministic policy gradient with the deep

actor-critic stochastic policy gradient.

II ALGORITHM AND MODEL

Fig.1: Sketch of smart grid optimization using deep

neural network based reinforcement learning algorithm.

- 2 -

(i) Deep Q-Learning (DQN): A gener al model which

describes the main framework is given as follows. In this

sketch, we adopted deep Q-learning algorithm as an

example to illustrate the learning principle: physical model

of smart grid simulation environment based on Simscape

Power SystemsTM was constructed. The state space is

always continuous and action space is set either discrete or

continuous for off-policy Q learning and deep policy

gradient algorithm respectively. A detailed operation flow

is given as follows by the form of pseudo-simulation code:

(ii) Hybrid Deep Deterministic Policy Gradient (H-

DDPG): Deterministic policy is in theory efficient at the

late stage of simulation because the policy distribution is

less variant and more deterministic. Policy gradient is

usually formulated as follows, where

is the policy

object function;

is the function approximation parameter

(in neural network, it is the weight w); s and a correspond

to the state and action

#$%&'()

is the state-action

function under certain policy

*%(+&' ",)

and is the :

0 1

&'(

-45678

(

&'",

3333333

;

)

policy distribution function. David et al. has shown that if

the policy is treated as deterministic; the above equation can

be reformed as: [6]

&'(

-45678

(

&'",

0<(#

&'(

)

33333333

)

and if the action a is approximated as policy action function:

33( 0 >

)

3333333333333333333333333333333%@)

using the chain rule

&'(

) can be further extended as:

33<4B#

&+"A

)

<4C>

)

3333333333333333333333

)

and then policy parameter

is updated as the usual

gradient decent:

33"?0 "?E F G <4B#

&+"A

)

<4C>

)

3333333333

)

However, implementing the deterministic policy at the

early simulation stage will inevitably cause high variance

and slow convergence because the policy is far from

optimal policy so the policy distribution is fairly stochastic

and less deterministic with high bias. The hybridized

algorithm is designed in such a way that both the advantage

of deterministic and stochastic policy is assimilated thus a

stable learning profile with fast convergence can be

achieved.

(iii) Neural Network Model: In this work, we use

multilayer neural network including four hidden layers to

approximate the state-action value function. The activation

function is fixed at hyperbolic-tangent function and epsilon-

greedy algorithm is utilized to enhance the exploration in

the case of DQN for discrete action and re-parameterization.

These techniques were used when using H-DDPG for

continuous action space.

(iv) MILP and DRL Model: In this work we

performed benchmark test and compared the results from

the MILP and DRL algorithm. Both the MILP and DRL

were perfomed under the same input data including the

solar power generation and electric consumption profile as

well the purchase/sell price for the electricity. The soft

constraints for the battery charge/discharge were also

arranged the same for both methods. The inter-conversion

principle between these two methods were given in Fig.2.

For MILP we divide the constraints in into two parts : Soft

constraint and hard constraint. And we make soft constraint

as reward for the neural network learning process and hard

- 3 -

constraint are difficult to learn so we used them to terminate

and restart the learning process.

III RESULTS

Here we present one representative simulated results by

employing DQN algorithm to optimize for discrete action

space. Mainly we deploy the DRL(DQN) agent to

maximize earning for comparing with MILP optimization,

and we also use DRL for maintain balance from different

power sources.

There are many optimization methods for different types

of problems, among them MILP is popular tools in the

Matlab environment. On the base of Matlab deployed

energy hub optimization we compared our result and its

reliability so far. In Fig.3. upper graphs are result of Matlab

based MILP used optimized result for buying and selling.

The optimized one-day profit from selling power produced

by PV and reduced the cost of buying from power producer

is 74 yen. The lower two graphs are result by our programed

DRL agent. The optimal result obtained from DRL is 78 yen

for a day, from selling power produced by PV and reduced

the cost of buying from power producer. By comparing

these result DQN (deep reinforcement learning agent) is

good enough to optimize the power system optimization

problem. By using reinforcement learning agent we get

optimized result as well as we can get different option for

optimizing the problem where these options are not

available from other optimization tools.

Optimization tools in Matlab deliver quite accurate

results but failed to be applied to large scale system. The

MILP results calculated over 10 days input data has greatly

deviated from the theoretical solution. On the contrast, DRL

has greatest advantage over the MILP in this sense. The

machine learning based DRL method learn the feature of

the system via big data and generalize the feature using

neural network. The agent successfully learnt to discharge

its battery power during daytime instead buying electricity

from the grid and also learnt to purchase at low price time.

We a ls o co mp ar e t he D RL ag en t a nd M IL P o pt im iz at io n

result on different PV production pattern and different

selling buying rate. we used the same DRL reward system

for all the comparing time and most of time DRL get the

better result than the optimization tools.

Fig.3. The MILP optimization tools optimized buying and selling schedule in the upper and Agent

training results

using the DQN algorithm. In the lower graph diagram

Fig.2. Conversion between MILP and DRL implementation

- 4 -

The DRL agent is able to maintain the balance for power

source demand stability. There are many options for power

sources to fulfill the power demand in such problem

Reinforcement agent can help to maintain supply demand

balance. In Matlab environment we create virtual power

resources and power demand as well, from the helps of

power sources and demand data DRL agent able to maintain

the power demand and supply.

On the base of above works and result, we are building

large scale virtual power network for demand Grid power

supply as well as power purchaser many others electricity

producer like PV, Turbine, Wind farm, CHP etc. and Heat

producer like gas boiler, heat tank etc. as well. According to

this plan we need good DRL algorithm as well as more

agents. Our planed concepts simple diagram is shown in

Fig.5 and more detailed design principle and preliminary

results from trial experiments will be given at the

conference.

IV CONCLUSION

We p re se nt h er e a d ee p r ei nf orc em en t l ea rn in g me th od

applied for smart grid optimization. From the preliminary

simulation results, the agent was able to catch the feature

involved in the balance of load demand, PV power surplus

and battery discharge/charge as well as grid integrate. The

agent successfully learnt how to tune its action profile to

maximize the reward function during training. More

detailed results regarding to the comparison between DQN

and H-DDPG and the key role played by reward function

will be given at the conference.

REFERENCES

[1] R. H. Khan and J. Y. Khan, "A comprehensive review

of the application characteristics and traffic

requirements of a smart grid communications

network," Computer Networks, vol. 57, no. 3, pp. 825-

845, 2013.

[2] H. E. Brown, S. Suryanarayanan, and G. T. Heydt,

"Some characteristics of emerging distribution systems

considering the smart grid initiative," The Electricity

Journal, vol. 23,no. 5, pp. 64 -75, 2010.

[3] M. R. Alam, M. St-Hilaire, and T. Kunz,

"Computational methods for residential energy cost

optimization in smart grids: A survey," ACM Comput.

Surv., vol. 49, pp. 22-34,Apr. 2016.

[4] E. Mocanu, P. H. Nguyen, M. Gibescu, and W. L. Kling,

"Deep learning for estimating building energy

consumption," Sustainable Energy, Grids and

Networks, vol. 6, pp. 91-99, 2016.

[5] V François-Lavet, Q Gemine, D Ernst, R Fonteneau,

"Towards the minimization of the levelized energy

costs of microgrids using both long-term and short-

term storage devices, "Smart Grid: Networking, Data

Management, and Business Models, P295-319,2016

[6] S. David, L. Guy, H. Nicolas, D. Thomas, W. Daan,

and R.Martin. "Deterministic policy gradient

algorithms," ICML, 2014

Battery PV Demand Grid

Sell Buy

Battery

Demand

GRID

Fig.4. The model for power balance in Matlab

Demand

Electricity

request >0

Heat

request >0

Supply

Electricity

offer >0

Heat

offer >0

Market Hub

Agent

Energy Utility

Hub

Heat pump

Gas boiler

CHP units

Solar thermal

Building

Battery 1

Hot water tank

Request

quantity

Request

Price

Offer

quantity

Offer

Price

Tec hno log y Ag ent

Given Given

Opinion

Action

Critic NN

Actor NN

Opinion

Action

Critic NN

Actor NN

Opinion

Action

Critic NN

Actor NN

Opinion

Action

Critic NN

Actor NN

Action Opinion

Actor NN Critic NN

ERQ= action(0)

ERP= action(1)

HRQ= action(2)

HRP= action(3)

Given

!"#$%& '( )*+$,-#&,(*)$..&+(*+/#&(0 1*+$,-#&,(*1$..&+(*+/#&(

)2*&",/3-+& '( )*+$,-#&,(*)#$43(( 0 1*+$,-#&,(*1#$43(

EOQ= action(0)

EOP= action(1)

HOQ= action(2)

HOP= action(3)

EOQ= action(0)

EOP= action(1)

HOQ= action(2)

HOP= action(3)

Energy

conversion

Energy

Storage

+&56+, +&56+, +&56+, +&56+,

7+$./3( ' !"#$%&( 8 )2*&",/3-+&(

Battery 3

Battery 2

Hot water tank

Fig.5. Sketch of large scale virtual power plant using

DRL

A systematic literature review on machine learning applications at coal-fired thermal power plants for improved energy efficiency

Article

Full-text available

Aug 2023

Power generation comprises high environmental and ecological impacts. The global power industry is under pressure to develop more efficient ways to operate and reduce the impacts of inherent process variability. With the rapid development of technologies within the energy sector, large volumes of data are available due to in-time operational measurements. With increased computer processing capabilities, machine learning is applied to explore these in-time operational measurements for improved process understanding. This research paper investigates machine learning algorithms for energy efficiency improvement at coal-fired thermal power plants by conducting a systematic literature review. This research is essential since it provides guidelines for applying machine learning towards sustainable energy supply and improved decision-making. Subsequently, efficient processes result in the reduction of fuel usage, which results in lower emission levels for equivalent power generation capacity. Furthermore, this study contributes towards future research by providing valuable insights from academic and industry-related studies.

Optimal Reactive Power Dispatch in ADNs using DRL and the Impact of Its Various Settings and Environmental Changes

Article

Full-text available

Aug 2023
SENSORS-BASEL

Modern active distribution networks (ADNs) witness increasing complexities that require efforts in control practices, including optimal reactive power dispatch (ORPD). Deep reinforcement learning (DRL) is proposed to manage the network’s reactive power by coordinating different resources, including distributed energy resources, to enhance performance. However, there is a lack of studies examining DRL elements’ performance sensitivity. To this end, in this paper we examine the impact of various DRL reward representations and hyperparameters on the agent’s learning performance when solving the ORPD problem for ADNs. We assess the agent’s performance regarding accuracy and training time metrics, as well as critic estimate measures. Furthermore, different environmental changes are examined to study the DRL model’s scalability by including other resources. Results show that compared to other representations, the complementary reward function exhibits improved performance in terms of power loss minimization and convergence time by 10–15% and 14–18%, respectively. Also, adequate agent performance is observed to be neighboring the best-suited value of each hyperparameter for the studied problem. In addition, scalability analysis depicts that increasing the number of possible action combinations in the action space by approximately nine times results in 1.7 times increase in the training time.

Policy Resilience to Environment Poisoning Attacks on Reinforcement Learning

Preprint

Full-text available

Apr 2023

This paper investigates policy resilience to training-environment poisoning attacks on reinforcement learning (RL) policies, with the goal of recovering the deployment performance of a poisoned RL policy. Due to the fact that the policy resilience is an add-on concern to RL algorithms, it should be resource-efficient, time-conserving, and widely applicable without compromising the performance of RL algorithms. This paper proposes such a policy-resilience mechanism based on an idea of knowledge sharing. We summarize the policy resilience as three stages: preparation, diagnosis, recovery. Specifically, we design the mechanism as a federated architecture coupled with a meta-learning manner, pursuing an efficient extraction and sharing of the environment knowledge. With the shared knowledge, a poisoned agent can quickly identify the deployment condition and accordingly recover its policy performance. We empirically evaluate the resilience mechanism for both model-based and model-free RL algorithms, showing its effectiveness and efficiency in restoring the deployment performance of a poisoned policy.

Design of DDPG-Based Extended Look-Ahead for Longitudinal and Lateral Control of Vehicle Platoon

Article

Full-text available

Jan 2023

This paper presents a novel Deep Deterministic Policy Gradient (DDPG) algorithm with extended look-ahead approach for longitudinal and lateral control of vehicle platooning. The DDPG algorithm is adapted due to its ability to fit nonlinear system and to handle continuous control environment. Moreover, the dynamic input inversion is introduced to reduce domain of the action space from DDPG output. The existing look-ahead approach is considered as a cost-effective approach since it uses the available information from on-board sensors and is effective against the loss of lane markings. However, the approach is known to suffer from cutting-corner phenomenon. To address cutting-corners, we introduce the extended look-ahead approach and derive the true-local error states using the already available information from lidar and V2V communication. The robustness and performance of DDPG-based extended look-ahead controller is investigated by means of simulations and validated through experiments on a Donkey Car platform. The simulations and experiments with Donkey Car show that the DDPG-based extended look-ahead algorithm can provide an efficient control strategy for longitudinal and lateral maneuvers without the requirement of path information.

DQN-BASED TRAFFIC SIGNAL CONTROL SYSTEMS

Article

Full-text available

Jul 2021

Ivan Vladimirovich Kondratov

Real-time adaptive traffic control is an important problem in modern world. Historically, various optimization methods have been used to build adaptive traffic signal control systems. Recently, reinforcement learning has been advanced, and various papers showed efficiency of Deep-Q-Learning (DQN) in solving traffic control problems and providing real-time adaptive control for traffic, decreasing traffic pressure and lowering average travel time for drivers. In this paper we consider the problem of traffic signal control, present the basics of reinforcement learning and review the latest results in this area.

Data-driven learning-based Model Predictive Control for energy-intensive systems

Article

Oct 2023
ADV ENG INFORM

Toward Data Integrity Attacks Against Distributed Dynamic State Estimation in Smart Grid

Article

Jan 2023

With the continuous expansion of the power grid nodes scale, traditional centralized state estimation method shows certain limitations in estimation efficiency and accuracy. Recently, some power grids adopt a distributed state estimation method, in which each partition independently estimates the partial state information by partitioning the entire power system. However, the deviation of the state estimation in certain partition will result in the deviation of the estimation results in the entire power grid system. In this paper, we propose the attack strategy against the distributed state estimation in smart grids from two perspectives, i.e. the attack against local physical measurement value of the power system partition and the attack against the measurement value of coordination center. Moreover, the theoretical analysis of the state estimation deviation caused by the proposed data integrity attack and the propagation processes of proposed attack vectors in measurement calculations are formalized. The effectiveness of the proposed attack strategy is verified in the IEEE-30 bus and IEEE-118 bus systems. Simulation results show that attacking against a certain partition of a distributed system can indirectly affect the state estimation results of other partitions and attacking against the measurement value of coordination center can directly threaten the state estimation results of the entire power grid. Note to Practitioners —This paper proposes two attack strategies against the distributed state estimation of power grid from two perspectives, i.e. the attack against local physical measurement value of the power system partition and the attack against the measurement value of coordination center. Most of the previous works fail to formalize the state estimation deviation of both partial and entire state estimation results of power grid after the attacker launches the attack against the distributed state estimation. We formalize the state estimation deviation caused by the proposed data integrity attack and the propagation processes of proposed attack vectors in measurement calculations. The effectiveness of the proposed attack strategy against the state estimation of power grid is verified in the IEEE-30 bus and IEEE-118 bus systems. Simulation results show that attacking against a certain partition of power grid can indirectly cause the deviation in the state estimation results of entire power grid and attacking against the measurement value of the coordination center can directly threaten the state estimation results of the entire power grid. In conclusion, the proposed attack strategies are helpful for the research community to design detection strategies in a targeted manner and can be conveniently applied to the real-world security management system of smart grid.

Deep Learning in Intelligent Power and Energy Systems

Chapter

Dec 2022

The rapid developments in Internet‐of‐Things (IoT), cloud computing, and big data technologies have increased the popularity of machine learning (ML) techniques. As a result, of all ML techniques, deep learning (DL) is at the forefront of innovation, outperforming all other techniques in many application domains. DL has made breakthroughs in speech recognition, image processing, forecasting, natural language processing, fault detection, power disturbance classification, energy trading, and much more. DL is a complex ML approach composed of multiple processing layers, which allows pattern and structure recognition on huge datasets. This chapter takes an in‐depth look at the most recent and promising DL works in the literature for intelligent power and energy systems (PES). Several types of problems are explored, including regression, classification, and decision‐making problems. The presented works show an increasing trend of new DL techniques that outperform traditional approaches, either through novel architectures or hybrid systems.

Distributional Actor-Critic Ensemble for Uncertainty-Aware Continuous Control

Conference Paper

Jul 2022

Deep Reinforcement Learning Based Approach for Optimal Power Flow of Microgrid with Grid Services Implementation

Conference Paper

Jun 2022

Toward the Minimization of the Levelized Energy Costs of Microgrids Using Both Long-Term and Short-Term Storage Devices

Chapter

Full-text available

Apr 2016

Deep Learning For Estimating Building Energy Consumption

Article

Mar 2016

To improve the design of the electricity infrastructure and the efficient deployment of distributed and renewable energy sources, a new paradigm for the energy supply chain is emerging, leading to the development of smart grids. There is a need to add intelligence at all levels in the grid, acting over various time horizons. Predicting the behavior of the energy system is crucial to mitigate potential uncertainties. An accurate energy prediction at the customer level will reflect directly in efficiency improvements in the whole system. However, prediction of building energy consumption is complex due to many influencing factors, such as climate, performance of thermal systems, and occupancy patterns. Therefore, current state-of-the-art methods are not able to confine the uncertainty at the building level due to the many fluctuations in influencing variables. As an evolution of artificial neural network (ANN)-based prediction methods, deep learning techniques are expected to increase the prediction accuracy by allowing higher levels of abstraction. In this paper, we investigate two newly developed stochastic models for time series prediction of energy consumption, namely Conditional Restricted Boltzmann Machine (CRBM) and Factored Conditional Restricted Boltzmann Machine (FCRBM). The assessment is made on a benchmark dataset consisting of almost four years of one minute resolution electric power consumption data collected from an individual residential customer. The results show that for the energy prediction problem solved here, FCRBM outperforms ANN, Support Vector Machine (SVM), Recurrent Neural Networks (RNN) and CRBM.

Deterministic Policy Gradient Algorithms

Article

Jun 2014

2014 In this paper we consider deterministic policy gradient algorithms for reinforcement learning with continuous actions. The deterministic policy gradient has a particularly appealing form: it is the expected gradient of the action-value function. This simple form means that the deterministic policy gradient can be estimated much more efficiently than the usual stochastic policy gradient. To ensure adequate exploration, we introduce an off-policy actor-critic algorithm that learns a deterministic target policy from an exploratory behaviour policy. We demonstrate that deterministic policy gradient algorithms can significantly outperform their stochastic counterparts in high-dimensional action spaces.

A comprehensive review of the application characteristics and traffic requirements of a smart grid communications network

Article

Feb 2013
COMPUT NETW

A robust communication infrastructure is the touchstone of a smart grid that differentiates it from the conventional electrical grid by transforming it into an intelligent and adaptive energy delivery network. To cope with the rising penetration of renewable energy sources and expected widespread adoption of electric vehicles, the future smart grid needs to implement efficient monitoring and control technologies to improve its operational efficiency. However, the legacy communication infrastructures in the existing grid are quite insufficient, if not incapable of meeting the diverse communication requirements of the smart grid. Therefore, utilities from all over the world are now facing the key challenge of finding the most appropriate technology that can satisfy their future communication needs. In order to properly assess the vast landscape of available communication technologies, architectures and protocols, it is very important to acquire detailed knowledge about the current and prospective applications of the smart grid. With a view to addressing this critical issue, this paper offers an in depth review on the application characteristics and traffic requirements of several emerging smart grid applications and highlights some of the key research challenges present in this arena.

Some Characteristics of Emerging Distribution Systems Considering the Smart Grid Initiative

Article

Jun 2010

Modernization of the electric power system in the United States is driven by the Smart Grid Initiative. Many changes are planned in the coming years to the distribution side of the U.S. electricity delivery infrastructure to embody the idea of "smart distribution systems." However, no functional or technical definition of a smart distribution system has yet been accepted by all.

Computational methods for residential energy cost optimization in smart grids: A survey

M R Alam
M St-Hilaire
T Kunz

M. R. Alam, M. St-Hilaire, and T. Kunz, "Computational methods for residential energy cost optimization in smart grids: A survey," ACM Comput. Surv., vol. 49, pp. 22-34,Apr. 2016.

Smart Grid Optimization by Deep Reinforcement Learning over Discrete and Continuous Action Space

Figures

Recommended publications

Enhance Load Forecastability: Optimize Data Sampling Policy by Reinforcing User Behaviors

Bootstrapping Baysian Inverse Reinforcement Learning in Robotics through VR Demonstration

Convolution filter embedded quantum gate autoencoder

Development of Generic CNN Deep Learning Method Using Feature Graph

Hot Carrier Transportation Dynamics in InAs/GaAs Quantum Dot Solar Cell