Content uploaded by Mohammad Ashrafuzzaman
Author content
All content in this area was uploaded by Mohammad Ashrafuzzaman on Sep 03, 2018
Content may be subject to copyright.
Detecting Stealthy False Data Injection Attacks in
Power Grids Using Deep Learning
Mohammad Ashrafuzzaman∗, Yacine Chakhchoukh†‡ , Ananth A. Jillepalli∗‡ ,
Predrag T. Tosic∗, Daniel Conte de Leon∗‡, Frederick T. Sheldon∗and Brian K. Johnson†‡
‡Center for Secure and Dependable Systems
∗Department of Computer Science
†Department of Electrical and Computer Engineering
University of Idaho, Moscow, ID, USA
Emails: {m.ashrafuzzaman, yacinec, ajillepalli, predrag.tosic, dcontedeleon, sheldon, b.k.johnson}@ieee.org
Abstract—The electric power grid, as a critical national in-
frastructure, is under constant threat from cyber-attacks. State
estimation (SE) is at the foundation of a series of critical
control processes in a power transmission system. A false data
injection (FDI) attack against SE can disrupt these control
processes, crippling a power system and wreaking havoc in a
region. With knowledge of the system topology, a cyber-attacker
can formulate and execute stealthy FDI attacks that are very
difficult to detect. Statistical and, more recently, machine learning
approaches have been undertaken to detect FDI attacks on SE
of the power grid. In this paper, we propose a Deep Learning
(DL) based method to accurately detect stealthy FDI attacks
on the SE of power grid. We compare the performance of the
DL method with three popular machine learning algorithms,
which are: gradient boosting machines (GBM), generalized linear
modelings (GLM) and distributed random forests (DRF). All four
algorithms analyze a dataset simulating the IEEE 14-bus system.
The results demonstrate that these algorithms perform well in
accurately and precisely detecting stealthy FDI attacks on the
smart grid, with the DL-based approach showing best results.
Index Terms—False data injection attack, power grid, state
estimation, machine learning, deep learning.
I. INTRODUCTION
The power grid, including generators, transmission systems,
distribution systems, and other numerous devices, is one of
the largest and most complex critical infrastructures. The
power grid and its various components have, for decades,
been undergoing evolution. Significant changes came when the
components of the grid and the supervisory control and data
acquisition (SCADA) systems that monitor and control these
components were updated with networking capabilities. This
essentially transformed power grids into cyber-physical sys-
tems (CPS), with the downside of inheriting issues associated
with being “cyber” (e.g., vulnerable to exploitation by cyber-
attackers). Nonetheless, the industry has attempted to “air-gap”
operational technology (OT) from information technology (IT)
networks toward protecting valuable CPS assets critical to
stable operations. Unfortunately, many OT networks are still
not fully insulated from the IT networks [1] and are vulnerable
to both internal and external threats [2]. As a result the power
grid has been subjected to a new set of exploits on top
of the increased complexity of internetworking of SCADA
systems [3], [4], [5], [6].
We illustrate the vulnerabilities of power grid with a few
well-known incidents. In January of 2008 the Central Intelli-
gence Agency (CIA) reported that a number of non-US cities
were under cyber-attack affecting distribution systems causing
a wide-spread blackout [7]. In January of 2015, 140 million
people in Pakistan were plunged into darkness because of
a militant attack [8] (not cyber-attack). In December 2015,
three power distribution companies were taken down in a
coordinated cyber-attack, resulting in a power outage for about
225,000 Ukrainians [9].
A. Research Problem
One of the many ways a smart power grid can be attacked
over the IT network is using false data injection (FDI) attacks,
which are described as a new class of cyber-attacks against
state estimation in power grid by Liu et al [10]. The attack vec-
tors may include compromising substation meters, unauthori-
zed access or malicious activity by authorized insiders in the
control system, wireless communication intrusion, and external
penetration, etc. State estimation (SE) is a fundamental tool in
the energy management system (EMS) at the control center.
The SE computes voltage magnitudes and phase angles at all
of the different buses of the power system [11] after collecting
measurements that are communicated to the control center
from remote terminal units (RTUs) equipped with SCADA
units.
In generalized false data injection attacks, an adversary
attempts to introduce measurements in the system with ma-
licious intent. If the incorrect measurement data affect the
outcome of state estimation, the resulting misinformation
can reduce the control center operators level of situatio-
nal awareness [12]. This potentially forces the operators to
take corrective actions (wrong steps) against spoofed data.
A contaminated SE may disrupt the real-time operation of
the grid by impacting tools such as contingency analysis,
unit commitment, optimal power flow and computation of
locational marginal pricing (LMPs) for electricity markets.
Cyber-attacks that impact the SE results have been presented in
several publications [10], [13], [14], [15], [16], [17]. As shown
by Xiang et al., false data injection is an important element
of a coordinated attack on the power grid and represents an
important class of attack on cyber-physical systems [18].
B. Proposed Solution
In this paper, we formulate detection of FDI attacks as a
machine learning problem of a binary classification variety. We
propose a methodology using deep neural networks to solve
this binary classification problem as reliably as possible. The
objective function is to minimize the number of both false
positives and false negatives. Our target binary classification
can be summarized as follows: given a system “snapshot
(i.e., the measurement vector at a given discrete time step),
determine whether this measurement has been modified by
an FDI attack or not. We define the vector as compromised
if at least one entry in it has been modified, that is, it
contains an injected component corresponding to the attack.
In the standard state estimation models used by the power
engineering community, such injected data value is assumed to
be added, component-wise, to the legitimate signal and to the
Gaussian noise values. We propose to use deep learning [19]
to solve this classification problem.
C. Contributions
In this paper, we generate datasets from a simulated IEEE
14-bus system and run feed-forward artificial neural networks
with different structures and configurations to demonstrate the
effectiveness of deep-learning-based models in accurately and
precisely detecting stealthy FDI attacks on the state estimation
of a power grid. We identify the structure and configuration
of the model that performs the best. In addition, we run three
other machine learning algorithms, namely gradient boosting
machines (GBM), generalized linear models (GLM) and dis-
tributed random forests classifier (DRF) [20] on the dataset,
and compare results from the deep-learning-based models with
those from the three methods. The results show that the deep
learning-based method outperforms the other three methods.
D. Outline of this Paper
The reminder of the paper is organized as follows. Section II
summarizes the previous work on detecting FDI attacks on
state estimation using machine learning. Section III describes
the system model used in this paper and the mathematical
formulation for static SE in the presence of cyber-attacks,
particularly stealthy FDI attacks. It also identifies the attack
model assumed for the current work. A brief overview of
deep learning is presented in Section IV. The experimental
setup is presented in Section V. We discuss the results of our
experiments in Section VI. Conclusions and future work are
presented in Section VII, followed by acknowledgment and
references.
II. RE LATE D WORK
Traditional statistical approaches, for example [21], [22],
have been proposed to detect FDI attacks on the state estima-
tion in power systems. However, attempts to explore machine
learning techniques are still quite limited.
Keshk et al. [23] used an expectation maximization clus-
tering mechanism to detect if any data in a SCADA system
have been altered by an adversary. The primary goal of the
work was to detect privacy violation in the system.
Tan et al. [24] study the impact of FDI attacks on automatic
generation control systems used in power grids to maintain the
grid frequency with normal operating value. They developed a
threshold-based regression model using an attack identification
approach that estimates the attacker’s write access.
Esmalifalak et al. [25] proposed two machine-learning based
models for FDI attack detection in smart grid systems. The first
model utilizes the multivariate Gaussian semi-supervised lear-
ning algorithm and the second model utilizes a measurement-
based deviation analysis algorithm, which requires no learning.
Both models use principal component analysis to control the
dimensionality of complex simulations. This model falls short
in identifying anomalies in transmission networks.
Chakhchoukh et al. [26] proposed a detection method using
a newly developed machine learning technique known as the
density ratio estimation (DRE). The DRE [27] is an effective
countermeasure against cyber-attacks, which does not require
supervision or an attack model.
Wang et al. [28] proposed a data-centric paradigm to detect
FDI attacks in large scale smart grids. In this mode, the
Margin Setting Algorithm (MSA) is used to process massive
amount of domain-specific data of a large scale smart grid.
This model falls short at identifying FDI attacks in real-time
and in using data emerging from a sophisticated network of
Phasor Measurement Units (PMUs).
Hao et al. [29] proposed a sparse principal-component
analysis-approximation based model to identify stealthy FDI
attacks in smart grid systems. In this model, identification of
real measurements with the availability of sparse data sets is
achieved by using recovery functions. The recovery function’s
accuracy is directly proportional to the sparsity of available
data. As such, this model falls short at identifying FDI attacks
when data is too sparse to produce reliably accurate recovery
functions.
He et al. [30] proposed a deep-learning based model for FDI
attack detection in smart meters data, as opposed to power grid
data, to prevent electricity theft. The model utilizes a State
Vector Estimator (SVE) and a Deep Learning Based Identifi-
cation (DLBI) algorithm. This model compares the historical
measurement data and recognizes a pattern to identify FDI
attacks in real-time. Currently, for this model to work, data
from a large number of sensing units is required.
Wei and Mendis [31] proposed a strategy using Conditional
Deep Belief Network (CDBN) [32] to identify alteration in
data that may destabilize the wide area monitoring systems
(WAMS) in the power grid. They used PMU data for their
investigation.
Wang et al. [33] proposed a deep learning-based interval
state estimation technique to detect anomaly caused by sparse
cyber-attacks in smart grids. They used a 6-layer Stacked
Auto-Encoders (SAE) [32] as an advanced feature extractor
and then a classical predictor, e.g., logistic regressor, as the
last layer detecting anomaly in electric load forecasting.
As the reviewed literature above show, there are hardly any
work done on using deep learning techniques in detecting
stealthy FDI attacks on the measurement data in the SCADA
system that impacts the state estimation in the AC power grids.
III. SYS TE M MOD EL
This section introduces the mathematical formulation for
stealthy false data injection attacks on static state estimation
in power systems [10], [11].
In a power system, the static SE is run after collecting
measurements from the SCADA units at chosen time snapshots
and communicating those to the control center every 2-5
s. These measurements are power flows and injections, as
well as voltage magnitudes. The AC static SE estimates the
(state) vector x∈Rnthat contains phase angles and voltage
magnitude at the different buses, where n= 2k−1and kis
the number of buses in the system. The slack bus phase angle
is assumed to be the reference and is fixed to 0. The state
vector obeys the following nonlinear equation:
z=h(x) + e(1)
The nonlinear vector function h(·)is computed from the grid
topology and the parameters for transmission lines, trans-
formers and other devices. The error vector e∈Rmis
assumed Gaussian with a covariance matrix R, where mis
the number of measurements. The vector of measurements
z∈Rmcontains communicated readings from SCADA units.
The AC SE is executed using an iterative algorithm based
on the weighted least squares (WLS) [11] to compute and
estimate the vector x, i.e.,
ˆ
xk=ˆ
xk−1+H]
k(zk−h(xk−1)) (2)
where H]
k= (H>
kR−1Hk)−1H>
kR−1, the matrix Hkis the
Jacobian of hwith respect to xat step k. The WLS algorithm
is optimal under Gaussian noise.
Let ∆zk=zk−h(xk−1)be the kth residual vector. After
the convergence of the algorithm, i.e., once kˆ
xk−ˆ
xk−1k< δ
for some chosen threshold δ > 0, the obtained residuals
are analyzed by practitioners to detect possible abnormal
measurements by checking for residuals that do not obey
the Gaussian assumption. These abnormal or bad data could
be due to natural failures such as sensor or communication
misbehavior, or to FDI attacks. The most practical bad data
detection rules are known as the chi-square test (χ2) and the
“3σ”rejection rule [11]. The iterative algorithm is equivalent
to an estimation that is run iteratively after linearizing the
regression in each step. The AC SE problem can also be
reformulated as:
z=Hx+e(3)
If contamination occurs due to an FDI attack then the measu-
rement vector zreceived at the control center is replaced by
zawith za=z+Hc. The obtained new state is biased by
the contamination vector c, i.e., ˆ
xa=ˆ
x+c. The conventional
methods detect such contamination by analyzing the residual
(i.e., the difference between the measurement vector zand
the calculated value from the state estimation, i.e., z−Hˆ
x).
In the largest normalized residual test, if the largest absolute
value of the elements in normalized residual is greater than a
pre-defined threshold α > 0, (αis generally chosen to be 3)
the corresponding measurement is identified as bad data and
reported to system operators. The measurement is removed
and the estimation is re-executed.
In the case of FDI attacks, if the injected data are large
enough so that the conventional residual tests can detect them
then these are called non-stealthy FDI attacks. In the non-
stealthy case, the measurement matrix His not known to the
attackers and they simply generates random attack vectors and
manipulate the meter readings.
On the other hand, if the attackers are familiar with the
power system topology information or knows the measure-
ment matrix H, they can carefully craft the data amounts
to be injected in a way that the residual rof the original
measurement vector zremains the same as the residual raof
the measurement vector zwith the injected data z.
ra=za−Hˆ
xa=z−Hˆ
x=r(4)
These are called stealthy FDI attacks as they cannot be
detected using the conventional methods based on residual
analysis [10].
A. Attack Model
False data injection attacks can be carried on different parts
of the power grid, e.g., transmission systems, distribution
systems, advanced metering infrastructure, etc. [34]. In this
work, we consider FDI attacks on the static state estimation
in the AC power transmission system. This attack model
assumes that an attacker can surreptitiously change physical
data, e.g., voltages, currents and phase angles, communicated
by the SCADA and hence launch a stealthy FDI attack. The
model assumes that the adversary has partial knowledge of
the network topology, but knows the needed elements of the
measurement matrix H. It is assumed that the attacker has the
ability to learn these system information prior to devising the
attack, either because the attacker is a trusted insider or has
hacked into the system databases.
Figure 1 shows a simplified diagram of the power grid with
the transmission system, the SCADA and the wireless commu-
nication links. It also indicates the possible attack vectors. An
attacker can break into the remote sensors associated with the
buses and modify the measurement data or/and compromise
the communication channels, possibly through man-in-the-
middle attacks, to intercept and modify the network packets.
Our model also assumes that throughout the entire duration
of an attack, only one and the same bus is targeted, meaning
that this is a targeted attack as opposed to a random attack.
Fig. 1. A simplified diagram of a power grid with its SCADA and
communication system showing FDI attack vectors.
Fig. 2. Basic architecture of a deep neural network showing the input layer
(L1), two hidden layers (L2 and L3) and the output layer (L4) with two
responses.
IV. OVERV IE W OF DE EP LEARNING
Deep learning, one of the fastest growing and exciting bran-
ches of machine learning with applications in many diverse
fields, is a contemporary and trendy name for neural networks,
with perhaps the number of hidden layers as the only distin-
guishing trait that separates it from the more commonplace
single-layer, shallow, neural networks.
The main idea for deep learning came from Hecht-Nielsen
who proved the universal expressive power of a three-layer
neural network back in 1989 [35]. However, it needed the
development of a training algorithm by Hinton in 2006 to
make way for harnessing the power of this model and for
Fig. 3. Diagram of an IEEE 14-bus system under attack [39] (adapted
from [37]).
practical implementation architectures [36].
Deep learning employs a class of multi-layered neural
network based architectures that attempt to hierarchically
learn deep features and correlations in input data. The basic
architecture has an input layer of nodes equal to the number of
input data, one or more hidden layers with varying number of
nodes or neurons, and an output layer with number of nodes
equal to the number of responses. Figure 2 shows a basic
architecture of a deep neural network with the input layer, 2
hidden layers and the output layer.
Deep learning architectures have many variants with each
finding success in specific domains of applications. An exten-
sive review of deep learning neural networks can be found in
the paper by Schmidhuber [19].
V. EXPERIMENTS
A. Simulation and Data Generation
For this work, we simulated a standard IEEE 14-bus (trans-
mission) system with 5 generators and 11 loads [37], as
shown in Figure 3, using the MATPOWER toolbox [38]. This
simulation generates the measurement vector zat 5 second
intervals. The vector consists of 40 active power-flows, 14
active power-injections and 68 reactive power and voltage
measurements. Thus, in the 14-bus system, there are 122
measurement features. The dataset contains 100,000 sets of
measurement data.
B. Structures of Deep Neural Networks
We used a feed-forward artificial neural network (ANN)
architecture. Also known as multi-layer perceptron (MLP),
this architecture is the most common type of deep neural
network (DNN). This DNN is trained with stochastic gradient
descent using back-propagation. We used tanh as the activation
function and L1 regularization [40].
The following four different deep learning models, based
on this architecture, were used in this exercise:
1) DL1-1: This configuration had 1 hidden layer of 100
neurons and it did not use any regularization.
2) DL1-2: This configuration had 3 hidden layers of 150
neurons in each layer and it did not use any regularization.
3) DL2-1: Same as DL1-1 configuration, but it uses L1 regu-
larization with a value of 0.00001 and “misclassification”
as the stopping criterion.
4) DL2-2: Same as DL1-2 configuration, but it uses L1 regu-
larization with a value of 0.00001 and “misclassification”
as the stopping criterion.
All the DL models were run for 100 epochs as we have
experienced that training that lasts more than 100 epochs does
not improve the performance of the models.
C. Learning Models
In addition to four different configurations of deep neural
network, we used the generalized linear models (GLM), gra-
dient boosting machines (GBM) and distributed random forests
classifier (DRF) [20]. This set of seven algorithms was used
to train the dataset with all 122 feature or predictor variables.
Then the 122 feature set was reduced to 20 features using
random forest classifier (RFC) for feature selection, and the
same seven algorithms were used to train the reduced dataset
with 20 features to determine if feature selection makes any
difference on training time and model performance.
We split the dataset into two sets: 80% for training and 20%
for testing. To avoid over-fitting and to obtain robust models,
we used 10-fold cross-validation over randomly divided trai-
ning data during training of the models [40].
D. Evaluation Metrics
A machine learning model for (binary) classification pre-
dicts class labels as output for a given input data. For our case
here, the labels are: 1) True positives (TP): when the model
correctly identifies an attack, 2) True negatives (TN): when it
correctly identifies a normal or non-attack, 3) False positives
(FP): when a non-attack is incorrectly identified as an attack,
and 4) False negatives (FN): when an attack is incorrectly
identified as a non-attack. These four labels constitute the so-
called confusion matrix [20]. To evaluate the models in this
paper, we use the following metrics derived from the confusion
matrix .
1) Accuracy = (T P +TN )/T otal
2) Precision =T P /(F P +T P )
3) False Alarm Rate (FAR) =F P /(F P +T N )
4) Recall =T P /(F N +T P ).
5) F-Measure = 2T P /(2T N +F P +F N )
Accuracy is the percentage of true detection over total data
instances. Recall, also known as true-positive rate, sensitivity,
or detection rate, indicates how many of the attacks the
model does identify, while precision, also known as positive
predictive value, represents how often the model correctly
Fig. 4. The ROC curves for the seven methods used.
identifies an attack. F-Measure provides the harmonic average
of precision and recall [41].
In addition to these five metrics, we use ROC AUC score
which is a measure of the diagnostic ability of binary classifier
systems. We also clocked the run times for training of each
model for comparison.
VI. RE SU LTS
In this section we present the numerical results from the
experiments in terms of the evaluation metrics.
Figure 4 shows the receiver operating characteristics (ROC)
curves [20] for the seven algorithms showing the robustness
and the diagnostic (i.e., correlation between detection rate
and false positive rate) ability of the models with varying
discrimination threshold.
Table II shows the results from running the seven algorithms
on the dataset with the full set of 122 feature variables. As
we can see from the table, the deep learning model DL1-2
(with 3 hidden layers and no regularization) performs the best
among the seven models. It is more accurate and precise than
the others, has a lower false alarm rate (the FAR column)
and a better detection rate as indicated by the value of the
Recall column. The DL2-2 models (with 3 hidden layers and
regularization) has the highest ROC AUC score which means
this model is the most consistent in diagnosing an attack over
the entire dataset.
When the feature set is sorted according to importance by
using a random forest classifier, and selecting the 20 most
important features, we see in Table III that the values of the
metrics improve across the board. In this case, the DL2-2
model (with 3 hidden layers and L1 regularization) performs
best with respect to all evaluated metrics, except for the ROC
AUC score.
The tables list the elapsed times, in number of seconds,
for training. Table III also shows the speed-up in time when
training with the reduced-feature-set dataset as compared to
that with full-feature-set dataset. It shows that the speed up in
training time for deep learning models are not as high as the
other three models.
TABLE I
RES ULTS O F EVALUATI ON M ETR IC S FOR T HE VAR IOU S MAC HIN E LE ARN IN G MET HO DS ON A LL 122 F EATUR ES U SIN G TH E TEST D ATAS ET.
Methods ROC Score Accuracy Precision FAR Recall F-Measure Time (in seconds)
GLM 0.9413 0.9551 0.9669 0.3212 0.9841 0.9754 3.442
GBM 0.9403 0.9570 0.9680 0.3107 0.9851 0.9764 35.955
DRF 0.9629 0.9589 0.9703 0.2871 0.9847 0.9775 33.020
DL1-1 0.9520 0.9716 0.9802 0.1903 0.9885 0.9844 92.061
DL1-2 0.9531 0.9730 0.9809 0.1840 0.9895 0.9852 218.823
DL2-1 0.9610 0.9699 0.9758 0.2345 0.9913 0.9835 669.584
DL2-2 0.9621 0.9711 0.9779 0.2135 0.9905 0.9841 782.512
TABLE II
RESULTS OF EVALUATION METRICS FOR THE VARIOUS MACHINE LEARNING METHODS ON 20 FE ATURE S US ING T HE TE ST DATASE T.
Methods ROC Score Accuracy Precision FAR Recall F-Measure Time (in seconds) Speed Up
GLM 0.9615 0.9571 0.968 0.3107 0.9852 0.9765 0.750 459%
GBM 0.9604 0.9587 0.9686 0.3044 0.9863 0.9774 14.530 247%
DRF 0.9666 0.9627 0.9713 0.2781 0.9880 0.9796 20.364 162%
DL1-1 0.9839 0.9746 0.9812 0.1809 0.9910 0.9861 84.459 110%
DL1-2 0.9803 0.9768 0.9835 0.1583 0.9910 0.9872 181.061 120%
DL2-1 0.9853 0.9761 0.9812 0.1809 0.9926 0.9869 489.145 136%
DL2-2 0.9814 0.9777 0.9825 0.1688 0.9931 0.9878 556.405 140%
Table II also shows that the best-performing model, i.e.,
DL2-2 with feature reduction, takes 556 seconds or 9.3 minu-
tes to complete a training with 10-fold cross validation for a
14-bus system. In this days of high-performance computing,
this is too high a time, and a real-time deployment of deep
learning training is not feasible. However, as a usual practice,
the training is seldom done in real-time. Instead, the deep
learning training is used off-line to obtain a robust model and
then this “trained” model is deployed online to detect any such
attacks. In addition, high-performance computing platforms
with graphical processing units (GPU) are used to speed up
the training process.
VII. CONCLUSION AND FUTURE WOR K
Stealthy false data injection attacks on the state estimation
of a power grid can be disastrous; and an early and accurate
detection of these are critical. In this paper we have demon-
strated that machine learning based models, including deep
learning models, perform very well in detecting stealthy FDI
attacks on the SE in power transmission systems.
The next logical step with this research is to employ these
models for power systems with larger number of buses to
ascertain if the performance scales reasonably. We would also
want to find out if the training time reduction with reduced
feature-set is significant for large systems. In this work we
used data generated by MATPOWER simulation. In order to
gain more confidence we plan to run the models on realistic
datasets generated by Real-time Digital Simulation (RTDS)
and physical testbeds.
ACKNOWLEDGMENT
This research was partially supported by an Idaho Global
Entrepreneurial Mission (IGEM) Grant for Security Manage-
ment of Cyber-Physical Control Systems, 2016 (Grant Number
IGEM17-001), and the U.S. National Science Foundation
(NSF) CyberCorps R
award 1565572.
REFERENCES
[1] E. Byres, “The air gap: SCADA’s enduring security myth,” Communi-
cations of the ACM, vol. 56, no. 8, pp. 29–31, 2013.
[2] S. McLaughlin, C. Konstantinou, X. Wang, L. Davi, A.-R. Sadeghi,
M. Maniatakos, and R. Karri, “The cybersecurity landscape in industrial
control systems,” Proceedings of the IEEE, vol. 104, no. 5, pp. 1039–
1057, 2016.
[3] Y. Zhang, L. Wang, Y. Xiang, and C.-W. Ten, “Power system reliability
evaluation with SCADA cybersecurity considerations,” IEEE Transacti-
ons on Smart Grid, vol. 6, pp. 1707–1721, July 2015.
[4] J. Giraldo, E. Sarkar, A. Cardenas, M. Maniatakos, and M. Kantarcioglu,
“Security and privacy in cyber-physical systems: A survey of surveys,”
IEEE Design & Test, (2017), 2017.
[5] S. Paudel, P. Smith, and T. Zseby, “Attack models for advanced persistent
threats in smart grid wide area monitoring,” in Proceedings of the 2nd
Workshop on Cyber-Physical Security and Resilience in Smart Grids,
pp. 61–66, ACM, 2017.
[6] S. Tan, D. De, W.-Z. Song, J. Yang, and S. K. Das, “Survey of security
advances in smart grid: A data driven approach,” IEEE Communications
Surveys & Tutorials, vol. 19, pp. 397–422, First Quarter 2017.
[7] “CIA: cyberattack caused multiple-city blackout,” January 2008. www.
cnet.com/news/cia-cyberattack-caused- multiple-city- blackout/.
[8] “Massive power failure plunges 80% of Pakistan into darkness,”
January 2015. www.theguardian.com/world/2015/jan/25/
massive-power-failure-plunges- 80-of- pakistan-into- darkness.
[9] G. Liang, S. R. Weller, J. Zhao, F. Luo, and Z. Y. Dong, “The 2015
Ukraine blackout: Implications for false data injection attacks,” IEEE
Transactions on Power Systems, vol. 32, no. 4, pp. 3317–3318, 2017.
[10] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against
state estimation in electric power grids,” ACM Transactions on Informa-
tion and System Security (TISSEC), vol. 14, no. 1, p. 13, 2011.
[11] A. Abur and A. Gomez-Exposito, Power System State Estimation:
Theory and Implementation. New York: CRC Press, 2004.
[12] C. Alcaraz and J. Lopez, “Wide-area situational awareness for critical
infrastructure protection,” Computer, vol. 46, no. 4, pp. 30–37, 2013.
[13] O. Kosut, L. Jia, R. J. Thomas, and L. Tong, “Malicious data attacks
on the smart grid,” IEEE Transactions on Smart Grid, vol. 2, no. 4,
pp. 645–658, 2011.
[14] A. Teixeira, S. Amin, H. Sandberg, K. H. Johansson, and S. S. Sastry,
“Cyber security analysis of state estimators in electric power systems,” in
Decision and Control (CDC), 2010 49th IEEE Conference on, pp. 5991–
5998, IEEE, 2010.
[15] K. C. Sou, H. Sandberg, and K. H. Johansson, “Detection and identifica-
tion of data attacks in power system,” in American Control Conference
(ACC), pp. 3651–3656, June 2012.
[16] G. Hug and J. A. Giampapa, “Vulnerability assessment of AC state
estimation with respect to false data injection cyber-attacks,” IEEE
Transactions on Smart Grid, vol. 3, no. 3, pp. 1362–1370, 2012.
[17] Y. Chakhchoukh and H. Ishii, “Cyber attacks scenarios on the me-
asurement function of power state estimation,” in American Control
Conference (ACC), 2015, pp. 3676–3681, IEEE, 2015.
[18] Y. Xiang, L. Wang, and N. Liu, “Coordinated attacks on electric
power systems in a cyber-physical environment,” Electric Power Systems
Research, vol. 149, pp. 156–168, 2017.
[19] J. Schmidhuber, “Deep learning in neural networks: An overview,”
Neural networks, vol. 61, pp. 85–117, 2015.
[20] T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical
learning: data mining, inference, and prediction. New York: Springer-
Verlag, 2nd ed., 2008.
[21] Y. Chakhchoukh and H. Ishii, “Coordinated cyber-attacks on the me-
asurement function in hybrid state estimation,” IEEE Transactions on
Power Systems, vol. 30, pp. 2487–2497, September 2015.
[22] Y. Chakhchoukh, S. Liu, M. Sugiyama, and H. Ishii, “Statistical outlier
detection for diagnosis of cyber attacks in power state estimation,” in
IEEE Power and Energy Society General Meeting (PESGM), July 2016.
[23] M. Keshk, N. Moustafa, E. Sitnikova, and G. Creech, “Privacy preser-
vation intrusion detection technique for SCADA systems,” in Military
Communications and Information Systems Conference (MilCIS), 2017.
[24] R. Tan, H. H. Nguyen, E. Y. Foo, D. K. Yau, Z. Kalbarczyk, R. K.
Iyer, and H. B. Gooi, “Modeling and mitigating impact of false data
injection attacks on automatic generation control,” IEEE Transactions
on Information Forensics and Security, vol. 12, no. 7, pp. 1609–1624,
2017.
[25] M. Esmalifalak, L. Liu, N. Nguyen, R. Zheng, and Z. Han, “Detecting
stealthy false data injection using machine learning in smart grid,” IEEE
Systems Journal, 2014.
[26] Y. Chakhchoukh, S. Liu, M. Sugiyama, and H. Ishii, “Statistical outlier
detection for diagnosis of cyber attacks in power state estimation,” in
Proc. IEEE PES General Meeting, pp. 1–5, July 2016.
[27] M. Sugiyama, T. Suzuki, and T. Kanamori, Density ratio estimation in
machine learning. Cambridge University Press, 2012.
[28] Y. Wang, M. Amin, J. Fu, and H. Moussa, “A novel data analytical
approach for false data injection cyber-physical attack mitigation in
smart grids,” IEEE Access, 2017.
[29] J. Hao, R. J. Piechocki, D. Kaleshi, W. H. Chin, and Z. Fan, “Sparse
malicious false data injection attacks and defense mechanisms in smart
grids,” IEEE Transactions on Industrial Informatics, vol. 11, pp. 1–12,
October 2015.
[30] Y. He, G. J. Mendis, and J. Wei, “Real-time detection of false data
injection attacks in smart grid: A deep learning-based intelligent me-
chanism,” IEEE Transactions on Smart Grid, 2017.
[31] J. Wei and G. J. Mendis, “A deep learning-based cyber-physical strategy
to mitigate false data injection attack in smart grids,” in Cyber-Physical
Security and Resilience in Smart Grids (CPSR-SG), Joint Workshop on,
pp. 1–6, IEEE, 2016.
[32] S. Marsland, Machine learning: an algorithmic perspective. CRC press,
2015.
[33] H. Wang, J. Ruan, G. Wang, B. Zhou, Y. Liu, X. Fu, and J.-C. Peng,
“Deep learning based interval state estimation of ac smart grids against
sparse cyber attacks,” IEEE Transactions on Industrial Informatics,
2018.
[34] X. Liu and Z. Li, “False data attack models, impact analyses and defense
strategies in the electricity grid (subsection 5.2),” The Electricity Journal,
vol. 30, pp. 35–42, 2017.
[35] R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in
International Joint IEEE Conference on Neural Networks, pp. 593–605,
1989.
[36] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for
deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554,
2006.
[37] University of Washington, Power System Test Case Archive. [Online].
Available: http://www.ee.washington.edu/research/pstca/.
[38] R. D. Zimmerman, C. E. Murillo-S´
anchez, and R. J. Thomas, “MAT-
POWER: Steady-state operations, planning, and analysis tools for power
systems research and education,” IEEE Transactions on power systems,
vol. 26, no. 1, pp. 12–19, 2011.
[39] Y. Chakhchoukh and H. Ishii, “Enhancing robustness to cyber-attacks in
power systems through multiple least trimmed squares state estimations,”
IEEE Transactions on Power Systems, vol. 31, no. 6, pp. 4395–4405,
2016.
[40] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press,
2016. http://www.deeplearningbook.org.
[41] M. Sokolova and G. Lapalme, “A systematic analysis of performance
measures for classification tasks,” Information Processing & Manage-
ment, vol. 45, no. 4, pp. 427–437, 2009.