Conference PaperPDF Available

Detecting Stealthy False Data Injection Attacks in Power Grids Using Deep Learning

Authors:

Abstract and Figures

The electric power grid, as a critical national infrastructure, is under constant threat from cyber-attacks. State estimation (SE) is at the foundation of a series of critical control processes in a power transmission system. A false data injection (FDI) attack against SE can disrupt these control processes, crippling a power system and wreaking havoc in a region. With knowledge of the system topology, a cyber-attackercan formulate and execute stealthy FDI attacks that are very difficult to detect. Statistical and, more recently, machine learning approaches have been undertaken to detect FDI attacks on SE of the power grid. In this paper, we propose a Deep Learning (DL) based method to accurately detect stealthy FDI attacks on the SE of power grid. We compare the performance of the DL method with three popular machine learning algorithms, which are: gradient boosting machines (GBM), generalized linear modelings (GLM) and distributed random forests (DRF). All four algorithms analyze a dataset simulating the IEEE 14-bus system. The results demonstrate that these algorithms perform well in accurately and precisely detecting stealthy FDI attacks on the smart grid, with the DL-based approach showing best results.
Content may be subject to copyright.
Detecting Stealthy False Data Injection Attacks in
Power Grids Using Deep Learning
Mohammad Ashrafuzzaman, Yacine Chakhchoukh†‡ , Ananth A. Jillepalli∗‡ ,
Predrag T. Tosic, Daniel Conte de Leon∗‡, Frederick T. Sheldonand Brian K. Johnson†‡
Center for Secure and Dependable Systems
Department of Computer Science
Department of Electrical and Computer Engineering
University of Idaho, Moscow, ID, USA
Emails: {m.ashrafuzzaman, yacinec, ajillepalli, predrag.tosic, dcontedeleon, sheldon, b.k.johnson}@ieee.org
Abstract—The electric power grid, as a critical national in-
frastructure, is under constant threat from cyber-attacks. State
estimation (SE) is at the foundation of a series of critical
control processes in a power transmission system. A false data
injection (FDI) attack against SE can disrupt these control
processes, crippling a power system and wreaking havoc in a
region. With knowledge of the system topology, a cyber-attacker
can formulate and execute stealthy FDI attacks that are very
difficult to detect. Statistical and, more recently, machine learning
approaches have been undertaken to detect FDI attacks on SE
of the power grid. In this paper, we propose a Deep Learning
(DL) based method to accurately detect stealthy FDI attacks
on the SE of power grid. We compare the performance of the
DL method with three popular machine learning algorithms,
which are: gradient boosting machines (GBM), generalized linear
modelings (GLM) and distributed random forests (DRF). All four
algorithms analyze a dataset simulating the IEEE 14-bus system.
The results demonstrate that these algorithms perform well in
accurately and precisely detecting stealthy FDI attacks on the
smart grid, with the DL-based approach showing best results.
Index Terms—False data injection attack, power grid, state
estimation, machine learning, deep learning.
I. INTRODUCTION
The power grid, including generators, transmission systems,
distribution systems, and other numerous devices, is one of
the largest and most complex critical infrastructures. The
power grid and its various components have, for decades,
been undergoing evolution. Significant changes came when the
components of the grid and the supervisory control and data
acquisition (SCADA) systems that monitor and control these
components were updated with networking capabilities. This
essentially transformed power grids into cyber-physical sys-
tems (CPS), with the downside of inheriting issues associated
with being “cyber” (e.g., vulnerable to exploitation by cyber-
attackers). Nonetheless, the industry has attempted to “air-gap”
operational technology (OT) from information technology (IT)
networks toward protecting valuable CPS assets critical to
stable operations. Unfortunately, many OT networks are still
not fully insulated from the IT networks [1] and are vulnerable
to both internal and external threats [2]. As a result the power
grid has been subjected to a new set of exploits on top
of the increased complexity of internetworking of SCADA
systems [3], [4], [5], [6].
We illustrate the vulnerabilities of power grid with a few
well-known incidents. In January of 2008 the Central Intelli-
gence Agency (CIA) reported that a number of non-US cities
were under cyber-attack affecting distribution systems causing
a wide-spread blackout [7]. In January of 2015, 140 million
people in Pakistan were plunged into darkness because of
a militant attack [8] (not cyber-attack). In December 2015,
three power distribution companies were taken down in a
coordinated cyber-attack, resulting in a power outage for about
225,000 Ukrainians [9].
A. Research Problem
One of the many ways a smart power grid can be attacked
over the IT network is using false data injection (FDI) attacks,
which are described as a new class of cyber-attacks against
state estimation in power grid by Liu et al [10]. The attack vec-
tors may include compromising substation meters, unauthori-
zed access or malicious activity by authorized insiders in the
control system, wireless communication intrusion, and external
penetration, etc. State estimation (SE) is a fundamental tool in
the energy management system (EMS) at the control center.
The SE computes voltage magnitudes and phase angles at all
of the different buses of the power system [11] after collecting
measurements that are communicated to the control center
from remote terminal units (RTUs) equipped with SCADA
units.
In generalized false data injection attacks, an adversary
attempts to introduce measurements in the system with ma-
licious intent. If the incorrect measurement data affect the
outcome of state estimation, the resulting misinformation
can reduce the control center operators level of situatio-
nal awareness [12]. This potentially forces the operators to
take corrective actions (wrong steps) against spoofed data.
A contaminated SE may disrupt the real-time operation of
the grid by impacting tools such as contingency analysis,
unit commitment, optimal power flow and computation of
locational marginal pricing (LMPs) for electricity markets.
Cyber-attacks that impact the SE results have been presented in
several publications [10], [13], [14], [15], [16], [17]. As shown
by Xiang et al., false data injection is an important element
of a coordinated attack on the power grid and represents an
important class of attack on cyber-physical systems [18].
B. Proposed Solution
In this paper, we formulate detection of FDI attacks as a
machine learning problem of a binary classification variety. We
propose a methodology using deep neural networks to solve
this binary classification problem as reliably as possible. The
objective function is to minimize the number of both false
positives and false negatives. Our target binary classification
can be summarized as follows: given a system “snapshot
(i.e., the measurement vector at a given discrete time step),
determine whether this measurement has been modified by
an FDI attack or not. We define the vector as compromised
if at least one entry in it has been modified, that is, it
contains an injected component corresponding to the attack.
In the standard state estimation models used by the power
engineering community, such injected data value is assumed to
be added, component-wise, to the legitimate signal and to the
Gaussian noise values. We propose to use deep learning [19]
to solve this classification problem.
C. Contributions
In this paper, we generate datasets from a simulated IEEE
14-bus system and run feed-forward artificial neural networks
with different structures and configurations to demonstrate the
effectiveness of deep-learning-based models in accurately and
precisely detecting stealthy FDI attacks on the state estimation
of a power grid. We identify the structure and configuration
of the model that performs the best. In addition, we run three
other machine learning algorithms, namely gradient boosting
machines (GBM), generalized linear models (GLM) and dis-
tributed random forests classifier (DRF) [20] on the dataset,
and compare results from the deep-learning-based models with
those from the three methods. The results show that the deep
learning-based method outperforms the other three methods.
D. Outline of this Paper
The reminder of the paper is organized as follows. Section II
summarizes the previous work on detecting FDI attacks on
state estimation using machine learning. Section III describes
the system model used in this paper and the mathematical
formulation for static SE in the presence of cyber-attacks,
particularly stealthy FDI attacks. It also identifies the attack
model assumed for the current work. A brief overview of
deep learning is presented in Section IV. The experimental
setup is presented in Section V. We discuss the results of our
experiments in Section VI. Conclusions and future work are
presented in Section VII, followed by acknowledgment and
references.
II. RE LATE D WORK
Traditional statistical approaches, for example [21], [22],
have been proposed to detect FDI attacks on the state estima-
tion in power systems. However, attempts to explore machine
learning techniques are still quite limited.
Keshk et al. [23] used an expectation maximization clus-
tering mechanism to detect if any data in a SCADA system
have been altered by an adversary. The primary goal of the
work was to detect privacy violation in the system.
Tan et al. [24] study the impact of FDI attacks on automatic
generation control systems used in power grids to maintain the
grid frequency with normal operating value. They developed a
threshold-based regression model using an attack identification
approach that estimates the attacker’s write access.
Esmalifalak et al. [25] proposed two machine-learning based
models for FDI attack detection in smart grid systems. The first
model utilizes the multivariate Gaussian semi-supervised lear-
ning algorithm and the second model utilizes a measurement-
based deviation analysis algorithm, which requires no learning.
Both models use principal component analysis to control the
dimensionality of complex simulations. This model falls short
in identifying anomalies in transmission networks.
Chakhchoukh et al. [26] proposed a detection method using
a newly developed machine learning technique known as the
density ratio estimation (DRE). The DRE [27] is an effective
countermeasure against cyber-attacks, which does not require
supervision or an attack model.
Wang et al. [28] proposed a data-centric paradigm to detect
FDI attacks in large scale smart grids. In this mode, the
Margin Setting Algorithm (MSA) is used to process massive
amount of domain-specific data of a large scale smart grid.
This model falls short at identifying FDI attacks in real-time
and in using data emerging from a sophisticated network of
Phasor Measurement Units (PMUs).
Hao et al. [29] proposed a sparse principal-component
analysis-approximation based model to identify stealthy FDI
attacks in smart grid systems. In this model, identification of
real measurements with the availability of sparse data sets is
achieved by using recovery functions. The recovery function’s
accuracy is directly proportional to the sparsity of available
data. As such, this model falls short at identifying FDI attacks
when data is too sparse to produce reliably accurate recovery
functions.
He et al. [30] proposed a deep-learning based model for FDI
attack detection in smart meters data, as opposed to power grid
data, to prevent electricity theft. The model utilizes a State
Vector Estimator (SVE) and a Deep Learning Based Identifi-
cation (DLBI) algorithm. This model compares the historical
measurement data and recognizes a pattern to identify FDI
attacks in real-time. Currently, for this model to work, data
from a large number of sensing units is required.
Wei and Mendis [31] proposed a strategy using Conditional
Deep Belief Network (CDBN) [32] to identify alteration in
data that may destabilize the wide area monitoring systems
(WAMS) in the power grid. They used PMU data for their
investigation.
Wang et al. [33] proposed a deep learning-based interval
state estimation technique to detect anomaly caused by sparse
cyber-attacks in smart grids. They used a 6-layer Stacked
Auto-Encoders (SAE) [32] as an advanced feature extractor
and then a classical predictor, e.g., logistic regressor, as the
last layer detecting anomaly in electric load forecasting.
As the reviewed literature above show, there are hardly any
work done on using deep learning techniques in detecting
stealthy FDI attacks on the measurement data in the SCADA
system that impacts the state estimation in the AC power grids.
III. SYS TE M MOD EL
This section introduces the mathematical formulation for
stealthy false data injection attacks on static state estimation
in power systems [10], [11].
In a power system, the static SE is run after collecting
measurements from the SCADA units at chosen time snapshots
and communicating those to the control center every 2-5
s. These measurements are power flows and injections, as
well as voltage magnitudes. The AC static SE estimates the
(state) vector xRnthat contains phase angles and voltage
magnitude at the different buses, where n= 2k1and kis
the number of buses in the system. The slack bus phase angle
is assumed to be the reference and is fixed to 0. The state
vector obeys the following nonlinear equation:
z=h(x) + e(1)
The nonlinear vector function h(·)is computed from the grid
topology and the parameters for transmission lines, trans-
formers and other devices. The error vector eRmis
assumed Gaussian with a covariance matrix R, where mis
the number of measurements. The vector of measurements
zRmcontains communicated readings from SCADA units.
The AC SE is executed using an iterative algorithm based
on the weighted least squares (WLS) [11] to compute and
estimate the vector x, i.e.,
ˆ
xk=ˆ
xk1+H]
k(zkh(xk1)) (2)
where H]
k= (H>
kR1Hk)1H>
kR1, the matrix Hkis the
Jacobian of hwith respect to xat step k. The WLS algorithm
is optimal under Gaussian noise.
Let zk=zkh(xk1)be the kth residual vector. After
the convergence of the algorithm, i.e., once kˆ
xkˆ
xk1k< δ
for some chosen threshold δ > 0, the obtained residuals
are analyzed by practitioners to detect possible abnormal
measurements by checking for residuals that do not obey
the Gaussian assumption. These abnormal or bad data could
be due to natural failures such as sensor or communication
misbehavior, or to FDI attacks. The most practical bad data
detection rules are known as the chi-square test (χ2) and the
“3σrejection rule [11]. The iterative algorithm is equivalent
to an estimation that is run iteratively after linearizing the
regression in each step. The AC SE problem can also be
reformulated as:
z=Hx+e(3)
If contamination occurs due to an FDI attack then the measu-
rement vector zreceived at the control center is replaced by
zawith za=z+Hc. The obtained new state is biased by
the contamination vector c, i.e., ˆ
xa=ˆ
x+c. The conventional
methods detect such contamination by analyzing the residual
(i.e., the difference between the measurement vector zand
the calculated value from the state estimation, i.e., zHˆ
x).
In the largest normalized residual test, if the largest absolute
value of the elements in normalized residual is greater than a
pre-defined threshold α > 0, (αis generally chosen to be 3)
the corresponding measurement is identified as bad data and
reported to system operators. The measurement is removed
and the estimation is re-executed.
In the case of FDI attacks, if the injected data are large
enough so that the conventional residual tests can detect them
then these are called non-stealthy FDI attacks. In the non-
stealthy case, the measurement matrix His not known to the
attackers and they simply generates random attack vectors and
manipulate the meter readings.
On the other hand, if the attackers are familiar with the
power system topology information or knows the measure-
ment matrix H, they can carefully craft the data amounts
to be injected in a way that the residual rof the original
measurement vector zremains the same as the residual raof
the measurement vector zwith the injected data z.
ra=zaHˆ
xa=zHˆ
x=r(4)
These are called stealthy FDI attacks as they cannot be
detected using the conventional methods based on residual
analysis [10].
A. Attack Model
False data injection attacks can be carried on different parts
of the power grid, e.g., transmission systems, distribution
systems, advanced metering infrastructure, etc. [34]. In this
work, we consider FDI attacks on the static state estimation
in the AC power transmission system. This attack model
assumes that an attacker can surreptitiously change physical
data, e.g., voltages, currents and phase angles, communicated
by the SCADA and hence launch a stealthy FDI attack. The
model assumes that the adversary has partial knowledge of
the network topology, but knows the needed elements of the
measurement matrix H. It is assumed that the attacker has the
ability to learn these system information prior to devising the
attack, either because the attacker is a trusted insider or has
hacked into the system databases.
Figure 1 shows a simplified diagram of the power grid with
the transmission system, the SCADA and the wireless commu-
nication links. It also indicates the possible attack vectors. An
attacker can break into the remote sensors associated with the
buses and modify the measurement data or/and compromise
the communication channels, possibly through man-in-the-
middle attacks, to intercept and modify the network packets.
Our model also assumes that throughout the entire duration
of an attack, only one and the same bus is targeted, meaning
that this is a targeted attack as opposed to a random attack.
Fig. 1. A simplified diagram of a power grid with its SCADA and
communication system showing FDI attack vectors.
Fig. 2. Basic architecture of a deep neural network showing the input layer
(L1), two hidden layers (L2 and L3) and the output layer (L4) with two
responses.
IV. OVERV IE W OF DE EP LEARNING
Deep learning, one of the fastest growing and exciting bran-
ches of machine learning with applications in many diverse
fields, is a contemporary and trendy name for neural networks,
with perhaps the number of hidden layers as the only distin-
guishing trait that separates it from the more commonplace
single-layer, shallow, neural networks.
The main idea for deep learning came from Hecht-Nielsen
who proved the universal expressive power of a three-layer
neural network back in 1989 [35]. However, it needed the
development of a training algorithm by Hinton in 2006 to
make way for harnessing the power of this model and for
Fig. 3. Diagram of an IEEE 14-bus system under attack [39] (adapted
from [37]).
practical implementation architectures [36].
Deep learning employs a class of multi-layered neural
network based architectures that attempt to hierarchically
learn deep features and correlations in input data. The basic
architecture has an input layer of nodes equal to the number of
input data, one or more hidden layers with varying number of
nodes or neurons, and an output layer with number of nodes
equal to the number of responses. Figure 2 shows a basic
architecture of a deep neural network with the input layer, 2
hidden layers and the output layer.
Deep learning architectures have many variants with each
finding success in specific domains of applications. An exten-
sive review of deep learning neural networks can be found in
the paper by Schmidhuber [19].
V. EXPERIMENTS
A. Simulation and Data Generation
For this work, we simulated a standard IEEE 14-bus (trans-
mission) system with 5 generators and 11 loads [37], as
shown in Figure 3, using the MATPOWER toolbox [38]. This
simulation generates the measurement vector zat 5 second
intervals. The vector consists of 40 active power-flows, 14
active power-injections and 68 reactive power and voltage
measurements. Thus, in the 14-bus system, there are 122
measurement features. The dataset contains 100,000 sets of
measurement data.
B. Structures of Deep Neural Networks
We used a feed-forward artificial neural network (ANN)
architecture. Also known as multi-layer perceptron (MLP),
this architecture is the most common type of deep neural
network (DNN). This DNN is trained with stochastic gradient
descent using back-propagation. We used tanh as the activation
function and L1 regularization [40].
The following four different deep learning models, based
on this architecture, were used in this exercise:
1) DL1-1: This configuration had 1 hidden layer of 100
neurons and it did not use any regularization.
2) DL1-2: This configuration had 3 hidden layers of 150
neurons in each layer and it did not use any regularization.
3) DL2-1: Same as DL1-1 configuration, but it uses L1 regu-
larization with a value of 0.00001 and “misclassification”
as the stopping criterion.
4) DL2-2: Same as DL1-2 configuration, but it uses L1 regu-
larization with a value of 0.00001 and “misclassification”
as the stopping criterion.
All the DL models were run for 100 epochs as we have
experienced that training that lasts more than 100 epochs does
not improve the performance of the models.
C. Learning Models
In addition to four different configurations of deep neural
network, we used the generalized linear models (GLM), gra-
dient boosting machines (GBM) and distributed random forests
classifier (DRF) [20]. This set of seven algorithms was used
to train the dataset with all 122 feature or predictor variables.
Then the 122 feature set was reduced to 20 features using
random forest classifier (RFC) for feature selection, and the
same seven algorithms were used to train the reduced dataset
with 20 features to determine if feature selection makes any
difference on training time and model performance.
We split the dataset into two sets: 80% for training and 20%
for testing. To avoid over-fitting and to obtain robust models,
we used 10-fold cross-validation over randomly divided trai-
ning data during training of the models [40].
D. Evaluation Metrics
A machine learning model for (binary) classification pre-
dicts class labels as output for a given input data. For our case
here, the labels are: 1) True positives (TP): when the model
correctly identifies an attack, 2) True negatives (TN): when it
correctly identifies a normal or non-attack, 3) False positives
(FP): when a non-attack is incorrectly identified as an attack,
and 4) False negatives (FN): when an attack is incorrectly
identified as a non-attack. These four labels constitute the so-
called confusion matrix [20]. To evaluate the models in this
paper, we use the following metrics derived from the confusion
matrix .
1) Accuracy = (T P +TN )/T otal
2) Precision =T P /(F P +T P )
3) False Alarm Rate (FAR) =F P /(F P +T N )
4) Recall =T P /(F N +T P ).
5) F-Measure = 2T P /(2T N +F P +F N )
Accuracy is the percentage of true detection over total data
instances. Recall, also known as true-positive rate, sensitivity,
or detection rate, indicates how many of the attacks the
model does identify, while precision, also known as positive
predictive value, represents how often the model correctly
Fig. 4. The ROC curves for the seven methods used.
identifies an attack. F-Measure provides the harmonic average
of precision and recall [41].
In addition to these five metrics, we use ROC AUC score
which is a measure of the diagnostic ability of binary classifier
systems. We also clocked the run times for training of each
model for comparison.
VI. RE SU LTS
In this section we present the numerical results from the
experiments in terms of the evaluation metrics.
Figure 4 shows the receiver operating characteristics (ROC)
curves [20] for the seven algorithms showing the robustness
and the diagnostic (i.e., correlation between detection rate
and false positive rate) ability of the models with varying
discrimination threshold.
Table II shows the results from running the seven algorithms
on the dataset with the full set of 122 feature variables. As
we can see from the table, the deep learning model DL1-2
(with 3 hidden layers and no regularization) performs the best
among the seven models. It is more accurate and precise than
the others, has a lower false alarm rate (the FAR column)
and a better detection rate as indicated by the value of the
Recall column. The DL2-2 models (with 3 hidden layers and
regularization) has the highest ROC AUC score which means
this model is the most consistent in diagnosing an attack over
the entire dataset.
When the feature set is sorted according to importance by
using a random forest classifier, and selecting the 20 most
important features, we see in Table III that the values of the
metrics improve across the board. In this case, the DL2-2
model (with 3 hidden layers and L1 regularization) performs
best with respect to all evaluated metrics, except for the ROC
AUC score.
The tables list the elapsed times, in number of seconds,
for training. Table III also shows the speed-up in time when
training with the reduced-feature-set dataset as compared to
that with full-feature-set dataset. It shows that the speed up in
training time for deep learning models are not as high as the
other three models.
TABLE I
RES ULTS O F EVALUATI ON M ETR IC S FOR T HE VAR IOU S MAC HIN E LE ARN IN G MET HO DS ON A LL 122 F EATUR ES U SIN G TH E TEST D ATAS ET.
Methods ROC Score Accuracy Precision FAR Recall F-Measure Time (in seconds)
GLM 0.9413 0.9551 0.9669 0.3212 0.9841 0.9754 3.442
GBM 0.9403 0.9570 0.9680 0.3107 0.9851 0.9764 35.955
DRF 0.9629 0.9589 0.9703 0.2871 0.9847 0.9775 33.020
DL1-1 0.9520 0.9716 0.9802 0.1903 0.9885 0.9844 92.061
DL1-2 0.9531 0.9730 0.9809 0.1840 0.9895 0.9852 218.823
DL2-1 0.9610 0.9699 0.9758 0.2345 0.9913 0.9835 669.584
DL2-2 0.9621 0.9711 0.9779 0.2135 0.9905 0.9841 782.512
TABLE II
RESULTS OF EVALUATION METRICS FOR THE VARIOUS MACHINE LEARNING METHODS ON 20 FE ATURE S US ING T HE TE ST DATASE T.
Methods ROC Score Accuracy Precision FAR Recall F-Measure Time (in seconds) Speed Up
GLM 0.9615 0.9571 0.968 0.3107 0.9852 0.9765 0.750 459%
GBM 0.9604 0.9587 0.9686 0.3044 0.9863 0.9774 14.530 247%
DRF 0.9666 0.9627 0.9713 0.2781 0.9880 0.9796 20.364 162%
DL1-1 0.9839 0.9746 0.9812 0.1809 0.9910 0.9861 84.459 110%
DL1-2 0.9803 0.9768 0.9835 0.1583 0.9910 0.9872 181.061 120%
DL2-1 0.9853 0.9761 0.9812 0.1809 0.9926 0.9869 489.145 136%
DL2-2 0.9814 0.9777 0.9825 0.1688 0.9931 0.9878 556.405 140%
Table II also shows that the best-performing model, i.e.,
DL2-2 with feature reduction, takes 556 seconds or 9.3 minu-
tes to complete a training with 10-fold cross validation for a
14-bus system. In this days of high-performance computing,
this is too high a time, and a real-time deployment of deep
learning training is not feasible. However, as a usual practice,
the training is seldom done in real-time. Instead, the deep
learning training is used off-line to obtain a robust model and
then this “trained” model is deployed online to detect any such
attacks. In addition, high-performance computing platforms
with graphical processing units (GPU) are used to speed up
the training process.
VII. CONCLUSION AND FUTURE WOR K
Stealthy false data injection attacks on the state estimation
of a power grid can be disastrous; and an early and accurate
detection of these are critical. In this paper we have demon-
strated that machine learning based models, including deep
learning models, perform very well in detecting stealthy FDI
attacks on the SE in power transmission systems.
The next logical step with this research is to employ these
models for power systems with larger number of buses to
ascertain if the performance scales reasonably. We would also
want to find out if the training time reduction with reduced
feature-set is significant for large systems. In this work we
used data generated by MATPOWER simulation. In order to
gain more confidence we plan to run the models on realistic
datasets generated by Real-time Digital Simulation (RTDS)
and physical testbeds.
ACKNOWLEDGMENT
This research was partially supported by an Idaho Global
Entrepreneurial Mission (IGEM) Grant for Security Manage-
ment of Cyber-Physical Control Systems, 2016 (Grant Number
IGEM17-001), and the U.S. National Science Foundation
(NSF) CyberCorps R
award 1565572.
REFERENCES
[1] E. Byres, “The air gap: SCADA’s enduring security myth, Communi-
cations of the ACM, vol. 56, no. 8, pp. 29–31, 2013.
[2] S. McLaughlin, C. Konstantinou, X. Wang, L. Davi, A.-R. Sadeghi,
M. Maniatakos, and R. Karri, “The cybersecurity landscape in industrial
control systems,” Proceedings of the IEEE, vol. 104, no. 5, pp. 1039–
1057, 2016.
[3] Y. Zhang, L. Wang, Y. Xiang, and C.-W. Ten, “Power system reliability
evaluation with SCADA cybersecurity considerations,” IEEE Transacti-
ons on Smart Grid, vol. 6, pp. 1707–1721, July 2015.
[4] J. Giraldo, E. Sarkar, A. Cardenas, M. Maniatakos, and M. Kantarcioglu,
“Security and privacy in cyber-physical systems: A survey of surveys,
IEEE Design & Test, (2017), 2017.
[5] S. Paudel, P. Smith, and T. Zseby, “Attack models for advanced persistent
threats in smart grid wide area monitoring,” in Proceedings of the 2nd
Workshop on Cyber-Physical Security and Resilience in Smart Grids,
pp. 61–66, ACM, 2017.
[6] S. Tan, D. De, W.-Z. Song, J. Yang, and S. K. Das, “Survey of security
advances in smart grid: A data driven approach, IEEE Communications
Surveys & Tutorials, vol. 19, pp. 397–422, First Quarter 2017.
[7] “CIA: cyberattack caused multiple-city blackout, January 2008. www.
cnet.com/news/cia-cyberattack-caused- multiple-city- blackout/.
[8] “Massive power failure plunges 80% of Pakistan into darkness,”
January 2015. www.theguardian.com/world/2015/jan/25/
massive-power-failure-plunges- 80-of- pakistan-into- darkness.
[9] G. Liang, S. R. Weller, J. Zhao, F. Luo, and Z. Y. Dong, “The 2015
Ukraine blackout: Implications for false data injection attacks,” IEEE
Transactions on Power Systems, vol. 32, no. 4, pp. 3317–3318, 2017.
[10] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against
state estimation in electric power grids, ACM Transactions on Informa-
tion and System Security (TISSEC), vol. 14, no. 1, p. 13, 2011.
[11] A. Abur and A. Gomez-Exposito, Power System State Estimation:
Theory and Implementation. New York: CRC Press, 2004.
[12] C. Alcaraz and J. Lopez, “Wide-area situational awareness for critical
infrastructure protection,” Computer, vol. 46, no. 4, pp. 30–37, 2013.
[13] O. Kosut, L. Jia, R. J. Thomas, and L. Tong, “Malicious data attacks
on the smart grid,” IEEE Transactions on Smart Grid, vol. 2, no. 4,
pp. 645–658, 2011.
[14] A. Teixeira, S. Amin, H. Sandberg, K. H. Johansson, and S. S. Sastry,
“Cyber security analysis of state estimators in electric power systems,” in
Decision and Control (CDC), 2010 49th IEEE Conference on, pp. 5991–
5998, IEEE, 2010.
[15] K. C. Sou, H. Sandberg, and K. H. Johansson, “Detection and identifica-
tion of data attacks in power system, in American Control Conference
(ACC), pp. 3651–3656, June 2012.
[16] G. Hug and J. A. Giampapa, “Vulnerability assessment of AC state
estimation with respect to false data injection cyber-attacks, IEEE
Transactions on Smart Grid, vol. 3, no. 3, pp. 1362–1370, 2012.
[17] Y. Chakhchoukh and H. Ishii, “Cyber attacks scenarios on the me-
asurement function of power state estimation, in American Control
Conference (ACC), 2015, pp. 3676–3681, IEEE, 2015.
[18] Y. Xiang, L. Wang, and N. Liu, “Coordinated attacks on electric
power systems in a cyber-physical environment,” Electric Power Systems
Research, vol. 149, pp. 156–168, 2017.
[19] J. Schmidhuber, “Deep learning in neural networks: An overview,”
Neural networks, vol. 61, pp. 85–117, 2015.
[20] T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical
learning: data mining, inference, and prediction. New York: Springer-
Verlag, 2nd ed., 2008.
[21] Y. Chakhchoukh and H. Ishii, “Coordinated cyber-attacks on the me-
asurement function in hybrid state estimation,” IEEE Transactions on
Power Systems, vol. 30, pp. 2487–2497, September 2015.
[22] Y. Chakhchoukh, S. Liu, M. Sugiyama, and H. Ishii, “Statistical outlier
detection for diagnosis of cyber attacks in power state estimation, in
IEEE Power and Energy Society General Meeting (PESGM), July 2016.
[23] M. Keshk, N. Moustafa, E. Sitnikova, and G. Creech, “Privacy preser-
vation intrusion detection technique for SCADA systems, in Military
Communications and Information Systems Conference (MilCIS), 2017.
[24] R. Tan, H. H. Nguyen, E. Y. Foo, D. K. Yau, Z. Kalbarczyk, R. K.
Iyer, and H. B. Gooi, “Modeling and mitigating impact of false data
injection attacks on automatic generation control,” IEEE Transactions
on Information Forensics and Security, vol. 12, no. 7, pp. 1609–1624,
2017.
[25] M. Esmalifalak, L. Liu, N. Nguyen, R. Zheng, and Z. Han, “Detecting
stealthy false data injection using machine learning in smart grid,” IEEE
Systems Journal, 2014.
[26] Y. Chakhchoukh, S. Liu, M. Sugiyama, and H. Ishii, “Statistical outlier
detection for diagnosis of cyber attacks in power state estimation, in
Proc. IEEE PES General Meeting, pp. 1–5, July 2016.
[27] M. Sugiyama, T. Suzuki, and T. Kanamori, Density ratio estimation in
machine learning. Cambridge University Press, 2012.
[28] Y. Wang, M. Amin, J. Fu, and H. Moussa, A novel data analytical
approach for false data injection cyber-physical attack mitigation in
smart grids,” IEEE Access, 2017.
[29] J. Hao, R. J. Piechocki, D. Kaleshi, W. H. Chin, and Z. Fan, “Sparse
malicious false data injection attacks and defense mechanisms in smart
grids,” IEEE Transactions on Industrial Informatics, vol. 11, pp. 1–12,
October 2015.
[30] Y. He, G. J. Mendis, and J. Wei, “Real-time detection of false data
injection attacks in smart grid: A deep learning-based intelligent me-
chanism,” IEEE Transactions on Smart Grid, 2017.
[31] J. Wei and G. J. Mendis, A deep learning-based cyber-physical strategy
to mitigate false data injection attack in smart grids,” in Cyber-Physical
Security and Resilience in Smart Grids (CPSR-SG), Joint Workshop on,
pp. 1–6, IEEE, 2016.
[32] S. Marsland, Machine learning: an algorithmic perspective. CRC press,
2015.
[33] H. Wang, J. Ruan, G. Wang, B. Zhou, Y. Liu, X. Fu, and J.-C. Peng,
“Deep learning based interval state estimation of ac smart grids against
sparse cyber attacks,” IEEE Transactions on Industrial Informatics,
2018.
[34] X. Liu and Z. Li, “False data attack models, impact analyses and defense
strategies in the electricity grid (subsection 5.2),” The Electricity Journal,
vol. 30, pp. 35–42, 2017.
[35] R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in
International Joint IEEE Conference on Neural Networks, pp. 593–605,
1989.
[36] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for
deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554,
2006.
[37] University of Washington, Power System Test Case Archive. [Online].
Available: http://www.ee.washington.edu/research/pstca/.
[38] R. D. Zimmerman, C. E. Murillo-S´
anchez, and R. J. Thomas, “MAT-
POWER: Steady-state operations, planning, and analysis tools for power
systems research and education,” IEEE Transactions on power systems,
vol. 26, no. 1, pp. 12–19, 2011.
[39] Y. Chakhchoukh and H. Ishii, “Enhancing robustness to cyber-attacks in
power systems through multiple least trimmed squares state estimations,”
IEEE Transactions on Power Systems, vol. 31, no. 6, pp. 4395–4405,
2016.
[40] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press,
2016. http://www.deeplearningbook.org.
[41] M. Sokolova and G. Lapalme, A systematic analysis of performance
measures for classification tasks,” Information Processing & Manage-
ment, vol. 45, no. 4, pp. 427–437, 2009.
... A number of mechanisms have been proposed to detect and identify cyber attacks in power systems [6,7,8,9,10,11]. Some of these mechanisms use supervised learning models, which require a large number of labeled data to achieve accurate attack detection. ...
... For example, Lou et al. [7] exploited a BiLSTM-based model for the detection of TDA. Mohammad Ashrafuzzaman et al. proposed a DNN-based model [8] for FDIA detection. Qingyu Deng et al. used an LSTM-based model [9] to detect FDIA in a power grid. ...
... DNN model [8]: It is a MLP model, which involves 3 hidden layers. Because it was only designed for FDIA detection, so we change the last layer of the model from 2 units to 3 units and train it to do classification task. ...
Preprint
Modern power grids are undergoing significant changes driven by information and communication technologies (ICTs), and evolving into smart grids with higher efficiency and lower operation cost. Using ICTs, however, comes with an inevitable side effect that makes the power system more vulnerable to cyber attacks. In this paper, we propose a self-supervised learning-based framework to detect and identify various types of cyber attacks. Different from existing approaches, the proposed framework does not rely on large amounts of well-curated labeled data but makes use of the massive unlabeled data in the wild which are easily accessible. Specifically, the proposed framework adopts the BERT model from the natural language processing domain and learns generalizable and effective representations from the unlabeled sensing data, which capture the distinctive patterns of different attacks. Using the learned representations, together with a very small amount of labeled data, we can train a task-specific classifier to detect various types of cyber attacks. Meanwhile, real-world training datasets are usually imbalanced, i.e., there are only a limited number of data samples containing attacks. In order to cope with such data imbalance, we propose a new loss function, separate mean error (SME), which pays equal attention to the large and small categories to better train the model. Experiment results in a 5-area power grid system with 37 buses demonstrate the superior performance of our framework over existing approaches, especially when a very limited portion of labeled data are available, e.g., as low as 0.002\%. We believe such a framework can be easily adopted to detect a variety of cyber attacks in other power grid scenarios.
... The authors relate the performance of DL approach with three famous ML techniques that are distributed random forests (DRF), GBM, and GLM. Akram et al. (2018) examine a temporal analytics infrastructure for stealth assessment which examines student's problemsolving approaches. This strategy-based temporal analytic structure utilizes LSTM-based evidence approaches and cluster orders of student's problem-solving performances across following tasks. ...
... Most of the methodologies focus on specific domains like genetics (Henderson et al. 2020), power systems (Ashrafuzzaman et al. 2018(Ashrafuzzaman et al. , 2020Zhou et al. 2019), and computation (Akram et al. 2018), which might restrict the generalizability of the proposed techniques to broader educational contexts. Moreover, the review highlights a potential gap in the lack of rigorous comparison methods, as some papers compare their novel approaches against traditional techniques without employing more advanced evaluation techniques like k-fold cross-validation, which could yield more robust results. ...
Article
Full-text available
A distinguishing feature of intelligent game-based learning environment is its capacity for assisting stealth assessment. Stealth assessment gathers data regarding student competency in an invisible way and enables drawing valid inferences with respect to student knowledge. Stealth assessment might radically extend the impact and scope of learning analytics. Stealth assessment describes the unobtrusive assessment of learner by using emergent data from the digital traces in electronic learning environment via machine learning technology. This study presents a new stealth assessment model using ensemble learning for inferring the student competency in a game-based learning (GBL) environments, named ELSAM-GBL technique. To perform automated and accurate stealth assessment, this study focuses on the design of ensemble learning model by the incorporation of three DL models, namely gated recurrent unit (GRU), sparse auto encoder (SAE), and vanilla recurrent neural network (RNN). At the same time, the hyperparameter tuning of the DL models takes place using the atomic orbital search (AOS) optimization algorithm, which helps in improving the ensemble learning process. To demonstrate the enhanced stealth assessment performance of the ELSAM-GBL technique, a comprehensive experimental analysis is conducted. The comparative study shows the enhanced performance of the presented ELSAM-GBL technique over other DL models in terms of different metrics.
... The study by Valdes et al. (2016) looked at energy measurement samples and used self-organizing maps and adaptive resonance theory to find new information and patterns that were the same. In Ashrafuzzaman et al. (2018), stealthy false data insertion in a state estimation was detected using both supervised and unsupervised machine learning techniques where dimensionality reduction is accomplished using PCA, and a distributed SVM is utilized to distinguish between a stealth assault and a regular attack. Meanwhile, Hink et al. (2014), Badrinath Krishna et al. (2016), Badrinath , O'Toole et al. (2019) have extensively worked with anomaly detection concerning electrical meter frauds. ...
Article
Full-text available
Energy systems require radical changes due to the conflicting needs of combating climate change and meeting rising energy demands. These revolutionary decentralization, decarbonization, and digitalization techniques have ushered in a new global energy paradigm. Waves of disruption have been felt across the electricity industry as the digitalization journey in this sector has converged with advances in artificial intelligence (AI). However, there are risks involved. As AI becomes more established, new security threats have emerged. Among the most important is the cyber-physical protection of critical infrastructure, such as the power grid. This article focuses on dueling AI algorithms designed to investigate the trustworthiness of power systems’ cyber-physical security under various scenarios using the phasor measurement units (PMU) use case. Particularly in PMU operations, the focus is on areas that manage sensitive data vital to power system operators’ activities. The initial stage deals with anomaly detection applied to energy systems and PMUs, while the subsequent stage examines adversarial attacks targeting AI models. At this stage, evaluations of the Madry attack, basic iterative method (BIM), momentum iterative method (MIM), and projected gradient descend (PGD) are carried out, which are all powerful adversarial techniques that may compromise anomaly detection methods. The final stage addresses mitigation methods for AI-based cyberattacks. All these three stages represent various uses of AI and constitute the dueling AI algorithm convention that is conceptualised and demonstrated in this work. According to the findings of this study, it is essential to investigate the trade-off between the accuracy of AI-based anomaly detection models and their digital immutability against potential cyberphysical attacks in terms of trustworthiness for the critical infrastructure under consideration.
... Cybersecurity professionals strive to identify anomalies in outgoing data as an indication of a network breach. With the increasing sharing of sensitive information and data transmission across digital environments, the assurance of information security and dependable data transmission becomes a critical concern, particularly in ad hoc architectures and smart cities [12]. ...
Article
Full-text available
Nowadays, countries face a multitude of electronic threats that have permeated almost all business sectors, be it private corporations or public institutions. Among these threats, advanced persistent threats (APTs) stand out as a well-known example. APTs are highly sophisticated and stealthy computer network attacks meticulously designed to gain unauthorized access and persist undetected threats within targeted networks for extended periods. They represent a formidable cybersecurity challenge for governments, corporations, and individuals alike. Recognizing the gravity of APTs as one of the most critical cybersecurity threats, this study aims to reach a deeper understanding of their nature and propose a multi-stage framework for automated APT detection leveraging time series data. Unlike previous models, the proposed approach has the capability to detect real-time attacks based on stored attack scenarios. This study conducts an extensive review of existing research, identifying its strengths, weaknesses, and opportunities for improvement. Furthermore, standardized techniques have been enhanced to enhance their effectiveness in detecting APT attacks. The learning process relies on datasets sourced from various channels, including journal logs, traceability audits, and systems monitoring statistics. Subsequently, an efficient APT detection and prevention system, known as the composition-based decision tree (CDT), has been developed to operate in complex environments. The obtained results demonstrate that the proposed approach consistently outperforms existing algorithms in terms of detection accuracy and effectiveess.
... Two sub-architectures i.e., spatial architecture and temporal architecture formed the PowerFDNet. In [40], another deep learning-based method was adopted in order to detect attacks. The formulation of FDI attack detection was based on a machine learning based binary classification problem. ...
Article
Full-text available
With the growing cyber-infrastructure of smart grids, the threat of cyber-attacks has intensified, posing an increased risk of compromised communication links. Of particular concern is the false data injection (FDI) attack, which has emerged as a highly dangerous cyber-attack targeting smart grids. This paper addresses the limitations of the variable dummy value model proposed in the authors previous work and presents a novel defense methodology called the nonlinear function-based variable dummy value model for the AC power flow network. The proposed model is evaluated using the IEEE 14-bus test system, demonstrating its effectiveness in detecting FDI attacks. It has been shown that previous detection techniques are unable to detect FDI attacks, whereas the proposed method is shown to be successful in the detection of such attacks, guaranteeing the security of the smart grid’s measurement infrastructure.
Article
False data injection attacks (FDIAs) pose a significant threat to the healthy and safe operation of smart grids. Traditional fdia detection methods are difficult to deal with complex and harsh scenarios. Considering the increasing demand for attack location services in practical power grid management, this paper proposes a new adversarial scheme called Semi-Supervised Multi-Label Adversarial Network (SMAN). To better adapt to harsh learning conditions with only a few labeled samples, this scheme combines semi-supervised mechanisms with state-of-the-art generative adversarial networks. The showdown between generator and discriminator aims to optimize the model for high detection accuracy in semi-supervised training. Among them, for the non-Euclidean structure of the power system, we propose a graph attention-based generation mechanism to improve the generator and enhance the authenticity of the generated samples. Furthermore, the classification network based on label correlation of FDIAs is proposed to capture the inconsistency and co-occurrence dependency in the measurements due to the potential attacks, which is applied to detect the exact injection locations with high dimension data. To this end, we perform label transformation and hierarchical training on the localization task to quickly localize data attacks based on multi-label classification. Extensive simulations and comparisons implemented on IEEE 14-bus and 118-bus power systems demonstrate the superiority of this scheme in localization detection. And it has been proved that it has high robustness and generalization ability under harsh conditions.
Article
Full-text available
Supervisory Control and Data Acquisition (SCADA) systems face the absence of a protection technique that can beat different types of intrusions and protect the data from disclosure while handling this data using other applications, specifically Intrusion Detection System (IDS). The SCADA system can manage the critical infrastructure of industrial control environments. Protecting sensitive information is a difficult task to achieve in reality with the connection of physical and digital systems. Hence, privacy preservation techniques have become effective in order to protect sensitive/private information and to detect malicious activities, but they are not accurate in terms of error detection, sensitivity percentage of data disclosure. In this paper, we propose a new Privacy Preservation Intrusion Detection (PPID) technique based on the correlation coefficient and Expectation Maximisation (EM) clustering mechanisms for selecting important portions of data and recognizing intrusive events. This technique is evaluated on the power system datasets for multiclass attacks to measure its reliability for detecting suspicious activities. The experimental results outperform three techniques in the above terms, showing the efficiency and effectiveness of the proposed technique to be utilized for current SCADA systems.
Article
Full-text available
False data injection cyber-physical threat is a typical integrity attack in modern smart grids. Nowadays, data analytical methods have been employed to mitigate false data injection attacks (FDIAs), especially when large scale smart grids generate huge amounts of data. In this paper, a novel data analytical method is proposed to detect FDIAs based on data-centric paradigm employing the margin setting algorithm (MSA). The performance of the proposed method is demonstrated through simulation using the six-bus power network in a wide area measurement system (WAMS) environment, as well as experimental data sets. Two FDIA scenarios, playback attack and time attack, are investigated. Experimental results are compared with the support vector machine (SVM) and artificial neural network (ANN). The results indicate that MSA yields better results in terms of detection accuracy than both the SVM and ANN when applied to FDIA detection.
Conference Paper
Full-text available
Supervisory Control and Data Acquisition (SCADA) systems face the absence of a protection technique that can beat different types of intrusions and protect the data from disclosure while handling this data using other applications, specifically Intrusion Detection System (IDS). The SCADA system can manage the critical infrastructure of industrial control environments. Protecting sensitive information is a difficult task to achieve in reality with the connection of physical and digital systems. Hence, privacy preservation techniques have become effective in order to protect sensitive/private information and to detect malicious activities, but they are not accurate in terms of error detection, sensitivity percentage of data disclosure. In this paper, we propose a new Privacy Preservation Intrusion Detection (PPID) technique based on the correlation coefficient and Expectation Maximisation (EM) clustering mechanisms for selecting important portions of data and recognizing intrusive events. This technique is evaluated on the power system datasets for multiclass attacks to measure its reliability for detecting suspicious activities. The experimental results outperform three techniques in the above terms, showing the efficiency and effectiveness of the proposed technique to be utilized for current SCADA systems.
Article
Full-text available
This paper studies the impact of false data injection attacks on automatic generation control (AGC), a fundamental control system used in all power grids to maintain the grid frequency at a nominal value. Attacks on the sensor measurements for AGC can cause frequency excursion that triggers remedial actions such as disconnecting customer loads or generators, leading to blackouts and potentially costly equipment damage. We derive an attack impact model and analyze an optimal attack, consisting of a series of false data injections, that minimizes the remaining time until the onset of disruptive remedial actions, leaving the shortest time for the grid to counteract. We show that, based on eavesdropped sensor data and a few feasibleto- obtain system constants, the attacker can learn the attack impact model and achieve the optimal attack in practice. This paper provides essential understanding on the limits of physical impact of false data injections on power grids, and provides an analysis framework to guide the protection of sensor data links. For countermeasures, we develop efficient algorithms to detect the attack, estimate which sensor data links are under attack, and mitigate attack impact. Our analysis and algorithms are validated by experiments on a physical 16-bus power system testbed and extensive simulations based on a 37-bus power system model.
Article
Due to the aging of electric infrastructures, conventional power grid is being modernized towards smart grid that enables two-way communications between consumer and utility, and thus more vulnerable to cyber-attacks. However, due to the attacking cost, the attack strategy may vary a lot from one operation scenario to another from the perspective of adversary, which is not considered in previous studies. Therefore, in this paper, scenario based two-stage sparse cyber-attack models for smart grid with complete and incomplete network information are proposed. Then, in order to effectively detect the established cyber-attacks, an interval state estimation (ISE) based defense mechanism is developed innovatively. In this mechanism, the lower and upper bounds of each state variable are modeled as a dual optimization problem that aims to maximize the variation intervals of the system variable. At last, a typical deep learning, i.e., stacked auto-encoder (SAE), is designed to properly extract the nonlinear and non-stationary features in electric load data. These features are then applied to improve the accuracy for electric load fore-casting, resulting in a more narrow width of state variables. The uncertainty with respect to forecasting errors is modeled as a parametric Gaussian distribution. The validation of the proposed cyber-attack models and defense mechanism have been demonstrated via comprehensive tests on various IEEE benchmarks.
Article
The injection of false data is a type of cyber-attack that targets the data and measurements in power systems to disrupt their normal operation. This article presents a comprehensive review of such attacks against modern power systems from three perspectives: attack models, their operational impacts and defense strategies. Also discussed are future research directions in this field and existing technical challenges.
Article
With increasing terrorism and sabotage activities, the power grid is becoming more vulnerable to various kinds of cyber and physical attacks. Coordination between the attacks could bring higher impacts on the power system, as evidenced by the 2015 Ukrainian power system cyberattack. There is limited study in existing literature about possible coordinated attack scenarios and the detailed mathematical modeling of them. To prevent future coordinated attacks against power systems, in this paper the cyber-physical security of the power system is analyzed and probable coordinated attack scenarios are proposed. Two typical attack coordination examples are studied in detail: the coordination between load redistribution (LR) attack and attacking generators; and the coordination between LR attack and attacking lines. They are formulated as bilevel optimization problems, where the attacker at the upper level aims to maximize the load curtailment while the defender at the lower level makes an effort to reduce the load curtailment. The case studies conducted based on a modified IEEE 14-bus system demonstrate the potential damaging effects of the coordinated attacks. And it is shown that coordinated attacks could cause higher load curtailment than the standalone attacks. This study can provide meaningful insights on how to prevent and mitigate such high-impact, low-frequency (HILF) coordinated attacks.
Conference Paper
Wide Area Monitoring Systems (WAMSs) provide an essential building block for Smart Grid supervision and control. Distributed Phasor Measurement Units (PMUs) allow accurate clock-synchronized measurements of voltage and current phasors (amplitudes, phase angles) and frequencies. The sensor data from PMUs provide situational awareness in the grid, and are used as input for control decisions. A modification of sensor data can severely impact grid stability, overall power supply, and physical devices. Since power grids are critical infrastructures, WAMSs are tempting targets for all kinds of attackers, including well-organized and motivated adversaries such as terrorist groups or adversarial nation states. Such groups possess sufficient resources to launch sophisticated attacks. In this paper, we provide an in-depth analysis of attack possibilities on WAMSs. We model the dependencies and building blocks of Advanced Persistent Threats (APTs) on WAMSs using attack trees. We consider the whole WAMS infrastructure, including aggregation and data collection points, such as Phasor Data Concentrators (PDCs), classical IT components, and clock synchronization. Since Smart Grids are cyber-physical systems, we consider physical perturbations, in addition to cyber attacks in our models. The models provide valuable information about the chain of cyber or physical attack steps that can be combined to build a sophisticated attack for reaching a higher goal. They assist in the assessment of physical and cyber vulnerabilities, and provide strategic guidance for the deployment of suitable countermeasures.