Conference PaperPDF Available

Anomaly Detection in Electrical Substation Circuits via Unsupervised Machine Learning

Authors:

Figures

Content may be subject to copyright.
Anomaly Detection in Electrical Substation Circuits via
Unsupervised Machine Learning
Alfonso Valdes
University of Illinois at Urbana-
Champaign
1308 W Main St
Urbana, IL 61801 USA
(217)244-5147
avaldes@illinois.edu
Richard Macwan
University of Illinois at Urbana-
Champaign
1308 W Main St
Urbana, IL 61801 USA
rmacwan@illinois.edu
Matt Backes
University of Illinois at Urbana-
Champaign
1308 W Main St
Urbana, IL 61801 USA
mbackes2@illinois.edu
ABSTRACT
Cyber-physical systems (CPS), such as smart grids, include
distributed cyber assets for monitoring, control, and
communication in order to maintain safe and efficient operation of
the physical system in question. Security in CPS may be able to
leverage physical laws that govern the CPS, providing a defense
strategy complementing conventional cybersecurity measures. CPS
intrusion detection systems (CPS IDS) should seek not just to detect
attacks in the host audit logs and network traffic (cyber plane), but
should consider how attacks are reflected in measurements from
diverse devices at multiple locations (physical plane). In electric
grids, voltage and current laws induce physical constraints that can
be leveraged in distributed agreement algorithms to detect
anomalous conditions where the physical and cyber states are
inconsistent. This can be done by explicitly coding the physical
constraints into a hybrid CPS IDS, but the detector is then specific
to a particular CPS. We propose an alternative approach using
machine learning to characterize normal, fault, and attack states in
a smart distribution substation CPS, using this as a component of a
CPS IDS. Our innovative approach does not require that attack
states be rare, nor does it require clean training data. Initial results
indicate that attack states are either learned as unique classes if they
are present in the training phase, or are easily detected as
anomalous by the trained system, and that normal and non-
malicious fault states are learned as well.
CCS Concepts
Security and privacy
Intrusion/anomaly detection and
malware mitigation, Hardware
Power and energy
Energy
distribution
Smart grid,
Computing methodologies
Machine learning
Unsupervised
learning
Anomaly detection,
Computing methodologies
Machine learning
Machine
learning approaches
Neural networks.
Keywords
Anomaly Detection; Distributed Agreement; Machine Learning;
Smart Grid; Cyber/Physical System Security; Neural Networks;
Self-Organizing Maps; Adaptive Resonance Theory; Competitive
Learning; IEC 61850
1. INTRODUCTION
Modern infrastructure systems, such as those in energy delivery,
are rapidly evolving into cyber-physical systems (CPS) in which
distributed cyber assets for monitoring, communication, and
control interface with a physical process for safe and efficient
operation. Cyber assets include human-machine interfaces (HMI)
in control rooms, as well as embedded systems in substations or in
the field outside of any physical security perimeter. Increasingly,
these systems use commodity operating systems and networking
protocols and technology, with the hardware possibly hardened for
a harsh physical environment, but in many cases logically not
unlike those found in enterprise systems. Typically, legacy control
protocols such as MODBUS are adapted to modern CPS via
encapsulation (MODBUS over TCP [1]), while newer protocols
such as IEC 61850 are designed to layered upon modern
networking protocols such as TCP and Ethernet.
There is concern that incorporation of extensive cyber assets in CPS
makes them vulnerable to cyber attack, which potentially leads to
failure of the physical process (for example, power outage in an
electrical system), threat to physical safety, damage to expensive
equipment, and environmental consequences. Embedded systems
in the field are frequently constrained with respect to
communication and computational capability. This limitation, as
well as strict real time requirements often found in CPS, renders
adoption of conventional security technology problematic. CPS
may be attacked via vectors similar to those used to attack
enterprise systems, such as malicious protocol commands or device
compromise. Another important class of attacks in CPS is to inject
incorrect measurement data, causing the CPS to undertake incorrect
and potentially destabilizing control action.
The special challenges to security in CPS are mitigated when the
laws governing the physical system induce constraints in what
should be observed in the cyber system. Understanding the physics
may afford an opportunity to enforce consistency across multiple
devices at different locations using diverse approaches for
measurement and control.
We consider the attacker who has some knowledge of the
underlying protocol and can inject a limited number of false
measurements that are correct as far as the protocol syntax. We
refer to this as a data injection attack. Our defenses make such an
attack significantly more difficult by leveraging the underlying
physical constraints, thereby requiring the simultaneous
compromise of a diversity of devices at multiple points in the
system. Merely injecting false measurements into the network
traffic, even if these are syntactically correct with respect to the
control protocol and pass rudimentary range checks, will not work
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
ACM e-Energy Conference, June 21-24, Waterloo, Ontario, Canada
if the defender is able to quickly assess these for consistency with
the global system state, from a physical standpoint.
In order to implement security measures based on this approach,
we may encode the constraints imposed by the underlying physics.
This was the approach we took in prior work [6][8]. While these
works demonstrated the efficacy and feasibility of the approach, it
is the case that the implementation requires extensive configuration
for a specific CPS. In the present work, we demonstrate that a
machine learning/anomaly detection approach can learn system
states induced by the underlying physical laws, without explicitly
encoding these laws into our detector.
We present initial results of an unsupervised machine learning
approach to anomaly detection, specifically detection of
measurement patterns corresponding to normal operating
conditions, true non-malicious fault currents, and false data
injection in electrical distribution circuits. The learning approach
we use has aspects of Self-Organizing Maps (SOM) [4] and
Adaptive Resonance Theory (ART) [2]. Patterns of voltage and/or
current measurements at various points in the circuit are presented
to the classifier. The classifier ideally learns a number of pattern
exemplars that represent the observed measurement patterns, and
in particular learns patterns for normal and true fault conditions, as
well as for injected measurements falsely purporting to report a
fault condition.
In conventional cybersecurity, “anomaly detection” refers to
detection of extremely unusual events, where “unusual” may be
determined by observing the system to characterize normal
behavior. Anomaly detection in IDS is differentiated from
signature-based detection in which the IDS searches for patterns of
misuse, such as known malware byte sequences [5]. Although
anomaly detection potentially protects systems against unknown
attacks, such as zero-day attacks, in practice the false alarm and
missed detection rates frequently fall short of expectations. A
hypothesis underlying our work is that the constraints imposed by
the physical laws enable our system to overcome these
shortcomings.
Anomaly detection approaches typically try to construct a decision
surface between observations labeled “normal” and “anomalous”.
In practice, there is often the further (questionable) assumption that
malicious events are rare and thus anomalous. Moreover, many
learning-based approaches require clean training data with no
exemplars from the malicious class(es) [5]. The approach we
present has advantages over anomaly detection schemes that tend
to lump all “normal” cases into one class, in that our approach
explicitly allows for multiple patterns of normal conditions (in
electrical systems, these might correspond to different points in the
load curve). Another advantage is that the “anomalous” cases may
fall into several classes, and are potentially learned as different
pattern classes if present in the training phase.
While the approach is applicable to domains other than electrical
circuits, we believe that it is particularly effective in this domain
because the CPS is governed by well-understood physical laws,
namely, Kirchhoff current and voltage laws (KCL/KVL). We do
not claim that our classifier has learned KCL/KVL, but that these
laws induce measurement patterns for physically feasible system
states that are learned as pattern classes in our system. We
conjecture that the approach is applicable to other CPS where
physical laws constrain what is consistent among measurements,
even if those laws are not explicitly encoded into the classifier.
Our contribution is to demonstrate the feasibility of the approach in
the context of electrical distribution substation circuits, and in
particular smart grid systems. In these systems, modern protection
schemes (the rapid detection and isolation of a fault with minimum
outage while maintaining safe operation) depend on cyber assets
for ubiquitous measurement, communication, and control. A
classifier of measurement patterns for steady-state and fault
conditions is a form of distributed agreement among different
devices in the smart substation to detect and mitigate the effect of
an attacker who can falsify a limited number of measurements
available to the protective relays.
Moreover, as discussed in more detail in the results summary
below, our algorithm can be coupled with modern, simulation-
based design and analysis of electrical systems to identify points
where redundant measurements and/or enhanced cyber defenses
are most beneficial.
Throughout this paper we use the term “anomaly” to refer to a
cyber-physical anomaly, specifically, an anomalous pattern of
measurement values potentially indicative of a data injection attack
into the measurements in the substation protection environment. If
undetected, such an attack may trigger a system response that is
unwarranted, suboptimal, and potentially dangerous or
destabilizing.
2. SYSTEM UNDER EVALUATION
We model a distribution substation circuit that connects to the
upstream grid at 69 kV, includes a 7.5 MVA, 69 kV to 13.09 kV
transformer, then a circuit bus with two three-phase distribution
feeders that serve balanced, three-phase customer loads of 1 MW
and 1 MVAR peak. This is based on an actual distribution
substation on the US Eastern Interconnection. The substation
circuit topology is given in Figure 1. We have set up a hardware-
in-the-loop testbed in our laboratory, with the circuit defined in a
Real Time Digital Simulator (RTDS [7]) system connected to
physical ABB protective relays (ABB REF 615 series) at locations
90, 91, 92, and 93. The simulated substation operates under the IEC
61850 protocol [3]. For this work, we utilized only the substation
simulation in RTDS rather than a full hardware-in-the-loop setup
to generate simulated measurements for different contingencies and
injection attacks.
Figure 1. Distribution circuit under evaluation.
The KCL/KVL conditions will hold either in either steady-state
system operation or in the case of true, physical short circuit fault
currents (non-malicious faults). In the latter case, the relays should
undertake collective action to open breaker(s) so as to isolate the
fault and minimize the extent of the outage. The measurements are
simulated under varying load conditions and with measurement
noise that model those observed on the actual system in the field.
In addition to the simulated faults at various points in the circuit,
we also introduce some measurement samples corresponding to
injection attacks.
In our earlier work, we described distributed agreement algorithms
for detecting data injection attacks in this circuit [6], as well as in a
ring-bus topology [8] which is more typical of transmission
substations. In this earlier work, we established the efficacy of
detecting these attacks via applying KCL/KVL constraints,
explicitly coding these constraints in the detection algorithms.
A false injected measurement that evades detection may trigger an
unnecessary protective action, leading to economic consequences
and denial of service in a real system. A detection system based on
distributed agreement using KCL/KVL conditions raises adversary
work factor significantly, requiring the adversary to possess
detailed knowledge of the system and the ability to inject precise,
time-aligned false data at multiple points in the system. We now
demonstrate that our system, based on machine learning and
anomaly detection rather than explicitly encoding KCL/KVL
conditions, similarly raises adversary work factor. In either case,
the attack is not impossible, but it is significantly more difficult
than getting access to and compromising a measurement device.
Also, as shown in [6], fast remedial actions can be triggered by the
distributed agreement to mitigate the impact of the attack.
3. Simulation and Data Generation
We use the RTDS [7] to simulate the electrical distribution
substation circuit and collect measurements for the machine
learning algorithm at the locations where the relays are located.
RTDS and similar systems are widely used in the utility sector to
define numerous topologies, signal actual and virtual substation
components at high sample rates, and conduct high-fidelity
simulation analyses to enable effective system design.
The feature vector for the results discussed here consists of RTDS-
generated, time-aligned voltage and current magnitudes for all
three phases at the four locations, for a total of 24 features if we
simultaneously consider voltage and current, or 12 features if we
consider either separately.
We simulate the circuit for a total of 120 seconds, representing
circuit conditions and events in a compressed time frame. The
circuit has a time-varying load that is typical of a daily residential
customer load profile. The 24-hour load profile is compressed into
24 seconds, allowing the learning algorithm to process the entire
load profile and create corresponding patterns. The load profile is
continuously played in a 24 second loop for the duration of the
simulation. Load consumption levels are based on the actual
distribution substation feeder loads. Figure 2 shows the compressed
load curve.
Figure 2: Load Curve Compressed to 24s
In order to make the simulation realistic and to assess the ability of
the machine learning algorithm to identify patterns in presence of
noisy measurement, we introduce noise levels into the relay
measurements in the RTDS simulation. The noise is assumed to be
Gaussian with mean 0 and standard deviation corresponding to a
1% signal-to-noise ratio (SNR), typical of a distribution circuit.
The standard deviation for the Gaussian noise is given by
! " #$%&'()*+,
-./012
34 5
where the Signal_rms is the RMS value of either the voltage, or
current sine wave. As can be seen from the equation, the standard
deviation changes with the signal level. We base the standard
deviation using the full-load current and nominal voltage values,
which we consider to be most challenging to the algorithm. In this
scenario, the noise level for the signals at a lower load level will be
higher than what it normally would be, but the assumption is that if
the machine learning algorithm is able to identify different system
states in this scenario, then it will be able to identify system states
for a noise level lower than this.
The sampling rate for the simulated measurements is approximately
260 Hz. Typically, phasor measurement units (PMU) sample at a
30 Hz rate, but we chose a higher sampling rate in order to emulate
the sampled value measurement scheme available under the IEC
61850 standard [14] which is the substation environment assumed
here.
We use the RTDS to simulate balanced, three-phase-to-ground
short-circuit faults at three different locations of the distribution
circuit. Two of the faults are located at the end of the distribution
feeders 1 and 2, and the third fault is located on the circuit bus 2.
We consider faults of this type for the sake of simplicity and in
order to test the efficacy of the machine learning algorithm at this
stage in our analysis, and also due to the severity of damage that
can occur with this type of fault. It is of more interest to ensure
proper protection relay operation for this type of fault as opposed
to, say, a single line-to-ground fault. In addition, three phase-to-
ground faults allow us to obtain much higher fault currents, which
will be used to test the machine learning algorithm’s ability to
correctly match patterns from faults at the same location but of
different magnitudes.
Since both the time of occurrence of the fault on the load curve
cycle and the fault magnitude of are random and unpredictable, it
is necessary to account for both these variables while generating
data for machine learning. As stated before, the aim of the machine
learning algorithm is to identify the different states of the system
with minimum false positives for each state. Keeping this in mind,
the different faults were simulated for both the training phase and
the validation phase, varying magnitude and point on the load
curve.
The fault impedance takes on two values. The first is 70 Ohms. This
fault impedance was chosen to give a fault current near two times
the full-load current of the circuit at each fault location. The second
fault impedance is 20 Ohms. This impedance gives fault currents
of five times (relay 92), seven times (relay 91), and ten times (relay
93) the full load current, depending upon which location the fault
occurs. The duration of all faults is 0.05 seconds, or three full cycles
(at 60 Hz). Due to transition from fault states to normal, the trace
for an event in our simulation is non-deterministic, but typically
ranges from 14 to 20 samples. The injection attacks typically last
for 30 samples. We will use the term “trace” to refer to the trace of
consecutive samples corresponding to a fault or attack event.
4. OVERVIEW OF LEARNING
APPROACH
In studies exploring machine learning, it is typical to assign samples
at random to training and test (or validation) data sets. The system
learns from samples in the training set, and claims of generalization
are based on results of the trained system as applied to the
validation set, without further learning. There are more complicated
assignments to validation and training, such as n-fold cross-
validation.
We will denote by “sample” a vector of time-aligned measurement
values at the various points of the circuit, and use “trace” to denote
the sequence of contiguous samples corresponding to a fault or data
injection event. For scoring detections and false alarms, we shall
use the nominal count of 15 samples per fault and 30 per injection
attack, although these are non-deterministic and event traces
exhibit transients at the start and/or end which can be difficult to
assign to the event or to normal. We believe that these values are
high, so that, for example, claiming 90% detection when we detect
27 samples of an attack may in fact be understating the actual
performance. Therefore, our claimed detection results are
conservative.
In this study, we generate a total of 31,250 time aligned samples
and assign the first 5000-6000 samples to training and the rest to
validation. Events of interest, like short circuit faults and data
injection attacks, are simulated at various points in the sample
stream. As it will be further discussed in section 5, the different
scenarios of machine learning are studied considering the inclusion
or exclusion of certain events in the training data. We claim that
this assignment is without significant loss of generality because the
introduced noise is stationary and we do not consider
autocorrelation between samples. Indeed, this lets us make
assertions about the ability of the system to generalize when events
of a particular type occur at previously unseen points on the load
curve.
In the training set, samples 1-5000 contain faults at relays 91 and
92, samples 5001-5500 contain a fault at relay 93, and 5500-6000
contain an injection attack on relay 90. We observe that these are
in the early part of the load curve of Figure 2, dipping to the first
local minimum, so that, if successful, our approach gives
confidence of generalizability beyond specific load characteristics.
We now have the option to choose our training set to be the first
5000, 5500, or 6000 traces/samples to give us a way to observe how
well the machine learning algorithm can identify different event
classes, depending on whether they are included in the training set
or not. In particular, we make the claim that, unlike many other
anomaly detection systems in cyber security, we do not require
clean (that is, attack-free) data in training. This claim can be
evaluated by training on samples 1-6000, which contain an attack.
The remainder of the data generated is the validation set, which
begins with steady-state circuit operation so we can verify that the
varying load curve does not cause anomaly detection. Following
this, the six true faults and eight data injection attacks occur
randomly in the remainder of the samples. Each event happens once
within the validation set.
We would like to observe how well the machine learning algorithm
can generalize to detect true faults and data injection attacks
regardless of where they happen on the load curve. For example,
we wish to confirm that a fault (at the same location) that occurs at
the peak of the load profile matches the pattern for the fault that
occurs at the trough of the load profile.
Next, we would like to observe how well the machine learning
algorithm can generalize to detect of faults at the same location, but
with the fault current being a different magnitude from that seen in
training. In this case, we compare how well faults in the training
set, with fault current being twice the level of full-load current,
match with faults in the validation set having higher fault current,
e.g. five times the full-load current.
The algorithm has configurable parameters for learning rate,
goodness of pattern match, criteria for generating new learned
classes, and for blending similar learned pattern classes. The
algorithm includes outer and inner learning loops. On each pass
through the inner loop, a sample pattern (a measurement trace, after
feature normalization) is presented to the classifier. The outer loop
adjusts learning rates and match criteria, and prunes the learned
SOM by merging similar classes.
The various features are in different units (volts and amperes) or
are of varying magnitude (for example, voltages at Bus1 and Bus2
differ by more than a factor of 5 due to the transformer). For this
reason, we normalize each feature (column in the matrix) by
subtracting the mean and dividing by the standard deviation of the
feature. We then subtract the row mean from each row (time sample
in the matrix), which centers the sample about zero, removing some
of the effect of the location of the sample on the load curve. Finally,
we process the individual values through a squashing function,
given by
#67',8 9 " -- : ;<=>?
where , is a scaling factor, set to 1.0 for our experiments.
Using terminology from Kohonen, the set of currently learned
patterns is referred to as the SOM (Self-organizing map), although
our technique for learning is different from that source. Kohonen
begins with multiple random patterns, whereas we begin with one.
New patterns are learned if a sample does not match a pattern in the
SOM. In this respect, our approach is more like that in ART. The
learned SOM may be considered the knowledge base of our system.
4.1 INNER LOOP
The inner loop operates by presentation of training patterns to the
classifier. Depending on how well sample patterns match learned
patterns, the learned patterns are reinforced or new learned pattern
classes are defined. In the inner loop, samples are presented at
random, with the number of patterns presented equal to some
multiple of the number of patterns. In this way, the classifier “sees”
each training pattern multiple times. This is a batch learning
operation, but with minor modifications could be adapted to
continuous or streaming-mode learning. Such a system would
likely run the inner loop for every sample in real time, and
periodically invoke instances of the outer loop in batch mode (the
outer loop is described below). In this case, it may be possible to
implement a system based on these concepts as a real-time CPS
security module.
In the following, @Ais a pattern currently learned in the SOM, and
B is the training sample pattern presented to the classifier. The
pattern match we use is based on the normalized dot product or
vector cosine, given by
C'DE8 @ F B " @GB@ B
This will be unity if the arguments are collinear. The winning
pattern is the currently learned pattern with the highest match score,
provided that score exceeds the goodness of pattern match
threshold. The winning pattern is adjusted slightly in the direction
of the presented pattern, according to the learning rate. If no
currently learned pattern matches the presented pattern to the
required goodness of match, the presented pattern becomes a new
exemplar for a learned pattern class. This is conceptually similar to
ART, where the match is the degree to which the new pattern
“resonates” with an already learned exemplar class, and a new
exemplar class is defined if no existing pattern class resonates
adequately. The learning operation modifies a learned pattern as
follows:
@ " - H I @ : IB
where I is initially set to 0.2 and decreased according to a
schedule that reduces it by half after each completion of all
iterations of the inner loop. As this parameter decreases, the
influence of new patterns on the learned pattern classes is lessened,
to reflect the expectation that further into the learning, the learned
classes should converge to the actual modes induced by the
physical laws.
Figure 3 shows pseudocode for the inner loop.
4.2 OUTER LOOP
An outer loop iteration consists of all required iterations of the inner
loop and then adjustments to various parameters.
We track the number of patterns that each learned pattern in the
SOM “wins”. If at the end of inner loop execution a pattern class in
the SOM wins too few data patterns, as determined by a pruning
threshold, it is removed (pruned) from the SOM.
At the end of each outer loop operation, we check the SOM for
similar patterns (effectively, we run the SOM through the SOM).
Patterns that match other patterns according to the goodness of
match criterion are optionally blended according to a weighted
average based on the number of patterns each has won. Figure 4
provides pseudocode for the pattern blending logic. Pattern
blending has the effect of further pruning the SOM. The results in
this paper reflect pruning and pattern blending for all but the last
iteration of the outer loop.
At the end of each outer loop iteration and after optional pattern
blending, the goodness of match criterion, initially 0.85, is adjusted
so that it is halfway between its present value and a configurable
capped value. For the results presented here, we used a cap value
of 0.95. We have experimented with values as high as unity, which
is a more stringent match requirement. This has the effect of
requiring a higher quality match of the SOM to the set of training
patterns as training proceeds. Using a more stringent match
criterion results in more normal patterns in the learned SOM, but
the results with respect to anomalous patterns do not change.
The code has an option to perform a simulated annealing operation
at outer loop iterations, wherein the learned SOM is randomized
slightly according to an annealing schedule wherein the
randomization is reduced later in the schedule. This made no
difference in the detection performance of anomalous cases, and
has been disabled for the results presented here. Pseudocode for the
outer loop is provided in Figure 5.
5. RESULTS
Our study used 31,250 time-aligned measurement patterns, with the
first n patterns assigned to the training set, and the remaining to the
test/validation set. As described above, the set includes
observations corresponding to true (non-malicious) ground faults at
the locations 91, 92, and 93 in Figure 1, and observations
corresponding to injection attacks at all four relay locations. By
changing the number of training samples n, we can generate
Let NSOM=number of learned patterns in the
SOM
Let Nj=number of patterns won by SOM pattern
class j
For each input pattern Y
j=ArgMax(Match(Y,Xj))
For some threshold Twin
If (Match(Y,Xj)> Twin)
Nj=Nj+1
Xj=(1-W)Xj+WY
else
NSOM= NSOM+1
X NSOM=Y
NNSOM= 1
endif
Xi, Xj : Two learned patterns in the SOM
Ni, Nj: Number of patterns each of the these has won
For some threshold Tblend,
If (Match(Xi,Xj)> Tblend)
Xi=(NiXi+NjXj)/(Ni+Nj)
PRUNE(Xj)
endif
Figure 4. Pseudocode for pattern blending.
Execute inner loop
Adjust learning weight W
Adjust thresholds Twin and Tblend
Prune SOM
Blend Patterns
Figure 5. Pseudocode for outer loop.
training sets that include only two faults, all three faults, and zero
or one injection attacks.
The method is considered successful if the following conditions are
met (n is the number of samples in the training set for each
condition described):
For n=5000, the fault at position 93 is not in the training
set. The other faults are learned as patterns, and faults at
the same location in the validation set should not
generate an anomaly, even though they are of different
magnitudes and at different points on the load curve. All
instances of the fault at position 93 will appear
anomalous, as will all the injection attacks.
For n=5500, all faults are in the training set. These
should be learned as distinct patterns, and should not
trigger anomalies when encountered in the validation
set. All injection attacks should appear anomalous. This
and the n=5000 are considered “clean” in that they are
attack-free.
For n=6000, the injection attack at relay 90 is included
in the training set. This should be learned as a pattern
class containing only the injection attack. In the
validation phase, future instances of this attack should
match the learned attack pattern and not trigger an
anomaly, but the other injection attacks should trigger
anomalies. This demonstrates that the system can learn
an attack pattern without diluting its ability to classify
normal patterns or detect future attack patterns.
The event timeline is schematically presented in Figure 6. In the
detailed description of each experiment that follows, we establish
that these success criteria are met. The 120 seconds represent five
compressed days (cycles of the load curve), as discussed
previously.
The anomaly score for a sample is the match score for the class that
the sample most closely fits. For these runs, we use an anomaly
threshold of 0.90 (anything below is considered anomalous).
Figure 6. Simulation Timeline with injected events
For all runs, we define a false alarm as a sample that is incorrectly
declared anomalous, and a missed detection (or false negative) as a
sample that should have been declared anomalous but was not. All
samples corresponding to steady-state operation should appear
normal, and they do for all runs. Depending on the training set
selection, the fault at relay 93 and the attack at relay 90 may or may
not be anomalous, as explained below.
5.1 First Training Set: n=5000
The classifier learns 3 pattern classes for steady-state operation
(4971 matches) as well as for the non-malicious fault states at relays
91 and 92 (15 and 14 matches, respectively), and classifies all
patterns in both the training and validation sets from these classes
correctly. All patterns corresponding to non-malicious faults
correspond to a pattern class for each fault location. No other
samples are classified as belonging to either of these classes.
As the fault at relay 93 event is not in the training set, we expect it
to be detected as an anomaly in the instance starting near sample
5400 as well as all subsequent faults at this relay location in the
validation set, although they vary in magnitude and location on the
load curve. This is in fact the result obtained.
The attack at relay 90 is not in the training set, and the
corresponding event traces at samples 5920, 21807, and 28838 are
flagged as anomalies. All other attack traces are considered
anomalous, as expected.
Two samples for the F92 event at 17135 are declared anomalous.
As the training set included the fault at this location, we consider
these false alarms. Even for these cases, the false alarm samples
comprise a small part of the trace. Typically, the anomalous
samples in a non-malicious fault trace occur toward the end of the
trace, in the unusual patterns observed as the system transitions to
normal operation. This represents a false alarm rate of less than
0.01%.
The results for the validation set on this run are 92.06% detection
on attack samples, but 100% attack detection when considering
traces. None of the false alarms occurred in samples corresponding
to normal system operation, even as the load curve varies.
5.2 Second Training Set: n=5500
The classifier learns five patterns. All normal samples are learned
as a single pattern class, as are faults F91 and F92. Fault F93 was
learned as two patterns, with the bulk of the trace (14 samples) as
one pattern, and the samples at the start and end as another pattern
with a count of two.
Two samples of the F92 fault near sample 15829 are considered
anomalous and counted as false alarms. All other fault and normal
samples were correctly classified, for a false alarm rate under
0.01%.
It appears that the sample learned for the F93 fault in the training
set degrades the detection performance of attack A93 (false data
injection at position 93) at samples 24441 and 26233. The learned
pattern for F93 is similar to at least some samples in these attack
traces. In the first of the A93 attacks in the validation set, which is
at 5x nominal magnitude, 21 samples of the approximately 30-
sample event trace were considered anomalous. For the second A93
event (near sample 26233, 3x magnitude), the detector misses the
entire trace. We note that even though the anomaly scores for the
first event were below threshold, they were on the high side,
matching the F93 pattern at scores of 0.85 or higher.
All other attack traces were detected, with results very similar to
those for the n=5000 experiment.
The results for this experiment were a false alarm rate under 0.01%
and a detection rate of 78.15% of samples, or 88.89% of traces.
5.3 Third Training Set: n=6000
The training set includes all fault exemplars above, as well as an
attack at relay 90 (event A90 at sample 5920). The expectation is
that all the fault exemplars would be learned, as would the A90
event. Subsequent instances of faults and also of the A90 event
should not appear anomalous.
The classifier learns seven patterns. As with the earlier results, the
normal samples are learned as a class, and the samples for faults
F91, and F92 are learned as distinct pattern classes. As before, F93
is learned as two pattern classes, with 14 samples in the bulk of the
trace as one pattern, and samples at the start and end of the trace
learned as a pattern with only these two samples. The attack trace
A90 is learned as two pattern classes, the larger with 26 samples,
and the smaller with 3 samples, once again at the start and end of
the trace.
We observe a single false alarm sample for the F92 event near
sample 17135.
As in the n=5500 experiment, samples for various instances of
attack A93 match the learned pattern for fault F93, degrading the
attack detection performance. As in the prior experiment, 21
samples of the A93 event near sample 24411 are considered
anomalous, but with relatively high scores, and the A93 trace near
sample 26233 is missed entirely
For this run, the false alarm rate is under 0.01% and the detection
rate is 71.11% of samples or 83.33% of traces, with all missed
detections occurring at position 93.
5.4 Results Summary
We conjecture that, due to some quirk in the topology, learning the
pattern for the non-malicious fault at relay 93 inhibits the ability to
detect an injection attack at the same location. We are investigating
why this is the case. This finding is nonetheless a useful result, as a
power system designer can use our approach to identify by
simulation the points in the system where an injection attack is
more difficult to detect, and either modify topology slightly or
deploy extra defense or measurement redundancy at these points.
These results indicate that inclusion of an attack trace in the training
phase does not contaminate the results or impact detection
performance. In an actual implementation, one would label patterns
corresponding to observed or modeled attacks. This can be
achieved either by expert analysis of the algorithm result for an
event of interest that one wishes to incorporate into the system’s
knowledge base as a labeled exemplar, or by injecting such a trace
into the training set.
The following table summarizes these results. Entries in red
indicate either false alarms or failure to detect an entire attack trace
(missed detection or false negative).
Table 1. Summary of Results
Number'of'Anomalous'Samples 'in'Event'Trace
Event
Starting'
Sample'
(Approx)
Train in g'
5000
Train in g'
5500
Train in g'
6000
F92 (2 X)
2006 0 0 0
F91 (2 x)
3833 0 0 0
F93 (2 x)
5400 15 0 0
A90 (5x)
5920 28 28 0
F91 (2 x)
13126 0 0 0
F93 (2 x)
14523 14 0 0
F92 (2 x)
15829 020
F92 (5 x)
17135 201
F91 (7 x)
18442 0 0 0
F93 (1 0x)
19748 15 0 0
A92 (5x)
20504 24 24 24
A90 (5x)
21807 25 25 0
A91 (5x)
23106 27 27 27
A93 (5x)
24411 29 21 21
A91 (10 x)
24930 28 28 28
A93 (3x)
26233 27 0 0
A92 (7x)
27535 28 28 28
A90 (10 x)
28838 30 30 0
FA
0.01% 0.01% 0.00%
Detection
Samples 92.06% 78.15% 71.11%
Traces 100.00% 88.89% 83.33%
6. Related Work
The work presented here proposes anomaly detection based on
machine learning to detect data injection attacks in electrical
systems, and distinguish these from normal and non-malicious fault
states in such systems. Data injection attacks into electrical systems
have been described in [6, 8, 9, 10, 11]. The consequences of such
attacks can range from incorrect system response, which leads to
suboptimal operation, all the way to potentially destabilizing
control actions.
Lin and Ning [9] derived algebraic conditions for injection attacks
that are able to evade detection but result in incorrect system state
estimation. State estimation uses an iterative approach to estimate
state from observable measurements. Measurements are related to
state via a Jacobean matrix. The rank of the Jacobean in typical
transmission systems is such that injected error vectors in the kernel
of the matrix will lead to an incorrect state estimate, but the injected
error vector will not be flagged by the commonly used bad data
detection algorithms.
Bobba et al. [10] extended this result by considering detection and
countermeasures consisting of optimally placing a limited number
of costly but higher-fidelity, harder-to-compromise measurement
units (modern Phasor Measurement Units, or PMUs) so as to
achieve a degree of redundancy that greatly increases the attacker’s
burden. Teixeira and his collaborators [11] considered a radial
distribution system in which Conservation Voltage Reduction
(CVR) is applied. CVR is an energy conservation measure whereby
voltage is reduced slightly at the had of the distribution feeder, and
controlled by distribution transformers along the feeder to maintain
end-line voltage above the nominal limit. In this study, the authors
demonstrated that bad data injection would lead to suboptimal
CVR, but would not be likely to destabilize the system.
CPS offer the opportunity to leverage physical constraints to
advance cyber-physical security. The consistency of measurements
constrained by underlying physical laws can be leveraged to
complement conventional cyber defenses and implement a system
that requires an adversary to compromise network traffic as well as
measurements at multiple points, significantly raising adversary
work factor. This has been demonstrated by the use of KCL/KVL
constraints to distinguish fault and data injection conditions in
distribution and transmission substations [6, 8]. The present work
differs from these in that the KCL/KVL conditions are not
explicitly encoded in our algorithm, but we hypothesize that our
system learns the possible system states constrained by these
conditions.
Anomaly detection in enterprise systems has arguably fallen short
of hopes that it would provide a “magic bullet” against such threats
as zero-day attacks. This is because enterprise systems are highly
variable with respect to many features of interest. Generally,
anomaly detection in enterprise systems has not achieved adequate
detection sensitivity at an acceptable false positive rate. CPS, by
contrast, are characterized by regularities in communications and
measurements. In [12], for example, it was demonstrated that
communication patterns in a SCADA system are sufficiently
regular that they can be learned, and a detector based on anomaly
detection would be feasible. The authors of [13] developed an
anomaly detection system that, like ours, looks for anomalies in the
measurements. Their technique is to fit a quadratic regression to a
window of measurement points, and then perform a principal
component analysis (PCA) on the regression coefficient. The
anomaly detection is based on clustering of the first few principal
components. Our approach considers measurement samples
directly.
7. SUMMARY
Success of our approach with respect to normal and anomalous
patterns requires satisfying the following criteria:
The classifier should be able to identify distinct states
corresponding to normal operation, non-malicious fault
(consistent with KCL/KVL), and false measurement
injection.
A pattern corresponding to an injection in the training phase
should cause the system to learn a class that does not match
any normal or non-malicious fault pattern (in other words,
the class consists of the anomalous pattern only). Note that,
in particular, we do not require a “clean” training set.
A pattern in the validation phase corresponding to an
injection should not match any normal or non-malicious
fault pattern classes well. In particular, a threshold based on
the match score defined above should easily identify this
pattern as an anomaly, even on the first presentation of the
anomalous pattern to the classifier.
The results obtained indicate that the approach described above
largely meets these criteria, subject to the preceding discussion on
false alarms and missed detections. This gives us confidence that,
at least in this domain, the constraints induced by underlying
physics of a CPS lead to states amenable to anomaly detection
based on machine learning. In order for an attacker to inject false
measurements and evade detection, he or she would need to
manipulate not merely messages in the cyber plane, but
measurements at multiple points in the physical plane of the CPS.
Requiring the adversary to simultaneously falsify a number of
measurements at multiple points in the CPS in a manner consistent
with the underlying physics represents a significant increase in
adversary capability to perpetrate a successful, undetected injection
attack.
Our system may be implemented and deployed in a variety of ways.
The first deployment concept is off-line and simulation based, as
we have done here. For CPS in which there is a high-fidelity
simulation framework, our system may be trained entirely off-line.
This is the case in the electric power sector, where technologies
such as RTDS are widely used for power system design. In this
case, part of the design process can include our algorithm to support
design for security, training a system that can detect false data from
measurements consistent with the physical laws. In the results we
have presented here, a security sensitivity analysis may result in the
identification of measurement points where, due to topology or
insufficient redundancy, false data may be difficult to distinguish
from actual measurements. In this situation, analysis based on our
system identifies points for cost-effective deployment of redundant
measurements or enhanced defenses. Off-line, simulation-based
training also enables the system designer to label patterns
corresponding to event classes of interest, so that these can be
classified in operational use as something more specific than
“anomaly”. This allows an implementation in which the learning is
semi-supervised: ground truth for some of the event classes is
known, and the corresponding traces can be labeled before training.
We intend to explore this variant in future work.
In on-line deployment, our algorithms self-organize and learn
patterns of events as these arise in operation. In these cases, expert
operators would need to look at the traces contributing to various
learned patterns, and possibly label these as normal, non-malicious,
or attack.
Finally, it is possible to train a system off-line based on a high-
fidelity simulation and some event labels, then transition this
system to online operation, with parameters permitting new
patterns to be learned and labeled by experts.
8. ACKNOWLEDGMENTS
The work is sponsored by the Department of Energy under grant
DE-OE0000674, under subcontract to ABB Research. The views
expressed are solely those of authors.
The authors would like acknowledge the support from the ABB US
Corporate Research Center in Raleigh, NC, the ECE Department
and the Information Trust Institute of the University of Illinois at
Urbana-Champaign, Ameren Illinois, and the US DOE CEDS
program.
9. REFERENCES
[1] Acromag, Inc. (2005). Introduction to MODBUS TCP/IP.
https://www.acromag.com/sites/default/files/Acromag_Intro_
ModbusTCP_765A.pdf
[2] Grossberg, S. (ed.). 1988. Neural Networks and Natural
Intelligence, MIT Press.
[3] IEC 61850 Communication networks and systems in
substations, all parts, Reference number IEC 61850-SER.
http://www.iec.ch/smartgrid/standards/
[4] Kohonen, S. 2001. Self-Organizing Maps, 3rd edition,
Springer.
[5] Stolfo, S., Hershkop, S, Bui, L., Ferster, R., Wang, K.
“Anomaly Detection for Computer Security and an
Application to File System Acceses”, in M. S> Hacid et al.,
ISMIS 2005.
[6] Macwan, R., Drew, C., Panumpabi, P., Valdes, A., Vaidya,
N., Sauer, P., and Ischenko, D. Collaborative Defense
Against Data Injection Attacks in IEC 61850 Based Smart
Substation.” To appear in IEEE Power and Energy Society
(IEEE-PES), July 17-21, 2016.
[7] RTDS Technologies. 2015. https://www.rtds.com/
[8] Valdes, A., Cui Hang, Panumpabi, P., Vaidya, N., Drew, C.,
Ischenko, D. 2015. Design and simulation of fast substation
protection in IEC 61850 environments. In Modeling and
Simulation of Cyber-Physical Energy Systems (MSCPES),
Proceedings of the 2015 Workshop on Cyber-Physical
Systems (April 13, 2015), 1-6.
[9] Y. Liu, P. Ning, and M. Reiter, “False data injection attacks
against state estimation in electric power grids,” Proc. 16th
ACM Conf. on Computer and Communications Security
(CCS ’09), Chicago, IL, 2009, pp. 2132.
[10] R. Bobba, K. Rogers, Q. Wang, H. Khurana, K. Nahrstedt,
and T. Overbye, “Detecting false data injection attacks on
DC state estimation,” Proc. 1st Workshop on Secure Control
Systems (SCS), Stockholm, Sweden, 2010. [Online].
Available:
https://www.truststc.org/conferences/10/CPSWeek/program.
htm
[11] A. Teixeira, G. Dán, H. Sandberg, R. Berthier, R. B. Bobba,
and A. Valdes, “Security of smart distribution grids: Data
integrity attacks on integrated volt/VAR control and
countermeasures,” Proc. American Control Conference
(ACC), Portland, OR, 2014, pp. 43724378.
[12] Cheung, S. and Valdes, A. “Communication Pattern
Anomaly Detection in Process Control System Security”,
IEEE International Conference on Technologies for
Homeland Security, Waltham, MA , May 11-12, 2009
[13] Amidan, BG, JD Follum, KA Freeman, and JE
Dagle. 2015. “Baselining PMU Data to Find Patterns and
Anomalies.” In CIGRE US National Committee: 2015 Grid
of the Future Symposium, Chicago, IL.
[14] Communication networks and systems for power utility
automation Part 9-2: Specific communication service
mapping (SCSM) Sampled Values over ISO/IEC 8802-3,
IEC International Standard 61850-9-2, Ed. 2.0, Nov 2011
... In the context of distribution systems, SVM models can be used to categorize data based on whether it reflects a normal or abnormal operating condition, enabling operators to engage in actions that will mitigate interruptions [21]. Finally, clustering and anomaly detection techniques within unsupervised learning have been utilized to enhance the resilience of distribution systems [22]. These algorithms can unearth patterns and aberrations in data that conventional methods might overlook, empowering operators to take preemptive steps in mitigating or eliminating any disruptions. ...
Chapter
Lately, distribution systems have grown increasingly intricate and vulnerable to a wide range of disruptions, such as natural disasters, cyberattacks, and equipment failures. As a result, there is a growing need for methods that can improve the resilience of these systems and minimize their downtime. In the realm of cutting-edge technology, novel methods of artificial intelligence are coming to the forefront, with machine learning (ML) leading the pack and becoming increasingly applied in many sectors including the energy sector. These automated techniques can help enhance the resilience of distribution systems by providing real-time data analysis, predictive modeling, and automated decision-making capabilities. Accordingly, this chapter delves into the role played by different ML techniques in Revitalizing the tenacity of distribution networks. Specifically, it provides a comprehensive review of the existing studies on the application of ML in distribution systems’ resilience and provides several case studies to illustrate the practical applications of these robust methods aimed at minimizing the frequency of disruptions from both natural and man-made disasters. Additionally, this chapter details the challenges of deploying ML techniques for distribution systems resilience along with highlighting the future directions of research in this area that will address the challenges to fully leverage the potential of AI-powered approaches for improving distribution system resilience. This chapter will act as an insightful resource for different key stakeholders, researchers, and students with a vested interest in this area.
... The algorithm combines fuzzy math with an SVM to separate noise and outliers from valid samples. In practical applications, researchers have made some [30,31], but many problems remain. For example, if there is a considerable amount of abnormal data or abnormal data with a certain distribution, FSVM loses information when separating the abnormal data. ...
... In this article, we present an alternative paradigm, namely, anomaly detection [19][20][21][22], which is particularly suitable for detecting such special configurations. The main advantage of anomaly detection with respect to conventional classification schemes is that here one does not need the a priori knowledge * jb.ghosh@outlook.com ...
Article
We present the application of classical and quantum-classical hybrid anomaly detection schemes to explore exotic configurations with anomalous features. We consider the Anderson model as a prototype, where we define two types of anomalies—a high conductance in the presence of strong impurity and a low conductance in the presence of weak impurity—as a function of random impurity distribution. Such anomalous outcome constitutes an imperceptible fraction of the data set and is not a part of the training process. These exotic configurations, which can be a source of rich new physics, usually remain elusive to conventional classification or regression methods and can be tracked only with a suitable anomaly detection scheme. We also present a systematic study of the performance of the classical and the quantum-classical hybrid anomaly detection method and show that the inclusion of a quantum circuit significantly enhances the performance of anomaly detection, which we quantify with suitable performance metrics. Our approach is quite generic in nature and can be used for any system that relies on a large number of parameters to find their new configurations, which can hold exotic new features.
... Self-Organizing Maps (SOM) is an unsupervised machine learning technique used to produce a low-dimensional representation of a higher dimensional data set while preserving the topological structure of the data. SOMs are used in [200] and [197] for detecting faults and FDI attacks using consumption data. ...
Article
Full-text available
The power grid is a constant target for attacks as they have the potential to affect a large geographical location, thus affecting hundreds of thousands of customers. With the advent of wireless sensor networks in the smart grids, the distributed network has more vulnerabilities than before, giving numerous entry points for an attacker. The power grid operation is usually not hindered by small-scale attacks; it is popularly known to be self-healing and recovers from an attack as the neighboring areas can mitigate the loss and prevent cascading failures. However, the attackers could target users, admins and other control personnel, disabling access to their systems and causing a delay in the required action to be taken. Termed as the biggest machine in the world, the US power grid has only been having an increased risk of outages due to cyber attacks. This work focuses on structuring the attack detection literature in power grids and provides a systematic review and insights into the work done in the past decade in the area of anomaly or attack detection in the domain.
... In this paper, we present a new paradigm, namely anomaly detection [13][14][15] which is particularly suitable for detecting such special configurations. The main advantage of anomaly detection with respect to the conventional classification scheme is that here one doesn't need the a priory knowledge of the data points that are uncharacteristic for a specific data set or the anomaly. ...
Preprint
Full-text available
In this paper we present the application of classical and quantum-classical hybrid anomaly detection schemes to explore exotic configuration with anomalous features. We consider the Anderson model as a prototype where we define two types of anomalies - a high conductance in presence of strong impurity and low conductance in presence of weak impurity - as a function of random impurity distribution. Such anomalous outcome constitutes less than 10% of a data set and is not a part of the training process. The anomaly detection is therefore more suitable to detect unknown features which is not possible with conventional classification or regression methods. We also present a systematic study of the performance of the classical and the hybrid method and show that the inclusion of a quantum circuit significantly enhances the performance of anomaly detection which we quantify with suitable performance metrics. Our approach is quite generic in nature and can be used for any system that relies on a large number of parameters to find their new configurations which can hold exotic new features.
Article
Full-text available
Energy systems require radical changes due to the conflicting needs of combating climate change and meeting rising energy demands. These revolutionary decentralization, decarbonization, and digitalization techniques have ushered in a new global energy paradigm. Waves of disruption have been felt across the electricity industry as the digitalization journey in this sector has converged with advances in artificial intelligence (AI). However, there are risks involved. As AI becomes more established, new security threats have emerged. Among the most important is the cyber-physical protection of critical infrastructure, such as the power grid. This article focuses on dueling AI algorithms designed to investigate the trustworthiness of power systems’ cyber-physical security under various scenarios using the phasor measurement units (PMU) use case. Particularly in PMU operations, the focus is on areas that manage sensitive data vital to power system operators’ activities. The initial stage deals with anomaly detection applied to energy systems and PMUs, while the subsequent stage examines adversarial attacks targeting AI models. At this stage, evaluations of the Madry attack, basic iterative method (BIM), momentum iterative method (MIM), and projected gradient descend (PGD) are carried out, which are all powerful adversarial techniques that may compromise anomaly detection methods. The final stage addresses mitigation methods for AI-based cyberattacks. All these three stages represent various uses of AI and constitute the dueling AI algorithm convention that is conceptualised and demonstrated in this work. According to the findings of this study, it is essential to investigate the trade-off between the accuracy of AI-based anomaly detection models and their digital immutability against potential cyberphysical attacks in terms of trustworthiness for the critical infrastructure under consideration.
Chapter
In Smart grid (SG), cyber-physical attacks (CPA) are the most critical hurdles to the use and development. False data injection attack (FDIA) is a main group among these threats, with a broad range of methods and consequences that have been widely documented in recent years. To overcome this challenge, several recognition processes have been developed in current years. These algorithms are mainly classified into model-based algorithms or data-driven algorithms. By categorizing these algorithms and discussing the advantages and disadvantages of each group, this analysis provides an intensive overview of them. The Chapter begins by introducing different types of CPA as well as the major stated incidents history. In addition, the chapter describes the use of Machine Learning (ML) techniques to distinguish false injection attacks in Smart Grids. A few remarks are made in the conclusion as to what should be considered when developing forthcoming recognition algorithms for fake data injection attacks.
Chapter
Anomaly detection is an observation of irregular, uncommon events that leads to a deviation from the expected behaviour of a larger dataset. When data is multiplied exponentially, it becomes sparse, making it difficult to spot anomalies. The fundamental aim of anomaly detection is to determine odd cases as the data may be properly evaluated and understood to make the best decision possible. A promising area of research is detecting anomalies using modern ML algorithms. Many machines learning models that are used to learn and detect anomalies in their respective applications across various domains are examined in this systematic review study.KeywordsAnomaliesAnomaly detectionMachine learning techniquesApplications
Chapter
Technological progression in communication and computing domains has led to the advent of cyber-physical systems (CPS). As an emerging technological advancement, CPS security is considered one of the prominent research directions these days. CPS is featured by its potential to integrate the cyber and physical data of the real world. CPS deployment in major infrastructure has shown the ability to reshape the world. Although, harnessing this ability is confined by their decisive nature and deep-seated consequences of cyber-attacks on surroundings or environment, infrastructure, and humans. In CPS, the substantial cyber concerns surge from the procedure of information transmission from multiple sensors to diverse actuators via the wireless medium, thus augmenting the attack region. Conventionally, CPS safety has been inspected from the standpoint of impending intruders from acquiring access to crucial systems using crypto-graphic or access control schemes. Thus, most research studies have emphasized attack detection in CPS. Although, in a sphere of growing adversaries, safe-guarding CPS from diverse adversarial attacks is becoming extremely sophisticated. Therefore, the need emerges for constructing resilient CPS which can con-front disruptions and stay functional despite adversarial attacks. Among the predominant methods investigated for constructing robust CPS, machine learning (ML) techniques have displayed greater suitability. However, from the latest studies regarding adversarial ML, it is advisable that for protecting CPS, ML techniques should themselves be robust. Therefore, this paper is intended at surveying the ML techniques employed for securing CPS and for detecting several attacks on CPS. It discusses the various design challenges, security objectives, security measures, security and reliability requirements of CPS, attack detection frameworks, and performance measures employed in prior works. Furthermore, it concludes with several research gaps and future directions for improving ML techniques and developing secure CPS.KeywordsCPSSecurity threatsMachine learning
Conference Paper
Full-text available
We examine the feasibility of an attack on the measurements that will be used by integrated volt-var control (VVC) in future smart power distribution systems. The analysis is performed under a variety of assumptions of adversary capability regarding knowledge of details of the VVC algorithm used, system topology, access to actual measurements, and ability to corrupt measurements. The adversary also faces an optimization problem, which is to maximize adverse impact while remaining stealthy. This is achieved by first identifying sets of measurements that can be jointly but stealthily corrupted. Then, the maximal impact of such data corruption is computed for the case where the operator is unaware of the attack and directly applies the configuration from the integrated VVC. Furthermore, since the attacker is constrained to remaining stealthy, we consider a game-theoretic framework where the operator chooses settings to maximize observability and constrain the adversary action space.
Article
Full-text available
Aging power industries together with increase in the demand from industrial and residential customers are the main incentive for policy makers to define a road map to the next generation power system called smart grid. In smart grid, the overall monitoring costs will be decreased but at the same time, the risk of cyber attacks might be increased. Recently a new type of attacks (called the stealth attack) has been introduced, which cannot be detected by the traditional bad data detection using state estimation. In this paper, we show how normal operations of power networks can be statistically distinguished from the case under stealthy attacks. We propose two machine learning based techniques for stealthy attack detection. The first method utilizes the supervised learning over labeled data and trains a distributed support vector machine. The design of the distributed SVM is based on the Alternating Direction Method of Multipliers, which offers provable optimality and convergence rate. The second method requires no training data and detects deviation in measurements. In both methods, principle component analysis is used to reduce the dimensionality of the data to be processed, which leads to lower computation complexities. The results of the proposed detection methods on the IEEE standard test systems demonstrate the effectiveness of both schemes.
Article
Full-text available
State estimation is an important power system application that is used to estimate the state of the power transmission networks using (usually) a redundant set of sensor measurements and network topology information. Many power system applications such as contingency analysis rely on the output of the state estimator. Until recently it was assumed that the techniques used to detect and identify bad sensor measurements in state estimation can also thwart malicious sensor measurement modification. However, recent work by Liu et al. [1] demonstrated that an adversary, armed with the knowledge of network configuration, can inject false data into state estimation that uses DC power flow models without being detected. In this work, we explore the detection of false data injection attacks of [1] by protecting a strategically selected set of sensor measurements and by having a way to independently verify or measure the values of a strategically selected set of state variables. Specifically, we show that it is necessary and sufficient to protect a set of basic measurements to detect such attacks.
Conference Paper
Full-text available
Digital control systems are increasingly being deployed in critical infrastructure such as electric power generation and distribution. To protect these process control systems, we present a learning-based approach for detecting anomalous network traffic patterns. These anomalous patterns may correspond to attack activities such as malware propagation or denial of service. Misuse detection, the mainstream intrusion detection approach used today, typically uses attack signatures to detect known, specific attacks, but may not be effective against new or variations of known attacks. Our approach, which does not rely on attack-specific knowledge, may provide a complementary detection capability for protecting digital control systems.
Article
The IEC 61850 protocol suite provides significant benefits in electrical substation design and enables formal validation of complex device configurations to ensure that design objectives are met. One important benefit is the potential for protective relays to react in a collaborative fashion to an observed fault current. Modern relays are networked cyberphysical devices with embedded systems, capable of sophisticated protection schemes that are not possible on legacy overcurrent relays. However, they may be subject to error or cyber attack. Herein, we introduce the CODEF (Collaborative Defense) project examining distributed substation protection. Under CODEF, we derive algorithms for distributed protection schemes based on distributed agreement. By leveraging Kirchhoff's laws, we establish that certain fast agreement protocols have important equivalences to linear coding and error correction theory. In parallel, we describe a cyber-physical simulation environment in which these algorithms are being validated with respect to the strict time constraints of substation protection.
Book
The Self-Organising Map (SOM) algorithm was introduced by the author in 1981. Its theory and many applications form one of the major approaches to the contemporary artificial neural networks field, and new technologies have already been based on it. The most important practical applications are in exploratory data analysis, pattern recognition, speech analysis, robotics, industrial and medical diagnostics, instrumentation, and control, and literally hundreds of other tasks. In this monograph the mathematical preliminaries, background, basic ideas, and implications are expounded in a manner which is accessible without prior expert knowledge.
Article
Packed with real-time computer simulations and rigorous demonstrations of these phenomena, this book includes results on vision, speech, cognitive information processing, adaptive pattern recognition, adaptive robotics, conditioning and attention, cognitive-emotional interactions, and decision making under risk. "Neural Networks and Natural Intelligence" first discusses neural network architecture for preattentive 3-D vision and then shows how this architecture provides a unified explanation, through systematic computer simulations, of many classical and recent phenomena from psycho-physics, visual perception, and cortical neurophysiology. It illustrates within the domain of preattentive boundary segmentation and featural filling-in, how computer experiments help to develop and refine computational vision models. (PsycINFO Database Record (c) 2012 APA, all rights reserved)