ChapterPDF Available

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices

Authors:

Abstract and Figures

Cyber-Physical Systems (CPSs) are complex systems that evolve from the integrations of components dealing with real-time computations and physical processes, along with networking. CPSs often incorporate approaches merging from different scientific fields such as embedded systems, control systems, operational technology, information technology systems (ITS), and cybernetics. Major cybersecurity concerns are rising around CPSs because of their expanding uses in the modern world today. Often the security concerns are limited to deriving risk analytics and security assessment. Others focus on the development of intrusion detection and prevention systems. To make the CPSs resilient, it needs a thorough understanding of the current cybersecurity frameworks proposed by different governing bodies in this domain. It is also imperative to realize how these frameworks are applying established security practices. To address the gap in understanding the defense-in-depth security architectures and achieving them within the CPS domain, we analyze the cybersecurity frameworks and the challenges in applying them. To give some background information, we start a discussion of the differences between ITS and CPS. We then present a state-of-the-art review of some of the existing cybersecurity frameworks for risk and resilience management. Finally, we propose formal techniques to realize the frameworks and security practices in the CPS domain by providing quantitative resilience analytics.
Content may be subject to copyright.
Realizing Cyber-Physical Systems
Resilience Frameworks and Security
Practices
Md Ariful Haque, Sachin Shetty, Kimberly Gold, and Bheshaj Krishnappa
Abstract Cyber-Physical Systems (CPSs) are complex systems that evolve from
the integrations of components dealing with real-time computations and physical
processes, along with networking. CPSs often incorporate approaches merging from
different scientific fields such as embedded systems, control systems, operational
technology, information technology systems (ITS), and cybernetics. Major cyberse-
curity concerns are rising around CPSs because of their expanding uses in the modern
world today. Often the security concerns are limited to deriving risk analytics and
security assessment. Others focus on the development of intrusion detection and pre-
vention systems. To make the CPSs resilient, it needs a thorough understanding of
the current cybersecurity frameworks proposed by different governing bodies in this
domain. It is also imperative to realize how these frameworks are applying established
security practices. To address the gap in understanding the defense-in-depth security
architectures and achieving them within the CPS domain, we analyze the cyberse-
curity frameworks and the challenges in applying them. To give some background
information, we start a discussion of the differences between ITS and CPS. We then
present a state-of-the-art review of some of the existing cybersecurity frameworks
for risk and resilience management. Finally, we propose formal techniques to realize
the frameworks and security practices in the CPS domain by providing quantitative
resilience analytics.
M. A. Haque (B
)·S. Shetty
Computational Modeling and Simulation Engineering, Old Dominion University,
5115 Hampton Blvd, Norfolk, VA 23529, USA
e-mail: mhaqu001@odu.edu
S. Shetty
e-mail: sshetty@odu.edu
K. Gold
Naval Surface Warfare Center, Crane Division, Crane, IN 47522, USA
e-mail: kimberly.gold@navy.mil
B. Krishnappa
Risk Analysis and Mitigation, ReliabilityFirst Corporation, 3 Summit Park Drive, Suite 600,
Cleveland, OH 44131, USA
e-mail: bheshaj.krishnappa@rfirst.org
© Springer Nature Switzerland AG 2021
A. I. Awad et al. (eds.), Security in Cyber-Physical Systems, Studies in Systems,
Decision and Control 339, https://doi.org/10.1007/978-3- 030-67361- 1_1
1
2 M. A. Haque et al.
Keywords Cyber-Physical Systems ·Cybersecurity frameworks ·Security
practices ·Criticality assessment ·Resilience metrics ·Graphical modeling ·
Analytical hierarchy process ·TOPSIS
1 Introduction
In the modern world today, we observe a steep increase in the usage of Cyber-
Physical Systems (CPSs). For example, critical infrastructures (i.e., energy delivery
systems, oil and gas industry, healthcare systems, transportation systems), industrial
manufacturing plants, autonomous vehicles, smart cities, etc. profoundly use CPSs.
CPS is a class of complex systems of systems that integrate cyber operations with the
physical processes. In CPSs, we use computing and networking devices to perform
computation and communication. The networked devices also control the underlying
instrumental processes. We need the communication network to monitor and control
the physical devices’ operations and performances in real-time, some of which may
base on remote field locations.
In the broad sense, CPSs consists of the cyber domain (or, the information tech-
nology systems (ITS)), and the physical domain (or, the operational technology (OT)
network). The cyber section consists of servers and hosting devices for organization-
wide communications. On the other hand, the physical domain contains the field
devices and the industrial control systems (ICS), which again comprise sensors,
actuators, control functions, feedback systems, etc. We need the OT network for
handling the production processes and the ITS for the business communications.
The advancements in monitoring and controlling the production processes bring
the risk of malicious cyber attacks on these systems. The increment in risk comes
from the integrated interconnection between the cyber components and the physical
elements, more precisely, the amalgamation of ITS and ICS.
The CPS’s security concerns are often addressed by focusing on the development
of intrusion detection and prevention system (IDS/IPS) and generating security met-
rics for risk assessment. While the IDS and IPS are necessary for mitigating the attack
impact and quick recovery of the system, we cannot overlook the concern regarding
systems’ resilience and reliability. The resilience posture indicates the overall system
security and guides network administrators for developing effective and optimized
mitigation strategies and remediation plans.
To make the CPS cyber resilient, regulatory bodies and researchers propose sev-
eral frameworks and provide essential instructions in standards. The standard bodies
that we are referring are the National Institute of Standards and Technology (NIST),
the North American Electric Reliability Corporation critical infrastructure protec-
tion (NERC-CIP), and the Industrial Control Systems Cyber Emergency Response
Team (ICS-CERT). The open question is how to apply those frameworks and secu-
rity practices as comprehensively as possible without affecting the regular business
operations.
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 3
In this chapter, we address the implementation challenges of the theoretical frame-
works. We discuss the threats and vulnerabilities that CPS is facing today. We also
cover how the security frameworks, recommended defense architectures, and stan-
dard practices can help design and develop resilient CPS. This chapter aims to mathe-
matically realize the frameworks and security practices using established theoretical
analysis methodologies. Significant contributions of the chapter are:
A detailed discussion on the CPS threats, vulnerabilities, and cyber resilience
A comprehensive review of cybersecurity and resilience frameworks and recom-
mended defense-in-depth security practices for CPS
A proposed qualitative approach for quantifying cyber resilience using analytical
hierarchy process (AHP)
A quantitative realization of defense-in-depth security architectures using a multi-
level directed acyclic graph modeling technique
Critical and cyber vulnerable assets identification using the vulnerability graph
model
A concise discussion on the mapping of CPS security, resilience, and operational
domains.
We organize the rest of the chapter as follows. Section 2presents a brief descrip-
tion of CPS and components of CPS (i.e., IT network, supervisory control and data
acquisition (SCADA), and OT network), and cyber resilience. Section3provides
a state-of-the-art review of the cybersecurity and resilience frameworks. Section 4
discusses some of the critical security guidelines presented by the standard bod-
ies. Section 5proposes different mathematical techniques for the realization of the
frameworks. Section 6highlights the challenges in mapping CPS security, resilience,
and operational domains. Finally, Sect. 7concludes the chapter with some significant
takeaways.
2 Cyber-Physical Systems
Cyber-Physical Systems (CPSs) represent a composite class of engineered systems
consisting of physical processes and computational resources. The National Institute
of Standards and Technology (NIST) CPS Public Working Group (CPS PWG) defines
CPS as “smart systems that include engineered interacting networks of physical and
computational components” (Griffor et al. [1]) . CPS technologies continue helping to
transform people’s approaches to interact with engineered systems. Advances in CPS
bring extended capability, adaptability, and usability, making them crucial in many
industries. Today we observe CPS are in use to implement most modern technologies
such as the Internet of Things (IoT), industrial internet, Industrial Control Systems
(ICS), smart devices, etc. In this chapter, we sometimes use the phrases CPS and ICS
interchangeably to mean the same systems.
We present a conceptual representation of the Cyber-Physical Systems in Fig.1.
We divide the discussion area into feedback systems, application domains, system
4 M. A. Haque et al.
security, and system challenges. CPS consists of control and feedback systems, which
are highly interconnected and heterogeneous. The control systems are either net-
worked or distributed and include physical processes such as sensors and actuators,
which operate in real-time. There may be human and environmental interactions
involved in the process.
We illustrate the CPS here by using the example of the power systems as
researchers consider the power systems as cyber-physical power systems (CPPS)
[2,3]. The power system’s physical domain consists of the generation and distribu-
tion devices such as generators, transformers, electric buses, etc. The physical part
also comprises ICS devices. There are different ICS devices in use based on require-
ments such as the phasor measurement units (PMU), intelligent electronic devices
(IED), the programmable logic controllers (PLC), and remote terminal units (RTU),
etc. To monitor and control the field devices’ performances, we need the supervi-
sory control and data acquisition (SCADA) systems. As we know, SCADA is the
central control system used to monitor and control the equipment in the industrial
production systems. In general, SCADA contains the master terminal unit (MTU),
human-machine interface (HMI), and input/output (I/O) devices, etc. The field ICS
devices such as RTU sends real-time system performance data to MTU. The oper-
ators in SCADA observe the performance measures, compare those values with
desired values, and, if necessary, issue control commands through HMI. The com-
mands issued from HMI control the system to function at the desired service level
(Macaulay and Singer [4]).
Due to the complicated operational requirements, the CPS itself has challenges
such as modeling the underlying physical processes and real-time behavior, modeling
interconnectivity, and interoperability in the heterogeneous SoS, secure integration
of different components of CPS, etc. The CPS needs to handle the analysis of spec-
ification, design methodologies, scalability and complexity, and overall verification
and validation of the systems from the modeling & simulation perspective. On the
other hand, because of the amalgamation of the IT and OT domains, CPS needs to
handle many cyber threats. Thus, understanding the proposed cyber frameworks and
applying the recommended practices in developing a resilient system are integral
parts of CPS security analysis.
We start with a short discussion on the primary differences between ITS and CPS
in the next subsections. We then gradually proceed to CPS threats, vulnerabilities,
and cyber resilience to smooth transition to the cyber framework analysis.
2.1 Primary Differences Between CPS and ITS Security
Today, the extensive access of ITS devices into the control systems makes CPS
vulnerable to cyberattacks. Cyberattacks are different than physical attacks on several
points. In the physical attacks, the defenders are aware of the system units under the
target, the impact is immediate, and there are policies to handle such attacks. On
the other hand, cyberattacks are remote, repeatable, and can occur over extended
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 5
Fig. 1 Cyber-Physical Systems concept map
6 M. A. Haque et al.
periods. Cyber intruders can execute cyberattacks, with the objective of a long-term
intrusion and identification of potential attractive targets (e.g., advanced persistent
threat). The impact can be less intense for the time being but can lead to disastrous
consequences in the long run. We provide a summary of the fundamental differences
between ITS and ICS/CPS security in Table 1.
Overseeing cybersecurity in the CPS domain is far more daunting than control-
ling the same in the information technology context. The reason lies on the ground
that CPS has unique operational requirements than ITS. Firstly, for CPS, real-time
availability and operational continuity are of utmost importance. But for ITS, data
confidentiality and integrity are crucial. Momentary downtime in ITS does not ham-
per any production processes (see Macaulay and Singer [4]). Secondly, it is easy
to apply patching through anti-malware and anti-virus software in ITS, and they
often automatically download and install the necessary security patches or updates.
But ICSs are generally old proprietary technologies intended for functionality (not
focus on security issues). ICSs have limited memory and other processing capacities.
These hardware-level limitations make it hard to install anti-malware or anti-virus
solutions, which consume a lot of memory for automatic updates and delay moni-
toring and controlling the production process. Thirdly, ICS operates in diverse fields
such as in the oil, gas, and electric industries. So the application of security measures
should be adapted to fit the structure of these sectors.
2.2 CPS Threats and Vulnerabilities
This section starts with a brief definition of vulnerability and threat, as we find in
the literature to facilitate the audience with the necessary information for the next
discussion. In information systems, a vulnerability is a flaw in the software program
or system that an intruder may exploit to gain unauthorized access to a cyber asset.
NIST defines vulnerability as “weakness in an information system, system security
procedures, internal controls, or implementation that could be exploited or triggered
by a threat source” (see Johnson et al. [7]). On the other hand, a threat is anything
that “can exploit a vulnerability, intentionally or accidentally, and obtain, damage, or
destroy an asset” (Cyware [8]). The Joint Task Force Transformation Initiative defines
threat as “threat is any circumstance or event with the potential to adversely impact
organizational operations and assets, individuals, other organizations, or the Nation
through an information system via unauthorized access, destruction, disclosure, or
modification of information, and/or denial of service” (Blank [9]). Lewis [10] defines
vulnerability and threat as “the probability that a component or asset will fail when
attacked” and “the probability that an attack will happen”, respectively.
We have already highlighted that CPS consist of physical, control, and communi-
cation layers. CPS threat vectors can come from adversaries in any of those layers. In
the physical layer, the availability of the field devices’ services and functionalities are
of utmost concern. There is the risk of information alteration by modifying the phys-
ical device codes (e.g., PLC logic codes). In the control layer, most attacks occur in
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 7
Tabl e 1 Primary differences in operations and security in ITS and CPSs/ICS
Category Information Technology Systems
(ITS)
Cyber-Physical Systems
(CPSs/ICS)
Performance constraintsaHigh throughput demanded Modest throughput is allowable
Non real-time response is ok Real-time response in essential
High delay and jitter are tolerable Delay and jitter over certain
threshold are not tolerable
Resource constraintsbUpdated hardware and software
products are used
Old and less secured proprietary
products are used
Systems have enough memory
and processing capabilities
Products are designed with low
memory and processing
capabilities
Regular security updates are
maintained through patching
Often security updates and
patches are not implemented to
avoid system unavailability due to
reboot requirements after
configuration changes
Confidentiality, integrity, and
availability
Data confidentiality and integrity
are critical
Confidentiality and integrity is
not important
Temporary unavailability is
tolerable
High availability is required.
Momentary downtime may not be
acceptable
Communication protocol Standard communication protocols
(i.e., TCP, UDP, etc.)
Proprietary protocols (i.e.,
MODBUS, DNP3, etc.)
Patch and change managementaSoftware updates and patching
are applied regularly according to
the organization’s security policy
Any configuration changes need
to test, and deploy in test mode
before committing the changes to
live system to avoid unexpected
outages
Rebooting the system to
re-initialize the hardware or
software devices is acceptable
Unplanned rebooting of the
system is not acceptable
Password and authenticationbMulti-factor authentication is
possible to deploy
Sometimes lack of any sort of
authentication requirement
Passwords need to change after
certain time
Passwords are hard-wired in
legacy ICS and cannot be changed
Security is enhanced through
encryption mechanisms
Lack of encryption mechanisms
in message communication
Component lifetime and technical
supporta
Lifetime generally spans from 3
to 5 years
Lifetime varies between 15 and
20 years
Ample technical support
available from either own IT
experts or diversified managed
services
Support solely vendor dependent.
Some product supports may be
ceased by the vendor due to
lifetime expiry
Operational command and control Mostly central monitoring Distributed field operations, but
central monitoring through
SCADA
aStouffer et al. [5]
bColbert et al. [6]
8 M. A. Haque et al.
the form of distributed denial of service (DDoS), eavesdropping (man-in-the-middle
attack), jamming, selective forwarding, etc. Threats in the communication layer can
lead to leaking of confidentiality, stealing credentials, unauthorized access to the
system, social engineering, etc. Based on the type of threats, we classify them in the
discussion below, as pointed out by Haque et al. [11].
External Threats: By external threats, we mean any cyberattack coming from out-
side of the organization. External threats arise from different rival groups, including
nation sponsored hackers, terrorist organizations, or industry competitors. Cyber
intruders may launch an advanced persistent threat attack, where the goal is to theft
crucial data and login information (e.g., password) on the network’s assets with-
out getting caught. One such example is the Stuxnet attack on the Iranian nuclear
centrifuges in the year 2010 (see Chen and Abu-Nimeh [12]).
Internal Threats: The internal threat comes from either within the organization or
from the affiliated parties. Today, the industry’s operating processes are segmented
and done by third-party vendors or contractors. Thus organizations need to share sys-
tem information with outside business partners to some extent. Sharing the network
information (e.g., network design documents) makes the CPS/ICS system vulnerable
to potential cyber threats. There is also the risk of insider attacks from the organiza-
tion’s employees as some employees have authorized access to the ICS network for
managing the network operations. This type of insider threat falls in the category of
credentialed ICS insider attack [13,14].
Technology Threats: Even today, most ICS systems run on old technologies, where
the primary concern is the matching of protocol-level message communications
among different ICS products from other vendors. Thus, many ICSs lack strong
authentication and encryption mechanism (see Laing [15]). Some ICS use authen-
tication procedures, but the weak security mechanisms (e.g., insecure password,
default user accounts, and inadequate password policies) are not enough to protect
the system from intelligent adversaries [13,14].
Integration and Inter-connectivity Threats: In the enterprise networks, business
units are interconnected. Due to the interconnection of the corporate network with
the control system network, ICS devices become vulnerable to cyberattacks. This
vulnerability arises because part of the corporate network is open for communication
over the internet, and ITS hosts and servers contain vulnerabilities. Merely putting the
ICS devices behind the firewalls do not necessarily protect the ICS components [13].
Next, we discuss the cyber resilience from the CPS perspective to understand how
to protect the CPS form the threats discussed above.
2.3 Cyber Resilience: What Does It Mean for CPS?
Some of the early definitions of resiliency had concentrated on disaster resiliency.
From the disaster resilience perspective, Bruneau et al. [16] had proposed a concep-
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 9
tual framework to define seismic resilience. In another work, Tierney and Bruneau
[17] later introduced the R4 framework for disaster resilience. The R4 model [17]
comprises of four metrics: robustness, redundancy, resourcefulness, and rapidity.
‘Robustness’ means systems’ ability to function and provide services even under
degraded performance, probably with reduced quality of services. ‘Redundancy’
means identifying substitute elements that satisfy functional requirements in the
event of significant performance degradation or service disruption. ‘Resourceful-
ness’ is to initiate solutions by identifying the required resources based on the conse-
quence, nature, or depth of degradation by prioritizing problems that need to solve.
‘Rapidity’ indicates the ability to restore functions within the required time-stamp.
The National Academy of Science (NAS) defines resilience as “the ability to
prepare and plan for, absorb, respond, recover from, and more successfully adapt to
adverse events” (see National Research Council [18]). The National Institute of Stan-
dards and Technology (NIST) defines the information system resilience as follows.
Resilience is “the ability of an information system to continue to: (i) operate under
adverse conditions or stress, even if in a degraded or debilitated state while main-
taining essential operational capabilities; and (ii) recover to an effective operational
posture in a time frame consistent with mission needs” (see Ross [19]).
A lot of research works are going on the cyber resiliency study of CPS. We
mention a few of them here which deal with frameworks and security guidelines.
The NIST provides a framework (Sedgewick [20]) for improving the cybersecurity
and resilience of critical infrastructures that support both ITS and ICS. NIST pro-
vides another framework specifically for Cyber-Physical Systems (Griffor et al. [1]).
We elaborate on the frameworks in Sect.3.2. Haque et al. [21] illustrate the gap in
resilience analysis and propose a cyber resilience framework to quantify resilience
metrics. The framework considers the physical, technical, and organizational aspects
of cyber operations to assess ICS’s cyber resilience. Haque et al. also introduce a
qualitative cyber resilience assessment tool [22] based on the framework.
In the ICS domain, Stouffer et al. [5] provide detailed guidelines for ICS system
security. The policies cover secure ICS architecture and the methods for applying the
security controls to the ICS environment. Barker et al. [23] propose resilience ana-
lytics for social networks that depends on each other. The metrics describe how risk
analysis can help in the modeling and quantification of systems resiliency. DiMase
et al. [24] present a systems engineering framework for Cyber-Physical Systems
security and resiliency. The paper focuses on CPS security and relates to resiliency
to handle integrated and targeted security measures and policies. We would cover
some of the frameworks in Sect.3and thus omit the detailed discussion here to avoid
repetition.
In the modeling context, Haque et al. [25] highlight ways of modeling resilience
in CPS by considering the criticality of the cyber asset. Haque et al. [26] present
cyber modeling techniques by utilizing the critical system functionality for energy
delivery systems specifically. In the resilience analytics, Clark and Zonouz [27]
present intrusion resilience metrics for Cyber-Physical Systems by segregating the
cyber and control layers of CPS. In another work, Haque et al. [14] explain the
10 M. A. Haque et al.
challenges in resilience assessment in CPS and discuss ways to develop a simulation
platform for resilience assessment.
Wei and Ji [28] discuss a model named the resilient industrial control system
(RICS). The authors mentioned the following characteristics of resilient ICS:
Capability to reduce the unexpected consequence or impact of a cyber incidence
to as minimum as possible
Capability to mitigate a major portion of undesirable events
Capability to recover normal operations within an expected time frame.
The R4 metrics [17] presented above are in line with the resilient characteristics pro-
vided by Wei and Ji [28]. Most of the above works address resilience by developing
security frameworks and deriving quantitative analytic for the CPS or ICS. In this
chapter, we want to focus on understanding the cybersecurity frameworks and stan-
dard practices proposed by the governing bodies; That discussion follows in Sect.3
and Sect. 4, respectively.
3 State-of-the-Art Review of Cybersecurity Frameworks
In this section, we cover four crucial cybersecurity frameworks applicable to CPS.
These are (1) NIST framework for improving critical infrastructure cybersecurity,
(2) NIST framework for Cyber-Physical Systems, (3) NIST risk management frame-
work for information systems cybersecurity, and (4) cyber resiliency engineering
framework of MITRE Corporation. We consider these frameworks for our analy-
sis as researchers consider these frameworks as mostly adopted frameworks in the
cybersecurity domain. We also provide a comparative analysis of several other cyber-
security frameworks in Sect.3.5.
3.1 NIST Framework for Improving Critical Infrastructure
Cybersecurity
The NIST cybersecurity framework (Sedgewick [20]) version 1.0 provides broad
guidelines to manage cybersecurity risk and resilience. It has three main sections:
core, implementation, tiers, and profiles. It can also help to identify operations needed
to reduce risks and enhance resilience. The NIST framework identifies and proposes
five security functions. These functions help managing systems cybersecurity, as we
illustrate in Fig. 2.
The core piece of the framework provides actions to achieve specific results in
the cybersecurity area. The element “Functions” organize necessary cybersecurity
activities at the uppermost level. These functions are to identify, protect, detect,
respond, and recover, respectively. These functions help organizations in managing
cybersecurity risk and resilience. Here, the ‘identify’ function implies developing an
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 11
Fig. 2 NIST cybersecurity framework core functions and categories. We present here only the
core functions and categories adapting from the initially proposed framework by Sedgewick [20]
to explain the essential ideas
understanding of system risks and managing assets, data, capabilities, skills, etc. The
‘protect’ function deals with developing necessary defensive measures and imple-
menting those to ensure the continuity of services. Detect realizes the capability to
capture the occurrence of a cyberattack incident. The function ‘respond’ refers to
taking actions regarding a detected cyber breach incident. Lastly, the recovery means
restoring any damaged capabilities or services due to a cyberattack incident.
The framework presents a high-level risk and resilience assessment. It guides what
to do during a cyber attack event. However, the model framework lacks pointing on
how to implement those actions. Also, the model needs to consider system differences
among different critical infrastructures. For example, if the same attack happens in the
energy and water sectors, the methodologies and actions to be taken, as mentioned,
are the same, which may not consider the system differences.
We adapt the resilience curve presented by Wei and Ji [28] and map the graph
with the five functions offered by the NIST framework. The curve is similar to
Fig. 3 CPS cyber resilience graph with different phases of action. We adjust the graph from the
original graph presented in RICS model by Wei and Ji [28] to incorporate the resilience phases
12 M. A. Haque et al.
the duck curve in energy systems reliability analysis. In general, a resilient sys-
tem goes through five stages during an adverse event. These are plan/prepare,
absorb, analyze/respond, recover, and adapt (Linkov et al. [29]). In Fig. 3, we present
the resilience curve applicable to CPS by mapping it with the NIST functions.
The resilience curve indicates system behaviors during a cyberattack incident. The
resilience graph presents different phases of cyber operations as a function of sys-
tem functionality over time. The five stages complete the resilience cycle, and the
area formed by the enclosed curve is the quantitative measure of the system’s cyber
resilience.
3.2 NIST Framework for Cyber-Physical Systems
Griffor et al. [1] propose a framework for CPS that captures the generic CPS func-
tionalities. The framework focuses on the activities required to support conceptual-
ization, realization, and assurance of CPS. The framework requires identifying CPS
domains, facets, aspects, concerns, activities, and artifacts [1]. Here, ‘domains’ repre-
sent the CPS application areas; ‘concerns’ are concepts that drive the CPS framework
methodology. Activities within the facets address the ‘aspects.’ And ‘aspects’ consist
of a group of related concerns. There are nine defined aspects. These are functional,
business, human, trustworthiness, timing, data, boundaries, composition, and life-
cycle (see Griffor et al. [1]). ‘Facets’ encompass identified activities to perform in
the systems engineering process within the CPS. Each facet contains a set of well-
defined activities and artifacts (i.e., outputs) for addressing the concerns. In Fig.4,
the middle rectangular box layer shows what to do at each facet step. The bottom
parallelograms indicate the outcomes (i.e., artifacts) of the facet steps.
In Fig. 4, we observe the three identified facets: conceptualization, realization,
and assurance. ‘Conceptualization’ means things to perform. These are the group
of actions that constitute a CPS model. ‘Realization’ means how things are to make
Fig. 4 Main facets of the NIST framework for Cyber-Physical Systems [1]. We have adapted the
figure from the original framework to focus only on the important ideas. Here, conceptualization,
realization, and assurance are the three facets, as proposed in the framework
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 13
and operate. Realization encompasses the group of measures that create, deploy, and
manage a CPS. ‘Assurance’ is to achieve the desired level of confidence that the
system will work as planned. This facet includes the group of actions that provide
the belief that CPSs work as intended.
The CPS framework is still in first draft format and yet not fully established. The
CPS framework’s primary goal is to be actionable. From the critics’ point of view,
the framework is nothing but a systematic approach for realizing CPS’s process.
The three main facets on which the framework is sitting upon require activities that
depend solely on the expertise from subject matter experts. The framework hardly
illustrates how to handle cyber and physical challenges from design, modeling, and
security perspectives. Defining the CPS aspects and updating the facet activities and
artifacts would differ from domain to domain. They would require gathering a vast
amount of data and expertise from system administrators or subject matter experts.
Overall, the framework does not explain how to handle the security, reliability, and
resilience issues of the complex CPS.
3.3 NIST Risk Management Framework for Information
Systems Cybersecurity
In collaboration with the US Department of Defense, the Office of the Director
of National Intelligence, and the Committee on National Security Systems, NIST
has developed the Risk Management Framework (RMF) [30]. The RMF has con-
ceived to improve information and data security in a networked environment. The
RMF encourages sharing data and information among organizations and strengthens
risk and resilience management processes. The RMF has considered a three-layered
pyramid-shaped approach to handle and manage risks within the organization. The
bottom layer is the information systems layer. The middle layer deals with business
or mission processes. Finally, the topmost layer handles the organizational processes.
Here we only analyze the core processes as proposed in the framework.
Figure 5illustrates the RMF steps in the risk management process flow. We explain
here the seven steps involved in the comprehensive risk assessment in brief.
1. Prepare: The preparation step incorporates essential tasks at all the three levels
of the enterprise network. The ‘prepare’ step is to keep the organization ready
to manage risks associated with its security and privacy. The three levels that
we are referring are the organizational level, mission and business process level,
and information systems level.
2. Categorize: The categorization step classifies the system based on the impact
analysis. Here the classification of the categories considers the study of the
amount of information processed by the system. The categorization also takes
into consideration the volume of data stored and transmitted by the system.
3. Select: This step guides the organization to choose an initial set of baseline
security controls. The security controls come from the analysis of the security
14 M. A. Haque et al.
Fig. 5 Steps in NIST risk
management framework for
information systems
cybersecurity [30]. It
consists of seven steps: (1)
prepare, (2) categorize
system, (3) select controls,
(4) implement controls, (5)
access controls, (6) authorize
system, and (7) monitor
controls. The steps need are
to follow sequentially,
although the preparation
phase needs to consider the
constraints in other stages.
The figure is adapted from
the proposed framework [30]
to help in realizing the
discussion
categorization. This ‘select’ step handles managing and rectification of security
controls standards as needed. The baseline standard comes from the study of the
organization’s risk conditions assessment.
4. Implement: This implementation step emphasizes on the execution of the secu-
rity controls from the operational perspective. The step also incorporates docu-
mentation of the controls.
5. Assess: This step is to assess the security controls using the right measures to
estimate the extent of the correctness of the implemented controls. It consid-
ers whether the system is operating as planned. The step also evaluates if the
implemented actions produce the expected outcome considering the established
security requirements.
6. Authorize: This authorization step is to authorize system operations when there
is an identified risk. This risk is directly related to organizational assets and
operations. The operation of the system goes on if the assessment outcome finds
that the risk is acceptable.
7. Monitor: The last step is to monitor and assess the implemented security mea-
sures regularly. This monitoring includes evaluating the effectiveness of the
implemented security control and documenting any operational environment
changes. Other tasks, such as conducting impact analyses of the security alter-
ations and reporting to appropriate personnel, are part of the monitoring process.
This framework is one of the advanced cybersecurity frameworks existing today
to address cybersecurity and cyber resiliency concerns. The seven sequential steps,
as previously mentioned, are possible to tailor depending on the CPS domain areas
with the help of subject matter experts. One of the framework’s primary focus is to
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 15
Fig. 6 MITRE cyber resiliency engineering framework [31]. We only present here goals and
objectives, adapting from the proposed framework to help in realizing the essential ideas
monitor and assess the security control mechanisms and evaluate the impact of any
cyberattack incident. We think the framework addresses that concern conclusively
and comprehensively.
3.4 MITRE Cyber Resiliency Engineering Framework
MITRE Corporation has proposed a cyber resiliency engineering framework (see
Bodeau and Graubart [31]). The framework consists of cyber resiliency goals, objec-
tives, and cyber resiliency practices. It also incorporates threat models associated with
cyber risk and resiliency. The framework focuses on characterizing cyber resilience
metrics. Figure 6illustrates the framework. The elements of cyber resiliency consist
of four goals: (1) anticipate, (2) withstand, (3) recover, and (4) evolve [31]. There
are eight objectives: (1) understand, (2) prepare, (3) prevent, (4) continue, (5) con-
strain, (6) reconstitute, (7) transform, and (8) re-architect. The framework consists
of fourteen practices that intend to maximize cyber resiliency. These are (1) adap-
tive response, (2) privilege restriction, (3) deception, (4) diversity, (5) substantiated
integrity, (6) coordinated defense, (7) analytic monitoring, (8) non-persistence, (9)
dynamic positioning, (10) redundancy, (11) segmentation, (12) unpredictability, (13)
dynamic representation, and (14) realignment. In this framework, the different goals,
objectives, and practices may work together or operate separately.
Although the NIST frameworks presented earlier deal with cybersecurity in broad,
the MITRE framework focuses specifically on the cyber resilience engineering and
assessment. The goals and objectives guide us to take the correct action under each
step of the resilience management cycle. The proposed goals align with the NAS
resilience definition, which includes the plan, absorb, recover, adapt [18]. The frame-
work offers several practices which, with careful consideration, apply to ICS/CPS
domain by adjusting the rules considering the system constraints and design method-
ologies.
16 M. A. Haque et al.
3.5 Comparison of the Frameworks
A close look at the above frameworks reveals that the frameworks consider man-
agement of cyber risk and resilience from the following perspectives to handle the
cybersecurity and cyber resilience for the infrastructure or the systems.
Plans, goals, objectives, practices, and strategies (risk and resilience perspective)
Identify, protect, detect, respond, recover, and adapt (resilience perspective)
Anticipate, recover, withstand, and evolve (resilience perspective)
In Table 2, we present a structured comparison among a couple of crucial cyber-
security and cyber resilience frameworks proposed by different standard bodies and
research organizations. If we look in-depth, we find that most of the frameworks dis-
cuss some common areas. These are identifying critical assets, securing the network
through multi-level access controls, and assessing cyber risks on the business and
organization as a whole. Finally, the frameworks propose techniques to safeguard the
critical system functions or services by developing mitigation plans and strategies.
What is a lack in those frameworks is to formalize those guidances using established
mathematical methods. In this work, we understand the need for formal approaches,
and we address that need to develop mathematical techniques for risk and resilience
assessment in detail in Sect. 5.
4 Cyber Standards and Recommended Practices for CPS
In this section, we briefly discuss control system specific recommendations suitable
for ICS or CPS. The ICS-CERT provides the following critical control system specific
cyber recommendations that give a solid baseline regarding what to do and how to
prevent cyberattacks in ICS.
Developing cyber forensics plans for control systems: Developing a cyber foren-
sics program is challenging for control systems environments. The challenges arise
because of the system limitations, such as nonstandard protocols, old designs, and
irregular proprietary technologies. Cornelius and Fabro [32] address the chal-
lenges of traditional forensics to ICS and provides detailed guidance to develop a
cyber forensics program through identifying system environment and uniqueness,
defining context-specific requirements, and identifying and collection of system
data.
Applying defense-in-depth strategies to improve industrial control systems cyber-
security: The ICS defense-in-depth strategies (see Fabro et al. [33]) provide
comprehensive guidance for improving cybersecurity in control systems such as
CPS/ICS.
ICS security incident response plan: The standard [35] primarily focuses on the
preparation and response mechanisms for a cyberattack incident on the ICSs net-
work. The policy has four segments. The first segment concentrates on planning
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 17
Tabl e 2 Major cybersecurity and cyber resilience frameworks proposed by standard bodies and research organizations
Framework Publishing organization System Intended use Major functions, processes,
and/or metrics
Year Document version
Risk management
framework for information
systems cybersecurity
Joint Task Force ITS Managing security and
privacy risk for
organization-wide
information systems
Prepare, categorize, select,
implement, access,
authorize, and monitor
2018 NIST SP 800-37 Rev. 2a
Framework for
Cyber-Physical Systems
National Institute of
Standards and Technology
(NIST)
Mainly CPS Broad design and security
guidelines for CPS
Conceptualization,
realization, and assurance
2017 NIST SP 1500-201b
Framework for improving
critical infrastructure
cybersecurity
National Institute of
Standards and Technology
(NIST)
Critical Infrastructures (CI) Managing risk and
resilience of CI
Identify, protect, detect,
respond, and recover
2014 Version 1.0c
Conceptual Framework for
developing resilience
metrics for the electricity,
oil and gas sectors in the
United States
Sandia National Laboratory Mainly energy and oil and
gas sector. Also covers ICS,
CPS, SCADA
Developing cyber resilience
analytics for energy, and oil
and gas sectors
Define goals and metrics,
characterize threats, apply
system model, evaluate and
incorporate improvements
2014 Version not specifiedd
Cyber resiliency
engineering framework
MITRE Corporation ITS Developing cyber resilience
goals, objectives, and
practices for ITS
Anticipate, withstand,
recover, and evolve
2011 Version not specified e
R4 resilience framework Multidisciplinary Center
for Earthquake Engineering
Research (MCEER)
Critical Infrastructure (CI) Resilience assessment for
CI using quantitative
security metrics
Robustness, redundancy,
resourcefulness, and
rapidity
2007 Version not specifiedf
aJOINT TASK FORCE [30]
bGriffor et al. [1]
cSedgewick [20]
dWatson et al. [34]
eBodeau and Graubart [31]
fTierney and Bruneau [17]
18 M. A. Haque et al.
for a potential cyber event. This part also incorporates establishing a response
team and setting up a response plan for cyber incidents. The plan should include
policies, procedures, and personnel as per the organization’s established standards.
The second segment focuses on incident prevention. The third segment is incident
management, which again subdivides into four operations: (1) detection of poten-
tial threats; (2) containment of the event (e.g., quarantine malware installed on the
servers); (3) remediation including the eradication of the risk (e.g., malware); and
finally (4) recovering from the event and restoring the system to its full-service
capability. The fourth segment deals with the post-event analysis. This analysis
includes determining the root cause, access path, vulnerability, and other necessary
information to understand the incident better. The review would help to prevent
the system in the future, including cyber forensics and data preservation.
Patch management for control systems: There is no “one size fits all” solution
that adequately addresses the patch management processes of IT and OT networks.
There are some differences in implementing the patches in information technol-
ogy systems and industrial control systems, as discussed earlier in Table 1.The
recommended practices (see Tom et al. [36]) provide a detailed explanation of
the patch management program (e.g., backup, testing of a patch, disaster recovery,
etc.), patching analysis (e.g., vulnerability analysis), and deployment in the control
systems environment.
Updating Antivirus in industrial control systems: Antivirus has widely used in
information technology than the ICS. The application of antivirus software is to
comply with the defense-in-depth strategy in ICS. Thus antivirus software and
patches need to keep updated periodically in ICS. These recommendations [37]
guide how to update the Antivirus in the control system environment without
impacting the OT production systems.
Again, most of these standards are very generic and may vary from system to
system, depending on the area of applications. In this chapter, we want to provide
quantifiable resilience assessment methodologies that would help make informed
decisions by incorporating the security guidelines.
5 Formal Approaches for Realizing CPS Resilience
One way to realize the frameworks and security practices within the CPS domain is to
provide quantitative cyber resilience analytics. The quantitative cyber resilience ana-
lytics could help network administrators and operators in two ways: (1) It can help in
assessing systems and evaluating the weak areas and (2) assist in developing optimal
mitigation strategies. Researchers utilize both qualitative and quantitative model-
ing approaches for deriving quantitative cyber resilience metrics. In this section, we
present formal mathematical methods and procedures to quantify cyber resilience for
the CPS. We first offer a subjective approach for quantifying cyber resilience utilizing
the analytical hierarchy process (AHP) in Sect. 5.1. We then propose a quantitative
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 19
resilience assessment approach using a multi-level vulnerability graph model utiliz-
ing the graph properties in Sect. 5.2. Next, we offer a plan for critical cyber asset
identification utilizing the technique for order of preference by similarity to ideal
solution (TOPSIS) method in Sect. 5.3.
We know that we need to choose only specific aspects from the frameworks for
formal modeling within this chapter’s context. That is why we model here network
criticality, system functionality, and cyber resilience analytics for the CPS utilizing
the system’s vulnerabilities. We also provide methods to rank critical assets.
5.1 Cyber Resilience Quantification by Subjective Evaluation
Using Analytical Hierarchy Process (AHP)
The Analytic Hierarchy Process (AHP) is an organized technique for analyzing
complex decisions based on mathematical and psychological comparison (Saaty
[38]). AHP has been in use in the cybersecurity domain to assess security metrics
for a long time because of its ability to combine mathematical objectivity with the
psychological subjectivity to evaluate information and help make decisions [39,40].
We use AHP to quantify cyber resilience analytics using the subjective evaluation
method based on specific questionnaires. First, in the next paragraphs, we present
the mathematical process involved in AHP. Then we discuss a case study to assess
the robustness metrics for a hypothetical ICS network in Sect.5.1.1.
In AHP, we form the hierarchy by setting a goal to evaluate, criteria to meet that
goal, and available possibilities or options or alternatives. Here we illustrate the AHP
procedures for the cyber resilience analytics following Haque et al. [21]. We collect
subjective judgment data from Nsubject matter experts (SME). We compare m
criteria pairwise and form a comparison matrix Pof dimension m×m. An element
Pij in Prepresents the subjective comparison between the two criteria Piand Pj.We
provide the pairwise comparison matrix Pin Eq. (1) below where Pij =1
Pji .
P=
1P12 ··· P1m
1
P12 1··· P2m
··· ··· ··· ···
1
P1m
1
P2m··· 1
(1)
We then derive the normalized comparison matrix NCMP from the original com-
parison matrix Pabove where NCMP (i,j)=Pij
m
i=1Pij .
NCMP =
NCMP(1,1)NCMP (1,2)··· NCMP(1,m)
NCMP(2,1)NCMP (2,2)··· NCMP(2,m)
··· ··· ··· ···
NCMP(m,1)NCMP (m,2)··· NCMP(m,m)
(2)
20 M. A. Haque et al.
Each criterion has a weight. We compute the weights of the criteria using the
normalized matrix NCMP . The weights are none other than the normalized right
eigenvector of the pairwise comparison matrix P.
W=
W1
W2
···
Wm
(3)
where, Wi=1
mm
j=1NCMP(ij). We also need to check the consistency of the
pairwise comparison. We can do that by computing the consistency ratio, CRby using
the expression CR =CI
RI , where RI is the random index, and CI is the consistency
index. We calculate CI by utilizing the principle eigenvalue λmax,asgiveninEq.
(4).
CI =λmax 1
m1(4)
Here, we compute λmax by
λmax =
m
j=1m
i=1
PijWj(5)
We find the value of random index RI from Table 6 of the article by Saaty [38]. We
accept the comparison if the consistency ratio CR 0.1 (this means that out of 10
sample responses 9 responses are consistent to each other). Table3provides default
RI values for the corresponding mvalues for cases m<10. Next, in Sect.5.1.1,we
present an illustration of assessing robustness metric.
5.1.1 A Hypothetical ICS Network ‘Robustness’ Assessment Using
AHP
To explain how the mechanism of the AHP applies in cyber resilience assessment,
we provide here an illustration using the ‘robustness’ metric (one of the broad four
resilience metrics of R4 model [17]). Let us consider a hypothetical ICS network, and
our goal is to evaluate the cyber robustness metric for that ICS network quantitatively.
Here, we utilize the robustness metric’s decomposition, as illustrated by Haque et al.
Tabl e 3 Values of the random index (RI) for small problems (m<10)
m-factora2 3 4 5 6 7 8
Random
Index (RI)
0.00 0.58 0.9 1.12 1.24 1.32 1.41
aSee Table6 of Satty [38]
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 21
Fig. 7 Decomposition of ‘robustness’ metric for ICS using the AHP process hierarchy. Robustness
is one of the four broad categories of cyber resilience metrics in R4 model
Tabl e 4 List of possible values for each of the sub-criteria
Alternative values Interpretation of the options
High (H) Specialized security measures are already implemented in the ICS for
the associated sub category
Medium (M) Some or partial security measures are implemented in the ICS for the
associated sub criteria
Low (L) No or very few measures are implemented in the ICS for the
corresponding sub criteria
[22]. As shown in Fig. 7, the robustness metric has three assessment criteria: physical
robustness (C1), technical robustness (C2), and organizational robustness (C3). Each
criterion has four sub-criteria SC1,SC2,SC3, and SC4.SC1is ICS security (e.g.,
using IDS/IPS or physical security), SC2is access control (e.g., using firewall policy),
SC3is ICS product diversity, and finally, SC4is ICS risk mitigation strategies. Finally,
to be aligned with the Common Vulnerability Scoring System (CVSS) (Mell et al.
[41]), we design each sub-criteria to take values among three alternative options:
high (H), medium (M), and low (L). We present the meaning of high, medium, and
low in Table 4.
We first use the Likert scale to set up the pairwise comparison. In comparison,
having equal importance is the lowest parameter with a numeric value of 1, and
having extreme importance is the highest-ranked parameter with a numerical value
of 9. We present a sample assessment question in Fig. 8.Weprefertousethesame
Likert scale numerical scores to align with the Likert range used by Satty [38].
We then utilize the subject matter experts to assess the questionnaires. We have
conducted the survey and collected a total of N=15 sample data sets from which
we exclude N=5 because of inconsistency (CR <0.1) in responses. Table 5con-
22 M. A. Haque et al.
Fig. 8 Sample pairwise comparison between the physical and the technical criteria for the ‘Robust-
ness’ metric using Likert scale
Tabl e 5 Pairwise comparison matrix and eigenvector estimation for maximizing robustness with
respect to the three considered criteria: physical, technical, and organizational
Pairwise comparison Evaluated score
Criteria Physical (C1)Technical (C2) Organizational
(C3)
Normalized
eigenvector
Physical (C1) 1 0.11 0.20 0.0578
Technical (C2)8.95 15.101 0.7383
Organizational
(C3)
4.89 0.20 10.2039
tains the aggregated pairwise comparison matrix that we have computed from the
consistent data set for the three criteria C1,C2, and C3. From the normalized right
eigenvector of Table 5, we find the robustness as a function of the physical (C1),
technical (C2), and organizational (C3) criteria as given in the Eq. (6):
Robustness =0.06 ×Physical +0.74 ×Technical +0.20 ×Organizational
or,
R1=0.06 ×C1+0.74 ×C2+0.20 ×C3
(6)
Similarly, we have made pairwise comparisons for sub-criteria and alternatives.
We derive the weights of each criterion from the normalized eigenvector correspond-
ing to the options at the lower level of the hierarchy in the AHP model. Figure9shows
a sample pairwise comparison illustration between options high (H) and medium
(M). Equation 7provides the numerical values that we have obtained for the weights
and the normalized eigenvector for the four sub-criteria and three alternatives in the
matrix form.
ICS Security
Access Control
ICS Product Diversity
ICS Risk Management
=
0.4
0.2
0.1
0.3
,and
High (H)
Medium (M)
Low (L)
=
0.7
0.2
0.1
(7)
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 23
Fig. 9 Sample pairwise comparison between the options high (H) and medium (M) for the access
control sub-criteria using Likert scale
This way, we can assess the four broad cyber resilience metrics one at a time
using the AHP, and then combine the assessment to reach a consolidated value for
the resilience metric.
5.2 Cyber Resilience Assessment Using Multi-level Directed
Acyclic Vulnerability Graph Model
As we have already stated, our goal within the context of this chapter is to apply
the frameworks and security practices to evaluate the quantitative cyber resilience
metric. In this section, we describe a multi-level vulnerability graph model to assess
the cyber resilience quantitatively. We find graph-theoretic security analytics is one
of the most common methods to address cyber risk and resilience. Next, we present
the background information necessary to understand the cyber resilience modeling
approach.
5.2.1 Background Information for Graph-Theoretic Modeling and
Analysis
This section discusses the graph-based modeling approach to provide the readers with
the necessary background information about our resilience quantification method-
ology. Some of the definitions we have taken from one of our recent works [26].
We frequently refer to SCADA systems for illustration purposes as we formulate the
mathematical models by keeping in mind the energy delivery systems (EDS) as an
example of CPS. Readers may consider SCADA as a monitoring and control system
for the physical field devices. We find that researchers commonly refer to cyber-
physical power systems (CPPS) [2,3] when it comes to the discussion of energy
systems cybersecurity. That is why we take the power systems’ case to illustrate the
model that applies equally to other CPS.
24 M. A. Haque et al.
(1) Vulnerability graph: We define a vulnerability graph as a directed acyclic graph
(DAG). In general, a vulnerability graph is a type of attack graph. Mathematically,
we represent the vulnerability graph as G=(N,E,W), where Nis the set of
vertices; Eis the set of edges where EN×N; and Wis the weight matrix
of the graph. If there exist and edge e=(i,j)between vertex iand j, then the
vertex iand jare adjacent to each other. An adjacency matrix Aof a graph
G=(N,E,W)with |N|=nis an n×nmatrix, where Aij =Wij,if(i,j)E
and Aij =0 otherwise. The weight value Wij between the edge (i,j)is coming
from the CVSS vulnerability base score (see Mell et al. [41]) of the node j.The
multi-level vulnerability graph is the same as the vulnerability graph, but here
different layers (as per the defense in depth security strategy) model themselves
as separate graphs. There can be single or multiple perimeter devices between the
layers, such as a firewall that connects the two consecutive layers. We encourage
readers to explore more about the multi-level vulnerability graph in the article
by Haque [42].
(2) Network topology: In a CPS network, the network design follows specific system
architecture and security policies (e.g., firewall rule-sets). In the CPS, as per
the NIST guidelines (see Stouffer et al. [5]), ICS firewalls control the allowed
protocols or message communications among the field devices through the rule-
sets or policies. We consider that the adjacency matrix is sufficient to represent
the network connectivity in the vulnerability graph.
(3) Control function: We consider a control function a logical connection that carries
(or transmit) the data from the field devices to SCADA and controls commands
from SCADA to the field devices. These functions perform specific tasks such
as voltage regulation adjustment, etc. Formally, we define a control function
CF(i,j)between node i&jas {CF(i,j)=e(i,j)|∃e(i,j)E,Aij =
0&Wij >0}, and thus, basically, the edges represents the control functions in
the vulnerability graph. As we utilize the CVSS base scores, this edge weight
or importance indicates the possible exploitability and impact of exploiting the
particular control function. We do not consider the degree of operability of
the control functions in this model as described in FDNA [43], because that
brings a different research question of modeling and incorporating the functional
dependencies in the cyber resilience assessment.
(4) CVSS base, exploitability, and impact scores:CVSS[41] defines the exploitabil-
ity and impact metrics for every known vulnerability. The national vulnerability
database [44] provides the CVSS scores for all the reported (i.e., known) vulner-
abilities. The exploitability metric comprises three base metrics: access vector
AV, access complexity AC, and access authentication AU. Similarly, the impact
metric is also composed of three base metrics: confidentiality impact IC, integrity
impact II, and availability impact IA. CVSS computes the exploitability Eiand
impact Iiof a vulnerability iusing Eq. (8).
Ei=20 ×Ai
V×Ai
C×Ai
U
Ii=10.41 ×(1(1Ii
C)(1Ii
I)(1Ii
A))(8)
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 25
The measurement of exploitability, impact, and base scores are on a scale of
0–10. The higher the value, the higher the exploit capability or consequences.
To define the base score, CVSS define an impact function as given below:
fIi=0ifIi=0
1.176 otherwise (9)
Finally, CVSS computes the base score (BS) of vulnerability iusing the below
equation (see [41]):
BS
i=roundTo1Decimal (((0.6×Ii)+(0.4×Ei)1.5)×f(Ii)) (10)
(5) Multi-edge to single edge transformation: In a network, if a node has multiple
vulnerabilities, the graph becomes a multi-digraph. The number of paths from
source to destination increases exponentially and creates scalability problems for
large networks. To avoid this, we transform the multi-edged directed vulnerabil-
ity graph to a single-edged directed graph (simple graph) using the composite
exploitability score. As the severity of the exploitability and impact are differ-
ent for different vulnerabilities, we use a severity-based weight approach (see
Table 3 of [25]) to incorporate the severity level of the vulnerability. The com-
posite exploitability score (ES), impact score (IS), and base score (BS) for node
j, having vulnerabilities i=1nis defined in Eqs. (11), (12), and (13).
ES j=n
i=1wj
i×Ej
i
n
i=1wj
i
(11)
IS j=n
i=1wj
i×Ij
i
n
i=1wj
i
(12)
BS j=n
i=1wj
i×BSj
i
n
i=1wj
i
(13)
Here, wj
i,Ej
i,Ij
i, and BS j
iare the severity weights, exploitability score, impact
score, and base score of vulnerability iof node j. We find BSj
ifrom NVD
database [44]orusingEq.(10), and we compute BS jusing Eq. (13) which
refers to the composite base score of node j.
(6) Computation of edge weight: We utilize CVSS base scores in computing the
edge weights using Eq. (13). This way, we consider both the exploitability and
impact of a vulnerability in our edge weight. The weight matrix is as follows.
Wij =BS jif (i,j)E
0 otherwise, i.e., if (i,j)/E(14)
26 M. A. Haque et al.
(7) Betweenness Centrality (BC): Betweenness Centrality (BC) is a graph-theoretic
metric that measures the number of times a node acts as a bridge along the shortest
paths between two other nodes. If we translate a network into a graph-theoretic
model, then the BC of a node indicates the possibility of attack progression
through that node. Mathematically, BC of node n(i.e., Bn)isasfollows:
Bn=
s=n=t
σst(n)
σst (15)
Here, σst =total number of shortest paths from source node sto target node t
and σst(n)=number of paths that pass-through node namong those shortest
paths.
(8) Katz Centrality (KC): Katz Centrality (KC) is another graph-theoretic parameter
that gives the importance of the node considering the network structure and node
position in the network. KC quantifies the number of nodes connected through a
path, while we penalize the contributions of distant nodes. Mathematically, we
define KC of node ias given in Eq. (16), where βis an attenuation factor and
0β1.
CKatz(i)=
p=1
n
m=1
βp(Ap)mi (16)
The following subsections present the derivation of system critical functional-
ity and resilience metrics, as shown by Haque et al. [26]. We utilize network
criticality to formulate system functionality.
5.2.2 Critical System Functionality (CSF)
System critical functionality is the level of minimum functionality maintained by a
system during any adverse scenario. It depicts the extent to which the system’s typical
performance can degrade. While discussing resiliency, Arghandeh et al. [45] illustrate
resilience as a multi-dimensional property, which requires managing disturbances of
the network performance. This disturbance may originate either from physical or
cyber devices malfunctions or failures or due to a cyberattack incident. Arghandeh
et al. also describe critical system functionality as maintaining the system’s minimal
required services in the presence of unexpected extreme disturbances. In another
study, Bharali and Baruah [46] define average network functionality using the net-
work criticality metric. Bharali and Baruah consider random network failures while
determining network functionality using a graph-theoretic approach. We extend the
analysis of Bharali and Baruah [46] for the case of random cyberattacks on the CPS.
We think removing an edge in the vulnerability graph makes a service unavailable
or deactivates a control function due to disconnecting the logical connection. Here
we consider the same average network functionality metric as the system’s criti-
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 27
cal functionality. This is the level of functionality maintained by the CPS under a
cyberattack.
Let us denote the original graph before any attack incident happens by Go=G,
and the graph obtained by removing the edge eduring an attack incident by Ge=
G\e. Let us also consider τand τebe the network criticality of the graphs Goand
Ge. Then we define the critical system functionality by considering the effect of the
edges removed from the original graph as given by Eq. (17).
η=11
m
eEI+eτ) τ
τe
+Ieτ) τ
τe+2n
μ(17)
where mdenotes the number of edges in Go,μis the smallest non-zero eigenvalue of
Go,I+(x)=1ifx0 and 0 otherwise, and I(x)=1ifx<0 and 0 otherwise. For
a connected graph Go,μ=μ1is the algebraic connectivity of Go. Here, 0 η1.
Thus, ηindicates the system functionality of the CPS under cyber attack events, i.e.,
the functionality or services available during the attack event considering the impacts
on the links. A higher value of ηmeans a higher degree of system functionality
is maintained. We discuss the computation process of the network criticality τin
Sect. 5.2.4.
5.2.3 Cyber Resilience Metric
Deriving resilience analytics requires understanding and incorporating system behav-
ior (linear or non-linear) during the recovery phase. It also needs to incorporate critical
system functionality while generating resilience metrics. Roberson et al. [47] define
resilience from the bulk power system perspective, where the authors consider that
the safeguarding and restoration of the system functionality subject to perturbations
are key elements of resilience. We compute the CPS’s cyber resilience by utilizing the
system performance or recovery curve, as given in Fig.10 incorporating the critical
system functionality metric. Typically, during an adverse event, the recovery behavior
of a system is non-linear. This recovery is a function of the system (S)under consid-
eration, duration of recovery (T), recovery rate (r), time (t), and the functionality
level (η). Zobel [48] addresses the power system recovery behavior from disaster
resilience and proposes several functional forms to model the recovery over time. In
this work, we utilize the inverted exponential form of the recovery curve from Zobel
[48], which considers the non-linearity and suitable to model the resilience for the
CPS. We model the time-dependent system recovery behavior Qr(t)by following
the Eq. (6) of Zobel [48] to demonstrate the quantitative resilience metric under any
adverse event. Here the impact is equivalent to the loss of system performance or
1ηwhere 0 η1.
Qr(t)=(1η)1eT(ttiri)ln(n)
T+T(ttiri)
nT (18)
28 M. A. Haque et al.
Fig. 10 System performance recovery curve during a cyberattack incident ion the CPS. We use
the graph from Haque et al. [26], which is a modified form of the resilience graph presented in Wei
and Ji [28]
Tabl e 6 Notations used for resilience modeling
Notations Explanation of notations
Qr(t)Time-dependent system recovery behavior
tri
iTime instance of initiating system recovery for incident i
tcr
iTime instance of complete recovery of system functions for attack incident i
T=tcr
itri
iThe period of recovery
TSystem-dependent maximum allowable time for the recovery
n(in Eq. 18)The level of concavity of the inverted exponential curve
We provide the notations used in Eq. (18)inFig.10 and Table 6. Here, Tis the
system-dependent maximum allowable time to recover. Typically, system adminis-
trators or designers select Tas the maximum acceptable time for the system to
recover. The area under the points e, a, d represents the amount of losses in system
functionality over time due to the cyberattack incident i. Thus, the area enclosed by
the marks a-b-c-dis the area of system resilience. To compute the resilience metrics,
we first calculate the area enclosed by the points e-a-d. We then can compute the
area covered by the points e-a-d as follows:
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 29
Ae-a-d =(1η)
tri
i+T
tri
i1eT(ttiri)ln(n)
T+T(ttiri)
nT dt (19)
Simplifying the above equation, we find the following reduced form as in Eq. (20).
Aead=(1η)T1n1
nln(n)+1
2n(20)
From Fig. 10, e-b-c-dis 1 T=Tand the area of e-a-d is defined by Eq. (20).
Thus, the cyber resilience of the CPS system is the area under the curve enclosed by
the points a-b-c-dover period Tas given in Eq. (21).
ξ=1
TT(1η)T1n1
nln(n)+1
2n (21)
The term 1n1
nln(n)+1
2nis a constant term for specific n, and is denoted by
γ. Thus, Eq. (21) becomes ξ=1
TT(1η)Tγ.
5.2.4 Network Criticality
As we have seen earlier, to compute the CSF, we need the criticality metric. Bharali
and Baruah [46], and Tizghadam and Garcia [49] proposed a graph-based network
criticality metric. We apply the same here to measure the criticality of the overall CPS
network. We use the Moore-Penrose inverse of the Laplacian matrix Lto compute
the network criticality τ. As we are using the directed weighted graph, we define the
Laplacian matrix Las per Chung [50]asgiveninEq.(22). In Eq. (22), P is the graph
transition matrix, is a matrix with the Perron vector of Pin the diagonal and zeros
in all other matrix elements.
L=I1
2P1
2+1
2PT1
2/2(22)
We can also derive Lis by using the normalized graph Laplacians Lsym and random
walk Laplacian Lrw, as below.
Lsym =D1
2LD1
2=ID1
2WD1
2
Lrw =D1
2Lsym D1
2(23)
Dis a diagonal matrix formed by the degree of the nodes. We define it as D=
diag(d1,d2,...,dm).Heredi=m
j=1Wij. We use Bernstein [51] to compute the
Moore-Penrose inverse of the Laplacian matrix (L), i.e., L+as we provide in Eq. (24).
30 M. A. Haque et al.
L+=L+J
n1
J
n(24)
where Jis an n×nmatrix whose entries are all equal to 1. We then define the
network criticality metric τby Eq. (25).
τ=2ntrace(L+)(25)
Here, nis the number of nodes, and trace(L+)=n
i=1(L+)ii. The larger the
value of τmeans the network is more vulnerable from the exploitability perspective.
We can apply the above vulnerability graph-based resilience analytics derivation
approaches in the CPS context to assess the overall network functionality, criticality,
and resiliency. Next, we present a process to identify and rank the critical cyber assets
using the TOPSIS method in Sect. 5.3. We consider the ranking an essential step
towards realizing the security guidelines as identifying critical assets of the network
is among the criteria in the recommended defense-in-depth security measures.
5.3 Ranking Critical Assets Using TOPSIS Method
Determining criticality for the network devices is a multi-attribute decision analysis
(MADA) problem. Haque et al. [25] have identified some of the crucial parameters
for ranking the critical devices in the power system network from a cyberattack
perspective using the vulnerability graph model. Here we apply the TOPSIS method
as a MADM (Multiple-Attribute Decision Making) technique to rank the critical
devices in a CPS network.
Assessment Parameters: Here, we consider four parameters to assess each device’s
criticality, although it is possible to take Nparameters into the decision-making
process. The parameters are (1) device’s asset value represented by Katz centrality,
(2) device’s briding capability, which we model using betweenness centrality, (3)
attack occurrence exploitability, and (4) potential attack impact. One can compute
the attack exploitability and attack result on a device using Eqs.(11) and (12). Also,
we find the BC and KC using Eqs. (15) and (16). For details on the meaning of asset
value, exploitability, and attack impact, we encourage readers to check the article
by Haque et al. [25], which we omit here to narrow down our focus to the specific
problem under consideration.
TOPSIS Method for Device Criticality Assessment: The Technique for Order of
Preference by Similarity to Ideal Solution (TOPSIS) is a MADM technique. It builds
on the concept that the chosen alternative should have the shortest geometric distance
from the positive ideal solution and the longest geometric distance from the ideal
negative solution (see Hwang et al. [52]). Kim and Kang [53] use and illustrates
TOPSIS to determine the device criticality. Here, we briefly present the TOPSIS
method’s steps for facilitating an understanding of the ranking process.
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 31
Step I: At first, we form an m×nmatrix with mcriteria (i.e., parameters) and n
alternatives (i.e., nodes/devices), with the intersection of each criteria and alter-
native contains a value yij, where Criteriaiand Alternative jare the ith criteria
and jth alternative.
Ym×n=
Alternative1Alternative2··· Alternativen
Criteria1y11 y12 ··· y1n
Criteria2y21 y22 ··· y2n
··· ··· ··· ··· y3n
Criteriamym1ym2··· ymn
(26)
Step II: In this step, we normalize the matrix Ym×nto form a normalization matrix
Rm×n=(Rij)m×nusing the below equation.
Rij =yij
m
i=1(yij)2(27)
Step III: Here, we calculate the weighted normalized decision matrix Tas below.
We need to compute the weights using AHP. We illustrate an example in the
Sect. 5.3.1.
T=(tij)m×n=(WiRij )m×n,j=1,2,...,n(28)
Step IV: We determine the worst alternative Awand the best alternative Ab.
Aw={max(tij|j=1,2,...,n|iI,
min(tij|j=1,2,...,n|iI+} = {twi |i=1,2,...,m}(29)
Ab={min(tij|j=1,2,...,n|iI,
max(tij|j=1,2,...,n|iI+} = {tbi |i=1,2,...,m}(30)
where I+={i=1,2,...,m|i}represents the criteria having a positive impact
and I={i=1,2,...,m|i}represents the criteria having a negative impact.
Step V: We compute the L2-distance between the target al.ternative jand the worst
condition Aw.
diw =
!
!
"
m
i=1
(tji twi )2,j=1,2,...,n(31)
The distance between the alternative j and the best condition Abis:
djb =
!
!
"
m
i=1
(tji tbi )2,j=1,2,...,n(32)
32 M. A. Haque et al.
where djw and djb are L2-norm distances from the target al.ternative i to the worst
and best conditions, respectively.
Step VI: Finally, at this stage, we compute the criticality of device j(alternative
j)usingEq.(33):
ηj=djw
djw +djb
,0ηj1,j=1,2,...,n(33)
Using the device criticality metric, we can identify and rank the critical network
devices.
5.3.1 Illustration of Ranking Critical Cyber Assets Using Vulnerability
Graph and TOPSIS Method
We illustrate an example of the application of TOPSIS in CPS network asset ranking
using the vulnerability graph of Fig. 11. Here, we consider Fig. 11 as a vulnerability
graph representation for a CPS with ten devices. We apply TOPSIS to determine the
criticality and illustrate the same for the nodes (or devices) 3–8 only using Fig.11
because of space constraints. The edge score contains two parameters: exploitability
score and impact score. Table7shows the weights of the criteria and the parame-
ter values of the nodes. Table 8shows the corresponding TOPSIS computation. In
Table 8, the bold italic underline value is the maximum of the criteria, and the bold
only value is the minimum of the criteria. Here we find that the most critical devices
are 4 and then 7 and 3, respectively, and the least critical one is 8 among the six nodes
that we have considered. Again, this is a sample illustration of how we can apply the
TOPSIS by choosing some criteria and corresponding weights using a vulnerability
graph representation of CPS.
Fig. 11 Asample
vulnerability graph with
arbitrary edge weights. We
represent the edge weights as
(exploitability score, impact
score). The number of nodes
used in this illustration is ten.
The edge weights are within
the range of 0–10 to keep
similar to CVSS scores
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 33
Tabl e 7 Device criticality assessment parameters and values
Parameters Weight (wc) Devices (nodes)
3 4 5 6 7 8
Exploitability 0.25 7.9 4.3 5.7 3.1 2.5 4.6
Impact 0.5 6.4 7.8 3.4 5.2 8.9 3.2
Betweenness centrality 0.15 0.1273 0.0671 0.0231 0.0417 0.1018 0.1111
Katz centrality 0.1 0.3243 0.3299 0.3010 0.3004 0.3635 0.3643
Tabl e 8 TOPSIS device criticality metrics computation
Parameters/Metrics Device (j)
3 4 5 6 7 8
Exploitability 1.978 1.075 1.425 0.775 0.625 1.15
Impact 3.2 3.9 1.7 2.6 4.45 1.6
Betweenness centrality 0.019095 0.010065 0.003465 0.006255 0.01527 0.016665
Katz centrality 0.03243 0.03299 0.0301 0.03004 0.03635 0.03643
djw 2.4546 2.7866 0.9708 1.4577 3.30 0.6917
djb 1.2523 1.1196 2.8202 2.2469 1.4251 2.9887
ηj0.6622 0.7134 0.2561 0.3935 0.6984 0.1879
Criticality rank 315 4 26
The bold underline is the maximum of the criteria and bold only is the minimum of the criteria
6 Challenges in Mapping of CPS Resilience with Security
Concerns and Operational Domains
The frameworks and recommended practices that we cover in this chapter provide a
solid background on designing and implementing an effective cyber resilient strategy
for the CPS. By correctly understanding and applying the guidance posted by the
frameworks and security practices, we can transform the challenges into opportunities
by using the mathematical analysis models. This section briefly discusses how to map
the standards and procedures into CPS security and operational resilience.
We think that cybersecurity and cyber resilience are viewed better in a three-
dimensional (3D) representation with the CPS domains (i.e., cyber, cyber-physical,
and physical), as illustrated in Fig. 12. The three CPS domains, cyber, cyber-physical,
and physical, have their independent security requirements. There are security con-
cerns (i.e., threats, vulnerabilities, cyberattacks, etc.) for each domain. There are
access control policies, organizational security policy, and overall security strategy
in place to address the security concerns, which varies from system to system and
domain to domain. The security policies and strategies evolve based on the organi-
zation and business mission and situational knowledge and awareness.
On the other hand, the resilience of the systems from cyber incidence largely
depends on the organizational implementation of the policies and defense strategies
according to different stages of cyber resilience (i.e., plan/prepare, absorb, recover,
and adapt). Researchers utilize another set of resilience functions: identify, protect,
34 M. A. Haque et al.
Fig. 12 Mapping of CPS resilience with the security concerns and operational domains
detect, respond, and recover for the same functionality. The frameworks presented in
Sect. 3provide concrete references for the organizations to understand the security
requirements and develop cybersecurity models and strategies according to the sys-
tem needs. The recommended practices and the defense-in-depth policy, as illustrated
in Sect. 4provide practical knowledge and implementation experiences required to
build resilient CPS.
The overall challenge in implementing the defense-in-depth strategies into CPS
is to map them to the particular system under considerations based on the functional
area. For example, if the functional domain is an autonomous vehicle system, then
the challenge would be to map the recommendations and strategies with the vehicle
system design and specifications under consideration. Thus, with a thorough under-
standing of the system design specifications, devices, protocols, communications,
system limitations, etc. With the help of the recommended practices and mathemati-
cal modeling, one can design and implement resilient strategies for control systems,
critical infrastructures, etc. It is also imperative to utilize the formal analyses that we
have presented in Sect.5to evaluate the system’s criticality and cyber resilience to
have an overall assessment of the resilience poster of the whole system.
7 Conclusions
This article discusses cyber resilience in the context of available frameworks and
recommended practices proposed by the different standard bodies and cyber orga-
nizations. At first, the paper presented an in-depth analysis and review of existing
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 35
cyber frameworks and recommended security guidelines for CPS systems to han-
dle the resiliency. Then the article discusses ways to transform the challenges into
opportunities by understanding and realizing the security standards and instructions.
The chapter provides a three-dimensional graphical illustration among CPS security,
CPS components, and CPS resilience by mapping those with the frameworks and
standard practices. The article also presents formal mathematical models to assess
and quantify cyber resilience analytics for CPSs to help network administrations and
researchers make informed decisions. Overall, the paper would guide the researchers
in the CPS domain to gain a good understanding of the relevant frameworks, CPS
security measures, and modeling and simulation (M&S) constraints to overcome the
challenges and utilize the opportunities within the frameworks and guidelines.
Acknowledgments This material is based upon work supported by the Department of Energy
under Award Number DE-OE0000780.
References
1. Griffor, E.R., Greer, C., Wollman, D.A., Burns, M.J.: Framework for cyber-physical systems:
vol. 1. Overview, Technical report (2017)
2. Shi, L., Dai, Q., Ni, Y.: Cyber-physical interactions in power systems: a review of models,
methods, and applications. Electr. Power Syst. Res. 163, 396–412 (2018)
3. Zhang, T., Wang, Y., Liang, X., Zhuang, Z., Xu, W.: Cyber attacks in cyber-physical power
systems: a case study with gprs-based scada systems. In: 2017 29th Chinese Control And
Decision Conference (CCDC), pp. 6847–6852. IEEE (2017)
4. Macaulay, T., Singer, B.L.: Cybersecurity for industrial control systems: SCADA, DCS. HMI,
and SIS. Auerbach Publications, PLC (2016)
5. Stouffer, K., Falco, J., Scarfone, K.: Guide to Industrial Control Systems (ICS) Security, vol.
800, no. 82, p. 16. NIST Special Publication (2011)
6. Colbert, E.J.M., Kott, A.: Cyber-Security of SCADA and Other Industrial Control Systems,
vol. 66. Springer (2016)
7. Johnson, A., Dempsey, K., Ross, R., Gupta, S., Bailey, D.: Guide for Security-Focused Configu-
ration Management of Information Systems, vol. 800, no. 128, p. 16. NIST Special Publication
(2011)
8. Cyware: Understanding the difference between risk, threat, and vulnerability (2019). https://
cyware.com/news/understanding-the- difference-between- risk-threat- and-vulnerability-
c5210e89
9. Blank, R.M.: Guide for conducting risk assessments (2011)
10. Lewis, T.G.: Network Science: Theory and Applications. Wiley (2011)
11. Haque, M.A., Gochhayat, S.P., Shetty, S., Krishnappa, B.: Simulation Foundations, Methods
and Applications. SFMA) series, Cloud-Based Simulation Platform for Quantifying Cyber-
Physical Systems Resilience. Springer (2020)
12. Chen, T., Abu-Nimeh, S.: Lessons from stuxnet. Computer 44(4), 91–93 (2011)
13. Mittal, S., Tolk, A.: Complexity Challenges in Cyber Physical Systems: Using Modeling and
Simulation (M&S) to Support Intelligence. Wiley, Adaptation and Autonomy (2019)
14. Haque, M.A., Shetty, S., Krishnappa, B.: Cyber-physical system resilience. In: Complexity
Challenges in Cyber Physical Systems: Using Modeling and Simulation (M&S) to Support
Intelligence, Adaptation and Autonomy (2019)
15. Laing, C.: Securing Critical Infrastructures and Critical Control Systems: Approaches for
Threat Protection. IGI Global (2012)
36 M. A. Haque et al.
16. Bruneau, M., Chang, S.E., Eguchi, R.T., Lee, G.C., O’Rourke, T.D., Reinhorn, A.M., Shi-
nozuka, M., Tierney, K., Wallace, W.A., Winterfeldt, D.V.: A framework to quantitatively
assess and enhance the seismic resilience of communities. Earthquake Spectra 19(4), 733–752
(2003)
17. Tierney, K., Bruneau, M.: Conceptualizing and measuring resilience: a key to disaster loss
reduction. TR News (250) (2007)
18. National Research Council et al.: Disaster resilience: a national imperative (2012)
19. Ross, R.S.: Recommended security controls for federal information systems and organizations
[includes updates through 9/14/2009]. Technical report (2009)
20. Sedgewick, A.: Framework for improving critical infrastructure cybersecurity, version 1.0.
Technical report (2014)
21. Haque, M.A., De Teyou, G.K., Shetty, S., Krishnappa, B.: Cyber resilience framework for
industrial control systems: concepts, metrics, and insights. In: 2018 IEEE International Con-
ference on Intelligence and Security Informatics (ISI), pp. 25–30. IEEE (2018)
22. Haque, M.A., Shetty, S., Krishnappa, B.: ICS-CRAT: a cyber resilience assessment tool for
industrial control systems. In: 2019 IEEE 5th Intl Conference on Big Data Security on Cloud
(BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing (HPSC)
and IEEE Intl Conference on Intelligent Data and Security (IDS), pp. 273–281. IEEE (2019)
23. Barker, K., Lambert, J.H., Zobel, C.W., Tapia, A.H., Ramirez-Marquez, J.E., Albert, L., Nichol-
son, C.D., Caragea, C.: Defining resilience analytics for interdependent cyber-physical-social
networks. Sustain. Resilient Infrastructu. 2(2), 59–67 (2017)
24. DiMase, D., Collier, Z.A., Heffner, K., Linkov, I.: Systems engineering framework for cyber
physical security and resilience. Environ. Syst. Decis. 35(2), 291–300 (2015)
25. Haque, M.A., Shetty, S., Kamdem, G.: Improving bulk power system resilience by ranking
critical nodes in the vulnerability graph. In: Proceedings of the Annual Simulation Symposium,
p. 8. Society for Computer Simulation International (2018)
26. Haque, M.A., Shetty, S., Krishnappa, B.: Modeling cyber resilience for energy deliverysystems
using critical system functionality. In: IEEE Resilience Week 2019, pp. 33–41. IEEE (2019)
27. Clark, A., Zonouz, S.: Cyber-physical resilience: definition and assessment metric. IEEE Trans.
Smart Grid 10(2), 1671–1684 (2017)
28. Wei, D., Ji, K.: Resilient industrial control system (RICS): concepts, formulation, metrics, and
insights. In: 2010 3rd International Symposium on Resilient Control Systems, pp. 15–22. IEEE
(2010)
29. Linkov, I., Eisenberg, D.A., Bates, M.E., Chang, D., Convertino, M., Allen, J.H., Flynn, S.E.,
Seager, T.P.: Measurable resilience for actionable policy (2013)
30. JOINT TASK FORCE: Risk Management Framework for Information Systems and Organiza-
tions, vol. 800, p. 37. NIST Special Publication (2018)
31. Bodeau, D., Graubart., R.: Cyber Resiliency Engineering Framework. MTR110237,
MITRECorporation (2011)
32. Cornelius, E., Fabro, M.: Recommended practice: Creating cyber forensics plans for control
systems. Technical report, Idaho National Laboratory (INL) (2008)
33. Fabro, M., Gorski, E., Spiers, N.: Recommended practice: improving industrial control system
cybersecurity with defense-in-depth strategies. In: DHS Industrial Control Systems Cyber
Emergency Response Team (2016)
34. Watson, J.-P., Guttromson, R., Silva-Monroy, C., Jeffers, R., Jones, K., Ellison, J., Rath, C.,
Gearhart, J., Jones, D., Corbet, T., et al.: Conceptual framework for developing resilience
metrics for the electricity oil and gas sectors in the united states. Technical report, Sandia
National Laboratories, Albuquerque, NM, USA (2014)
35. ICS-CERT: Recommended practice: developing an industrial control systems cybersecurity
incident response capability (2009)
36. Tom, S., Christiansen, D., Berrett, D.: Recommended practice for patch management of control
systems. Technical report, Idaho National Laboratory (INL) (2008)
37. ICS-CERT: Recommended practice: updating antivirus in an industrial control system (2018)
Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 37
38. Saaty, T.L.: Relative measurement and its generalization in decision making why pairwise
comparisons are central in mathematics for the measurement of intangible factors the ana-
lytic hierarchy/network process. RACSAM-Revista de la Real Academia de Ciencias Exactas,
Fisicas y Naturales. Serie A. Matematicas 102(2), 251–318 (2008)
39. Wilamowski, G.C., Dever, J.R., Stuban, S.M.F.: Using analytical hierarchy and analytical net-
work processes to create cyber security metrics. Def. Acquisit. Res. J.: Publicat. Def.e Acquisit.
Univ. 24(2) (2017)
40. Sun, K., Jajodia, S., Li, J., Cheng, Y., Tang, W., Singhal, A.: Automatic security analysis
using security metrics. In: 2011-MILCOM 2011 Military Communications Conference, pp.
1207–1212. IEEE (2011)
41. Mell, P., Scarfone, K., Romanosky, S.: A complete guide to the common vulnerability scoring
system version 2.0. In: Published by FIRST-Forum of Incident Response and Security Teams,
vol. 1, p. 23 (2007)
42. Haque, M.A.: Analysis of bulk power system resilience using vulnerability graph (2018)
43. Garvey, P.R., Ariel Pinto, C.: Introduction to functional dependency network analysis. In: The
MITRE Corporation and Old Dominion, Second International Symposium on Engineering
Systems, vol. 5. MIT, Cambridge, MA (2009)
44. NIST: National vulnerability database. https://nvd.nist.gov/vuln/data- feeds. Accessed 14 Jan
2020
45. Arghandeh, R., Von Meier, A., Mehrmanesh, L., Mili, L.: On the definition of cyber-physical
resilience in power systems. Renew. Sustain. Energy Rev. 58, 1060–1069 (2016)
46. Bharali, A., Baruah, D.: On network criticality in robustness analysis of a network structure.
Malaya J. Matematik (MJM) 7(2), 223–229 (2019)
47. Roberson, D., Clarisse Kim, H., Chen, B., Page, C., Nuqui, R., Valdes, A., Macwan, R., Johnson,
B.K.: Improving grid resilience using high-voltage dc: strengthening the security of power
system stability. IEEE Power Energy Mag. 17(3), 38–47 (2019)
48. Zobel, C.W.: Quantitatively representing nonlinear disaster recovery. Decis. Sci. 45(6), 1053–
1082 (2014)
49. Tizghadam, A., Leon-Garcia, A.: On robust traffic engineering in transport networks. In: IEEE
GLOBECOM 2008—2008 IEEE Global Telecommunications Conference, pp. 1–6. IEEE
(2008)
50. Chung, F.: Laplacians and the cheeger inequality for directed graphs. Ann. Combinatorics 9(1),
1–19 (2005)
51. Bernstein, D.S.: Scalar, Vector, and Matrix Mathematics: Theory, Facts, and Formulas-Revised
and, Expanded edn. Princeton University Press (2018)
52. Hwang, C.-L., Lai, Y.-J., Liu, T.-Y.: A new approach for multiple objective decision making.
Comput. Oper. Res. 20(8), 889–899 (1993)
53. Kim, A., Kang, M.H.: Determining asset criticality for cyber defense. Technical report, Naval
Research Lab, Washington, DC (2011)
... It offers an up-to-date federal information since 1999s [23] common methodology for assessing risks that threaten the confidentiality of unpatched systems integrity and availability. As, NIST SP 800-53 sets the inspiration of technical management and operational controls that has got to be included within the system to a minimum and make sure the security of low, medium, high-risk systems [24]. Before deploying a wireless network, the organization should assess its security needs and therefore the potential consequences of a security violation. ...
Article
Full-text available
The potentials of the wireless network has made it possible to access a variety of technological operations through the folk of internet-connected devices. This exponential hike and trade of wireless carriers could be a platform to operate and deploy in highly radiated remotely accessed areas such as Nuclear Power Plants (NPPs). The internet-connected devices play the most significant role to improve and build up a smart NPP operating system. The promising internet of things (IoTs) method has enabled interaction between advanced instrumentation and control devices that contributes a new paradigm in the digital world towards the advanced futuristic wireless networks as 5G or 5G beyond (5GB) communication system in industrial research. From this point of view, we investigate the important features and arduously execute a novel approach to operate of smart NPP system for safety concerns by the deployment of internet-connected devices. Therefore, due to security reasons, the IoTs are patched up with a communication between the user and corresponding component at the site. Monitoring and surveillance of NPP safety concerns IoT have become the replacement solution of manpower. In this study, we are summarizing the affecting factors for smooth functioning of NPP, human health diseases cause radiation and a big impact of remotely accessed modern NPP system. This study is also carried out a wide discussion and comprised the existing operational views. Additionally, the major key components of wireless connections are used for security and safety monitoring in NPP systems through IoTs. Finally, the future extendable work is also summarized.
... However, resilience also requires techniques and strategies to perform recovery actions and ensure continuity in the system operability also in case of disruptions. A recent ever-growing interest has been devoted to resilience challenges [2,14,15], often related to the notion of self-adaptation, as witnessed by an increasing number of surveys on this topic [16][17][18][19]. Targets of recent approaches addressing CPS resilience vary from cyber-security [20][21][22][23], to cyberphysical power systems [24,25] and to cyber-physical production systems [26][27][28][29][30]. ...
Article
Full-text available
Cyber-physical systems are hybrid networked cyber and engineered physical elements that record data (e.g. using sensors), analyse them using connected services, influence physical processes and interact with human actors using multi-channel interfaces. Examples of CPS interacting with humans in industrial production environments are the so-called cyber-physical production systems (CPPS), where operators supervise the industrial machines, according to the human-in-the-loop paradigm. In this scenario, research challenges for implementing CPPS resilience, promptly reacting to faults, concern: (i) the complex structure of CPPS, which cannot be addressed as a monolithic system, but as a dynamic ecosystem of single CPS interacting and influencing each other; (ii) the volume, velocity and variety of data (Big Data) on which resilience is based, which call for novel methods and techniques to ensure recovery procedures; (iii) the involvement of human factors in these systems. In this paper, we address the design of resilient cyber-physical production systems (R-CPPS) in digital factories by facing these challenges. Specifically, each component of the R-CPPS is modelled as a smart machine, that is, a cyber-physical system equipped with a set of recovery services, a Sensor Data API used to collect sensor data acquired from the physical side for monitoring the component behaviour, and an operator interface for displaying detected anomalous conditions and notifying necessary recovery actions to on-field operators. A context-based mediator, at shop floor level, is in charge of ensuring resilience by gathering data from the CPPS, selecting the proper recovery actions and invoking corresponding recovery services on the target CPS. Finally, data summarisation and relevance evaluation techniques are used for supporting the identification of anomalous conditions in the presence of high volume and velocity of data collected through the Sensor Data API. The approach is validated in a food industry real case study.
Article
Full-text available
Due to increasing the intricacies of cyber‐physical systems (CPSs) and the severity of natural phenomena, upgrading network planning is vital to reduce the vulnerability of these systems. This study develops a novel preventive‐corrective resilient energy management strategy (PC‐REMS) for a CPS in two stages, exploiting the network reconfiguration (NR) and energy storage systems (ESSs) capacity. The first stage of the proposed PC‐REMS follows preventive actions based on contingency faults. In contrast, the second stage applies corrective measures for improving the CPS resilience to cope with natural physical disasters. Vulnerability assessment data is sent to the physical power system daily through the communication network. The first stage of preparing the CPS for predictable faults focuses on pre‐scheduled ESSs and preventive NR to minimise the expected energy curtailment cost. The second stage involves the network recovery in real‐time through corrective NR to minimise energy curtailment cost after the faults. Three resistance, recovery, and resilience indices are introduced for evaluating the effectiveness of the model. The proposed model is examined by performing multiple simulations on the 33 and 118‐bus radial test systems. The simulation results show the efficiency of the proposed PC‐REMS model in dealing with predictable disasters to improve the CPS resilience.
Technical Report
Full-text available
This report has been written for the Department of Energy's Office of Electricity Delivery and Energy Reliability to support the Office of Energy Policy and Systems Analysis in their writing of the Quadrennial Energy Review in the area of energy resilience. The topics of measuring and increasing energy resilience are addressed, including definitions, means of measuring, and analytic methodologies that can be used to make decisions for policy, infrastructure planning, and operations. A risk-based framework is presented which provides a standard definition of a resilience metric. Additionally, a process is identified which explains how the metrics can be applied. Research and development is articulated that will further accelerate the resilience of energy infrastructures.
Chapter
Full-text available
Cyber-Physical Systems (CPS) often involve trans-disciplinary approaches, merging theories of different scientific domains, such as cybernetics, control systems, and process design. Advances in CPS expand the horizons of these critical systems and at the same time, bring the concerns regarding safety, security, and resiliency. To minimize the operating costs and maximize the scalability, often time, it is preferable to use the cloud environment for deploying the CPS computation processes and simulation environments. With the expanding uses of the CPS and cloud computing, major cybersecurity concerns are also growing around these systems. The cloud itself has security and privacy issues. This chapter focuses on a cloud-based simulation platform for deriving the cyber resilience metrics for the CPS. First, it presents a detailed analysis of the modeling of the resilience metrics by mapping them with cloud security concerns. Then, it covers modeling and simulation (M&S) challenges in developing simulation platforms in the cloud environment and discusses a way forward. Overall, we aim to discuss resilience metrics modeling and automation using the proposed simulation platform for the CPS in the cloud environment.
Chapter
Full-text available
Cyber‐physical systems (CPSs) play a critical role in diversified fields. The integration of computation and physical processes makes CPS a vital part in different industries, e.g. autonomous automobile systems, smart grid systems, healthcare systems, communication systems, etc. The CPS often involves transdisciplinary approaches, merging theory of different scientific domains such as cybernetics, control systems, process design, and embedded systems. With the expanding uses of the CPS, major cybersecurity concerns are also growing around these systems. Often computing the cyber resilience metrics are omitted in literature because of the complexity of the systems and lack of a clear idea about the overall network security posture. The chapter focuses on the cyber resilience metrics and frameworks for the CPS. The chapter presents a detailed cyber resilience framework for CPS to be used across different industries. The framework also guides the methodologies to compute the resilience metrics for the CPS. The chapter presents both qualitative and quantitative modeling of cyber resilience for the CPS. A discussion on the automation process for the CPS resilience metrics computation is presented, which covers details of the qualitative and quantitative simulation tool architectures, vulnerability assessment, visualization, and reporting processes. The chapter also covers complexities in designing and developing simulation tools and resilience metrics computation methodologies. The chapter aims to provide an overall idea about the cyber resilience metrics computation process and a simulation platform for the CPS and how that would be beneficial across various industries.
Conference Paper
Full-text available
In this paper, we analyze the cyber resilience for the energy delivery systems (EDS) using critical system functionality (CSF). Some research works focus on identification of critical cyber components and services to address the resiliency for the EDS. Analysis based on the devices and services excluding the system behavior during an adverse event would provide partial analysis of cyber resilience. To address the gap, in this work, we utilize the vulnerability graph representation of EDS to compute the system functionality under adverse condition. We use network criticality metric to determine CSF. We estimate the criticality metric using graph Laplacian matrix and network performance after removing links (i.e., disabling control functions, or services). We model the resilience of the EDS using CSF, and system recovery curve. We also provide a comprehensive analysis of cyber resilience by determining the critical devices using TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) and AHP (Analytical Hierarchy Process) methods. We present use cases of EDS illustrating the way control functions and services in EDS map to the vulnerability graph model. The simulation results show that we can estimate the resilience metric using different types of graphs that may assist in making an informed decision about EDS resilience.
Conference Paper
Full-text available
In this work, we use a subjective approach to compute cyber resilience metrics for industrial control systems. We utilize the extended form of the R4 resilience framework and span the metrics over physical, technical, and organizational domains of resilience. We develop a qualitative cyber resilience assessment tool using the framework and a subjective questionnaire method. We make sure the questionnaires are realistic, balanced, and pertinent to ICS by involving subject matter experts into the process and following security guidelines and standards practices. We provide detail mathematical explanation of the resilience computation procedure. We discuss several usages of the qualitative tool by generating simulation results. We provide a system architecture of the simulation engine and the validation of the tool. We think the qualitative simulation tool would give useful insights for industrial control systems' overall resilience assessment and security analysis.
Book
This book provides the state-of-the-art in methods and technologies that aim to elaborate on the modeling and simulation support to cyber physical systems (CPS) engineering across many sectors such as healthcare, smart grid, or smart home. It presents a compilation of simulation-based methods, technologies, and approaches that encourage the reader to incorporate simulation technologies in their CPS engineering endeavors, supporting management of complexity challenges in such endeavors. Complexity Challenges in Cyber Physical Systems: Using Modeling and Simulation (M&S) to Support Intelligence, Adaptation and Autonomy is laid out in four sections. The first section provides an overview of complexities associated with the application of M&S to CPS Engineering. It discusses M&S in the context of autonomous systems involvement within the North Atlantic Treaty Organization (NATO). The second section provides a more detailed description of the challenges in applying modeling to the operation, risk and design of holistic CPS. The third section delves in details of simulation support to CPS engineering followed by the engineering practices to incorporate the cyber element to build resilient CPS sociotechnical systems. Finally, the fourth section presents a research agenda for handling complexity in application of M&S for CPS engineering. In addition, this text: -Introduces a unifying framework for hierarchical co-simulations of cyber physical systems (CPS) -Provides understanding of the cycle of macro-level behavior dynamically arising from spaciotemporal interactions between parts at the micro-level -Describes a simulation platform for characterizing resilience of CPS Complexity Challenges in Cyber Physical Systems has been written for researchers, practitioners, lecturers, and graduate students in computer engineering who want to learn all about M&S support to addressing complexity in CPS and its applications in today’s and tomorrow’s world.
Book
This book provides the state-of-the-art in methods and technologies that aim to elaborate on the modeling and simulation support to cyber physical systems (CPS) engineering across many sectors such as healthcare, smart grid, or smart home. It presents a compilation of simulation-based methods, technologies, and approaches that encourage the reader to incorporate simulation technologies in their CPS engineering endeavors, supporting management of complexity challenges in such endeavors. Complexity Challenges in Cyber Physical Systems: Using Modeling and Simulation (M&S) to Support Intelligence, Adaptation and Autonomy is laid out in four sections. The first section provides an overview of complexities associated with the application of M&S to CPS Engineering. It discusses M&S in the context of autonomous systems involvement within the North Atlantic Treaty Organization (NATO). The second section provides a more detailed description of the challenges in applying modeling to the operation, risk and design of holistic CPS. The third section delves in details of simulation support to CPS engineering followed by the engineering practices to incorporate the cyber element to build resilient CPS sociotechnical systems. Finally, the fourth section presents a research agenda for handling complexity in application of M&S for CPS engineering. In addition, this text: -Introduces a unifying framework for hierarchical co-simulations of cyber physical systems (CPS) -Provides understanding of the cycle of macro-level behavior dynamically arising from spaciotemporal interactions between parts at the micro-level -Describes a simulation platform for characterizing resilience of CPS Complexity Challenges in Cyber Physical Systems has been written for researchers, practitioners, lecturers, and graduate students in computer engineering who want to learn all about M&S support to addressing complexity in CPS and its applications in today’s and tomorrow’s world.
Thesis
Critical infrastructure such as a Bulk Power System (BPS) should have some quantifiable measure of resiliency and definite rule-sets to achieve a certain resilience value. Industrial Control System (ICS) and Supervisory Control and Data Acquisition (SCADA) networks are integral parts of BPS. BPS or ICS are themselves not vulnerable because of their proprietary technology, but when the control network and the corporate network need to have communications for performance measurements and reporting, the ICS or BPS become vulnerable to cyber-attacks. Thus, a systematic way of quantifying resiliency and identifying crucial nodes in the network is critical for addressing the cyber resiliency measurement process. This can help security analysts and power system operators in the decision-making process. This thesis focuses on the resilience analysis of BPS and proposes a ranking algorithm to identify critical nodes in the network. Although there are some ranking algorithms already in place, but they lack comprehensive inclusion of the factors that are critical in the cyber domain. This thesis has analyzed a range of factors which are critical from the point of view of cyber-attacks and come up with a MADM (Multi-Attribute Decision Making) based ranking method. The node ranking process will not only help improve the resilience but also facilitate hardening the network from vulnerabilities and threats. The proposed method is called MVNRank which stands for Multiple Vulnerability Node Rank. MVNRank algorithm takes into account the asset value of the hosts, the exploitability and impact scores of vulnerabilities as quantified by CVSS (Common Vulnerability Scoring System). It also considers the total number of vulnerabilities and severity level of each vulnerability, degree centrality of the nodes in vulnerability graph and the attacker’s distance from the target node. We are using a multi-layered directed acyclic graph (DAG) model and ranking the critical nodes in the corporate and control network which falls in the paths to the target ICS. We don't rank the ICS nodes but use them to calculate the potential power loss capability of the control center nodes using the assumed ICS connectivity to BPS. Unlike most of the works, we have considered multiple vulnerabilities for each node in the network while generating the rank by using a weighted average method. The resilience computation is highly time consuming as it considers all the possible attack paths from the source to the target node which increases in a multiplicative manner based on the number of nodes and vulnerabilities. Thus, one of the goals of this thesis is to reduce the simulation time to compute resilience which is achieved as illustrated in the simulation results.