ChapterPDF Available

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices

March 2021

March 2021

DOI:10.1007/978-3-030-67361-1_1

In book: Security in Cyber-Physical Systems, Foundations and Applications (pp.1-37)

Authors:

Md Ariful Haque

Old Dominion University

Sachin Shetty

Old Dominion University

Bheshaj Krishnappa

Cyber-Physical Systems (CPSs) are complex systems that evolve from the integrations of components dealing with real-time computations and physical processes, along with networking. CPSs often incorporate approaches merging from different scientific fields such as embedded systems, control systems, operational technology, information technology systems (ITS), and cybernetics. Major cybersecurity concerns are rising around CPSs because of their expanding uses in the modern world today. Often the security concerns are limited to deriving risk analytics and security assessment. Others focus on the development of intrusion detection and prevention systems. To make the CPSs resilient, it needs a thorough understanding of the current cybersecurity frameworks proposed by different governing bodies in this domain. It is also imperative to realize how these frameworks are applying established security practices. To address the gap in understanding the defense-in-depth security architectures and achieving them within the CPS domain, we analyze the cybersecurity frameworks and the challenges in applying them. To give some background information, we start a discussion of the differences between ITS and CPS. We then present a state-of-the-art review of some of the existing cybersecurity frameworks for risk and resilience management. Finally, we propose formal techniques to realize the frameworks and security practices in the CPS domain by providing quantitative resilience analytics.

Cyber-Physical Systems concept map

…

Steps in NIST risk management framework for information systems cybersecurity [29]. It consists of seven steps: 1) prepare, 2) categorize system, 3) select controls, 4) implement controls, 5) access controls, 6) authorize system, and 7) monitor controls. The steps need are to follow sequentially, although the preparation phase needs to consider the constraints in other stages. The figure is adapted from the proposed framework [29] to help in realizing the discussion.

…

Decomposition of 'robustness' metric for ICS using the AHP process hierarchy. Robustness is one of the four broad categories of cyber resilience metrics in R4 model

…

System performance recovery curve during a cyber-attack incident on the CPS. We use the graph from Haque et al. [24], which is a modified form of the resilience graph presented in Wei and Ji [27].

…

A sample vulnerability graph with arbitrary edge weights. We represent the edge weights as (exploitability score, impact score). The number of nodes used in this illustration is ten. The edge weights are within the range of 0∼10 to keep similar to CVSS scores.

…

Figures - uploaded by Md Ariful Haque

Content may be subject to copyright.

Content uploaded by Md Ariful Haque

Content may be subject to copyright.

Realizing Cyber-Physical Systems

Resilience Frameworks and Security

Practices

Md Ariful Haque, Sachin Shetty, Kimberly Gold, and Bheshaj Krishnappa

Abstract Cyber-Physical Systems (CPSs) are complex systems that evolve from

the integrations of components dealing with real-time computations and physical

processes, along with networking. CPSs often incorporate approaches merging from

different scientiﬁc ﬁelds such as embedded systems, control systems, operational

technology, information technology systems (ITS), and cybernetics. Major cyberse-

curity concerns are rising around CPSs because of their expanding uses in the modern

world today. Often the security concerns are limited to deriving risk analytics and

security assessment. Others focus on the development of intrusion detection and pre-

vention systems. To make the CPSs resilient, it needs a thorough understanding of

the current cybersecurity frameworks proposed by different governing bodies in this

domain. It is also imperative to realize how these frameworks are applying established

security practices. To address the gap in understanding the defense-in-depth security

architectures and achieving them within the CPS domain, we analyze the cyberse-

curity frameworks and the challenges in applying them. To give some background

information, we start a discussion of the differences between ITS and CPS. We then

present a state-of-the-art review of some of the existing cybersecurity frameworks

for risk and resilience management. Finally, we propose formal techniques to realize

the frameworks and security practices in the CPS domain by providing quantitative

resilience analytics.

M. A. Haque (B

)·S. Shetty

Computational Modeling and Simulation Engineering, Old Dominion University,

5115 Hampton Blvd, Norfolk, VA 23529, USA

e-mail: mhaqu001@odu.edu

S. Shetty

e-mail: sshetty@odu.edu

K. Gold

Naval Surface Warfare Center, Crane Division, Crane, IN 47522, USA

e-mail: kimberly.gold@navy.mil

B. Krishnappa

Risk Analysis and Mitigation, ReliabilityFirst Corporation, 3 Summit Park Drive, Suite 600,

Cleveland, OH 44131, USA

e-mail: bheshaj.krishnappa@rﬁrst.org

A. I. Awad et al. (eds.), Security in Cyber-Physical Systems, Studies in Systems,

Decision and Control 339, https://doi.org/10.1007/978-3- 030-67361- 1_1

2 M. A. Haque et al.

Keywords Cyber-Physical Systems ·Cybersecurity frameworks ·Security

practices ·Criticality assessment ·Resilience metrics ·Graphical modeling ·

Analytical hierarchy process ·TOPSIS

1 Introduction

In the modern world today, we observe a steep increase in the usage of Cyber-

Physical Systems (CPSs). For example, critical infrastructures (i.e., energy delivery

systems, oil and gas industry, healthcare systems, transportation systems), industrial

manufacturing plants, autonomous vehicles, smart cities, etc. profoundly use CPSs.

CPS is a class of complex systems of systems that integrate cyber operations with the

physical processes. In CPSs, we use computing and networking devices to perform

computation and communication. The networked devices also control the underlying

instrumental processes. We need the communication network to monitor and control

the physical devices’ operations and performances in real-time, some of which may

base on remote ﬁeld locations.

In the broad sense, CPSs consists of the cyber domain (or, the information tech-

nology systems (ITS)), and the physical domain (or, the operational technology (OT)

network). The cyber section consists of servers and hosting devices for organization-

wide communications. On the other hand, the physical domain contains the ﬁeld

devices and the industrial control systems (ICS), which again comprise sensors,

actuators, control functions, feedback systems, etc. We need the OT network for

handling the production processes and the ITS for the business communications.

The advancements in monitoring and controlling the production processes bring

the risk of malicious cyber attacks on these systems. The increment in risk comes

from the integrated interconnection between the cyber components and the physical

elements, more precisely, the amalgamation of ITS and ICS.

The CPS’s security concerns are often addressed by focusing on the development

of intrusion detection and prevention system (IDS/IPS) and generating security met-

rics for risk assessment. While the IDS and IPS are necessary for mitigating the attack

impact and quick recovery of the system, we cannot overlook the concern regarding

systems’ resilience and reliability. The resilience posture indicates the overall system

security and guides network administrators for developing effective and optimized

mitigation strategies and remediation plans.

To make the CPS cyber resilient, regulatory bodies and researchers propose sev-

eral frameworks and provide essential instructions in standards. The standard bodies

that we are referring are the National Institute of Standards and Technology (NIST),

the North American Electric Reliability Corporation critical infrastructure protec-

tion (NERC-CIP), and the Industrial Control Systems Cyber Emergency Response

Team (ICS-CERT). The open question is how to apply those frameworks and secu-

rity practices as comprehensively as possible without affecting the regular business

operations.

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 3

In this chapter, we address the implementation challenges of the theoretical frame-

works. We discuss the threats and vulnerabilities that CPS is facing today. We also

cover how the security frameworks, recommended defense architectures, and stan-

dard practices can help design and develop resilient CPS. This chapter aims to mathe-

matically realize the frameworks and security practices using established theoretical

analysis methodologies. Signiﬁcant contributions of the chapter are:

•A detailed discussion on the CPS threats, vulnerabilities, and cyber resilience

•A comprehensive review of cybersecurity and resilience frameworks and recom-

mended defense-in-depth security practices for CPS

•A proposed qualitative approach for quantifying cyber resilience using analytical

hierarchy process (AHP)

•A quantitative realization of defense-in-depth security architectures using a multi-

level directed acyclic graph modeling technique

•Critical and cyber vulnerable assets identiﬁcation using the vulnerability graph

model

•A concise discussion on the mapping of CPS security, resilience, and operational

domains.

We organize the rest of the chapter as follows. Section 2presents a brief descrip-

tion of CPS and components of CPS (i.e., IT network, supervisory control and data

acquisition (SCADA), and OT network), and cyber resilience. Section3provides

a state-of-the-art review of the cybersecurity and resilience frameworks. Section 4

discusses some of the critical security guidelines presented by the standard bod-

ies. Section 5proposes different mathematical techniques for the realization of the

frameworks. Section 6highlights the challenges in mapping CPS security, resilience,

and operational domains. Finally, Sect. 7concludes the chapter with some signiﬁcant

takeaways.

2 Cyber-Physical Systems

Cyber-Physical Systems (CPSs) represent a composite class of engineered systems

consisting of physical processes and computational resources. The National Institute

of Standards and Technology (NIST) CPS Public Working Group (CPS PWG) deﬁnes

CPS as “smart systems that include engineered interacting networks of physical and

computational components” (Griffor et al. [1]) . CPS technologies continue helping to

transform people’s approaches to interact with engineered systems. Advances in CPS

bring extended capability, adaptability, and usability, making them crucial in many

industries. Today we observe CPS are in use to implement most modern technologies

such as the Internet of Things (IoT), industrial internet, Industrial Control Systems

(ICS), smart devices, etc. In this chapter, we sometimes use the phrases CPS and ICS

interchangeably to mean the same systems.

We present a conceptual representation of the Cyber-Physical Systems in Fig.1.

We divide the discussion area into feedback systems, application domains, system

4 M. A. Haque et al.

security, and system challenges. CPS consists of control and feedback systems, which

are highly interconnected and heterogeneous. The control systems are either net-

worked or distributed and include physical processes such as sensors and actuators,

which operate in real-time. There may be human and environmental interactions

involved in the process.

We illustrate the CPS here by using the example of the power systems as

researchers consider the power systems as cyber-physical power systems (CPPS)

[2,3]. The power system’s physical domain consists of the generation and distribu-

tion devices such as generators, transformers, electric buses, etc. The physical part

also comprises ICS devices. There are different ICS devices in use based on require-

ments such as the phasor measurement units (PMU), intelligent electronic devices

(IED), the programmable logic controllers (PLC), and remote terminal units (RTU),

etc. To monitor and control the ﬁeld devices’ performances, we need the supervi-

sory control and data acquisition (SCADA) systems. As we know, SCADA is the

central control system used to monitor and control the equipment in the industrial

production systems. In general, SCADA contains the master terminal unit (MTU),

human-machine interface (HMI), and input/output (I/O) devices, etc. The ﬁeld ICS

devices such as RTU sends real-time system performance data to MTU. The oper-

ators in SCADA observe the performance measures, compare those values with

desired values, and, if necessary, issue control commands through HMI. The com-

mands issued from HMI control the system to function at the desired service level

(Macaulay and Singer [4]).

Due to the complicated operational requirements, the CPS itself has challenges

such as modeling the underlying physical processes and real-time behavior, modeling

interconnectivity, and interoperability in the heterogeneous SoS, secure integration

of different components of CPS, etc. The CPS needs to handle the analysis of spec-

iﬁcation, design methodologies, scalability and complexity, and overall veriﬁcation

and validation of the systems from the modeling & simulation perspective. On the

other hand, because of the amalgamation of the IT and OT domains, CPS needs to

handle many cyber threats. Thus, understanding the proposed cyber frameworks and

applying the recommended practices in developing a resilient system are integral

parts of CPS security analysis.

We start with a short discussion on the primary differences between ITS and CPS

in the next subsections. We then gradually proceed to CPS threats, vulnerabilities,

and cyber resilience to smooth transition to the cyber framework analysis.

2.1 Primary Differences Between CPS and ITS Security

Today, the extensive access of ITS devices into the control systems makes CPS

vulnerable to cyberattacks. Cyberattacks are different than physical attacks on several

points. In the physical attacks, the defenders are aware of the system units under the

target, the impact is immediate, and there are policies to handle such attacks. On

the other hand, cyberattacks are remote, repeatable, and can occur over extended

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 5

Fig. 1 Cyber-Physical Systems concept map

6 M. A. Haque et al.

periods. Cyber intruders can execute cyberattacks, with the objective of a long-term

intrusion and identiﬁcation of potential attractive targets (e.g., advanced persistent

threat). The impact can be less intense for the time being but can lead to disastrous

consequences in the long run. We provide a summary of the fundamental differences

between ITS and ICS/CPS security in Table 1.

Overseeing cybersecurity in the CPS domain is far more daunting than control-

ling the same in the information technology context. The reason lies on the ground

that CPS has unique operational requirements than ITS. Firstly, for CPS, real-time

availability and operational continuity are of utmost importance. But for ITS, data

conﬁdentiality and integrity are crucial. Momentary downtime in ITS does not ham-

per any production processes (see Macaulay and Singer [4]). Secondly, it is easy

to apply patching through anti-malware and anti-virus software in ITS, and they

often automatically download and install the necessary security patches or updates.

But ICSs are generally old proprietary technologies intended for functionality (not

focus on security issues). ICSs have limited memory and other processing capacities.

These hardware-level limitations make it hard to install anti-malware or anti-virus

solutions, which consume a lot of memory for automatic updates and delay moni-

toring and controlling the production process. Thirdly, ICS operates in diverse ﬁelds

such as in the oil, gas, and electric industries. So the application of security measures

should be adapted to ﬁt the structure of these sectors.

2.2 CPS Threats and Vulnerabilities

This section starts with a brief deﬁnition of vulnerability and threat, as we ﬁnd in

the literature to facilitate the audience with the necessary information for the next

discussion. In information systems, a vulnerability is a ﬂaw in the software program

or system that an intruder may exploit to gain unauthorized access to a cyber asset.

NIST deﬁnes vulnerability as “weakness in an information system, system security

procedures, internal controls, or implementation that could be exploited or triggered

by a threat source” (see Johnson et al. [7]). On the other hand, a threat is anything

that “can exploit a vulnerability, intentionally or accidentally, and obtain, damage, or

destroy an asset” (Cyware [8]). The Joint Task Force Transformation Initiative deﬁnes

threat as “threat is any circumstance or event with the potential to adversely impact

organizational operations and assets, individuals, other organizations, or the Nation

through an information system via unauthorized access, destruction, disclosure, or

modiﬁcation of information, and/or denial of service” (Blank [9]). Lewis [10] deﬁnes

vulnerability and threat as “the probability that a component or asset will fail when

attacked” and “the probability that an attack will happen”, respectively.

We have already highlighted that CPS consist of physical, control, and communi-

cation layers. CPS threat vectors can come from adversaries in any of those layers. In

the physical layer, the availability of the ﬁeld devices’ services and functionalities are

of utmost concern. There is the risk of information alteration by modifying the phys-

ical device codes (e.g., PLC logic codes). In the control layer, most attacks occur in

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 7

Tabl e 1 Primary differences in operations and security in ITS and CPSs/ICS

Category Information Technology Systems

(ITS)

Cyber-Physical Systems

(CPSs/ICS)

Performance constraintsa•High throughput demanded •Modest throughput is allowable

•Non real-time response is ok •Real-time response in essential

•High delay and jitter are tolerable •Delay and jitter over certain

threshold are not tolerable

Resource constraintsb•Updated hardware and software

products are used

•Old and less secured proprietary

products are used

•Systems have enough memory

and processing capabilities

•Products are designed with low

memory and processing

capabilities

•Regular security updates are

maintained through patching

•Often security updates and

patches are not implemented to

avoid system unavailability due to

reboot requirements after

conﬁguration changes

Conﬁdentiality, integrity, and

availability

•Data conﬁdentiality and integrity

are critical

•Conﬁdentiality and integrity is

not important

•Temporary unavailability is

tolerable

•High availability is required.

Momentary downtime may not be

acceptable

Communication protocol Standard communication protocols

(i.e., TCP, UDP, etc.)

Proprietary protocols (i.e.,

MODBUS, DNP3, etc.)

Patch and change managementa•Software updates and patching

are applied regularly according to

the organization’s security policy

•Any conﬁguration changes need

to test, and deploy in test mode

before committing the changes to

live system to avoid unexpected

outages

•Rebooting the system to

re-initialize the hardware or

software devices is acceptable

•Unplanned rebooting of the

system is not acceptable

Password and authenticationb•Multi-factor authentication is

possible to deploy

•Sometimes lack of any sort of

authentication requirement

•Passwords need to change after

certain time

•Passwords are hard-wired in

legacy ICS and cannot be changed

•Security is enhanced through

encryption mechanisms

•Lack of encryption mechanisms

in message communication

Component lifetime and technical

supporta

•Lifetime generally spans from 3

to 5 years

•Lifetime varies between 15 and

20 years

•Ample technical support

available from either own IT

experts or diversiﬁed managed

services

•Support solely vendor dependent.

Some product supports may be

ceased by the vendor due to

lifetime expiry

Operational command and control Mostly central monitoring Distributed ﬁeld operations, but

central monitoring through

SCADA

aStouffer et al. [5]

bColbert et al. [6]

8 M. A. Haque et al.

the form of distributed denial of service (DDoS), eavesdropping (man-in-the-middle

attack), jamming, selective forwarding, etc. Threats in the communication layer can

lead to leaking of conﬁdentiality, stealing credentials, unauthorized access to the

system, social engineering, etc. Based on the type of threats, we classify them in the

discussion below, as pointed out by Haque et al. [11].

External Threats: By external threats, we mean any cyberattack coming from out-

side of the organization. External threats arise from different rival groups, including

nation sponsored hackers, terrorist organizations, or industry competitors. Cyber

intruders may launch an advanced persistent threat attack, where the goal is to theft

crucial data and login information (e.g., password) on the network’s assets with-

out getting caught. One such example is the Stuxnet attack on the Iranian nuclear

centrifuges in the year 2010 (see Chen and Abu-Nimeh [12]).

Internal Threats: The internal threat comes from either within the organization or

from the afﬁliated parties. Today, the industry’s operating processes are segmented

and done by third-party vendors or contractors. Thus organizations need to share sys-

tem information with outside business partners to some extent. Sharing the network

information (e.g., network design documents) makes the CPS/ICS system vulnerable

to potential cyber threats. There is also the risk of insider attacks from the organiza-

tion’s employees as some employees have authorized access to the ICS network for

managing the network operations. This type of insider threat falls in the category of

credentialed ICS insider attack [13,14].

Technology Threats: Even today, most ICS systems run on old technologies, where

the primary concern is the matching of protocol-level message communications

among different ICS products from other vendors. Thus, many ICSs lack strong

authentication and encryption mechanism (see Laing [15]). Some ICS use authen-

tication procedures, but the weak security mechanisms (e.g., insecure password,

default user accounts, and inadequate password policies) are not enough to protect

the system from intelligent adversaries [13,14].

Integration and Inter-connectivity Threats: In the enterprise networks, business

units are interconnected. Due to the interconnection of the corporate network with

the control system network, ICS devices become vulnerable to cyberattacks. This

vulnerability arises because part of the corporate network is open for communication

over the internet, and ITS hosts and servers contain vulnerabilities. Merely putting the

ICS devices behind the ﬁrewalls do not necessarily protect the ICS components [13].

Next, we discuss the cyber resilience from the CPS perspective to understand how

to protect the CPS form the threats discussed above.

2.3 Cyber Resilience: What Does It Mean for CPS?

Some of the early deﬁnitions of resiliency had concentrated on disaster resiliency.

From the disaster resilience perspective, Bruneau et al. [16] had proposed a concep-

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 9

tual framework to deﬁne seismic resilience. In another work, Tierney and Bruneau

[17] later introduced the R4 framework for disaster resilience. The R4 model [17]

comprises of four metrics: robustness, redundancy, resourcefulness, and rapidity.

‘Robustness’ means systems’ ability to function and provide services even under

degraded performance, probably with reduced quality of services. ‘Redundancy’

means identifying substitute elements that satisfy functional requirements in the

event of signiﬁcant performance degradation or service disruption. ‘Resourceful-

ness’ is to initiate solutions by identifying the required resources based on the conse-

quence, nature, or depth of degradation by prioritizing problems that need to solve.

‘Rapidity’ indicates the ability to restore functions within the required time-stamp.

The National Academy of Science (NAS) deﬁnes resilience as “the ability to

prepare and plan for, absorb, respond, recover from, and more successfully adapt to

adverse events” (see National Research Council [18]). The National Institute of Stan-

dards and Technology (NIST) deﬁnes the information system resilience as follows.

Resilience is “the ability of an information system to continue to: (i) operate under

adverse conditions or stress, even if in a degraded or debilitated state while main-

taining essential operational capabilities; and (ii) recover to an effective operational

posture in a time frame consistent with mission needs” (see Ross [19]).

A lot of research works are going on the cyber resiliency study of CPS. We

mention a few of them here which deal with frameworks and security guidelines.

The NIST provides a framework (Sedgewick [20]) for improving the cybersecurity

and resilience of critical infrastructures that support both ITS and ICS. NIST pro-

vides another framework speciﬁcally for Cyber-Physical Systems (Griffor et al. [1]).

We elaborate on the frameworks in Sect.3.2. Haque et al. [21] illustrate the gap in

resilience analysis and propose a cyber resilience framework to quantify resilience

metrics. The framework considers the physical, technical, and organizational aspects

of cyber operations to assess ICS’s cyber resilience. Haque et al. also introduce a

qualitative cyber resilience assessment tool [22] based on the framework.

In the ICS domain, Stouffer et al. [5] provide detailed guidelines for ICS system

security. The policies cover secure ICS architecture and the methods for applying the

security controls to the ICS environment. Barker et al. [23] propose resilience ana-

lytics for social networks that depends on each other. The metrics describe how risk

analysis can help in the modeling and quantiﬁcation of systems resiliency. DiMase

et al. [24] present a systems engineering framework for Cyber-Physical Systems

security and resiliency. The paper focuses on CPS security and relates to resiliency

to handle integrated and targeted security measures and policies. We would cover

some of the frameworks in Sect.3and thus omit the detailed discussion here to avoid

repetition.

In the modeling context, Haque et al. [25] highlight ways of modeling resilience

in CPS by considering the criticality of the cyber asset. Haque et al. [26] present

cyber modeling techniques by utilizing the critical system functionality for energy

delivery systems speciﬁcally. In the resilience analytics, Clark and Zonouz [27]

present intrusion resilience metrics for Cyber-Physical Systems by segregating the

cyber and control layers of CPS. In another work, Haque et al. [14] explain the

10 M. A. Haque et al.

challenges in resilience assessment in CPS and discuss ways to develop a simulation

platform for resilience assessment.

Wei and Ji [28] discuss a model named the resilient industrial control system

(RICS). The authors mentioned the following characteristics of resilient ICS:

•Capability to reduce the unexpected consequence or impact of a cyber incidence

to as minimum as possible

•Capability to mitigate a major portion of undesirable events

•Capability to recover normal operations within an expected time frame.

The R4 metrics [17] presented above are in line with the resilient characteristics pro-

vided by Wei and Ji [28]. Most of the above works address resilience by developing

security frameworks and deriving quantitative analytic for the CPS or ICS. In this

chapter, we want to focus on understanding the cybersecurity frameworks and stan-

dard practices proposed by the governing bodies; That discussion follows in Sect.3

and Sect. 4, respectively.

3 State-of-the-Art Review of Cybersecurity Frameworks

In this section, we cover four crucial cybersecurity frameworks applicable to CPS.

These are (1) NIST framework for improving critical infrastructure cybersecurity,

(2) NIST framework for Cyber-Physical Systems, (3) NIST risk management frame-

work for information systems cybersecurity, and (4) cyber resiliency engineering

framework of MITRE Corporation. We consider these frameworks for our analy-

sis as researchers consider these frameworks as mostly adopted frameworks in the

cybersecurity domain. We also provide a comparative analysis of several other cyber-

security frameworks in Sect.3.5.

3.1 NIST Framework for Improving Critical Infrastructure

Cybersecurity

The NIST cybersecurity framework (Sedgewick [20]) version 1.0 provides broad

guidelines to manage cybersecurity risk and resilience. It has three main sections:

core, implementation, tiers, and proﬁles. It can also help to identify operations needed

to reduce risks and enhance resilience. The NIST framework identiﬁes and proposes

ﬁve security functions. These functions help managing systems cybersecurity, as we

illustrate in Fig. 2.

The core piece of the framework provides actions to achieve speciﬁc results in

the cybersecurity area. The element “Functions” organize necessary cybersecurity

activities at the uppermost level. These functions are to identify, protect, detect,

respond, and recover, respectively. These functions help organizations in managing

cybersecurity risk and resilience. Here, the ‘identify’ function implies developing an

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 11

Fig. 2 NIST cybersecurity framework core functions and categories. We present here only the

core functions and categories adapting from the initially proposed framework by Sedgewick [20]

to explain the essential ideas

understanding of system risks and managing assets, data, capabilities, skills, etc. The

‘protect’ function deals with developing necessary defensive measures and imple-

menting those to ensure the continuity of services. Detect realizes the capability to

capture the occurrence of a cyberattack incident. The function ‘respond’ refers to

taking actions regarding a detected cyber breach incident. Lastly, the recovery means

restoring any damaged capabilities or services due to a cyberattack incident.

The framework presents a high-level risk and resilience assessment. It guides what

to do during a cyber attack event. However, the model framework lacks pointing on

how to implement those actions. Also, the model needs to consider system differences

among different critical infrastructures. For example, if the same attack happens in the

energy and water sectors, the methodologies and actions to be taken, as mentioned,

are the same, which may not consider the system differences.

We adapt the resilience curve presented by Wei and Ji [28] and map the graph

with the ﬁve functions offered by the NIST framework. The curve is similar to

Fig. 3 CPS cyber resilience graph with different phases of action. We adjust the graph from the

original graph presented in RICS model by Wei and Ji [28] to incorporate the resilience phases

12 M. A. Haque et al.

the duck curve in energy systems reliability analysis. In general, a resilient sys-

tem goes through ﬁve stages during an adverse event. These are plan/prepare,

absorb, analyze/respond, recover, and adapt (Linkov et al. [29]). In Fig. 3, we present

the resilience curve applicable to CPS by mapping it with the NIST functions.

The resilience curve indicates system behaviors during a cyberattack incident. The

resilience graph presents different phases of cyber operations as a function of sys-

tem functionality over time. The ﬁve stages complete the resilience cycle, and the

area formed by the enclosed curve is the quantitative measure of the system’s cyber

resilience.

3.2 NIST Framework for Cyber-Physical Systems

Griffor et al. [1] propose a framework for CPS that captures the generic CPS func-

tionalities. The framework focuses on the activities required to support conceptual-

ization, realization, and assurance of CPS. The framework requires identifying CPS

domains, facets, aspects, concerns, activities, and artifacts [1]. Here, ‘domains’ repre-

sent the CPS application areas; ‘concerns’ are concepts that drive the CPS framework

methodology. Activities within the facets address the ‘aspects.’ And ‘aspects’ consist

of a group of related concerns. There are nine deﬁned aspects. These are functional,

business, human, trustworthiness, timing, data, boundaries, composition, and life-

cycle (see Griffor et al. [1]). ‘Facets’ encompass identiﬁed activities to perform in

the systems engineering process within the CPS. Each facet contains a set of well-

deﬁned activities and artifacts (i.e., outputs) for addressing the concerns. In Fig.4,

the middle rectangular box layer shows what to do at each facet step. The bottom

parallelograms indicate the outcomes (i.e., artifacts) of the facet steps.

In Fig. 4, we observe the three identiﬁed facets: conceptualization, realization,

and assurance. ‘Conceptualization’ means things to perform. These are the group

of actions that constitute a CPS model. ‘Realization’ means how things are to make

Fig. 4 Main facets of the NIST framework for Cyber-Physical Systems [1]. We have adapted the

ﬁgure from the original framework to focus only on the important ideas. Here, conceptualization,

realization, and assurance are the three facets, as proposed in the framework

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 13

and operate. Realization encompasses the group of measures that create, deploy, and

manage a CPS. ‘Assurance’ is to achieve the desired level of conﬁdence that the

system will work as planned. This facet includes the group of actions that provide

the belief that CPSs work as intended.

The CPS framework is still in ﬁrst draft format and yet not fully established. The

CPS framework’s primary goal is to be actionable. From the critics’ point of view,

the framework is nothing but a systematic approach for realizing CPS’s process.

The three main facets on which the framework is sitting upon require activities that

depend solely on the expertise from subject matter experts. The framework hardly

illustrates how to handle cyber and physical challenges from design, modeling, and

security perspectives. Deﬁning the CPS aspects and updating the facet activities and

artifacts would differ from domain to domain. They would require gathering a vast

amount of data and expertise from system administrators or subject matter experts.

Overall, the framework does not explain how to handle the security, reliability, and

resilience issues of the complex CPS.

3.3 NIST Risk Management Framework for Information

Systems Cybersecurity

In collaboration with the US Department of Defense, the Ofﬁce of the Director

of National Intelligence, and the Committee on National Security Systems, NIST

has developed the Risk Management Framework (RMF) [30]. The RMF has con-

ceived to improve information and data security in a networked environment. The

RMF encourages sharing data and information among organizations and strengthens

risk and resilience management processes. The RMF has considered a three-layered

pyramid-shaped approach to handle and manage risks within the organization. The

bottom layer is the information systems layer. The middle layer deals with business

or mission processes. Finally, the topmost layer handles the organizational processes.

Here we only analyze the core processes as proposed in the framework.

Figure 5illustrates the RMF steps in the risk management process ﬂow. We explain

here the seven steps involved in the comprehensive risk assessment in brief.

1. Prepare: The preparation step incorporates essential tasks at all the three levels

of the enterprise network. The ‘prepare’ step is to keep the organization ready

to manage risks associated with its security and privacy. The three levels that

we are referring are the organizational level, mission and business process level,

and information systems level.

2. Categorize: The categorization step classiﬁes the system based on the impact

analysis. Here the classiﬁcation of the categories considers the study of the

amount of information processed by the system. The categorization also takes

into consideration the volume of data stored and transmitted by the system.

3. Select: This step guides the organization to choose an initial set of baseline

security controls. The security controls come from the analysis of the security

14 M. A. Haque et al.

Fig. 5 Steps in NIST risk

management framework for

information systems

cybersecurity [30]. It

consists of seven steps: (1)

prepare, (2) categorize

system, (3) select controls,

(4) implement controls, (5)

access controls, (6) authorize

system, and (7) monitor

controls. The steps need are

to follow sequentially,

although the preparation

phase needs to consider the

constraints in other stages.

The ﬁgure is adapted from

the proposed framework [30]

to help in realizing the

discussion

categorization. This ‘select’ step handles managing and rectiﬁcation of security

controls standards as needed. The baseline standard comes from the study of the

organization’s risk conditions assessment.

4. Implement: This implementation step emphasizes on the execution of the secu-

rity controls from the operational perspective. The step also incorporates docu-

mentation of the controls.

5. Assess: This step is to assess the security controls using the right measures to

estimate the extent of the correctness of the implemented controls. It consid-

ers whether the system is operating as planned. The step also evaluates if the

implemented actions produce the expected outcome considering the established

security requirements.

6. Authorize: This authorization step is to authorize system operations when there

is an identiﬁed risk. This risk is directly related to organizational assets and

operations. The operation of the system goes on if the assessment outcome ﬁnds

that the risk is acceptable.

7. Monitor: The last step is to monitor and assess the implemented security mea-

sures regularly. This monitoring includes evaluating the effectiveness of the

implemented security control and documenting any operational environment

changes. Other tasks, such as conducting impact analyses of the security alter-

ations and reporting to appropriate personnel, are part of the monitoring process.

This framework is one of the advanced cybersecurity frameworks existing today

to address cybersecurity and cyber resiliency concerns. The seven sequential steps,

as previously mentioned, are possible to tailor depending on the CPS domain areas

with the help of subject matter experts. One of the framework’s primary focus is to

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 15

Fig. 6 MITRE cyber resiliency engineering framework [31]. We only present here goals and

objectives, adapting from the proposed framework to help in realizing the essential ideas

monitor and assess the security control mechanisms and evaluate the impact of any

cyberattack incident. We think the framework addresses that concern conclusively

and comprehensively.

3.4 MITRE Cyber Resiliency Engineering Framework

MITRE Corporation has proposed a cyber resiliency engineering framework (see

Bodeau and Graubart [31]). The framework consists of cyber resiliency goals, objec-

tives, and cyber resiliency practices. It also incorporates threat models associated with

cyber risk and resiliency. The framework focuses on characterizing cyber resilience

metrics. Figure 6illustrates the framework. The elements of cyber resiliency consist

of four goals: (1) anticipate, (2) withstand, (3) recover, and (4) evolve [31]. There

are eight objectives: (1) understand, (2) prepare, (3) prevent, (4) continue, (5) con-

strain, (6) reconstitute, (7) transform, and (8) re-architect. The framework consists

of fourteen practices that intend to maximize cyber resiliency. These are (1) adap-

tive response, (2) privilege restriction, (3) deception, (4) diversity, (5) substantiated

integrity, (6) coordinated defense, (7) analytic monitoring, (8) non-persistence, (9)

dynamic positioning, (10) redundancy, (11) segmentation, (12) unpredictability, (13)

dynamic representation, and (14) realignment. In this framework, the different goals,

objectives, and practices may work together or operate separately.

Although the NIST frameworks presented earlier deal with cybersecurity in broad,

the MITRE framework focuses speciﬁcally on the cyber resilience engineering and

assessment. The goals and objectives guide us to take the correct action under each

step of the resilience management cycle. The proposed goals align with the NAS

resilience deﬁnition, which includes the plan, absorb, recover, adapt [18]. The frame-

work offers several practices which, with careful consideration, apply to ICS/CPS

domain by adjusting the rules considering the system constraints and design method-

ologies.

16 M. A. Haque et al.

3.5 Comparison of the Frameworks

A close look at the above frameworks reveals that the frameworks consider man-

agement of cyber risk and resilience from the following perspectives to handle the

cybersecurity and cyber resilience for the infrastructure or the systems.

•Plans, goals, objectives, practices, and strategies (risk and resilience perspective)

•Identify, protect, detect, respond, recover, and adapt (resilience perspective)

•Anticipate, recover, withstand, and evolve (resilience perspective)

In Table 2, we present a structured comparison among a couple of crucial cyber-

security and cyber resilience frameworks proposed by different standard bodies and

research organizations. If we look in-depth, we ﬁnd that most of the frameworks dis-

cuss some common areas. These are identifying critical assets, securing the network

through multi-level access controls, and assessing cyber risks on the business and

organization as a whole. Finally, the frameworks propose techniques to safeguard the

critical system functions or services by developing mitigation plans and strategies.

What is a lack in those frameworks is to formalize those guidances using established

mathematical methods. In this work, we understand the need for formal approaches,

and we address that need to develop mathematical techniques for risk and resilience

assessment in detail in Sect. 5.

4 Cyber Standards and Recommended Practices for CPS

In this section, we brieﬂy discuss control system speciﬁc recommendations suitable

for ICS or CPS. The ICS-CERT provides the following critical control system speciﬁc

cyber recommendations that give a solid baseline regarding what to do and how to

prevent cyberattacks in ICS.

•Developing cyber forensics plans for control systems: Developing a cyber foren-

sics program is challenging for control systems environments. The challenges arise

because of the system limitations, such as nonstandard protocols, old designs, and

irregular proprietary technologies. Cornelius and Fabro [32] address the chal-

lenges of traditional forensics to ICS and provides detailed guidance to develop a

cyber forensics program through identifying system environment and uniqueness,

deﬁning context-speciﬁc requirements, and identifying and collection of system

data.

•Applying defense-in-depth strategies to improve industrial control systems cyber-

security: The ICS defense-in-depth strategies (see Fabro et al. [33]) provide

comprehensive guidance for improving cybersecurity in control systems such as

CPS/ICS.

•ICS security incident response plan: The standard [35] primarily focuses on the

preparation and response mechanisms for a cyberattack incident on the ICSs net-

work. The policy has four segments. The ﬁrst segment concentrates on planning

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 17

Tabl e 2 Major cybersecurity and cyber resilience frameworks proposed by standard bodies and research organizations

Framework Publishing organization System Intended use Major functions, processes,

and/or metrics

Year Document version

Risk management

framework for information

systems cybersecurity

Joint Task Force ITS Managing security and

privacy risk for

organization-wide

information systems

Prepare, categorize, select,

implement, access,

authorize, and monitor

2018 NIST SP 800-37 Rev. 2a

Framework for

Cyber-Physical Systems

National Institute of

Standards and Technology

(NIST)

Mainly CPS Broad design and security

guidelines for CPS

Conceptualization,

realization, and assurance

2017 NIST SP 1500-201b

Framework for improving

critical infrastructure

cybersecurity

National Institute of

Standards and Technology

(NIST)

Critical Infrastructures (CI) Managing risk and

resilience of CI

Identify, protect, detect,

respond, and recover

2014 Version 1.0c

Conceptual Framework for

developing resilience

metrics for the electricity,

oil and gas sectors in the

United States

Sandia National Laboratory Mainly energy and oil and

gas sector. Also covers ICS,

CPS, SCADA

Developing cyber resilience

analytics for energy, and oil

and gas sectors

Deﬁne goals and metrics,

characterize threats, apply

system model, evaluate and

incorporate improvements

2014 Version not speciﬁedd

Cyber resiliency

engineering framework

MITRE Corporation ITS Developing cyber resilience

goals, objectives, and

practices for ITS

Anticipate, withstand,

recover, and evolve

2011 Version not speciﬁed e

R4 resilience framework Multidisciplinary Center

for Earthquake Engineering

Research (MCEER)

Critical Infrastructure (CI) Resilience assessment for

CI using quantitative

security metrics

Robustness, redundancy,

resourcefulness, and

rapidity

2007 Version not speciﬁedf

aJOINT TASK FORCE [30]

bGriffor et al. [1]

cSedgewick [20]

dWatson et al. [34]

eBodeau and Graubart [31]

fTierney and Bruneau [17]

18 M. A. Haque et al.

for a potential cyber event. This part also incorporates establishing a response

team and setting up a response plan for cyber incidents. The plan should include

policies, procedures, and personnel as per the organization’s established standards.

The second segment focuses on incident prevention. The third segment is incident

management, which again subdivides into four operations: (1) detection of poten-

tial threats; (2) containment of the event (e.g., quarantine malware installed on the

servers); (3) remediation including the eradication of the risk (e.g., malware); and

ﬁnally (4) recovering from the event and restoring the system to its full-service

capability. The fourth segment deals with the post-event analysis. This analysis

includes determining the root cause, access path, vulnerability, and other necessary

information to understand the incident better. The review would help to prevent

the system in the future, including cyber forensics and data preservation.

•Patch management for control systems: There is no “one size ﬁts all” solution

that adequately addresses the patch management processes of IT and OT networks.

There are some differences in implementing the patches in information technol-

ogy systems and industrial control systems, as discussed earlier in Table 1.The

recommended practices (see Tom et al. [36]) provide a detailed explanation of

the patch management program (e.g., backup, testing of a patch, disaster recovery,

etc.), patching analysis (e.g., vulnerability analysis), and deployment in the control

systems environment.

•Updating Antivirus in industrial control systems: Antivirus has widely used in

information technology than the ICS. The application of antivirus software is to

comply with the defense-in-depth strategy in ICS. Thus antivirus software and

patches need to keep updated periodically in ICS. These recommendations [37]

guide how to update the Antivirus in the control system environment without

impacting the OT production systems.

Again, most of these standards are very generic and may vary from system to

system, depending on the area of applications. In this chapter, we want to provide

quantiﬁable resilience assessment methodologies that would help make informed

decisions by incorporating the security guidelines.

5 Formal Approaches for Realizing CPS Resilience

One way to realize the frameworks and security practices within the CPS domain is to

provide quantitative cyber resilience analytics. The quantitative cyber resilience ana-

lytics could help network administrators and operators in two ways: (1) It can help in

assessing systems and evaluating the weak areas and (2) assist in developing optimal

mitigation strategies. Researchers utilize both qualitative and quantitative model-

ing approaches for deriving quantitative cyber resilience metrics. In this section, we

present formal mathematical methods and procedures to quantify cyber resilience for

the CPS. We ﬁrst offer a subjective approach for quantifying cyber resilience utilizing

the analytical hierarchy process (AHP) in Sect. 5.1. We then propose a quantitative

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 19

resilience assessment approach using a multi-level vulnerability graph model utiliz-

ing the graph properties in Sect. 5.2. Next, we offer a plan for critical cyber asset

identiﬁcation utilizing the technique for order of preference by similarity to ideal

solution (TOPSIS) method in Sect. 5.3.

We know that we need to choose only speciﬁc aspects from the frameworks for

formal modeling within this chapter’s context. That is why we model here network

criticality, system functionality, and cyber resilience analytics for the CPS utilizing

the system’s vulnerabilities. We also provide methods to rank critical assets.

5.1 Cyber Resilience Quantiﬁcation by Subjective Evaluation

Using Analytical Hierarchy Process (AHP)

The Analytic Hierarchy Process (AHP) is an organized technique for analyzing

complex decisions based on mathematical and psychological comparison (Saaty

[38]). AHP has been in use in the cybersecurity domain to assess security metrics

for a long time because of its ability to combine mathematical objectivity with the

psychological subjectivity to evaluate information and help make decisions [39,40].

We use AHP to quantify cyber resilience analytics using the subjective evaluation

method based on speciﬁc questionnaires. First, in the next paragraphs, we present

the mathematical process involved in AHP. Then we discuss a case study to assess

the robustness metrics for a hypothetical ICS network in Sect.5.1.1.

In AHP, we form the hierarchy by setting a goal to evaluate, criteria to meet that

goal, and available possibilities or options or alternatives. Here we illustrate the AHP

procedures for the cyber resilience analytics following Haque et al. [21]. We collect

subjective judgment data from Nsubject matter experts (SME). We compare m

criteria pairwise and form a comparison matrix Pof dimension m×m. An element

Pij in Prepresents the subjective comparison between the two criteria Piand Pj.We

provide the pairwise comparison matrix Pin Eq. (1) below where Pij =1

Pji .

P=⎡

⎢

⎣

1P12 ··· P1m

P12 1··· P2m

··· ··· ··· ···

P1m

P2m··· 1

⎤

⎥

⎦

(1)

We then derive the normalized comparison matrix NCMP from the original com-

parison matrix Pabove where NCMP (i,j)=Pij

m

i=1Pij .

NCMP =⎡

⎢

⎣

NCMP(1,1)NCMP (1,2)··· NCMP(1,m)

NCMP(2,1)NCMP (2,2)··· NCMP(2,m)

··· ··· ··· ···

NCMP(m,1)NCMP (m,2)··· NCMP(m,m)

⎤

⎥

⎦

(2)

20 M. A. Haque et al.

Each criterion has a weight. We compute the weights of the criteria using the

normalized matrix NCMP . The weights are none other than the normalized right

eigenvector of the pairwise comparison matrix P.

W=⎡

⎢

⎣

···

⎤

⎥

⎦

(3)

where, Wi=1

mm

j=1NCMP(ij). We also need to check the consistency of the

pairwise comparison. We can do that by computing the consistency ratio, CRby using

the expression CR =CI

RI , where RI is the random index, and CI is the consistency

index. We calculate CI by utilizing the principle eigenvalue λmax,asgiveninEq.

(4).

CI =λmax −1

m−1(4)

Here, we compute λmax by

λmax =



j=1m



i=1

Pij∗Wj(5)

We ﬁnd the value of random index RI from Table 6 of the article by Saaty [38]. We

accept the comparison if the consistency ratio CR ≤0.1 (this means that out of 10

sample responses 9 responses are consistent to each other). Table3provides default

RI values for the corresponding mvalues for cases m<10. Next, in Sect.5.1.1,we

present an illustration of assessing robustness metric.

5.1.1 A Hypothetical ICS Network ‘Robustness’ Assessment Using

AHP

To explain how the mechanism of the AHP applies in cyber resilience assessment,

we provide here an illustration using the ‘robustness’ metric (one of the broad four

resilience metrics of R4 model [17]). Let us consider a hypothetical ICS network, and

our goal is to evaluate the cyber robustness metric for that ICS network quantitatively.

Here, we utilize the robustness metric’s decomposition, as illustrated by Haque et al.

Tabl e 3 Values of the random index (RI) for small problems (m<10)

m-factora2 3 4 5 6 7 8

Random

Index (RI)

0.00 0.58 0.9 1.12 1.24 1.32 1.41

aSee Table6 of Satty [38]

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 21

Fig. 7 Decomposition of ‘robustness’ metric for ICS using the AHP process hierarchy. Robustness

is one of the four broad categories of cyber resilience metrics in R4 model

Tabl e 4 List of possible values for each of the sub-criteria

Alternative values Interpretation of the options

High (H) Specialized security measures are already implemented in the ICS for

the associated sub category

Medium (M) Some or partial security measures are implemented in the ICS for the

associated sub criteria

Low (L) No or very few measures are implemented in the ICS for the

corresponding sub criteria

[22]. As shown in Fig. 7, the robustness metric has three assessment criteria: physical

robustness (C1), technical robustness (C2), and organizational robustness (C3). Each

criterion has four sub-criteria SC1,SC2,SC3, and SC4.SC1is ICS security (e.g.,

using IDS/IPS or physical security), SC2is access control (e.g., using ﬁrewall policy),

SC3is ICS product diversity, and ﬁnally, SC4is ICS risk mitigation strategies. Finally,

to be aligned with the Common Vulnerability Scoring System (CVSS) (Mell et al.

[41]), we design each sub-criteria to take values among three alternative options:

high (H), medium (M), and low (L). We present the meaning of high, medium, and

low in Table 4.

We ﬁrst use the Likert scale to set up the pairwise comparison. In comparison,

having equal importance is the lowest parameter with a numeric value of 1, and

having extreme importance is the highest-ranked parameter with a numerical value

of 9. We present a sample assessment question in Fig. 8.Weprefertousethesame

Likert scale numerical scores to align with the Likert range used by Satty [38].

We then utilize the subject matter experts to assess the questionnaires. We have

conducted the survey and collected a total of N=15 sample data sets from which

we exclude N=5 because of inconsistency (CR <0.1) in responses. Table 5con-

22 M. A. Haque et al.

Fig. 8 Sample pairwise comparison between the physical and the technical criteria for the ‘Robust-

ness’ metric using Likert scale

Tabl e 5 Pairwise comparison matrix and eigenvector estimation for maximizing robustness with

respect to the three considered criteria: physical, technical, and organizational

Pairwise comparison Evaluated score

Criteria Physical (C1)Technical (C2) Organizational

(C3)

Normalized

eigenvector

Physical (C1) 1 0.11 0.20 0.0578

Technical (C2)8.95 15.101 0.7383

Organizational

(C3)

4.89 0.20 10.2039

tains the aggregated pairwise comparison matrix that we have computed from the

consistent data set for the three criteria C1,C2, and C3. From the normalized right

eigenvector of Table 5, we ﬁnd the robustness as a function of the physical (C1),

technical (C2), and organizational (C3) criteria as given in the Eq. (6):

Robustness =0.06 ×Physical +0.74 ×Technical +0.20 ×Organizational

or,

R1=0.06 ×C1+0.74 ×C2+0.20 ×C3

⎫

⎪

⎬

⎪

⎭

(6)

Similarly, we have made pairwise comparisons for sub-criteria and alternatives.

We derive the weights of each criterion from the normalized eigenvector correspond-

ing to the options at the lower level of the hierarchy in the AHP model. Figure9shows

a sample pairwise comparison illustration between options high (H) and medium

(M). Equation 7provides the numerical values that we have obtained for the weights

and the normalized eigenvector for the four sub-criteria and three alternatives in the

matrix form.

⎡

⎢

⎣

ICS Security

Access Control

ICS Product Diversity

ICS Risk Management

⎤

⎥

⎦

=⎡

⎢

⎣

0.4

0.2

0.1

0.3

⎤

⎥

⎦

,and ⎡

⎣

High (H)

Medium (M)

Low (L) ⎤

⎦=⎡

⎣

0.7

0.2

0.1⎤

⎦(7)

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 23

Fig. 9 Sample pairwise comparison between the options high (H) and medium (M) for the access

control sub-criteria using Likert scale

This way, we can assess the four broad cyber resilience metrics one at a time

using the AHP, and then combine the assessment to reach a consolidated value for

the resilience metric.

5.2 Cyber Resilience Assessment Using Multi-level Directed

Acyclic Vulnerability Graph Model

As we have already stated, our goal within the context of this chapter is to apply

the frameworks and security practices to evaluate the quantitative cyber resilience

metric. In this section, we describe a multi-level vulnerability graph model to assess

the cyber resilience quantitatively. We ﬁnd graph-theoretic security analytics is one

of the most common methods to address cyber risk and resilience. Next, we present

the background information necessary to understand the cyber resilience modeling

approach.

5.2.1 Background Information for Graph-Theoretic Modeling and

Analysis

This section discusses the graph-based modeling approach to provide the readers with

the necessary background information about our resilience quantiﬁcation method-

ology. Some of the deﬁnitions we have taken from one of our recent works [26].

We frequently refer to SCADA systems for illustration purposes as we formulate the

mathematical models by keeping in mind the energy delivery systems (EDS) as an

example of CPS. Readers may consider SCADA as a monitoring and control system

for the physical ﬁeld devices. We ﬁnd that researchers commonly refer to cyber-

physical power systems (CPPS) [2,3] when it comes to the discussion of energy

systems cybersecurity. That is why we take the power systems’ case to illustrate the

model that applies equally to other CPS.

24 M. A. Haque et al.

(1) Vulnerability graph: We deﬁne a vulnerability graph as a directed acyclic graph

(DAG). In general, a vulnerability graph is a type of attack graph. Mathematically,

we represent the vulnerability graph as G=(N,E,W), where Nis the set of

vertices; Eis the set of edges where E⊆N×N; and Wis the weight matrix

of the graph. If there exist and edge e=(i,j)between vertex iand j, then the

vertex iand jare adjacent to each other. An adjacency matrix Aof a graph

G=(N,E,W)with |N|=nis an n×nmatrix, where Aij =Wij,if(i,j)∈E

and Aij =0 otherwise. The weight value Wij between the edge (i,j)is coming

from the CVSS vulnerability base score (see Mell et al. [41]) of the node j.The

multi-level vulnerability graph is the same as the vulnerability graph, but here

different layers (as per the defense in depth security strategy) model themselves

as separate graphs. There can be single or multiple perimeter devices between the

layers, such as a ﬁrewall that connects the two consecutive layers. We encourage

readers to explore more about the multi-level vulnerability graph in the article

by Haque [42].

(2) Network topology: In a CPS network, the network design follows speciﬁc system

architecture and security policies (e.g., ﬁrewall rule-sets). In the CPS, as per

the NIST guidelines (see Stouffer et al. [5]), ICS ﬁrewalls control the allowed

protocols or message communications among the ﬁeld devices through the rule-

sets or policies. We consider that the adjacency matrix is sufﬁcient to represent

the network connectivity in the vulnerability graph.

(3) Control function: We consider a control function a logical connection that carries

(or transmit) the data from the ﬁeld devices to SCADA and controls commands

from SCADA to the ﬁeld devices. These functions perform speciﬁc tasks such

as voltage regulation adjustment, etc. Formally, we deﬁne a control function

CF(i,j)between node i&jas {CF(i,j)=e(i,j)|∃e(i,j)∈E,Aij =

0&Wij >0}, and thus, basically, the edges represents the control functions in

the vulnerability graph. As we utilize the CVSS base scores, this edge weight

or importance indicates the possible exploitability and impact of exploiting the

particular control function. We do not consider the degree of operability of

the control functions in this model as described in FDNA [43], because that

brings a different research question of modeling and incorporating the functional

dependencies in the cyber resilience assessment.

(4) CVSS base, exploitability, and impact scores:CVSS[41] deﬁnes the exploitabil-

ity and impact metrics for every known vulnerability. The national vulnerability

database [44] provides the CVSS scores for all the reported (i.e., known) vulner-

abilities. The exploitability metric comprises three base metrics: access vector

AV, access complexity AC, and access authentication AU. Similarly, the impact

metric is also composed of three base metrics: conﬁdentiality impact IC, integrity

impact II, and availability impact IA. CVSS computes the exploitability Eiand

impact Iiof a vulnerability iusing Eq. (8).

Ei=20 ×Ai

V×Ai

C×Ai

Ii=10.41 ×(1−(1−Ii

C)(1−Ii

I)(1−Ii

A))(8)

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 25

The measurement of exploitability, impact, and base scores are on a scale of

0–10. The higher the value, the higher the exploit capability or consequences.

To deﬁne the base score, CVSS deﬁne an impact function as given below:

fIi=0ifIi=0

1.176 otherwise (9)

Finally, CVSS computes the base score (BS) of vulnerability iusing the below

equation (see [41]):

i=roundTo1Decimal (((0.6×Ii)+(0.4×Ei)−1.5)×f(Ii)) (10)

(5) Multi-edge to single edge transformation: In a network, if a node has multiple

vulnerabilities, the graph becomes a multi-digraph. The number of paths from

source to destination increases exponentially and creates scalability problems for

large networks. To avoid this, we transform the multi-edged directed vulnerabil-

ity graph to a single-edged directed graph (simple graph) using the composite

exploitability score. As the severity of the exploitability and impact are differ-

ent for different vulnerabilities, we use a severity-based weight approach (see

Table 3 of [25]) to incorporate the severity level of the vulnerability. The com-

posite exploitability score (ES), impact score (IS), and base score (BS) for node

j, having vulnerabilities i=1∼nis deﬁned in Eqs. (11), (12), and (13).

ES j=n

i=1wj

i×Ej

n

i=1wj

(11)

IS j=n

i=1wj

i×Ij

n

i=1wj

(12)

BS j=n

i=1wj

i×BSj

n

i=1wj

(13)

Here, wj

i,Ej

i,Ij

i, and BS j

iare the severity weights, exploitability score, impact

score, and base score of vulnerability iof node j. We ﬁnd BSj

ifrom NVD

database [44]orusingEq.(10), and we compute BS jusing Eq. (13) which

refers to the composite base score of node j.

(6) Computation of edge weight: We utilize CVSS base scores in computing the

edge weights using Eq. (13). This way, we consider both the exploitability and

impact of a vulnerability in our edge weight. The weight matrix is as follows.

Wij =BS jif (i,j)∈E

0 otherwise, i.e., if (i,j)/∈E(14)

26 M. A. Haque et al.

(7) Betweenness Centrality (BC): Betweenness Centrality (BC) is a graph-theoretic

metric that measures the number of times a node acts as a bridge along the shortest

paths between two other nodes. If we translate a network into a graph-theoretic

model, then the BC of a node indicates the possibility of attack progression

through that node. Mathematically, BC of node n(i.e., Bn)isasfollows:

Bn=

s=n=t

σst(n)

σst (15)

Here, σst =total number of shortest paths from source node sto target node t

and σst(n)=number of paths that pass-through node namong those shortest

paths.

(8) Katz Centrality (KC): Katz Centrality (KC) is another graph-theoretic parameter

that gives the importance of the node considering the network structure and node

position in the network. KC quantiﬁes the number of nodes connected through a

path, while we penalize the contributions of distant nodes. Mathematically, we

deﬁne KC of node ias given in Eq. (16), where βis an attenuation factor and

0≤β≤1.

CKatz(i)=

∞



p=1



m=1

βp(Ap)mi (16)

The following subsections present the derivation of system critical functional-

ity and resilience metrics, as shown by Haque et al. [26]. We utilize network

criticality to formulate system functionality.

5.2.2 Critical System Functionality (CSF)

System critical functionality is the level of minimum functionality maintained by a

system during any adverse scenario. It depicts the extent to which the system’s typical

performance can degrade. While discussing resiliency, Arghandeh et al. [45] illustrate

resilience as a multi-dimensional property, which requires managing disturbances of

the network performance. This disturbance may originate either from physical or

cyber devices malfunctions or failures or due to a cyberattack incident. Arghandeh

et al. also describe critical system functionality as maintaining the system’s minimal

required services in the presence of unexpected extreme disturbances. In another

study, Bharali and Baruah [46] deﬁne average network functionality using the net-

work criticality metric. Bharali and Baruah consider random network failures while

determining network functionality using a graph-theoretic approach. We extend the

analysis of Bharali and Baruah [46] for the case of random cyberattacks on the CPS.

We think removing an edge in the vulnerability graph makes a service unavailable

or deactivates a control function due to disconnecting the logical connection. Here

we consider the same average network functionality metric as the system’s criti-

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 27

cal functionality. This is the level of functionality maintained by the CPS under a

cyberattack.

Let us denote the original graph before any attack incident happens by Go=G,

and the graph obtained by removing the edge eduring an attack incident by Ge=

G\e. Let us also consider τand τebe the network criticality of the graphs Goand

Ge. Then we deﬁne the critical system functionality by considering the effect of the

edges removed from the original graph as given by Eq. (17).

η=1−1

m

e∈EI+(τe−τ) τ

τe

+I−(τe−τ) τ

τe+2n

μ(17)

where mdenotes the number of edges in Go,μis the smallest non-zero eigenvalue of

Go,I+(x)=1ifx≥0 and 0 otherwise, and I−(x)=1ifx<0 and 0 otherwise. For

a connected graph Go,μ=μ1is the algebraic connectivity of Go. Here, 0 ≤η≤1.

Thus, ηindicates the system functionality of the CPS under cyber attack events, i.e.,

the functionality or services available during the attack event considering the impacts

on the links. A higher value of ηmeans a higher degree of system functionality

is maintained. We discuss the computation process of the network criticality τin

Sect. 5.2.4.

5.2.3 Cyber Resilience Metric

Deriving resilience analytics requires understanding and incorporating system behav-

ior (linear or non-linear) during the recovery phase. It also needs to incorporate critical

system functionality while generating resilience metrics. Roberson et al. [47] deﬁne

resilience from the bulk power system perspective, where the authors consider that

the safeguarding and restoration of the system functionality subject to perturbations

are key elements of resilience. We compute the CPS’s cyber resilience by utilizing the

system performance or recovery curve, as given in Fig.10 incorporating the critical

system functionality metric. Typically, during an adverse event, the recovery behavior

of a system is non-linear. This recovery is a function of the system (S)under consid-

eration, duration of recovery (T), recovery rate (r), time (t), and the functionality

level (η). Zobel [48] addresses the power system recovery behavior from disaster

resilience and proposes several functional forms to model the recovery over time. In

this work, we utilize the inverted exponential form of the recovery curve from Zobel

[48], which considers the non-linearity and suitable to model the resilience for the

CPS. We model the time-dependent system recovery behavior Qr(t)by following

the Eq. (6) of Zobel [48] to demonstrate the quantitative resilience metric under any

adverse event. Here the impact is equivalent to the loss of system performance or

1−ηwhere 0 ≤η≤1.

Qr(t)=(1−η)1−e−T−(t−tiri)ln(n)

T+T−(t−tiri)

nT (18)

28 M. A. Haque et al.

Fig. 10 System performance recovery curve during a cyberattack incident ion the CPS. We use

the graph from Haque et al. [26], which is a modiﬁed form of the resilience graph presented in Wei

and Ji [28]

Tabl e 6 Notations used for resilience modeling

Notations Explanation of notations

Qr(t)Time-dependent system recovery behavior

tri

iTime instance of initiating system recovery for incident i

tcr

iTime instance of complete recovery of system functions for attack incident i

T=tcr

i−tri

iThe period of recovery

T∗System-dependent maximum allowable time for the recovery

n(in Eq. 18)The level of concavity of the inverted exponential curve

We provide the notations used in Eq. (18)inFig.10 and Table 6. Here, T∗is the

system-dependent maximum allowable time to recover. Typically, system adminis-

trators or designers select T∗as the maximum acceptable time for the system to

recover. The area under the points e, a, d represents the amount of losses in system

functionality over time due to the cyberattack incident i. Thus, the area enclosed by

the marks a-b-c-dis the area of system resilience. To compute the resilience metrics,

we ﬁrst calculate the area enclosed by the points e-a-d. We then can compute the

area covered by the points e-a-d as follows:

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 29

Ae-a-d =(1−η)

tri

i+T



tri

i1−e−T−(t−tiri)ln(n)

T+T−(t−tiri)

nT dt (19)

Simplifying the above equation, we ﬁnd the following reduced form as in Eq. (20).

Ae−a−d=(1−η)T1−n−1

nln(n)+1

2n(20)

From Fig. 10, e-b-c-dis 1 ∗T∗=T∗and the area of e-a-d is deﬁned by Eq. (20).

Thus, the cyber resilience of the CPS system is the area under the curve enclosed by

the points a-b-c-dover period T∗as given in Eq. (21).

ξ=1

T∗T∗−(1−η)T1−n−1

nln(n)+1

2n (21)

The term 1−n−1

nln(n)+1

2nis a constant term for speciﬁc n, and is denoted by

γ. Thus, Eq. (21) becomes ξ=1

T∗T∗−(1−η)Tγ.

5.2.4 Network Criticality

As we have seen earlier, to compute the CSF, we need the criticality metric. Bharali

and Baruah [46], and Tizghadam and Garcia [49] proposed a graph-based network

criticality metric. We apply the same here to measure the criticality of the overall CPS

network. We use the Moore-Penrose inverse of the Laplacian matrix Lto compute

the network criticality τ. As we are using the directed weighted graph, we deﬁne the

Laplacian matrix Las per Chung [50]asgiveninEq.(22). In Eq. (22), P is the graph

transition matrix, is a matrix with the Perron vector of Pin the diagonal and zeros

in all other matrix elements.

L=I−1

2P−1

2+−1

2PT1

2/2(22)

We can also derive Lis by using the normalized graph Laplacians Lsym and random

walk Laplacian Lrw, as below.

Lsym =D−1

2LD−1

2=I−D−1

2WD−1

Lrw =D−1

2Lsym D1

2(23)

Dis a diagonal matrix formed by the degree of the nodes. We deﬁne it as D=

diag(d1,d2,...,dm).Heredi=m

j=1Wij. We use Bernstein [51] to compute the

Moore-Penrose inverse of the Laplacian matrix (L), i.e., L+as we provide in Eq. (24).

30 M. A. Haque et al.

L+=L+J

n−1

−J

n(24)

where Jis an n×nmatrix whose entries are all equal to 1. We then deﬁne the

network criticality metric τby Eq. (25).

τ=2n∗trace(L+)(25)

Here, nis the number of nodes, and trace(L+)=n

i=1(L+)ii. The larger the

value of τmeans the network is more vulnerable from the exploitability perspective.

We can apply the above vulnerability graph-based resilience analytics derivation

approaches in the CPS context to assess the overall network functionality, criticality,

and resiliency. Next, we present a process to identify and rank the critical cyber assets

using the TOPSIS method in Sect. 5.3. We consider the ranking an essential step

towards realizing the security guidelines as identifying critical assets of the network

is among the criteria in the recommended defense-in-depth security measures.

5.3 Ranking Critical Assets Using TOPSIS Method

Determining criticality for the network devices is a multi-attribute decision analysis

(MADA) problem. Haque et al. [25] have identiﬁed some of the crucial parameters

for ranking the critical devices in the power system network from a cyberattack

perspective using the vulnerability graph model. Here we apply the TOPSIS method

as a MADM (Multiple-Attribute Decision Making) technique to rank the critical

devices in a CPS network.

Assessment Parameters: Here, we consider four parameters to assess each device’s

criticality, although it is possible to take Nparameters into the decision-making

process. The parameters are (1) device’s asset value represented by Katz centrality,

(2) device’s briding capability, which we model using betweenness centrality, (3)

attack occurrence exploitability, and (4) potential attack impact. One can compute

the attack exploitability and attack result on a device using Eqs.(11) and (12). Also,

we ﬁnd the BC and KC using Eqs. (15) and (16). For details on the meaning of asset

value, exploitability, and attack impact, we encourage readers to check the article

by Haque et al. [25], which we omit here to narrow down our focus to the speciﬁc

problem under consideration.

TOPSIS Method for Device Criticality Assessment: The Technique for Order of

Preference by Similarity to Ideal Solution (TOPSIS) is a MADM technique. It builds

on the concept that the chosen alternative should have the shortest geometric distance

from the positive ideal solution and the longest geometric distance from the ideal

negative solution (see Hwang et al. [52]). Kim and Kang [53] use and illustrates

TOPSIS to determine the device criticality. Here, we brieﬂy present the TOPSIS

method’s steps for facilitating an understanding of the ranking process.

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 31

Step I: At ﬁrst, we form an m×nmatrix with mcriteria (i.e., parameters) and n

alternatives (i.e., nodes/devices), with the intersection of each criteria and alter-

native contains a value yij, where Criteriaiand Alternative jare the ith criteria

and jth alternative.

Ym×n=⎡

⎢

⎣

Alternative1Alternative2··· Alternativen

Criteria1y11 y12 ··· y1n

Criteria2y21 y22 ··· y2n

··· ··· ··· ··· y3n

Criteriamym1ym2··· ymn

⎤

⎥

⎦

(26)

Step II: In this step, we normalize the matrix Ym×nto form a normalization matrix

Rm×n=(Rij)m×nusing the below equation.

Rij =yij

m

i=1(yij)2(27)

Step III: Here, we calculate the weighted normalized decision matrix Tas below.

We need to compute the weights using AHP. We illustrate an example in the

Sect. 5.3.1.

T=(tij)m×n=(WiRij )m×n,j=1,2,...,n(28)

Step IV: We determine the worst alternative Awand the best alternative Ab.

Aw={max(tij|j=1,2,...,n|i∈I−,

min(tij|j=1,2,...,n|i∈I+} = {twi |i=1,2,...,m}(29)

Ab={min(tij|j=1,2,...,n|i∈I−,

max(tij|j=1,2,...,n|i∈I+} = {tbi |i=1,2,...,m}(30)

where I+={i=1,2,...,m|i}represents the criteria having a positive impact

and I−={i=1,2,...,m|i}represents the criteria having a negative impact.

Step V: We compute the L2-distance between the target al.ternative jand the worst

condition Aw.

diw =



i=1

(tji −twi )2,j=1,2,...,n(31)

The distance between the alternative j and the best condition Abis:

djb =



i=1

(tji −tbi )2,j=1,2,...,n(32)

32 M. A. Haque et al.

where djw and djb are L2-norm distances from the target al.ternative i to the worst

and best conditions, respectively.

Step VI: Finally, at this stage, we compute the criticality of device j(alternative

j)usingEq.(33):

ηj=djw

djw +djb

,0≤ηj≤1,j=1,2,...,n(33)

Using the device criticality metric, we can identify and rank the critical network

devices.

5.3.1 Illustration of Ranking Critical Cyber Assets Using Vulnerability

Graph and TOPSIS Method

We illustrate an example of the application of TOPSIS in CPS network asset ranking

using the vulnerability graph of Fig. 11. Here, we consider Fig. 11 as a vulnerability

graph representation for a CPS with ten devices. We apply TOPSIS to determine the

criticality and illustrate the same for the nodes (or devices) 3–8 only using Fig.11

because of space constraints. The edge score contains two parameters: exploitability

score and impact score. Table7shows the weights of the criteria and the parame-

ter values of the nodes. Table 8shows the corresponding TOPSIS computation. In

Table 8, the bold italic underline value is the maximum of the criteria, and the bold

only value is the minimum of the criteria. Here we ﬁnd that the most critical devices

are 4 and then 7 and 3, respectively, and the least critical one is 8 among the six nodes

that we have considered. Again, this is a sample illustration of how we can apply the

TOPSIS by choosing some criteria and corresponding weights using a vulnerability

graph representation of CPS.

Fig. 11 Asample

vulnerability graph with

arbitrary edge weights. We

represent the edge weights as

(exploitability score, impact

score). The number of nodes

used in this illustration is ten.

The edge weights are within

the range of 0–10 to keep

similar to CVSS scores

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 33

Tabl e 7 Device criticality assessment parameters and values

Parameters Weight (wc) Devices (nodes)

3 4 5 6 7 8

Exploitability 0.25 7.9 4.3 5.7 3.1 2.5 4.6

Impact 0.5 6.4 7.8 3.4 5.2 8.9 3.2

Betweenness centrality 0.15 0.1273 0.0671 0.0231 0.0417 0.1018 0.1111

Katz centrality 0.1 0.3243 0.3299 0.3010 0.3004 0.3635 0.3643

Tabl e 8 TOPSIS device criticality metrics computation

Parameters/Metrics Device (j)

3 4 5 6 7 8

Exploitability 1.978 1.075 1.425 0.775 0.625 1.15

Impact 3.2 3.9 1.7 2.6 4.45 1.6

Betweenness centrality 0.019095 0.010065 0.003465 0.006255 0.01527 0.016665

Katz centrality 0.03243 0.03299 0.0301 0.03004 0.03635 0.03643

djw 2.4546 2.7866 0.9708 1.4577 3.30 0.6917

djb 1.2523 1.1196 2.8202 2.2469 1.4251 2.9887

ηj0.6622 0.7134 0.2561 0.3935 0.6984 0.1879

Criticality rank 315 4 26

The bold underline is the maximum of the criteria and bold only is the minimum of the criteria

6 Challenges in Mapping of CPS Resilience with Security

Concerns and Operational Domains

The frameworks and recommended practices that we cover in this chapter provide a

solid background on designing and implementing an effective cyber resilient strategy

for the CPS. By correctly understanding and applying the guidance posted by the

frameworks and security practices, we can transform the challenges into opportunities

by using the mathematical analysis models. This section brieﬂy discusses how to map

the standards and procedures into CPS security and operational resilience.

We think that cybersecurity and cyber resilience are viewed better in a three-

dimensional (3D) representation with the CPS domains (i.e., cyber, cyber-physical,

and physical), as illustrated in Fig. 12. The three CPS domains, cyber, cyber-physical,

and physical, have their independent security requirements. There are security con-

cerns (i.e., threats, vulnerabilities, cyberattacks, etc.) for each domain. There are

access control policies, organizational security policy, and overall security strategy

in place to address the security concerns, which varies from system to system and

domain to domain. The security policies and strategies evolve based on the organi-

zation and business mission and situational knowledge and awareness.

On the other hand, the resilience of the systems from cyber incidence largely

depends on the organizational implementation of the policies and defense strategies

according to different stages of cyber resilience (i.e., plan/prepare, absorb, recover,

and adapt). Researchers utilize another set of resilience functions: identify, protect,

34 M. A. Haque et al.

Fig. 12 Mapping of CPS resilience with the security concerns and operational domains

detect, respond, and recover for the same functionality. The frameworks presented in

Sect. 3provide concrete references for the organizations to understand the security

requirements and develop cybersecurity models and strategies according to the sys-

tem needs. The recommended practices and the defense-in-depth policy, as illustrated

in Sect. 4provide practical knowledge and implementation experiences required to

build resilient CPS.

The overall challenge in implementing the defense-in-depth strategies into CPS

is to map them to the particular system under considerations based on the functional

area. For example, if the functional domain is an autonomous vehicle system, then

the challenge would be to map the recommendations and strategies with the vehicle

system design and speciﬁcations under consideration. Thus, with a thorough under-

standing of the system design speciﬁcations, devices, protocols, communications,

system limitations, etc. With the help of the recommended practices and mathemati-

cal modeling, one can design and implement resilient strategies for control systems,

critical infrastructures, etc. It is also imperative to utilize the formal analyses that we

have presented in Sect.5to evaluate the system’s criticality and cyber resilience to

have an overall assessment of the resilience poster of the whole system.

7 Conclusions

This article discusses cyber resilience in the context of available frameworks and

recommended practices proposed by the different standard bodies and cyber orga-

nizations. At ﬁrst, the paper presented an in-depth analysis and review of existing

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 35

cyber frameworks and recommended security guidelines for CPS systems to han-

dle the resiliency. Then the article discusses ways to transform the challenges into

opportunities by understanding and realizing the security standards and instructions.

The chapter provides a three-dimensional graphical illustration among CPS security,

CPS components, and CPS resilience by mapping those with the frameworks and

standard practices. The article also presents formal mathematical models to assess

and quantify cyber resilience analytics for CPSs to help network administrations and

researchers make informed decisions. Overall, the paper would guide the researchers

in the CPS domain to gain a good understanding of the relevant frameworks, CPS

security measures, and modeling and simulation (M&S) constraints to overcome the

challenges and utilize the opportunities within the frameworks and guidelines.

Acknowledgments This material is based upon work supported by the Department of Energy

under Award Number DE-OE0000780.

References

1. Griffor, E.R., Greer, C., Wollman, D.A., Burns, M.J.: Framework for cyber-physical systems:

vol. 1. Overview, Technical report (2017)

2. Shi, L., Dai, Q., Ni, Y.: Cyber-physical interactions in power systems: a review of models,

methods, and applications. Electr. Power Syst. Res. 163, 396–412 (2018)

3. Zhang, T., Wang, Y., Liang, X., Zhuang, Z., Xu, W.: Cyber attacks in cyber-physical power

systems: a case study with gprs-based scada systems. In: 2017 29th Chinese Control And

Decision Conference (CCDC), pp. 6847–6852. IEEE (2017)

4. Macaulay, T., Singer, B.L.: Cybersecurity for industrial control systems: SCADA, DCS. HMI,

and SIS. Auerbach Publications, PLC (2016)

5. Stouffer, K., Falco, J., Scarfone, K.: Guide to Industrial Control Systems (ICS) Security, vol.

800, no. 82, p. 16. NIST Special Publication (2011)

6. Colbert, E.J.M., Kott, A.: Cyber-Security of SCADA and Other Industrial Control Systems,

vol. 66. Springer (2016)

7. Johnson, A., Dempsey, K., Ross, R., Gupta, S., Bailey, D.: Guide for Security-Focused Conﬁgu-

ration Management of Information Systems, vol. 800, no. 128, p. 16. NIST Special Publication

(2011)

8. Cyware: Understanding the difference between risk, threat, and vulnerability (2019). https://

cyware.com/news/understanding-the- difference-between- risk-threat- and-vulnerability-

c5210e89

9. Blank, R.M.: Guide for conducting risk assessments (2011)

10. Lewis, T.G.: Network Science: Theory and Applications. Wiley (2011)

11. Haque, M.A., Gochhayat, S.P., Shetty, S., Krishnappa, B.: Simulation Foundations, Methods

and Applications. SFMA) series, Cloud-Based Simulation Platform for Quantifying Cyber-

Physical Systems Resilience. Springer (2020)

12. Chen, T., Abu-Nimeh, S.: Lessons from stuxnet. Computer 44(4), 91–93 (2011)

13. Mittal, S., Tolk, A.: Complexity Challenges in Cyber Physical Systems: Using Modeling and

Simulation (M&S) to Support Intelligence. Wiley, Adaptation and Autonomy (2019)

14. Haque, M.A., Shetty, S., Krishnappa, B.: Cyber-physical system resilience. In: Complexity

Challenges in Cyber Physical Systems: Using Modeling and Simulation (M&S) to Support

Intelligence, Adaptation and Autonomy (2019)

15. Laing, C.: Securing Critical Infrastructures and Critical Control Systems: Approaches for

Threat Protection. IGI Global (2012)

36 M. A. Haque et al.

16. Bruneau, M., Chang, S.E., Eguchi, R.T., Lee, G.C., O’Rourke, T.D., Reinhorn, A.M., Shi-

nozuka, M., Tierney, K., Wallace, W.A., Winterfeldt, D.V.: A framework to quantitatively

assess and enhance the seismic resilience of communities. Earthquake Spectra 19(4), 733–752

(2003)

17. Tierney, K., Bruneau, M.: Conceptualizing and measuring resilience: a key to disaster loss

reduction. TR News (250) (2007)

18. National Research Council et al.: Disaster resilience: a national imperative (2012)

19. Ross, R.S.: Recommended security controls for federal information systems and organizations

[includes updates through 9/14/2009]. Technical report (2009)

20. Sedgewick, A.: Framework for improving critical infrastructure cybersecurity, version 1.0.

Technical report (2014)

21. Haque, M.A., De Teyou, G.K., Shetty, S., Krishnappa, B.: Cyber resilience framework for

industrial control systems: concepts, metrics, and insights. In: 2018 IEEE International Con-

ference on Intelligence and Security Informatics (ISI), pp. 25–30. IEEE (2018)

22. Haque, M.A., Shetty, S., Krishnappa, B.: ICS-CRAT: a cyber resilience assessment tool for

industrial control systems. In: 2019 IEEE 5th Intl Conference on Big Data Security on Cloud

(BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing (HPSC)

and IEEE Intl Conference on Intelligent Data and Security (IDS), pp. 273–281. IEEE (2019)

23. Barker, K., Lambert, J.H., Zobel, C.W., Tapia, A.H., Ramirez-Marquez, J.E., Albert, L., Nichol-

son, C.D., Caragea, C.: Deﬁning resilience analytics for interdependent cyber-physical-social

networks. Sustain. Resilient Infrastructu. 2(2), 59–67 (2017)

24. DiMase, D., Collier, Z.A., Heffner, K., Linkov, I.: Systems engineering framework for cyber

physical security and resilience. Environ. Syst. Decis. 35(2), 291–300 (2015)

25. Haque, M.A., Shetty, S., Kamdem, G.: Improving bulk power system resilience by ranking

critical nodes in the vulnerability graph. In: Proceedings of the Annual Simulation Symposium,

p. 8. Society for Computer Simulation International (2018)

26. Haque, M.A., Shetty, S., Krishnappa, B.: Modeling cyber resilience for energy deliverysystems

using critical system functionality. In: IEEE Resilience Week 2019, pp. 33–41. IEEE (2019)

27. Clark, A., Zonouz, S.: Cyber-physical resilience: deﬁnition and assessment metric. IEEE Trans.

Smart Grid 10(2), 1671–1684 (2017)

28. Wei, D., Ji, K.: Resilient industrial control system (RICS): concepts, formulation, metrics, and

insights. In: 2010 3rd International Symposium on Resilient Control Systems, pp. 15–22. IEEE

(2010)

29. Linkov, I., Eisenberg, D.A., Bates, M.E., Chang, D., Convertino, M., Allen, J.H., Flynn, S.E.,

Seager, T.P.: Measurable resilience for actionable policy (2013)

30. JOINT TASK FORCE: Risk Management Framework for Information Systems and Organiza-

tions, vol. 800, p. 37. NIST Special Publication (2018)

31. Bodeau, D., Graubart., R.: Cyber Resiliency Engineering Framework. MTR110237,

MITRECorporation (2011)

32. Cornelius, E., Fabro, M.: Recommended practice: Creating cyber forensics plans for control

systems. Technical report, Idaho National Laboratory (INL) (2008)

33. Fabro, M., Gorski, E., Spiers, N.: Recommended practice: improving industrial control system

cybersecurity with defense-in-depth strategies. In: DHS Industrial Control Systems Cyber

Emergency Response Team (2016)

34. Watson, J.-P., Guttromson, R., Silva-Monroy, C., Jeffers, R., Jones, K., Ellison, J., Rath, C.,

Gearhart, J., Jones, D., Corbet, T., et al.: Conceptual framework for developing resilience

metrics for the electricity oil and gas sectors in the united states. Technical report, Sandia

National Laboratories, Albuquerque, NM, USA (2014)

35. ICS-CERT: Recommended practice: developing an industrial control systems cybersecurity

incident response capability (2009)

36. Tom, S., Christiansen, D., Berrett, D.: Recommended practice for patch management of control

systems. Technical report, Idaho National Laboratory (INL) (2008)

37. ICS-CERT: Recommended practice: updating antivirus in an industrial control system (2018)

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices 37

38. Saaty, T.L.: Relative measurement and its generalization in decision making why pairwise

comparisons are central in mathematics for the measurement of intangible factors the ana-

lytic hierarchy/network process. RACSAM-Revista de la Real Academia de Ciencias Exactas,

Fisicas y Naturales. Serie A. Matematicas 102(2), 251–318 (2008)

39. Wilamowski, G.C., Dever, J.R., Stuban, S.M.F.: Using analytical hierarchy and analytical net-

work processes to create cyber security metrics. Def. Acquisit. Res. J.: Publicat. Def.e Acquisit.

Univ. 24(2) (2017)

40. Sun, K., Jajodia, S., Li, J., Cheng, Y., Tang, W., Singhal, A.: Automatic security analysis

using security metrics. In: 2011-MILCOM 2011 Military Communications Conference, pp.

1207–1212. IEEE (2011)

41. Mell, P., Scarfone, K., Romanosky, S.: A complete guide to the common vulnerability scoring

system version 2.0. In: Published by FIRST-Forum of Incident Response and Security Teams,

vol. 1, p. 23 (2007)

42. Haque, M.A.: Analysis of bulk power system resilience using vulnerability graph (2018)

43. Garvey, P.R., Ariel Pinto, C.: Introduction to functional dependency network analysis. In: The

MITRE Corporation and Old Dominion, Second International Symposium on Engineering

Systems, vol. 5. MIT, Cambridge, MA (2009)

44. NIST: National vulnerability database. https://nvd.nist.gov/vuln/data- feeds. Accessed 14 Jan

2020

45. Arghandeh, R., Von Meier, A., Mehrmanesh, L., Mili, L.: On the deﬁnition of cyber-physical

resilience in power systems. Renew. Sustain. Energy Rev. 58, 1060–1069 (2016)

46. Bharali, A., Baruah, D.: On network criticality in robustness analysis of a network structure.

Malaya J. Matematik (MJM) 7(2), 223–229 (2019)

47. Roberson, D., Clarisse Kim, H., Chen, B., Page, C., Nuqui, R., Valdes, A., Macwan, R., Johnson,

B.K.: Improving grid resilience using high-voltage dc: strengthening the security of power

system stability. IEEE Power Energy Mag. 17(3), 38–47 (2019)

48. Zobel, C.W.: Quantitatively representing nonlinear disaster recovery. Decis. Sci. 45(6), 1053–

1082 (2014)

49. Tizghadam, A., Leon-Garcia, A.: On robust trafﬁc engineering in transport networks. In: IEEE

GLOBECOM 2008—2008 IEEE Global Telecommunications Conference, pp. 1–6. IEEE

(2008)

50. Chung, F.: Laplacians and the cheeger inequality for directed graphs. Ann. Combinatorics 9(1),

1–19 (2005)

51. Bernstein, D.S.: Scalar, Vector, and Matrix Mathematics: Theory, Facts, and Formulas-Revised

and, Expanded edn. Princeton University Press (2018)

52. Hwang, C.-L., Lai, Y.-J., Liu, T.-Y.: A new approach for multiple objective decision making.

Comput. Oper. Res. 20(8), 889–899 (1993)

53. Kim, A., Kang, M.H.: Determining asset criticality for cyber defense. Technical report, Naval

Research Lab, Washington, DC (2011)

Smart Nuclear Power Plants Operating System Through IoTs

Article

Full-text available

Apr 2022

The potentials of the wireless network has made it possible to access a variety of technological operations through the folk of internet-connected devices. This exponential hike and trade of wireless carriers could be a platform to operate and deploy in highly radiated remotely accessed areas such as Nuclear Power Plants (NPPs). The internet-connected devices play the most significant role to improve and build up a smart NPP operating system. The promising internet of things (IoTs) method has enabled interaction between advanced instrumentation and control devices that contributes a new paradigm in the digital world towards the advanced futuristic wireless networks as 5G or 5G beyond (5GB) communication system in industrial research. From this point of view, we investigate the important features and arduously execute a novel approach to operate of smart NPP system for safety concerns by the deployment of internet-connected devices. Therefore, due to security reasons, the IoTs are patched up with a communication between the user and corresponding component at the site. Monitoring and surveillance of NPP safety concerns IoT have become the replacement solution of manpower. In this study, we are summarizing the affecting factors for smooth functioning of NPP, human health diseases cause radiation and a big impact of remotely accessed modern NPP system. This study is also carried out a wide discussion and comprised the existing operational views. Additionally, the major key components of wireless connections are used for security and safety monitoring in NPP systems through IoTs. Finally, the future extendable work is also summarized.

Context-Based Resilience in Cyber-Physical Production System

Article

Full-text available

Dec 2021

Cyber-physical systems are hybrid networked cyber and engineered physical elements that record data (e.g. using sensors), analyse them using connected services, influence physical processes and interact with human actors using multi-channel interfaces. Examples of CPS interacting with humans in industrial production environments are the so-called cyber-physical production systems (CPPS), where operators supervise the industrial machines, according to the human-in-the-loop paradigm. In this scenario, research challenges for implementing CPPS resilience, promptly reacting to faults, concern: (i) the complex structure of CPPS, which cannot be addressed as a monolithic system, but as a dynamic ecosystem of single CPS interacting and influencing each other; (ii) the volume, velocity and variety of data (Big Data) on which resilience is based, which call for novel methods and techniques to ensure recovery procedures; (iii) the involvement of human factors in these systems. In this paper, we address the design of resilient cyber-physical production systems (R-CPPS) in digital factories by facing these challenges. Specifically, each component of the R-CPPS is modelled as a smart machine, that is, a cyber-physical system equipped with a set of recovery services, a Sensor Data API used to collect sensor data acquired from the physical side for monitoring the component behaviour, and an operator interface for displaying detected anomalous conditions and notifying necessary recovery actions to on-field operators. A context-based mediator, at shop floor level, is in charge of ensuring resilience by gathering data from the CPPS, selecting the proper recovery actions and invoking corresponding recovery services on the target CPS. Finally, data summarisation and relevance evaluation techniques are used for supporting the identification of anomalous conditions in the presence of high volume and velocity of data collected through the Sensor Data API. The approach is validated in a food industry real case study.

Resilient energy management incorporating energy storage system and network reconfiguration: A framework of cyber‐physical system

Article

Full-text available

Apr 2022
IET GENER TRANSM DIS

Due to increasing the intricacies of cyber‐physical systems (CPSs) and the severity of natural phenomena, upgrading network planning is vital to reduce the vulnerability of these systems. This study develops a novel preventive‐corrective resilient energy management strategy (PC‐REMS) for a CPS in two stages, exploiting the network reconfiguration (NR) and energy storage systems (ESSs) capacity. The first stage of the proposed PC‐REMS follows preventive actions based on contingency faults. In contrast, the second stage applies corrective measures for improving the CPS resilience to cope with natural physical disasters. Vulnerability assessment data is sent to the physical power system daily through the communication network. The first stage of preparing the CPS for predictable faults focuses on pre‐scheduled ESSs and preventive NR to minimise the expected energy curtailment cost. The second stage involves the network recovery in real‐time through corrective NR to minimise energy curtailment cost after the faults. Three resistance, recovery, and resilience indices are introduced for evaluating the effectiveness of the model. The proposed model is examined by performing multiple simulations on the 33 and 118‐bus radial test systems. The simulation results show the efficiency of the proposed PC‐REMS model in dealing with predictable disasters to improve the CPS resilience.

Conceptual framework for developing resilience metrics for the electricity oil and gas sectors in the United States

Technical Report

Full-text available

Sep 2015

This report has been written for the Department of Energy's Office of Electricity Delivery and Energy Reliability to support the Office of Energy Policy and Systems Analysis in their writing of the Quadrennial Energy Review in the area of energy resilience. The topics of measuring and increasing energy resilience are addressed, including definitions, means of measuring, and analytic methodologies that can be used to make decisions for policy, infrastructure planning, and operations. A risk-based framework is presented which provides a standard definition of a resilience metric. Additionally, a process is identified which explains how the metrics can be applied. Research and development is articulated that will further accelerate the resilience of energy infrastructures.

Cloud-Based Simulation Platform for Quantifying Cyber-Physical Systems Resilience

Chapter

Full-text available

Nov 2020

Cyber-Physical Systems (CPS) often involve trans-disciplinary approaches, merging theories of different scientific domains, such as cybernetics, control systems, and process design. Advances in CPS expand the horizons of these critical systems and at the same time, bring the concerns regarding safety, security, and resiliency. To minimize the operating costs and maximize the scalability, often time, it is preferable to use the cloud environment for deploying the CPS computation processes and simulation environments. With the expanding uses of the CPS and cloud computing, major cybersecurity concerns are also growing around these systems. The cloud itself has security and privacy issues. This chapter focuses on a cloud-based simulation platform for deriving the cyber resilience metrics for the CPS. First, it presents a detailed analysis of the modeling of the resilience metrics by mapping them with cloud security concerns. Then, it covers modeling and simulation (M&S) challenges in developing simulation platforms in the cloud environment and discusses a way forward. Overall, we aim to discuss resilience metrics modeling and automation using the proposed simulation platform for the CPS in the cloud environment.

Cyber‐Physical System Resilience

Chapter

Full-text available

Dec 2019

Cyber‐physical systems (CPSs) play a critical role in diversified fields. The integration of computation and physical processes makes CPS a vital part in different industries, e.g. autonomous automobile systems, smart grid systems, healthcare systems, communication systems, etc. The CPS often involves transdisciplinary approaches, merging theory of different scientific domains such as cybernetics, control systems, process design, and embedded systems. With the expanding uses of the CPS, major cybersecurity concerns are also growing around these systems. Often computing the cyber resilience metrics are omitted in literature because of the complexity of the systems and lack of a clear idea about the overall network security posture. The chapter focuses on the cyber resilience metrics and frameworks for the CPS. The chapter presents a detailed cyber resilience framework for CPS to be used across different industries. The framework also guides the methodologies to compute the resilience metrics for the CPS. The chapter presents both qualitative and quantitative modeling of cyber resilience for the CPS. A discussion on the automation process for the CPS resilience metrics computation is presented, which covers details of the qualitative and quantitative simulation tool architectures, vulnerability assessment, visualization, and reporting processes. The chapter also covers complexities in designing and developing simulation tools and resilience metrics computation methodologies. The chapter aims to provide an overall idea about the cyber resilience metrics computation process and a simulation platform for the CPS and how that would be beneficial across various industries.

Modeling Cyber Resilience for Energy Delivery Systems Using Critical System Functionality

Conference Paper

Full-text available

Dec 2019

In this paper, we analyze the cyber resilience for the energy delivery systems (EDS) using critical system functionality (CSF). Some research works focus on identification of critical cyber components and services to address the resiliency for the EDS. Analysis based on the devices and services excluding the system behavior during an adverse event would provide partial analysis of cyber resilience. To address the gap, in this work, we utilize the vulnerability graph representation of EDS to compute the system functionality under adverse condition. We use network criticality metric to determine CSF. We estimate the criticality metric using graph Laplacian matrix and network performance after removing links (i.e., disabling control functions, or services). We model the resilience of the EDS using CSF, and system recovery curve. We also provide a comprehensive analysis of cyber resilience by determining the critical devices using TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) and AHP (Analytical Hierarchy Process) methods. We present use cases of EDS illustrating the way control functions and services in EDS map to the vulnerability graph model. The simulation results show that we can estimate the resilience metric using different types of graphs that may assist in making an informed decision about EDS resilience.

On network criticality in robustness analysis of a network structure

Article

Full-text available

Apr 2019

ICS-CRAT: A Cyber Resilience Assessment Tool for Industrial Control Systems

Conference Paper

Full-text available

May 2019

In this work, we use a subjective approach to compute cyber resilience metrics for industrial control systems. We utilize the extended form of the R4 resilience framework and span the metrics over physical, technical, and organizational domains of resilience. We develop a qualitative cyber resilience assessment tool using the framework and a subjective questionnaire method. We make sure the questionnaires are realistic, balanced, and pertinent to ICS by involving subject matter experts into the process and following security guidelines and standards practices. We provide detail mathematical explanation of the resilience computation procedure. We discuss several usages of the qualitative tool by generating simulation results. We provide a system architecture of the simulation engine and the validation of the tool. We think the qualitative simulation tool would give useful insights for industrial control systems' overall resilience assessment and security analysis.

Cybersecurity for Industrial Control Systems

Book

Apr 2016

Complexity Challenges in Cyber Physical Systems: Using Modeling and Simulation (M&S) to Support Intelligence, Adaption and Autonomy

Book

Dec 2019

This book provides the state-of-the-art in methods and technologies that aim to elaborate on the modeling and simulation support to cyber physical systems (CPS) engineering across many sectors such as healthcare, smart grid, or smart home. It presents a compilation of simulation-based methods, technologies, and approaches that encourage the reader to incorporate simulation technologies in their CPS engineering endeavors, supporting management of complexity challenges in such endeavors. Complexity Challenges in Cyber Physical Systems: Using Modeling and Simulation (M&S) to Support Intelligence, Adaptation and Autonomy is laid out in four sections. The first section provides an overview of complexities associated with the application of M&S to CPS Engineering. It discusses M&S in the context of autonomous systems involvement within the North Atlantic Treaty Organization (NATO). The second section provides a more detailed description of the challenges in applying modeling to the operation, risk and design of holistic CPS. The third section delves in details of simulation support to CPS engineering followed by the engineering practices to incorporate the cyber element to build resilient CPS sociotechnical systems. Finally, the fourth section presents a research agenda for handling complexity in application of M&S for CPS engineering. In addition, this text: -Introduces a unifying framework for hierarchical co-simulations of cyber physical systems (CPS) -Provides understanding of the cycle of macro-level behavior dynamically arising from spaciotemporal interactions between parts at the micro-level -Describes a simulation platform for characterizing resilience of CPS Complexity Challenges in Cyber Physical Systems has been written for researchers, practitioners, lecturers, and graduate students in computer engineering who want to learn all about M&S support to addressing complexity in CPS and its applications in today’s and tomorrow’s world.

Complexity Challenges in Cyber Physical Systems

Book

Nov 2019

Analysis of Bulk Power System Resilience Using Vulnerability Graph

Thesis

Sep 2018

Md Ariful Haque

Critical infrastructure such as a Bulk Power System (BPS) should have some quantifiable measure of resiliency and definite rule-sets to achieve a certain resilience value. Industrial Control System (ICS) and Supervisory Control and Data Acquisition (SCADA) networks are integral parts of BPS. BPS or ICS are themselves not vulnerable because of their proprietary technology, but when the control network and the corporate network need to have communications for performance measurements and reporting, the ICS or BPS become vulnerable to cyber-attacks. Thus, a systematic way of quantifying resiliency and identifying crucial nodes in the network is critical for addressing the cyber resiliency measurement process. This can help security analysts and power system operators in the decision-making process. This thesis focuses on the resilience analysis of BPS and proposes a ranking algorithm to identify critical nodes in the network. Although there are some ranking algorithms already in place, but they lack comprehensive inclusion of the factors that are critical in the cyber domain. This thesis has analyzed a range of factors which are critical from the point of view of cyber-attacks and come up with a MADM (Multi-Attribute Decision Making) based ranking method. The node ranking process will not only help improve the resilience but also facilitate hardening the network from vulnerabilities and threats. The proposed method is called MVNRank which stands for Multiple Vulnerability Node Rank. MVNRank algorithm takes into account the asset value of the hosts, the exploitability and impact scores of vulnerabilities as quantified by CVSS (Common Vulnerability Scoring System). It also considers the total number of vulnerabilities and severity level of each vulnerability, degree centrality of the nodes in vulnerability graph and the attacker’s distance from the target node. We are using a multi-layered directed acyclic graph (DAG) model and ranking the critical nodes in the corporate and control network which falls in the paths to the target ICS. We don't rank the ICS nodes but use them to calculate the potential power loss capability of the control center nodes using the assumed ICS connectivity to BPS. Unlike most of the works, we have considered multiple vulnerabilities for each node in the network while generating the rank by using a weighted average method. The resilience computation is highly time consuming as it considers all the possible attack paths from the source to the target node which increases in a multiplicative manner based on the number of nodes and vulnerabilities. Thus, one of the goals of this thesis is to reduce the simulation time to compute resilience which is achieved as illustrated in the simulation results.

Realizing Cyber-Physical Systems Resilience Frameworks and Security Practices

Abstract and Figures

Recommended publications

Cloud-Based Simulation Platform for Quantifying Cyber-Physical Systems Resilience

Modeling Cyber Resilience for Energy Delivery Systems Using Critical System Functionality

Cyber‐Physical System Resilience

Modeling Mission Impact of Cyber Attacks on Energy Delivery Systems