ArticlePDF Available

Dependability and Security Quantification of an Internet of Medical Things Infrastructure Based on Cloud-Fog-Edge Continuum for Healthcare Monitoring Using Hierarchical Models

Authors:

Abstract

Rising aggressive virus pandemics urge to conduct studies on dependability and security of modern computing systems to secure autonomous and continuous operations of healthcare systems. In that regard, we propose to quantify dependability and security measures of an Internet of Medical Things (IoMT)infrastructure relied on an integrated physical architecture of Cloud/Fog/Edge (CFE) computing paradigms in this paper. We propose a reliability/availability quantification methodology for the IoMT infrastructure using a hierarchical model of three levels:(i) fault tree (FT) of overall IoMT infrastructure consisting of CFE member systems, (ii) FT of subsystems within CFE member systems, and (iii) continuous time Markov chain (CTMC) models of components/devices in the subsystems. We incorporate a number of failure modes for the underlying subsystems including Mandel-bug related failures and non-Mandel bugs related failure, as well as failures due to cyber-security attacks on software subsystems. Five case-studies of configuration alternation and four operational scenarios of the IoMT infrastructure are considered to comprehend the dependability characteristics of the IoMT physical infrastructure. The metrics of interest include reliability over time, steady state availability (SSA), sensitivity of SSA wrt. selected Mean Time to Failure - Equivalent (MTTFeq) and Mean Time to Recovery -Equivalent (MTTReq) and sensitivity of SSA wrt. frequencies of cyber-security attacks on software subsystems. Analysis results help comprehend operational behaviors and properties of a typical IoMT infrastructure. The findings of this study can improve the design and implementation of real-world IoMT infrastructures consisting of cloud, fog and edge computing paradigms.
1
Dependability and Security Quantification of an Internet of Medical Things
Infrastructure based on Cloud-Fog-Edge Continuum for Healthcare
Monitoring using Hierarchical Models
Tuan Anh Nguyen, Dugki Min?and Eunmi Choi
Abstract—Rising aggressive virus pandemics urge to conduct
comprehensive studies on dependability and security of modern
computing systems to secure autonomous and continuous opera-
tions of healthcare systems. In that regard, we propose to quantify
dependability and security measures of an Internet of Medical
Things (IoMT) infrastructure relied on an integrated physical
architecture of Cloud/Fog/Edge (CFE) computing paradigms in
this paper. We propose a reliability/availability quantification
methodology for the IoMT infrastructure using a hierarchical
model of three levels: (i) fault tree (FT) of overall IoMT
infrastructure consisting of CFE member systems, (ii) FT of
subsystems within CFE member systems, and (iii) continuous
time Markov chain (CTMC) models of components/devices in
the subsystems. We incorporate a number of failure modes for
the underlying subsystems including Mandel-bug related failures
and non-Mandel bugs related failure, as well as failures due to
cyber-security attacks on software subsystems. Five case-studies of
configuration alternation and four operative scenarios of the IoMT
infrastructure are considered to comprehend the dependability
characteristics of the IoMT physical infrastructure. The metrics
of interest include reliability over time, steady state availability
(SSA), sensitivity of SSA wrt. selected Mean Time to Failure -
Equivalent (MTTFeq) and Mean Time to Recovery - Equivalent
(MTTReq) and sensitivity of SSA wrt. frequencies of cyber-
security attacks on software subsystems. Analysis results help
comprehend operative behaviors and properties of a typical IoMT
infrastructure. The findings of this study can help improve the
design and implementation of real-world IoMT infrastructures
consisting of cloud, fog and edge computing paradigms.
Index Terms—e-Health Monitoring, Internet of Medical Things,
Cloud Computing, Fog Computing, Edge Computing, Reliability,
Availability, Cyber Security Attack, Hierarchical Model
I. INTRODUCTION
Internet of Medical Things (IoMT) has emerged in recent
years as a mainstream computing infrastructure opening up
a world of possibilities in medical treatment and diagnosis
of diseases [?]. Specifically, due to the rise of world-wide
aggressive virus pandemics causing overcrowding problem in
medical centers and hospitals, and overloading problem in pre-
vention and treatment of traditional medical system in general
and of medical practitioners in particular, IoMT is expected to
revolutionize the digital healthcare systems (e-Health) in order
to reduce the currently huge workload of autonomous healthcare
monitoring tasks and to proactively respond to the new virus
pandemics with infinitesimal to zero latency [?]. An IoMT
infrastructure for remote healthcare monitoring often consists
of a huge amount of heterogeneous internet-connected medical
sensors/devices. When connected to the internet, ordinary med-
ical sensors/devices are enabled to repeatedly collect invaluable
health-related data and then, transmit the collected big data
?Corresponding author.
Tuan Anh Nguyen was with Office of Research, University-Industry
Cooperation Foundation, Konkuk University, Seoul 05029, Korea, e-
mail: anhnt2407@konkuk.ac.kr
Dugki Min was with Department of Computer Science and Engineering,
College of Engineering, Konkuk University, Seoul 05029, South Korea, e-
mail: dkmin@konkuk.ac.kr
Eunmi Choi was with School of Software, College of Computer
Science, Kookmin University, Seoul 02707, South Korea, e-mail: em-
choi@kookmin.ac.kr
instantly to high-performance computing platforms either in
local regions or remote areas for comprehensive data analysis
and prediction [?]. This methodology can give an insightful
assess into symptoms and trends of specific epidemic problems
in a community and thus, enable remote health-care solutions
for an individual patient at an uncertain place, bring about more
control over the patient’s live and medical treatment in general.
In term of finance, the report [?] anticipated that the investment
on the development and construction of healthcare IoMT infras-
tructures by 2025 will rapidly grow to one trillion USD in a year
approximately. And, the number of connected IoMT objects
will reach up-to 50 billions by 2025 in which a large portion of
those objects will be for healthcare purposes. On the other hand,
in some severe pandemics at a global scale, many extremely
infected countries prefer to conduct stringent medical distancing
solutions and lock-down acts in order to diminish the rapid
spreading of the disease outbreak, which therefore, demand
highly efficient and durable physical IoMT infrastructures for
real-time e-Health remote monitoring and consultation. For
this reason, a failure of an individual system/subsystem or
of an uncertain underlying component/device of the e-Health
monitoring IoMT infrastructure can consequently cause a severe
loss of patient lives and/or lead to uncontrollable situation
of the disease pandemic. Hence, reliability, availability and
security are essential measures for e-Health monitoring IoMT
infrastructures to be theoretically proven to secure constant and
low-latency medical operations with a high-level of autonomy
and satisfaction in term of quality of service (QoS) in such
circumstances. Thus, it is paramount of importance to develop
theoretical models and perform comprehensive assessment and
prediction of the above-mentioned measures of interest for a
specific e-Health monitoring IoMT infrastructure in a complete
manner.
Recent studies have proposed different architectures and
development solutions for Internet of Things (IoT) infrastruc-
tures across various ecosystems in practice such as smart
farms in digital precision agriculture (e-Agriculture), smart
factories (e-Manufacture), etc [?]. As a common sense, IoMT
infrastructures are built-up on top of a sophisticated and multi-
level architecture of heterogeneous software/hardware systems,
subsystems and underlying components. Computing power and
storage capability of IoMT infrastructures likely rely on existing
computing backbones which are powerful computing paradigms
(e.g., CFE), while sensing and data gathering rely on the
pervasiveness and ubiquity of medical sensors/devices at the
very edge of the overall network. The IoMT infrastructures for
e-Health monitoring in practice are strictly required with (i)
hosting latency-sensitive and likely real-time applications and
services across the infrastructure, (ii) a huge data volume at
a high-level of heterogeneity generated by IoMT medical sen-
sors/devices is seamlessly synchronized, stored, and processed
across the infrastructure often at a high data transaction rate, and
(iii) a high level of reliability, availability and security measures
is strictly required to satisfy a high level of data accuracy,
economically operational costs and user expectations for all
2
operations at all levels of the IoMT infrastructure. Moreover, the
concept of cloud-fog-edge inter-operability and/or integration
in medical IoMT infrastructures has been gradually adopted in
practice in order to resonate existing computing power and
storage capability of those individual computing paradigms
when integrated, while also to diminish the drawbacks of those
standalone ecosystems in specific medical purposes, all together
providing seamless services and applications in medical IoMT
infrastructures. Cloud computing, an accredited centralized
computing paradigm featured by its pricing models of pay-
as-you-go and its service platforms of Software as a Service
(SaaS), Platform as a Service (PaaS) and Infrastructure as a
Service (IaaS) [?], has revolutionized the world of computing
with centralization and consolidation solutions to maximize
availability, agility and adaptability of services and resources
[?]. However, the geographical centralization and allocation of
a limited number of huge cloud data centers across the globe
may hinder the rapid expanding of an enormous network of
heterogeneous IoMT sensors/devices in different remote areas
coupled with latency-sensitive services and applications that
require massive real-time data transactions. In that context, fog
computing, an emerging decentralized computing paradigm, en-
ables the dispersion of computing power and storage capabilities
of cloud computing centers to the edge of the computing net-
work in local areas (e.g., hospitals, houses, shopping malls, etc.)
in order to achieve a high-level of data processing efficiency
and latency sensitivity of real-time medical services/application
[?]. If fog computing plays an intermediate role between cloud
centers and the edge of the IoMT infrastructure mainly for local
data processing and aggregation, edge computing, a promising
distributed computing paradigm at the very edge of a computing
network, relies on processing for the things (connected objects)
with built-in standardized data processing methodologies at
the edge consisting of end devices (e.g, smart phones, smart
objects, wearable devices etc.) or edge devices (e.g, IoMT
gateways, edge routers, IoMT sensors/devices etc.) in order to
secure high-speed data processing and gathering and to vastly
reduce response-time in most real-time services and critical
applications for healthcare monitoring and treatment in which a
very short response time, ultra-low latency and real-time access
are non-negotiable [?].
Due to such a highly sophisticated and multi-fold archi-
tecture as described above, an IoMT infrastructure with CFE
interoperability/integration for healthcare monitoring often en-
counters with an enormous number of either internal failures
due to software/hardware faults or external failures due to
cyber-security attacks. Supposed to neglect the failures due
to human intervention of system operators, if hardware com-
ponents may suffer uncertain malfunction and irregular faults
causing a total failure demanding a hardware repair/replace
[?], software components generally undergo three main faults
(i) Bohrbugs-related faults, (ii) non-aging Mandelbugs-related
faults (or Heisenbugs-related faults) and (iii) aging-related faults
[?] which require appropriate software recovery solutions (e.g.,
restart, reset, aging rejuvenation etc.). In addition, connected
facilities in IoMT infrastructures, as a common sense, incen-
tivize industrial-espionage and cyber-hacks on software com-
ponents at all levels due to the instinct problem of world-wide
networking of the internet [?]. For instance, Distributed Denial
of Service (DDoS) attacks, a cyber-security phenomenon of the
last few years, can be performed by certain malicious parties
on infrastructural points of IoMT infrastructures to cripple
communication of connected facilities and special services of
connected things. Without exceptions, IoMT infrastructures for
e-health monitoring contain a variety of security vulnerabilities
and challenges [?]. Intense cyber-attacks, either as a decoy for a
hack or for political/activist reasons, fundamentally cause a last-
ing downtime of working time-sensitive applications/services
of the whole healthcare system which in turn, create financial
losses or even more severely, a blockage of regional medical
infrastructure. Therefore, taking into account as many failure
modes due to software/hardware faults and due to intensity
of cyber-attacks and their appropriate recovery solutions as
possible in a monolithic model of a specific architecture of
an e-health IoMT infrastructure is crucial for assessment and
prediction of the measures of interest in a complete manner, but
is also challenging and troublesome for system practitioners and
analysts.
In literature, there has been a number of comprehensive
studies on the modeling and evaluation of reliability, avail-
ability and security measures of dominant existing computing
paradigms and their physical infrastructures using stochastic
models. But the number of studies on modeling and assessment
of IoMT infrastructures in general is still limited. Especially,
few number of comprehensive studies on the modeling and
assessment of reliability, availability and security measures
using hierarchical models are performed for healthcare IoMT
infrastructures that come along with the integration and inter-
operability of CFE computing paradigms as a whole. Many
previous works focused on modeling and analysis of depend-
ability and security measures for cloud systems. Kim et al. in [?]
presented a comprehensive hierarchical model for availability
analysis of a virtualized servers system (VSS) in which a top-
level FT model is used to capture the hierarchical architecture
of the system while the bottom-level CTMC models are to
capture operative details of the underlying components. The
work was extended in [?], in which the authors comprehended
the dynamic of operative states and transitions of the VSS using
a monolithic stochastic reward net (SRN) model taking into
account software aging and rejuvenation techniques. Similar
approaches were performed for the reliability and availability
assessment of clusters of VSS [?], data center network (DCN)s
[?], software defined network (SDN) [?], disaster tolerant data
center (DTDC) [?]. In the last few years, some of recent
works studied dependability of IoMT infrastructure in different
aspects. The work [?] was one of the first studies which
proposed the concept of integrating CFE computing paradigms
in e-health IoMT infrastructures. In [?], Cao et al. proposed
a similar architecture for IoMT infrastructures based on edge-
fog-cloud continuum for streaming analytics in smart parking
applications. In terms of modeling and evaluation, Andrade et
al. proposed a monolithic stochastic petri net (SPN) model
to assess availability and operative cost measures of an IoMT
infrastructure based on a two-site disaster tolerant cloud system
in [?]. Tigre et al. in [?] presented a two-fold hierarchical
model to analyze availability of an e-health IoMT infrastructure.
The study was then extended in [?] by proposing additional
SPN models to take into account performance along with
availability measures of an e-health IoMT infrastructure. The
above-mentioned works are some of the first studies on IoMT
infrastructures for healthcare based on CFE continuum. Never-
theless, these studies only considered simplified and truncated
failure modes and recovery strategies of the cloud, fog and edge
member systems.
To the best of our knowledge, the majority of the previous
works in the area have not presented comprehensive studies
on IoMT for healthcare monitoring which carefully consider
detailed operative states and transitions of the underlying com-
ponents/devices at the bottom-most level as well as the complete
hierarchical architectures of the member systems/subsystems in
3
a comprehensive manner. Furthermore, security measures and
the trade-offs between availability and security metrics were
not taken into account along with the dependability assessment
properly. There has not been a common modeling and analysis
framework for IoMT infrastructures with the integration and
interoperability of CFE continuum in previous works. Our
previous paper [?] was a preliminary study in which we
proposed a modeling and analysis framework for reliability,
availability and security assessment of IoT infrastructures in
general cases. Different from the existing works, in this paper,
we present a comprehensive study on reliability, availability and
security assessment of a specific e-health IoMT infrastructure
for healthcare monitoring in which its architecture consists a
three-fold hierarchy of cloud-fog-edge computing continuum.
Accordingly, we extend our previous work by using FT models
in the top and middle levels to incorporate more in-depth hier-
archical architectures of the member systems and subsystems.
Furthermore, we capture detailed operative states and transitions
of every underlying components at the bottom level in a
complete manner using Markov models. As per observed, the
quantification of reliability, availability and security measures
of IoMT infrastructures with CFE continuum for healthcare
monitoring is still at the very preliminary stage in the research
area. Our study on reliability, availability and security quan-
tification of an IoMT infrastructure based on CFE continuum
for healthcare monitoring as well as the proposal of three-
fold hierarchical modeling and analysis framework for IoMT
infrastructures can be distinguished as one of the first studies
in the research area of dependability and security assessment
for e-health IoMT infrastructures with CFE continuum.
This paper extends the related research area on IoMT infras-
tructures through the following key contributions:
(i.) We proposed a three-fold hierarchical modeling and anal-
ysis framework for reliability, availability and security evalua-
tion of IoMT infrastructures featured with a three-layer CFE ar-
chitecture. The formalism of the proposed framework is gener-
ally developed using a hierarchical graph of {Ψ,,Φ}. Where
Ψrepresents the IoMT infrastructure’s FT model containing its
member systems and subsystems. represents FT models cap-
turing internal architectures of subsystems containing respective
IoMT components/devices. At last, Φrepresents the Markov
models elaborating underlying operative behaviors (e.g., oper-
ative states and transitions) of IoMT components/devices. The
modeling and analysis framework is generalized using sudo-
code for further development of assessment tools for different
IoMT infrastructures.
(ii.) We proposed a set of FT models and Markov models to
comprehensively capture sophisticated architectures of member
systems and their subsystems as well as to take into account
underlying operative behaviors of IoMT components for a
typical CFE continuum based IoMT infrastructure for healthcare
monitoring. An overall hierarchical infrastructure model for the
IoMT infrastructure is then developed based on the above-
mentioned modeling and analysis framework.
(iii.) Typical case-studies and operative scenarios are con-
sidered regarding configuration alterations and operational sit-
uations of the IoMT infrastructure. We also developed hierar-
chical infrastructure models of these case-studies and operative
scenarios.
(iv.) Various quantitative analysis experiments are performed
on the proposed hierarchical infrastructure models including (i)
reliability analyses of the default IoMT infrastructure (along
with reliability analyses of its member systems and subsystems)
and its case-studies wrt. default input parameters; (ii) steady-
state analyses of measures of interests including MTTFeq,
MTTReq, SSA and downtime hours in a year for the default
IoMT infrastructure and its member systems and subsystems;
(iii) steady-state analyses of the above-mentioned measures of
interest for case-studies and operative scenarios; (iv) parametric
sensitivity analyses of the default IoMT infrastructure under
the variation of MTTFeq and MTTReq all member systems
and subsystems, respectively; and (v) security analyses wrt.
intensities of cyber-attacks on all software parts in the IoMT
infrastructure.
Through the analysis results, findings of this study can be
realized as follows:
Service reliability can drop due to uncertain failures of
cloud while the presence of fog/edge can transparently improve
the measure, however,
Expansion in size of cloud can vastly enhance service
reliability in comparison to that of other member systems while
expansion of edge decreases service reliability,
Redundant roles of cloud and fog in providing services
(relaxed configuration) can improve system availability if com-
pared to crashed scenarios of either cloud or fog,
Higher time to failure and shorter time to recovery of edge
can help avoid severe drop of system availability and
More frequent attacks towards fog/edge gateways result in
a much lower system availability of IoMT infrastructures.
The above findings can contribute to the development as well
as the management tasks of a real-world IoMT infrastructure.
The remain of this paper is organized as follows. In Sec-
tion II, we discuss several related works on the topic. In
Section III, we propose a complete hierarchical modeling and
analysis framework for IoMT infrastructures with CFE contin-
uum. The architecture along with case-studies and operative
scenarios of a specific IoMT infrastructure with CFE con-
tinuum for healthcare monitoring is presented in Section IV.
Hierarchical models of overall architecture, member systems,
subsystems and underlying components are developed and
presented in Section V. Numerical model analysis results on
reliability, availability and security measures of the considered
IoMT infrastructure are presented in Section VI. The paper is
concluded in Section VII.
II. RE LATE D WO RK S
Quantitative evaluation using hierarchical models is an effi-
cient analytical modeling and analysis methodology which re-
lies on the combination of combinatorial models and state-space
models to take into account structural architecture of a complex
system and to comprehensively incorporate sophisticated oper-
ative behaviors within the system. The main purpose of using
hierarchical models is to utilize the pros while restricting the
cons of combinatorial models and state-space models in dealing
with large and complex systems. Combinatorial models [?],
such as reliability block diagram (RBD), reliability graph (RG),
and fault tree (FT) offer an intuitive architectural representation
of a complicated system with statistical independence of its
subsystems in order to capture the conditions of operative
normality or failure. On the other hand, state-based models
[?], such as Markov chain models (continuous time Markov
chain (CTMC), discrete time markov chain (DTMC), semi-
Markov process (SMP), markov reward model (MRM) etc.)
can comprehensively incorporate sophisticated operative be-
haviors (e.g., failure/recovery modes, transient and intermittent
faults, imperfect coverage, failure/repair dependencies, etc.) of
underlying micro-systems or components at the bottom level
but they often fail to represent architectural features of a
system and easily fall into largeness/stiffness problems in an
attempt to model a large/complex system, especially with a
4
hierarchical feature consisting of multiple levels of subsystems
and/or micro-systems. Numerous studies in literature success-
fully employed hierarchical models for quantitative assessment
of different practical infrastructures. Kim et al [?] employed
a hierarchical model of (i) FT models in the top level to
capture a specific architecture of a virtualized two-server sys-
tem consisting of both software/hardware subsystems, and (ii)
CTMC models in the bottom level to take into account detailed
operations of various hardware/software subsystems in order
to quantitatively assess availability measures of interest (e.g.,
MTTFeq, MTTReq, SSA, downtime) of the virtualized system.
The similar modeling approach of the authors were employed in
[?] for reliability analysis of sensor networks. Smith et al. in [?]
presented a two-fold hierarchical availability model composed
of a high-level FT model with various lower-level Markov
models for availability modeling and analysis of a real-world
commercial blade server system, IBM BladeCenter®, which
consists of up to 14 blade servers within a chassis providing
shared subsystems (e.g., power and cooling). Interchangeable
models can be used at different levels of a hierarchical model.
Matos et al. in [?] employed a two-level hierarchical model
to evaluate availability of a Mobile Cloud Computing (MCC)
infrastructure considering the connection of a mobile device
to a cloud system consisting of numerous cloud nodes. The
top-level model is RBD to represent the connection between
the mobile device and the cloud system in a straightforward
manner while the lower level consists of CTMC models to
capture state transitions of underlying parts such as battery
discharging, upgrading of mobile applications, etc. The similar
approach were also employed in [?] to quantitatively analyze
performability metrics of a private cloud storage services. The
top-level RBD model was still used to take into consideration
cloud storage’s subsystems while the lower-level SPN model
was interchangeably used to capture performance-related mea-
sures. In our work, we extend the above-mentioned hierarchical
modeling and analysis methodology for large and sophisti-
cated IoMT infrastructures by proposing a general hierarchical
evaluation framework featured with a three-fold hierarchical
model consisting of (i) FT models in the top and middle
levels to capture structural architectures of the infrastructure
and its member systems, and (ii) Markov models in the bottom
level to capture detailed operative states and transitions of the
underlying IoMT components/devices.
Dependability and security attributes of a certain system
often include different metrics (e.g., reliability, availability,
performance, survivability, etc.) to assess the system’s operative
satisfaction in accordance with QoS terms in service level
agreement (SLA) between system developers and clients [?].
The assessment of these dependability attributes has been ac-
complished for various existing computing paradigms in recent
years, but a limited number of previous works considered de-
pendability assessment for IoMT infrastructures. Kafhali et al.
[?] analyzed performance attribute of an IoT-enabled healthcare
monitoring system using queuing network models. Santos et al.
[?] applied a multi-objective optimisation algorithm (NSGA-
II) to obtain maximal availabilities in accordance with recom-
mended architectural configurations of an e-health IoT infras-
tructure. The work [?] was one of the preliminary studies on
dependability assessment of IoT infrastructures using stochastic
models. In this work, Macedo et al. proposed Markov models to
assess reliability and availability IoT applications considering
redundancy aspects disregarding detailed architectures and/or
underlying operative behaviors within the IoT infrastructure.
Ever et al. [?] developed monolithic Markov models for
performance analysis under constraints of availability (e.g.,
performability) for Wireless Sensor Network (WSN) and IoT
with similar assumptions, respectively. Li et al. in [?] proposed
end-to-end models to estimate energy consumption considering
trade-offs between performance (response time) and reliability
(service accuracy) when offloading computation from the things
to the inner computing paradigms (cloud/fog). In [?], Testa
et al. presented an event-based formalism for dependability
assessment after detailed failure modes and effects analysis of
a mobile-based healthcare monitoring system (mHealth) based
on body area network (BAN). Araujo et al. in [?] presented
a two-level hierarchical model of RBD and SPN to model a
MCC based mHealth system considering an abstracted MCC
architecture associated with a mobile device. In the work [?],
Araujo et al. studied the impact of capacity and discharging
rate on battery life time of mobile devices running mHealth
applications considering various network connectivity and com-
munication protocols in a cloud-based mHealth system. The
above studies have built a concrete foundation for our research.
Nevertheless, as per discussed, previous studies in literature
did not pay sufficient consideration on elaborating both overall
structural architectures of IoT infrastructures and key operative
behaviors within the infrastructures. In this work, we take into
account the above-mentioned features of IoMT infrastructures
in modeling and analysis of important dependability attributes
(e.g., reliability, availability and security) in a complete manner
via a hierarchical perspective. We further perform comprehen-
sive analyses on the impact of cyber-attack intensities on the
availability measures of IoMT infrastructures to realize the
trade-off between the two metrics.
III. A HIERARCHICAL MODELING AND ANALYSIS
FR AM EWORK FOR IOMT INFRASTRUCTURES
To quantify reliability, availability and security measures of a
multi-level complex IoMT infrastructure, it requires a suitable
hierarchical modeling and analysis framework. A number of
different sub-models is organized in a hierarchical manner
to form a multi-level system model in order to model het-
erogeneous features and integration of IoMT infrastructures
while also to capture operative behaviors in a comprehensive
manner of bottom-most subsystems/components. The proposed
hierarchical modeling and analysis framework utilises (i) the
capability in generating combinatorial models in accordance
with considered system architecture with simplicity and rapid-
ity and (ii) the capability in elaborating underlying operative
behaviors (states and transitions) into state-based models all
together forming an overall hierarchical model of the IoMT
infrastructure. In this study, we propose a hierarchical modeling
and analysis framework, specifically for reliability, availability
and security assessment of multi-level IoMT infrastructures.
The proposed hierarchical system model encompasses three-
fold heterogeneous models including (i) FT system model (Ψ)
in the top level to capture overall system architecture of the
IoMT infrastructure consisting of different member systems,
(ii) FT subsystem models () in the middle level to model
subsystems’ architectures in each member system and (iii) state-
based models (Φ) in the bottom-most level to comprehensively
capture operative behaviors featured by failure modes and
recovery strategies of every components in each subsystem as
depicted in Fig. 1. Without loss of generality, the developed
hierarchical model of an IoMT infrastructure is represented as
a graph of {Ψ,,Φ}. Where,
Ψencompasses a set of FT models Z:=
{ζ1, ..., ζl, ..., ζn, ζft}(l= 1...n) of a specific number nof
member systems, in which ζft represents the overall FT graph
connecting the FT models of the member systems to each other.
5
In turn, each FT model ζlof a member system encompasses
a set of subsystem models {ψζl
1, ..., ψζl
il, ..., ψζl
ml, ψζl
ft}
(il= 1...ml) with an assumption that the total number of
subsystem models of the member system model ζlis ml.
Where, ψζl
ft is the graph of the FT model ζlrepresenting
the architectural configuration of the member system ζl. For
instance as shown in Fig. 1, the subsystem models of the
member system model ζ1are {ψζ1
1, ..., ψζ1
i, ..., ψζ1
i}, where the
number of subsystem models in the member system’s model
ζ1is i. And the subsystem models of the member system’s
ζnare {ψζn
1, ..., ψζn
i, ..., ψζn
j}, where the number of subsystem
models in the member system’s model ζnis j. Computed
output of interest of a certain member system ζl( or a set of
subsystem models {ψζl
1, ..., ψζl
il, ..., ψζl
ml, ψζl
ft}) is πζl
ft. While,
the ultimate output of interest of the whole model is called
πft to be computed eventually as all outputs πζl
ft (l= 1...n)
of the member system models are previously generated. A
certain subsystem model ψζl
iland its output of interest (called,
πψζl
il
ft ) are developed and computed in the middle-level model,
respectively.
encompasses a set of separated FT models of the above-
mentioned subsystems {ψζl
1, ..., ψζl
il, ..., ψζl
ml}, where l= 1...n,
and il= 1...ml. In particular, the model which represents a
certain subsystem ψζl
ilis δψζl
il, which in turn, encompasses a
set of component models {δψζl
il
1, ..., δψζl
il
jlil, ..., δψζl
il
hlil, δψζl
il
ft }where,
jlil= 1, ..., hlilis the index of the corresponding component
in the subsystem ψζl
il,hlis the number of components of the
subsystem ψζl
iland δψζl
il
ft is the FT graph description of those
component models. The computed output of interest of a certain
subsystem ψζl
ilis called πψζl
il
ft . The component model δψζl
il
jliland
its output of interest are developed and computed in the bottom-
most level model, respectively.
Φencompasses a set of separated CTMC models of the
above-mentioned components {δψζl
il
1, ..., δψζl
il
jlil, ..., δψζl
il
hlil}where,
l= 1...n,il= 1...ml, and jlil= 1, ..., hlil. The CTMC model
of a certain component δψζl
il
jlilis called φδ
ψζl
il
jlil. And its output of
interest is denoted as πδ
ψζl
il
jlil
ctmc
Based on the above description, the proposed hierarchical mod-
eling and analysis framework is formed by a formal description
as presented in Eq. (1).
Ψ:= ζ1, ..., ζl, ..., ζn, ζft, πf t
=hψζ1
1, ..., ψζ1
i1, ..., ψζ1
m1, ψζ1
ft, πζ1
fti, ..., hψζl
1, ..., ψζl
il, ..., ψζl
ml, ψζl
ft, πζl
fti, ..., hψζn
1, ..., ψζn
in, ..., ψζn
mn, ψζn
ft , πζn
ft i, ζft , πft
:= ({δψζ1
1
1, ..., δψζ1
1
j11 , ..., δψζ1
1
h11 , δψζ1
1
ft , πψζ1
1
ft }, ...,
{δψζ1
i1
1, ..., δψζ1
i1
j1i1, ..., δψζ1
i1
h1i1, δψζ1
i1
ft , πψζ1
i1
ft }, ..., {δψζ1
m1
1, ..., δψζ1
m1
j1m1, ..., δψζ1
m1
h1m1, δψζ1
m1
ft , πψζ1
m1
ft }
, ...,
{δψζl
1
1, ..., δψζl
1
jl1, ..., δψζl
1
hl1, δψζl
1
ft , πψζl
1
ft }, ..., {δψζl
il
1, ..., δψζl
il
jlil, ..., δψζl
il
hlil, δψζl
il
ft , πψζl
il
ft }, ..., {δψζl
ml
1, ..., δψζl
ml
jlml, ..., δψζl
ml
hlml, δψζl
ml
ft , πψζl
ml
ft }
, ...,
{δψζn
1
1, ..., δψζn
1
jn1, ..., δψζn
1
hn1, δψζn
1
ft , πψζn
1
ft }, ...,
{δψζn
in
1, ..., δψζn
in
jnin, ..., δψζn
in
hnin, δψζn
in
ft , πψζn
in
ft }, ..., {δψζn
mn
1, ..., δψζn
mn
jnmn, ..., δψζn
mn
hnmn, δψζn
mn
ft , πψζn
mn
ft })
Φ:= (hφδψζ1
1
1, πδψζ1
1
1
ctmci, ..., hφδψζ1
1
j11 , πδψζ1
1
j11
ctmci, ..., hφδψζ1
1
h11 , πδψζ1
1
h11
ctmci, ...,
hφδ
ψζ1
i1
1, πδ
ψζ1
i1
1
ctmci, ..., hφδ
ψζ1
i1
j1i1, πδ
ψζ1
i1
j1i1
ctmc i, ..., hφδ
ψζ1
i1
h1i1, πδ
ψζ1
i1
h1i1
ctmc i, ..., hφδψζ1
m1
1, πδψζ1
m1
1
ctmc i, ..., hφδψζ1
m1
j1m1, πδψζ1
m1
j1m1
ctmc i, ..., hφδψζ1
m1
h1m1, πδψζ1
m1
h1m1
ctmc i
, ...,
hφδψζl
1
1, πδψζl
1
1
ctmci, ..., hφδψζl
1
jl1, πδψζl
1
jl1
ctmci, ..., hφδψζl
1
hl1, πδψζl
1
hl1
ctmci, ...,
hφδ
ψζl
il
1, πδ
ψζl
il
1
ctmci, ..., hφδ
ψζl
il
jlil, πδ
ψζl
il
jlil
ctmci, ..., hφδ
ψζl
il
hlil, π
δ
ψζl
il
hlil
ctmci, ..., hφδψζl
ml
1, πδψζl
ml
1
ctmc i, ..., hφδψζl
ml
jlml, πδψζl
ml
jlml
ctmc i, ..., hφδψζl
ml
hlml, π
δψζl
ml
hlml
ctmc i
, ...,
hφδψζn
1
1, πδψζn
1
1
ctmc i, ..., hφδψζn
1
jn1, πδψζn
1
jn1
ctmc i, ..., hφδψζn
1
hn1, πδψζn
1
hn1
ctmc i, ...,
hφδψζn
in
1, πδψζn
in
1
ctmc i, ..., hφδψζn
in
jnin, πδψζn
in
jnin
ctmc i, ..., hφδψζn
in
hnin, πδψζn
in
hnin
ctmc i, ...,
hφδψζn
mn
1, πδψζn
mn
1
ctmc i, ..., hφδψζn
mn
jnmn, πδψζn
mn
jnmn
ctmc i, ..., hφδψζn
mn
hnmn, πδψζn
mn
hnmn
ctmc i)(1)
Where, nis the number of member systems in the IoMT
6
infrastructure. l= 1, ..., n is the index of the corresponding
member system ζl.m1,...,ml,...,mnare the numbers of subsys-
tems existing in the member systems ζ1,...,ζl,...,ζn, respectively.
The notations (i1= 1, ..., m1),(i2= 1, ..., m2),...,(il=
1, ..., ml),...,(in= 1, ..., mn)are the indices which indicate
the corresponding subsystems ψζ1
i1,ψζ2
i2,..., ψζl
il,...,ψζn
in. The
notations h1i1,..., hlil,..., hninare the total numbers of compo-
nents existing in each of the above-mentioned subsystems ψζ1
i1,
ψζ2
i2,..., ψζl
il,...,ψζn
in, respectively. The notations j1i1,..., jlil,...,
jninare the indices of the corresponding components δψζ1
i1
j1i1,...,
δψζl
il
jlil,..., δψζn
in
jninin the subsystems. Furthermore, φδ
ψζl
il
jlilis the
underlying Markov model of the corresponding component δψζl
il
jlil
and its output of interest to be computed is πδ
ψζl
il
jlil
ctmc.δψζl
il
ft is the
description for the FT model of the subsystem ψζl
iland its output
of interest to be computed is πψζl
il
ft . Finally, ζft is the description
of the overall infrastructure FT model and πft is the output
measure of interest of the whole IoMT infrastructure.
1
1
1
1
1
1
1
1
1
(
Ψ
:
)
(
Φ
:
)
(
Δ
:
)
1
1
1
1
1
1
1
1
1
1
1
2
1
1
3
1
1
2
1
1
...
1
1
1
1
1
1
1
1
1
1
1
1
1
1
...
1
1
1
2
1
3
...
1
2
1
1
1
1
1
1
1
2
3
2
...
1
...
2
3
2
...
1
1
1
1
1
1
1
1
1
1
1
1
1
...
Fig. 1: A hierarchical modeling and analysis framework for
reliability, availability and security evaluation of IoMT infrastructures
In order to perform comprehensive computation and analyses
of the output measures of interest for IoMT infrastructures
based on the above-proposed hierarchical model {Ψ,,Φ},
the proposed modeling and analysis framework is presented
as in Algorithm 1. Without loss of generality, the framework
is presented without considering data types of input/output or
intermediate parameters/variables which can be appropriately
used in accordance with specific programming languages. Sup-
posed that, a system architecture designer provides sufficient
information of the IoMT infrastructure such as overall structure
of member systems, subsystems and components (IoMTStruc-
ture), its detailed description (IoMTDescription) and realistic
values of input parameters for computing a specific output of
interest πft. The empty set initialization is performed at the
beginning for the sets of member systems (Ψ), subsystems ()
and underlying components (Φ). Three main loops are used
to browse every member system, its subsystem and component,
respectively. The first loop is performed with the index lvarying
one by one from 1to n, where the number of member systems
nis computed in advance by the function count with two
input variables of IoMTStructure and the corresponding tag
memSystem. The second loop is performed with the index il
varying one by one from 1to ml, where (i) the total number
of subsystems mlin a member system ζlis computed by
the function count with two input parameters of ζland the
tag subSystem; and (ii) the member system ζlis extracted in
advance by the function extract with two input parameters
IoMTStructure and the tag memSystem. The third loop is
performed by varying the index jlilone after another from
1 to hlil, where (i) the total number of component hlilin a
subsystem ψζl
ilis computed by the function count with two
input parameters of ψζl
iland the tag Component, and (ii) the
corresponding subsystem ψζl
ilis extracted from the member
system ζlby the function extract with two input parameters
ζland the tag subSystem. Within the third loop, the we browse
every component and obtain the output of interest of its model.
A specific component δψζl
il
jlilis extracted from the subsystem
ψζl
ilby the function extract with two input parameters of
ψζl
iland the tag Component. The extracted component is then
put into the set by the function concat with two input
parameters of the current set and the extracted δψζl
il
jlil. As soon
as the extracted component δψζl
il
jlilis obtained, its Markov model
φδ
ψζl
il
jlilis generated accordingly using the function generate with
three input variables including the selected component δψζl
il
jlil, its
model description in IoMTDescription and the model tag ctmc.
The output of interest πδ
ψζl
il
jlil
ctmc of the above-generated model is
computed by the function compute with two inputs includ-
ing the component model φδ
ψζl
il
jliland default input parameters
IoMTParameters. The component model φδ
ψζl
il
jliland its computed
output of interest πδ
ψζl
il
jlil
ctmc are then put into the set of components
Φby the function concat. After that, those items in the set of
components Φare fed to the corresponding element in the set of
subsystems which exclusively represents the component δψζl
il
jlil
by the function replace with four input parameters including
the current set , the target component δψζl
il
jlilin , the newly
generated Ψand the selected pair of component model φδ
ψζl
il
jlil
and its computed output of interest πδ
ψζl
il
jlil
ctmc. As the third loop
ends when the index of the second loop is at a certain value
il, every single model φδ
ψζl
il
jlil(jlil= 1, ..., hlil) in the set Φ
of a specific component δψζl
il
jlilin the set , which belongs to
the subsystem ψζl
ilis browsed and its output of interest πδ
ψζl
il
jlil
ctmc
is computed, accordingly. Consequently, we can obtain the FT
model δψζl
il
ft of the subsystem ψζl
ilby the function generate with
three input parameters including (i) the considered subsystem
ψζl
ilconsisting of the above-replaced component models and
output values of interest, (ii) FT description within a subsystem
IoMTDescription and (iii) the model-type tag ft. The generated
subsystem model is used afterwards to compute the output of
7
interest πψζl
il
ft for the subsystem ψζl
ilby the function compute
with two inputs including δψζl
il
ft and input parameters IoMTPa-
rameters. The results of subsystem model and its computed
output of interest are put back to the set by the function
concat. Moreover, the generated model and computed output of
interest of the subsystem ψζl
ilare forwarded to the corresponding
position of the subsystem ψζl
ilin the upper-level set Ψby
the function replace with four main parameters including (i)
the current set Ψ, (ii) the subsystem ψζl
ilin the set Ψto be
replaced, (iii) the newly constructed set , and (iv) the above-
mentioned FT model and its computed output of interest of
the subsystem. As the second loop finishes when the index
of the first loop is at a certain value l, all subsystem models
in the set which belong to a certain member system ζlin
the set Ψand their output of interest πψζl
il
ft (il= 1, ..., ml) are
browsed and computed, respectively. When all subsystems of
a certain member system ζlin the set Ψare replaced with
the computed output of interest as mentioned above, we can
obtain the individual FT model ψζl
ft of the member system ζl
by the function generate with three input parameters including
the member system ζl, the structural description of the member
system ζlin IoMTDescription, and the model-type tag ft. Its
output of interest πζl
ft is then computed using the function com-
pute with two input parameters including the generated model
ψζl
ft of the member system ζland the default input parameters
IoMTParameters. These results are subsequently replaced to the
position of the member system ζlin the set Ψby the function
replace with three input parameters including the current set
Ψ, the considered member system ζland the pair of member
system model ψζl
ft and its output of interest πζl
ft. When the
first loop ends, all member systems ζlalong with their member
system models ψζl
ft and output of interest πζl
ft (l= 1, ..., n) are
browsed, generated and computed, respectively. After all, we
can obtain the overall FT model ζft of the IoMT infrastructure
by the function generate with three input parameters including
the overall structure IoMTStructure of the IoMT infrastructure,
the structural description IoMTDescription and the model-type
tag ft. At the end, the output of interest πft of the overall
system is computed by the function compute with two input
parameters including the above-generated infrastructure model
ζft and the default input parameters of the IoMT infrastruc-
ture IoMTParameters. Without loss of generality, the func-
tion/procedure used in above framework description including
count,extract,concate,generate,replace are developed upon
the programming languages and the data types of parameters
and variables. And the function compute is inherited from
existing tools for solving stochastic/deterministic models such
as symbolic hierarchical automated reliability and performance
evaluator (SHARPE) [?].
IV. A CLOUD,FOG AND EDGE CONTINUUM BASED IOMT
INFRASTRUCTURE FOR HEALTHCARE MONITORING
A typical system design of an IoMT infrastructure for health-
care monitoring purposes is shown in Fig. 2. The physical
architecture design of the IoMT infrastructure is aimed to
provide sufficient information about the overall organization of
member systems, subsystems and underlying components/parts.
In this work, an architecture is proposed to represent the
behaviors of an IoMT infrastructure for e-health monitoring
purposes that relies on IoMT sensors, fog devices in the local
areas and cloud services at a remote distance to process and
store medical data of patients at home or in hospital, and to
provide real-time data analysis and recommendation function-
alities between patients and (local/remote) physicians/medical
Algorithm 1: HIERARCHICAL MODELING AND ANALYSIS
FRAMEWORK FOR IOMT INFRASTRUCTURES
Input: IoMTParameters,IoMTStructure,IoMTDescription
Output: πft
1: Ψ , ,Φ
2: ncount(IoMTStructure,memSystem)
3: for l1to nby 1do
4: ζlextract(IoMTStructure,memSystem)
5: mlcount(ζl,subSystem)
6: for il1to mlby 1do
7: ψζl
ilextract(ζl,subSystem)
8: hlilcount(ψζl
il,Component)
9: for jlil1to hlilby hlildo
10: δ
ψζl
il
jlil
extract(ψζl
il,Component)
11: concat(,δ
ψζl
il
jlil
)
12: φδ
ψζl
il
jlilgenerate(δ
ψζl
il
jlil
,IoMTDescription,ctmc)
13: π
δ
ψζl
il
jlil
ctmc compute(φδ
ψζl
il
jlil,IoMTParameters)
14: Φconcat(Φ,hφδ
ψζl
il
jlil,π
δ
ψζl
il
jlil
ctmc i)
15: replace(,δ
ψζl
il
jlil
,Φ,hφδ
ψζl
il
jlil,π
δ
ψζl
il
jlil
ctmc i)
16: δ
ψζl
il
ft generate(ψζl
il,IoMTDesription,ft)
17: π
ψζl
il
ft compute(δ
ψζl
il
ft ,IoMTParameters)
18: concat(,δ
ψζl
il
ft ,π
ψζl
il
ft )
19: Ψreplace(Ψ,ψζl
il,,hδ
ψζl
il
ft ,π
ψζl
il
ft i)
20: ψζl
ft generate(ζl,IoMTDesription,ft)
21: πζl
ft compute(ψζl
ft ,IoMTParameters)
22: Ψreplace(Ψ,ζl,hψζl
ft ,πζl
ft i)
23: ζft generate(IoMTStructure,IoMTDesription,ft)
24: πft compute(ζft ,IoMTParameters)
experts. The IoMT infrastructure’s architecture for healthcare
monitoring functionalities is presented in the perspectives of
(i) end-users who are physician and/or medical experts at a
remote distance or physicians and/or patients in local areas
interacting with the IoMT based e-health infrastructure via user-
based customized interface applications without knowing its
underlying operations, (ii) system managers and/or operative
practitioners who often pay more attention on the efficient
design trade-offs relating to the overall infrastructure, and
(iii) physical system developers who specifically consider the
details of the underlying layers of the IoMT infrastructure
from the bottom-most level of physical components/devices
to the top level of the architecture. On the end-user side,
remote physicians or medical experts can access medical data
of patients via a cloud-based web portal/application through
a remote access point, while local medical supervisors and
caregivers can retrieve instantly medical data of patients via a
fog-based web portal/application through a local access point.
It is assumed that security countermeasures are implemented
to diminish cyber-attacks from anonymous and unauthorized
accesses to the underlying infrastructure through the access
points. Due to stringent requirements for the medical physi-
cians/supervisors to constantly access trusted medical data at
whatever time and from any place, the architecture design of the
e-health IoMT infrastructure needs to provide the highest values
of reliability/availability and security measures as possible to
data processing tasks and protection mechanisms. Thus, in the
perspective of system managers and operative practitioners, the
8
e-health IoMT infrastructure in consideration is composed of
three-fold CFE member systems with a specific configuration
as follows.
Cloud member system mainly consists of a specific c
number of cloud servers (cServer) interconnected to each other
and to a cloud storage system of sphysical cloud storages
(cStorage) through a network of physical cloud-oriented gate-
ways (cGateway). The physical cloud infrastructure is often lo-
cated remotely in distant and centralized cloud centers equipped
with a persistent internet connection to local member systems.
Cloud services hosted on cloud virtualized servers include
advanced data processing capabilities and superior technologies.
The cloud services provide cloud solutions for data warehouse,
data analytics, and data broadcasting functionalities.
Fog member system is composed of a specific fnumber of
fog nodes (fNode) which are often geographically distributed
in the local area such as in an individual smart building of
a hospital or a personal house. A fog node in turn consists
of a number mof fog servers (fServer) which are able
to perform computing/data transactions with other fog nodes
and/or with other cloud/edge member systems via a fog gateway
(fGateway). Each fog node is also equipped with a fog storage
(fStorage) acting as a fog repository (local database).
Edge member system mainly encompasses a number n
of IoMT nodes (iNode). Each IoMT node in turn, contains
IoMT ubiquitous sensors including non-body-attached or wear-
able sensors (ihSensor) and ambient sensors (iaSensor) to
constantly collect real-time biomedical and surrounding envi-
ronment data of patients. Those biomedical and context data
represents a source of big data for statistical and epidemi-
ological medical studies [?]. The huge amount of collected
data is transmitted to an IoMT gateway (iGateway) for pre-
processing and further data transactions from/to fog and cloud
member systems. Supposed that those IoMT sensors transmit
and receive signals from/to the IoMT gateway via a single
wireless connection protocol (which is called iW ireless).
Physical system developers’ perspective focuses on detailed
composition of bottom-most components and devices through-
out the member systems to be broken down as follows.
Cloud server (cServer) is composed of physical hard-
ware (cHW ) and software (cS W ) components. The physical
hardwares of a cloud server mainly include central processing
unit (cCP U ), memory banks (cM EM), network interface
cards (cN ET ), power supply unit (cP W R) and cooling unit
(cCOO). On top of the cloud server’s hardware cHW , a cloud
virtual machine monitor (VMM) (cV MM) is built-in to host a
cloud virtual machine (VM) subsystem consisting of a number
ncV M of cloud VMs (cV M). Without loss of generality, it is
supposed that the cloud VM subsystem runs an overall number
ncAP P of cloud-based medical applications. Cloud gateway
(cGateway) is simply composed of physical hardware (cg HW )
and software (cgS W ) components dedicated to cloud services.
Cloud storage (cStorage) consists of a system of ncSD cloud
storage disks (cSD) and a cloud storage manager (cSM ).
Fog server (fServer) comprises physical hardware com-
ponent (f HW ) (which are similar to those of cloud server) and
fog software components (f SW ). The fog server’s hardware
also consists of central processing unit (f CP U ), memory banks
(fMEM ), network interface cards (f NE T ), power supply unit
(f P W R) and cooling unit (fCOO). Nevertheless, different
from the cloud server’s software, the software part of the fog
server include an underlying fog Operating System (OS) (fOS)
and its upper fog-based medical applications (f AP P ). Fog
gateway (fGateway) and fog storage (fStorage) have a sim-
ilar architecture design compared to cloud gateway cGateway
and cloud storage cStorage. Fog gateway fGateway com-
prises physical hardware (f gHW ) and software (f gS W ), while
fog storage fStorage is composed of fog storage disks (f SD)
with a number nfSD disks and fog storage manager (fSM).
IoMT sensors (ihSensor &iaSensor) are usually an
embedded board with an identical architecture containing a bat-
tery pack (iBAT ), a sensing part (iSEN ), an analog-to-digital
conversion unit (iADC), a micro-controller unit (iM C U ), a
tiny memory part (iMEM ) and a transceiver (iT Rx). IoMT
gateway (iGateway) comprises its physical hardware (igH W )
and software (igS W ) components.
In this study, we consider several assumptions for the sim-
plification of modeling and analysis.
Identical parts: we assume that many parts of the IoMT
infrastructure possess the same design and operative features.
For instance, all fog nodes are identical to each other in terms
of their underlying architecture and operations. Nevertheless,
in practice, ones may design those parts in different manner to
perform different tasks.
Security countermeasures: it is supposed that cyber-
security attacks may target all built-in softwares in every
physical components but equipped with different types of secu-
rity countermeasures to avoid severe attacks. Nevertheless, the
details of attack types and its corresponding countermeasures
are not taken into account in the modeling to reduce the
size of models while considering sufficient operative behaviors
regarding reliability/availability quantification purposes.
Inter-dependencies: among member systems and/or sub-
systems within a member system are minimized to reduce the
size and the complexity of state-based models in order to avoid
state-space explosion problems of those models at the bottom-
most level [?].
Networking: in practice, is usually sophisticated in real-
world IoMT infrastructures to achieve efficient operations and
high-performance services [?]. In the assessment of reliabil-
ity/availability for the whole IoMT infrastructure, it is inevitable
to disregard the networking among the gateways and compo-
nents to develop overall non-state-based models of the IoMT
infrastructure.
Connectivity: it is assumed that wired connection be-
tween/within member systems/subsystems and internet con-
nection between the physical infrastructure and end-users are
secured to provide constant communication and services. Also,
wireless communication from/to IoMT sensors and IoMT gate-
ways are simplified to be considered in two states of connected
and disconnected communication and we disregard different
types of the real-world wireless communication like Bluetooth,
6LoWPAN or Zigbee [?].
Limited components: In practice, ones may expect to
involve many IoMT devices/sensors as well as subsys-
tems/components in member systems, nevertheless, the attempt
to consider a huge amount of items in an infrastructure model
and to take into account many factors can cause excessive and
unnecessary computation and storing in analysis and thus ne-
glect significant features of the IoMT infrastructure. Therefore,
we assume to consider only limited numbers of components
such as the numbers (nihSensor ,niaSensor ) of IoMT sensors in
the edge member system, the number of fog and edge nodes in
the fog and edge member systems as shown in Table VIII.
In order to explore operative characteristics of the IoMT
infrastructure as well as to comprehend different impacting
factors to the availability and security of the IoMT infrastruc-
tures, we propose to specifically investigate ve case-studies
and four operative scenarios as shown in Table I. The case-
studies are aimed at investigating the effects of configuration
9
Hospital/Household
(f)
(2)
(n)
(2)
Hospital/House Room (1)
(i)
Health Sensors
Blood Pressure
Body Temperature
Heart Rate
Oxygen Saturation
(j)
Ambient Sensors
Temperature
Light
Humidity
IoT Gateway (1)
Fog Node (1)
Fog Gateway (1) Fog Storage
(m)
Fog Servers
(c)
Cloud
Servers
Local Network
Internet
(s)
Cloud Storages
(g)
Cloud Gateways
Cloud Center
Wireless
Network
Cloud
Fog
Edge
IoT Sensor
iBAT iSEN iADC iMCU iMEM iTRx
IoT OS & App
IoT Gateway
IoT Gateway
Hardware IoT Gateway
Software
Security Countermeasures
Local Access Points
Remote Access Points
Remote Physician/
Medical Experts
Local Physicians/
Patients
Fog Server
Fog OS
Fog Gateway
Fog Gateway
Hardware Fog Gateway
Software
Cloud Server
cPWRcMEM cCOO
Cloud Gateway
Cloud Gateway
Hardware Cloud Gateway
Software
VMM
VM
Components Infrastructure End-User
cCPU cNET
Medical Apps
fPWRfMEM fCOOfCPU fNET
Fog Storage Medical Apps
Fog Storage
Manager Fog Storage
Pool
Cloud Storage
Cloud Storage
Manager Cloud Storage
Pool
Fig. 2: A specific system design of IoMT infrastructure for e-health monitoring
alterations, while the operative scenarios are to explore the loss
and gain of measures of interest in different operative situations.
The five case-studies include (i) case I: the proposed IoMT
infrastructure (used for comparison); (ii) case II: the IoMT
infrastructure with two redundant cloud center; (iii) case III:
the IoMT infrastructure with double-sized fog member system;
(iv) case IV: the IoMT infrastructure with double-sized edge
member system; and (v) case V: the IoMT infrastructure with
double-sized of all member systems of cloud, fog and edge.
Other four operative scenarios are also analyzed, including
(i) scenario I: the proposed default IoMT infrastructure (for
comparison purposes); (ii) scenario II: the IoMT infrastructure
with only cloud and edge systems and without Fog system; (iii)
scenario III: the IoMT infrastructure with fog and edge systems
and without cloud system; and (iv) the IoMT infrastructure fails
if both cloud and fog systems fail or edge system fails.
TABLE I: CASE -ST UD IES A ND O PE RATI VE SCEN ARIOS
Studies Descriptions
Case I default IoMT infrastructure
Case II with two redundant cloud center
Case III with double-sized fog member system
Case IV with double-sized edge member system
Case-studies
(configuration
alternation)
Case V with all double-sized cloud, fog and edge
member systems
Scenario I default IoMT infrastructure
Scenario II
(crashed fog)
with only cloud and edge member systems
and without fog member system
Scenario III
(crashed cloud)
with fog and edge member systems and
without cloud member system
Scenarios
(operational
situations)
Scenario IV (re-
laxed configura-
tion)
the IoMT infrastructure fails if both cloud
and fog member systems fail or edge
member system fails.
V. HIERARCHICAL MO DE LS
In this section, we presents (i) FT models of the overall IoMT
infrastructure and its member systems and subsystem, and (ii)
CTMC models of the bottom-most level components/devices.
A. Top-Level FT models
(i) FT model of default IoMT infrastructure
Fig. 3 shows the top-level system FT model of the IoMT
infrastructure. The reliability/availability of the IoMT infras-
tructure are basically captured by the true/false Boolean values
of the events in the developed FT. The IoMT infrastructure
is considered being in a down state if the top event labeled
Sy stemF ailure is true. The FT model consists of three types
of different gates including (i) AND gate generates T RU E
value of its output if all inputs are T RU E, (ii) OR gate
generates T RU E output if any of its inputs are T RU E,
and (iii) KO F N (k-out-of-n) gate generates T RUE output
if kor more of the ninputs are all T RU E. The K OF N
facilitates the evaluation of different combinations of input
events. As shown in Fig. 3, the overall top-level FT model
consists of three-fold hierarchical models, including (i) level
0: system level consists of FT models which capture the
detailed architectures of the CFE member systems, (ii) level 1:
subsystem level consists of FT models detailing the architecture
of components/devices within each subsystem, and (iii) level
2: component level consists of CTMC models which com-
prehensively capture operative behaviors at the bottom-most
level of the IoMT infrastructure’s components/devices. When
assessing a specific measure of interest (reliability/availability),
the computed output of the CTMC models in the component
level is forwarded to the corresponding component/device in
the FT model of a specific subsystem in the subsystem level.
In turn, the output of the FT model in the subsystem level
is forwarded to the corresponding subsystem in the FT model
of a specific member system in the system level. At last,
the computed outputs of FT models representing the member
systems are the inputs of the Sy stemF ailure event in the top-
level FT model of the IoMT infrastructure. Ones can obtain
output measures of interest based on the computed output
of the top-level event Sy stemF ailure. In the default IoMT
infrastructure, an OR gate in which its output is the top-
level event Sy stemF ailure is used to capture a stringent
requirement in designing the IoMT infrastructure in the way
that a complete failure of a single member system among those
three CFE member systems is considered as a total failure
of the overall IoMT infrastructure. This top-level FT model
10
is used to evaluate reliability/availability levels of the default
IoMT infrastructure. Ones can modify the top-level FT models
in accordance with different architectures of various case-
studies or operative scenarios of the IoMT infrastructure. These
top-level FT models are used to assess reliability/availability
solution level of the IoMT infrastructures in those case-studies
and scenarios. It is worth noting that it is feasible to obtain a
closed-form analytic solution for the FT models and the bottom-
most Markov models. For instance, the closed-form solution
for the default IoMT infrastructure can be derived based on
the probability computation of unreliability/unavailability for
AND,OR and k-out-of-ngates in Eq. (2) as presented in [?].
U:=
Qn
j=1 Uj(t)(AND gate with ninputs)
1Qn
j=11Uj(t)(OR gate with ninputs)
Pn
j=kn
jUj(t)j1Uj(t)nj
(k-out-of-ngate with nidentical inputs)
(2)
Where, Uj(t)is the unreliability/availability of input jof
the corresponding gate. Thus, it is clearly possible to infer the
respective reliability/availability of the input jand of the output
results of the gate as R|Aj(t) = 1 Ujand R|A= 1 U. In
the case of the default IoMT infrastructure, a closed-form ex-
pression of unreliability/unavailability is obtained as in Eq. (3).
UIoM T =UC.UF.UE(3)
Where, UC,UF, and UEare the reliability/availability of the
member systems which are derived in the following sections.
(ii) FT models of case-studies and scenarios:
In order to assess the reliability/availability of different case-
studies (regarding the modification of various system configu-
ration) and scenarios (regarding operative circumstances in run-
time period), it is indeed a necessity to develop comprehensive
overall hierarchical models from top-level FT model of the
IoMT infrastructures down to the bottom-most level CTMC
models of the components/devices. Fig. 4 shows the top-
level FT models of the IoMT infrastructures with different
configurations in the case-studies. And Fig. 5 presents the
overall FT models of the IoMT infrastructures in the selected
scenarios. Based on the FT models, we can derive the unrelia-
bility/unavailability of the IoMT infrastructures for case-studies
as shown in Eq. (4) under the following assumptions.
Case I: default IoMT infrastructure as shown in Fig. 4a.
Case II: two identical cloud A (CA) and cloud B (CB) of
the cloud member system are redundant. Thus, an AND gate
is used to model this assumption as shown in Fig. 4b
Case III: two identical fog A (FA) and fog B (FB) of the
fog member system are redundant. Thus, an AND gate is used
to model this assumption as shown in Fig. 4c.
Case IV: two identical edge A (EA) and edge B (EB) of
the edge member system are not redundant. Failures of any
edge are considered causing the complete failure of the overall
IoMT infrastructure. Thus, an OR gate is used to model this
assumption as shown in Fig. 4d.
Case V: All member systems are identically double-sized
as in the case II, III and IV. Thus, AND gates are used
to capture the redundancy of the double-sized cloud and fog
member systems while an OR gate is used for the double-sized
edge member system as shown in Fig. 4e
Case I :default, C FE:UIoM T =UC.UF.UE(4a)
Case II :(CACB)FE:UIoM T = (UCA+UCB).UF.UE
(4b)
Case III :C(FAFB)E:UIoM T =UC.(UFA+UFB).UE
(4c)
Case IV :CF(EAEB):UIoM T =UC.UF.(UEA.UEB)
(4d)
Case V :(CACB)(FAFB)(EAEB):
UIoM T = (UCA+UCB).(UFA+UFB).(UEA+UEB)(4e)
Accordingly, the unreliability/unavailability of the IoMT in-
frastructures in case-studies can be derived as in Eq. (4). The re-
liability/availability is computed using the formula R|A= 1U
with a notice that the same measure is computed from the
bottom-most level CTMC models up to the top-level FT models.
In the operative scenarios when considering different situ-
ations in run-time or relaxation of requirement of the same
default architecture of the considered IoMT infrastructure, the
corresponding top-level FT models are developed as shown
in Fig. 5. Based on the figures, it is possible to derive the
unreliability/unavailability of the IoMT infrastructures for those
scenarios as shown in Eq. (5) under the following assumptions.
Scenario I: default IoMT infrastructure, as shown in
Fig. 5a.
Scenario II: fog member system crashes, as shown in
Fig. 5b. Thus, the FT model of the fog member system is re-
moved from the top-level FT model of the IoMT infrastructure.
Scenario III: cloud member system crashes, as shown in
Fig. 5c. Thus, the FT model of the cloud member system is re-
moved from the top-level FT model of the IoMT infrastructure.
Scenario IV: strict requirements on the IoMT infrastruc-
ture’s configuration is relaxed in the way that only if both back-
end computing physical cloud and fog crash, the whole IoMT
infrastructure is considered being in a total failure. Thus, an
AND gate is used with the inputs of the cloud and fog FT
models as shown in Fig. 5d
The expressions for computation of unreliability/availability
of the IoMT infrastructure in scenarios are obtained as in
Eq. (5), respectively.
Scenario I :default, C FE:UIoM T =UC.UF.UE(5a)
Scenario II :CE:UIoM T =UC.UE(5b)
Scenario III :FE:UIoM T =UF.UE(5c)
Scenario IV :(CF)E:UIoM T = (UC+UF).UE(5d)
The tool SHARPE can be used to produce the closed-form
expression for the overall hierarchical model of the IoMT
infrastructure, in general [?]. But, the size of the generated
output expression is so enormous causing a challenge for
corresponding storage space in computation. Hence, we choose
to use numerical solution for all proposed models. The Markov
models at the bottom-most component level are all solved
numerically in a separate manner. The generated outputs are
fed into the upper-level FT models of subsystems that are, in
turn, solved using numerical methods. And lastly, the computed
results of the intermediate FT models of subsystems are fed to
the top-level FT model of the IoMT infrastructure to eventually
end up with the output measure of interest for the assessment
of the IoMT infrastructure. Numerical methods for Markov
models are presented comprehensively in [?] and those of FT
models are in [?]. SHARPE provides automation processes for
all solutions from solving the bottom-most Markov models to
composition of sub-models.
11
Cloud
Servers Cloud
Gateways Cloud
Storages
g1 of g
SYSTEM FAILURE
OR
Cloud Edge
c1 of c s1 of s
...
Cloud
Server 1Cloud
Server 2 Cloud
Server c
Cloud
Gateway 1Cloud
Gateway 2Cloud
Gateway g
...
Cloud
Storage 1Cloud
Storage 2Cloud
Storage s
...
Fog
Storage (1)
m1 of m
Fog
Gateway (1)
f1 of f
Fog
Node 1
Fog
Node f
Fog
Server 1(1) Fog
Server 2(1) Fog
Server m(1)
...
m1 of m
Fog
Server 1(f) Fog
Server 2(f) Fog
Server m(f)
...
...
Fog
Node k
...
Fog
Gateway (f) Fog
Storage (f)
IoT
Node 1IoT
Node n
IoT
Node k
i1 of i
IoT
Gateway (1)
IoT Wireless
Channel (1)
IoT Health
Sensor 1(1) IoT Health
Sensor 2(1) IoT Health
Sensor i(1)
...
j1 of j
IoT Ambient
Sensor 1(1) IoT Ambient
Sensor 2(1) IoT Ambient
Sensor j(1)
...
... ...
i1 of i
IoT Wireless
Channel (n) IoT
Gateway (n) j1 of j
IoT Health
Sensor 1(n) IoT Health
Sensor 2(n) IoT Health
Sensor i(n)
... IoT Ambient
Sensor 1(n) IoT Ambient
Sensor 2(n) IoT Ambient
Sensor j(n)
...
...
Level 0: System
Level 1: Subsystems
100% 90%80%70%60%50%40 %30%20%10% 0%
F
λbf λbf λbf λbf λbf λbf λbf λbf λbf λbf λbf
λbdrλbdrλbdr
λbdr
λbdr
λbdr
λbdr
λbdr
λbdr
λbdr
SL100% SL90%SL80%SL70 %SL60%SL50%SL40%SL 30%SL20%
N F
λigswf
D
λigswd
cigswdrej.δigswdrej
V
ξigswv
C
ξigswc
(1-cigswad).λigswa
A
cigswad.λigswa
(1-cigswad).µigswad
AD cigswad.µigswad
δigswdr
R
(1-cigswdrej).δigswdrej
cigswarej.δigswarej
(1-cigswarej).δigswarej
ξigswarej
Level 2: Components
OR
cCPU
cHW cSW
cMEM cNET cPWR cCOO cVMM cVM cAPP
Cloud Server Failure
OR
fCPU
fHW fSW
fMEM fNET fP WR fCOO fOS fAPP
Fog Server Failure
OR
igHW igSW
IoT Gateway Failure OR
iBAT
iHW iSW
iSEN iA DC iMCU iMEM iOS iAPPiTRx
IoT Sensor Failure
... ...
nxMEM F
nxMEM.λxMEM nxMEM-1 1 λxMEM
...
D
σxMEMd
λxMEMdf
D
σxMEMd λxMEMdf
nxMEM-2
(nxMEM-1).λxMEM
D
σxMEMd λxMEMdf
NnfApp F
nfApp.
λfAf NnfApp
-1 N1
λf
Af
.
.
.
RnfApp
NnfApp
-2
(nfApp-
1).λfAf
UnfApp
RnfApp-
1
UnfApp
-1
RnfApp-
1
UnfApp
-1
NnfApp
-2
R1U1
(nfApp-
2).λfAf
λf
Au
δf
Au
µf
Au
(nfApp-
1).λfAf
δf
Au λf
Au
µf
Au (nfApp-
2).λfAf
δf
Au
λf
Au
µf
Au
(nfApp-
3).λfAf
U2
λfAf
.
.
.
µf
Au
λf
Au
δf
Au
Fog
Servers
(1)
Fog
Servers
(f)
IoT Health
Sensors
(1)
IoT
Ambient
Sensors
(1)
IoT Health
Sensors
(n)
IoT
Ambient
Sensors
(n)
Fog
Fig. 3: System Fault Tree Model
B. FT models of member systems
In this subsection, the FT models of three CFE member
systems as shown in Fig. 6 are developed and described in
a comprehensive manner. The corresponding expressions for
computation of unreliability/unavailability are also provided.
(i) FT model of cloud member system is depicted in Fig. 6a.
A stringent requirement in design is incurred on the cloud
member system in which a crash of one of the three main parts
(cloud servers, cloud gateways, and cloud storage) causes a total
failure of the cloud member system. For this reason, an OR
gate is used to model this requirement. It is provided that there
are initially cphysical servers in the cloud. And redundancy is
12
SYSTEM FAILURE
OR
Cloud Edge
Fog
Node 1
Fog
Node f
...
Fog
Node k
... IoT
Node 1
IoT
Node e
IoT
Node k
... ...
f1 of f
e1 of e
Fog
Cloud
Servers
c1 of c
Cloud
Gateways
Cloud
Storages
s1 of s
(a) Case I: default IoMT infrastructure
SYSTEM FAILURE
OR
Edge
Fog
Node 1
Fog
Node f
...
Fog
Node k
... IoT
Node 1
IoT
Node e
IoT
Node k
... ...
Cloud A
s1 of s
Cloud
Storages
g1 of g
Cloud
Gateways
Cloud B
s1 of s
Cloud
Storages
g1 of g
Cloud
Gateways
c1 of c
f1 of f
e1 of e
Cloud
Cloud
Servers
c1 of c
Cloud
Servers
Fog
(b) Case II: double-sized cloud
g1 of g
Cloud
Gateways
SYSTEM FAILURE
OR
Cloud Edge
c1 of c
Cloud
Servers
s1 of s
Cloud
Storages
IoT
Node 1
IoT
Node e
IoT
Node k
... ...
e1 of e
Fog
Node 1
Fog
Node f
...
Fog
Node k
...
f1 of f
Fog
Node 1
Fog
Node f
...
Fog
Node k
...
f1 of f
Fog
Fog A Fog B
(c) Case III: double-sized fog
g1 of g
Cloud
Gateways
SYSTEM FAILURE
OR
Cloud
Edge A
c1 of c
Cloud
Servers
s1 of s
Cloud
Storages
Fog
Node 1
Fog
Node f
...
Fog
Node k
...
IoT
Node 1
IoT
Node e
IoT
Node k
... ...
f1 of f
e1 of e
Edge B
IoT
Node 1
IoT
Node e
IoT
Node k
... ...
e1 of e
Edge
Fog
(d) Case IV: double-sized edge
(e) Case V: double-sized member systems
Fig. 4: Top-level FT models of case-studies
well implemented in the cloud so that the cloud is considered
being in an outage period of providing sufficient computing
power only if the number of crashed servers is greater than or
equal to a given integer number c1. To capture this, a c1-out-
of-cgate is used to model the system of cloud servers with the
inputs of csub-models of the corresponding ccloud servers.
Furthermore, with an assumption to secure connectivity among
cloud servers and cloud storage, the cloud is required to avoid
the case that the number of crashed cloud gateways is greater
than or equal to a specific number g1among the total gnumber
of running cloud gateways, regardless of networking topology.
For this reason, a g1-out-of-ggate is also used with the inputs of
sub-models representing the gnumber of cloud gateways. Under
similar assumptions, another requirement to secure continuous
online services on the cloud is to maintain sufficient storage
space at all time. A s1-out-of-sgate is, thus used to model
the above requirement in the way that if there are more than
s1failed cloud storages among total scloud storages hosting
cloud services at a time, the cloud storage system is considered
residing in a downtime period of not providing sufficient cloud
storage space for hosting cloud services. The models of the
cloud server, gateway and storage subsystems are presented in
the following sections.
The cloud unreliability/unavailability are computed using the
closed-form expression as in Eq. (6).
UC=UcServer s ×UcGateways ×UcStor ages
=c
X
τ=c1 c
τ!UcServerττ1UcS erverτcτ
×g
X
τ=g1 g
τ!UcGatewayττ1UcGatewayτgτ
×s
X
τ=s1 s
τ!UcStorageττ1UcS torageτsτ
(6)
Where, UCis the unreliability/unavailability of the cloud
member system. UcServers,UcGateways and UcStorages are the
output measures of the c1-out-of-c,g1-out-of-g, and s1-out-
of-sgates representing the systems of cloud servers, cloud
gateways and cloud storages, respectively. UcServer ,UcGateway
and UcStorage are the unreliability/unavailability measures of
the subsystems including cloud server, cloud gateway and cloud
storage, respectively. These measures are computed and fed by
the FT sub-models of the subsystems.
(ii) FT model of fog member system is shown in Fig. 6b. It
is assumed that the architectures of fog nodes are identical and
that, techniques for service redundancy are equipped in the fog
member systems in order to secure the number of running fog
nodes at a time to provide sufficient computing resources. In
this concern, the stringent requirement is that not more than
a specific f1number of crashed fog nodes out of the total f
number of fog nodes exists at a time in the fog member system.
Thus, a f1-out-of-fgate is used to capture the above service
requirement. The inputs of this f1-out-of-fgate are the output
of the models representing fog nodes. Since, the fog nodes are
13
g1 of g
Cloud
Gateways
SYSTEM FAILURE
OR
Cloud Edge
c1 of c
Cloud
Servers
s1 of s
Cloud
Storages
Fog
Node 1
Fog
Node f
...
Fog
Node k
... IoT
Node 1
IoT
Node e
IoT
Node k
... ...
f1 of f
e1 of e
Fog
(a) Scenario I: default IoMT infrastructure
g1 of g
Cloud
Gateways
SYSTEM FAILURE
OR
Cloud Edge
c1 of c s1 of s
Cloud
Storages
IoT
Node 1
IoT
Node e
IoT
Node k
... ...
e1 of e
Cloud
Servers
(b) Scenario II: crashed fog
SYSTEM FAILURE
OR
Edge
IoT
Node 1
IoT
Node e
IoT
Node k
... ...
e1 of e
Fog
Node 1
Fog
Node f
...
Fog
Node k
...
f1 of f
Fog
(c) Scenario III: crashed cloud
Text
g1 of g
Cloud
Gateways
SYSTEM FAILURE
OR
Cloud
Edge
c1 of c s1 of s
Cloud
Storages
IoT
Node 1
IoT
Node e
IoT
Node k
... ...
e1 of e
Fog
Node 1
Fog
Node f
...
Fog
Node k
...
f1 of f
AND
Fog
Cloud
Servers
(d) Case IV: relaxed configuration
Fig. 5: Top-level FT models of scenarios
assumed to have identical architecture, we describe the model of
a fog node in general. As seen in Fig. 2, the total downstate of a
fog node is caused by a complete failure of either fog servers,
fog gateway or fog storage. This requirement is modeled by
an OR gate with the inputs of the models of fog servers, fog
gateway and fog storage. In turn, the strict requirement to host
fog computing services is to secure not more than a specific
m1number of crashed fog servers among the total mexisting
fog servers. The models of fog servers, fog gateway and fog
storage are fed by the subsystem models as presented in the
following sections.
The unreliability/unavailability measures of the fog member
system are computed using a closed-form expression as shown
in Eq. (7).
UF=
f
X
τ=f1 f
τ!UfN odeττ1Uf N odeτcτ
(7)
Where, UfNodeτis the output measure of the gate
F og N ode τ . In general, UfN ode of a certain gate F og N ode
is computed using the closed-form expression Eq. (8).
UfN ode =Uf S erver s ×UfGateway ×Uf S torage (8)
Where, UfServers is the output measure of the gate
F og Serv ers, representing a cluster of fog servers within a fog
node. UfGateway and UfStorage are the output measures of the
FT sub-models (as shown in Fig. 7f and Fig. 7g) representing
the subsystems fog gateway and fog storage, respectively.
Furthermore, we can obtain the UfServers as in Eq. (9)
UfS erver s =
m
X
τ=m1 m
τ!UfS erverττ1Uf S erv erτmτ
(9)
Where, UfServer is the output measure of a single fog server,
which is obtained by analyzing the model depicted in Fig. 7b.
(iii) FT model of edge member system is presented in Fig. 6c.
In the edge member system, due to a stringent requirement
in designing the system architecture for e-Health monitoring
purposes in the way that all IoMT nodes must be in a healthy
state for the edge member system to be considered in UP state.
If any single of nIoMT nodes fails, the whole edge member
system is considered being in a system failure. Therefore, an
OR gate is used as the main gate with the inputs of the
models representing IoMT nodes to take into account the above
requirement in modeling. Furthermore, it is also assumed that
the designed architecture of IoMT nodes is identical to each
other, thus the description of the modeling for IoMT nodes
is for general cases. A complete failure of an IoMT node is
caused by one or the combination of the failures of (i) IoMT
health sensors, (ii) IoMT ambient sensors, (iii) IoMT wireless
communication and (iv) IoMT gateway. To model this, an OR
gate is used with the inputs of the models representing those
subsystems. The wireless connectivity within an IoMT node is
modeled by a Markov model shown in Fig. 10d. The model of
IoMT gateway is depicted in Fig. 7h. For the IoMT health and
ambient sensors, it is possible to allow the case that some of the
sensors fail at once and only if a specific number of sensors,
particularly i1out of iof IoMT health sensors, and j1out of j
IoMT ambient sensors, or more are all in an outage at a time,
the corresponding system of sensors is considered experiencing
a down state. Thus, the i1-out-of-igate and j1-out-of-jgate are
used to model the clusters of IoMT health and ambient sensors
in an IoMT node. The model of a single IoMT sensor (both
health and ambient sensors) is depicted in Fig. 7c.
UE= 1
n
Y
τ=11UiNodeτ(10)
Where, UiNodeτis the output measure of the gate
IoMT Node τ. In general, the output measure UiNode of the
corresponding gate IoM T Node is obtained as in Eq. (11).
UiNode =UihS ensors ×UiW ireless ×UiGateway ×UiaSensors (11)
Where, UihSensors and UiaS ensors are the output
measures of the gates IoM T Health Sensors and
IoMT Ambient Sensors, respectively. UiWir eless and
UiGateway are the output measures of the FT sub-models
representing the IoMT wireless connectivity and IoMT
gateway in an IoMT node as depicted in Fig. 10d and
Fig. 7h, respectively. It is also assumed that the clusters of
IoMT health and ambient sensors are not able to accomplish
the sensing and monitoring tasks if a number of crashed
sensors is greater than or equal to a specific integer number.
Thus, the output measures UihSensors and UiaS ensors of
the corresponding i1-out-of-i IoMT Health Sensors gate
14
j1-out-of-j I oM T Ambient Sensors gate are obtained as in
Eq. (12) and Eq. (13).
UihSensors =
i
X
τ=i1 i
τ!UihSensorττ1UihSensorτiτ
(12)
UiaSensors =
j
X
τ=j1 j
τ!UiaSensorττ1UiaSensorτjτ
(13)
Where, UihSensor and UiaSensor are the output measures of
the FT sub-models as shown in Fig. 7c representing the IoMT
health and ambient sensors.
OR
Cloud Failure
...
Cloud
Server 1
Cloud
Server 2
Cloud
Server c
Cloud
Storage 1
Cloud
Storage 2
Cloud
Storage s
...
Cloud
Gateway 1
Cloud
Gateway 2
Cloud
Gateway g
...
c1 of c g1 of g s1 of s
Cloud
Servers
Cloud
Gateways Cloud
Storages
(a) Cloud
Fog Storage
(1)
Fog
Gateway (1)
Fog Failure
Fog Nodes
Fog
Node 1
Fog
Node f
Fog
Server 1(1)
Fog
Server 2(1)
Fog
Server m(1)
... Fog
Server 1(f)
Fog
Server 2(f)
Fog
Server m(f)
...
...
Fog
Node k
...
Fog
Gateway (f)
Fog Storage
(f)
f1 of f
m1 of m
m1 of m
Fog
Servers
(1)
Fog
Servers
(f)
(b) Fog
OR
Edge
Node 1
Edge
Node e
Edge
Node k
IoT
Gateway (1)
IoT Wireless
Channel (1)
IoT Health
Sensor 1(1)
IoT Health
Sensor 2(1)
IoT Health
Sensor i(1)
... IoT Ambient
Sensor 1(1)
IoT Ambient
Sensor 2(1)
IoT Ambient
Sensor j(1)
...
... ...
IoT Wireless
Channel (n)
IoT
Gateway (n)
IoT
Ambient
Sensors
(n)
IoT Health
Sensor 1(n)
IoT Health
Sensor 2(n)
IoT Health
Sensor i(n)
... IoT Ambient
Sensor 1(n)
IoT Ambient
Sensor 2(n)
IoT Ambient
Sensor j(n)
...
i1 of i j1 of j i1 of i
Edge Failure
IoT
Health
Sensors
(1)
IoT
Ambient
Sensors
(1)
IoT
Health
Sensors
(n)
j1 of j
(c) Edge
Fig. 6: FT models of member systems
C. FT models of subsystems
In this subsection, the FT models of subsystems in CFE
member systems are presented as in Fig. 7.
Cloud’s subsystems: Fig. 7a shows the FT model of cloud
servers. A cloud server experiences a service outage if either
its hardware (cHW ) or software (cS W ) fails. Thus, an OR
gate is used to connect the FT models of cHW and cSW . In
turn, the cloud server’s hardware/software enters a downtime
period if any of its underlying components crashes. Two OR
gates are used to model this behavior. The failures of a cloud
server’s hardware are those of CPU (cC P U), memory banks
(cMEM ), network interface card (cN ET ) and cooler part
(cCOO) of the cloud server. The failures of cloud server’s
software are due to the failures of VMM (cV MM), VM (cV M)
and cloud applications/services (cAP P ) with an assumption of
independence among those software components. In addition,
Fig. 7d and Fig. 7e depicted the FT models of cloud gateway
and cloud storage subsystems, respectively. The investigation
of cloud gateway’s failures is simplified using those failures of
the cloud gateway’s hardware (cgH W ) and software (cgSW )
components. Thus, an OR gate is used to model the hard-
ware/software failures within a cloud gateway. On the other
hand, the cloud storage crashes if either its storage disks or
storage management software fails to operate properly.
Based on the modeled FT of the cloud’s subsystems,
the analytical expressions are formed to compute unreliabil-
ity/unavailability based metrics of interest as in Eq. (14),
Eq. (15) and Eq. (16).
UcServer =UcH W ×UcSW
=UcCP U ×UcM EM ×UcN ET ×UcP W R ×UcCOO
×UcV MM ×UcV M ×UcAP P (14)
UcGateway =UcgH W ×UcgSW (15)
UcStorage =UcS D ×UcSM (16)
Fog’s subsystems: Fig. 7b depicts the FT model of a fog
server. As similar to the above description of a cloud server’s
FT model, the failures of a fog server are also caused by the
failures of its fog hardware/software components. It is also
assumed that the hardware architecture of a fog server is similar
to the corresponding hardware architecture of a cloud server.
The failures of a fog server’s hardware are due to the failures
of any of its underlying hardware components including CPU
(f CP U ), memory banks (fM EM), network interface card
(f N ET ) and cooler part (fCOO) of the fog server. In the
mean while, the fog server’s software can fail due to a failure
in its OS (fOS) or fog services/applications (f AP P ) that
occurs independently. Fig. 7f and Fig. 7g show the FT models
of the gateways and storage subsystems in the fog member
system. The assumption on the subsystems is that those fog
gateway and fog storage consist of similar components as of
the cloud gateway and cloud storage detailed above. Thus,
the two subsystems are modeled using an OR gate with the
inputs of underlying component CTMC. The FT model of
the fog gateway consists of input CTMC models fgHW and
fgSW representing the fog gateway’s hardware and software
components, respectively. The input CTMC models of the fog
storage’s FT model are fSD and fSM representing fog storage
disks and fog storage management components, respectively.
The unreliability/unavailability of the fog’s subsystems are
computed using the closed-form expressions as shown in
Eq. (17), Eq. (18) and Eq. (19).
UfS erver =UfH W ×UfS W
=UfC P U ×UfM E M ×Uf N ET ×Uf P W R ×Uf C OO
×UfO S ×UfAP P (17)
UfGatew ay =Uf gH W ×Ufg SW (18)
UfS torage =UfS D ×UfS M (19)
15
Edge’s subsystems: Fig. 7c shows the FT model of an
IoMT sensor in the edge member system. It is assumed that
the two types of IoMT sensors (IoMT health sensor and IoMT
ambient sensor) have the same architecture design which mainly
consists of hardware (iHW ) and software (iS W ) components
of the IoMT sensor. In addition, the failures of an IoMT
sensor are considered being caused by a single failure of
either iHW or iSW . Therefore, an OR gate is used with the
inputs of FT models representing the failures of iHW and
iSW . Based on the FT models of iH W and iS W , failures
of the IoMT sensor’s hardware iHW encompass failures in
at least one of the underlying components IoMT sensor’s
battery (iBAT ), IoMT sensing part (iSEN ), analog-to-digital
converter (iADC), IoMT sensor’s microprocessor center unit
(iMCU)), IoMT sensor’s memory (iMEM), IoMT sensor’s
transceiver (iT Rx); while the failures of either IoMT sensor’s
embedded operating system and sensing application (iOS and
iAP P ) cause the failures of IoMT sensor’s software (iSW ).
The IoMT gateway is modeled using FT model as shown in
Fig. 7h. The model also uses an OR gate to mean that the
failures of the IoMT gateway is caused by the failures in either
the IoMT gateway’s hardware (igH W ) or software (igSW )
components. These components are modeled using CTMC as
in Fig. 11a and Fig. 11b, respectively.
The closed-form expressions Eq. (20) and Eq. (21) are
used to compute the unreliability/unavailability of the IoMT
sensor and IoMT gateway, respectively. It is worth pointing out
that the unreliability/unavailability of the IoMT health sensor
(UihSensor ) and IoMT ambient sensor (UiaSensor ) in Eq. (12)
and Eq. (13) are computed using the same formula in Eq. (20)
in the way that the input values of the lower CTMC models are
different when computing UihSensor or UiaSensor .
UiSensor =UiH W ×UiSW
=UiBAT ×UiSE N ×UiADC ×UiM CU ×UiM EM
×UiT Rx×UiOS a(20)
UiGateway =UigH W ×UigSW (21)
D. CTMC models of components/devices
The modeling of components/devices in cloud, fog and edge
member systems is carried out and described in the following.
Because, the modeling of several components/devices is mostly
identical in term of operative behaviors, it is possible to use
the same models for similar items but with different values of
input parameters. Also, it is better to combine the models in
the same figures and describe using common notations due to
the limitation of writing space.
(i) cCPU/fCPU: The CTMC models of Central Processing
Units (CPU) components in the cloud and fog servers are
presented in the Fig. 8a. It is assumed that the servers in cloud
and fog member systems have the same architecture of CPUs
consisting of a multiple number of Processing Element (PE)
for parallel processing patterns [?]. Initially, the servers operate
with nxCP U PEs in the normal state NnxCP U , where, xCPU
denotes the CPU component of either servers in cloud or fog
member systems and it can be replaced by cCPU or fCPU,
respectively. As soon as a PE fails due to a certain hardware
error with a xCPU’s failure rate of λxCP U , the state of the
xCPU component transits from the initial state NnxCP U to
the state NnxCP U 1with the rate of nxC P U xCP U . In this
state, if another running PE suffers a hardware failure which
Cloud Server Failure
OR
cCPU
cHW cSW
cMEM cNET cPWR cCOO cVMM cVM cAPP
(a) cServers
Fog Server Failure
OR
fCPU
fHW fSW
fMEM fNET fPWR fCOO fOS fAPP
(b) fServers
OR
iBAT
iHW iSW
iSEN iADC iMCU iMEM iOS iAPPiTRx
IoT Sensor Failure
(c) iSensors
OR
cgHW cgSW
Cloud Gateway
Failure
(d) cGateway
OR
cSD cSM
Cloud Storage
Failure
(e) cStorage
OR
fgHW fgSW
Fog Gateway
Failure
(f) fGateway
OR
fSD fSM
Fog Storage
Failure
(g) fStorage
OR
igHW igSW
IoT Gateway
Failure
(h) iGateway
Fig. 7: FT models of Subsystems
means that there are two failed PEs at the same time, the state
transition of xCPU keeps going contiuously from NnxCP U 1to
NnxCP U 2with the rate of (nxC P U 1)xCP U . This goes on
as soon as a certain hardware failure occurs on a certain running
PE jth (i.e., jrunning PEs at the considered time) leading a
state transition from the state NjxCP U to the state NjxCP U 1
under the rate jxCP U xC P U . When the xCPU remains the
number of running PEs less than or equal to mxCP U , the
computing power of the corresponding server is not guaranteed
sufficient to hold the currently running computing services,
which therefore probably leads to a service malfunction/crash.
It is thus, considered as a down state of the xCPU in providing
computing services. Although, the sufficiency of the xCPU’s
computing power is not met, but the server is still running under
the notification of the service downtime awaiting a maintenance
service. As soon as a repair-person come to diagnose the
server in malfunction, a server recovery process is performed
to replace the xCPU immediately leading to a state transition
of the xCPU component from the down state NmxCP U to the
normal state NnxCP U . If without proper recovery processes, the
server runs under malfunction of its xCPU component. When
the xCPU remains the last PE running in the state N1, the
xCPU can transit to a complete failure state Funder the rate
λxCP U causing a complete crash of the corresponding server.
A repair-person is summoned to recover the completely failed
16
server and thus, the state of the xCPU component is transited
from the complete failure state Fback to the initial working
state NnxCP U with nxC P U healthy xCPUs.
(ii) cMEM/fMEM: The memory components of either cloud
(cMEM ) or fog (fMEM ) servers are modeled in common
using CTMC as in Fig. 8b. We use the notation xMEM in the
model to denote both cMEM and f MEM . It is assumed that
the initial number of memory banks is nxMEM and that the
initial state of the memory component xMEM is NnxME M . In
this state, one of the healthy memory bank may experience
unrecoverable multibit errors causing a state transition from
NnxME M to NnxMEM 1(which is to mean that the number of
currently running DIMMs is reduced accordingly) under the rate
nxME M xME M . In this state, if another memory bank under-
goes a hardware error, the state of memory component xMEM
transits to NnxME M 2under the transition rate (nxME M
1)xME M . This goes on when a hardware failure occurs on a
certain memory bank in the state NjxME M (which, there are
jxME M running DIMMs at the considered time) causing a
state transition of the memory component xMEM to the state
NjxME M 1with the transition rate of (jxME M 1)xMEM . As
soon as there is the only last DIMM in the state N1, its failure
causes a complete service failure of the memory component
xMEM in the state F, supposed that a single memory’s capacity
can provide a sufficient amount of computing services. When a
repairperson is summoned to diagnose and replace the flawed
DIMMs one after another in the state NjxME M , the repair
transition rate is (nxME M jxM EM )xM EM . Thus, in the
case that the xMEM is in the complete failure state F, the
repair transition rate is nxME M xME M . When the xMEM
resides in the state NjxME M , one of the running DIMMs may
encounter the performance-degraded faults due to data corrup-
tion represented by the transition with the rate of σxMEM dr
leading a state transition from the state NjxME M to the state
DjxME M . The correctable errors of xMEM can be corrected
using error-correcting memory controllers or by hardware en-
abling/disabling measures, thus it returns the degraded DIMM
from DjxME M to normal state NjxMEM . If it is not correctable,
the degraded DIMM experiences a fatal failure in an uncertain
period of time and the state of xMEM transits to NjxME M 1
with the rate of λxM EMdf. The unrecoverable DIMMs are often
deconfigured by the server BIOS and the crashed DIMMs are
afterwards diagnosed and replaced by a repairperson while the
server utilises the remaining healthy memory banks.
(iii) cNET/fNET, cPWR/fPWR, cCOO/fCOO: Fig. 8c de-
picts a common CTMC model for either cloud or fog server
components including (i) cloud (cN ET ) or fog (fNET) Net-
work Interface Controller (NIC)s, (ii) cloud (cPWR) or fog
(fPWR) Power Supply Unit (PSU), and (iii) cloud (cCOO) or
fog (fCOO) Cooling Unit (COO)s. Without loss of generality,
we denote one of components in the model as Xand its initial
state is normal NnX, which is to mean that the component
consists of nXindependent units and assumed that one can
either disable or enable an individual unit for a maintenance.
In the initial normal state NnX, a certain unit may experience a
hardware fault causing its complete failure and reconfiguration.
As a consequence, the component’s operative state transits from
NnXto NnX1under the rate of nXX. If another unit
fails, the state transition is from NnX1to NnX2with the
rate (nX1)X. This process goes on in accordance with
the failure and the number of running units at a time. If a
failure occurs on a unit when the component is in the state
NjX, its state transits to NjX1under the rate j.λX. When
there is the only last running unit in the state N1, its failure
leads to a state transition from N1to the complete failure state
Fof the component under the rate λX. In the worst case,
if all units of the component resides in the complete failure
state F, a repair person is summoned to replace and recover
the completely failed component to its initially normal state
NnXwith a mean time of 1X. In other cases, when the
component is in the state NjX, which is to mean that jX
units of the components are working normally and nXjX
units are in a complete manfunction state, a repair person
is casually summoned to perform a detection of faulty unit
with a mean time of 1/σXd. The component’s operative state
is transited from NjXto the repair state RjXwith other
healthy units providing continuous operations to the servers.
The repair, replacement and reconfiguration processes of faulty
units consume a mean time of (nXjX)Xd , where 1Xd
is the mean time of a recovery process of a single unit.
(iv) cVMM/fOS: The CTMC models of cloud server’s VMM
(cV MM) and fog server’s OS (fOS) are together presented in
Fig. 8d. One can replace Xby cV MM or fOS to obtain the
model of cloud server’s VMM or fog server’s OS, respectively.
Without loss of generality, we assume that these two bottom-
most soft-wares own similar operative behaviors and states
since cloud server runs bare VMM to host upper VMs for
higher availability whereas fog server runs OS to directly host
specific upper apps for higher performance. The failure modes
in consideration include (i) an uncertain error causing a software
failure [?], (ii) a failure due to a cyber-security attack [?] and
(iii) a performance-degraded failure [?]. It is assumed that the
cV MM or f OS initially resides in the normal state N. The
events of uncertain failures regularly occur with the mean time
of 1Xf . This behavior is represented by the state transition
from the state Nto the state F. The performance degradation
causes the cV MM or f OS to encounter failure-probable issues
in the performance-degraded state D. The mean time of this
degradation is supposed to be 1Xd . In this state, software
rejuvenation policies are carried on to remove performance-
degraded faults [?]. The completion likelihood of this process
is captured by the coverage factor cXdrej , while its mean time
is assumed to be 1Xdr ej , and as a result, the rate of the
state transition from performance-degraded state Dreturning
to normal state Nis cXdr ej Xdrej .On the other hand, unsuc-
cessful software rejuvenation of cV MM or f OS eventually
causes a complete crash, represented by the state transition
from the performance-degraded state Dto the complete failure
state Fassociated with the rate (1 cXdre j )Xdrej . As soon
as the IoMT infrastructure’s cyber-security countermeasures
fail to defend the whole multi-layer soft-wares from external
security intrusion, the soft-wares cV MM and fOS may suffer
ceaseless attacks causing a period of software services’ outage
represented by the state A. The cyber-security attacks on the
soft-wares cV MM and f OS are depicted by the state transition
from normal state Nto under-attack state Aand the mean time
of attacks is supposed to be 1Xa . When the soft-wares are
under cyber-security attacks, the software systems’ Intrusion
Prevention System (IPS) carries out immediate adaptive reac-
tions to adapt the penetration scenarios and to thwart potential
attackers in order to mitigate impact of the intrusions and
thus, to temporarily recover existing operations and software
services [?]. The success likelihood of this process for either
cV MM or f OS software is captured by the coverage factor
cXad , thus the adaption process is represented by the state
transition from the under-attack state Ato the adaption state AD
with a mean time of 1/(cXad Xad ). The successful adaption
reactions after cyber-security attacks can help facilitate the fully
functional recovery of software services as captured by the
state transition from AD to Nwith the rate δXdr . The case
17
of unsuccessful adaption to repeated cyber-attacks results in a
complete failure of the software operations and services. The
adaption failure possibility is depicted by the partial coverage
factor 1cXad and thus, the adaption failure event (or in
other word, the success of cyber-attacks on the soft-wares
cV MM or fOS) is captured by the state transition from
Ato Fassociated with the rate (1 cXad )Xad . In severe
cases, software administrators often conduct thorough cyber-
security investigations of the attacks in advance of carrying out
appropriate repair/maintenance procedures. The investigation
period after cyber-attacks often costs a mean time of ξXarej
and the process is depicted by the state transition from Ato R.
As soon as the soft-wares’ security leaks are investigated in the
state R, a rejuvenation process is performed on the soft-wares
cV MM or fOS to conduct cyber-security countermeasures
against the attacks such as patching upgrade packages, fixing
security leaks and removing malicious intruded codes etc. The
likelihood to recover the attacked soft-wares from the state Rto
the normal state Nis represented by the coverage factor cXarej
and the mean time of the rejuvenation process after attacks
is 1/(cXar ej Xarej ). The state transition represented for this
after-attack rejuvenation process is from the under-rejuvenation
state Rto the healthy state N. Nevertheless, if the after-attack
rejuvenation process fails to fully recover the attacked soft-
wares, the cyber-attacks are considered successful and the soft-
wares cV MM or fOS are considered under a complete failure
due to cyber-attacks. The possibility of this case is supposed
to be depicted by the coverage factor 1cXarej , thus the
mean time of this failure after-attack rejuvenation would be
(1 cXar ej )Xarej . As soon as the soft-wares cV M M and/or
fOS are in the complete failure, it is essential to summon a
repair person to recover the failed software to a healthy state.
The recovery of the soft-wares from the failure state Fto their
normal running state Nis supposed to be in the duration of a
mean time 1Xr .
(v) cVM: Fig. 8e depicts a CTMC model of VMs running
on cVMMs (also called cV Ms). Because, the cloud cV Ms play
as the role of virtual OSs hosting cloud applications (cAPPs),
we also consider similar failure modes as of cloud cV MM
or fog fOS including (i) failures due to uncertain causes, (ii)
failures due to cyber-attacks, (iii) failures due to performance-
degradation related faults. We assume the initial number of
cV Ms is ncV M all running in a healthy state NncV M . Since a
multiple number of cV Ms operate at the same time, the running
cV Ms are considered to compete each other to fail in the first
place. Therefore, the failure rate of the state transition from the
initially normal state NncV M to another normal state NncV M 1
consisting a failure of a certain cV M is ncV M cV Mf. In the
case that a certain cV M confronts performance degradation
faults with a mean time of 1cV Md , the state of cV Ms transits
from the initially normal state NncV M to the performance-
degraded state DncV M (which is to imply that a certain cV M
is in failure-probable state while the remaining cV Ms are still
operating in their normal state). Performance-related software
rejuvenation techniques are implemented in this scenario with
a coverage factor of ccV Mdrej and the performance-degraded
cV M is returned to its normal state, thus the state of cV M s
goes back to the normal state NncV M with a mean time
of ccV Mdrej cV Mdrej . If the implementation of the above-
mentioned techniques is not successful (which is represented
by the partial coverage factor 1ccV Mdrej ), the performance-
degraded cV M eventually counters with a failure resulting a
state transition of cV Ms from DncV M to NncV M 1with the
rate (1 ccV Mdrej )cV Mdrej . In another case, cyber-attacks
on a healthy cV M in the initial normal state NncV M are
supposed to have a mean time of λcV Ma leading to a state
transition of cVMs from NncV M to the under-cyber-attack state
AncV M . The adaption process carried out by IPS to avoid further
penetration of attackers and to temporarily recover partial op-
erations and services is represented by the state transition from
the under-attack state AncV M to the adaption state ADncV M
with a mean time of 1/(ccV Mad cV Mad ), where ccV Mad is
the coverage factor depicting the successful possibility of the
adaption reactions to cyber-attacks. The success of this adaption
results in a recovery of the cV M after cyber-attacks depicted
by the state transition from ADncV M to the normal state
NncV M associated with a transition rate of δcV Mdr. In the
case that the adaption process is not achieved, the afflicted
cV M is further penetrated and thus it is critical to rapidly
conduct software investigation of security leaks in advance of
the initiation of an after-attack rejuvenation process as captured
by the state transition from the under-attack state AncV M to
the after-attack rejuvenation state RncV M with a mean time
of 1/((1 ccV Mad )cV Mad ). The after-attack rejuvenation
process of the afflicted cV M is captured by the state transition
from RncV M to NncV M associated with a transition rate of
ccV Marej cV Marej , in which ccV Marej is the coverage factor
of the successful possibility of the after-attack rejuvenation
process. When the penetration of attackers is unsuccessfully
prevented by the after-attack rejuvenation techniques with a
partial coverage factor 1ccV Marej , the cyber-attack affliction
on a certain cV M causes a complete failure of the cV M
resulting in a reconfiguration of the respective afflicted cV M
in software services and operations of cloud VMs. The state of
cVMs transits from the after-attack rejuvenation state RncV M to
a normal state NncV M 1as a consequence, which is to imply
that a certain cV M suffers a complete failure state due to cyber-
attacks while the remaining cVMs are operating in a healthy
state. The recovery of the failed cV M in the state NncV M 1
costs a mean time of 1cV Mrand the state of cVMs returns
to the initial healthy state NncV M . Without loss of generality,
the above detailed description of the CTMC model in the case
of existing ncV M running cVMs on cloud servers initially is
applied in the same manner for the other cases with a certain
number of running cVMs. For instance in a general case, when
cVMs resides in the normal state NjcV M , which is to imply the
state of jcV M number of running cVMs at once, the failure of a
certain cV M causes a state transition to the state NjcV M 1with
a mean time of jcV M cV Mf. In the state NjcV M 1, the number
of running cVMs is jcV M 1, thus the number of cVMs in a
complete failure is ncV M jcV M + 1. Therefore, the recovery
of a failed cV M from the state NjcV M 1to the state NjcV M
costs a mean time of (ncV M jcV M +1)cV Mr. Performance-
degraded issues are represented by the state DjcV M . On the
other hand, cyber-attacks on a certain cV M are depicted by
the state AjcV M . Adaption process is captured by the state
ADjcV M meanwhile, the after-attack rejuvenation process is
represented by the state RjcV M . The transition rates between
the above states in accordance with the model or the mean
time of the above-mentioned processes are described similarly
as aforementioned for the initial case. When there exists only
one running cV M in the healthy state N1, it suffers either an
uncertain failure associated with a mean time of 1cV Mf, or
performance-degraded failures or failures due to cyber-security
attacks, which all leads to a complete failure of all cVMs
in the state F. The recovery of an individual cV M when
all cVMs reside in downtime state Ftakes a mean time of
1/(ncV M cV Mr).
(vi) cAPP/fAPP: In Fig. 8f, a common CTMC model is pre-
sented for medical applications running on either cloud (cAP P )
18
or fog (f AP P ) servers which are to be denoted as xAPP in the
model. In the case of applications, the operations and failure
modes impacting their reliability/availability to be considered
include (i) an uncertain fault causing a complete failure, (ii)
regular upgrade processes leading a service outage. Initially,
it is assumed to exist nxAP P of cloud or fog applications
(cAPPs/fAPPs) running in the initial healthy state NnxAP P . The
competition among a multiple number of xAPPs to fail at first
due to an uncertain fault causes the failure rate of a certain
xAPP to be nxAP P.λxAP Pf. The state of xAPPs transits from the
initial state NnxAP P to NnxAP P 1, in which the index indicates
that the remaining number of healthy xAPPs is nxAP P 1.
We assume that a regular update process is scheduled for a
medical application to not be out of date with a mean time of
δxAP Pu, and the state of xAPPs transits from NnxAP P to the
update-requested state RnxAP P . When an update of a cloud/fog
application is requested, to-be-updated application packages
are uploaded to cloud/fog servers, the readiness of updating
a cloud/fog application requires a mean time of λxAP Puand
a state transition from the update-requested state RnxAP P to
the updating state UnxAP P . As soon as the updating resources
are ready, updating processes are carried out afterwards. The
completion of updating processes of a certain xAPP costs a
mean time of µxAP Puand thus, the state of xAPPs transits
from the updating state UnxAP P to the initially healthy state
NnxAP P . In the duration of updating processes going on a
certain xAPP, the remaining nxAP P 1number of xAPPs
are still operating normally but one of the running xAPPs may
experience an uncertain failure leading to a state transition of
xAPPs from UnxAP P to NnxAP P 1with a transition rate of
(nxAP P 1)xAP Pfassuming that the completion of updating
processes on the previously-mentioned xAPP is achieved in
advance. When the state of xAPPs is in NnxAPP 1, the recovery
of a failed xAPP is assumed to take a mean time of µxAP Pr.
The repetition of the above description goes on as the state of
xAPPs resides in NnxAP P 1(which is to imply that the number
of xAPPs currently running in a healthy state is nxAP P 1),
and goes on when the number of healthy xAPPs decreases.
Without loss of generality, when the state of xAPPs is in
NjxAP P (which is to indicate jxAP P healthy xAPPs at a time),
the above model description for the initial case of existing
nxAP P healthy xAPPs at once is similarly used. Particularly, an
uncertain failure of xAPP causes a state transition from NjxAP P
to NjxAP P 1with a mean time of 1/(jxAP P xAP Pf). The
recovery of a failed xAPP from NjxAPP 1to NjxAP P costs a
mean time of 1/((nxAP P jxAP P + 1)xAP Pr).The update-
requested state and the updating state are RjxAP P and UjxAP P ,
respectively. The state transition from UjxAPP to NjxAP P 1is
associated with a transition rate of (jxAP P 1)xAP Pf. Other
state transitions come along with the same transition rates with
the respective state transition in the initial case as described
previously. In the case when there exists only one last healthy
xAPP residing in the normal state N1, its failure due to uncertain
causes with a mean time of λxAP Pfleads to a complete service
downtime period of xAPPs in F. The updating processes of this
last xAPP also causes a service outage of xAPPs in the state U1.
The recovery of xAPPs in the complete failure state Funder
requirements of hosting multiple number nxAP P of xAPPs costs
a mean time of 1/(nxAP P xAP Pr).
(vii) cSD/fSD: Fig. 9a shows a CTMC model for storage
hardware components in both cloud and fog member systems
denoted as cSD/fSD to imply the storage disks of cloud/fog
storages, respectively and commonly denoted as xSD in the
model. The model is similar with the model of cloud and fog
servers’ CPUs in Fig. 8a. Assumed that, the initial number of
storage disks (memory devices) is nxSD and those physical
hard-wares are all operating in the initially healthy state NnxSD .
Regardless of cloud/fog storage’s’ detailed architectures and
data access protocols, the failure modes and recovery strategies
of physical hard-wares are simplified to be investigated and
captured in the model. An uncertain failure of an individual
storage disk costs a mean time of λxSD , thus the competition
to fail first among the multiple number nxSD of initial healthy
xSDs causes a state transition of xSDs from NnxSD to NnxSD 1
associated with a transition rate of nxSD xSD . This behavior
repeats as soon as another xSD fails among a multiple number of
xSDs. Without loss of generality, the failure of a certain xSD in
the state NjxSD (which is to indicate the existence of jxSD xSDs
currently residing in healthy state), where 1jxSD nxSD
and N0F, causes a state transition from NjxSD to NjxSD 1
associated with a transition rate of jxSD xSD . When xSDs
resides in N1, the failure of this last storage disk results
in a complete failure of the storage hardware layer in F.
The recovery of the whole failed storage hardware requires
a mean time of µxSDfrepresented by the state transition
from the complete failure state Fto the initially healthy state
NnxSD . In another case under stringent requirements of storage
performance, if the existing number jxSD of xSDs residing
in the state NjxSD is less than or equal to a certain integer
mxSD , those corresponding states are considered down states
for sufficient storage services and operations on cloud/fog
member systems. For that reason, a hardware technician is often
summoned to diagnose and replace broken storage disks when
the storage services suddenly crash due to hardware failures in
the state NmxSD . The timely recovery of storage hardware when
there exists remaining mxSD storage disks in normal operations
costs a mean time of µxSDmand correspondingly represented
by the state transition from the state NmxSD back to the initially
healthy state NnxSD .
(viii) cSM/fSM: Fig. 9b depicts a CTMC model of redundant
twin storage managers of cloud/fog member systems, which
is to be denoted commonly as XY . The storage managers
are two-node physical machines operating in an active-active
redundancy to achieve a high level of continuous availability
[?]. The storage managers are assumed to initially operate in a
healthy state UU, which is to imply the states of both machines
Xand Yare up states (denoted by U). As soon as one of the
two storage managers fails due to an uncertain cause, the state
of the storage managers transits from UU to either F U (Xfails)
or UF (Yfails) associated with a transition rate of 2cSMf. A
repair person is summoned to recover a failed storage manager.
A diagnosis of failure causes on the failed storage manager is
carried out with a mean time of 1xSMdand the diagnosis
processes are represented by the state transitions from F U to
DU or from U F to U D for the fault diagnosis processes on
either Xor Y, respectively. Successful recovery of the failed X
is depicted by the state transition from DU to U U while that of
the failed Yis represented by the state transition from UD to
UU, and those state transitions represented for the recoveries
are both associated with a transition rate of µxSMr. During the
period of fault diagnosis on a crashed storage manager either
in the diagnosis states (F U or U F ) or in the repair states (DU
or UD), the remaining healthy storage manager in upstate U
can experience an uncertain failure at the same time depicted
by the state transitions associated with a transition rate λxSMf
to the complete failure state of cloud/fog storage management
F F . The recovery of the storage management services and
operations costs a mean time of 1xSM and is represented
by the state transition from the complete failure state F F to
the initially healthy state UU.
19
(ix) IoMT sensors: The operations of IoMT sensors’ com-
ponents in the edge member system are modeled in Fig. 10. The
description of each of the component sub-models is as follows.
The models are used commonly for all sensors in the IoMT
infrastructure. We specifically consider in detail the practical
operative states of battery part in industry and the security-
related states of embedded operating system and application on
IoMT sensors.
iBAT: Fig. 10a depicts the CTMC model of IoMT sensors’
battery (denoted as iBAT). Our assumptions in modeling the
power consumption of an IoMT sensor’s battery without loss
of generality are: (i) battery’s discharge process complies with
a nearly deterministic behavior and is captured by using a 10-
stage Erlang random variable; (ii) processing load consumes
battery’s energy evenly in average as time goes by; (iii) oper-
ative states of a battery is simplified to consist of sleep mode,
energy draining and failure due to uncertain causes; and (iv)
power consumption in sleep mode is infinitesimal enough to be
disregarded and/or to be counted as a part of draining and/or
discharging. Initially, IoMT sensors’ battery is fully charged
in the full-charge state 100%. The battery discharge process
occurs in 10 steps with an individual discharging mean time of
1iBdr experiencing through the corresponding states which
are supposed to represent the energy amount in percentages
of the battery. Furthermore, we assume the energy usage of a
battery is reduced at the lowest level in corresponding sleep
modes of an IoMT sensor which happen after a mean time of
1iBsl and are represented by the state transition from the state
depicting the energy percentage of the battery to the state of
its sleep-mode at the respective energy amount, for instance,
from the initial full-charge state 100% to the sleep-mode state
SL100%. The corresponding wake-up process is depicted by
the transition, for instance, from SL100% to 100% associated
with a mean time of ζiBw. On the other hand, at a certain
level of energy, the battery encounters uncertain faults causing
a complete failure, which is represented by the state transition
from the energy-level state X%(for instance, the full-charge
state 100%) to the complete failure state F. In the complete
failure state F, it is necessary to summon a repair-person to
replace the failed battery with a mean time of 1iBr, and
the replacement is represented by the state transition from the
complete failure state Fto the initial full-charge state 100%. At
the energy level of zero percentage in the state 0%, the battery
is assumed to also experience a complete failure Fbefore a
replacement is needed. The down-states of the battery are in
the state 10% and 0%, and in the complete failure state F,
which is to indicate that sensing operations of an IoMT sensor
are not secured as soon as the energy level of the battery is less
than or equal to 10%, or when an uncertain failure occurs on
the battery in the complete failure state F.
iSEN/iADC/iMCU: The modeling of physical parts of
an IoMT sensor including sensor part (iSEN), analog-to-
digital converter (iADC) and micro-controller unit (iM C U )
is simplified and presented in Fig. 10c. Since these parts of
an IoMT sensor often come in completely isolated module
even though they may consists of a number of other sub-parts,
we simplify the modeling by considering only operative and
crashed states of those physical parts. The initial state of those
physical parts is denoted as the initially normal state N. After
a mean time of 1iX (where, iX represents iSEN,iADC
or iMCU), an uncertain failure exposed on the physical parts
causes a malfunction and the state is transited to failure state
F. The replacement of physical part is assumed to take a mean
time of 1iX and the state returns to initially normal state N.
iMEM: The memory part of an IoMT sensor (iMEM)
is modeled in Fig. 10b. As a common sense, iMEM is also
subject to uncertain failures and performance-degraded failures
since IoMT sensor often uses memory card in industry for its
temporary data storage. When an uncertain failure occurs on the
IoMT sensor with a mean time of 1iME Mf , its state transits
from initially healthy state Nto complete failure state F. In
the case of failures due to performance-degradation, the IoMT
sensor’s state transits from Nto performance-degraded state D
with a rate of σiME M d. In this state, access requests to retrieve
and store data from/to iMEM are prone to interruptions/crashes
ending up with a complete failure in state Fwith a mean time of
1iME Mdf. Nevertheless, the IoMT sensor is assumed to have
the capability to reset itself and swipe data on iMEM and return
to its initially normal state Nwith a mean time of 1iMEM d.
In the complete failure state F,iMEM is recovered by a repair-
person with the rate of µiME M r and its state returns to the
initially normal state N.
iTRx: An IoMT sensor is often assumed to consist of a
transceiver for duplex communication (denoted as iT Rx). We
suppose that transmitter (Tx) and receiver (Rx) on an IoMT
sensor are independent to each other in the way that an uncertain
failure of Tx does not cause to a dependent failure of Rx and vice
versa. We also assume that iT Rx is still considered in running
state even during an outage of Rx if Tx is in its healthy state.
But if Tx does not operate properly, the whole transceiver iT Rx
enters a service down-time. Under the above assumptions, the
transceiver iT Rx is modeled as in Fig. 10f. A failure due to
an uncertain cause can occur with an assumed mean time of
1T Rxf ,iT Rx’s state transits from its initially healthy state
Nto a complete failure state F. In the case that Rx fails with
a failure rate of λRxf ,iT Rx’s state transits from Nto state
Rxfwhich is still considered a running state of iT Rx. In this
state, as Tx fails, iT Rx’s state transits to the complete failure
state Fwith a failure rate λT xf . On the other hand, if Tx fails
in advance, iT Rx’s state transits from its initially normal state
Nto the state T xfwhich is a down-state of the iT Rx. In
this state, Rx still operates so that the IoMT sensor can receive
health-check and maintenance commands. If Rx fails with a
mean time of 1Rxf ,iT Rx’s state transits to the complete
failure state Fof the IoMT sensor. The failed transceiver iT Rx
is repaired/replaced with a mean time of µT Rxr and its state
returns to initially healthy state N.
iOSa: Fig. 10e shows the model of IoMT sensor’s em-
bedded software subsystem. We consider a tiny embedded OS
running on a singular sensor node integrated with an application
for collecting data from sensor part. It is assumed that a failure
of the tiny embedded OS causes an outage of the IoMT sensor’s
software part in the way that ones may not be possible to
health-check the status of the IoMT sensor. Initially, the iOSa
resides in normal state N. Its state transits to other states in
three cases: (i) if the embedded OS fails with a mean time of
1iOS , the iOSa enters a down-time period in state DNOS ,
(ii) if the embedded application fails with a mean time of
1iA, the iOSa goes to the state DNa(an up-state) from
its initially normal state, (iii) if the iOSa has a scheduled sleep
to reduce the power consumption of the IoMT sensor with a
mean time of 1iOSa , the iOSa’s state transits to the state SL.
When the embedded application fails to collect sensing data in
the state DNa, the running tiny OS tries to reset and restart
another instance of the embedded application immediately with
a coverage factor of successful recovery ciA and a recovery
rate of 1iA, and the state of iOSa returns to its initially
healthy state Nfrom the current state DNa. If the reset of
the crashed application fails to complete, the state transits from
DNato a complete failure state Fof the iOSa with a rate of
20
(1ciA)iA. Moreover, when the iOSa is in the state DNa, its
tiny embedded OS may fail in the middle of recovery process
of the failed embedded application and thus, the state of iOSa
transits from DNato DNOS with a rate of λiOS . In the case
that the iOSa encounters a failure of its tiny embedded OS in
the state DNOS , a reset and recovery process with a mean time
of 1iOS is performed on the IoMT sensor’s main board to
restart/reinstall the crashed embedded OS. The success of this
process is featured by a coverage factor of ciOS. Therefore, the
state of iOSa transits to DNafrom DNOS to indicate that the
tiny embedded iOS is recovered without a running embedded
application for sensing purposes. In this state, a new application
is initiated with a coverage of ciA and a mean time of µiA as
described above. In the failure of the attempt to recover the
failed tiny embedded OS (featured by the coverage factor of
(1 ciOS ), the IoMT sensor’s iOSa part is considered being
in a complete failure Fwith a mean time of 1
(1ciOS )iOS .
In the last case as the IoMT sensor’s embedded software goes
to its sleep mode to reduce power consumption in the state
SL, a quick wake-up process with a successful coverage factor
of ciOSa and a mean time of ηiOSa is significant to return
the embedded software iOSa to its initially healthy state N.
Therefore, the state transition from the sleep-mode state SL to
the initially normal state Nrepresents the wake-up action with
a mean time of ciOSa iOSa . If the wake-up process encounters
an uncertain failure, the state of iOSa transits to the complete
failure state Fwith a mean time of 1
(1ciOSa )iOSa , instead.
A completely failed IoMT sensor’s software iOSa should be
recovered/reinstalled by a summoned repair-person with a mean
time of 1iOSa to return its state to its initially normal state
N.
(x) iWireless: The modeling of wireless communication
within an edge is adopted by using Gilbert-EllIoMTt (GE)
model in [?] as shown in Fig. 10d. A DTMC model of two
states (connected Cand disconnected D) is used to represent
the communication quality of the IoMT wireless connection in
the edge member system. The state Crepresents an error-free
data transmission and the state Dis to capture the unavailability
of data transmission due to a high probability of errors in data
streams. The probability of a wireless disconnection represented
by the state transition from Cto Dis called pd, thus the
probability of retaining a wireless connection in the state C
is 1pd. Similarly, the connectivity of the IoMT wireless
communication is represented by the state transition from D
to Cand the probability of a wireless connection event pc.
And thus, the probability of retaining in disconnection state is
1pc. It is worth noticing that pc+pd= 1 [?].
(xi) Gateways: The gateways of CFE member systems are
assumed to expose identical operative behaviors and thus it is
possible to capture those operative behaviors using common
states and state transitions in the same models as shown in
Fig. 11.
cgHW/fgHW/igHW Fig. 11a shows a common model
of the gateways’ hardware component. As similar to other
hardware components in the IoMT infrastructure, the modeling
is simplified by using a two-state model with normal Nand
failure Fstates representing up and down states in operation.
The MTTFeq and MTTReq of a gateway’s hardware component
are denoted as 1xHW and 1xH W , respectively. Where,
the notation xHW indicates the hardware component of the
respective member system being considered in the model and
it can be replaced by cgH W ,fgHW and igHW to indicate
the hardware components of cloud, fog, edge gateways, respec-
tively.
cgSW/fgSW/igSW Fig. 11b depicts the modeling of the
gateways’ software components adopting a security model in
[?]. A common notation xSW is used to indicate the software
part of a specific gateway x. The notation can be replaced
in accordance with the gateway to be considered such as
cgSW/f gSW/igSW for the software part of cloud, fog and
edge gateways, respectively. Regarding the failure modes of
the gateways’ software part, we consider failure causes due to
cyber-security attacks and long-term operations including (i)
failures due to uncertain causes, (ii) performance-degradation
related failures, (iii) failures due to cyber-security attacks. The
uncertain failures with a mean time of 1xSW are captured
by the state transition from the initially healthy state Nto
the complete failure state F. In the case if performance of the
gateway software degrades over time, its state transits from N
to the failure-probable state Ddue to performance degradation
with an assumption of a mean time of 1xSW d [?]. A software
rejuvenation is applied to remove performance-degraded errors
in the gateway software [?]. The successful removal is covered
by a coverage factor of cxSW dej and a mean time of 1xSW dej .
After cleaning the performance-degradation related errors, the
gateway software residing in Dreturns its state to the initially
normal state N. If the removal is not able to clean the errors
residing in the gateway software, it encounters a complete
failure in the state Fwith a mean time of 1
(1cxSW drej )xS W drej .
In the case that the gateway software encounters a calculated
cyber-security attack, attackers repeatedly intrude malicious
codes to identify the vulnerabilities of the gateway software
with an assumed mean time of ξxSW v . At this point, the state
of the gateway software is considered in the vulnerable state V
under malicious intrusion of attackers. After that, attackers may
perform compromising actions against pre-installed defense
mechanisms of the gateway software. This phase is supposed
to take a mean time of ξxSW c and the gateway software’s state
transits from the vulnerable state Vto the compromised state C.
When the gateway software is compromised, attackers perform
repeated attacks to disguise communication and steal significant
information in pre-assumed-to-be-trusted data transactions or
even to disrupt running services on the gateways. Ones may
recognize the attacks with a success coverage factor of cxSW a
and a mean sojourn time of λxSW a . Therefore, the gateway
software is considered being under an attack and its state transits
from Cto the under-attack state Awith a rate of cxSW axS W a.
If the gateway software is compromised and not able to detect
a certain attack, we consider this case as a complete failure and
it is necessary to call a repair-person to recover a compromised
gateway software under certain cyber-security attack actions.
For that reason, the gateway software’s state transits from
Cto Fwith a mean time of 1
(1cxSW a )xSW a . As soon as
the gateway software is under repeated attacks, an adaption
mechanism is performed to partially recover damaged services
for temporary with a graceful degradation of performance. The
attempt to adapt the detected attacks features with a success
coverage factor of cxSW ad and a mean time of µxS W ad. Then,
the state of the gateway software transits from the under-attack
state Ato the adaption state AD. The adaption mechanism help
recognize vulnerable points, isolate attacked parts and perform
software patches to fully recover the running services on the
gateway software. The recovery of attacked gateway software
after executed adaption mechanism can take a mean time of
1xSW dr and the gateway software’s state transits from AD
to its initially normal state N. If the adaption mechanism is
not effective to the attacks, the gateway software is considered
under a failure due to security attacks and its state transits from
Ato Fwith a mean time of 1
(1cxSW ad )xSW ad . Furthermore,
when the gateway software is experiencing an attack in the state
21
A, a security professional may intervene to halt the attacks,
investigate existing vulnerabilities and perform rejuvenation
strategies to recover the whole damaged gateway software.
This case would take a mean time of xixSW arej and the
gateway software’s state transits from the under-attack Ato
the after-attack rejuvenation state R. The rejuvenation after
attacks of the gateway software may succeed or fail depending
on the severe level of attacks and it is represented by a
coverage factor of cxSW arej . A successful rejuvenation of the
gateway software include clearing malicious code, performing
software patches to remove exploited existing vulnerabilities
and restarting the gateway software. In this case, the mean time
of rejuvenation is δxSW arej and thus, the recovery of a damaged
gateway software after security attacks takes a mean time of
cxSW ar ej xSW arej and the gateway software’s state returns
from the under-rejuvenation Rto its initially healthy state N.
The failure of the after-attack rejuvenation process is considered
to cause a complete failure of the gateway software with a mean
time of 1
(1cxSW arej )xSW ar ej . And, when the gateway software
resides in a complete failure F, a repair person is summoned
to recover the crashed gateway software with a mean time of
1xSW .
VI. NUMERICAL ANA LYSIS RESULTS
The default input parameters of the model is shown in
Table XV. We adopted several values from studies in previous
works [?], [?], [?]. We specifically performed the analyses
not only for the proposed IoMT infrastructure under default
parameters, but also for extensive case-studies and operative
scenarios described in Table I. Reliability analyses are presented
in Section VI-A. The reliability values of the IoMT infrastruc-
tures are computed and analyzed when dis-regarding recov-
ery/maintenance processes of failed subsystems/components.
SSA analyses are performed in Section VI-B to comprehend
the IoMT infrastructure’s availability properties in a long-run
period. And availability sensitivity analyses wrt. the variation
of impacting factors/parameters (e.g., MTTFeq and MTTReq)
are shown in Section VI-B.
A. Reliability Analysis
We performed a number of reliability analyses as shown in
Fig. 12 and Fig. 13 to comprehend the impact of failures in
member systems, subsystems and components on the overall
reliability of the IoMT infrastructure.
Reliability of IoMT infrastructure wrt. default input param-
eters:: The reliability of the IoMT infrastructure is analyzed
under default parameters as depicted in Fig. 12. The figures,
Fig. 12a, Fig. 12b and Fig. 12c present a comparison of
reliability values among the subsystems and the impact of
those subsystems’ reliability values on the overall reliability
of the corresponding cloud, fog and edge member systems,
respectively. Moreover, Fig. 12d depicts the overall reliability
values of member systems and of the whole IoMT infrastructure
to comprehend the differences regarding the reliability impact
of member systems on the IoMT infrastructure.
In Fig. 12a, the cloud storage and cloud gateway subsystems
exhibit a high reliability property under any failure without a
certain recovery strategy. In more detail, the reliability of the
cloud gateway over time is, however lower than the reliability
of the cloud storage as seen in the small subplot. The figure
also shows that the cloud servers are prone to failures over
time if recovery operations are not planned in advance. The
reliability of the cloud server subsystem drops by time quickly
under a highly vertical incline. And it is clear that the drop of
NnxCPU F
nxCPU.λxCPU NnxCPU-1 N1λxCPU
...
NnxCPU-2
(nxCPU-1).λxCPU
µxCPUf
NmxCPU ...
µxCPUm
(a) cCPU/fCPU
NnxMEM F
nxMEM.λxMEM NnxMEM-1 N1λxMEM
...
DnxMEM
σxMEMd
λxMEMdf
D1
σxMEMd λxMEMdf
NnxMEM-2
(nxMEM-1).λxMEM
DnxMEM-1
σxMEMd
λxMEMdf
µxMEMdr
µxMEMr
µxMEMdr µxMEMdr
2.µxMEMr nxMEM.µxMEMr
(b) cMEM/fMEM
NnXF
nX.λXNnX-1 N1 λX
...
NnX-2
(nX-1).λX
µX
RnX-1 RnX-2 R1
µXd 2.µXd
(nX-1).µXd
σXd σXd σXd
(c) cNET/fNET, cPWR/fPWR, cCOO/fCOO
N F
λXf
D
λXd
cXdrej.δXdrej
AAD
δXdr
R
(1-cXdrej).δXdrej
cXarej.δXarej
(1-cXarej).δXarej
cXad.δXad
ξXarej
λXa (1-cXad).δXad
µXr
(d) cVMM/fOS
ncVM.λcVMf
λcVMd
ccVMdrej.δcVMdrej
(1-ccVMdrej).δcVMdrej
ccVMarej.δcVMarej
(1-ccVMarej).δcVMarej
λcVMa
NncVM NncVM-1
DncVM
AncVM
RncVM
(1-ccVMad).δcVMad
ADncVM
ccVMad.δcVMad
δcVMdr
DncVM-1
NncVM-2
(ncVM-1).λcVMf
AncVM-1
ADncVM-1 RncVM-1
...
(1-ccVMdrej).δcVMdrej
ccVMdrej.δcVMdrej
λcVMd
ccVMarej.δcVMarej
λcVMa
ccVMad.δcVMad
δcVMdr
(1-ccVMarej).δcVMarej
N1F
...
...
A1
R1
AD1
D1
ccVMdrej.δcVMdrej
λcVMd (1-ccVMdrej).δcVMdrej
λcVMf
ccVMarej.δcVMarej
(1-ccVMarej).δcVMarej
ccVMad.δcVMad
δcVMdr
λcVMa
µcVMr
2.µcVMr
ncVM.µcVMr
(1-ccVMad).δcVMad
(1-ccVMad).δcVMad
(e) cVM
NnxAPP F
nxAPP.λxAPPf NnxAPP-1 N1λxAPPf
...
RnxAPP
NnxAPP-2
(nxAPP-1).λxAPPf
UnxAPP
RnxAPP-1 UnxAPP-1
RnxAPP-1 UnxAPP-1
NnxAPP-2
R1U1
(nxAPP-2).λxAPPf
λxAPPu
δxAPPu µxAPPu (nxAPP-1).λxAPPf
δxAPPu
λxAPPu
µxAPPu
(nxAPP-2).λxAPPf
δxAPPu
λxAPPu
µxAPPu (nxAPP-3).λxAPPf
U2
λxAPPf
...
µxAPPu
λxAPPu
δxAPPu
µxAPPr
2*µxAPPr
3*µxAPPr
...
RnxAPP-2 ...
δxAPPu
nxAPP.µxAPPr
(f) cAPP/fAPP
Fig. 8: Component sub-models of cloud and fog servers
the cloud server subsystem’s reliability severely causes the loss
of the overall cloud member system’s reliability as seen by the
higher vertical slope of the graph with five-pointed star markers
representing for the cloud member system’s reliability.
Fig. 12b presents the reliability analyses for the subsystems
in the fog member system and for the overall fog system
as well. As similar to the reliability analyses of the cloud
member system in Fig. 12a, the fog storage and gateway
subsystems are also highly reliable over time (represented by
the graphs of circle and square markers, respectively) but the
fog server subsystem is also severely prone to failures if without
accompanying recovery operations (represented by the graph of
diamond markers). And thus, it also pulls down the reliability of
fog member system which is represented by the graph of five-
22
NnxSD F
nxSD.λxSD NnxSD-1 N1λxSD
...
NnxSD-2
(nxSD-1).λxSD
µxSDf
NmxSD ...
µxSDm
(a) cSD/fSD
2.λxSMf
µxSM
UU
UFUD FU
FF
DU
2.λxSMf
ξxSMd ξxSMd
λxSMf
λxSMf
λxSMf
λxSMf
µxSMr µxSMr
(b) cSM/fSM
Fig. 9: Sub-models of cloud and fog storage components
100% 90%80%70%60%50%40 %30%20%10% 0%
F
λiBf
δiBdr
SL100%SL90%SL80%SL 70%SL60%SL50%SL 40%SL30%SL20%
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
ξiBsl ζiBw
δiBdr δiBdr δiBdr δiBdr δiB dr δiBdr δiBdr δiBdr δiBdr
µiBr
λiBf λiBf λiBf λiBf λiBf λiBf λiBf λiBf λiBf λiBf
(a) iBAT
N F
λiMEMf
D
λiMEMdf
σiMEMd
µiMEMr
ξiMEMd
(b) iMEM
λiX
N F
µiX
(c) iSEN/iADC/iMCU
DC
pd
pc
1-pc
1-pd
(d) iWireless
N F
DNa
DNos
λiA
λiOS
ciA.µiA
(1-ciA).µiA
λiOS ciOS.µiOS
(1-ciOS).µiOS
SL
ciOSa.ζiOSa (1-ciOSa).ζiOSa
ξiOSa
µiOSa
(e) iOSa
N F
Rxf
λTxf
Txf
λRxf
λRxf
λTxf
λTRxf
µTRxr
(f) iTRx
Fig. 10: Component sub-models of edge member system
N F
λxHW
µxHW
(a) cgHW/fgHW/igHW
N F
λxSW
D
λxSWd
cxSWdrej.δxSWdrej
V
ξxSWv
C
ξxSWc
(1-cxSWa).λxSWa
A
cxSWa.λxSWa
(1-cxSWad).µxSWad
AD cxSWad.µxSWad
δxSWdr
R
(1-cxSWdrej).δxSWdrej
cxSWarej.δxSWarej
(1-cxSWarej).δxSWarej
ξxSWarej
µxSW
(b) cgSW/fgSW/igSW
Fig. 11: Component sub-models of cloud, fog and edge gateways
pointed markers. In comparison to the cloud member system,
the fog member system are slightly more reliable due to slightly
higher reliability values of its subsystems as seen by comparing
the curve and inclination of the respective graphs.
Fig. 12c shows the reliability analysis results for the edge
member system. In the edge layer, the IoMT gateway clearly
exhibits its highest reliability under uncertain failures without
planned recovery strategies in comparison to the reliability
values of IoMT sensors. The IoMT health sensors have a higher
reliability over time compared to the IoMT ambient sensors as
expected. However, since the reliability values of those IoMT
devices are not high enough as the reliability values of the
gateway and storage subsystems in the cloud and fog member
systems (as shown in Fig. 12a and Fig. 12b), the overall
reliability of the edge member system is also featured by a
five-pointed star graph with highly vertical inclination.
In Fig. 12d, we synthesize the above-described reliability
analyses of individual cloud, fog and edge member systems
into a common graph and to see the impact of member
systems’ reliability values on the overall IoMT infrastructure’s
reliability. In general, due to the complexities of the overall
IoMT infrastructure in which consists of multi-level systems,
subsystems and components/devices, and due to stringent op-
eration requirements to avoid emergent cases in which any
failure would be considered as a severe matter causing a system
failure, the reliability values of the member systems and the
overall IoMT infrastructure vertically drops as time goes by
under certain failures without recovery actions. With attention
to detail through the curve of the graphs, the fog member system
transparently exhibits the highest reliability comparatively to
other member systems in the course of time. Whereas, the cloud
member system has the least reliability over time represented
by the sharply vertical graph of diamond markers. Moreover,
the cloud and edge member systems exhibits clearly lower
reliability over time in comparison with the fog member system
as seen by the relative distances between the graphs. Because
of this, the overall reliability of the IoMT infrastructure steeply
drops as time goes by even worse than the reliability of the
cloud member system as depicted by the graph of five-pointed
star markers.
The above detailed reliability analyses of the IoMT infras-
tructure under default parameters are extended by considering
various system configuration changes in the design phase of
the IoMT infrastructure. We considered the five case-studies of
different system configurations based on expanding the size of
each member system one after another as described in Table I
and the analysis results are shown in Fig. 13 which depict
the comparison between the case I (default configuration) with
other case-studies, respectively. The figures, Fig. 13a - Fig. 13c
show the detailed comparisons between the reconfigured corre-
sponding member systems and the overall IoMT infrastructure
of the case I and those of other remaining case-studies.
Fig. 13a is the figure of the reliability comparison between
the case I and case II. When comparing the two cloud member
systems of the two cases, we can see that the expanding of the
cloud size obviously enhance the reliability of the cloud member
system itself as clearly depicted by a large distance between the
base graph of diamond markers representing the cloud member
system in the original IoMT infrastructure of the based case I
and the graph of circle markers representing the cloud member
system in the reconfigured IoMT infrastructure of the case-study
case II. As a result, the reliability of the IoMT infrastructure
with reconfigured configuration in the case II (represented by
the graph of five-pointed star markers) is transparently improved
in comparison with the reliability of the default configuration
of the original IoMT infrastructure (represented by the graph
with square markers).
Fig. 13b presents the reliability analogy of the two case-
studies including the base case I and the case III with an
expansion in size of the fog member system. As seen through
the distance between the graph of circle markers representing
23
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
cServer
cGateway
cStorage
CLOUD
0 50 100 150 200
Time (hours)
0.97
0.98
0.99
1
Reliability
Reliability of cStorage&cGateway
(a) Cloud
0 50 100 150 200 250 300
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
fServer
fGateway
fStorage
FOG
0 100 200 300
Time (hours)
0.9985
0.999
0.9995
1
Reliability
Reliability of fStorage
(b) Fog
0 50 100 150 200 250 300
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
ihSensor
iGateway
iaSensor
EDGE
(c) Edge
0 50 100 150 200 250 300
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
CLOUD
FOG
EDGE
IoT
(d) IoMT
Fig. 12: Reliability of IoMT infrastructure under default parameters
0 20 40 60 80 100
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
Case I: CLOUD
Case I: IoT
Case II: CLOUD
Case II: IoT
(a) Case I vs. case II
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
Case I: FOG
Case I: IoT
Case III: FOG
Case III: IoT
29.8 30 30.2
Time (hours)
0.716
0.718
0.72
0.722
0.724
0.726
0.728
Reliability
(b) Case I vs. case III
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
Case I: EDGE
Case I: IoT
Case IV: EDGE
Case IV: IoT
(c) Case I vs. case IV
0 10 20 30 40 50 60 70
Time (hours)
0
0.2
0.4
0.6
0.8
1
Reliability
Case I
Case II
Case III
Case IV
Case V
29 30 31
Time (hours)
0.72
0.73
0.74
Reliability
(d) All cases
Fig. 13: Reliability analyses of IoMT infrastructures with configuration changes
the reliability of the reconfigured fog member system in the
case III and the graph of diamond markers representing the
reliability of the default fog member system in the base case
I, the expansion in size of the fog member system improves
the reliability of the fog member system clearly. And this
improvement of the fog member system, however slightly
enhances the overall reliability of the IoMT infrastructure as
seen by the small gap between the graph of five-pointed markers
depicted the reconfigured IoMT infrastructure in the case III
and the original one in the based case I.
Fig. 13c depicts the reliability analysis results of the edge
member system and the overall IoMT infrastructure in the
base case I and the reconfigured case-study case IV. As per
compared, an expansion in size of the edge member system
in the reconfigured IoMT infrastructure actually diminishes the
reliability of the edge member system itself and the overall
reconfigured IoMT infrastructure due to the stringent operative
requirements that a failure of IoMT devices in the edge level
causes a severe malfunction of the whole IoMT infrastructure
in term of QoS. As observed the distance between graphs, a
big amount of reliability decrease in the edge member system
(as seen by the big gap between the graph of diamond markers
representing the reliability of the reconfigured edge member
system in the case IV, and the graph of circle markers repre-
senting the reliability of the default edge member system in the
case I) slightly causes a little decrease in the overall reliability
measure of the IoMT infrastructure in the course of time (as
depicted by a little gap between the graph of five-pointed star
markers representing the reliability of the reconfigured IoMT
infrastructure in the case IV, and the graph of square markers
representing the reliability of the default IoMT infrastructure in
the base case I).
Fig. 13d synthesizes the reliability analysis results of the
all five case-studies in a common figure. When observing the
relative distances between the pairs of graphs, we recognize
the impact of the system architecture’s reconfiguration on the
overall reliability of the IoMT infrastructure in the base case
Irepresented by the graph of diamond markers. The only
expansion in size of the cloud member system in the case
II can vastly enhance the IoMT infrastructure’s reliability in
comparison to the other system reconfiguration as depicted by
the distance from other graphs to the topmost graph of plus
markers representing the reliability of the reconfigured IoMT
infrastructure with a size-expanded cloud member system.
Whereas, the only size-expansion of the edge member system
clearly lessens the overall reliability of the IoMT infrastructure
in the case IV as seen by the bottom-most graph of star markers.
Further observation on the smaller figure in Fig. 13d shows
that the only expansion in size of the fog member system can
very slightly improve the overall reliability of the reconfigured
IoMT infrastructure in the case III as depicted by a little gap
between the graph of circle markers representing the reliability
of the reconfigured IoMT infrastructure with size-changed fog
member system, and the graph of diamond markers representing
the reliability of the default IoMT infrastructure in the base case
I. And last but not least, we realize the effects of the expansion
in size of the cloud and fog member systems (in the case V) in
the way that not only diminish the negative impact of the size-
expansion of the reconfigured edge member system but also
transparently enhance the overall reliability of the reconfigured
IoMT infrastructure, as shown by the graph of five-pointed star
markers.
B. Steady State Availability Analyses
MTTFeq, MTTReq and SSA under default parameters:: To
comprehend the failure and repair characteristics of subsystems
and components, we performed comprehensive computations
of different measures including MTTFeq, MTTReq and SSA.
Those values of components (e.g., cCPU, fCPU, etc.) are the
analysis results of corresponding CTMC component models.
And those of subsystems (e.g., cServer, fServer, etc.) are
computed by using subsystem FT in the middle level in which
the above-mentioned outputs of component models are used as
input of basic nodes in the subsystem FT. The computation
results of MTTFeq, MTTReq and SSA for components and
subsystems in Cloud, Fog and Edge systems are shown in
Table III, Table IV and Table V, respectively. The measures
computed for the overall IoMT infrastructure are shown in
Table II. To easily see the difference of availability values in
24
the tables, we computed the number of nines (denoted by #9s)
in the values of SSA using Eq. (22).
#9s=log(1 ASS )(22)
where, ASS is the above-mentioned SSA computed in the
tables. We also computed the downtime hours in a year of
the IoMT infrastructure using Eq. (23) provided that, the
corresponding SSA was computed in advance and that a full
year of operations has 365 days of 24 hours.
TDN (hours) = (1 AS S ).8760 (23)
As we observed the output values in the comparison of hard-
ware and software components, MTTFeqs of hardware com-
ponents are transparently much higher than those of software
components. Whereas, their MTTReqs are higher relatively. If
the software component has multiple redundant parts in its
operation, the value of its MTTFeq is, however increased vastly
in comparison to those of other hardware components due to
much lower repair time. The SSAs of hardware components are
generally much higher than those of software components, as
a result.
In Table II, we show the computed measures of the IoMT
infrastructure in the system level. The SSAs of the systems
in the IoMT infrastructure are not significantly different from
each other. The numbers of nines in the SSAs of the member
systems (Cloud, Fog and Edge) are similarly about 3but the
overall number of nines of the IoMT infrastructure is about
2. As a consequence, the downtime hours in a year of those
member systems are at about 6-7 hours per year, whereas, that
number of the overall IoMT infrastructure is vastly longer at
about 20 hours. This happens since the IoMT infrastructure is
strictly required to be operative when the member systems are
in a normal state at the same time. In other words, the IoMT
infrastructure fails if any of the three member systems (Cloud,
Fog or Edge) falls into a down-state. This strict requirement
leads to a huge decrease in the SSA (therefore, a corresponding
increase in the downtime per a year) of the IoMT infrastructure.
Availability analyses of case-studies and operative scenar-
ios:: We performed computations of MTTFeq, MTTReq, SSA
(additionally associated with #9s) and downtime hours in a
year for the four operative scenarios (as shown in Table VI)
and five case-studies (as shown in Table VII). For intuitive
understanding, we plot the SSAs associated with downtime
hours per year for the analyses of scenarios using bar charts
in Fig. 14a and the SSAs of all member systems of the IoMT
infrastructure in case-studies as in Fig. 14b.
As we observed, when the system is designed to satisfy the
strict requirement in the scenario I (default architecture), in
which as soon as one of the underlying computing systems
(Cloud or Fog) goes down the whole IoMT infrastructure is
considered to be in a system failure, the overall SSA of the
IoMT infrastructure is lower (the #9s at about 2.636) (therefore,
its downtime is much longer at about 20.27 hours per year) than
that measure of the IoMT in the other operative scenarios. When
we relax this constraint and allow the scenario IV to occur,
in which the overall IoMT infrastructure experiences a system
failure only if both underlying computing systems (Cloud and
Fog) stay in downtime period at the same time, the SSA of
the whole IoMT infrastructure clearly increases (with #9s at
about 3.105) and the corresponding downtime hours in a year
decrease vastly (at about 6.88 hours) in comparison to those
measures of the IoMT infrastructure in other scenarios. When
we allow the worst-case scenarios to occur, in which only one
of the underlying computing systems (Fog in the scenario II
and Cloud in the scenario III) is totally in downtime period
for long (or probably not taken into account), the SSAs of the
IoMT infrastructure in those scenarios are higher than that in
the scenario I and lower than the measure in the scenario IV.
This happens due to the fact that the IoMT infrastructure is
strictly considered to be in a system failure if any of the member
systems fails, therefore the more number of member systems
involves in the overall architecture, the lower SSA the overall
IoMT infrastructure has. As we observed further, the IoMT with
Cloud (and without Fog) in scenario II possess a slightly higher
SSA (with #9s at about 2.813) than the IoMT with Fog (and
without Cloud) in scenario III does (with #9s at about 2.807).
Thus, it costs about 13.66 hours of downtime in a year for the
scenario III whereas it is slightly lower at about 13.49 hours
for the scenario II.
In Table VII, the major measures of interest are computed
for the sake of comparison of the member systems as well as
of the overall IoMT infrastructure in the case-studies. We use
the analysis results of case I (the default IoMT infrastructure)
as a base-line for the comparison with other cases. As we
observed the relative differences in the Fig. 14b, we find that the
configuration changes transparently lead to respective increase
or decrease of the overall values of the measures of interest. In
the case II, when the IoMT infrastructure operates on top of two
redundant Cloud centers, the SSA of the Cloud system is vastly
improved and thus the downtime hours per a year are clearly
shortened in comparison to those measures of the Cloud system
in the case I. Thanks to the SSA enhancement of the cloud
member system, the overall SSA of the IoMT infrastructure is
clearly improved. The IoMT infrastructure’s downtime hours in
a year also decrease accordingly from about 20.271 hours down
to 13.667 hours in a year. In the case III, we see a relative
increase of SSA (thus, a corresponding decrease of downtime
hours per a year) of the IoMT infrastructure with a double-
sized configuration of Fog member system in comparison to
those measures in the case II. This emphasizes the significance
of scaling up the size of the Fog member system to achieve
higher availability. In the case IV, we however, can see a clear
decrease of SSA and a severe increase of downtime hours per
year in comparison to those measures of the case III and case II
as well as of the case I. The cause of this transparent decrease
is the use of a bigger number of IoMT medical sensors/devices
in the Edge level under a strict requirement in which a failure
of a medical device in the Edge level is considered as a severe
failure of the system in modern medical system with a high level
of operational autonomy. In those severe circumstances, the use
of both scaled-up architectures of Cloud and Fog systems can
enhance the measures clearly as shown by the analysis output
of the case V.
C. Sensitivity Analyses
Fig. 15a, Fig. 15c, Fig. 15e and Fig. 15g are the figures to
show the variations of the overall system’s steady state availabil-
ity with respect to the changes of MTTFeq of main subsystems
in the member systems Cloud, Fog and Edge respectively,
and with respect to the changes of MTTFeq of the member
subsystems as well. Whereas, Fig. 15b, Fig. 15d, Fig. 15f
and Fig. 15h are for the sensitivity analyses of the system’s
SSA with respect to MTTReq of the same above-mentioned
subsystems and member systems, respectively. Last but not
least, Fig. 15i shows the impact of the wireless connection
in the edge member system on the overall SSA of the IoMT
infrastructure.
SSA wrt. MTTFeq::As per observed the graphs one after
another, we notice that the SSA of the IoMT infrastructure
has a common tendency of increasing and gradually increasing
25
TABLE II: STEA DY STATE A NALYS IS O F IOMT I NFRAS TRU CT UR E UN DER D EFAU LT PARA METER S
Systems MTTFeq MTTReq SSA #9s Downtime (hours/year)
Cloud 5.385 ×1034.072 ×1009.992 ×1013.122 6.62
Fog 4.844 ×1033.761 ×1009.992 ×1013.110 6.80
Edge 1.119 ×1038.788 ×1019.992 ×1013.105 6.87
IoMT 7.779 ×1021.804 ×1009.977 ×1012.636 20.27
TABLE III: STEA DY STATE AVAILA BI LI TY ANALYS IS O F CLO UD UN DE R DE FAULT PARAME TE RS
Components MTTFeq MTTReq SSA #9s
cCPU 7.695 ×1063.232 ×1029.99957997 ×1014.377
cMEM 1.673 ×1089.784 ×1009.99999942 ×1017.233
cNET 2.416 ×1071.397 ×1019.99999421 ×1016.238
cPWR 6.221 ×1071.913 ×1019.99999693 ×1016.512
cCOO 7.924 ×1071.786 ×1019.99999775 ×1016.647
cVMM 1.559 ×1021.221 ×1009.92232804 ×1012.110
cVM 1.684 ×1021.558 ×1009.90833169 ×1012.038
cAPP 9.383 ×1061.347 ×1019.99999986 ×1017.843
cServer 8.097 ×1011.392 ×1009.83094715 ×1011.772
cGatewayHW 2.0×1057.2×1019.99640129 ×1013.444
cGatewaySW 1.062 ×1031.626 ×1019.84925031 ×1011.822
cGateway 1.057 ×1031.657 ×1019.84570585 ×1011.812
cStorage 5.278 ×1043.933 ×1019.99255374 ×1013.128
Cloud 5.385 ×1034.072 ×1009.99244421 ×1013.122
TABLE IV: STEA DY STATE AVAILA BI LI TY ANALYS IS O F FOG U NDE R DE FAULT PAR AM ETE RS
Components MTTFeq MTTReq SSA #9s
fCPU 1.989 ×1061.319 ×1029.99933661 ×1014.178
fMEM 1.339 ×1087.336 ×1009.99999945 ×1017.261
fNET 4.014 ×1076.944 ×1009.99999827 ×1016.762
fPWR 4.511 ×1077.867 ×1009.99999825 ×1016.759
fCOO 5.579 ×1076.571 ×1009.99999882 ×1016.929
fOS 1.501 ×1026.177 ×1019.95901012 ×1012.387
fServer 1.499 ×1026.274 ×1019.95833256 ×1012.380
fGatewayHW 1.0×1052.40 ×1019.99760057 ×1013.620
fGatewaySW 4.592 ×1027.356 ×1009.84231902 ×1011.802
fGateway 4.571 ×1027.434 ×1009.83995743 ×1011.796
fStorage 1.764 ×1052.931 ×1019.99833870 ×1013.780
Fog 4.844 ×1033.761 ×1009.99224198 ×1013.110
TABLE V: STEA DY STATE A NALYS IS O F EDG E UNDER D EFAU LT PARA METER S
Components MTTFeq MTTReq SSA #9s
ihBAT 3.497 ×1021.196 ×1019.669249 ×1011.481
ihSEN 4.320 ×1043.333 ×1019.999923 ×1015.113
ihADC 8.640 ×1037.500 ×1019.999132 ×1014.061
ihMCU 8.640 ×1036.667 ×1019.999228 ×1014.113
ihMEM 2.577 ×1042.50 ×1019.999903 ×1015.013
ihTRx 1.191 ×1048.0×1009.993289 ×1013.173
ihOSapp 7.427 ×1021.021 ×1019.998625 ×1013.862
ihSensor 2.182 ×1027.687 ×1009.659680 ×1011.468
iWireless 2.0×1041.0×1009.999500 ×1014.301
iGatewayHW 1.50 ×1056.0×1009.999600 ×1014.398
iGatewaySW 3.088 ×1029.994 ×1019.967742 ×1012.491
iGateway 3.082 ×1021.009 ×1009.967343 ×1012.486
iaBAT 2.346 ×1026.493 ×1009.730653 ×1011.570
iaSEN 4.320 ×1042.50 ×1019.999942 ×1015.238
iaADC 4.320 ×1034.167 ×1019.999036 ×1015 4.016
iaMCU 4.320 ×1032.50 ×1019.999421 ×1014.238
iaMEM 1.725 ×1041.667 ×1019.999903 ×1015.015
iaTRx 9.760 ×1034.0×1009.995903 ×1013.388
iaOSapp 6.723 ×1021.059 ×1019.998425 ×1013.803
iaSensor 1.563 ×1024.446 ×1009.723483 ×1011.558
Edge 1.119 ×1038.788 ×1019.992156 ×1013.105
towards a stable availability if the respective MTTFeq increases
and reaches a high value. Therefore, when the operations of the
subsystems and member systems in the IoMT infrastructure can
last in a longer duration before a failure occurrence, the overall
IoMT infrastructure can actually gain higher availability. In a
more detailed observation, we also can see that the SSA does
not increase in a proportion to the increase of MTTFeq. Rather
than that, the SSA of the IoMT infrastructure is enhanced vastly
with respect to a small increase of MTTFeq when the value
of the MTTFeq is in a range of small values. But the SSA
26
TABLE VI: STEA DY STATE AVAILA BI LI TY ANALYS IS O F IOMT
INFRASTRUCTURE IN DIFFERENT SCENARIOS
Sce. MTTFeq MTTReq SSA #9s DN
(hrs/y)
I7.779 ×1021.804 ×1009.97686 ×1012.636 20.271
II 9.268 ×1021.429 ×1009.98461 ×1012.813 13.485
III 9.093 ×1021.420 ×1009.98440 ×1012.807 13.662
IV 1.119 ×1031.119 ×1039.99215 ×1013.105 6.877
Scenario I Scenario II Scenario III Scenario IV
0.99
0.992
0.994
0.996
0.998
1
Steady state availability
0
5
10
15
20
25
Downtime (hours/year)
Steady State Availability Downtime
(a) Four operative scenarios
I II III IV V
Case-Studies
0.9955
0.996
0.9965
0.997
0.9975
0.998
0.9985
0.999
0.9995
Steady State Availability
Cloud
Fog
Edge
IoT
(b) Five case-studies
Fig. 14: Steady state availability values of IoMT infrastructures in
case-studies and operative scenarios
changes with a little difference with respect to the MTTFeq in
the range of high values. This is to say that the SSA of the IoMT
infrastructure has a low sensitivity with respect to the MTTFeq
of the subsystems and/or member systems in the range of high
values and vice versa.
SSA wrt. MTTReq::On the other hand, the SSA of
the IoMT infrastructure has a sensitivity wrt. MTTReq of
the subsystems and member systems in contradiction with the
sensitivity of the SSA wrt. the MTTFeq as detailed in the
above analysis. Particularly, the SSA decreases as soon as
the MTTReq increases, and it drops quickly if the MTTReq
increases in the range of high values. Therefore, the SSA has
a smaller sensitivity wrt. MTTReq in the range of small values
and much higher sensitivity wrt. those high values of MTTReq.
This means that the longer period of time for a recovery of a
subsystem and/or a member system under a certain failure it
takes, the deeper the overall SSA of the IoMT infrastructure
drops down in a quick manner.
The above analyses of the sensitivity of SSA of the IoMT
infrastructure with respect to the MTTFeq and MTTReq are
significant for the trade-off between designing the system
architecture and selecting components and devices in order to
not only achieve high availability but also to balance the related
costs for maintenance and purchase of devices.
Detailed analyses of figures::Fig. 15a shows the variation
of SSA of the IoMT infrastructure wrt. the changes of MTTFeq
of three subsystems including server, gateway and storage of the
cloud member system. Under the default system configuration,
the variation of the cloud server’s MTTFeq has the greatest
impact on the IoMT infrastructure’s SSA in the way that the
IoMT infrastructure always has the highest SSA wrt. any value
of the MTTFeq of the cloud sever in comparison with the
MTTFeq of cloud gateway and storage. However, the sensitivity
of the SSA of cloud server is lower than it of cloud gateway
and storage since the amount difference of SSA in the case of
cloud server is smaller than the corresponding measures in the
cases of cloud gateway and storage when the MTTFeq varies
from low values towards high values. The cloud storage has
a slightly higher SSA under the variation of the MTTFeq in
comparison with the cloud gateway.
Fig. 15b presents the dependency of the IoMT infrastructure’s
SSA on the recovery of subsystems (servers, gateways and
storages) in the cloud member system. In gerneral, the longer
it takes to recover a failure in the cloud member system,
the steeper the overall SSA of the IoMT infrastructure drops.
Particularly, the MTTReq of cloud servers poses the greatest
sensitivity of the IoMT infrastructure’s SSA as seen by the
huge drop of the SSA over an increase of the cloud server’s
MTTReq in comparison with other subsystems of the cloud. On
the other hand, the variation of the cloud storage subsystem’s
MTTReq has the lowest negative impact and sensitivity on the
overall SSA of the IoMT infrastructure, therefore the SSA is
maintained high under any increase or decrease of the cloud
storage’s MTTReq. The graph represented for the case of the
cloud gateway shows a slightly higher sensitivity of the IoMT
infrastructure’s SSA wrt. its MTTReq compared with the case
of the cloud storage but much lower sensitivity in comparison
with the case of the cloud server.
Fig. 15c is the figure of the IoMT infrastructure’s SSA wrt.
the variation of MTTFeq of servers, gateways and storages in
fog member system. The variation of the fog servers’ MTTFeq
also holds the greatest impact on the IoMT infrastructure’s SSA
in which the SSA maintains its highest values in comparison
with the cases of fog gateway and fog storage. But, the SSA
clearly has a lower sensitivity with the variaon of MTTFeq
when compared with that of fog gateway and storage. The fog
gateway’s SSA is generally higher than it of the fog storage and
it also has a higher sensitivity wrt. the variation of MTTFeq due
to the higher slope of the graph represented for the case of the
fog gateway.
Fig. 15d depicts the impact of the recovery time MTTReq of
the fog subsystems on the overall SSA of the IoMT infrastruc-
ture. As similar to the cases of the cloud subsystems in Fig. 15b,
the fog server has the greatest impact on the overall SSA of the
IoMT infrastructure in comparison with other fog subsystems,
and thus the SSA has the highest sensitivity wrt. the variation
of the fog server’s MTTReq. The MTTReq of the fog storage
also has the least impact on the SSA of the IoMT infrastructure,
thus the sensitivity of the IoMT infrastructure’s SSA wrt. the
variation of the fog storage’s MTTReq is the lowest among that
wrt. the variation of the fog subsystems’ MTTReq.
Fig. 15e shows the variation of the IoMT infrastructure’s SSA
wrt. the MTTFeq of IoMT sensors and IoMT gateway. The
graphs have a steep incline when the MTTFeq is in a range of
small values, thus the SSA of the IoMT infrastructure increases
27
TABLE VII: STEA DY STATE AVAILA BI LI TY ANALYS ES O F IOMT INFR AS TRU CT UR ES IN DIFF ER EN T CASE-STUDIES
Metrics Cloud Fog Edge IoMT
MTTFeq 5.385 ×1034.844 ×1031.119 ×1037.779 ×102
MTTReq 4.072 ×1003.761 ×1008.788 ×1011.804 ×100
SSA 9.99244421 ×1019.99224199 ×1019.99215584 ×1019.97685991 ×101
Case I
Downtime (hours/year) 6.619 6.796 6.871 20.271
MTTFeq 3.566 ×106xx9.091 ×102
MTTReq 2.036 ×100xx1.421 ×100
SSA 9.99999429 ×101xx9.98439821 ×101
Case II
Downtime (hours/year) 0.005 x x 13.667
MTTFeq x 3.124 ×106x9.265 ×102
MTTReq x 1.880 ×100x1.429 ×100
SSA x 9.99999398 ×101x9.98459997 ×101
Case III
Downtime (hours/year) x 0.00527 x 13.490
MTTFeq x x 5.597 ×1024.589 ×102
MTTReq x x 8.791 ×1011.426 ×100
SSA x x 9.98431783 ×1019.96903390 ×101
Case IV
Downtime (hours/year) x x 13.738 27.126
MTTFeq 3.566 ×1063.124 ×1065.597 ×1025.595 ×102
MTTReq 2.036 ×1001.880 ×1008.791 ×1018.795 ×101
SSA 9.99999429 ×1019.99999398 ×1019.98431783 ×1019.98430612 ×101
Case V
Downtime (hours/year) 0.005 0.00527 13.73758 13.74784
x denotes same values as those in case I
with a vast amount when there is an increase of the MTTFeq of
the IoMT sensors and gateway. In more detail, the IoMT health
sensor poses a negative impact on the IoMT infrastructure’s
SSA in the way that the low values of its MTTFeq pull the
IoMT seriously down in comparison with other subsystems of
the edge. When it comes to greater values of MTTFeq of the
subsystems in the edge, the IoMT infrastructure’s overall SSA
gradually reach a steady and high value.
Fig. 15f depicts the dependency of the IoMT infrastructure’s
SSA wrt. the MTTReq of IoMT devices in the edge member
system. As per observed the graphs, the overall SSA of the
IoMT infrastructure has the highest sensitivity wrt. IoMT am-
bient sensors depicted by the steepest fall of the corresponding
graph in comparison with other cases. The least steep drop of
the overall SSA wrt. the increase of the MTTReq is of the case
of the IoMT gateway. This emphasizes that under default system
configuration, the IoMT ambient sensors encounter with more
frequent failures thus require faster recovery actions (or smaller
value of MTTReq) to maintain high availability of the IoMT
infrastructure. As observed the curve of the graphs in a more
detailed manner, it is interesting that if it is possible to perform
recovery operations as fast as possible (i.e., shorter mean time
to recover a failure), system operators should pay more attention
on the recovery of IoMT health sensors at first (represented by
the uppermost graph of dot markers), IoMT ambient sensors
in the second place and IoMT gateway at last (represented by
the undermost graph of five-pointed star markers) in order to
maintain higher values of the IoMT infrastructure’s availability.
Nevertheless, in the case that the recovery processes take a
longer time, ones may prioritize the recovery of IoMT gateway
at first (depicted by the topmost graph of five-pointed star
markers), then the recovery of IoMT health sensors (depicted
by the graph of dot markers), and at last the recovery of
IoMT ambient sensors (depicted by the bottom-most graph of
diamond markers). Furthermore, ones need to be aware that the
recovery of the IoMT ambient and health sensors should be
conducted in a quick manner to avoid the steep drop of the
IoMT infrastructure’s availability. Therefore, depending on the
recovery capability and resources of the IoMT infrastructure,
system operators can conduct appropriate recovery planning and
strategies for the IoMT devices in the edge member system with
proper trade-offs to gain the highest availability for the overall
IoMT infrastructure.
Fig. 15g presents the variation of the IoMT infrastructure’s
SSA over the changes of the member systems’ MTTFeq. We
performed the computation of the IoMT infrastructure’s SSA
when considered each member system as a whole individual
part of the IoMT infrastructure featured with its MTTFeq. As
per observed the graphs, the IoMT infrastructure’s SSA has a
much higher sensitivity wrt. the MTTFeqs of the cloud and fog
member systems in comparison with that of the edge member
system, especially when the MTTFeqs are in the range of
small values. The low values of the MTTFeqs of the cloud
or fog systems cause a severe drop of the IoMT infrastructure’s
SSA. The variation of the edge member system’s MTTFeq,
however performs better in enhancing the IoMT infrastructure’s
SSA since the values of SSA in the case of the edge member
system are intuitively beyond the rest under the variation of the
MTTFeq of the member systems.
Fig. 15h shows the variation of the IoMT infrastructure’s SSA
wrt. the overall MTTReq of the cloud, fog and edge member
systems. In comparison to the other systems, the variation
of the edge member system’s MTTReq has the most impact
on the overall SSA of the IoMT infrastructure, and thus the
SSA has the highest sensitivity wrt. the MTTReq of the edge
member system as clearly depicted by the steep incline of the
graph with five-pointed star markers. Whereas, the case of the
least sensitivity is the variation of the IoMT infrastructure’s
SSA wrt. the cloud member system’s MTTReq as shown by
the graph of dot markers. In addition to the above sensitivity
analyses, it is worth pointing out that the ones had better
perform rapid recovery processes for a failure in the edge
member system to not severely drop the overall SSA, and also
prioritize the recovery operations of the cloud member system
at first and the fog member system secondly to prolong the
IoMT infrastructure’s availability at the level of higher values.
SSA wrt. wireless connectivity:: Fig. 15i presents the
dependence of the IoMT infrastructure’s SSA over the variation
of the probability of wireless connection in the edge member
system. As per observed the graph, the variation of the IoMT
infrastructure’s SSA is proportional to any changes of the
wireless connectivity in the edge member system. We assume
that the conditions of wireless connection (e.g., signal strength,
open line of sight, etc.) are good enough to achieve high values
of probabilities of wireless connection. As seen through the
figure, a little decrease of the wireless connectability causes a
28
clear drop on the IoMT infrastructure’s SSA in proportion.
D. Cyber-security Analysis
Fig. 16 shows the security analysis based on the variation of
SSA wrt. attack intensity on different software subsystems in
the IoMT infrastructure under default parameters. Severe cyber-
security attacks often aim at causing service disruption and/or
operative termination, directly [?]. Vulnerabilities exposed in
software subsystems are likely exploited over and over again
through repeated attacks in a period of time. For that reason,
attack intensity clearly impacts the availability attribute of the
IoMT infrastructure. The attack intensity is featured by the
amount of attacks in a specific duration (aka., attack frequency),
roughly computed as a multiplicative inverse of Mean time to
an attack (MTTA), fA=1
MTTA . The software subsystems
considered to be vulnerable to cyber-attacks include cV MM ,
cV M,cgS W ,fOS,fgSW and igSW. Attack intensities of those
software subsystems are denoted as the frequencies fcV MM a,
fcV Ma ,fcgSW a,ff OS a,ff gS W a and figSW a , respectively.
Fig. 16a - Fig. 16f show the analysis results of the cases
one after another. As we observed, when the amount of at-
tacks increases in a specific duration of 24 hours, the attack
consequences are generally increase causing huge drops in
availability values of the IoMT infrastructure. Fig. 16a, Fig. 16b
and Fig. 16d show that (i) as the attack intensity is low, the
software subsystems cV MM,cV M and f OS still can secure
high values of availability for the IoMT infrastructure, (ii)
but when the number of attacks increases in the same period
of time (or in another word, higher attack frequency), those
software subsystems fail to maintain the availability values and
the availability of the software subsystems rapidly drops as
depicted by a likely vertical falling convex. Fig. 16c, Fig. 16e
and Fig. 16f show that the software subsystems cgSW ,fgSW
and igSW are very sensitive to cyber-attacks in the way that a
few attacks at the beginning cause a severe drop in availability
as depicted by the vertically falling graphs. Fig. 16g shows a
relative comparison of the impact of cyber-attacks on the IoMT
infrastructure’s availability between between the cases of soft-
ware subsystems. As per observed, fgSW is the most vunerable
to cyber-attacks since a few number of attacks on fgSW clearly
causes a huge drop on the IoMT infrastructure’s availability,
and the higher the attack intensity incurred on fgSW is, the
more availability the IoMT infrastructure loses in comparison
to other cases of the remaining software subsystems, depicted
by the graph of square markers. The second most vulnerable
software is igSW while cV M and cV M M are less vulnerable,
respectively. The least vulnerable software in comparison to
the other cases is fOS, depicted by the least difference of the
graph of right-pointing triangle markers. The above analyses
suggest pinpointing the critical subsystems prone to cyber-
security attacks in a way that, (i) early attacks on gateways’
software including cgS W ,fgSW and igSW causes immediate
loss of the IoMT infrastructure’s availability, (ii) a large amount
of attacks on software subsystems including cV MM,cV M and
fOS in a specific period of time also cause a huge loss of the
IoMT infrastructure’s availability, and (iii) the fog and edge
gateways’ software (fgSW and igSW) are most vulnerable in
securing availability of the IoMT infrastructure.
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.9974
0.99745
0.9975
0.99755
0.9976
0.99765
0.9977
0.99775
Steady State Availability
fcVMMa
(a) fcV MM a
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.9972
0.9973
0.9974
0.9975
0.9976
0.9977
0.9978
Steady State Availability
fcVMa
(b) fcV Ma
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.99755
0.9976
0.99765
0.9977
0.99775
0.9978
0.99785
0.9979
0.99795
0.998
Steady State Availability
fcgSWa
(c) fcgSW a
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.9976844
0.9976846
0.9976848
0.997685
0.9976852
0.9976854
0.9976856
0.9976858
0.997686
0.9976862
Steady State Availability
ffOSa
(d) ffO Sa
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.997
0.9972
0.9974
0.9976
0.9978
0.998
0.9982
0.9984
Steady State Availability
ffgSWa
(e) ffg SW a
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.9973
0.9974
0.9975
0.9976
0.9977
0.9978
0.9979
0.998
Steady State Availability
figSWa
(f) figSW a
0 0.1 0.2 0.3 0.4 0.5
Frequency (times/day)
0.997
0.9972
0.9974
0.9976
0.9978
0.998
0.9982
0.9984
Steady State Availability
fcVMMa
fcVMa
fcgSWa
ffOSa
ffgSWa
figSWa
(g) All cases
Fig. 16: Security analysis wrt. attack intensity
VII. CONCLUSION
Reliability, availability and security have been always im-
portant dependability measures to quantify the level of QoS of
IoMT infrastructures for healthcare monitoring in the circum-
stances of rising health-related pandemics. In this paper, we
presented a theoretical approach for dependability quantification
of IoMT infrastructures using a three-fold hierarchical modeling
and analysis framework. A comprehensive hierarchical model
was also developed in accordance with a three-level architecture
of a typical IoMT infrastructure for healthcare monitoring fea-
tured with a continuum of three computing paradigms of cloud,
fog and edge. The overall hierarchical infrastructure model con-
sists of (i) FT models in the top and middle levels to capture cor-
responding structural architectures of the overall infrastructure,
its member systems and subsystems, (ii) Markov models in the
bottom level to incorporate various operative behaviors (failure
and recovery) of both hardware and software components within
29
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Steady State Availability
1/ eqcServer
1/ eqcGateway
1/ eqcStorage
0 50 100 150 200
Time (hours)
0.98
0.985
0.99
0.995
1
Steady State Availability
SSA wrt. 1/ eqcServer
(a) MTTFeq of Cloud
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Steady State Availability
1/ eqcServer
1/ eqcGateway
1/ eqcStorage
0 50 100 150 200
Time (hours)
0.99764
0.99765
0.99766
0.99767
0.99768
Steady State Availability
SSA wrt. 1/ eqcStorage
0 50 100 150 200
Time (hours)
0.92
0.94
0.96
0.98
1
Steady State Availability
SSA wrt. 1/ eqcGateway
(b) MTTReq of Cloud
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Steady State Availability
1/ eqfServer
1/ eqfGateway
1/ eqfStorage
0 50 100 150 200
Time (hours)
0.9975
0.99755
0.9976
0.99765
0.9977
Steady State Availability
SSA wrt. 1/ eqfServer
(c) MTTFeq of Fog
0 50 100 150 200
Time (hours)
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Steady State Availability
1/ eqfServer
1/ eqfGateway
1/ eqfStorage
0 50 100 150 200
Time (hours)
0.9976
0.99765
0.9977
Steady State Availability
SSA wrt. 1/ eqfStorage
(d) MTTReq of Fog
0 50 100 150 200
Time (hours)
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Steady State Availability
1/ eqihSensor
1/ eqiaSensor
1/ eqiGateway
(e) MTTFeq of Edge
0 50 100 150 200
Time (hours)
0
0.2
0.4
0.6
0.8
1
Steady State Availability
1/ eqihSensor
1/ eqiaSensor
1/ eqiGateway
(f) MTTReq of Edge
0 50 100 150 200
Time (hours)
0.7
0.75
0.8
0.85
0.9
0.95
1
Steady State Availability
1/ eqCLOUD
1/ eqFOG
1/ eqEDGE
(g) MTTFeq of all cases
0 50 100 150 200
Time (hours)
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
Steady State Availability
1/ eqCLOUD
1/ eqFOG
1/ eqEDGE
(h) MTTReq of all cases
0.9999 0.99992 0.99994 0.99996 0.99998 1
Probability
0.997675
0.99768
0.997685
0.99769
0.997695
Steady State Availability
piWireless
(i) Probability of IoMT wireless connection
Fig. 15: Steady state availability of IoMT wrt. impacting factors
every subsystem in a comprehensive manner. Specific case-
studies and operative scenarios with corresponding hierarchical
models were also presented to comprehend the dependability
properties of the IoMT infrastructure under various configura-
tions as well as in different circumstances. Numerous analyses
were performed in detail for the developed models of the default
architecture of the considered IoMT infrastructure and the
alternated architectures of case-studies and scenarios, including
(i) reliability analyses, (ii) steady-state availability analyses, (iii)
availability sensitivity analyses wrt. impacting factors and (iv)
security analyses considering the impact of attack intensities
on system availability. The proposed hierarchical modeling and
analysis framework along with developed hierarchical models
and comprehensive analyses of a typical IoMT infrastructure
presented in this work are expected (i) to provide a novel
approach towards dependability evaluation of sophisticated and
multi-level architectures of IoMT infrastructures in practice,
and also (ii) to help guide system developers and practitioners
either design an optimised architecture of IoMT infrastructures
to obtain a proper level of QoS or to operate a practical
IoMT infrastructure to provide the best operative services as
possible to clients. Further works, including (i) performance
analysis under availability constraints based on hierarchical
stochastic models (performability analysis), and (ii) develop-
ment of automated evaluation tools for different structural
architectures of IoMT infrastructures can be fruitful extensions
of this study to provide a multi-perspective comprehension on
the cloud/fog/edge continuum-based IoMT infrastructure.
ACK NOW LE DG EM EN T
This paper was supported by KU Brain Pool 2019, Konkuk
University, Seoul, South Korea.
The authors would like to thank Kishor S. Trivedi and his
research team at Duke University, Durham, NC 27708, United
States, for providing the tool SHARPE.
APPENDIX A: TABLES OF DEFAU LT INPUT PARAMETERS
TABLE VIII: DEFAULT CONFI GU RATI ON PAR AM ET ERS OF FT
MODELS
Name Description Values
c,ncServer Total number of cloud servers 5
c1,
mcServer
Minimum number of cloud servers for the pool of
cloud servers to operate normally
2
g,
ncGateway
Total number of cloud gateways 3
g1,
mcGateway
Minimum number of cloud gateways for the pool
of cloud gateways to be considered being in op-
erational state
1
30
s,ncStorage Total number of cloud gateways 3
s1,
mcStorage
Minimum number of cloud gateways for the net-
work of cloud gateways to be considered being in
operational state
1
f,nfN ode Total number of fog nodes 3
f1,mfN ode Minimum number of fog nodes for the pool of fog
nodes to be considered being in operational state
1
m,nfS erver Total number of fog nodes 3
m1,
mfS erver
Minimum number of fog nodes for the pool of fog
nodes to be considered being in operational state
1
e,niNode Total number of edge nodes 3
i,nihSensor Total number of IoMT health Sensors 5
i1,
mihSensor
Minimum number of IoMT health sensors for the
pool of IoMT health sensors to be considered
being in operational state
3
j,niaSensor Total number of IoMT ambient Sensors 5
j1,
miaSensor
Minimum number of IoMT ambient sensors for
the pool of IoMT ambient sensors to be considered
being in operational state
3
TABLE IX: DEFAULT INPUT PA RAM ET ER S FO R MODE LS O F
CLO UD GATE WAYS (cGateway)
Name Description Values
Cloud Gateway Hardware (cgHW )
1cgHW Mean time to a failure of a cloud gateway hard-
ware
200000
hours
1cgHW Mean time to recover a failed cloud gateway
hardware
8 hours
Cloud Gateway Software (cgSW )
1cgSW Mean time to a failure of a cloud gateway software 2760
hours
1cgSW a Mean time to a cyber-attack of a cloud gateway
software since it has been compromised
4320
hours
1cgSW d Mean time to a performance degradation of a
cloud gateway software from its normal state
1560
hours
1cgSW v Mean time to a vunerable issue of a cloud gateway
software
600 hours
1cgSW c Mean time to a compromise of a cloud gateway
software after being vunerable to cyber-attacks
120 hours
1cgSW arej Mean time to trigger a recovery process after a
cyber-attack to a cloud gateway software
3 hours
1cgSW dr Mean time to recover an attacked cloud gateway
software after its adaption process
45
minutes
1cgSW arej Mean time of a recovery process of a cloud
gateway software due to a cyber-attack
1.5 hours
1cgSW drej Mean time of a rejuvenation process of a cloud
gateway software due to a performance degrada-
tion
2.5 hours
1cgSW Mean time to recover a failed cloud gateway
software from failure state to normal state
3.5 hours
1cgSW ad Mean time of an adaption process after a cyber-
attack to a cloud gateway software
5.5 hours
ccgSW a Coverage factor of an attack to a cloud gateway
software
0.95
ccgSW ad Coverage factor of an adaption process after a
cyber-attack to a cloud gateway software
0.90
ccgSW arej Coverage factor of a recovery process after a
cyber-attack to a cloud gateway software
0.90
ccgSW drej Coverage factor of a rejuvenation process after
a performance degradation of a cloud gateway
software
0.95
TABLE X: DEFAULT INPUT PARAMETERS FOR MO DEL S OF FO G
GATEWAYS (fGateway)
Name Description Values
Fog Gateway Hardware (fgHW)
1fg HW Mean time to a failure of a fog gateway hardware 300000
hours
1fg HW Mean time to recover a failed fog gateway hard-
ware
12 hours
Fog Gateway Software (fgSW)
1fg SW Mean time to a failure of a fog gateway software 2280
hours
1fg SW a Mean time to a cyber-attack of a fog gateway
software since it has been compromised
4320
hours
1fg SW d Mean time to a performance degradation of a fog
gateway software from its normal state
1080
hours
1fg SW v Mean time to a vunerable issue of a fog gateway
software
360 hours
1fg SW c Mean time to a compromise of a fog gateway
software after being vunerable to cyber-attacks
72 hours
1fg SW ar ej Mean time to trigger a recovery process after a
cyber-attack to a fog gateway software
1 hours
1fg SW dr Mean time to recover an attacked fog gateway
software after its adaption process
25
minutes
1fg SW ar ej Mean time of a recovery process of a fog gateway
software due to a cyber-attack
45
minutes
1fg SW dr ej Mean time of a rejuvenation process of a fog gate-
way software due to a performance degradation
75
minutes
1fg SW Mean time to recover a failed fog gateway soft-
ware from failure state to normal state
95
minutes
1fg SW ad Mean time of an adaption process after a cyber-
attack to a fog gateway software
135 min-
utes
cfg SW a Coverage factor of an attack to a fog gateway
software
0.98
cfg SW ad Coverage factor of an adaption process after a
cyber-attack to a fog gateway software
0.85
cfg SW ar ej Coverage factor of a recovery process after a
cyber-attack to a fog gateway software
0.90
cfg SW dr ej Coverage factor of a rejuvenation process after a
performance degradation of a fog gateway soft-
ware
0.90
TABLE XI: DEFAULT INPUT PARAMETERS FOR MO DEL S OF
CLO UD STO RAG E (cStorage)
Name Description Values
Cloud Storage Pool (cSD)
ncSD Total number of cloud storage disks in a cloud
storage pool
10
mcSD Minimum number of cloud storage disks running
in normal state for a cloud storage pool to be in
operative state
3
1cSD Mean time to a failure of a storage disk in a cloud
storage pool
3 years
1cSDm Mean time to recover a failed cloud storage pool
due to a shortage of running storage disks
12 hours
1cSDf Mean time to recover all failed disks in a cloud
storage pool at once
36 hours
Cloud Storage Manager (cSM)
1cSM f Mean time to a failure of a storage manager unit
in a cloud storage
2160
hours
1cSM Mean time to recover all failed storage managers
at once in a cloud storage
8 hours
1cSM r Mean time to recover a failed storage manager unit
after its completed failure detection
3 hours
1cSM d Mean time to detect a failure of a storage manager
unit in a cloud storage
45
minutes
TABLE XII: DEFAULT INPUT PA RAM ET ER S FO R MODE LS O F FOG
STORAGE (fStorage)
Name Description Values
Fog Storage Pool (fSD)
nfS D Total number of fog storage disks in a fog storage
pool
10
mfS D Minimum number of fog storage disks running
in normal state for a fog storage pool to be in
operative state
3
1fS D Mean time to a failure of a storage disk in a fog
storage pool
2 years
1fS Dm Mean time to recover a failed fog storage pool due
to a shortage of running storage disks
8 hours
1fS Df Mean time to recover all failed disks in a fog
storage pool at once
24 hours
Fog Storage Manager (fSM)
1fS M f Mean time to a failure of a storage manager unit
in a fog storage
1800
hours
1fS M Mean time to recover all failed storage managers
at once in a fog storage
6 hours
1fS M r Mean time to recover a failed storage manager unit
after its completed failure detection
90
minutes
1fS M d Mean time to detect a failure of a storage manager
unit in a fog storage
25
minutes
TABLE XIII: DEFAULT INPUT PARAMETERS FOR MO DEL S OF
IOMT GATEWAYS (iGateway)
Name Description Values
Fog Gateway Hardware (igHW)
1igHW Mean time to a failure of a IoMT gateway hard-
ware
150000
hours
1igHW Mean time to recover a failed IoMT gateway
hardware
6 hours
Fog Gateway Software (igSW)
1igSW Mean time to a failure of a IoMT gateway software 1800
hours
1igSW a Mean time to a cyber-attack of a IoMT gateway
software since it has been compromised
4320
hours
1igSW d Mean time to a performance degradation of a
IoMT gateway software from its normal state
600 hours
1igSW v Mean time to a vunerable issue of a IoMT gateway
software
120 hours
1igSW c Mean time to a compromise of a IoMT gateway
software after being vunerable to cyber-attacks
24 hours
31
1igSW arej Mean time to trigger a recovery process after a
cyber-attack to a IoMT gateway software
25
minutes
1igSW dr Mean time to recover an attacked IoMT gateway
software after its adaption process
15
minutes
1igSW arej Mean time of a recovery process of a IoMT
gateway software due to a cyber-attack
35
minutes
1igSW drej Mean time of a rejuvenation process of a IoMT
gateway software due to a performance degrada-
tion
45
minutes
1igSW Mean time to recover a failed IoMT gateway
software from failure state to normal state
65
minutes
1igSW ad Mean time of an adaption process after a cyber-
attack to a IoMT gateway software
85
minutes
cigSW a Coverage factor of an attack to a IoMT gateway
software
0.98
cigSW ad Coverage factor of an adaption process after a
cyber-attack to a IoMT gateway software
0.80
cigSW arej Coverage factor of a recovery process after a
cyber-attack to a IoMT gateway software
0.80
cigSW drej Coverage factor of a rejuvenation process after
a performance degradation of a IoMT gateway
software
0.85
TABLE XIV: DEFAULT INPUT PARAMETERS FOR MO DEL S OF
IOMT WIRELESS CHANNE L (iWireless)
Name Description Values
pdProbability of wireless disconnection 0.00005
pcProbability of wireless connection 0.99995
TABLE XV: DEFAULT INPUT PARAMETERS FOR MO DEL S OF
CLO UD SERVER S (cServer)
Name Description Values
Processors (cCPU)
ncCP U Total number of CPUs in a cloud server 8
mcCP U Minimum number of CPUs in operation for a
cloud server to operate normally
3
1cCP U Mean time to a failure of a CPU in a cloud server 10 years
1cCP U Mean time to recover all failed CPUs in a cloud
server at once
24 hours
1cCP U m Mean time to recover a number of failed CPUs in
a cloud server
8 hours
Memory banks (cME M )
ncME M Total number of memory banks in a cloud server 8
mcME M Minimum number of memory banks in normal
states for a cloud server to operate normally
4
1cME M Mean time to a failure of a memory bank in a
cloud server
3,000
hours
1cME Md Mean time to performance degradation of a mem-
ory bank in a cloud server
1,800
hours
1cME Mdf Mean time to a failure after performance degrada-
tion of a memory bank in a cloud server
360 hours
1cME Mdr Mean time to detect and perform a maintenance of
a performance-degraded memory bank in a cloud
server
8 hours
1cME Mr Mean time to detect and perform a maintenance
of a failed memory bank due to uncertain causes
in a cloud server
16 hours
Network interface controllers (cNET )
ncNE T Number of NICs in a cloud server 4
mcNE T Minimum number of NICs in normal state for a
cloud server to operate normally
2
1cNE T Mean time to a failure of a NIC in a cloud server 3720
hours
1cNE T Mean time to detect a failure of a network inter-
face controller
3.5 hours
1cNE T Mean time to recover operations of all failed NICs
in a cloud server
12 hours
1cNE T d Mean time to recover operations of a failed NIC
after completing its failure detection
3 hours
Power supply units (cPWR)
ncP W R Total number of power supply units in a cloud
server
4
mcP W R Minimum number of normally running power sup-
ply units for a cloud server to operate normally
2
1cP W R Mean time to a failure of a power supply unit in
a cloud server
2040
hours
1cP W Rd Mean time to detect a failure of a failed power
supply unit in a cloud server
2.5 hours
1cP W R Mean time to recover all failed power supply units
at once
15.6
hours
1cP W Rd Mean time to recover a failed power supply unit
after completing its failure detection
3.6 hours
Cooling units (cCOO )
ncCOO Total number of cooling units in a cloud server 4
mcOO Minimum number of cooling units in normal op-
eration for a cloud server to operate normally
2
1cCOO Mean time to a failure of a cooling unit in a cloud
server
1560
hours
1cCOOd Mean time to detect a failure of a failed cooling
unit in a cloud server
1.5 hours
1cCOO Mean time to recover all failed cooling units in a
cloud server
10.8
hours
1cCOOd Mean time to recover a failed cooling unit in a
cloud server after completing its failure detection
2.4 hours
Virtual machine monitor (cV MM )
1cV MM f Mean time to a failure of a VMM in a cloud server
due to a certain cause
2760
hours
1cV MM d Mean time to performance degradation of a VMM
in a cloud server
2040
hours
1cV MM a Mean time to a cyber security attack to a VMM
in a cloud server
4320
hours
1cV MM arej Mean time of a diagnosis process and failure
detection for a rejuvenation process after a cyber
security attack to a VMM
2.5 hours
1cV MM arej Mean time of a rejunvenation process after a cyber
security attack to a VMM
50
minutes
1cV MM drej Mean time of a rejuvenation process after a per-
formance degradation due to a certain cause
35
minutes
1cV MM ad Mean time of an adaptation process after a cyber
security attack to a VMM
1.5 hours
1cV MM dr Mean time to a reconfiguration after an adaptation
process due to a cyber security attack to a VMM
25
minutes
cV MM r Mean time to recover a failed VMM in a cloud
server
55
minutes
ccV MM arej Coverage factor of a rejuvenation process on a
VMM due to a cyber security attack
0.93
ccV MM drej Coverage factor of a rejuvenation process on a
VMM due to a performance degradation
0.95
ccV MM ad Coverage factor of an adaptation process on a
VMM after a cyber security attack
0.90
Virtual machine (cV M)
ncV M Number of VMs hosted on a VMM in a cloud
server
4
1cV Mf Mean time to a failure of a VM in a cloud server
due to a certain cause
2280
hours
1cV Md Mean time to performance degradation of a VM
in a cloud server
1800
hours
1cV Ma Mean time to a cyber security attack to a VM in
a cloud server
4320
hours
1cV Marej Mean time of a diagnosis process and failure
detection for a rejuvenation process after a cyber
security attack to a VM
1.5 hours
1cV Marej Mean time of a rejunvenation process after a cyber
security attack to a VM
35
minutes
1cV Mdrej Mean time of a rejuvenation process after a per-
formance degradation due to a certain cause
30
minutes
1cV Mad Mean time of an adaptation process after a cyber
security attack to a VM
1.5 hours
1cV Mdr Mean time to a reconfiguration after an adaptation
process due to a cyber security attack to a VM
25
minutes
1cV Mr Mean time to recover all failed VM at once in a
cloud server
30
minutes
ccV Marej Coverage factor of a rejuvenation process on a VM
due to a cyber security attack
0.95
ccV Mdrej Coverage factor of a rejuvenation process on a VM
due to a performance degradation
0.98
ccV Mad Coverage factor of an adaptation process on a VM
after a cyber security attack
0.90
Medical application (cAP P )
ncAP P Total number of Apps running on a cloud server 8
mcAP P Minimum number of Apps running at once to
consider the whole Apps unit to be in operative
state
4
λcAP P f Mean time to a failure of medical apps in a cloud
server
72 hours
δcAP P u Mean time to a periodical upgrade event of cloud
apps
168 hours
λcAP P u Mean time of a preparation process for an upgrade
event of cloud apps
15
minutes
µcAP P r Mean time to restart/repair a failed Application
(APP) on a cloud server
10
minutes
µcAP P u Mean time of an upgrade process of cloud apps 35
minutes
TABLE XVI: DEFAULT INPUT PARAMETERS FOR MO DEL S OF FO G
SERVER (fServer)
Name Description Values
nfC P U Total number of CPUs in a fog server 8
mfC P U Minimum number of CPUs in operation for a fog
server to operate normally
4
1fC P U Mean time to a failure of a CPU in a fog server 6 years
1fC P U Mean time to recover a failed slot of a CPU in a
fog server
12 hours
1fC P U m Mean time to recover a number of failed CPUs in
a fog server
6 hours
Memory banks (fME M)
nfM E M Total number of memory banks in a fog server 8
mfM E M Minimum number of memory banks in normal
states for a fog server to operate normally
4
1fM E M Mean time to a failure of a memory bank in a fog
server
2280
hours
32
1fM E M d Mean time to performance degradation of a mem-
ory bank in a fog server
1320
hours
1fM E M df Mean time to a failure after performance degrada-
tion of a memory bank in a fog server
180 hours
1fM E M dr Mean time to detect and perform a maintenance
of a performance-degraded memory bank in a fog
server
6 hours
1fM E M r Mean time to detect and perform a maintenance
of a failed memory bank due to uncertain causes
in a fog server
12 hours
Network interface controllers (fN ET )
nfN E T Number of NICs in a fog server 4
mfN E T Minimum number of NICs in normal state for a
fog server to operate normally
2
1fN E T Mean time to a failure of a NIC in a fog server 2280
hours
1fN E T Mean time to detect a failure of a network inter-
face controller
2.5 hours
1fN E T Mean time to recover operations of all failed NICs
in a fog server
8 hours
1fN E T d Mean time to recover operations of a failed NIC
after completing its failure detection
2 hours
Power supply units (fPWR)
nfP W R Total number of power supply units in a fog server 4
mfP W R Minimum number of normally running power sup-
ply units for a fog server to operate normally
2
1fP W R Mean time to a failure of a power supply unit in
a fog server
1320
hours
1fP W Rd Mean time to detect a failure of a failed power
supply unit in a fog server
1.5 hours
1fP W R Mean time to recover all failed power supply units
at once
12 hours
1fP W Rd Mean time to recover a failed power supply unit
after completing its failure detection
2.4 hours
Cooling units (fCOO)
nfC OO Total number of cooling units in a fog server 4
mcOO Minimum number of cooling units in normal op-
eration for a fog server to operate normally
2
1fC OO Mean time to a failure of a cooling unit in a fog
server
1320
hours
1fC OOd Mean time to detect a failure of a failed cooling
unit in a fog server
1 hours
1fC OO Mean time to recover all failed cooling units in a
fog server
12 hours
1fC OOd Mean time to recover a failed cooling unit in a
fog server after completing its failure detection
2 hours
Virtual machine monitor (fOS)
1fO Sf Mean time to a failure of a OS in a fog server due
to a certain cause
2292
hours
1fO Sd Mean time to performance degradation of a OS in
a fog server
1800
hours
1fO Sa Mean time to a cyber security attack to a OS in a
fog server
4320
hours
1fO Sarej Mean time of a diagnosis process and failure
detection for a rejuvenation process after a cyber
security attack to a OS
1.5 hours
1fO Sarej Mean time of a rejunvenation process after a cyber
security attack to a OS
40
minutes
1fO Sdrej Mean time of a rejuvenation process after a per-
formance degradation due to a certain cause
25
minutes
1fO Sad Mean time of an adaptation process after a cyber
security attack to a OS
30
minutes
1fO Sdr Mean time to a reconfiguration after an adaptation
process due to a cyber security attack to a OS
10
minutes
fO Sr Mean time to recover a failed OS in a fog server 35
minutes
cfO Sarej Coverage factor of a rejuvenation process on a OS
due to a cyber security attack
0.90
cfO Sdrej Coverage factor of a rejuvenation process on a OS
due to a performance degradation
0.90
cfO Sad Coverage factor of an adaptation process on a OS
after a cyber security attack
0.85
Medical application (cAP P )
ncAP P Total number of Apps running on a fog server 8
mcAP P Minimum number of Apps running at once to
consider the whole Apps unit to be in operative
state
4
λcAP P f Mean time to a failure of medical apps in a fog
server
48 hours
δcAP P u Mean time to a periodical upgrade event of fog
apps
120 hours
λcAP P u Mean time of a preparation process for an upgrade
event of fog apps
10
minutes
µcAP P r Mean time to restart/repair a failed APP on a fog
server
45
minutes
µcAP P u Mean time of an upgrade process of fog apps 30
minutes
TABLE XVII: DEFAULT INPUT PARAMETERS FOR MO DEL S OF
IOMT SENSOR S (ihSensor/iaSensor)
ihSensor iaSensor Description
Notation Value Notation Value
Batteries (ihBAT/iaBAT)
1ihBdr 720 hours 1iaB dr 1440
hours
Mean time of a draining dura-
tion of IoMT batteries
1ihBsl 24 hours 1iaBsl 48 hours Mean time to a sleep event
of medical IoMT sensors at a
certain level of battery power
1ihBw 5 minutes 1iaBw 10
minutes
Mean time for an IoMT sen-
sor to wake up at a certain
level of battery power
1ihBf 1080
hours
1iaBf 1560
hours
Mean time to a failure of
an IoMT battery at a certain
level of power due to a certain
cause
1ihBr 10
minutes
1iaBr 5 minutes Mean time to recover of a
failed battery back to 100%
level of power
Sensors (ihSEN/iaSEN)
λihSEN 10 years λiaSE N 10 years Mean time to failure of a sen-
sor hardware component
µihSEN 10
minutes
µiaSEN 5 minutes Mean time to recover a failed
sensor part
Analog to Digital Converters (ihADC/iaADC)
1ihADC 5 years 1iaADC 5 years Mean time to failure of an
analog to digital converter in
a sensor device
1ihADC 15
minutes
1iaADC 10
minutes
Mean time to recover a failed
analog to digital converter in
a sensor device
Micro-Controller Units (ihMCU/iaMCU)
1ihMC U 10 years 1iaMCU 10 years Mean time to failure of a
micro-controller unit in a sen-
sor device
1ihMC U 10
minutes
1iaMC U 5 minutes Mean time to recover a failed
micro-controller unit in a sen-
sor device
Memories (ihMEM/iaMEM)
1ihME Mf 10 years 1iaM EMf 10 years Mean time to failure of a
memory in a sensor device
due to a certain cause
1ihME Mdf720 hours 1iaM EM df1080
hours
Mean time to a performance-
degraded failure of a memory
in a sensor device
1ihME Md 4320
hours
1iaME Md 5400
hours
Mean time to performance
degradation of a memory in a
sensor device
1ihME Mr 15
minutes
1iaME Mr 10
minutes
Mean time to recover a failed
memory in a sensor device
1ihME M 35
minutes
1iaME M 25
minutes
Mean time to recover a
performance-degraded mem-
ory in a sensor device
Tranceivers (ihTRx/iaTRx)
1ihT xf 7200
hours
1iaT xf 8640
hours
Mean time to failure of
a transmitter part of a
transceiver in an IoMT sensor
1ihRxf 8640
hours
1iaRxf 10800
hours
Mean time to failure of a re-
ceiver part of a transceiver in
an IoMT sensor
muihT Rxr 8 hours muiaT Rxr 4 hours Mean time to recover a failed
transceiver in an IoMT sensor
IoMT Operating System and Embedded Apps (ihOS&ihApp/iaOS&iaApp)
1ihA 360 hours 1iaA 600 hours Mean time to failure of an em-
bedded sensoring app running
on an IoMT sensor
1ihOS 2040
hours
1iaOS 2760
hours
Mean time to failure of a tiny
operating system running on
an IoMT sensor
1ihOSa 1440
hours
1iaOSa 1080
hours
Mean time to sleep of both
tiny operating system and in-
tegrated sensoring apps
1ihOSa 5 minutes 1iaOS a 5 minutes Mean time to wake up of both
tiny operating system and in-
tegrated sensoring apps
1ihA 10
minutes
1iaA 5 minutes Mean time to recover a failed
sensoring app on an IoMT
sensor
1ihOS 25
minutes
1iaOS 15
minutes
Mean time to recover a failed
tiny operating system on an
IoMT sensor
1ihOSa 35
minutes
1iaOSa 20
minutes
Mean time to recover a failed
tiny operating system inte-
grated with a failed sensoring
app on an IoMT sensor
1/cihA 0.95 1/ciaA 0.90 Coverage factor of a recovery
process for a failed sensoring
app on an IoMT sensor
1/cihOS 0.95 1/ciaOS 0.90 Coverage factor of a recover
process for a failed tiny op-
erating system on an IoMT
sensor
33
1/cihOSa 0.98 1/ciaOS a 0.95 Coverage factor of waking up
a tiny operating system inte-
grated with a sensoring app on
an IoMT sensor
PLACE
PHOTO
HERE
TUAN ANH NGUYEN (Ph.D.) is currently a (Re-
search) Assistant Professor with the Department of
Computer Science and Engineering, College of Infor-
mation and Telecommunication, Konkuk University
(KU), Seoul, South Korea. He received his Ph.D.
degree in computer science and system engineering
at the Department of Computer Engineering, Korea
Aerospace University (KAU), Seoul, Korea, in 2015.
His M.Sc and B.Eng were in Mechatronics with
Hanoi University of Science and Technology (HUST),
Hanoi, Vietnam in 2010 and 2008, respectively. His
current research interests include (i) dependability and security of systems
and networks, (ii) fault tolerance of embedded systems in aerospace and
mechatronics, (iii) disaster tolerance and recovery of computing systems, (iv)
integration of cloud/fog/edge computing paradigms, (v) dependability and
security analytical quantification for Internet of things, cloud data centers,
unmanned vehicles, mechatronic production chains, e-logistics, (vi) uncertainty
propagation in stochastic processes and imprecise probability in analytical
models of computer systems, and unmanned vehicles.
PLACE
PHOTO
HERE
DUGKI MIN (Ph.D.) received the B.S. degree in
industrial engineering from Korea University, in 1986,
and the M.S. and Ph.D. degrees in computer science
from Michigan State University, in 1995. He is cur-
rently a Professor with the Department of Computer
Science and Engineering, Konkuk University. His re-
search interests include cloud computing, distributed
and parallel processing, big data processing, intelli-
gent processing, software architecture, and modeling
and simulation.
PLACE
PHOTO
HERE
EUNMI CHOI (Ph.D.) received the B.S. degree
(Hons.) in computer science from Korea University,
in 1988, and the M.S. and Ph.D. degrees in computer
science from Michigan State University, USA, in 1991
and 1997, respectively. She is currently a Professor
with Kookmin University, South Korea. Since 1998,
she has been an Assistant Professor with Handong
University, South Korea, before joining Kookmin
University, in 2004. She is also the Head of the
Distributed Information System and Cloud Computing
Laboratory, Kookmin University. Her current research
interests include big data infra system and analysis, cloud computing, intelli-
gent systems, information security, parallel and distributed systems, and SW
architecture and modeling.
... In literature, there are several papers [10,11,12,13,14] that introduce models that jointly model security and dependability of different information systems, but none of these works address a whole 5G-MEC system. Sallhammer et al. [10,11,12] focus on the interplay between security and dependability in information systems. ...
... Although this work models devices in a network, it does not include anything regarding 5G or MEC. Nguyen et al. [14] model the security and dependability of Internet-of-Things (IoT) medical monitoring systems, broadly discussing the edge computing platform. They assess the security and dependability of medical IoT devices, communication channels, and data storage mechanisms, emphasizing the importance of secure and reliable IoT systems in healthcare. ...
... A joint model must accommodate both needs, which is demanding. Our work follows the line of previous papers [8,14,15,13] and considers a two-layer approach that makes use of commonly used techniques for the quantitative evaluation of the availability of a 5G-MEC system. More precisely, we construct our models starting from [8], which makes use of a Fault Tree (FT) and Stochastic Activity Networks (SANs). ...
Preprint
Full-text available
Multi-access Edge Computing (MEC) is an essential technology for the fifth generation (5G) of mobile networks. MEC enables low-latency services by bringing computing resources close to the end-users. The integration of 5G and MEC technologies provides a favorable platform for a wide range of applications, including various mission-critical applications, such as smart grids, industrial internet, and telemedicine, which require high dependability and security. Ensuring both security and dependability is a complex and critical task, and not achieving the necessary goals can lead to severe consequences. Joint modeling can help to assess and achieve the necessary requirements. Under these motivations, we propose an extension of a two-level availability model for a 5G-MEC system. In comparison to the existing work, our extended model (i) includes the failure of the connectivity between the 5G-MEC elements and (ii) considers attacks against the 5G-MEC elements or their interconnection. We implement and run the model in M\"{o}bius. The results show that a three-element redundancy, especially of the management and core elements, is needed and still enough to reach around 4-nines availability even when connectivity and security are considered. Moreover, the evaluation shows that slow detection of attacks, slow recovery from attacks, and bad connectivity are the most significant factors that influence the overall system availability.
... This model aims to help ensure the availability and resilience of these infrastructures in adverse situations, providing a systematic approach to their analysis. Nguyen et al. [21] proposes a methodology to quantify reliability and security in an Internet of Medical Things (IoMT) infrastructure with cloud/fog/edge (CFE) computing. It uses hierarchical models and considers failures, including cyber-attacks. ...
Article
Full-text available
There is a growing importance of the Internet of Medical Things (IoMT), an emerging aspect of the Internet of Things (IoT), in smart healthcare. With the emergence of the Coronavirus (COVID-19) pandemic, healthcare systems faced extreme pressure, leading to the need for advancements and research focused on IoMT. Smart hospital infrastructures face challenges regarding availability and reliability measures, especially in the event of local server failures or disasters. Unpredictable malfunctions in any aspect of medical computing infrastructure, from the power system in a remote area to local computing systems in a smart hospital, can result in critical failures in medical monitoring services. These failures can have serious consequences, including potentially fatal loss of life in the most serious cases. Therefore, we propose a disaster analysis and recovery measures using Stochastic Petri Nets (SPN) to resolve these critical issues. The proposed model aims to identify the system’s most critical components, develop strategies to mitigate failures and ensure system resilience. Our results show that the disaster recovery system demonstrated availability and reliability. The sensitivity analysis indicated the components that had the greatest impact on availability-for example, the failure time of the Standby Edge Server proved to be a very relevant component in the proposed architecture. The present work can help system architects develop distributed architectures considering points of failure and recovery measures.
... The Healthcare system is a critical system in which simple mistakes may affect the patient's life [4]. Healthcare systems are transitioning from architectural practice centered around hospitals to programs based on patientcentered electronic medical records [5]. Nowadays, there is a strong demand for a modern quality healthcare system. ...
Article
Full-text available
Software development methodology for healthcare projects may be a major cause of late schedules, over-budgeting, dissatisfaction with requirements, and poor quality. Therefore, there should be a methodology for managing and improving software development processes as a major solution to these problems. Managing software development process requires a guideline for developing software products on schedule within budget, avoiding risks, and meeting requirements. In recent years, software process improvement has been extensively studied in traditional or rapid software development, and its strengths and weaknesses have been recognized. Rapidly methodologies have challenged the traditional ways of developing a critical system such as healthcare. The main objective of this research is to improve the productivity of healthcare systems and avoid risks by improving development processes. The research proposed a development methodology called “Rapid Healthcare System Development )RHSD(”. The proposed methodology acts fast and takes care of risks during development processes by intensifying the testing and validation mechanism. RHSD also facilitates the development of health care systems within the required requirements and achieves the highest levels of quality.
... The work of Nguyen et al. [27] proposes quantifying the security measures of an IoMT infrastructure based on an integrated physical architecture of cloud/fog/edge (CFE) computing paradigms. The work of Valentim et al. [28] presents a modeling approach based on Generalized Stochastic Petri Nets (GSPN) to evaluate the availability of an IoMT architecture based on a private cloud. ...
Preprint
Full-text available
The variety of sensors for health monitoring helps professionals in the field make decisions. For certain clinical conditions, it is interesting to monitor the patient's health after discharge. Wearable health monitoring devices (smartwatches) can send data to the hospital. However, to monitor many patients (external and internal), a resilient and high-performance computing infrastructure is required. Such characteristics require high monetary cost equipment. This paper presents an SPN (Stochastic Petri Net) model for evaluating the performance and performability of a multi-tier hospital system architecture (edge-fog-cloud). The model allows you to evaluate the Mean Response Time (MRT), resource utilization level (U), and probability of data loss (DP). A very specific characteristic is to consider two data sources (internal and external). Performance metrics are evaluated in two scenarios based on varying the number of containers for simultaneous processing in the cloud and fog. The model evaluates performability of the MRT metric over the variation of parameters that have the greatest influence on availability (container MTTF) and performance (cloud processing capacity).
... Nguyen et al. [12] conducted a study involving reliability and availability analysis for the infrastructure of the Internet of Medical Things (IoMT) in a healthcare system, utilizing hierarchical models such as Fault Tree (FT) and Continuous Time Markov Chain (CTMC). The study incorporated failure modes for systems, including cybersecurity attacks on software subsystems. ...
Article
Full-text available
This paper evaluates the impact of battery charging and discharging times on the availability of mechanical respirators in the Intensive Care Unit (ICU). The availability of these life-saving devices is crucial for ensuring optimal patient care in critical situations. This study aims to assess how the duration of battery charging and discharging cycles affects the availability of mechanical respirators and explore potential strategies to optimize their maintainability. We analyze the system’s behavior in eight scenarios that consider changes to optimize repair times, battery charge and discharge times, and power system redundancy. The results showed 98% improvements in availability and reduced system downtime. The outcomes of this research contribute to understanding the critical factors impacting the availability of mechanical respirators in the ICU. By addressing the issues related to battery charging and discharging times and maintaining these devices, healthcare facilities can enhance the availability and reliability of respiratory support systems. Ultimately, this study aims to improve patient outcomes and promote efficient resource utilization in the ICU setting.
... Nguyen et al. [Nguyen et al. 2021b] proposed a hierarchical modeling approach for assessing IoMT infrastructures. The approach adopts fault trees and Markov chains, focusing on availability and security issues. ...
Conference Paper
Investments in smart health applications are expected to rise to US$ 960 billion by 2030, and Internet of Things (IoT) have a prominent role in implementing such applications. For instance, hospitals have adopted IoT to collect and transmit patient data to health professionals, as critical patients must be monitored uninterruptedly. Therefore, health systems commonly require high availability, but availability assessment of health systems’ architecture is not a common approach. This paper presents a modeling approach based on generalized stochastic Petri nets (GSPN) to evaluate the availability of Internet of Medical Things (IoMT) architecture based on a private cloud. A case study is adopted to demonstrate the feasibility of the proposed approach.
Article
Full-text available
System dependability is pivotal for the reliable execution of designated computing functions. With the emergence of cloud-fog computing and microservices architectures, new challenges and opportunities arise in evaluating system dependability. Enhancing dependability in microservices often involves component replication, potentially increasing energy costs. Thus, discerning optimal redundancy strategies and understanding their energy implications is crucial for both cost efficiency and ecological sustainability. This paper presents a model-driven approach to evaluate the dependability and energy consumption of cloud-fog systems, utilizing Kubernetes, a container application orchestration platform. The developed model considers various determinants affecting system dependability, including hardware and software reliability, resource accessibility, and support personnel availability. Empirical studies validate the model’s effectiveness, demonstrating a 22.33% increase in system availability with only a 1.33% rise in energy consumption. Moreover, this methodology provides a structured framework for understanding cloud-fog system dependability, serves as a reference for comparing dependability across different systems, and aids in resource allocation optimization. This research significantly contributes to the efforts to enhance cloud-fog system dependability.
Article
Full-text available
In recent years, the healthcare-IT systems have undergone numerous technological advancements. With the advent of implanted medical devices, the ubiquitous health is possible and quite simple. As a result, locating, monitoring, and treating patients, no matter where they are, have become an easy task. Additionally, the electronic health record system digitalized medical information and provides collaboration, real-time decision support, and permanent patient health records. Despite the fact that these capabilities significantly improve the quality of healthcare, their vulnerability to viruses/malicious attacks has become a major challenge and one of the serious concerns. Literature review shows that number of such attacks is increasing rapidly in healthcare organizations. Therefore, an immediate attention is required as personal health information may be exposed, making healthcare infrastructures less dependable or even cease to function. TMA is a technique/method that thoroughly examines the system, its corresponding flaws, and possible potential attackers. It may be instrumental for making well-informed decisions for security measure. The paper presents a critical review and systematic evaluation of pertinent TMA methods. Each method has been evaluated on the key features of modeling and assessment. Additionally, it includes the relevance and applicability of each method to the healthcare domain based on key factors. This work may be a useful guide for researchers and practitioners working in this area. It may significantly facilitate them for addressing security-related issues and concerns in healthcare domain.
Article
Full-text available
Introduction: Internet of Things (IoT), which provides smart services and remote monitoring across healthcare systems according to a set of interconnected networks and devices, is a revolutionary technology in this domain. Due to its nature to sensitive and confidential information of patients, ensuring security is a critical issue in the development of IoT-based healthcare system. Aim: Our purpose was to identify the features and concepts associated with security requirements of IoT in healthcare system. Methods: A survey study on security requirements of IoT in healthcare system was conducted. Four digital databases (Web of Science, Scopus, PubMed and IEEE) were searched from 2005 to September 2019. Moreover, we followed international standards and accredited guidelines containing security requirements in cyber space. Results: We identified two main groups of security requirements including cyber security and cyber resiliency. Cyber security requirements are divided into two parts: CIA Triad (three features) and non-CIA (seven features). Six major features for cyber resiliency requirements including reliability, safety, maintainability, survivability, performability and information security (cover CIA triad such as availability, confidentiality and integrity) were identified. Conclusion: Both conventional (cyber security) and novel (cyber resiliency) requirements should be taken into consideration in order to achieve the trustworthiness level in IoT-based healthcare system.
Article
Full-text available
Objectives: This study aimed to investigate electronic medical record (EMR) implementation in a busy urban academic emergency department (ED) and to determine the frequency, duration, and predictors of EMR downtime episodes. Materials and methods: This study retrospectively analyzed data collected real time by the EMR and by the operations group at the study ED from May 2016 to December 2017. The study center has used the First Net Millennium EMR (Cerner Corporation, Kansas City, Missouri, USA). The ED operations data have been downloaded weekly from the EMR and transferred to the analytics software Stata (version 15MP, StataCorp, College Station, Texas, USA). Results: During the study period, 12 episodes of EMRD occurred, with a total of 58 hours and a mean of 4.8 ± 2.7 hours. The occurrence of EMRD event has not been associated with on-duty physician coverage levels (p = 0.831), month (p = 0.850), or clinical shift (morning, evening, or night shift) (p = 0.423). However, EMRD occurrence has been statistically significantly associated with weekdays (p = 0.020). Discussion: In a real-world implementation of EMR in a busy ED, EMRD episodes averaging approximately 5 hours occurred at unpredictable intervals, with a frequency that remained unchanged over the first 20 months of the EMR deployment. Conclusion: The study could define downtime characteristics at the study center. The EMRD episodes have been associated with inaccuracies in hourly census reporting, with a rebound phenomenon of over-reporting in the first hour or two after restoration of EMR operations.
Article
Full-text available
E-health systems can be used to monitor people in real-time, offering a range of multimedia-based health services, at the same time reducing the cost since cheaper devices can be used to compose it. However, any downtime, mainly in the case of critical health services, can result in patient health problems and in the worst case, loss of life. In this paper, we use an interdisciplinary approach combining stochastic models with optimisation algorithms to analyse how failures impact e-health monitoring system availability. We propose surrogate models to estimate the availability of e-health monitoring systems that rely on edge, fog, and cloud infrastructures. Then, we apply a multi-objective optimisation algorithm, NSGA-II, to improve system availability considering component costs as constraint. Results suggest that replacing components with more reliable ones is more effective in improving the availability of an e-health monitoring system than adding more redundant components.
Article
Full-text available
Internet of Things (IoT) forms the foundation of next generation infrastructures, enabling development of future cities that are inherently sustainable. Intrusion detection for such paradigms is a non-trivial challenge which has attracted further significance due to extraordinary growth in the volume and variety of security threats for such systems. However, due to unique characteristics of such systems i.e., battery power, bandwidth and processor overheads and network dynamics, intrusion detection for IoT is a challenge, which requires taking into account the trade-off between detection accuracy and performance overheads. In this context, we are focused at highlighting this trade-off and its significance to achieve effective intrusion detection for IoT. Specifically, this paper presents a comprehensive study of existing intrusion detection systems for IoT systems in three aspects: computational overhead, energy consumption and privacy implications. Through extensive study of existing intrusion detection approaches, we have identified open challenges to achieve effective intrusion detection for IoT infrastructures. These include resource constraints, attack complexity, experimentation rigor and unavailability of relevant security data. Further, this paper is envisaged to highlight contributions and limitations of the state-of-the-art within intrusion detection for IoT, and aid the research community to advance it by identifying significant research directions.
Article
Full-text available
The impact of the Internet of Things (IoT) on the advancement of the healthcare industry is immense. The ushering of the Medicine 4.0 has resulted in an increased effort to develop platforms, both at the hardware level as well as the underlying software level. This vision has led to the development of Healthcare IoT (H-IoT) systems. The basic enabling technologies include the communication systems between the sensing nodes and the processors; and the processing algorithms for generating an output from the data collected by the sensors. However, at present, these enabling technologies are also supported by several new technologies. The use of Artificial Intelligence (AI) has transformed the H-IoT systems at almost every level. The fog/edge paradigm is bringing the computing power close to the deployed network and hence mitigating many challenges in the process. While the big data allows handling an enormous amount of data. Additionally, the Software Defined Networks (SDNs) bring flexibility to the system while the blockchains are finding the most novel use cases in H-IoT systems. The Internet of Nano Things (IoNT) and Tactile Internet (TI) are driving the innovation in the H-IoT applications. This paper delves into the ways these technologies are transforming the H-IoT systems and also identifies the future course for improving the Quality of Service (QoS) using these new technologies.
Article
Full-text available
Modeling a complete Internet of Things (IoT) infrastructure is crucial to assess its availabilityand security characteristics. However, modern IoT infrastructures often consist of a complex andheterogeneous architecture and thus taking into account both architecture and operative details ofthe IoT infrastructure in a monolithic model is a challenge for system practitioners and developers.In that regard, we propose a hierarchical modeling framework for the availability and securityquantification of IoT infrastructures in this paper. The modeling methodology is based on ahierarchical model of three levels including (i) reliability block diagram (RBD) at the top levelto capture the overall architecture of the IoT infrastructure, (ii) fault tree (FT) at the middle level toelaborate system architectures of the member systems in the IoT infrastructure, and (iii) continuoustime Markov chain (CTMC) at the bottom level to capture detailed operative states and transitionsof the bottom subsystems in the IoT infrastructure. We consider a specific case-study of IoT smartfactory infrastructure to demonstrate the feasibility of the modeling framework. The IoT smartfactory infrastructure is composed of integrated cloud, fog, and edge computing paradigms. Acomplete hierarchical model of RBD, FT, and CTMC is developed. A variety of availability andsecurity measures are computed and analyzed. The investigation of the case-study’s analysis resultsshows that more frequent failures in cloud cause more severe decreases of overall availability, whilefaster recovery of edge enhances the availability of the IoT smart factory infrastructure. On theother hand, the analysis results of the case-study also reveal that cloud servers’ virtual machinemonitor (VMM) and virtual machine (VM), and fog server’s operating system (OS) are the mostvulnerable components to cyber-security attack intensity. The proposed modeling and analysisframework coupled with further investigation on the analysis results in this study help develop andoperate the IoT infrastructure in order to gain the highest values of availability and security measuresand to provide development guidelines in decision-making processes in practice.
Article
Full-text available
Exploring Internet of Things (IoT) data streams generated by smart cities means not only transforming data into better business decisions in a timely way but also generating long-term location intelligence for developing new forms of urban governance and organization policies. This paper proposes a new architecture based on the edge-fog-cloud continuum to analyze IoT data streams for delivering data-driven insights in a smart parking scenario.
Article
Full-text available
There are millions of base stations distributed across China, each containing many support devices and monitoring sensors. Conventional base station management systems tend to be hosted in the cloud, but cloud-based systems are difficult to reprogram and performing tasks in real-time is sometimes problematic, for example, sounding a combination of alarms or executing linked tasks. To overcome these drawbacks, we propose a hybrid edge-cloud IoT base station system, called BSIS. This paper includes a theoretical mathematical model that demonstrates the dynamic characteristics of BSIS along with a formulation for implementing BSIS in practice. Embedded programmable logic controllers serve as the edge nodes; a dynamic programming method creates a seamless integration between the edge nodes and the cloud. The paper concludes with a series of comprehensive analyses on scalability, responsiveness, and reliability. These analyses indicate a possible 60% reduction in the number of alarms, an edge response time of less than 0.1s, and an average downtime ratio of 0.66%.
Chapter
As a solution to protect and defend a system against inside attacks, many intrusion detection systems (IDSs) have been developed to identify and react to them for protecting a system. However, the core idea of an IDS is a reactive mechanism in nature even though it detects intrusions which have already been in the system. Hence, the reactive mechanisms would be way behind and not effective for the actions taken by agile and smart attackers. Due to the inherent limitation of an IDS with the reactive nature, intrusion prevention systems (IPSs) have been developed to thwart potential attackers and/or mitigate the impact of the intrusions before they penetrate into the system. In this chapter, we introduce an integrated defense mechanism to achieve intrusion prevention in a software‐defined Internet of Things (IoT) network by leveraging the technologies of cyber deception (i.e. a decoy system) and moving target defense, namely MTD (i.e. network topology shuffling). In addition, we validate their effectiveness and efficiency based on the devised graphical security model (GSM)‐based evaluation framework. To develop an adaptive, proactive intrusion prevention mechanism, we employed fitness functions based on the genetic algorithm (GA) in order to identify an optimal network topology where a network topology can be shuffled based on the detected level of system vulnerability. Our simulation results show that GA‐based shuffling schemes outperform random shuffling schemes in terms of the number of attack paths toward decoy targets. In addition, we observe that there exists a trade‐off between the system lifetime (i.e. mean time to security failure, MTTSF) and the defense cost introduced by the proposed MTD technique for fixed and adaptive shuffling schemes. That is, a fixed GA‐based shuffling can achieve higher MTTSF with more cost while an adaptive GA‐based shuffling obtains less MTTSF with less cost.
Book
Do you need to know what technique to use to evaluate the reliability of an engineered system? This self-contained guide provides comprehensive coverage of all the analytical and modeling techniques currently in use, from classical non-state and state space approaches, to newer and more advanced methods such as binary decision diagrams, dynamic fault trees, Bayesian belief networks, stochastic Petri nets, non-homogeneous Markov chains, semi-Markov processes, and phase type expansions. Readers will quickly understand the relative pros and cons of each technique, as well as how to combine different models together to address complex, real-world modeling scenarios. Numerous examples, case studies and problems provided throughout help readers put knowledge into practice, and a solutions manual and Powerpoint slides for instructors accompany the book online. This is the ideal self-study guide for students, researchers and practitioners in engineering and computer science.