Content uploaded by Francisco Airton Silva
Author content
All content in this area was uploaded by Francisco Airton Silva on Apr 25, 2022
Content may be subject to copyright.
Vol.:(0123456789)
Journal of Network and Systems Management (2021) 29:27
https://doi.org/10.1007/s10922-021-09592-x
1 3
Data Processing onEdge andCloud: APerformability
Evaluation andSensitivity Analysis
LucasSantos1· BeneditoCunha1· IureFé2· MarcoVieira3·
FranciscoAirtonSilva1
Received: 2 November 2020 / Revised: 6 January 2021 / Accepted: 10 February 2021
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature
2021
Abstract
Nowadays, the Internet of Things (IoT) allows monitoring and automation in diverse
contexts, such as hospitals, homes, or even smart cities, just to name a few exam-
ples. IoT data processing may occur, at the edge of the network or in the cloud,
but frequently the processing must be divided between the two layers. Aiming to
guarantee that the IoT systems works efficiently, it is essential to evaluate the system
even in initial design stages. However, evaluating hybrid systems composed by mul-
tiple layers is not an easy task as a myriad of parameters are involved in the process.
Thus, this paper presents two SPN models (one base and extended one) that can rep-
resent an abstract distributed system composed of IoT, edge and cloud layers. The
models are highly configurable to be used in diverse simulation scenarios. Besides a
sensitivity analysis evidenced the most impacting components in the studied archi-
tecture and made it possible to optimize the base SPN model. Finally a case study
explores multiple metrics of interest concurrently and works as a guide of the model
utilization. Ultimately, the proposed approach can assist system designers to avoid
unnecessary investment in original equipment.
Keywords Internet of Things (IoT)· Edge computing· Cloud computing·
Stochastic petri nets
* Francisco Airton Silva
faps@ufpi.edu.br
Extended author information available on the last page of the article
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 2 of 24
1 Introduction
The Internet of Things (IoT) is one of the great recent advances in technology. IoT
devices are intercommunicable and can extract information about different environ-
ments. IoT devices are used, for example, to monitor patients in smart hospitals, to
monitor smart homes, or to collect data at power stations. As predicted by Ericsson
Inc. in 2010,1 more than 50 billion devices will connect to Internet by the year 2025.
Most of these devices will be located at the edge of the Internet and could provide
new applications, changing many aspects of both traditional industrial productions
and our everyday living.
Most of the IoT devices are resources constrained. Sometimes these devices need
additional processing capability that can be found on remote data centers. Edge
computing and cloud computing [1, 2] are examples of such an additional capability.
Edge computing has emerged as a solution to bring data processing closer to users.
Edge computing can be seen as a cloud server running at the edge of a network and
performing specific tasks with even lower latency for end-users [3]. Edge comput-
ing also raises challenges regarding high data throughput, energy efficiency, security
mechanisms, etc. (e.g., many recent works are focused on data throughput issues
[4]) [5]. data, and applications, from the remote cloud to the network edge, and thus
enables numerous real-time smart city services. been used in the literature to bring
the striking features of cloud computing to the edge.
Cloud computing, in turn, provides computing services, including databases, net-
works, software, and data analysis over the Internet, to provide faster responses [6].
Cloud computing can meet many IoT requirements, such as service monitoring, sen-
sor data flow processing, and visualization tasks [7]. Cloud also has advantages in
terms of processing scalability, being a burgeoning computing scheme where appli-
cations can be offloaded to the centralized cloud data centers and the cloud manager
provisions elastic and on-demand resources for their execution [8].
The arrival of requests may overtake the edge/cloud capacity, becoming essen-
tial to evaluate the performance of the architecture and avoid bottlenecks. However,
evaluating hybrid systems with IoT, edge, and cloud can be costly, making it neces-
sary to use simulation or analytical modeling methods. Stochastic Petri nets (SPNs)
are a powerful analytical modeling approach. SPNs can represent complex systems
with diverse characteristics, including parallelism and concurrency [9, 10]. In previ-
ous work, we have evaluated the hybrid fog and cloud [11] systems; however, only
focusing on performance issues. We are now interested in evaluating the perform-
ability, which combines performance and availability in the same model. Perform-
ability is the study of systems performance when subjected to the effect of failures
on its subcomponents. The performance of a system is said to be degradable if fail-
ure events may affect it negatively [12]. For instance, a mesh network of routers can
tolerate a certain number of failures, but the overall performance will be affected
as some routers may be subject to overheads. In this paper we explore the effect of
1 http://www.erics son.com/theco mpany /press /relea ses/2010/04/14032 31.
1 3
Journal of Network and Systems Management (2021) 29:27 Page 3 of 24 27
failures in the edge layer and consequent impact on the cloud layer. All simulations
take into account the performance in terms of throughput.
Sensitivity analysis is often adopted by system designers to evaluate how “sen-
sitive” a metric is to changes in the model. If the evaluation modifies a parameter
value (eg, component failure rate) to check the output impact, the sensitivity analysis
is called parametric. On the other hand, if the model structure (eg, number of sys-
tem components) changes during system evaluation, the sensitivity analysis is struc-
tural. Traditionally, parametric sensitivity analysis is performed by a discrete varia-
tion of input parameters over their value ranges, and graphing the effects on output
measures. Another technique for performing parametric sensitivity analysis is the
differential sensitivity analysis. This approach calculates the partial derivatives of
the measure of interest (eg, system availability) with respect to each input parameter.
The main advantages of differential sensitivity analysis is the reduced computation
time when compared with other methods (eg, discrete method) [13]. The paper pre-
sents a sensitivity analysis based on the DoE [14] method that has revealed the com-
ponents with the most significant impact in the proposed architecture. Therefore, the
contributions of this work includes:
• Two SPN models (one base and extended one) that can represent an abstract dis-
tributed system composed of IoT, edge and cloud layers. The models are highly
configurable to be used in diverse simulation scenarios. The base model permits
to calibrate six timed transitions and four capacity places. The extended model
permits to calibrate ten timed transitions and six capacity places. These parame-
ters enable a system analyst to analyse both metrics like throughput, availability,
drop rate, mean response time, among others.
• A sensitivity analysis which evidenced the most impacting components in the
studied architecture and made it possible to optimize the base SPN model. The
sensitivity analysis has employed a powerful method called Design of Experi-
ments, and it has shown the dependency between levels of distinct factors
through interaction analysis.
• A case study which explores multiple metrics of interest concurrently and works
as a guide of the model utilization. The case study has explored for example
the relationship between mean time to repair and mean time to fail of the most
impacting component of the system, the edge. In other analysis the cross correla-
tion included the mean time to fail of the edge and the service time of the cloud,
observing the throughput dependent metric. Ultimately, the proposed approach
can assist system designers to avoid unnecessary investment in original equip-
ment.
The remaining of the paper is organized as follows. Section2 presents the related
works highlighting the positive and negative points of all studies. Section3 presents
the proposed IoT architecture. The architecture is described in distinct perspectives.
Section4 discusses the SPN models. The components of the models are described
and also how the workflow works. Section5 conducts a sensitivity analysis. Such
analysis has identified the most impacting components in the model. Section 6
presents a case study exploring the metrics throughput and availability. Section7
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 4 of 24
Table 1 Related works
Wor ks Type of model Context Metrics Sensi-
tivity
analysis
Redundancy
[15] Petri Net (PNs) Logistic framework to optimize the profit
improvement in Cloud computing
Availability, repair time Yes Ye s
[21] Markov Chain Cloud gaming systems and optimizing Performance, response time No No
[16] Reliability Block Diagram(RBD), Petri
Net (PNs)
Availability in cloud data center infra-
structure
Availability Yes Yes
[20] Priced Timed Petri Nets (PTPN) Fog resource allocation strategy Service response rate, execution effi-
ciency, reboot rate, reliability
No No
[22] Resource Preservation Net (RPN) Resource management in smart healthcare Arrival and service time, resource utiliza-
tion rate
No No
[23] Generalized Stochastic Petri Nets (GSPN) Performability analysis in cloud comput-
ing with NoSQL DBMS storage system
Availability, performance Yes Ye s
[24] Petri Net (PNs), Reliability Block Diagram
(RBD)
IoT in e-health architecture Throughput, service time, availability Yes No
[17] Petri Net (PNs) Availability of cooling subsystem Availability No Yes
[18] Petri Net (PNs), Reliability Block Diagram
(RBD)
Modeling cloud infrastructure Dependability, cost model No Yes
[19] Petri Net (PNs), Reliability Block Diagram
(RBD)
Availability on a private cloud computing
platform
Availability Yes No
This work Petri Net (PNs) IoT architecture and data processing edge
and the cloud
Availability, throughput Yes Yes
1 3
Journal of Network and Systems Management (2021) 29:27 Page 5 of 24 27
explains some limitations of the work. Section8 traces some conclusions and future
work.
2 Related Work
Table1 summarizes a comparison of related works considering four aspects: Tem-
plate Type, Metrics used, if sensitivity analysis was applied, and whether redun-
dancy was applied to the proposed model. Aiming to raise the most closely related
work, we have used the keywords in advanced search engines and read the refer-
ences of the main state-of-art papers. We searched for and collected papers that used
some type of analytical model to represent and evaluate a system with some similar
characteristic than ours (e.g.: IoT/Cloud). We also observed the similarities regard-
ing the adopted metrics in which the majority have used availability, for example.
The first comparison criterion focus on the type of model built. Related works
used mostly petri nets, such as: [15–19]. In some works, however, variations of the
petri nets were applied, in addition to other types of models being used. In Ni etal.
[20] the Price Timed Petri Nets (PTPN) were used to analyze the performance and
cost of the system. Yates etal. [21] developed a model based on Markov chain to
represent the process of delivering video frames in an edge game system. Oueida
etal. [22] proposed a framework also varied from petri nets, called Resource Preser-
vation Net (RPN) applied to a smart healthcare structure for resource management.
The work of Rodrigues etal. [23] adopted the use of Generalized Stochastic Petri
Nets (GSPN) for a performance assessment of cloud computing systems. Another
type of modeling adopted by some of the related works was Reliability Block Dia-
gram (RBD), used by [16, 18, 19, 24]. The level of abstraction is one of the differ-
ences between the adopted models. Continuous Time Markov Chain (CTMC) has
the most significant power of representation among the adopted models. CTMC
describes systems as a stochastic process with discrete states, characterizing them
by their states and how they alternate. Stochastic Petri net (SPN) is a type of model
equivalent to CTMC. However, SPN uses fewer elements to describe the same sys-
tem. We have adopted SPN because it can represent complex systems including
aspects such as concurrency and parallelism.
The most studied metric among the related works is availability [15–17, 19, 23,
24], also being one of the metrics adopted in this work. In the related works, other
metrics were also studied, such as service time [22, 24], which refers to the time for
a given task to be performed within the applied system, being directly linked to the
Quality of Service provided to the user. Yates etal. [21], Ni etal. [20] and Rodri-
gues etal. [23] have observed models focused on performance aiming to increase
the metric Quality of Experience (QoE). The work in [18] exceptionally included
metrics directed towards the cost of cloud infrastructure proposed in its model.
Although most of the related work explored the availability metric, only one inves-
tigated throughput. Besides, we analysed the relationship of both metrics under the
system behaviour.
Another aspect analyzed is whether sensitivity analysis was performed on the
proposed model. Among the related works, the sensitivity analysis was performed
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 6 of 24
only by [15, 16, 19, 23, 24]. The work of [15] proposed to perform a sensitivity
analysis on the failure rate, showing the impact of the failure rate on the profit pro-
file in a Cloud environment. The work of [16] also performed a sensitivity analysis
on its model, simulating and varying the values of the components to verify which
parameters most affect the general availability of a data center. In [23] the analyzes
is performed for the performance and availability of the system, both based on the
design of experiments (DoE). Santos etal. [24] also carried out sensitivity analyzes
to find out which components have the greatest impact on the availability of a elec-
tronic based health monitoring system. In [19], to perform the sensitivity analysis
and detect the most judicious components of the model, a percentage difference cal-
culation was used to calculate the sensitivity index. Although most of the related
work explored sensitivity analysis only two ([23, 24]) of them used DoE, and these
one did not focused on edge computing.
The last aspect analyzed was the application of redundancy in the model.
Among the works related to redundancy, it was applied only by [15, 16, 18, 23]. In
the work by Jiang etal. [15], three levels of warm-standby configuration were used
to carry out the evaluation process of the proposed model. In the work of Santos
etal. [16], redundancy was also applied to the proposed architecture, duplicating the
hardware components of its cloud data center architecture. In Rodrigues etal. [23],
a double redundancy was applied to a component of the architecture. In Sousa etal.
[18], the proposed modeling presented three types of redundancy Hot standby, cold
standby and warm standby.
3 Architecture
This section presents an architecture for processing information from IoT sensors.
In our architecture, data processing occurs first on the edge and later on the cloud
[25]. Services that require limited resources are processed only at the edge, while
Fig. 1 Base architecture composed of three stages: admission, edge, and cloud
1 3
Journal of Network and Systems Management (2021) 29:27 Page 7 of 24 27
services with higher demand are processed in two stages, pre-processing at the
edge and a second stage in the cloud. Figure1 presents an overview of the archi-
tecture, which is composed of three parts:
1. Admission, which deals with the generation of requests and forwarding to the
edge;
2. Edge, which performs primary data processing;
3. Cloud responsible for the final processing of data and subsequent storage.
In a simplified view, it can be said that Admission is composed of IoT devices
and an Access Point. IoT sensors are used to capture information from the envi-
ronment in which they are located, such as a smart home [26], a smart hospital
[27], a water treatment plant [28], among other environments. In this work, we
implement the generation of requests based on general parameters of IoT sensors,
without a specific monitoring focus. Meanwhile, the Access Point is in charge of
transferring requests from IoT devices to the edge components.
Processing nodes on the edge can be servers, virtual machines, or server cores.
These are responsible for bringing the processing closer, thus mitigating problems
related to latency and limited network bandwidth. Edge processing brings advan-
tages over the cloud due to its location, closer to the data generation. Data loads
will be reduced before being sent to the cloud server through the gateway [29].
We address the latency sensitive applications requirement in three ways. First, the
edge component is positioned as the closest component to the admission block,
intending to have a faster response time in this point of the system. Second, the
edge component has a respective transition that can be configured with very small
processing time (0.01s, for example, as adopted in the case study—Table 10),
depending on the real system’s reality. Third, we performed the availability study
focused on the edge component, since we consider it as the most critical one.
The last part of the architecture is the cloud. This part consists of two compo-
nents, a gateway and the cloud server. The first is intended to receive the work-
load from the edge and forward it to the next component. The gateway has a
queue capable of storing workloads when the arrival rate exceeds the throughput,
and the cloud server is responsible for the final processing of the generated data.
After processing the workload, the processor in the cloud permanently stores data
in a data center.
While recent years have seen significant advances in system instrumentation
and data center energy efficiency and automation, computational resources and
network capacity are often provisioned using best-effort models and coarse-
grained quality-of-service (QoS) mechanisms. In a future networked society per-
meated by connected objects, such approaches will not be sustainable given the
increased loads on networks and data centers. A similar manifestation of the lim-
its of today’s large-scale computing architectures is offered by the limited adop-
tion of cloud infrastructures to deploy systems with low latency demands, such as
telecommunications services. The next generation of distributed cloud architec-
tures should be based on the modeling of complex applications and infrastructures
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 8 of 24
using fine-grained and accurate application deployment and behavior models. For
all these discussed aspects we consider that the use of edge computing is essential
to have a more optimized service and such a layered architecture is important to
be evaluated.
4 SPN Models
Petri nets are a tool for formal modeling of quantitative properties of concurrent and
synchronized systems. Petri nets with random firing delays applied in transitions are
considered as stochastic Petri nets (SPNs) [11, 30–33]. Since the last decade, the
utilization of SPN has been enticing the researchers’s attention in the modeling and
performance analysis of discrete event systems. SPNs are convenient modeling tools
for the performance analysis of parallel, concurrent, dynamic and distributed sys-
tems. Nature of temporal specification can be deterministic or probabilistic. Variable
X as a stochastic process can be considered as a family of functions of time as sam-
ple paths of the events of process, formally defined as
X(t),t∈T
, where
T=[0, ∞)
.
Each sample path denotes a particular trajectory over the state space and consists of
a possible observed behavior of the process events.
This section presents the two SPN models proposed considering the scenario pre-
sented before. The first model, taken as a basis, includes no redundancy. Such a first
model encompasses four blocks representing the real architecture: admission, edge,
gateway and cloud. The second model extends the first model and considers redun-
dancy in a component that has the most significant impact on the metrics of interest.
The goal is to provide a simple and accurate way to assess the relationship between
the two metrics. In practice, the proposed SPN models must be used to predict the
behavior of the distributed system by scaling resources or changing the capacity of
the equipment represented by the model. The software designer must collect input
parameters with simple test-beds and feed the model to perform simulations by var-
ying only the model components.
4.1 Base Model
The base SPN model to evaluate the performability of the hybrid system is presented
in Fig.2. Table2 overviews the components of the model.
The data processing flow between the components of the model is described next.
The Admission subnet is made up of two IoT and Access_Point places, which repre-
sent the waiting between data and acceptance of this data in the queue. Tokens gen-
erated in IoT represent any request that involves data generated by IoT devices for
future processing and storage. The AD transition represents the time between arriv-
als, and T1 represents the receipt of the request. Note that T1 is an immediate transi-
tion, with no associated delay. T1 fires as soon as there is a token in Access_Point
and at least one token in ECS.
When T1 fires, the Edge subnet is reached. A token is taken from Access_Point
and ECS. In T1 firing, a token is returned to the IoT, allowing a new firing. A token
1 3
Journal of Network and Systems Management (2021) 29:27 Page 9 of 24 27
Fig. 2 Base SPN model, with the respective architecture components
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 10 of 24
is then added to the EC_REQUEST place. The number of tokens in EC_REQUEST
represents the number of requests being processed at the Edge. The component asso-
ciated with T1 represents the availability of the Edge. A token in EC_ON means that
the Edge is available and T1 can be triggered. However, a token in EC_OFF means
that the Edge is down, and T1 cannot be fired. The inhibitor arc serves to inhibit
T1 from firing if there is a token in EC_OFF. EC_MTTF and EC_MTTR represent
failure at the edge and recovery time, respectively. ECTS represents service time at
Edge. When ECTS is triggered, a token is taken from EC_REQUEST and a token is
added to the EC_RESPONSE place.
When T3 is triggered, the Gateway subnet is reached, then a token is added to Gate-
way_REQUEST and ECS and a token is removed from Gateway_E. The Gateway_E
place represents the queuing capability of the Gateway. The capacity of the gateway
can be interpreted as available channels for the Cloud. Queuing at gateway occurs
when there is no capacity available to serve newly-arrived requests. The transition
Gateway_TS represents the service time of a request in Gateway. When Gateway_TS
Table 2 Description of the
components of the base model Component Description
IoT Entry of requests
AD Request input delay
Access_Point Edge computing access point
EC_REQUEST Entry of requests in edge computing
ECTS Edge computing processing time
EC_RESPONSE Outbound requests from edge computing
ECS Edge computing processing power
NEC Edge capability
EC_OFF Inactive edge computing
EC_MTTF Average edge computing failure time
EC_MTTR Average edge computing repair time
EC_ON Active edge computing
Gateway_TS Gateway processing time
Gateway_RESPONSE Outbound requests from the gateway
Gateway_E Gateway processing capacity
NG gateway capability
CC_REQUEST Entry of requests in cloud computing
CCTS Cloud computing processing time
CC_RESPONSE Output of cloud computing requests
CCE Cloud computing processing capacity
NCC Cloud capability
Table 3 Calculated metrics for
the base model Model Throughput Availability
No Redundancy
E{#CC_REQUEST}
CCTS
(P{#EC_ON>0})
1 3
Journal of Network and Systems Management (2021) 29:27 Page 11 of 24 27
is reached, a token is removed from the place Gateway_REQUEST and added to
Gateway_RESPONSE.
The last component of the model is the Cloud Computing. The Cloud Computing
subnet is reached when the T4 transition is triggered. When T4 is triggered, a token is
added to the places CC_REQUEST and Gateway_E, and a token is taken from CCE.
The CCE place represents the processing power in the Cloud. CCTS represents the pro-
cessing time of a request in the Cloud. When C_ST is triggered, a token is taken from
CC_REQUEST and added to CC_RESPONSE. When T4 is triggered, a token is taken
from CC_RESPONSE and a token is added to the place CCE.
Table3 presents the metrics for the base model. The throughput is calculated based
on the cloud component, that is the last component of the system. P stands for probabil-
ity in the equations and # stands for the number of tokens in a given place. E stands for
the expected number of tokens in the corresponding place. Thus, the throughput equa-
tion considers the expected number of tokens in the cloud divided by the cloud service
time. Regarding availability, the system is considered working if the edge component
is working. As aforementioned, we focus on the edge component as it is considered a
constrained point of failure in this work.
4.2 Extended Model
The extended model aims to replicate the base model part that has the most significant
impact on system availability. The sensitivity analysis in the next section will show that
edge failures and recovery times have the most significant impact on system availabil-
ity. Therefore, the extended model applies redundancy at the edge, following a cold
standby pattern [18]. Figure3 shows the extended model. For the sake of simplicity,
next we provide details addressing only the additional redundancy in the model. The
inhibitor arc present in EC_ON ensures that if the main component of the Edge is
working, the EC2_Switch transition cannot be triggered, thus preventing both compo-
nents from working at the same time. When a token exists in EC2_OFF, it means that
the secondary machine is down and T42 cannot be triggered, thus avoiding processing
in EC_REQUEST2, while EC_REQUEST is in operation.
The place EC_D was included. A token in EC_D means that the redundant edge
component is off. When a token is added to EC_D, that place will immediately inhibit
the EC2_Switch transition from being triggered until EC_OFF is reached and the Edge
is in a down state. The times used for the transitions were obtained from [34]. Table4
overviews the components present in the extended model.
Table5 presents the metrics for the extended model. Again, the throughput is cal-
culated based on the cloud component. Thus, the throughput equation considers the
expected number of tokens in the cloud divided by the cloud service time. Regarding
availability, the system is considered working if one of the redundant edge components
is working.
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 12 of 24
Fig. 3 Extended SPN model focusing on replicating the edge component
1 3
Journal of Network and Systems Management (2021) 29:27 Page 13 of 24 27
Table 4 Description of the extended model components
Component Description
IoT Entry of requests
AD Request input delay
Access_Point Edge computing access point
EC_REQUEST Entry of requests in edge computing
ECTS Edge computing processing time
EC_RESPONSE Outbound requests from edge computing
ECS Edge computing processing power
NEC Edge capability
EC_OFF Inactive edge computing
EC_MTTF Average edge computing failure time
EC_MTTR Average edge computing repair time
EC_ON Active edge computing
EC_D Edge computing Down
EC2_Switch Second edge computing enabler
EC_REQUEST2 Entry of requests on the second edge computing
ECTS2 Second edge computing processing time
EC_RESPONSE2 Output of requests from the second edge computing
ECS2 Second edge computing processing capacity
EC2_OFF Second edge computing inactive
EC2_MTTF Second edge computing average failure time
EC2_MTTR Average repair time of the second edge computing
EC2_ON Second active edge computing
Gateway_REQUEST Entry of requests at the Gateway
Gateway_TS Gateway processing time
Gateway_RESPONSE Outbound requests from the Gateway
Gateway_E Gateway processing capacity
NG Gateway capability
CC_REQUEST Entry of requests in cloud computing
CCTS Cloud computing processing time
CC_RESPONSE Output of cloud computing requests
CCE Cloud computing processing capacity
NCC cloud capability
Table 5 Calculated metrics for the extended model
Model Throughput Availability
With Redundancy
E{#CC_REQUEST}
CCTS
(P{(#EC_ON>0)OR (#EC2_ON>0)})
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 14 of 24
Table 6 DoE levels and factors Level 1 (− 50%) Level 2 (+ 50%)
EC_MTTR 1.735 5.205
EC_MTTF 2382.90 7148.69
ECTS 0.005 0.015
Gateway_TS 0.00005 0.00015
CCTS 0.01 0.03
SWITCH 0.41667 1.25000
Table 7 Combination table for
analysis with DOE base model Comb. EC_MTTR EC_MTTF ECTS Gateway_TS CCTS
#1 1.735 2382.90 0.005 0.00015 0.03
#2 5.205 2382.90 0.005 0.00015 0.03
#3 1.735 7148.69 0.015 0.00005 0.03
#4 5.205 2382.90 0.015 0.00015 0.01
#5 1.735 2382.90 0.005 0.00015 0.01
#6 1.735 2382.90 0.015 0.00005 0.03
#7 1.735 2382.90 0.005 0.00005 0.03
#8 5.205 7148.69 0.005 0.00015 0.01
#9 5.205 7148.69 0.015 0.00015 0.03
#10 1.735 2382.90 0.015 0.00015 0.03
#11 1.735 2382.90 0.015 0.00015 0.01
#12 1.735 7148.69 0.005 0.00015 0.03
#13 1.735 2382.90 0.015 0.00005 0.01
#14 5.205 2382.90 0.005 0.00015 0.01
#15 5.205 2382.90 0.015 0.00015 0.03
#16 1.735 7148.69 0.015 0.00015 0.01
#17 5.205 7148.69 0.015 0.00005 0.03
#18 5.205 2382.90 0.005 0.00005 0.01
#19 5.205 7148.69 0.005 0.00005 0.01
#20 1.735 7148.69 0.005 0.00005 0.01
#21 5.205 7148.69 0.015 0.00015 0.01
#22 1.735 7148.69 0.005 0.00015 0.01
#23 5.205 2382.90 0.015 0.00005 0.01
#24 5.205 7148.69 0.005 0.00005 0.03
#25 1.735 7148.69 0.005 0.00005 0.03
#26 5.205 7148.69 0.015 0.00005 0.01
#27 1.735 2382.90 0.005 0.00005 0.01
#28 5.205 2382.90 0.015 0.00005 0.03
#29 5.205 7148.69 0.005 0.00015 0.03
#30 5.205 2382.90 0.005 0.00005 0.03
#31 1.735 7148.69 0.015 0.00015 0.03
#32 1.735 7148.69 0.015 0.00005 0.01
1 3
Journal of Network and Systems Management (2021) 29:27 Page 15 of 24 27
5 Sensitivity Analysis withDoE
Design of Experiments (DoE) corresponds to a collection of statistical techniques
that allow deepening the knowledge about the product or process under study
[35]. It can also be defined by a series of tests in which the researcher changes the
set of variables or input factors to observe and identify the reasons for changes in
the output response. The parameters to be changed are defined using an experi-
ment plan. The goal is to generate the most significant amount of information
with the least possible experiments. The behavior of the system based on param-
eter changes can be observed using sets of outputs.
This section presents a study of sensitivity analysis using the DoE method.
The execution of the DoE seeks to identify the factors that most influence the
system and whether there is an interaction between factors. In our work, avail-
ability and throughput are the metrics considered. To evaluate these metrics, the
timed transitions were established as factors, with their levels varying by −50%
and +50% of their base values. Table6 presents the chosen factors and respective
levels. Table7 presents the thirty-two combinations of factors for the base model.
The table of combinations for the extended model was omitted due to too much
redundant data.
5.1 Results forAvailability
Table8 shows the impact of each factor on the availability (the most significant ones
are highlighted in blue). EC_MTTF and EC_MTTR are the most significant factors
in both cases. However, in the base model, the MTTR stands out slightly, whereas
the MTTF does the same in the extended model. One possible cause is that the edge
is the first component to receive requests. That way, if the component fails faster
(smaller EC_MTTF) or takes longer to recover from the failure (larger EC_MTTR),
the input transition will be blocked for longer, and the system will be unavailable for
a longer time. Only the edge has MTTF and MTTR parameters because it is more
likely to fail than advanced cloud technologies.
Interaction graphs are responsible for identifying interactions between factors.
An interaction occurs when the influence of a given factor on the result is altered
(amplified or reduced) by the difference in another factor. If the lines on the graph
Table 8 Sensitive indexes for
availability Factor Effect—base model Effect—ext. model
EC_MTTR 0.05706 0.016435
EC_MTTF 0.05699 0.016457
Gateway_TS 0.00005 0.000001
ECTS 0.00000 0.000007
CCTS 0.00000 0.000012
SWITCH – 0.000015
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 16 of 24
Interaction - BaseModel
Interaction - Extended Model
(a)
(b)
Fig. 4 Interaction plots for availability. a Interaction—base model. b Interaction—extended model
Table 9 Sensitive indexes for
throughput Factor Effect—base model Effect—ext. model
EC_MTTR 4.36500 0.90400
EC_MTTF 4.35000 0.91050
ECTS 0.00000 0.67780
Gateway_TS 0.00000 0.00000
CCTS 0.00000 16.1934
SWITCH – 0.00800
1 3
Journal of Network and Systems Management (2021) 29:27 Page 17 of 24 27
are parallel, there will not be an interaction between the factors. If they are not
parallel, it means that there is an interaction between the factors [35].
Figures 4a and b shows the interaction plots for the availability consider-
ing factors EC_MTTR and EC_MTTF. The most sensitive factors were chosen
to study their interaction. In both cases (base and extended models), there are
interactions. In the base model (Fig. 4a) for distinct values of EC_MTTR, the
EC_MTTF changes. The variation in availability is higher when the EC_MTTR
is 22.56 h. In other words, considering the same MTTF variation (150->450), the
impact on availability is higher with a higher MTTR. However, in absolute num-
bers, the highest MTTF (450) with the lowest MTTR (7.5) reaches the best avail-
ability result, about 98%. Observing now the interaction plot for the extended
Interaction-BaseModel
Interaction-ExtendedModel
(a)
(b)
Fig. 5 Interaction plots for throughput. a Interaction—base model. b Interaction—extended model
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 18 of 24
model (Fig.4b), the interaction was higher than the base model, since the lines
are not parallel.
5.2 Results forThroughput
Table 9 shows the impact of each factor in the throughput. Again, similar to the
availability results, the EC_MTTR is the most sensitive factor. However, the transi-
tion CCTS (related to the cloud service time) impacts the throughput the most. The
CCTS is highly relevant because the cloud is the last component of the system. The
throughput equation considers such component (one place and one transition), as
discussed in Sect.4.
Figure5 shows the interaction of factors for the throughput metric. The most sen-
sitive factors were chosen to study their interaction. The result is similar to the avail-
ability result in the base model. The EC_MTTF and EC_MTTR have a significant
impact on throughput. However, in terms of the extended model, the factors CCTS
and EC_MTTF do not have a significant impact, as the lines are almost parallel.
6 Case Study
This section presents a case study. First, the simulation parameters used in this work
are shown, and then the results obtained through the simulations are detailed. The
models and simulations were performed using the Mercury tool [36], which is a
powerful modeling software appropriate not only for Petri net models but also for
Markov chains. Table10 shows the values used in the transitions and places of the
model. The times used for the transitions were obtained from [34].
Figure6 presents 3D surface plots to show the system behavior considering avail-
ability (6a and b show the result for the base model and for the extended model,
respectively). We have varied the MTTF and MTTR parameters of the edge com-
puting component. The values were based on the literature [34] (MTTR= 3.47 h,
MTTF=4765.9h) in increments of 25% up and down. Therefore, the MTTR ranged
from 1.7 to 5.2h, and MTTF ranged from 2382 to 7147 h. The availability is pre-
sented in the number of nines form. The following equation is used to calculate the
Table 10 Transition and places
values Component Type Value
AD Transition 0.02s
EC_MTTF, EC2_MTTF Transition 4765.9h
EC_MTTR, EC2_MTTR Transition 3.47.0h
ECTS, ECTS2 Transition 0.01s
Gateway_TS Transition 0.0001s
CCTS Transition 0.02
EC_Switch Transition 0.833333s
NEC, NG, NCC Place 2
1 3
Journal of Network and Systems Management (2021) 29:27 Page 19 of 24 27
2,0
2,5
3,0
3,5
4,0
4,5
5,0
2,6
2,8
3,0
3,2
3,4
3,6
3000
4000
5000
6000
7000
Availability (#9’)
MTTF Edge Computing (h)
MTTR Edge Computing (h)
2,655
2,777
2,899
3,021
3,143
3,264
3,386
3,508
3,630
Base Model
2,0
2,5
3,0
3,5
4,0
4,5
5,0
4,8
5,0
5,2
5,4
5,6
5,8
3000
4000
5000
6000
7000
Availability (#9’)
MTTF Edge Computing (h)
MTTR Edge Computing (h)
4,760
4,886
5,011
5,137
5,263
5,388
5,514
5,639
5,765
Extended Model
(a)
(b)
Fig. 6 Availability analysis by varying concomitant factors. a Base model. b Extended model
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 20 of 24
number of nines:
A_nines =−LOG(1−Availability)
[37]. First of all, it is essential
to say that the colors are related to the level of availability. The purple color repre-
sents the lowest availability, and the red color means the highest availability. In both
models, the MTTF and MTTR significantly impact availability, but there are some
peculiarities. The availability of the base model is lower than the extended one. In
fact, the highest availability is 3.6 in the base model, while the lowest availability
of the extended model is 4.8. As the sensitivity analysis has indicated, MTTF has
more impact than MTTR. In the base model, this difference is not so evident. How-
ever, in the extended model, the MTTF indeed has a higher impact than MTTR.
From MTTF>6000h, the color becomes red (
A>5.4#9′
) independently from the
MTTR value. Such an higher impact of MTTF on the extended model is reinforced
by the sensitive indexes presented in Sect.5 (Table8). In the base model, the MTTR
is more relevant, while in the extended model, the MTTF is more relevant. Also,
the extended model presents some small inflection in the bottom of the graph. We
believe that this may be caused by the higher interaction in the extended model (see
graphs in Fig.6). Another difference between the base and extended results is the
color distribution. Observe that the collor blue is more proeminent in the extended
model graph. It means that 5011 nines of availability is more frequent related to the
other results.
Figure7a and b show the results for the base and extended models, respectively,
in terms of throughput. The factors have taken into account the sensitivity analy-
sis results. The base model evidences the MTTF/MTTR of the edge computing,
whereas in the extended model it is evidenced the MTTF of the edge with the cloud
service time (CCTS). In the first graph, for the base model, the throughput varia-
tion is extremely low, ranging from 76.36 to 76.52req/s. However, it is possible to
visualize the tendency of the throughput behavior. When the MTTR increases, the
throughput decrements, and when the MTTF increases, the throughput increments.
The low throughput variability happens because, in this scenario, the availability is
too high (more than 2.6#9’), as can be seen in Fig.6. This way, we can assume that
the system will be almost “always” available. Then, the variation of MTTF/MTTR
will not cause much difference. In the second graph (Fig.7b), the variability is sig-
nificant. The throughput ranges from 20 to 70req/s. However, the EC_MTTF does
not impact as much as the CCTS factor. In fact, the sensitive index of CCTS (16.19)
is much higher than all others (< 0) (see Table9). Besides, there is no interaction
between both factors, different from all previous results.
7 Limitations oftheWork
Following, we list some of the limitations of this work:
• Our model, although enables scalability, does not do it in an automatic way.
However we did not model such a requirement because we did not focus on
response time as the related work [38].
• We did not evaluate the availability of the gateway and the cloud, only the edge,
believing that the edge does have a higher impact on the system due to its client
1 3
Journal of Network and Systems Management (2021) 29:27 Page 21 of 24 27
2,0
2,5
3,0
3,5
4,0
4,5
5,0
76,36
76,38
76,40
76,42
76,44
76,46
76,48
76,50
76,52
3000
4000
5000
6000
7000
Throughput (req/s)
76,37
76,39
76,40
76,42
76,44
76,46
76,48
76,50
76,52
Base Model
3000
4000
5000
6000
7000
20
30
40
50
60
70
0,02
0,04
0,06
0,08
0,10
Throughput (req/s)
CCTS (s)
EC_MTTF (h)
19,80
26,90
34,00
41,10
48,20
55,30
62,40
69,50
76,60
Extended Model
(a)
(b)
Fig. 7 Throughput analysis by varying concomitant factors. a Base model. b Extended model
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 22 of 24
proximity. We decided also evaluate only the edge availability and not all the
components avoiding the state space explosion problem. It is possible that the
non evaluated components are equally important.
• Lastly, other metrics could be also investigated, such as drop rate, and usability.
However, we have investigated such metrics in a previous work [30].
• Although important, cloud elasticity was not evaluated in the proposed model
of this work, because we decided to simplify the cloud component, abstracting
internal details to focus on the multi-layer edge/cloud architecture problem.
8 Conclusion
This paper proposed two stochastic Petri net models to measure the performability
of an architecture for an IoT system with processing in the edge and in the cloud.
The paper presents a sensitivity analysis based on the DoE method that has revealed
the components with the most significant impact in the proposed architecture. Two
SPN models were presented (one base and extended one) that can represent an
abstract distributed system composed of IoT, edge and cloud layers. The models are
highly configurable to be used in diverse simulation scenarios. Besides, a sensitivity
analysis evidenced the most impacting components in the studied architecture and
made it possible to optimize the base SPN model. We also performed a case study
which have explored multiple metrics of interest concurrently and works as a guide
of the model utilization. Ultimately, the proposed approach can assist system design-
ers to avoid unnecessary investment in original equipment As future work, we intend
to extend the model focusing on the other components and explore new metrics such
as mean response time and resource utilization.
References
1. Aceto, G., Persico, V., Pescapé, A.: Industry 4.0 and health: Internet of things, big data, and cloud
computing for healthcare 4.0. J. Ind. Inf. Integr. 100129 (2020)
2. Zhao, Z., Zhou, W., Deng, D., Xia, J., Fan, L.: Intelligent mobile edge computing with pricing in
internet of things. IEEE Access 8, 37727–37735 (2020)
3. Patel, M., Naughton, B., Chan, C, Sprecher, N., Abeta, S., Neal, A. etal.: Mobile-edge computing
introductory technical white paper. White paper, mobile-edge computing (MEC) industry initiative,
1089–7801 (2014)
4. Wei, H., Luo, H., Sun, Y.: Mobility-aware service caching in mobile edge computing for internet of
things. Sensors 20(3), 610 (2020)
5. Khan, L.U., Yaqoob, I., Tran, N.H., Kazmi, S.M.,Ahsan, D., Nguyen, T., Hong, C.S.: Edge comput-
ing enabled smart cities: a comprehensive survey. IEEE Internet of Things J. (2020)
6. Dang, L.M., Piran, M., Han, D., Min, K., Moon, H.: A survey on internet of things and cloud com-
puting for healthcare. Electronics 8(7), 768 (2019)
7. Dizdarević, J., Carpio, F., Jukan, A., Masip-Bruin, X.: A survey of communication protocols for
internet of things and related challenges of fog and cloud computing integration. ACM Comput.
Surv. (CSUR) 51(6), 1–29 (2019)
8. Sharma, G., Kalra, S.: A lightweight user authentication scheme for cloud-IoT based healthcare ser-
vices. Iran. J. Sci. Technol. Trans. Electr. Eng. 43(1), 619–636 (2019)
1 3
Journal of Network and Systems Management (2021) 29:27 Page 23 of 24 27
9. Silva, F.F.F.A., Kosta, S., Rodrigues, M., Oliveira, D., Maciel, T., Mei, A., Maciel, P.: Mobile cloud
performance evaluation using stochastic models. IEEE Trans. Mob. Comput. 17(5), 1134–1147
(2017)
10. Silva, F.A., Rodrigues, M., Maciel, P., Kosta, S., Mei, A.: Planning mobile cloud infrastructures
using stochastic petri nets and graphic processing units. In 2015 IEEE 7th International Conference
on Cloud Computing Technology and Science (CloudCom), pp. 471–474. IEEE (2015)
11. Silva, F.A., Fé, I., Gonçalves, G.: Stochastic models for performance and cost analysis of a hybrid
cloud and fog architecture. J. Supercomput. (2020)
12. Oliveira, D., Brinkmann, A., Rosa, N., Maciel, P.: Performability evaluation and optimization of
workflow applications in cloud environments. J. Grid Comput. 17(4), 749–770 (2019)
13. Silva, B., Matos, R., Tavares, E., Maciel, P., Zimmermann, A.: Sensitivity analysis of an availability
model for disaster tolerant cloud computing system. Int. J. Netw. Manag. 28(6), e2040 (2018)
14. Li, A., Kusuma, G., James, D., Lim, R.: Design of experiment (doe) approach to identify critical
parameters in a counterflow centrifugation system. Cytotherapy 22(5), S151–S152 (2020)
15. Jiang, F.-C., Hsu, C.-H., Wang, S.: Logistic support architecture with petri net design in cloud envi-
ronment for services and profit optimization. IEEE Trans. Serv. Comput. 10(6), 879–888 (2016)
16. Santos, G.L., Endo, P.T., Gonçalves, G., Rosendo, D., Gomes, D., Kelner, J., Sadok, D., Mahloo,
M.: Analyzing the it subsystem failure impact on availability of cloud services. In: 2017 IEEE Sym-
posium on Computers and Communications (ISCC), pp. 717–723. IEEE (2017)
17. Gomes, D.M., Endo, P.T., Gonçalves, G., Rosendo, D., Santos, G.L., Kelner, J., Sadok, D., Mahloo,
M.: Evaluating the cooling subsystem availability on a cloud data center. In: 2017 IEEE Symposium
on Computers and Communications (ISCC), pp. 736–741. IEEE (2017)
18. Sousa, E., Lins, F., Tavares, E., Cunha, P., Maciel, P.: A modeling approach for cloud infrastructure
planning considering dependability and cost requirements. IEEE Trans. Syst. Man Cybern. Syst.
45(4), 549–558 (2014)
19. Melo, C., Matos, R., Dantas, J., Maciel, P.: Capacity-oriented availability model for resources esti-
mation on private cloud infrastructure. In: 2017 IEEE 22nd Pacific Rim International Symposium
on Dependable Computing (PRDC), pp. 255–260. IEEE (2017)
20. Ni, L., Zhang, J., Jiang, C., Yan, C., Kan, Yu.: Resource allocation strategy in fog computing based
on priced timed petri nets. IEEE Internet Things J. 4(5), 1216–1228 (2017)
21. Yates, R.D., Tavan, M., Hu, Y., Raychaudhuri, D.: Timely cloud gaming. In: IEEE INFOCOM
2017-IEEE Conference on Computer Communications, pp. 1–9. IEEE (2017)
22. Oueida, S., Kotb, Y., Aloqaily, M., Jararweh, Y., Baker, T.: An edge computing based smart health-
care framework for resource management. Sensors 18(12), 4307 (2018)
23. Rodrigues, M., Vasconcelos, B., Gomes, C., Tavares, E.: Evaluation of nosql dbms in private cloud
environment: an approach based on stochastic modeling. In: 2019 IEEE International Systems Con-
ference (SysCon), pp. 1–7. IEEE (2019)
24. Santos, G.L., Endo, P.T., Lisboa, M.F.F.S., Silva, L.G.F., Sadok, D., Kelner, J., Lynn, T.: Analyzing
the availability and performance of an e-health system integrated with edge, fog and cloud infra-
structures. J. Cloud Comput. 7(1), 16 (2018)
25. Ahsan, U., Bais, A.: Distributed smart home architecture for data handling in smart grid. Can. J.
Electr. Comput. Eng. 41(1), 17–27 (2018)
26. Mokhtari, G., Anvari-Moghaddam, A., Zhang, Q.: A new layered architecture for future big data-
driven smart homes. IEEE Access 7, 19002–19012 (2019)
27. Rodrigues, L.A., Endo, P.T., da Silva, F.A.P.: Modelo estocástico para avaliação de desempenho de
hospitais inteligentes. In: Anais da VII Escola Regional de Computação Aplicada à Saúde, pp. 7–12.
SBC (2019)
28. Andrade, E., Machida, F.: Analysis of software aging impacts on plant anomaly detection with edge
computing. In: 2019 IEEE International Symposium on Software Reliability Engineering Work-
shops (ISSREW), pp. 204–210. IEEE (2019)
29. Morabito, R., Cozzolino, V., Ding, A.Y., Beijar, N., Ott, J.: Consolidate IoT edge computing with
lightweight virtualization. IEEE Netw. 32(1), 102–111 (2018)
30. Carvalho, D., Rodrigues, L., Endo, P.T., Kosta, S., Silva, F.A.: Mobile edge computing performance
evaluation using stochastic petri nets. In: 2020 IEEE Symposium on Computers and Communica-
tions (ISCC), pp. 1–6. IEEE (2020)
31. Santos, G.L., Gomes, D., Kelner, J., Sadok, D., Silva, F.A., Endo, P.T., Lynn, T.: The internet of
things for healthcare: optimising e-health system availability in the fog and cloud. Int. J. Comput.
Sci. Eng. 21(4), 615–628 (2020)
Journal of Network and Systems Management (2021) 29:27
1 3
27 Page 24 of 24
32. Ferreira, L., daSilva Rocha, E., Monteiro, K.H.C., Santos, G.L., Silva, F.A., Kelner, J., Sadok, D.,
Filho, Carmelo, J.A.B., Rosati, P., Lynn, T. etal. Optimizing resource availability in composable
data center infrastructures. In: 2019 9th Latin-American Symposium on Dependable Computing
(LADC), pp. 1–10. IEEE (2019)
33. Rodrigues, L., Endo, P.T., Silva, F.A.: Stochastic model for evaluating smart hospitals performance.
In: 2019 IEEE Latin-American Conference on Communications (LATINCOM), pp. 1–6. IEEE
(2019)
34. El Kafhali, S., Salah, K.: Efficient and dynamic scaling of fog nodes for IoT devices. J. Supercom-
put. 73(12), 5261–5284 (2017)
35. Kleijnen, J.P.C.: Sensitivity analysis and optimization in simulation: design of experiments and case
studies. In: Winter Simulation Conference Proceedings, 1995., pp. 133–140. IEEE (1995)
36. Silva, B., Matos, R., Callou, G., Figueiredo, J., Oliveira, D., Ferreira, J., Dantas, J., Junior, A.L.,
Alves, V., Maciel, P.: Mercury: An integrated environment for performance and dependability eval-
uation of general systems. In: Proceedings of Industrial Track at 45th Dependable Systems and Net-
works Conference (DSN) (2015)
37. Callou, G., Maciel, P., Tutsch, D., Araújo, J., Ferreira, J., Souza, R.: A petri net-based approach to
the quantification of data center dependability. In: Petri Nets-Manufacturing and Computer Science,
pp. 313–336 (2012)
38. Fe, I., Matos, R., Dantas, J., Melo, C., Maciel, P.: Stochastic model of performance and cost for
auto-scaling planning in public cloud. In: 2017 IEEE International Conference on Systems, Man,
and Cybernetics (SMC), pp. 2081–2086. IEEE (2017)
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.
Authors and Aliations
LucasSantos1· BeneditoCunha1· IureFé2· MarcoVieira3·
FranciscoAirtonSilva1
Lucas Santos
vinicius.lucas@ufpi.edu.br
Benedito Cunha
bene_rodrigo@ufpi.edu.br
Iure Fé
iuresf@gmail.com
Marco Vieira
mvieira@dei.uc.pt
1 Universidade Federal doPiauí- (UFPI), R. Cícero Duarte, no 905 - Junco, Picos, PI64607-670,
Brazil
2 Departamento de Engenharia Informática, Universidade de Coimbra, Polo II - Pinhal de
Marrocos, 3030-290Coimbra, Portugal
3 Exército Brasileiro, Terceiro BEC, Picos, Piauí, Brazil
A preview of this full-text is provided by Springer Nature.
Content available from Journal of Network and Systems Management
This content is subject to copyright. Terms and conditions apply.