Conference PaperPDF Available

Reinforcing the Edge: Autonomous Energy Management for Mobile Device Clouds

Authors:

Abstract

The collaboration among mobile devices to form an edge cloud for sharing computation and data can drastically reduce the tasks that need to be transmitted to the cloud. Moreover, reinforcement learning (RL) research has recently begun to intersect with edge computing to reduce the amount of data (and tasks) that needs to be transmitted over the network. For battery-powered Internet of Things (IoT) devices, the energy consumption in collaborating edge devices emerges as an important problem. To address this problem, we propose an RL-based Droplet framework for autonomous energy management. Droplet learns the power-related statistics of the devices and forms a reliable group of resources for providing a computation environment on-the-fly. We compare the energy reductions achieved by two different state-of-the-art RL algorithms. Further, we model a reward strategy for edge devices that participate in the mobile device cloud service. The proposed strategy effectively achieves a 10% gain in the rewards earned compared to state-of-the-art strategies.
Reinforcing the Edge: Autonomous Energy
Management for Mobile Device Clouds
Venkatraman Balasubramanian, Faisal Zaman, Moayad Aloqaily§, Saed Alrabaee,
Maria Gorlatova, Martin Reisslein
Arizona State University, AZ, USA, {Vbalas11, Reisslein}@asu.edu
University of Ottawa, Ottawa, ON, Canada, Fzama075@uottawa.ca
§Canadian University Dubai, UAE, maloqaily@ieee.org
United Arab Emirates University (UAEU), UAE, Salrabaee@uaeu.ac.ae
Duke University, NC, USA, Maria.Gorlatova@duke.edu
Abstract—The collaboration among mobile devices to form an
edge cloud for sharing computation and data can drastically
reduce the tasks that need to be transmitted to the cloud. More-
over, reinforcement learning (RL) research has recently begun to
intersect with edge computing to reduce the amount of data (and
tasks) that needs to be transmitted over the network. For battery-
powered Internet of Things (IoT) devices, the energy consumption
in collaborating edge devices emerges as an important problem.
To address this problem, we propose an RL-based Droplet
framework for autonomous energy management. Droplet learns
the power-related statistics of the devices and forms a reliable
group of resources for providing a computation environment
on-the-fly. We compare the energy reductions achieved by two
different state-of-the-art RL algorithms. Further, we model a
reward strategy for edge devices that participate in the mobile
device cloud service. The proposed strategy effectively achieves
a 10% gain in the rewards earned compared to state-of-the-art
strategies.
Index Terms—Mobile Edge Computing, Device Clouds, Inter-
net of Things, Reinforcement Learning.
I. INTRODUCTION
As the IoT revolution expands further into the 5G realm,
low latency service, connected and autonomous systems, data
and service management have been the focus of today’s
research [1]–[6]. Computation and caching at the edge of
the cellular network has been proposed as one of the main
strategies for achieving sub-millisecond latencies [7]–[9]. Past
research has demonstrated that bringing computation closer
to the user results in manageable response times [10], [11].
However, the deployment overheads and costs of these edge
entities can be prohibitive for last mile users. To cope with
these costs, the authors of [12] [13] have proposed an idle
device resource composition scheme in the mobile device to
be utilized for providing a collaborative environment, thereby
resulting in a computation infrastructure [14]. However, none
of these solutions consider the draining energy inside mobile
devices that affect the stability of the mobile environment. Our
interest in this paper is in considering the energy consumption
in Mobile Device Clouds (MDC) at the time of task execution.
For example, in a coffee shop setting where people have gath-
ered and linger in the shop for sometime. In such scenarios,
instead of requesting a costly public cloud service, forming
a local device cloud between users bonded socially is more
feasible [15]. However, the battery powered devices need a
local energy manager to decide whether a device can continue
to participate or should be excluded.
Thus, this problem involves a decision making process at
the device-end where a decision maker module can either
choose to stop (for which the device receives a reward and
the system terminates) or pay a certain cost (in terms of
energy dissipation) to continue further. In terms of termination,
the device that terminates itself receives a termination reward
where as if the decision is to continue then the task execution
continues until the request is complete. All these procedures
should be executed without any user intervention. Our primary
motive here is not to find out the battery draining periods
but to investigate the energy constraints that are within the
device defined power thresholds that are estimated over a
period of time. Implicitly, this decision is also responsible
for maximizing rewards over that period. Moreover, to the
best of our knowledge, this work is the first of its kind that
employs learning models to resolve such an issue. To this end,
we simulate two state of the art algorithms namely greedy
and near-greedy. We deploy a gradient based approach to
compare and show the resulting rewards obtained. We observe
an approximate 10% increase in rewards earned using the
gradient approach.
A. Motivation
In order to reduce the overheads in terms of deployment
costs of cloudlets and edge data-centers, leveraging the local
idle device resources is essential. Hence, opportunistic MDC
environments are an important replacement for edge cloud
deployments. These devices are hand-held user specific nodes
that change location based on user movements. Therefore,
the challenges present here are two fold. One is the device
movement - where the user who requests the service and the
serving entity both are subjected to motion. Secondly, there
are constant disruptions due to drain out periods of devices.
Earlier works such as [16] have presented offloading based
solutions. Thus, in this paper we only target the problem of
energy maintenance in such environments where reward grant
is achieved by a continuous learning process.
B. Problem Definition
In various scenarios such as smart home automation, smart
buildings such as libraries, coffee shops where devices may
or may not be connected to a power source, there are always
times when device usage remains idle. In such scenarios,
a discovery of nodes that are willing to provide cohesive
service by establishing a local computation environment will
provide a cost effective compute location compared to a
centralized public cloud. However, recently it has been found
that such opportunistic edge cloud deployments requires en-
ergy maintenance. Although, task execution is a priority, if
the device runs out of power, then the resiliency of such a
model fails. Thus, our motivation to use reinforcement learning
stems from the challenge of finding proper energy thresholds.
Further, this technique provides a two-fold benefit of reward
calculation along with reliable device selection in such volatile
environments. Thus, training enables a highly robust system.
C. Contributions
To this end, our contributions in this paper are as follows:
1) We design a framework for autonomous energy manage-
ment relying on reinforcement learning approach [17].
The design is based on interaction of the agents and
proceed without human inputs.
2) The result of such a design is the production of rewards
commensurate with job completion for the nodes par-
ticipating in an MDC service. As the termination of a
service does not depend on any external inputs, it is
completely autonomous.
The remainder of this document delineates the related work
in Section II, followed by design in Section III. In Section IV
we discuss the performance evaluation. We provide concluding
remarks in Section V.
Fig. 1: Droplet Architecture
II. RE LATE D WOR KS
In [18] authors propose a distributed topology control al-
gorithm by combining design theories where the technique
focuses on asynchronous and asymmetric neighbor discovery.
According to this concept, neighbor discovery schedules are
made based on a combinatorial design of “multiples of 2” after
which a target duty cycling is done. In addition, the multiples
of 2 are applied to overcome the challenge of the block design
and support asymmetric operation. The proposed method has
shown the smallest total number of slots and wake-up slots
among existing representative neighbor discovery protocols.
On the contrary to this work, we propose an autonomous
device environment where decision to terminate is solely taken
by the device itself based on the native energy statistics
(including duty cycling periods at the time of discovery).
Further, in this work authors fail to consider a device cloud
environment where the dynamics of movement affects the
overall energy in the device due to continuous radio module
operation.
In [19] Lee et al. propose a cloud based energy management
approach that eliminates the prediction overhead by offloading
it to remote cloud. This framework pre-computes the execution
time by profiling web applications on dedicated mobile devices
in the cloud. Behaving like typical caches when mobile
web applications request data to servers, both the data and
its execution times are delivered to users mobile devices.
A performance control agent on the mobile device selects
an operating point to meet the response time requirement.
Contrary to this approach, our model does not make any
decisions remotely in the cloud, the decisions are made within
the composition environment.
In [15] Habak et al. propose a femto-cloud system which
provides a dynamic, self-configuring and multi-device mobile
cloud from a collection of mobile devices. Authors present
the femto-cloud system architecture to enable multiple mobile
devices to be composed in to a cloud computing service
despite movements in mobile device participation. Different
from this framework, we generate a learning based approach
for energy management that repeatedly updates its local energy
decision controller and self-heals by terminating itself from the
execution rather than taking over a task and dropping it due
to the lack of energy.
In [12], [14] authors propose a novel distributed device
cloud strategy with emphasis on mobility and maintaining
seamless connectivity. In both of these works, failure to con-
sider the energy component in these devices while providing
the service makes the model incomplete. Different from this,
our model proposes for the first time an energy aware MDC
that is based on a reinforcement learning mechanism.
III. SYS TE M DESIGN
A. Overall Droplet Architecture
Fig. 1 shows the design. Our controller design mainly
consists of three modules: the user application, controller and
the network interface.
1) The User Application refers to any application that can
avail the MDC. Via this module, the user profiles and
resource information is obtained. As elaborated before
in [20] the user can configure the amount of resources
that need to be shared before participating in the device
cloud formation.
2) Network Interface is the in-built network card for com-
munication with the other devices and network nodes
(such as Wi-Fi).
The controller resides in every device below the application
layer. It mainly comprises of the following modules:
1) Network Monitor and Communication Agent- Once
every device offers its network location and resource
capacities (how much of storage that can be shared) to
form the device cloud, the network monitor keeps track
of these resource information. This information is passed
to the decision control module that decides the energy
dissipation and maintains a threshold for the resources
that are part of the current composition. A ‘renew’/ ‘top-
up’ message is sent to the resource that crosses below
this value. Due to space limitations we avoid going
into detail about the control messages exchanged at this
juncture. Once a ‘renew’ message is obtained either the
affected device should find a new device before exiting
the current cloud or can allot the tasks back to the
requester to make the requester aware of its power status
via the communication agent.
The communication agent and the network monitor
operate in a cohesive manner. This module performs
the evaluation of the network topology and the actual
maintenance updates of the task execution. Discovery
of neighbours and newly obtained user profiles are kept
in the database via this modules presence.
2) Decision Control- This is the core of the controller that
maintains energy information. The above two modules
provide periodic updates to the decision control for
maintaining the accuracy while the tasks are being exe-
cuted. A scheduling policy that operates based on device
movements and energy is required. It receives inputs
from the resource estimation module about the load and
the overall behavior of the resources participating in the
device cloud. It performs the main learning mechanism
as elaborated in the next section.
3) Task Manager and Resource Estimation- These two
modules have the main database associations. Here the
information is retrieved from all arriving tasks and
mapped to the resource scheduler. The energy profiles
are the sensed data of the devices participating in the
MDC. When a device is connected to a power source,
the task manager obtains a local approval that the most
reliable device or device’s “capability” is 100% which
means that this is the most reliable device profile.
This module updates the task execution progress and
maintains the incoming job status.
B. Modelling Decision Control
In our system, a collection of devices that discover each
other over Wi-Fi is considered. These devices are idle and
provide a composed service as recognized before in [20]. We
TABLE I: Notations and Definitions
Notation Definition
R(i)Points offered for participating in the device cloud. Accrued
over-time for MDC services. 1 point per unit time spent in
task-execution.
C(i)Cost in Joules/sec. Amount of energy dissipated.
iState of Execution
OΓ(i)Energy Policy for i
0ΓExpected Cost Function
Pij Probability of going from state i to state j
On(i)Decision Maker for n devices
θ,π θ In Alg. 3, this is the proxy for uncertainty that we vary this
from 1 to 2, πIn Alg. 2, is the preference for an action
probability of exploration
SiSelection parameter for devices
PiInitial Probability
assume Ndevices together provide a device cloud service, the
user may offload the tasks based on the resources required. We
assume that the energy constraints on the device are directly
linked to the cost of resource usage. For example, if fiis
the CPU usage of a device, then the cost C(i)for a task i
is directly related as C(i) = αfi, where αis the constant
of power-computing relation. Owing to space limitations, the
derivations have been omitted. Table I shows the employed
notations. Consider terminal reward R(i),0R(i) ,
and continuation cost C(i), C(i)>0. We define a probability
of Pij in times of continuation. Let’s say while we begin
initialization of execution i, a controller calculates local energy
profiles and readies to stop and receive the terminal cost of
RR(i)0or pay a continuing cost of C(i)and move with
a probability of Pij , where j0. Now for an energy policy
Γ, let OΓand 0Γrepresent the expected cost function. For any
policy Γwhich stops at P r(stopping)=1,
OΓ(i) = OΓ(i) + R, where, i = 0,1,2, . . . (1)
This is the only process that needs to be considered for a non-
negative probability. The related processes can be modelled as
a Markov decision process [21]. Thus,
Oi= min[RR(i), C(i) + X
j
Pij O(j)], i 0.(2)
For nstages,
On(i) = min[R(i), C(i) + X
j
Pij On1(j)], i 0.(3)
Now, let 0n(i)be the set decision for initial state before
reaching the nth stage where the control decision is to stop:
0n(i)0n+1(i)O(i).(4)
Thus, a stable state is achieved if
lim
n→∞
On(i) = O(i).(5)
C. Autonomous Energy Management
As shown in Fig. 2 an agent takes an action based on the
learning process. For any action, the local agent prepares a
reward (R). The Learner module learns the decision control
strategy as time progresses and provides resource estimate
Decision Control
Environment
User Application
Energy Learner
Status
update
Best Action
action
state
reward
Environment
action
state
reward
Fig. 2: Learning Agent Model
updates to the user application for the best action to be taken.
This could be in terms of new discovery, worst case scenarios
of dropping a task if it comes at the cost of device shutting
down, etc. The reward received is mapped to the present value
of the action (E.g. to estimate the value of an action we simply
average all the rewards when an action is selected such as
T otal Rewards
Sum of No. choices ).
As suggested in [22] choice of a near greedy strategy, is
affected by a behavior that is greedy with a small probability
which converges to near certainty. We explore this by
randomly making functional calls to the greedy algorithm
as shown in Alg. 1. Our modeling follows a multi-bandit
approach that is inspired from [22]. We model the bandit
problem to make the power decision on the device forming the
cloud. Further, this enables calculation of reward. We believe
the bandit problem makes sense in our proposed framework
because the controller has no prior knowledge of the actual
reward in forming the cloud. Further, the controller does not
have the complete knowledge of the power requirement. The
drawback in Alg. 1 is that the choice is made immediately
without any exploration of other nodes. Although it is not an
economical approach, the strategy is simple to implement.
IV. PERFORMANCE EVALUATION
Our evaluation suggests that the design is less sensitive to
parameter changes. We observe that greedy and near-greedy
methods in probability of exploration is used to converge to
an optimal selection of action. That is, as the tasks were
changed, more fluctuating rewards were noticed that require
more training. A positive observation is that the exploration of
the near greedy strategy usually results in producing actions
faster at the cost of being sub-optimal. We produce synthetic
measurements for our interpretation. Traffic generation is a
Poisson distribution having reward with mean of zero and a
unit variance. We use the Python Library for this purpose. To
analyze different algorithms we complete 4000 independent
runs i.e., 2000 time steps for each.
In Fig. 3a we show the average rewards obtained with
varying θvalues. Further, in Fig. 3b, we analyze the sensitivity
of reward with respect to the θi.e., the proxy used to avoid
the unnecessary selection of action compared to Alg. 1. Using
Algorithm 1: Near Greedy Approach - Algorithm 1
Input : Si,Pi
/*Estimating Reward R */
LR
P= [] LR
Pkeeps track of time duration each device
participate for a MDC Formation
begin
/*In case of exploitation, i.e.,
1-P, the for loop is skipped
and the max of LR
Pis returned */
foreach Siin MobileDeviceSelectionList do
if Constraints For Device Cloud Satisfies then
/*Refer table I for Ri
Calculation */
LR
P [Update the Reward Ri=
RP revious
i+RNew
i]
end
end
Return max(LR
P)
end
Output: Average Reward For Participation in Device
Cloud Formation
Algorithm 2: Gradient Approach - Algorithm 2
Input : Si,Pi
/*Estimating Reward R, which
considers relative action
preference. */
πt(a)preference for action a
Estimate Reward(Action) /*Returns the
estimated reward for time t */
LAList of Actions LR
P= [] LR
Pkeeps track
of time duration each device participate for a MDC
Formation begin
/*In case of exploitation, i.e.,
1-P, the for loop is skipped
and the max of LR
Pis returned */
foreach Siin MobileDeviceSelectionList do
if Constraints For Device Cloud Satisfies then
foreach actioniin LAdo
Ri=Estimate Reward(i) if Ri
Raverage then
Increase πt(a)
end
else
Reduce πt(a)
end
end
end
LR
P [Update the Reward Ri=
RP revious
i+RNew
i]
end
Return max(LR
P)
end
Output: Average Reward For Participation in Device
Cloud Formation
Algorithm 3: Action(Si)-Reward (Ri). Reducing the
probability of Selection of unnecessary actions.
Input : Si, Pi
Estimate Reward(Action) /*Returns the
estimated reward for time t */
/*Estimating Reward R*/
LR
P= []
LR
Pkeeps track of time duration each device
participate for a MDC Formation begin
/*Every time an action is taken,
exploration happens to ensure
that all actions are considered
for the determining the optimal
solution. Unlike in 1 where
exploration happens with times
*/
foreach Siin MobileDeviceSelectionList do
if Constraints For Device Cloud Satisfies then
/*Refer Table I for Ri
Calculation */
RNew
i=Estimate Reward(i) +θi/*θi
is proxy for uncertainty
estimate; this converges
and reduces unnecessary
action selection for the
calculation. It is root
mean square value of the
action. */
LR
P [Update the Reward Ri=
RP revious
i+RNew
i]
end
Return max(LR
P)
end
end
Output: Average Reward For Participation in Device
Cloud Formation.
the relation, E=θ×sln(t)
Nt(a)how the uncertainty is reduced,
where Nt(a)represents number of times an action has been
executed before time t and Nt(0) is the max action selection.
Thus, as θincreases the exploration increases. Approximately,
the spike at step 6 is because we have considered 5 devices
at max forming the cloud for simplicity. In the beginning all
the devices are explored and given a reward and as the next
step goes the next maximum of the already selected 5 will be
chosen hence a sudden spike. From then on, it is only building
on what has already been chosen thus we don’t see anymore
spikes.
Additionally, Fig. 3b represents % optimal actions selected.
With θzero represents a complete greedy approach hence
no exploration happens. Thus the optimal solution determined
with θ= 0 is sub-optimal. Whereas the θincreases the time
taken by system to explore increases and also the systems
confidence level increases that all the devices are explored
before exploiting. Thus with increase in θvalue the system
converges slowly to an optimal solution.
Two things can be concluded from Fig. 3c. Firstly, the
relative comparison factor added to Alg. 1 shows the greedy
approach however is not optimal although it might improve the
performance. Secondly, the change of mean has no effect on
the proposed algorithm as it adapts to the new changes. Fig. 3c
shows the performance of the Alg. 3 with varying θ. Likewise,
we evaluate the changing rewards with varying πvalues. In
[23], Sun et al. arrive at a conclusion that is not very clear to
compare with our work, hence, in our work, we use baseline
as the scenario when distribution parameter is 4 as shown in
Fig. 4. Owing to the random nature of this scenario, it needs
to be further investigated. In Fig. 5, our approach (θ) produces
a better exploration compared to near-greedy and greedy. On
the other hand, Alg. 1 and Alg. 2 follow an immediate reward
selection which is sub-optimal. In Fig. 6 comparison of all
average optimal actions are represented. It is observed that
Droplet gives a 10% gain in rewards achieved.
V. CONCLUSION & FUTURE WORK
Reinforcement learning is primarily an action based training
methodology compared to other forms of learning. The issue
of energy inside the devices is important to study in an edge
device cloud setting. This is because the collaborating devices
may drain out in between the execution and computation pro-
cess. In this paper, an autonomous energy management archi-
tecture, called Droplet that learns the power-related statistics of
the device is proposed. To this end, it produces a reliable group
of resources for providing a computation environment on-the-
fly. Additionally, we compare Droplet and two state-of-the-
art approaches. We analyze the energy based improvements
in each case. Further, we model a reward strategy for those
devices participating in the mobile device cloud service. We
observed that the proposed architecture effectively produces
a 10% gain in the rewards earned. In future, a real-world
implementation will be investigated and a comparative study
with the simulation results will be produced.
REFERENCES
[1] A. A. Alkheir, M. Aloqaily, and H. T. Mouftah, “Connected and
autonomous electric vehicles (CAEVs),” IT Professional, vol. 20, no. 6,
pp. 54–61, 2018.
[2] M. Aloqaily, I. Al Ridhawi, H. B. Salameh, and Y. Jararweh, “Data
and service management in densely crowded environments: Challenges,
opportunities, and recent developments, IEEE Commun. Mag., March
2019.
[3] M. Aloqaily et al., “Congestion mitigation in densely crowded environ-
ments for augmenting QoS in vehicular clouds,” in Proc. ACM Symp.
on Des. and Analysis of Intel. Vehi. Netw. and App., 2018, pp. 49–56.
[4] Nasrallah et al., “Ultra-low latency (ULL) networks: The IEEE TSN and
IETF DetNet standards and related 5G ULL research,” IEEE Commun.
Surv. & Tut., vol. 21, no. 1, pp. 88–145, 2019.
[5] I. Parvez et al., A survey on low latency towards 5G: RAN, core
network and caching solutions,” IEEE Commun. Surv. & Tut., vol. 20,
no. 4, pp. 3098–3130, 2018.
[6] Z. Xiang et al., “Reducing latency in virtual machines: Enabling tactile
internet for human-machine co-working,” IEEE J. Sel. Areas Commun.,
vol. 37, no. 5, 2019.
(a) Reward assignment process. (b) Estimated power selection process. (c) Reward vs time cycles variation with θ
Fig. 3: Droplet sensitivity with respect to given θ.
Fig. 4: Varying πvalues.
Fig. 5: Average of optimal action for a selected parameter
range. θfor Droplet, for near-greedy and πfor greedy.
Fig. 6: Average of all the optimal actions.
[7] M. Chen et al., “Data-driven computing and caching in 5G networks:
Architecture and delay analysis,” IEEE Wirel. Commun., vol. 25, no. 1,
pp. 70–75, 2018.
[8] S. Sukhmani et al., “Edge caching and computing in 5G for mobile
AR/VR and tactile internet,” IEEE MultiMedia, vol. 26, no. 1, pp. 21–
30, 2019.
[9] X. Wang et al., “Content-centric collaborative edge caching in 5G mobile
internet,” IEEE Wirel. Commun., vol. 25, no. 3, pp. 10–11, 2018.
[10] M. Satyanarayanan, P. Bahl, R. Carceres, and N. Davies, “The case for
VM-based cloudlets in mobile computing,” IEEE Pervasive Computing,
vol. 8, no. 4, pp. 14–23, 2009.
[11] P. Shantharama et al., “LayBack: SDN management of multi-access
edge computing (MEC) for network access services and radio resource
sharing,” IEEE Access, vol. 6, pp. 57545–57 561, 2018.
[12] V. Balasubramanian and A. Karmouch, “Managing the mobile ad-hoc
cloud ecosystem using software defined networking principles,” in Proc.
Int. Symp. on Netw., Comp. and Commun. (ISNCC), May 2017, pp. 1–6.
[13] A. Mtibaa, K. A. Harras, and A. Fahim, “Towards computational
offloading in mobile device clouds, in Proc. IEEE Int. Conf. on Cloud
Computing Technology and Science (CloudCom), 2013, pp. 331–338.
[14] V. Balasubramanian, M. Aloqaily, F. Zaman, and Y. Jararweh, “Explor-
ing computing at the edge: A multi-interface system architecture enabled
mobile device cloud,” in Proc. IEEE Int. Conf. on Cloud Networking
(CloudNet), Oct 2018, pp. 1–4.
[15] K. Habak, M. Ammar, K. A. Harras, and E. Zegura, “Femto clouds:
Leveraging mobile devices to provide cloud service at the edge, in
Proc. IEEE Int. Conf. on Cloud Comp. (CLOUD), 2015, pp. 9–16.
[16] D. V. Le and C. Tham, “A deep reinforcement learning based offloading
scheme in ad-hoc mobile clouds,” in Proc. IEEE Infocom Workshops),
April 2018, pp. 760–765.
[17] N. Mastronarde and M. van der Schaar, “Fast reinforcement learning for
energy-efficient wireless communication, IEEE Transactions on Signal
Processing, vol. 59, no. 12, pp. 6262–6266, 2011.
[18] G. Yi, J. H. Park, and S. Choi, “Energy-efficient distributed topology
control algorithm for low-power IoT communication networks, IEEE
Access, vol. 4, pp. 9193–9203, 2016.
[19] W. Lee, D. Sunwoo, A. Gerstlauer, and L. K. John, “Cloud-guided QoS
and energy management for mobile interactive web applications, in
Proc. IEEE Int. Conf. on Mobile Softw. Eng. and Sys., 2017, pp. 25–29.
[20] V. Balasubramanian and A. Karmouch, “An infrastructure as a service
for mobile ad-hoc cloud,” in Proc. IEEE Comp. and Commun. Workshop
and Conf. (CCWC), 2017, pp. 1–7.
[21] S. M. Ross, Introduction to Stochastic Dynamic Programming. Aca-
demic Press, 2014.
[22] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.
MIT Press, 2018.
[23] K. Sun, Z. Chen, J. Ren, S. Yang, and J. Li, “M2C: Energy efficient mo-
bile cloud system for deep learning,” in Proc. IEEE Infocom Workshops,
April 2014, pp. 167–168.
... In this paper we refer to this scenario, without explicitly repeating it henceforth. Similarly, we will omit repeating that we focus on the ETSI MEC standard, rather than edge computing at large (e.g., as discussed in [38][39][40][41]). MEC app developers will be able to write applications that run through the cellular network and interact with the latter by consuming MEC services provided by it. ...
... They indeed have a different focus, as they do not provide a tool for experimenting with MEC, rather technical solution for MEC. On the one hand, there exist several data-driven solutions (e.g., [38,39]) that aim at improving MEC performance, either or both in terms of energy consumption and QoS/QoE. As such, data-driven approaches rely on realistic data sets in the design, training and testing phases, which are typically hard to obtain the context of mobile networks. ...
Article
The Multi-access Edge Computing (MEC) standard of the European Telecommunications Standards Institute (ETSI) will enable context-aware services for users of mobile 4G/5G networks. ETSI MEC application developers need tools to aid the design and the performance evaluation of their apps. During the early stages of deployment, they should be able to evaluate the performance impact of design choices - e.g., what round-trip delay can be expected due to the interplay of computation, communication and service consumption. When a prototype of the app exists, it needs to be tested live, under controllable conditions, to measure key performance indicators. In this paper, we present an open-source framework that allows developers to do all the above. Our framework is based on Simu5G, the OMNeT++ based simulator of 5G (NewRadio) and 4G (LTE) mobile networks. It includes models of ETSI MEC entities (i.e., MEC orchestrator, MEC host, etc.) and provides a standard-compliant RESTful interface towards application endpoints. Moreover, it can interface with external applications, and can also run in real time. Therefore, one can use it as a cradle to run a MEC app live, having underneath both 4G/5G data packet transport and MEC services based on information generated by the underlying emulated radio access network. We describe our framework and present a use-case of an emulated MEC-enabled 5G scenario.
... For instance, the state space considered for the DQN approach in Xu, Qin, Yang and Kwak (2019) to select optimal edge server include Channel state, energy arrival rate, and mobility. Alternatively, RL in Balasubramanian et al. (2019) is used to form a reliable group of devices for computation tasks offloading. The decision is made based on the power-related statistics of devices. ...
... It should be noted that these protocols [27][28][29] are not cosidered in the comparision since they do not have identity privacy. In addition, it should be highlighted that implementation of these protocols using the presented framework by Otoum et al. in 2022 [34] and also edge intelligence given by Hayyolalam et al. [35] in 2022 and the technique given in [36] improves the performance at resource-constrained devices incredibly especially at sensors. It can be seen that any progress at deployment of resource-constraint devices will have effect on all auditing protocols in cloud-assisted IoT applications. ...
Article
Full-text available
Cloud-assisted Internet of things (IoT) is an important technological trend since employing the cloud to manage massive IoT data enhances performance of IoT applications. To provide integrity of data shared between IoT equipment, it is essential to utilize a public auditing protocol to ensure the integrity of shared data. In this paper, we propose an identity-based public auditing protocol for cloud-assisted IoT applications to address the aforementioned issue. This protocol cannot only provide users’ privacy but also it supports data integrity, data privacy, batch auditing and dynamic data operation. In addition, it is shown that the proposal is secure in the random oracle model under difficulty of elliptic-curve discrete logarithm problem. Moreover, the performance analysis shows that it outperforms other auditing protocols employing ring signatures at computation and communication overhead. Therefore, IoT users can share data in the cloud in a flexible and efficient manner, while their privacy is preserved.
... However, the deterministic energy consumption model is difficult to formulate owing to the time-varying states of the remaining power. Another energy-critical problem for mobile devices in wireless networks is ensuring sufficient energy supply due to their limited battery life [46]. Although the energy consumption performance can be improved by large-capacity batteries and frequent battery recharging, the former increases the hardware size, weight, and cost, and the latter is unfavorable and impossible, such as for mobile phones and hard-to-reach wireless sensor devices [47]. ...
Article
Full-text available
Mobile edge computing (MEC) is considered a novel paradigm for computation-intensive and delay-sensitive tasks in fifth generation (5G) networks and beyond. However, its uncertainty, referred to as dynamic and randomness, from the mobile device, wireless channel, and edge network sides, results in high-dimensional, nonconvex, nonlinear, and NP-hard optimization problems. Thanks to the evolved reinforcement learning (RL), upon iteratively interacting with the dynamic and random environment, its trained agent can intelligently obtain the optimal policy in MEC. Furthermore, its evolved versions, such as deep reinforcement learning (DRL), can achieve higher convergence speed efficiency and learning accuracy based on the parametric approximation for the large-scale state-action space. This paper provides a comprehensive research review on RL-enabled MEC and offers insight for development in this area. More importantly, associated with free mobility, dynamic channels, and distributed services, the MEC challenges that can be solved by different kinds of RL algorithms are identified, followed by how they can be solved by RL solutions in diverse mobile applications. Finally, the open challenges are discussed to provide helpful guidance for future research in RL training and learning MEC.
... However, the deterministic energy consumption model is difficult to formulate owing to the time-varying states of the remaining power. Another energy-critical problem for mobile devices in wireless networks is ensuring sufficient energy supply due to their limited battery life [45]. Although the energy consumption performance can be improved by large-capacity batteries and frequent battery recharging, the former increases the hardware size, weight, and cost, and the latter is unfavorable and impossible, such as for mobile phones and hard-to-reach wireless sensor devices [46]. ...
Preprint
Mobile edge computing (MEC) is considered a novel paradigm for computation-intensive and delay-sensitive tasks in fifth generation (5G) networks and beyond. However, its uncertainty, referred to as dynamic and randomness, from the mobile device, wireless channel, and edge network sides, results in high-dimensional, nonconvex, nonlinear, and NP-hard optimization problems. Thanks to the evolved reinforcement learning (RL), upon iteratively interacting with the dynamic and random environment, its trained agent can intelligently obtain the optimal policy in MEC. Furthermore, its evolved versions, such as deep RL (DRL), can achieve higher convergence speed efficiency and learning accuracy based on the parametric approximation for the large-scale state-action space. This paper provides a comprehensive research review on RL-enabled MEC and offers insight for development in this area. More importantly, associated with free mobility, dynamic channels, and distributed services, the MEC challenges that can be solved by different kinds of RL algorithms are identified, followed by how they can be solved by RL solutions in diverse mobile applications. Finally, the open challenges are discussed to provide helpful guidance for future research in RL training and learning MEC.
Article
Full-text available
Multi-access edge computing (MEC) is a key technology in the fifth generation (5G) of mobile networks. MEC optimizes communication and computation resources by hosting the application process close to the user equipment (UE) in network edges. The key characteristics of MEC are its ultra-low latency response and real-time applications in emerging 5G networks. However, one of the main challenges in MEC-enabled 5G networks is that MEC servers are distributed within the ultra-dense network. Hence, it is an issue to manage user mobility within ultra-dense MEC coverage, which causes frequent handover. In this study, our purposed algorithms include the handover cost while having optimum offloading decisions. The contribution of this research is to choose optimum parameters in optimization function while considering handover, delay, and energy costs. In this study, it assumed that the upcoming future tasks are unknown and online task offloading (TO) decisions are considered. Generally, two scenarios are considered. In the first one, called the online UE-BS algorithm, the users have both user-side and base station-side (BS) information. Because the BS information is available, it is possible to calculate the optimum BS for offloading and there would be no handover. However, in the second one, called the BS-learning algorithm, the users only have user-side information. This means the users need to learn time and energy costs throughout the observation and select optimum BS based on it. In the results section, we compare our proposed algorithm with recently published literature. Additionally, to evaluate the performance it is compared with the optimum offline solution and two baseline scenarios. The simulation results indicate that the proposed methods outperform the overall system performance.
Article
Task scheduling on edge computing servers is a critical concern affecting user experience. Current scheduling methods attain an overall appealing performance through centralized control. Nevertheless, forcing users to act based on a centralized control is impractical. Hence, this work suggests a game theory-based distributed edge computing server task scheduling model. The proposed method comprehensively considers the mobile device-server link quality and the server’s computing resource allocation and balances link quality and computing resources requirements when selecting edge computing servers. Furthermore, we develop a time series prediction algorithm based on IndRNN and LSTM to accurately predict link quality. Once Nash equilibrium is reached quickly through our proposed acceleration scheme, the proposed model provides various QoS for different priority users. The experimental results highlight that the developed solution provides differentiated services while optimizing computing resource scheduling and ensuring an approximate Nash equilibrium in polynomial time.
Article
Owing to the advancements in communication and computation technologies, the dream of commercialized connected and autonomous cars is becoming a reality. However, among other challenges such as environmental pollution, cost, maintenance, security, and privacy, the ownership of vehicles (especially for Autonomous Vehicles (AV)) is the major obstacle in the realization of this technology at the commercial level. Furthermore, the business model of pay-as-you-go type services further attracts the consumer because there is no need for upfront investment. In this vein, the idea of car-sharing ( aka carpooling) is getting ground due to, at least in part, its simplicity, cost-effectiveness, and affordable choice of transportation. Carpooling systems are still in their infancy and face challenges such as scheduling, matching passengers interests, business model, security, privacy, and communication. To date, a plethora of research work has already been done covering different aspects of carpooling services (ranging from applications to communication and technologies); however, there is still a lack of a holistic, comprehensive survey that can be a one-stop-shop for the researchers in this area to, i) find all the relevant information, and ii) identify the future research directions. To fill these research challenges, this paper provides a comprehensive survey on carpooling in autonomous and connected vehicles and covers architecture, components, and solutions, including scheduling, matching, mobility, pricing models of carpooling. We also discuss the current challenges in carpooling and identify future research directions. This survey is aimed to spur further discussion among the research community for the effective realization of carpooling.
Article
Full-text available
Software Defined Networking (SDN) and Network Function Virtualization (NFV) processed in Multi-access Edge Computing (MEC) cloud systems have been proposed as critical paradigms for achieving the low latency requirements of the tactile Internet. While Virtual Network Functions (VNFs) allow greater flexibility compared to hardware based solutions, the VNF abstraction also introduces additional packet processing delays. In this paper, we investigate the practical feasibility of NFV with respect to the tactile Internet latency requirements. We develop, implement, and evaluate Chain bAsed Low latency VNF ImplemeNtation (CALVIN), a low-latency management framework for distributed Service Function Chains (SFCs). CALVIN classifies VNFs into elementary, basic, and advanced VNFs. CALVIN implements elementary and basic VNFs in the kernel space, while advanced VNFs are implemented in the user space. Throughout, CALVIN employs a distributed mapping with one VNF per Virtual Machine (VM) in a MEC system. Moreover, CALVIN avoids the metadata structure processing and batch processing of packets in the conventional Linux networking stack so as to achieve short per-packet latencies. Our rigorous measurements on off-the-shelf conventional networking and computing hardware demonstrate that CALVIN achieves round-trip times from a MEC ingress point via two elementary forwarding VNFs (one in kernel space and one in user space) and a MEC server to a MEC egress point on the order of 0.32 ms. Our measurements also indicate that MEC network coding and encryption are feasible for small 256 byte packets with an MEC latency budget of 0.35 ms; whereas, large 1400 byte packets can complete the network coding, but not the encryption within the 0.35 ms.
Article
Full-text available
The powerful capabilities of Connected and Autonomous Electric Vehicles (CAEVs) will make it a popular provider of a wide range of services, including mobility, sensing, computing, traffic control, and energy management. This article uses a characterization of these services to devise a service management framework and pricing schemes.
Conference Paper
Full-text available
Parking lots in densely crowded environments such as stadiums, theaters, and hospitals provide great opportunities for vehicular cloud services. A cloud environment formed by individual vehicles, where each vehicle offers its resources as a service has shown feasible practices in 5G network scenarios. Moreover, resource management in 5G must be achieved in accordance with user-centric QoS requirements. In alignment with this, a key enabler of the user-centric service scheme is Network Slicing. The formation of multiple slices in such a dense environment, the congestion between sender and receiver, and resource management and allocation are topics of current research. This paper has the following contribution: First, a framework of Vehicular Clouds being restricted to individual slices in 5G cellular networks is proposed. Second, a queuing strategy for congestion control in a densely crowded environment such as parking lots is designed. Finally, a resource allocation algorithm that enables maximum matching between the tasks to be executed and the candidate slices is developed. The novelty of this approach comes from the fact that congestion control is performed at the Access Points (AP). We do this by introducing a control module that makes queuing decisions at the time of request arrival. By incorporating control module in AP, our aim is to provide AP resources in terms of transmission period to different slices, thereby, allowing WiFi resources to be shared along with the 5G radio resources. The performance benefits of the proposed solution has been investigated through simulation tests.
Article
Full-text available
Existing radio access networks (RANs) allow only for very limited sharing of the communication and computation resources among wireless operators and heterogeneous wireless technologies. We introduce the LayBack architecture to facilitate communication and computation resource sharing among different wireless operators and technologies. LayBack organizes the RAN communication and multi-access edge computing (MEC) resources into layers, including a devices layer, a radio node [enhanced Node B (eNB), Access Point (AP)] layer, and a gateway layer. LayBack positions the “coordination point” between the different operators and technologies just behind the gateways and thus consistently decouples the fronthaul from the backhaul. The coordination point is implemented through a software defined networking (SDN) switching layer that connects the gateways to the backhaul (core) network layer. A unifying SDN orchestrator implements an SDN based management framework that centrally manages the fronthaul and backhaul communication and computation resources and coordinates the cooperation between different wireless operators and technologies. We illustrate the capabilities of the introduced LayBack architecture and SDN based management framework through a case study on a novel fluid cloud RAN (CRAN) function split. The fluid CRAN function split partitions the RAN functions into function blocks that are flexibly assigned to MEC nodes, effectively implementing the RAN functions through network function virtualization (NFV). We find that for non-uniform call arrivals, the computation of the function blocks with resource sharing among operators increases a revenue rate measure by more than 25% compared to the conventional CRAN where each operator utilizes only its own resources.
Conference Paper
Full-text available
In this paper, we consider the problem of making an optimal offloading decision for a mobile user in an ad-hoc mobile cloud in which the mobile user can offload his computation tasks to nearby mobile cloudlets via a device-to-device (D2D) communication-enabled cellular network. We propose a deep reinforcement learning (DRL)-based offloading scheme which enables the user to make near-optimal offloading decisions by taking into account uncertainties of user's and cloudlets' movements and the cloudlets' resource availabilities. We first propose a Markov decision process (MDP)-based offloading problem formulation which considers the composite states of the user's and cloudlets' queue states and the distance states between the user and cloudlets as the system state space. The objective of the formulated MDP-based problem is to determine the optimal action on how many tasks the user should process locally and how many tasks to offload to each cloudlet at each observed system state such that the user's utility obtained by task execution is maximized while minimizing the energy consumption, task processing delay, task loss probability and required payment. Then, we use a deep reinforcement learning scheme, called deep Q-network (DQN) to learn an efficient solution for the proposed MDP-based offloading problem. Extensive simulations were performed to evaluate the performance of the proposed offloading scheme. The simulation results validate the effectiveness of the offloading policies obtained by the proposed scheme.
Article
Densely crowded environments such as stadiums and metro stations have shown shortcomings when users request data and services simultaneously. This is due to the excessive amount of requested and generated traffic from the user side. Based on the wide availability of user smart-mobile devices, and noting their technological advancements, devices are not being categorized only as data/service requesters anymore, but are readily being transformed to data/ service providing network-side tools. In essence, to offload some of the workload burden from the cloud, data can be either fully or partially replicated to edge and mobile devices for faster and more efficient data access in such dense environments. Moreover, densely crowded environments provide an opportunity to deliver, in a timely manner, through node collaboration, enriched user-specific services using the replicated data and device-specific capabilities. In this article, we first highlight the challenges that arise in densely crowded environments in terms of data/service management and delivery. Then we show how data replication and service composition are considered promising solutions for data and service management in densely crowded environments. Specifically, we describe how to replicate data from the cloud to the edge, and then to mobile devices to provide faster data access for users. We also discuss how services can be composed in crowded environments using service-specific overlays. We conclude the article with most of the open research areas that remain to be investigated.
Article
As a result of increasing popularity of Augmented Reality and Virtual Reality (AR/VR) applications, there are significant efforts to bring AR/VR to mobile users. Parallel to the advances in AR/VR technologies, Tactile Internet is gaining interest from the research community. Both AR/VR and Tactile Internet applications require massive computational capability, high communication bandwidth, and ultra-low latency which cannot be provided with the current wireless mobile networks. By 2020, Long Term Evolution (LTE) networks will start to be replaced by fifth generation (5G) networks. Edge caching and mobile edge computing are among the potential 5G technologies that bring content and computing resources close to the users, reducing latency and load on the backhaul. The aim of this survey is to present current state-of-the-art research on edge caching and computing with a focus on AR/VR applications and Tactile Internet and to discuss applications, opportunities and challenges in this emerging field.
Article
Many network applications, e.g., industrial control, demand Ultra-Low Latency (ULL). However, traditional packet networks can only reduce the end-to-end latencies to the order of tens of milliseconds. The IEEE 802.1 Time Sensitive Networking (TSN) standard and related research studies have sought to provide link layer support for ULL networking, while the emerging IETF Deterministic Networking (DetNet) standards seek to provide the complementary network layer ULL support. This article provides an up-to-date comprehensive survey of the IEEE TSN and IETF DetNet standards and the related research studies. The survey of these standards and research studies is organized according to the main categories of flow concept, flow synchronization, flow management, flow control, and flow integrity. ULL networking mechanisms play a critical role in the emerging fifth generation (5G) network access chain from wireless devices via access, backhaul, and core networks. We survey the studies that specifically target the support of ULL in 5G networks, with the main categories of fronthaul, backhaul, and network management. Throughout, we identify the pitfalls and limitations of the existing standards and research studies. This survey can thus serve as a basis for the development of standards enhancements and future ULL research studies that address the identified pitfalls and limitations. IEEE
Article
The fifteen articles in this special section explores expositions on novel functionalities and technologies in various wireless and mobile network caching equipment, with respect to caching algorithms, caching implementations, system integration, cache modeling, optimization, and so on. In recent years, with the rapid development of information and communication technologies (ICTs), mobile network operators are increasingly suffering from the traffic explosion problem. And this problem remains to be tackled in the upcoming fifth generation (5G) networks. However, due to the limited bandwidth, the need to save energy, and the fact that advancements in transmission techniques are approaching the Shannon limit with diminishing returns, it is clear that these advancements need to be complemented by improvements in the network, transport, and application layers to provide sustainable solutions toward richer network capacity.