BookPDF Available

Edge Computing Final Paper

Authors:
  • Bharati Vidyapeeth (Deemed to be University) College of Engineering, Pune Pune-411043
18 Computational task off-loading using deep Q-learning in
mobile edge computing
Tanuja Satish Dhopea, Tanmay Dikshit, Unnati Gupta and Kumar Kartik
Department of Electronics and Communication, Bharati Vidyapeeth (Deemed to be University) College of Engineering,
Pune, India
Abstract
Because of the growing proliferation of networked Inter of Things (IoT) devices and the demanding requirements of IoT
applications, existing cloud computing (CC) architectures have encountered signicant challenges. A novel mobile edge com-
puting (MEC) can bring cloud computing capabilities to the edge network and support computationally expensive applica-
tions. By shifting local workloads to edge servers, it enhances the functionality of mobile devices and the user experience.
Computation off-loading (CO) is a crucial mobile edge computing technology to enhance the performance and minimize
the delay. In this paper, the deep Q-learning method has been utilized to make off-loading decisions whenever numerous
workloads are running concurrently on one user equipment (UE) or on a cellular network, for better resource management in
MEC. The suggested technique determines which tasks should be assigned to the edge server by examining the CPU utiliza-
tion needs for each task. This reduces the amount of power and execution time needed.
Keywords: Computation off-loading, edge server, mobile edge server, deep Q-learning
I. Introduction
Existing cloud computing (CC) architectures have
faced considerable hurdles as a result of the ongoing
proliferation of networked Internet of Things (IoT)
devices and the demanding needs of IoT applica-
tions, notably in terms of network congestion and
data privacy. Relocating computing resources closer
to end users can help overcome these problems and
improve cloud efciency by boosting its processing
power (Elhadj Benkhelifa et al., 2015). This strategy
has developed with the introduction of many para-
digms; fog computing, edge computing, all of which
have the same objective of increasing the deployment
of resources at the network edge. The signicant
issues that traditional cloud computing (as central-
ized) is experiencing include increased latency in real-
time applications, low spectral efciency. As a result
of new technologies, distributed computing capabili-
ties are increasingly being used by organization’s or
network’s edge devices in an effort to explain all these
challenges. Mobile edge computing (MEC) enables
certain apps to be off-loaded from resource-con-
strained devices like smartphones, saving resources.
MEC’s characteristics set it apart from typical cloud
computing because, unlike remote cloud servers, the
network can aggregate tasks in areas near the user
and device. By moving cloud processing to local serv-
ers, MEC improves user quality of experience (QoE),
in addition to reducing congestion in cellular infra-
structure and cutting delay (Khadija Akher et al.,
2016).
atanuja_dhope@yahoo.com
Analyzing the load on mobile edge servers is
required prior to jobs being ofoaded to edge serv-
ers, including whether to do so and, if so, which edge
server. Analyzing the load on mobile edge servers is
necessary to respond to the question above. The task/
data off-loading decision is important because it is
predicted to have a straightforward impact on the QoS
of the user application, including the resulting latency
caused by the off-loading mechanism (Yeongjin Kim
et al., 2018). When there is a lot of stress at the edge
node as a result of a staggeringly high number of user
devices using the same edge network for every task
as it could result in considerable processing delays
and the cessation of some processes (Fengxian Guo
et al., 2018). The reasoning architecture of the MEC
notion is utilized to obtain cloud computing applica-
tions. By placing several information centers at the
network’s node, users of smartphones will be more
accessible. The network terminal can refer to a multi-
tude of places, including indoor areas like Wi-Fi and
3G/4G. In today’s world, computing off-loading (CO)
discusses both boosting smartphone performance as
well as attempting to guarantee energy savings simul-
taneously (Abbas Kiani et al., 2018). Although meet-
ing the delay pre-requisites, MEC allows the edge to
perform computation-intensive applications rather
than user equipment. Additionally, IoT users will
participate in later detecting and processing duties in
user-centric 5G networks (Liang Huang et al., 2019;
Thakur et al., 2021). In reality, using MEC to off-load
computation processes results in wireless networks
being used to transmit data. It is feasible for wireless
130 Computational task off-loading using deep Q-learning in mobile edge computing
connections to become severely congested if many
application stations forcefully dump their processing
resources to the edge node, which would dramatically
slow down MEC (Gagandeep Kaur et al., 2021). A
unied management system for CO and the accom-
panying wireless resource distribution in order to
benet from compute off-loading is required (Khadija
Akher et al., 2016). In section 2, this paper describes
the job off-loading research in MEC. The task off-
loading system model is described in section 3 as local
computing, edge computing, and the deep Q-learning
method. Section 4 elaborates on the results and charts
for various task off-loading techniques. Finally con-
clusion is presented in section 5.
II. Related work
In (Khadija Akher et al., 2016) many edge com-
puting paradigms and their various applications, as
well as the difculties that academics and industry
professionals encounter in this fast-paced area has
been examined. Author suggested options, including
establishing a middleware-based design employing an
optimizing off-loading mechanism, which might help
to improve the current frameworks and provide the
mobile cloud computing (MCC) users more effective
and adaptable solutions by conserving energy, speed-
ing up reaction times, and lowering execution costs.
Ke Zhang et al. (2016) has given an energy efcient
computation off-loading (EECO) method, which
combinedly optimizes the decisions of CO and alloca-
tion of radio resources thereby minimizing the cost
of system energy within the delay constraints in 5G
heterogeneous networks.
An energy-efcient caching (EEC) techniques for a
backhaul capacity-limited cellular network to reduce
power consumption while meeting a cost limitation
for computation latency has been proposed (Zhaohui
Luo et al., 2019; Gera et al., 2021). The numerical
ndings demonstrate that 20% increase in delay ef-
ciency. The proposed method may be very close to the
ideal answer and far superior to the most likely out-
come, i.e., the approximation bound.
With the use of two time-optimized sequential
decision-making models and the optimal stopping
theory, author (Ibrahim Alghamdi et al., 2019)
address the issue of where to off-load from and when
to do so. Real-world data sets are used to offer a
performance evaluation, which is then contrasted
with baseline deterministic and stochastic models.
The outcomes demonstrate that, in cases involving
a single user and rival users, our technique optimizes
such decisions.
Kai Peng et al. (2019) examine the multi-objective
computation off-loading approach for workow
applications (MCOWA) in MEC which discovers the
best application approach while adhering to work-
ow applications deadline constraints. Numerous
experimental evaluations have been carried out to
demonstrate the usefulness and efciency of suggested
strategy.
In MEC wireless networks, an Software Dened
Networking (SDN) -based solution for off-loading
compute. Based on reinforcement learning, a solu-
tion to the energy conservation problem that consid-
ers both incentives and penalties have been assessed
(Nahida Kiran et al., 2020).
Distributed off-loading method with deep rein-
forcement learning that allows mobile devices to
make their off-loading decisions in a decentralized
way has been proposed. Simulation ndings dem-
onstrated that suggested technique may decrease the
ratio of dropped jobs and average latency when com-
pared to numerous benchmark methods (Ming Tang
et al., 2020). A multi-layer CO optimization frame-
work appropriate for multi-user, multi-channel, and
multi-server situations in MEC has been suggested.
Energy consumption and latency parameters are used
for CO decision from the perspective of edge users.
Multi-objective decision-making technique has been
proposed to decrease energy consumption and delay
of the edge client (Nanliang Shan et al., 2020).
III. Methodology
We took into account energy-sensitive UEs in this
paper, such as IoT devices and sensor nodes, which
have low power requirements but are not delay-sen-
sitive. We take into account N energy-sensitive UEs
that are running concurrently on a server, and the
server must choose which task from the task queue
needs to be done rst in order to reduce the power
and execution time for each work. When a user device
lacks the energy resources to complete the computa-
tion-intensive task locally, an edge server can step in.
To make decisions on off-loading, we employ deep
Q-learning algorithm. Based on the state and reward
of the Q function at state t, the Q-learning algorithm
acts.
A. Local computing
Let’s assume that E represents the energy needed for
each User Equipment (UE) to operate locally n= num-
ber of UE, pn =the power coefcient of energy used
for local computing per CPU cycle, cn= the CPU cycles
desired for each bit in numbers, βn = the percentage of
tasks computed locally, and Sn = the size of the com-
putation task are all represented by the numbers n.
Therefore, the amount of energy needed for UE to
operate locally can be determined by discretion.
(1)
Applied Data Science and Smart Systems 131
Figure 18.1 Block diagram of edge computing model
Table 18.1 Parameters for deep Q-learning algorithm
Parameters Value
Learning rate for the neural network’s optimizer 0.1
Reward decay factor in the Q-learning update 0.001
Initial Epsilon-Greedy exploration probability 0.99
Frequency of updating target network
parameters 200
Size of replay memory 10KB
Batch size for training 32
Exploration probability 0.9
Maximum number of episodes for training 3000
B. Edge computing model
Let’s assume that N numbers of UEs are anticipat-
ing tasks to be off-loaded and executed on edge
server since local server does not have enough power
resources. The task queue contains every single
task. When the Q value function is modied based
on reward and state, the task queue is supposed to
update each time. The processes that are being used
grow if the number of tasks (component list/UEs) in
the task queue rises.
The system model consists of workload off-loading
in MEC (see Figure 18.1). The task has been uploaded
to the any of three servers based on tasks that have
come from UE. Selection of the any one of the server
is based on deep Q-learning algorithm. We have used
TCP/IP, User Datagram Protocol (UDP) for transmit-
ting task from UE to edge server depending on type of
application viz., image processing, AR/VR, healthcare
applications, agriculture applications. The server will
track the Q value using a Q-learning algorithm and it
will update the entire task queue if one task consumes
less power than others.
Q-learning is a reinforcement learning algorithm
used in MEC that trains itself based on parameters sup-
plied during environment building and server allocation
algorithms. Later, an optimum job distribution on the
servers can be accomplished using the learned model.
The tasks that off-load delay will be more difcult with
local computing. In MEC, the state space could stand
in for the present network conditions, device status,
and resource availability (such as CPU, memory, and
bandwidth). Making judgments about how to allo-
cate resources requires access to this state information.
Q-learning assesses the effectiveness of activities con-
ducted in a specic condition using a reward mecha-
nism. Rewards in MEC can be determined based on
a variety of performance indicators, including latency,
energy use, throughput, or user happiness. Higher
rewards are associated with better decisions.
The learning method entails exploring the state-
action space iteratively and updating the Q-table
based on the rewards gained. Q-learning uses the
Bellman equation to update Q-values iteratively:
(2)
where, “P(s, a)” is the Q-value for state “s” and action “a”
α” is the learning rate.
“R(s, a)” is the immediate reward for taking action
“a” in state “s”.
γ” is the discount factor.
“max(Q(s’, a’))” represents the maximum Q-value
for the next state “s” and all possible actions “a”.
Algorithm
Input: Pt, Pt0, Pre_node, Comp_list, Trans_amount
Output: Trans_energy
Initialization: Trans_energy ® 0;
If Pt≠ 0 and Pre_node (Pt(0)(0)) ≤ Comp_list then
Trans_energy+= ε ptr
If Trans_amount≥ Pt(0)(2) then Pt0.append (Pt(0))
sort tasks on Pt0
else Pt(0)(2) -= Trans_amount
Results
We have considered total three edge servers and tasks
which are requesting for the edge server services. The
following parameters have been taken into account
for deep Q-learning algorithm (see Table 18.1).
The other parameters like transmit power, band-
width and noise PSD has been taken into consider-
ation. We analyzed the number of UEs that are now in
the task queue. If there is only one UE, we can decide
whether to off-load the job and compute the trans-
mission energy using the Epsilon-Greedy model. We
checked if there are multiple UEs in the task queue and
the amount of transmission needed for the task before
it in the queue is greater than the amount needed for
the task after it. If so, we simply sort the task queue
based on the amount of transmission needed for pro-
cessing, off-load the task in question, and execute it
on the edge server from the queue after sorting, using
less power in the process. According to the prior state
132 Computational task off-loading using deep Q-learning in mobile edge computing
Figure 18.2 Number of tasks in queue with respect to
time (s)
Figure 18.3 Server utilization with respect to time (ms)
and maximum, the Q value function produced this
transmission quantity.
Figure 18.2 shows the number of tasks in queue
with respect to time on the basis of provided num-
ber of nodes, environment variables, CPU requested
and processing time. The deep Q-learning algorithm
assigns the requested task to any three of the serv-
ers based on the reward. Figure 18.3 analyses the
three server utilization taken into consideration with
respect to time.
CPU utilization of different CPU’s and number of
tasks in queue during a single window execution has
been shown in Figure 18.4.
Figure 18.5 reects the tasks in number which can
be off-loaded with respect to the edge server and
tasks that can be computed locally based on deep
Q-learning algorithm.
Figure 18.4 CPU utilization of different CPU’s and
number of tasks in queue during a single window ex-
ecution, time (s) vs. number of tasks
Figure 18.5 Episodes vs. number of tasks for local com-
putation and edge server off-loading using Q-learning
algorithm
Applied Data Science and Smart Systems 133
Figure 18.6 Number of tasks vs. average delay (s) for
local and off-loading using Q-learning algorithm
As the number of tasks rises, the average time taken
to complete each job also rises (see Figure 18.6).
When there are more tasks running simultaneously,
the Q-learning method requires less time for task exe-
cution than local computing.
V. Conclusion
In MEC, deep Q-learning algorithms play a cru-
cial role in the off-loading of computational tasks.
We make the assumption in this study that numer-
ous tasks are running concurrently on various user
devices, and that the jobs are both delay and power
insensitive. We use the TCP/IP or UDP based on appli-
cation to link user equipment to the server. The work
queue on the server adjusts based on the amount of
transmission needed to complete jobs, and the reward
value is updated in the Q-learning process. The
Q-learning algorithm, which is based on reinforce-
ment learning, considers both rewards and penalties
to minimize power usage. Off-loading workload to a
node server instead of remote server increases power
efciency, lowers processing delay, and lowers total
infrastructure costs.
References
Benkhelifa, E., Welsh, T., Tawalbeh, L., Jararweh, Y., and Ba-
salamah, A. (2015). User proling for energy optimisa-
tion in mobile cloud computing. Proc. Comp. Sci., 52,
1275–1278, doi: https://doi.org/10.1016/j.procs.2015.
05.151.
Akher, K., Gerndt, M., and Harroud, H. (2016). Mobile
cloud computing for computation ofoading: Issues
and challenges. Appl. Comput. Informat., 14(1), 1–16.
doi: https://doi.org/10.1016/j.aci.2016.11.002.
Kim, Y., Lee, H.-W., and Chong, S. (2018). Mobile compu-
tation ofoading for application throughput fairness
and energy efciency. IEEE Trans Wire. Comm., 1–16.
doi:https://doi.org/10.1109/TWC.2018.2868679.
Guo, F., Zhang, H., Ji, H., Li, X., and Victor, C. M. L.
(2018). Energy efcient computation ofoading for
multi-access MEC enabled small cell networks. IEEE
Int. Conf. Comm. Workshops, 1–6. doi: https://doi.
org 10.1109/ICCW.2018.8403701.
Gera, T., Singh, J., Mehbodniya, A., Webber, J. L., Shabaz,
M., and Thakur, D. (2021). Dominant feature selec-
tion and machine learning-based hybrid approach
to analyze android ransomware. Sec. Comm. Netw.,
1–22. https://doi.org/10.1155/2021/7035233.
Kiani, A. and Ansari, N. (2018). Edge computing aware
NOMA for 5G networks. IEEE IoT J., 5(2), 1299–
1306. doi: 10.1109/JIOT.2018.2796542.
Huang, L., Feng, X., Zhang, C., Qian, L., Wu, Y. (2019).
Deep reinforcement learning based joint task ofoad-
ing and bandwidth allocation for multi-user MEC.
Dig. Comm. Netw., 5, 10–17. doi: https://doi.
org/10.1016/j.dcan.2018.10.003.
Kaur, G., Batth, R. S. (2021). Edge computing: classica-
tion, applications, and challenges. 2nd Int. Conf. In-
tel. Engg. Manag., 1–6. doi: https://doi.org /10.1109/
iciem51511.2021.94453.
Zhang, K., Mao, Y., Leng, S., Zhao, Q., Li, L., Peng, X.,
Pan, L., Maharjan, S., and Zhang, Y. (2016). Energy-
efcient ofoading for mobile edge computing in 5G
heterogeneous networks. IEEE Acc., 4, 5896–5907.
doi: https://doi.org /10.1109/access.2016.259716.
Thakur, D., Singh, J., Dhiman, G., Shabaz, M., and Gera,
T. (2021). Identifying major research areas and minor
research themes of android malware analysis and de-
tection eld using LSA. Complexity, 1–28. https://doi.
org/10.1155/2021/4551067.
Luo, Z., LiWang, M., Lin, Z., Huang, L., Du, X., and Gui-
zani, M. (2017). Energy-efcient caching for mobile
edge computing in 5G networks. Appl. Sci., 7(6),
1–13. doi: https://doi.org /10.3390/app7060557.
Alghamdi, I., Anagnostopoulos, C., and Pezaros, D. P.
(2019). On the optimality of task ofoading in mobile
edge computing environments. IEEE Global Comm.
Conf., 1–6. doi:https://doi.org/10.1109/GLOBE-
COM38437.2019.9014081.
Peng, K., Zhu, M., Zhang, Y., Liu, L., Zhang, J., Leung, V. C.
M., and Zheng, L. (2019). An energy- and cost-aware
computation ofoading method for workow appli-
cations in mobile edge computing. EURASIP J. Wire.
Comm. Netw., 1, 1–15. doi:https://doi.org/10.1109/
globecom38437.2019.9014081.
Kiran, N., Pan, C., and Changchuan, Y. (2020). Reinforce-
ment learning for task ofoading in mobile edge com-
puting for SDN based wireless networks. Seventh Int.
Conf. Softw. Dened Sys. (SDS), 1–6. doi : https://doi.
org /10.1109/sds49854.2020.9143888.
Tang, M. and Wong, V. W. S. (2020). Deep reinforcement
learning for task ofoading in mobile edge comput-
ing systems. IEEE Trans. Mob. Comput., 1, 1–12. doi:
https://doi.org /10.1109/TMC.2020.3036871.
Shan, N., Li, Y., and Cu, X. (2020). A multilevel optimiza-
tion framework for computation ofoading in mobile
edge computing. Math. Prob. Engg., 1–17. doi: https://
doi.org 10.1155/2020/4124791.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.