Content uploaded by Moayad Aloqaily
Author content
All content in this area was uploaded by Moayad Aloqaily on Jul 13, 2019
Content may be subject to copyright.
Reinforcing the Edge: Autonomous Energy
Management for Mobile Device Clouds
Venkatraman Balasubramanian∗, Faisal Zaman†, Moayad Aloqaily§, Saed Alrabaee¶,
Maria Gorlatova‡, Martin Reisslein∗
∗Arizona State University, AZ, USA, {Vbalas11, Reisslein}@asu.edu
†University of Ottawa, Ottawa, ON, Canada, Fzama075@uottawa.ca
§Canadian University Dubai, UAE, maloqaily@ieee.org
¶United Arab Emirates University (UAEU), UAE, Salrabaee@uaeu.ac.ae
‡Duke University, NC, USA, Maria.Gorlatova@duke.edu
Abstract—The collaboration among mobile devices to form an
edge cloud for sharing computation and data can drastically
reduce the tasks that need to be transmitted to the cloud. More-
over, reinforcement learning (RL) research has recently begun to
intersect with edge computing to reduce the amount of data (and
tasks) that needs to be transmitted over the network. For battery-
powered Internet of Things (IoT) devices, the energy consumption
in collaborating edge devices emerges as an important problem.
To address this problem, we propose an RL-based Droplet
framework for autonomous energy management. Droplet learns
the power-related statistics of the devices and forms a reliable
group of resources for providing a computation environment
on-the-fly. We compare the energy reductions achieved by two
different state-of-the-art RL algorithms. Further, we model a
reward strategy for edge devices that participate in the mobile
device cloud service. The proposed strategy effectively achieves
a 10% gain in the rewards earned compared to state-of-the-art
strategies.
Index Terms—Mobile Edge Computing, Device Clouds, Inter-
net of Things, Reinforcement Learning.
I. INTRODUCTION
As the IoT revolution expands further into the 5G realm,
low latency service, connected and autonomous systems, data
and service management have been the focus of today’s
research [1]–[6]. Computation and caching at the edge of
the cellular network has been proposed as one of the main
strategies for achieving sub-millisecond latencies [7]–[9]. Past
research has demonstrated that bringing computation closer
to the user results in manageable response times [10], [11].
However, the deployment overheads and costs of these edge
entities can be prohibitive for last mile users. To cope with
these costs, the authors of [12] [13] have proposed an idle
device resource composition scheme in the mobile device to
be utilized for providing a collaborative environment, thereby
resulting in a computation infrastructure [14]. However, none
of these solutions consider the draining energy inside mobile
devices that affect the stability of the mobile environment. Our
interest in this paper is in considering the energy consumption
in Mobile Device Clouds (MDC) at the time of task execution.
For example, in a coffee shop setting where people have gath-
ered and linger in the shop for sometime. In such scenarios,
instead of requesting a costly public cloud service, forming
a local device cloud between users bonded socially is more
feasible [15]. However, the battery powered devices need a
local energy manager to decide whether a device can continue
to participate or should be excluded.
Thus, this problem involves a decision making process at
the device-end where a decision maker module can either
choose to stop (for which the device receives a reward and
the system terminates) or pay a certain cost (in terms of
energy dissipation) to continue further. In terms of termination,
the device that terminates itself receives a termination reward
where as if the decision is to continue then the task execution
continues until the request is complete. All these procedures
should be executed without any user intervention. Our primary
motive here is not to find out the battery draining periods
but to investigate the energy constraints that are within the
device defined power thresholds that are estimated over a
period of time. Implicitly, this decision is also responsible
for maximizing rewards over that period. Moreover, to the
best of our knowledge, this work is the first of its kind that
employs learning models to resolve such an issue. To this end,
we simulate two state of the art algorithms namely greedy
and near-greedy. We deploy a gradient based approach to
compare and show the resulting rewards obtained. We observe
an approximate 10% increase in rewards earned using the
gradient approach.
A. Motivation
In order to reduce the overheads in terms of deployment
costs of cloudlets and edge data-centers, leveraging the local
idle device resources is essential. Hence, opportunistic MDC
environments are an important replacement for edge cloud
deployments. These devices are hand-held user specific nodes
that change location based on user movements. Therefore,
the challenges present here are two fold. One is the device
movement - where the user who requests the service and the
serving entity both are subjected to motion. Secondly, there
are constant disruptions due to drain out periods of devices.
Earlier works such as [16] have presented offloading based
solutions. Thus, in this paper we only target the problem of
energy maintenance in such environments where reward grant
is achieved by a continuous learning process.
B. Problem Definition
In various scenarios such as smart home automation, smart
buildings such as libraries, coffee shops where devices may
or may not be connected to a power source, there are always
times when device usage remains idle. In such scenarios,
a discovery of nodes that are willing to provide cohesive
service by establishing a local computation environment will
provide a cost effective compute location compared to a
centralized public cloud. However, recently it has been found
that such opportunistic edge cloud deployments requires en-
ergy maintenance. Although, task execution is a priority, if
the device runs out of power, then the resiliency of such a
model fails. Thus, our motivation to use reinforcement learning
stems from the challenge of finding proper energy thresholds.
Further, this technique provides a two-fold benefit of reward
calculation along with reliable device selection in such volatile
environments. Thus, training enables a highly robust system.
C. Contributions
To this end, our contributions in this paper are as follows:
1) We design a framework for autonomous energy manage-
ment relying on reinforcement learning approach [17].
The design is based on interaction of the agents and
proceed without human inputs.
2) The result of such a design is the production of rewards
commensurate with job completion for the nodes par-
ticipating in an MDC service. As the termination of a
service does not depend on any external inputs, it is
completely autonomous.
The remainder of this document delineates the related work
in Section II, followed by design in Section III. In Section IV
we discuss the performance evaluation. We provide concluding
remarks in Section V.
Fig. 1: Droplet Architecture
II. RE LATE D WOR KS
In [18] authors propose a distributed topology control al-
gorithm by combining design theories where the technique
focuses on asynchronous and asymmetric neighbor discovery.
According to this concept, neighbor discovery schedules are
made based on a combinatorial design of “multiples of 2” after
which a target duty cycling is done. In addition, the multiples
of 2 are applied to overcome the challenge of the block design
and support asymmetric operation. The proposed method has
shown the smallest total number of slots and wake-up slots
among existing representative neighbor discovery protocols.
On the contrary to this work, we propose an autonomous
device environment where decision to terminate is solely taken
by the device itself based on the native energy statistics
(including duty cycling periods at the time of discovery).
Further, in this work authors fail to consider a device cloud
environment where the dynamics of movement affects the
overall energy in the device due to continuous radio module
operation.
In [19] Lee et al. propose a cloud based energy management
approach that eliminates the prediction overhead by offloading
it to remote cloud. This framework pre-computes the execution
time by profiling web applications on dedicated mobile devices
in the cloud. Behaving like typical caches when mobile
web applications request data to servers, both the data and
its execution times are delivered to users mobile devices.
A performance control agent on the mobile device selects
an operating point to meet the response time requirement.
Contrary to this approach, our model does not make any
decisions remotely in the cloud, the decisions are made within
the composition environment.
In [15] Habak et al. propose a femto-cloud system which
provides a dynamic, self-configuring and multi-device mobile
cloud from a collection of mobile devices. Authors present
the femto-cloud system architecture to enable multiple mobile
devices to be composed in to a cloud computing service
despite movements in mobile device participation. Different
from this framework, we generate a learning based approach
for energy management that repeatedly updates its local energy
decision controller and self-heals by terminating itself from the
execution rather than taking over a task and dropping it due
to the lack of energy.
In [12], [14] authors propose a novel distributed device
cloud strategy with emphasis on mobility and maintaining
seamless connectivity. In both of these works, failure to con-
sider the energy component in these devices while providing
the service makes the model incomplete. Different from this,
our model proposes for the first time an energy aware MDC
that is based on a reinforcement learning mechanism.
III. SYS TE M DESIGN
A. Overall Droplet Architecture
Fig. 1 shows the design. Our controller design mainly
consists of three modules: the user application, controller and
the network interface.
1) The User Application refers to any application that can
avail the MDC. Via this module, the user profiles and
resource information is obtained. As elaborated before
in [20] the user can configure the amount of resources
that need to be shared before participating in the device
cloud formation.
2) Network Interface is the in-built network card for com-
munication with the other devices and network nodes
(such as Wi-Fi).
The controller resides in every device below the application
layer. It mainly comprises of the following modules:
1) Network Monitor and Communication Agent- Once
every device offers its network location and resource
capacities (how much of storage that can be shared) to
form the device cloud, the network monitor keeps track
of these resource information. This information is passed
to the decision control module that decides the energy
dissipation and maintains a threshold for the resources
that are part of the current composition. A ‘renew’/ ‘top-
up’ message is sent to the resource that crosses below
this value. Due to space limitations we avoid going
into detail about the control messages exchanged at this
juncture. Once a ‘renew’ message is obtained either the
affected device should find a new device before exiting
the current cloud or can allot the tasks back to the
requester to make the requester aware of its power status
via the communication agent.
The communication agent and the network monitor
operate in a cohesive manner. This module performs
the evaluation of the network topology and the actual
maintenance updates of the task execution. Discovery
of neighbours and newly obtained user profiles are kept
in the database via this modules presence.
2) Decision Control- This is the core of the controller that
maintains energy information. The above two modules
provide periodic updates to the decision control for
maintaining the accuracy while the tasks are being exe-
cuted. A scheduling policy that operates based on device
movements and energy is required. It receives inputs
from the resource estimation module about the load and
the overall behavior of the resources participating in the
device cloud. It performs the main learning mechanism
as elaborated in the next section.
3) Task Manager and Resource Estimation- These two
modules have the main database associations. Here the
information is retrieved from all arriving tasks and
mapped to the resource scheduler. The energy profiles
are the sensed data of the devices participating in the
MDC. When a device is connected to a power source,
the task manager obtains a local approval that the most
reliable device or device’s “capability” is 100% which
means that this is the most reliable device profile.
This module updates the task execution progress and
maintains the incoming job status.
B. Modelling Decision Control
In our system, a collection of devices that discover each
other over Wi-Fi is considered. These devices are idle and
provide a composed service as recognized before in [20]. We
TABLE I: Notations and Definitions
Notation Definition
R(i)Points offered for participating in the device cloud. Accrued
over-time for MDC services. 1 point per unit time spent in
task-execution.
C(i)Cost in Joules/sec. Amount of energy dissipated.
iState of Execution
OΓ(i)Energy Policy for i
0ΓExpected Cost Function
Pij Probability of going from state i to state j
On(i)Decision Maker for n devices
θ,π θ In Alg. 3, this is the proxy for uncertainty that we vary this
from 1 to 2, πIn Alg. 2, is the preference for an action
probability of exploration
SiSelection parameter for devices
PiInitial Probability
assume Ndevices together provide a device cloud service, the
user may offload the tasks based on the resources required. We
assume that the energy constraints on the device are directly
linked to the cost of resource usage. For example, if fiis
the CPU usage of a device, then the cost C(i)for a task i
is directly related as C(i) = αfi, where αis the constant
of power-computing relation. Owing to space limitations, the
derivations have been omitted. Table I shows the employed
notations. Consider terminal reward R(i),0≤R(i)≤ ∞,
and continuation cost C(i), C(i)>0. We define a probability
of Pij in times of continuation. Let’s say while we begin
initialization of execution i, a controller calculates local energy
profiles and readies to stop and receive the terminal cost of
R−R(i)≥0or pay a continuing cost of C(i)and move with
a probability of Pij , where j≥0. Now for an energy policy
Γ, let OΓand 0Γrepresent the expected cost function. For any
policy Γwhich stops at P r(stopping)=1,
OΓ(i) = OΓ(i) + R, where, i = 0,1,2, . . . (1)
This is the only process that needs to be considered for a non-
negative probability. The related processes can be modelled as
a Markov decision process [21]. Thus,
Oi= min[R−R(i), C(i) + X
j
Pij O(j)], i ≥0.(2)
For nstages,
On(i) = min[−R(i), C(i) + X
j
Pij On−1(j)], i ≥0.(3)
Now, let 0n(i)be the set decision for initial state before
reaching the nth stage where the control decision is to stop:
0n(i)≥0n+1(i)≥O(i).(4)
Thus, a stable state is achieved if
lim
n→∞
On(i) = O(i).(5)
C. Autonomous Energy Management
As shown in Fig. 2 an agent takes an action based on the
learning process. For any action, the local agent prepares a
reward (R). The Learner module learns the decision control
strategy as time progresses and provides resource estimate
Decision Control
Environment
User Application
Energy Learner
Status
update
Best Action
action
state
reward
Environment
action
state
reward
Fig. 2: Learning Agent Model
updates to the user application for the best action to be taken.
This could be in terms of new discovery, worst case scenarios
of dropping a task if it comes at the cost of device shutting
down, etc. The reward received is mapped to the present value
of the action (E.g. to estimate the value of an action we simply
average all the rewards when an action is selected such as
T otal Rewards
Sum of No. choices ).
As suggested in [22] choice of a near greedy strategy, is
affected by a behavior that is greedy with a small probability
which converges to near certainty. We explore this by
randomly making functional calls to the greedy algorithm
as shown in Alg. 1. Our modeling follows a multi-bandit
approach that is inspired from [22]. We model the bandit
problem to make the power decision on the device forming the
cloud. Further, this enables calculation of reward. We believe
the bandit problem makes sense in our proposed framework
because the controller has no prior knowledge of the actual
reward in forming the cloud. Further, the controller does not
have the complete knowledge of the power requirement. The
drawback in Alg. 1 is that the choice is made immediately
without any exploration of other nodes. Although it is not an
economical approach, the strategy is simple to implement.
IV. PERFORMANCE EVALUATION
Our evaluation suggests that the design is less sensitive to
parameter changes. We observe that greedy and near-greedy
methods in probability of exploration is used to converge to
an optimal selection of action. That is, as the tasks were
changed, more fluctuating rewards were noticed that require
more training. A positive observation is that the exploration of
the near greedy strategy usually results in producing actions
faster at the cost of being sub-optimal. We produce synthetic
measurements for our interpretation. Traffic generation is a
Poisson distribution having reward with mean of zero and a
unit variance. We use the Python Library for this purpose. To
analyze different algorithms we complete 4000 independent
runs i.e., 2000 time steps for each.
In Fig. 3a we show the average rewards obtained with
varying θvalues. Further, in Fig. 3b, we analyze the sensitivity
of reward with respect to the θi.e., the proxy used to avoid
the unnecessary selection of action compared to Alg. 1. Using
Algorithm 1: Near Greedy Approach - Algorithm 1
Input : Si,Pi
/*Estimating Reward R */
LR
P= [] LR
Pkeeps track of time duration each device
participate for a MDC Formation
begin
/*In case of exploitation, i.e.,
1-P, the for loop is skipped
and the max of LR
Pis returned */
foreach Siin MobileDeviceSelectionList do
if Constraints For Device Cloud Satisfies then
/*Refer table I for Ri
Calculation */
LR
P←− [Update the Reward Ri=
RP revious
i+RNew
i]
end
end
Return max(LR
P)
end
Output: Average Reward For Participation in Device
Cloud Formation
Algorithm 2: Gradient Approach - Algorithm 2
Input : Si,Pi
/*Estimating Reward R, which
considers relative action
preference. */
πt(a)←preference for action a
Estimate Reward(Action) /*Returns the
estimated reward for time t */
LA←List of Actions LR
P= [] LR
Pkeeps track
of time duration each device participate for a MDC
Formation begin
/*In case of exploitation, i.e.,
1-P, the for loop is skipped
and the max of LR
Pis returned */
foreach Siin MobileDeviceSelectionList do
if Constraints For Device Cloud Satisfies then
foreach actioniin LAdo
Ri=Estimate Reward(i) if Ri≥
Raverage then
Increase πt(a)
end
else
Reduce πt(a)
end
end
end
LR
P←− [Update the Reward Ri=
RP revious
i+RNew
i]
end
Return max(LR
P)
end
Output: Average Reward For Participation in Device
Cloud Formation
Algorithm 3: Action(Si)-Reward (Ri). Reducing the
probability of Selection of unnecessary actions.
Input : Si, Pi
Estimate Reward(Action) /*Returns the
estimated reward for time t */
/*Estimating Reward R*/
LR
P= []
LR
Pkeeps track of time duration each device
participate for a MDC Formation begin
/*Every time an action is taken,
exploration happens to ensure
that all actions are considered
for the determining the optimal
solution. Unlike in 1 where
exploration happens with times
*/
foreach Siin MobileDeviceSelectionList do
if Constraints For Device Cloud Satisfies then
/*Refer Table I for Ri
Calculation */
RNew
i=Estimate Reward(i) +θi/*θi
is proxy for uncertainty
estimate; this converges
and reduces unnecessary
action selection for the
calculation. It is root
mean square value of the
action. */
LR
P←− [Update the Reward Ri=
RP revious
i+RNew
i]
end
Return max(LR
P)
end
end
Output: Average Reward For Participation in Device
Cloud Formation.
the relation, E=θ×sln(t)
Nt(a)how the uncertainty is reduced,
where Nt(a)represents number of times an action has been
executed before time t and Nt(0) is the max action selection.
Thus, as θincreases the exploration increases. Approximately,
the spike at step 6 is because we have considered 5 devices
at max forming the cloud for simplicity. In the beginning all
the devices are explored and given a reward and as the next
step goes the next maximum of the already selected 5 will be
chosen hence a sudden spike. From then on, it is only building
on what has already been chosen thus we don’t see anymore
spikes.
Additionally, Fig. 3b represents % optimal actions selected.
With θzero represents a complete greedy approach hence
no exploration happens. Thus the optimal solution determined
with θ= 0 is sub-optimal. Whereas the θincreases the time
taken by system to explore increases and also the systems
confidence level increases that all the devices are explored
before exploiting. Thus with increase in θvalue the system
converges slowly to an optimal solution.
Two things can be concluded from Fig. 3c. Firstly, the
relative comparison factor added to Alg. 1 shows the greedy
approach however is not optimal although it might improve the
performance. Secondly, the change of mean has no effect on
the proposed algorithm as it adapts to the new changes. Fig. 3c
shows the performance of the Alg. 3 with varying θ. Likewise,
we evaluate the changing rewards with varying πvalues. In
[23], Sun et al. arrive at a conclusion that is not very clear to
compare with our work, hence, in our work, we use baseline
as the scenario when distribution parameter is 4 as shown in
Fig. 4. Owing to the random nature of this scenario, it needs
to be further investigated. In Fig. 5, our approach (θ) produces
a better exploration compared to near-greedy and greedy. On
the other hand, Alg. 1 and Alg. 2 follow an immediate reward
selection which is sub-optimal. In Fig. 6 comparison of all
average optimal actions are represented. It is observed that
Droplet gives a 10% gain in rewards achieved.
V. CONCLUSION & FUTURE WORK
Reinforcement learning is primarily an action based training
methodology compared to other forms of learning. The issue
of energy inside the devices is important to study in an edge
device cloud setting. This is because the collaborating devices
may drain out in between the execution and computation pro-
cess. In this paper, an autonomous energy management archi-
tecture, called Droplet that learns the power-related statistics of
the device is proposed. To this end, it produces a reliable group
of resources for providing a computation environment on-the-
fly. Additionally, we compare Droplet and two state-of-the-
art approaches. We analyze the energy based improvements
in each case. Further, we model a reward strategy for those
devices participating in the mobile device cloud service. We
observed that the proposed architecture effectively produces
a 10% gain in the rewards earned. In future, a real-world
implementation will be investigated and a comparative study
with the simulation results will be produced.
REFERENCES
[1] A. A. Alkheir, M. Aloqaily, and H. T. Mouftah, “Connected and
autonomous electric vehicles (CAEVs),” IT Professional, vol. 20, no. 6,
pp. 54–61, 2018.
[2] M. Aloqaily, I. Al Ridhawi, H. B. Salameh, and Y. Jararweh, “Data
and service management in densely crowded environments: Challenges,
opportunities, and recent developments,” IEEE Commun. Mag., March
2019.
[3] M. Aloqaily et al., “Congestion mitigation in densely crowded environ-
ments for augmenting QoS in vehicular clouds,” in Proc. ACM Symp.
on Des. and Analysis of Intel. Vehi. Netw. and App., 2018, pp. 49–56.
[4] Nasrallah et al., “Ultra-low latency (ULL) networks: The IEEE TSN and
IETF DetNet standards and related 5G ULL research,” IEEE Commun.
Surv. & Tut., vol. 21, no. 1, pp. 88–145, 2019.
[5] I. Parvez et al., “A survey on low latency towards 5G: RAN, core
network and caching solutions,” IEEE Commun. Surv. & Tut., vol. 20,
no. 4, pp. 3098–3130, 2018.
[6] Z. Xiang et al., “Reducing latency in virtual machines: Enabling tactile
internet for human-machine co-working,” IEEE J. Sel. Areas Commun.,
vol. 37, no. 5, 2019.
(a) Reward assignment process. (b) Estimated power selection process. (c) Reward vs time cycles variation with θ
Fig. 3: Droplet sensitivity with respect to given θ.
Fig. 4: Varying πvalues.
Fig. 5: Average of optimal action for a selected parameter
range. θfor Droplet, for near-greedy and πfor greedy.
Fig. 6: Average of all the optimal actions.
[7] M. Chen et al., “Data-driven computing and caching in 5G networks:
Architecture and delay analysis,” IEEE Wirel. Commun., vol. 25, no. 1,
pp. 70–75, 2018.
[8] S. Sukhmani et al., “Edge caching and computing in 5G for mobile
AR/VR and tactile internet,” IEEE MultiMedia, vol. 26, no. 1, pp. 21–
30, 2019.
[9] X. Wang et al., “Content-centric collaborative edge caching in 5G mobile
internet,” IEEE Wirel. Commun., vol. 25, no. 3, pp. 10–11, 2018.
[10] M. Satyanarayanan, P. Bahl, R. Carceres, and N. Davies, “The case for
VM-based cloudlets in mobile computing,” IEEE Pervasive Computing,
vol. 8, no. 4, pp. 14–23, 2009.
[11] P. Shantharama et al., “LayBack: SDN management of multi-access
edge computing (MEC) for network access services and radio resource
sharing,” IEEE Access, vol. 6, pp. 57545–57 561, 2018.
[12] V. Balasubramanian and A. Karmouch, “Managing the mobile ad-hoc
cloud ecosystem using software defined networking principles,” in Proc.
Int. Symp. on Netw., Comp. and Commun. (ISNCC), May 2017, pp. 1–6.
[13] A. Mtibaa, K. A. Harras, and A. Fahim, “Towards computational
offloading in mobile device clouds,” in Proc. IEEE Int. Conf. on Cloud
Computing Technology and Science (CloudCom), 2013, pp. 331–338.
[14] V. Balasubramanian, M. Aloqaily, F. Zaman, and Y. Jararweh, “Explor-
ing computing at the edge: A multi-interface system architecture enabled
mobile device cloud,” in Proc. IEEE Int. Conf. on Cloud Networking
(CloudNet), Oct 2018, pp. 1–4.
[15] K. Habak, M. Ammar, K. A. Harras, and E. Zegura, “Femto clouds:
Leveraging mobile devices to provide cloud service at the edge,” in
Proc. IEEE Int. Conf. on Cloud Comp. (CLOUD), 2015, pp. 9–16.
[16] D. V. Le and C. Tham, “A deep reinforcement learning based offloading
scheme in ad-hoc mobile clouds,” in Proc. IEEE Infocom Workshops),
April 2018, pp. 760–765.
[17] N. Mastronarde and M. van der Schaar, “Fast reinforcement learning for
energy-efficient wireless communication,” IEEE Transactions on Signal
Processing, vol. 59, no. 12, pp. 6262–6266, 2011.
[18] G. Yi, J. H. Park, and S. Choi, “Energy-efficient distributed topology
control algorithm for low-power IoT communication networks,” IEEE
Access, vol. 4, pp. 9193–9203, 2016.
[19] W. Lee, D. Sunwoo, A. Gerstlauer, and L. K. John, “Cloud-guided QoS
and energy management for mobile interactive web applications,” in
Proc. IEEE Int. Conf. on Mobile Softw. Eng. and Sys., 2017, pp. 25–29.
[20] V. Balasubramanian and A. Karmouch, “An infrastructure as a service
for mobile ad-hoc cloud,” in Proc. IEEE Comp. and Commun. Workshop
and Conf. (CCWC), 2017, pp. 1–7.
[21] S. M. Ross, Introduction to Stochastic Dynamic Programming. Aca-
demic Press, 2014.
[22] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.
MIT Press, 2018.
[23] K. Sun, Z. Chen, J. Ren, S. Yang, and J. Li, “M2C: Energy efficient mo-
bile cloud system for deep learning,” in Proc. IEEE Infocom Workshops,
April 2014, pp. 167–168.