Conference PaperPDF Available

Hybrid P2P-CDN Architecture for Live Video Streaming: An Online Learning Approach

Authors:
Hybrid P2P-CDN Architecture for Live Video
Streaming: An Online Learning Approach
Reza Farahani*, Abdelhak Bentaleb§, Ekrem C¸ etinkaya*,
Christian Timmerer*, Roger Zimmermann§, and Hermann Hellwagner*
*Christian Doppler Laboratory ATHENA, Institute of Information Technology, Alpen-Adria-Universit¨
at Klagenfurt, Austria
§School of Computing, National University of Singapore, Singapore
Abstract—Designing a cost-effective, scalable, and flexible
architecture that supports low latency and high quality live
video streaming is still a challenge for Over-The-Top (OTT)
service providers. To cope with this issue, this paper leverages
Peer-to-Peer (P2P), Content Delivery Network (CDN), edge com-
puting,Network Function Virtualization (NFV), and distributed
video transcoding paradigms to introduce a hybRId P2P-CDN
arcHiTecture for livE video stReaming (RICHTER). We first
introduce RICHTER’s multi-layer architecture and design an
action tree that considers all feasible resources provided by peers,
edge, and CDN servers for serving peer requests with minimum
latency and maximum quality. We then formulate the problem
as an optimization model executed at the edge of the network.
We present an Online Learning (OL) approach that leverages an
unsupervised Self Organizing Map (SOM) to (i) alleviate the time
complexity issue of the optimization model and (ii) make it a
suitable solution for large-scale scenarios, by enabling decisions
for groups of requests instead of for single requests. Finally, we
implement the RICHTER framework, conduct our experiments on
a large-scale cloud-based testbed including 350 HAS players, and
compare its effectiveness with baseline systems. The experimental
results illustrate that RICHTER outperforms baseline schemes in
terms of users’ Quality of Experience (QoE), latency, and network
utilization, by at least 59%, 39%, and 70% respectively.
Index Terms—HAS; Edge Computing; NFV; CDN; P2P; Low
Latency; QoE; Video Transcoding; Online Learning.
I. INTRODUCTION
Motivation: The proliferation of novel video streaming
technologies, advancement of networking paradigms, and
steadily increasing numbers of users who prefer to watch video
content over the Internet rather than using classical TV have
made video the predominant traffic on the Internet. Among all
types of video traffic, live video streaming has become signif-
icantly popular, accounting for about 17% of the total video
traffic by 2022 [1]. HTTP Adaptive Streaming (HAS) delivery
systems, (e.g., based on the Dynamic Adaptive Streaming over
HTTP (DASH) standard or Apple’s HTTP Live Streaming
(HLS)) have become the prevalent technologies employed by
OTT service providers (e.g., Facebook, YouTube, Twitch) for
live video streaming delivery [2]. In HAS, videos are split
into short segments with fixed duration, and each segment
is encoded at various qualities/bitrates (i.e., representations);
then, HAS clients adapt to the available bandwidth and/or
playout buffer status to download appropriate segments from
CDN servers, using an adaptive bitrate algorithm [2]. Although
utilizing CDN services to scale HAS delivery systems has been
a step forward, tremendous growth in high-quality and low
latency live video demands create several challenges for OTT
services. For instance, CDN servers can be overloaded, then
OTT services fail to deliver a satisfactory quality and latency to
end-users [3]. Recent studies have revealed that using clients’
capabilities within a P2P network to form hybrid P2P-CDN
video delivery systems addresses the aforementioned issues
and brings many advantages, like alleviating network con-
gestion, increasing streaming stability, and reducing delivery
costs [4]–[6]. Considering these benefits, many companies,
e.g., Peer5 and Livepeer, have been utilizing peer-assisted
networks with some promising networking protocols (e.g., We-
bRTC) to offload CDNs and accomplish the aforementioned
goals. Some works [7] reveal that existing hybrid P2P-CDN
live streaming systems do not consider the full capability of
peers to provide high quality and low latency live streaming,
consequently suffering from inefficient resource utilization and
unpleasant users’ QoE. Therefore, the primary motivation for
our work is devising a hybrid P2P-CDN live streaming system
to (i) employ both computing and bandwidth capabilities
provided by the P2P network, (ii) leverage modern networking
paradigms (i.e., NFV and edge computing) and an OL-based
approach to utilize P2P and CDN resources efficiently, and (iii)
satisfy HAS client requests with high QoE and low latency.
Related Work: Hybrid P2P-CDN systems generally include
three main components: (i) the media servers (i.e., origin or
CDN servers) for distributing the video contents to the peers,
(ii) peers that stream the same video contents with the same
quality, and (iii) a tracker server including a matching table to
find the best peers with minimum latency who are watching
the same video content and quality level. Some works like [8]
customize HAS players and propose such a hybrid system
in order to reduce CDN bandwidth usage and transmission
costs. Muscat et al. [9] utilize the server push functionality
of HTTP/2 to propose a hybrid P2P-CDN low latency live
video streaming system. In our previous works [10]–[12],
we propose edge- and Software-Defined Networking (SDN-)
assisted video streaming frameworks that do not leverage P2P
capability and focus on Video on Demand (VoD) scenarios.
Nacakli et al. [13] leverage SDN and edge computing to
present a novel hybrid P2P-CDN service that is hosted at SDN-
enabled edge data centers. Our previous work [14] proposes
a hybrid P2P-CDN architecture for low latency live video
streaming without implementation, evaluation an utilizing an
online learning approach. Ma et al. [15] propose machine
learning-based approaches for hybrid P2P-CDN systems that
enable their trackers to perform peer selection. However, their
system does not employ edge- and OL-supported approaches978-1-6654-3540-6/22 © 2022 IEEE
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1911
GLOBECOM 2022 - 2022 IEEE Global Communications Conference | 978-1-6654-3540-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/GLOBECOM48099.2022.10001091
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
CDN Network
Virtual Tracker
Server (VTS)
P2P Layer
Peer Transcoder
Edge Transcoder
gNodeB
Partial Cache (PC)
Media Organization Layer
Encoding and
Packaging
Seeders
CMCD
CMSD
Media
Leechers
Edge Layer
CDN Layer
Figure 1: RICHTER system architecture
and does not include transcoding-based actions. To the best of
our knowledge, none of the existing hybrid P2P-CDN video
streaming frameworks proposes a system to (i) use peers’
potential idle computational resources for serving HAS clients
through running video transcoding and (ii) make the virtual
edge trackers intelligent by employing an OL approach.
Contributions: To tackle these challenges, in this paper,
we leverage HAS, P2P, CDN, NFV, and edge computing
technologies and propose a hybRId P2P-CDN arcHiTecture
for livEvideo stReaming (RICHTER). Our solution aims to
minimize HAS clients’ latency and network costs. Besides
considering resource limitations, we design an Action Tree
including all possible actions for serving clients’ requests
employed by Virtual Tracker Servers (VTSs) at the edge of
a P2P-CDN network. We formulate the problem as a mixed-
integer linear programming (MILP) optimization model. Due
to the NP-completeness of the proposed MILP model, we
design an OL-based approach that uses an unsupervised SOM
technique [16] for action selection decisions. To test the
practical deployment of our solution, we implement RICHTER
and analyze its performance through experiments conducted
in a large-scale testbed including 350 clients and compare its
results with selected baseline approaches. The experimental
results demonstrate the effectiveness of RICHTER for achiev-
ing high users’ QoE, low latency, and optimized network
utilization.
Paper Outline: The remainder of this paper is structured
as follows. Section II-A explains the proposed architecture;
we formulate the problem as a MILP optimization model in
Section II-B, and explain our proposed OL-enabled method in
Section II-C. The evaluation setup, methods, metric and results
are described in Section III. Section IV concludes the paper
and gives an outlook on future work.
II. RICHTER DESIGN
A. System Model
The proposed architecture of RICHTER includes four core
layers and is shown in Fig. 1.
Media Organization Layer. In this layer, the raw live
videos are encoded and packaged into DASH format, then
stored on the origin server. Note that this layer is able to
package the encoded videos to other formats like HLS or
Common Media Application Format (CMAF).
CDN
Server
VTS
(PC.)
Origin
Server
Peer
(Tran.)
VTS
(Tran.)
VTS
(Tran.)
Action Tree
Clients
1
Peer
2345 6 7
Figure 2: RICHTER action tree
CDN Layer. This layer is constructed by a group of CDN
servers (either OTT servers or a purchased service from CDN
providers), each of which contains various parts of video
sequences. Inspired by the Consumer Technology Association
CTA-5004 standard [17], [18], CDN servers periodically in-
form the edge layer about their cache occupancy via Common
Media Server Data (CMSD) messages.
P2P Layer. Given the continuous increases in smartphone
capabilities, e.g., high-bandwidth access to the Internet, en-
ergy resources, and hardware-accelerated video transcoding,
RICHTER utilizes the peers’ idle resources to provide a
distributed video transcoding approach besides video trans-
mission. Like most hybrid P2P-CDN schemes, we construct
the P2P layer based on the tree-mesh structure, including
two types of peers: Seeders and Leechers. In this scheme,
seeders’ requests can be served by all nodes (i.e., CDNs,
origin, edge, or other seeders) except leechers, while leechers’
requests can be served by all nodes. Inspired by the CTA-5004
standard [17], peers periodically inform the edge layer about
their cache occupancies through Common Media Client Data
(CMCD) messages and receive updates from the edge layer
via CMSD messages.
Edge Layer. This layer leverages the capabilities of NFV
and edge computing and presents virtualized edge components
called Virtual Tracker Servers (VTSs) close to base stations
(e.g., gNodeB in 5G). Note that, in the proposed system,
during a live session, clients’ requests are directed to a VTS,
and then they get responses based on the VTS’s decisions.
As shown in Fig. 1, a VTS is equipped with transcoding
and partial cache functions to serve clients’ requests from
existing higher content qualities (by transcoding) or directly
from cached qualities, respectively. Note that because the VTS
has a broader view of both P2P and CDN layers (based on the
received CMCD/CMSD messages and monitored information),
it can track clients’ requests and store a mapping between
all transmitted content and all served clients in its peer-map
lists. Thus, it must respond to the following vital questions
whenever it needs to decide to serve received requests:
1) Where is the optimal place (i.e., adjacent peers, VTS, CDN
servers, or origin server) in terms of lowest latency for
fetching each client’s requested content quality level from,
while efficiently utilizing the available resources?
2) What is the optimal approach for responding to the re-
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1912
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
quested quality level (i.e., fetch or transcode)?
Among other tasks, a VTS monitors the system frequently to
obtain precise information about the available resources (e.g.,
bandwidth, peers’ computational and power resources), and
peers’ joining/leaving times. Therefore, when a VTS receives
a new request, it can find the optimal solution (i.e., in terms
of minimum latency) from the action tree (Fig. 2) (action
numbering as in the figure): (1) Use the P2P network and
transmit the requested quality directly from the best adjacent
peer with maximum stability (i.e., the least recent joining
time). (2) Transcode the requested quality from a higher
quality at the most stable adjacent peer and transmit it through
the P2P network. (3) Fetch the requested quality directly from
the edge, i.e., the VTS. (4) Transcode the requested quality
from a higher quality at the VTS. (5) Fetch the requested
quality from the origin server. (6) Fetch a higher quality from
the best CDN server and transcode it at the VTS. (7) Fetch
the requested quality from the best CDN server.
B. Problem Formulation
We introduce an MILP optimization model that includes
four groups of constraints: Action Selection (AS), Serving Time
(ST), Origin/Peer (CP), and Resource Usage (RU).
(i) AS constraint. Constraint (1) selects an appropriate
action from the proposed action tree (Fig. 2) for the request
issued by peer j. It chooses a suitable value of the binary
variable Xq,t
i,j (refer to Table I for the definition):
X
i∈{P∪V ∪C}\{j}X
t∈T X
q∈Qj
Xq,t
i,j ×αq
i,j = 1,j P (1)
(ii) ST constraint. Constraint (2) determines transmitting
time Tq
i,j to transmit quality level q Qjfrom source node i
to peer j:
Pt∈T Xq,t
i,j ×δq
j
ωi,j
Tq
i,j ,(2)
j P, i {P V C} \ {j}, q Qj
Constraint (3) determines the required transcoding time τq
i,j
at node iin case of serving the quality requested by peer j
from a higher quality qby transcoding:
X
t∈T \{0}X
q∈Qj
Xq,t
i,j ×µq
i,j τq
i,j ,j P, i {P V} \ {j}
(3)
Therefore, the serving time, namely Ψ,i.e., fetching time
plus transcoding time, can be expressed as follows:
X
i∈{P∪V ∪C}\{j}X
j∈P X
q∈Qj
Tq
i,j +τq
i,j Ψ(4)
(iii) CP constraint. Constraint (5) forces the model to fetch
the exact quality qfrom the origin or CDNs when one of them
is chosen to serve peer j P.
Table I: Notation for RICHTER
Notation Description
Input Parameters
CSet of kCDN servers and an origin server (i.e.,c= 0)
PSet of npeers including sseeders and lleechers in subsets P1and P2,
respectively
VVirtual Tracer Servers (VTSs)
QjSet of possible quality levels for serving quality qrequested by j P,
where Qj={q, q+ 1, ..., q
max}and q
max is the maximum
quality level for the demanded segment
TSet of possible transcoding statuses, where T={0,1,2}and t= 1 or
t= 2 if the requested quality is transcoded from a higher quality q Qj
at a VTS i V or a peer i {P } \ {j}, respectively; otherwise t=0
RSet of ρpeer regions
QSet of quality level queues in the VTS
αq
i,j Available quality levels in i {P V C};αq
i,j = 1 means node i
hosts quality qrequested by peer jP; otherwise αq
i,j = 0
ωi,j Available bandwidth on path between i {P V C} \ {j},j P
θq
i,j Required resources (i.e., CPU usage in %) for transcoding quality q Qj
into the quality requested by j P in i {P V}} \ {j}
ηq
i,j Required power (in milliampere-hour) for transcoding quality q Qj
into the quality requested by j P in i {P } \ {j}
µq
i,j Required time (in seconds) for transcoding quality q Qjinto the
quality requested by j P in i {P V}} \ {j}
δq
jSize of segment in quality q(in bytes) requested by j P
λq
jBitrate for quality level q Qjrequested by j P
iAvailable computation resources (available CPU) of i {P V}
ϕiAvailable power resources of i P
Variables
Xq,t
i,j Binary variable where Xq,t
i,j = 1 indicates source
i {P V C} \ {j}transmits quality q Qjrequested by peer
j P with transcoding status t, otherwise Xq,r,t
i,j = 0
τq
i,j Required transcoding time at source i {P V} \ {j}to serve quality
q Qjrequested by j P
Tq
i,j Required time of transmitting quality level q Qjin response to peer
j P from server i {P V C} \ {j}
ΨServing time consisting of τq
i,j and Tq
i,j
X
i∈C X
q∈Qj
Xq,t=0
i,j ×q=q,j P (5)
Note that i= 0 in Eq. (5) denotes the origin server.
Moreover, we should prevent seeders from fetching requested
qualities from leechers, expressed in Eq. (6):
X
i∈P2
X
t∈T \{1}X
q∈Qj
Xq,t
i,j ×q= 0,j P1(6)
(iv) RU constraint. Constraint (7) guarantees that the
required bandwidth for transmitting segments on the link
between nodes iand jmust respect the available bandwidth:
X
t∈T X
q∈Qj
Xq,t
i,j ×λq
jωi,j ,j P, i {P V C} \ {j}(7)
Constraint (8) limits the maximum required processing
capacity for the transcoding operation to the available com-
putational resource.
X
j∈P X
q∈Qj
(Xq,t=1
i∈V,j +Xq,t=2
i∈P\{j},j )×θq
i,j ii {P V}
(8)
Similarly, constraint (9) limits the maximum required peers’
power resources for running the transcoding function to the
available power resource.
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1913
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
X
j∈P X
q∈Qj
Xq,t=2
i,j ×ηq
i,j ϕii P \ {j}(9)
MILP Optimization Model. The following model min-
imizes the requests’ serving times (i.e., fetching time plus
transcoding time), denoted by Ψ:
Minimize Ψ(10)
s.t. constraints Eq.(1) Eq.(9)
vars. T q
i,j , τ q
i,j ,Ψ0, Xq,t
i,j {0,1}
By running the MILP model (10), an optimal action will be
selected for each request issued by peer j P such that the
total serving time is minimized. However, the MILP model
(10) is NP-hard [19], and suffers from high time complexity.
The next section introduces an OL-based approach based on
SOM [16] to remedy this issue.
C. Proposed Online Learning Approach
We design an OL-based solution depicted in Fig. 3(a),
which works in a time-slotted fashion. As shown in Fig. 3(b),
the proposed time slot structure consists of two intervals:
(i) Collecting Data (CD) and (ii) Serving Requests (SR). In
addition to the transcoding and partial cache functions, a
VTS is equipped with the four following modules: Resource
monitoring Module (RM), Manager Module (MM), Queuing
Module (QM), and OL Agent. Moreover, the VTS hosts
multiple queues, one each per peer region, live video channel,
and bitrate in each channel. In the CD interval, the following
modules are called to prepare inputs for the OL agent:
RM. This module is responsible for collecting received
CMCD and CMSD messages, monitoring available resources
(i.e., bandwidth, power, computation, joining/leaving times),
queues, and notifying the MM module.
MM. This module is used to (i) receive HTTP requests from
players, (ii) extract regions based on IP addresses, requested
channels, and bitrates from the incoming HTTP requests, (iii)
aggregate and forward the incoming HTTP requests and the
extracted information (i.e., region/channel/bitrate) to the QM
module, (iv) update the OL agent based on the items received
by the RM, (v) control the correctness of decisions made by
the OL agent before fetching or transcoding qualities from
nodes, (vi) communicate with the peers, CDN and/or origin
server regarding the decisions made by the OL agent, and
(vii) store popular segments fetched from CDN/origin server
into the partial cache. Note that the MM module immediately
responds to a requested segment that exists in the partial cache.
Furthermore, it includes an on the fly list that is used to prevent
delivering a request to the QM module if a response to the
request is in flight from the CDN/origin server.
QM. This module receives extracted features of requests from
the MM module and places requests in separate queues based
on peer regions, requested channel IDs, and bitrates.
Considering the system’s current state, i.e., available in-
formation on resources and queues of requests provided by
the MM module, the OL agent in the SR interval must run
multiple threads of an OL algorithm (one thread per peer
region) to answer the questions mentioned in Section II-A.
Since SOM [16] (i) is one of the widely used techniques for
unsupervised classification problems, (ii) can be applied to
solve NP-hard problems [20], (iii) does not require a prepared
dataset for supervised model training, (iv) allows online real-
time decision making, and (v) evolves its model quickly over
time, it is adopted as the request management solution in the
OL agent. For each queue Qb Q with requested bitrate level
b, a set of SOM neurons (black circles in Fig. 3(a)) is created,
each of which is a feasible node holding the requested quality
(i.e., same-region peer, VTS, CDN/origin server) or a higher
quality (i.e., same-region peer, VTS) for serving bthrough
fetching or running transcoding, respectively. Note that since
more than one queue can proceed and might violate all/several
resource constraints (e.g., the bandwidth, computation, or
power limitations), they are evaluated in a priority order where
the queue with a higher number of requests comes first.
Each SOM’s neuron has two features (i.e., feature map)
that are defined as a <latency, penalty>tuple. The latency
feature indicates fetching plus transcoding times, while the
penalty feature is used to penalize the neuron whenever the
agent makes an incorrect decision (due to violating one/several
constraints (1)-(9)). For the sake of simplicity, we assume that
each violating action increases the sum of penalties by one.
Moreover, in order to represent the SOM features in the same
space, we use normalized features in the range between 0
and 1. When the SOM thread is executed, it will consider
the neurons’ feature map and classify neurons to find the
best matching unit (BMU) with the maximum reward, i.e.,
minimum <latency, penalty>values. The Euclidean distance
function DQb(i, j) = qP2
n=1 wn
Qb(i[n]j[n])2as a simple
discriminant function is used to calculate the best matching
of the features used in each neuron jcompared to BMU
i, where wn
Qbin weight matrix wQbis used for the nth
feature of each feature list. Usually, after selecting the BMU,
the corresponding neuron and its neighbors must be updated.
Note that the neighborhood function employed in the SOM
is the Gaussian distribution function HQb(i, j) = e
DQb(i,j)2
2σ2,
where σis the learning rate. Finally, an output list of tuples
(N, A, R, V ) sorted in ascending order (in terms of latency)
is sent to the MM module, where each tuple indicates the
determined node N, action A, the maximum number of
requests Rthat can be served via that node/action, and a
violation signal V, respectively (Fig. 3(c)). For instance, tuple
(p1,1,2,0) of the output list shows that peer1using action1can
serve two requests without violating the defined constraints.
The MM module follows the SOM decisions for serving
requests with tuples with V= 0, while it ignores tuples with
V1. Note that the MM module updates the inputs of the OL
agent (i.e., available resources) regarding the accepted outputs
of the OL agent since the SOM threads might execute several
times during two consecutive CD intervals.
This process will be repeated in each SR interval until the
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1914
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
Region 1
Region 2
Region n
Ch I
Channel n
Q
I
Q
2
Q
3.
Ch
II
.
.
...
Channel I
Q I
Q 2
Q 3
Q n
.
Channel II
.
.
...
Ch I
Q
I
Q
2
Q
3.
Ch
II
.
.
...
Channel I
Q I
Q 2
Q 3
Q n
.
Channel II
.
.
...
Ch I
Q
I
Q
2
Q
3.
Ch
II
.
.
...
Q I
Q 2
Q 3
Q n
.
.
.
...
Queuing Module (QM)
.
.
.
Region 1 Region 2 Region n
Manager Module (MM)
. .
Online Learning Agent
Resource monitoring Module (RM)
1 2 3 456 7
Feature Map
Actions
Thread 1
Thread 2
Thread n
Partial
Cache
..
.
.
.
.
Channel n
Channel n
Channel II
Channel I
.
4
6
5
2
19
3 7
81
8
7
1
7
8
6
2
2
3
3
9
9
44
5
56
CDN/Origin
VTS
Peers
(a)
...
N:
A:
R:
V:
p1
1
2
0
c1
7
900
0
v
4
100
0
N:
A:
R:
V:
N:
A:
R:
V:
N:
A:
R:
V:
p2
2
2
1
N:
A:
R:
V:
N:
A:
R:
V:
v
3
500
0
o
5
300
0
123
Collecting Data (CD) Interval
Serving Requests (SR) Interval
... Time
(b) (c)
Figure 3: (a) Proposed online learning structure, (b) time slot structure, and (c) a sample of the OL agent output
live streaming session ends and all queues are served. Assume
ρ,β, and γindicate the number of peer regions, number of
live channels, and number of bitrates per channel. In the worst
case, the time complexity of the multi-thread SOM method
employed by the OL agent would be O(ρ×β×γ)in each
time slot.
III. PERFORMANCE EVALUATI ON
Evaluation Setup: To assess the effectiveness of RICHTER
in a realistic large-scale environment, InternetMCI1is con-
sidered as a real backbone network topology. We instanti-
ate our testbed including 375 elements, i.e., 350 AStream2
DASH players running the BOLA [21] adaptive bitrate (ABR)
algorithm (seven groups of 50 peers), five Apache HTTP
servers (i.e., four CDN servers with a total cache size of
40% of the video dataset and an origin server, containing all
video sequences), 19 OpenFlow (OF) backbone switches, 45
backbone layer-2 links, and a VTS server (with a partial cache
size of only 5% of the video sequences) on the CloudLab [22]
environment. Each element is run on Ubuntu 18.04 LTS inside
Xen virtual machines. RICHTER is independent of the caching
policy and is compatible with various caching strategies. For
simplicity, Least Recently Used (LRU) is considered in all
CDN and partial caches as the cache replacement policy. Note
that we assume each peer can cache five segments of the
videos at most. We implement all modules of VTS in Python
to serve clients’ requests for five live channels (i.e., CH I–
CH V). Each live channel plays a unique video [23] with 300
seconds duration, comprising two-second segments in bitrate
ladder {(0.089,320p), (0.262,480p), (0.791,720p), (2.4,1080p),
(4.2,1080p)}[Mbps, content resolution].
1http://www.topology-zoo.org/dataset.html; last access: 2022-05-16.
2https://github.com/pari685/AStream; last access: 2022-05-16.
The Docker image jrottenberg/ffmpeg3is utilized to measure
the segment transcoding time on the VTS. To measure the
transcoding time on the heterogeneous P2P network, we run
the transcoding function via FFmpegKit4on an iPhone 11
(Apple A13 Bionic, iOS 15.3), a Xiaomi Mi11 (Snapdragon
888, Android 11), and a PC (Apple M1, MacOS 12.0.1).
Moreover, power consumption is measured via device tools,
such as Android Energy Profiler and Android Battery Manager.
The bandwidth of all links in different paths from the CDN
and origin servers to the VTS are set to 50 and 100 Mbps,
respectively. To emulate the mobile network conditions, we
assume 250 peers initiate the experiments, and then, every
three seconds, a new peer joins the sessions. The VTS directs
the first peer to the best CDN server (in terms of lowest
latency), while other participating peers can be connected
on both CDN and P2P links. A real 4G network trace [24]
collected on bus rides is employed for links between peers to
edge servers in all experiments. The average bandwidth of this
trace is approximately 3780 kbps with a standard deviation
of 3190 kbps. The channel access probability is generated
following a Zipf distribution with the skew parameter α= 0.7,
i.e., the probability of an incoming request for the ith channel
in each peer group is given as prob(i) = 1/iα
PK
j=1 1/jα, where
K= 5. The learning rate and weighting parameters associ-
ated with latency and penalty are set to 0.01, 0.5, and 0.5,
respectively.
Evaluation Methods: The results achieved by the
RICHTER will be compared with the following baseline meth-
ods: (i) Non Hybrid (NOH): regular CDN-based streaming
with no P2P support. (ii) Non Transcoding-enabled Hybrid
(NTH): Like in most works, there is no transcoding capa-
3https://hub.docker.com/r/jrottenberg/ffmpeg; last access: 2022-05-16.
4https://tanersener.github.io/ffmpeg-kit/; last access: 2022-05-16.
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1915
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
ANS
AQS
(%)
(a)
(b)
(c)
(d)
ASD (sec.)
ASB (Mbps)
(e)
APQ
(%)
CHR
ETR
NOH NTH ECT RICHTER
0
20
40
60
80
100
0
20
40
60
80
100
NOH NTH ECT RICHTER
0
1
2
3
4
5
0
1
2
3
4
NOH NTH ECT RICHTER
AST (sec.)
0
10
20
30
0
10
20
30
NOH NTH ECT RICHTER
0
1
2
3
4
5
0
30
60
90
120
150
NOH NTH ECT RICHTER
ASB
AQS
ANS
ASD
APQ
AST
CHR
ETR
0
1
2
3
4
5
BTL (Gbps)
Figure 4: Evaluation results for the NOH, NTH, ECT, and RICHTER systems for 350 clients
bility in this approach. In an NTH-based system, peers only
can be served via one of the actions 1, 5, or 7 (Fig. 2).
(iii) Edge Caching/Transcoding Hybrid (ECT): In this ap-
proach, transcoding at the peer side is not considered, and
requests can be served via all actions except action 2. For fair
comparisons, our testbed with a similar setup is used in all
systems. Moreover, the NOH, NTH, and ECT systems em-
ploy lightweight heuristic approaches to answer the questions
mentioned in Section II-A by considering Eqs. (1)–(10).
Evaluation Metrics: The performance of these systems is
evaluated through the following metrics: (i) Average Segment
Bitrate (ASB) of all the downloaded segments; (ii) Average
Number of Quality Switches (AQS), the average number of
segments whose bitrate level changed compared to the pre-
vious one; (iii) Average Stall Duration (ASD), the average
of total video freeze time of all clients; (iv) Average Number
of Stalls (ANS), the average number of rebuffering events;
(v) Average Perceived Overall QoE (APQ) calculated by the
ITU-T Rec. P.1203 model in mode 05;(vi) Average Serving
Time (AST), defined as the overall time for serving all clients,
including fetching time plus transcoding time; (vii) Backhaul
Traffic Load (BTL), the volume of segments downloaded
from the origin server; (viii) Edge Transcoding Ratio (ETR),
the fraction of segments transcoded at the VTS or peers;
(ix) Cache Hit Ratio (CHR), defined as the fraction of seg-
ments fetched from the CDN or edge servers or peers. Each
experiment is executed 20 times, and the average and standard
deviation values are reported in the experimental results.
Evaluation Results: Running transcoding on peers must
be fast enough, not significantly impose a delay to the live
system, and not consume much battery; otherwise, the clients’
requests may use other actions that congest the network
and edge server. In the first scenario, we run experiments
to investigate the latency and energy overheads of running
transcoding tasks on peers. To evaluate the latency overheads,
we measure transcoding times for a five-minute video in
different resolutions/bitrates on the mobile devices. In fact,
transcoding demands decoding video into raw frames and then
re-encoding those frames into new frames. Thus, transcoding
time at the peer-side is equal to the encoding time due to
leveraging the video processing that is already being done
to capture or view video. As shown in Table II, running
transcoding for the whole video takes 8.5–254.2 seconds on
these devices (0.056–1.69 seconds per segment) and is fast
enough in action 2. In another experiment, we measure the
battery consumption of peers when they (i) play a video, (ii)
transcode a video from a higher quality, or (iii) play video
5https://github.com/itu-p1203/itu-p1203; last access: 2022-05-16.
Table II: Average transcoding times for a 5-min. video on peers
Resolution Bitrate iOS Android PC
1080p 240p 4219k89k 34.2 53.25 18.85
1080p 360p 4219k 262k 42.7 61 24.9
1080p 720p 4219k 791k 166.5 130.9 53.1
1080p 1080p 4219k 2484k 254.2 249.6 87.2
1080p 240p 2484k 89k 35 55.1 18.9
1080p 360p 2484k 262k 45 62.7 21.8
1080p 720p 2484k 791k 172.2 132.5 52
720p 240p 791k 89k 16.25 34.75 10.8
720p 360p 791k 262k 25.1 49.3 15.4
360p 240p 262k 89k 8.5 19.5 8.8
I, transcode video II, and transmit video III, simultaneously.
The average values for ve-minute videos are approximately
0.8%, 0.4%, and 1.3% of peers’ battery usage, respectively.
Thus, a combination of playing, transcoding, and transmitting
tasks does not put a significant burden on the peers’ batteries
compared to the energy used to play or transcode video.
In the second scenario, we evaluate RICHTERs effective-
ness in terms of the aforementioned metrics and compare the
results with the baseline systems. As illustrated in Fig. 4(a–c),
RICHTER downloads segments with higher ASB, decreases
AQS and ANS, shortens ASD, and thus improves APQ and
AST by at least 59% and 39% compared to the baseline
approaches (Fig. 4(c)), respectively. Thus, the average latency
can be significantly reduced due to shortening the ASD and
AST values. This is because RICHTER utilizes all peers’
possible resources for serving clients. The performance of
RICHTER regarding the CHR, BTL, and ETR metrics is
shown in Fig. 4(d–e). Note that a cache miss event occurs
when (i) the requested or higher quality levels are not avail-
able in the partial caches or on CDN servers, (ii) avail-
able bandwidth values are insufficient to fetch the requested
or higher quality levels from CDN servers or peers, (iii)
the available edge or peers’ processing capabilities are not
sufficient to transcode the requested quality from a higher
quality. The CHR and BTL metrics indicate that RICHTER
outperforms other systems due to its ability to fetch requested
or higher quality levels in a hybrid system or using distributed
transcoded. Although RICHTER downloads fewer segments
from the origin server and improves backhaul bandwidth usage
(by about 70%) compared to ECT, it uses more computation
resources of the edge and P2P layer due to employing a
distributed transcoding approach.
IV. CONCLUSION AND FUT UR E WO RK
This paper presents RICHTER, a hybrid P2P-CDN archi-
tecture for HAS-based live video streaming services. We (i)
introduce an action tree that defines all the possible actions
to serve HAS clients (from peers, CDN, edge, or origin
servers) with maximum users’ QoE and minimum latency,
(ii) formulate the action decision problem as an MILP op-
timization, and (iii) solve the formulated problem using an
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1916
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
online learning, SOM-based approach. Experimental results
on a large-scale testbed indicate the superiority of RICHTER
compared to its competitors. Extending the proposed action
tree and employing a reinforcement learning approach are our
future research directions.
ACKNOWLEDGMENT
The financial support of the Austrian Federal Ministry for
Digital and Economic Affairs, the National Foundation for
Research, Technology and Development, and the Christian
Doppler Research Association is gratefully acknowledged.
Christian Doppler Laboratory ATHENA: https://athena.itec.
aau.at/. This research has been supported in part by the
Singapore Ministry of Education Academic Research Fund
Tier 2 under MOE’s official grant number MOE2018-T2-1-
103.
REFERENCES
[1] Sandvine, “The Global Internet Phenomena Report,” White Paper, Jan-
uary 2022. [Online]. Available: https://www.sandvine.com/phenomena
[2] A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann,
“A survey on bitrate adaptation schemes for streaming media over
HTTP,” IEEE Communications Surveys & Tutorials, 2018.
[3] A. A. Barakabitze, N. Barman, A. Ahmad, S. Zadtootaghaj, L. Sun,
M. G. Martini, and L. Atzori, “QoE management of multimedia
streaming services in future networks: a tutorial and survey,” IEEE
Communications Surveys & Tutorials, 2019.
[4] N.-N. Dao, A.-T. Tran, N. H. Tu, T. T. Thanh, V. N. Q. Bao, and S. Cho,
“A Contemporary Survey on Live Video Streaming from a Computation-
Driven Perspective, ACM Computing Surveys (CSUR), 2022.
[5] R. Farahani, “CDN and SDN Support and Player Interaction for HTTP
Adaptive Video Streaming,” in Proc. 12th ACM Multimedia Systems
Conf., 2021.
[6] S. Budhkar and V. Tamarapalli, “An overlay management strategy
to improve QoS in CDN-P2P live streaming systems, Peer-to-Peer
Networking and Applications, 2020.
[7] N. Anjum, D. Karamshuk, M. Shikh-Bahaei, and N. Sastry, “Survey on
peer-assisted content delivery networks, Computer Networks, 2017.
[8] H. Yousef, J. Le Feuvre, P.-L. Ageneau, and A. Storelli, “Enabling
adaptive bitrate algorithms in hybrid CDN/P2P networks, in Proc. 11th
ACM Multimedia Systems Conf., 2020.
[9] N. Muscat and C. J. Debono, “A Hybrid CDN-P2P Architecture for
Live Video Streaming,” in IEEE EUROCON Int’l. Conf. on Smart
Technologies, 2021.
[10] R. Farahani, F. Tashtarian, A. Erfanian, C. Timmerer, M. Ghanbari,
and H. Hellwagner, “ES-HAS: An Edge- and SDN-Assisted Framework
for HTTP Adaptive Video Streaming,” in Proc. 31st ACM NOSSDAV
Workshop, 2021.
[11] R. Farahani, F. Tashtarian, H. Amirpour, C. Timmerer, M. Ghanbari,
and H. Hellwagner, “CSDN: CDN-Aware QoE Optimization in SDN-
Assisted HTTP Adaptive Video Streaming,” in Proc. 46th IEEE Conf. on
Local Computer Networks (LCN), 2021.
[12] R. Farahani, F. Tashtarian, C. Timmerer, M. Ghanbar, and H. Hellwag-
ner, “LEADER: A Collaborative Edge- and SDN-Assisted Framework
for HTTP Adaptive Video Streaming,” in Proc. IEEE Int’l. Conf. on
Communications (ICC), 2022.
[13] S. Nacakli and A. M. Tekalp, “Controlling P2P-CDN live streaming
services at SDN-enabled multi-access edge datacenters,” IEEE Trans.
on Multimedia, 2020.
[14] R. Farahani, H. Amirpour, F. Tashtarian, A. Bentaleb, C. Timmerer,
H. Hellwagner, and R. Zimmermann, “RICHTER: hybrid P2P-CDN
architecture for low latency live video streaming, in Proc. of the 1st
Mile-High Video Conference, 2022, pp. 87–88.
[15] Z. Ma, S. Roubia, F. Giroire, and G. Urvoy-Keller, “When Locality is not
enough: Boosting Peer Selection of Hybrid CDN-P2P Live Streaming
Systems using Machine Learning,” in Network Traffic Measurement and
Analysis Conf (IFIP TMA), 2021.
[16] T. Kohonen, “Self-Organizing Maps, Springer Science & Business
Media, 2012.
[17] CTA-5004, “Web application video ecosystem–common media client
data.” 2020. [Online]. Available: https://cdn.cta.tech/cta/media/media/
resources/standards/pdfs/cta-5004- final.pdf
[18] CTA-WAVE, “Common-media-server Data.” 2021. [Online]. Available:
https://github.com/cta-wave/common-media-client-data/issues/19
[19] M. R. Garey et al.,Computers and Intractability. A Guide to the Theory
of NP-Completeness. W.H. Freeman, 1979.
[20] A. Bentaleb, M. N. Akcay, M. Lim, A. C. Begen, and R. Zimmermann,
“Catching the moment with LoL+ in Twitch-like low-latency live
streaming platforms,” IEEE Trans. on Multimedia, 2021.
[21] K. Spiteri, R. Urgaonkar, and R. K. Sitaraman, “BOLA: Near-optimal
bitrate adaptation for online videos,” in 35th IEEE Int’l. Conf. on
Computer Communications, 2016.
[22] R. Ricci, E. Eide, and C. Team, “Introducing CloudLab: Scientific
infrastructure for advancing cloud architectures and applications,” login::
The Magazine of USENIX & SAGE, 2014.
[23] S. Lederer, C. M¨
uller, and C. Timmerer, “Dynamic adaptive streaming
over HTTP dataset, in Proc. 3rd ACM Multimedia Systems Conf., 2012.
[24] D. Raca, J. J. Quinlan, A. H. Zahran, and C. J. Sreenan, “Beyond
throughput: a 4G LTE dataset with channel and context metrics,” in
Proc. 9th ACM Multimedia Systems Conf., 2018.
Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia
1917
Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.
... (3) Learning involves training models to discern patterns within existing data and determine an appropriate decision-making action. This could entail Reinforcement Learning (RL) [22,56], classification using methods such as Support Vector Machines (SVM) [221,222], or prediction through techniques like Long Short-Term Memory (LSTM) [127,223] or Random Forest (RF) [6,128]. The initial step divides the dataset into training and testing sets, with subsequent training of models on the designated training set. ...
... The results validated that their approach achieved superior transcoding efficiency while meeting QoS requirements and outperformed the baseline greedy method. Farahani et al., [56] employed a Self Organizing Map (SOM) [98] to propose an online learning model for handling live video requests in a hybrid P2P-CDN network. This model effectively determines the serving action, whether from the P2P side through video transcoding or caching or from the CDN side through edge caching or edge transcoding, while taking serving latency into consideration. ...
Preprint
Full-text available
Improvements in networking technologies and the steadily increasing numbers of users, as well as the shift from traditional broadcasting to streaming content over the Internet, have made video applications (e.g., live and Video-on-Demand (VoD)) predominant sources of traffic. Recent advances in Artificial Intelligence (AI) and its widespread application in various academic and industrial fields have focused on designing and implementing a variety of video compression and content delivery techniques to improve user Quality of Experience (QoE). However, providing high QoE services results in more energy consumption and carbon footprint across the service delivery path, extending from the end user's device through the network and service infrastructure (e.g., cloud providers). Despite the importance of energy efficiency in video streaming, there is a lack of comprehensive surveys covering state-of-the-art AI techniques and their applications throughout the video streaming lifecycle. Existing surveys typically focus on specific parts, such as video encoding, delivery networks, playback, or quality assessment, without providing a holistic view of the entire lifecycle and its impact on energy consumption and QoE. Motivated by this research gap, this survey provides a comprehensive overview of the video streaming lifecycle, content delivery, energy and Video Quality Assessment (VQA) metrics and models, and AI techniques employed in video streaming. In addition, it conducts an in-depth state-of-the-art analysis focused on AI-driven approaches to enhance the energy efficiency of end-to-end aspects of video streaming systems (i.e., encoding, delivery network, playback, and VQA approaches). Finally, it discusses prospective research directions for developing AI-assisted energy-aware video streaming systems.
... In a hybrid P2P and CDN architecture called RICHTER [19], proposed by the same research group as ALIVE, the challenge of serving peer requests with minimal latency and optimal quality is formulated as an optimization problem executed at the network edge. The authors introduce an online learning approach and leverage unsupervised selforganizing maps (SOM) to (1) address the time complexity issue of the optimization model and (2) enable decision-making on groups of requests rather than individual requests, rendering it suitable for large-scale scenarios. ...
Article
Full-text available
This paper introduces a decentralized architecture designed for the sharing and distribution of user-generated video streams. The proposed system employs HTTP Live Streaming (HLS) as the delivery method for these video streams. In the architecture, a creator who captures a video stream using a smartphone camera subsequently transcodes it into a sequence of video chunks called HLS segments. These chunks are then stored in a distributed manner across the worker network, forming the core of the proposed architecture. Despite the presence of a coordinator for bootstrapping within the worker network, the selection of worker nodes for storing generated video chunks and autonomous load balancing among worker nodes are conducted in a decentralized fashion, eliminating the need for central servers. The worker network is implemented using the Golang-based IPFS (InterPlanetary File System) client, called kubo, leveraging essential IPFS functionalities such as node identification through Kademlia-DHT and message exchange using Bitswap. Beyond merely delivering stored video streams, the worker network can also amalgamate multiple streams to create a new composite stream. This bundling of multiple video streams into a unified video stream is executed on the worker nodes, making effective use of the FFmpeg library. To enhance download efficiency, parallel downloading with multiple threads is employed for retrieving the video stream from the worker network to the requester, thereby reducing download time. The result of the experiments conducted on the prototype system indicates that those concerned with the transmission time of the requested video streams compared with a server-based system using AWS exhibit a significant advantage, particularly evident in the case of low-resolution video streams, and this advantage becomes more pronounced as the stream length increases. Furthermore, it demonstrates a clear advantage in scenarios characterized by a substantial volume of viewing requests.
... Inferring individual preferences affective content is a difficult task. However, the portion of the media viewed by most users can be used as the context-aware adaptation by the content delivery network (CDN) [2,3]. On the other hand, the protocol stack is changing with the recently standardized Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. ...
... This paper's next step is integrating the CP-Steering strategy with dash.js and evaluating its performance on a real-world testbed like our previous works. [4][5][6]. ...
Article
Full-text available
Recent years have witnessed video streaming demands evolve into one of the most popular Internet applications. With the ever-increasing personalized demands for highdefinition and low-latency video streaming services, networkassisted video streaming schemes employing modern networking paradigms have become a promising complementary solution in the HTTP Adaptive Streaming (HAS) context. The emergence of such techniques addresses long-standing challenges of enhancing users’ Quality of Experience (QoE), end-to-end (E2E) latency, as well as network utilization. However, designing a cost-effective, scalable, and flexible network-assisted video streaming architecture that supports the aforementioned requirements for live streaming services is still an open challenge. This article leverages novel networking paradigms, i.e., edge computing and Network Function Virtualization (NFV), and promising video solutions, i.e., HAS, Video Super-Resolution (SR), and Distributed Video Transcoding (TR), to introduce A Latency-and cost-aware hybrId P2P-CDN framework for liVe video strEaming (ALIVE). We first introduce the ALIVE multi-layer architecture and design an action tree that considers all feasible resources (i.e., storage, computation, and bandwidth) provided by peers, edge, and CDN servers for serving peer requests with acceptable latency and quality. We then formulate the problem as a Mixed Integer Linear Programming (MILP) optimization model executed at the edge of the network. To alleviate the optimization model’s high time complexity, we propose a lightweight heuristic, namely, Greedy-Based Algorithm (GBA). Finally, we (i) design and instantiate a large-scale cloud-based testbed including 350 HAS players, (ii) deploy ALIVE on it, and (iii) conduct a series of experiments to evaluate the performance of ALIVE in various scenarios. Experimental results indicate that ALIVE (i) improves the users’ QoE by at least 22%, (ii) decreases incurred cost of the streaming service provider by at least 34%, (iii) shortens clients’ serving latency by at least 40%, (iv) enhances edge server energy consumption by at least 31%, and (v) reduces backhaul bandwidth usage by at least 24% compared to baseline approaches.
Conference Paper
Full-text available
5G and 6G networks are expected to support various novel emerging adaptive video streaming services (e.g., live, VoD, immersive media, and online gaming) with versatile Quality of Experience (QoE) requirements such as high bitrate, low latency, and sufficient reliability. It is widely agreed that these requirements can be satisfied by adopting emerging networking paradigms like Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing. Previous studies have leveraged these paradigms to present network-assisted video streaming frameworks, but mostly in isolation without devising chains of Virtualized Network Functions (VNFs) that consider the QoE requirements of various types of Multimedia Services (MS). To bridge the aforementioned gaps, we first introduce a set of multimedia VNFs at the edge of an SDN-enabled network, form diverse Service Function Chains (SFCs) based on the QoE requirements of different MS services. We then propose SARENA, an SFC-enabled ArchitectuRe for adaptive VidEo StreamiNg Applications. Next, we formulate the problem as a central scheduling optimization model executed at the SDN controller. We also present a lightweight heuristic solution consisting of two phases that run on the SDN controller and edge servers to alleviate the time complexity of the optimization model in large-scale scenarios. Finally, we design a large-scale cloud-based testbed including 250 HTTP Adaptive Streaming (HAS) players requesting two popular MS applications (i.e., live and VoD), conduct various experiments, and compare its effectiveness with baseline systems. Experimental results illustrate that SARENA outperforms baseline schemes in terms of users’ QoE by at least 39.6%, latency by 29.3%, and network utilization by 30% in both MS services.
Article
Full-text available
With the ever-increasing demands for high-definition and low-latency video streaming applications, network-assisted video streaming schemes have become a promising complementary solution in the HTTP Adaptive Streaming (HAS) context to improve users’ Quality of Experience (QoE) as well as network utilization. Edge computing is considered one of the leading networking paradigms for designing such systems by providing video processing and caching close to the end-users. Despite the wide usage of this technology, designing network-assisted HAS architectures that support low-latency and high-quality video streaming, including edge collaboration is still a challenge. To address these issues, this article leverages the Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing paradigms to propose A collaboRative edge-Assisted framewoRk for HTTP Adaptive video sTreaming (ARARAT). Aiming at minimizing HAS clients’ serving time and network cost, besides considering available resources and all possible serving actions, we design a multi-layer architecture and formulate the problem as a centralized optimization model executed by the SDN controller. However, to cope with the high time complexity of the centralized model, we introduce three heuristic approaches that produce near-optimal solutions through efficient collaboration between the SDN controller and edge servers. Finally, we implement the ARARAT framework, conduct our experiments on a large-scale cloud-based testbed including 250 HAS players, and compare its effectiveness with state-of-the-art systems within comprehensive scenarios. The experimental results illustrate that the proposed ARARAT methods (i) improve users’ QoE by at least 47%, (ii) decrease the streaming cost, including bandwidth and computational costs, by at least 47%, and (iii) enhance network utilization, by at least 48% compared to state-of-the-art approaches.
Article
Full-text available
Live video streaming services have experienced significant growth since the emergence of social networking paradigms in recent years. In this scenario, adaptive bitrate streaming communications transmitted on web protocols provide a convenient and cost-efficient facility to serve various multimedia platforms over the Internet. In these communication models, video content is delivered optimally, possibly transcoded, edited automatically, and cached temporarily by network elements along the path. To this end, the computational capabilities of various network elements are considered as major resources to be optimized for service quality improvements. This paper provides a contemporary survey of cutting-edge live video streaming studies from a computation-driven perspective. First, an overview of the global standards, system architectures, and streaming protocols is presented. Next, hierarchical computation-driven models of live video streaming are anatomized, including cloud-, edge-, and peer-to-peer-based solutions. Cutting-edge studies are then reviewed to discover the advances they have made in improving system performance in multiple aspects. Finally, open challenges are presented to direct future research in this field.
Conference Paper
Full-text available
Video streaming has become one of the most prevailing, bandwidth-hungry, and latency-sensitive Internet applications. HTTP Adaptive Streaming (HAS) has become the dominant video delivery mechanism over the Internet. Lack of coordination among the clients and lack of awareness of the network in pure client-based adaptive video bitrate approaches have caused problems, such as sub-optimal data throughput from Content Delivery Network (CDN) or origin servers, high CDN costs, and non-satisfactory users' experience. Recent studies have shown that network-assisted HAS techniques by utilizing modern networking paradigms, e.g., Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing can significantly improve HAS system performance. In this doctoral study, we leverage the aforementioned modern networking paradigms and design network-assistance for/by HAS clients to improve HAS systems performance and CDN/network utilization. We present four fundamental research questions to target different challenges in devising a network-assisted HAS system. CCS CONCEPTS • Information systems → Multimedia streaming; • Networks → In-network processing.
Conference Paper
Full-text available
Recent studies have revealed that network-assisted techniques, by providing a comprehensive view of the network, improve HTTP Adaptive Streaming (HAS) system performance significantly. This paper leverages the capability of Software-Defined Networking, Network Function Virtualization, and edge computing to introduce a CDN-Aware QoE Optimization in SDN-Assisted Adaptive Video Streaming (CSDN) framework. We employ virtualized edge entities to collect various information items and run an optimization model with a new server/segment selection approach in a time-slotted fashion to serve the clients' requests by selecting optimal cache servers. In case of a cache miss, a client's request is served by an optimal replacement quality from a cache server, by a quality transcoded from an optimal replacement quality at the edge, or by the originally requested quality from the origin server. Comprehensive experiments conducted on a large-scale testbed demonstrate that CSDN outperforms other approaches in terms of the users' QoE and network utilization.
Article
Full-text available
Our earlier Low-on-Latency (dubbed as LoL) solution offered an accurate bandwidth prediction and rate adaptation algorithm tailored for live streaming applications that needed an end-to-end latency of up to two seconds. While LoL was a significant step forward in multi-bitrate low-latency live streaming, further experimentation and testing showed that there was room for improvement in three areas. First, LoL used hardcoded parameters computed from an offline training process in the rate adaptation algorithm and this was seen as a significant barrier in LoLs wide deployment. Second, LoLs objective was to maximize a collective QoE function. Yet, certain use cases have specific objectives besides the singular QoE and this had to be accommodated. Third, the adaptive playback speed control failed to produce satisfying results in some scenarios. Our goal in this paper is to address these areas and make LoL sufficiently robust to deploy. We refer to the new solution as LoL+ .
Conference Paper
Full-text available
Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.
Conference Paper
Full-text available
As video traffic becomes the dominant part of the global Internet traffic, keeping a good quality of experience (QoE) becomes more challenging. To improve QoE, HTTP adaptive streaming with various adaptive bitrate (ABR) algorithms has been massively deployed for video delivery. Based on their required input information, these algorithms can be classified, into buffer-based, throughput-based or hybrid buffer-throughput algorithms. Nowadays, due to their low cost and high scalability, peer-to-peer (P2P) networks have become an efficient alternative for video delivery over the Internet, and many attempts at merging HTTP adaptive streaming and P2P networks have surfaced. However, the impact of merging these two approaches is still not clear enough, and interestingly, the existing HTTP adaptive streaming algorithms lack testing in a P2P environment. In this paper, we address and analyze the main problems raised by the use of the existing HTTP adaptive streaming algorithms in the context of P2P networks. We propose two methodologies to make these algorithms more efficient in P2P networks regardless of the ABR algorithm used, one favoring overall QoE and one favoring P2P efficiency. Additionally, we propose two new metrics to quantify the P2P efficiency for ABR delivery over P2P.
Article
Recognizing the shortcomings of current hybrid peer-to-peer (P2P) content-distribution network (CDN) video solutions and the potential of emerging multi-access edge datacenters, we propose a novel P2P-CDN service model that is hosted at software defined networks (SDN)-enabled multi-access edge datacenters operated by network service providers (NSP). An important feature of the proposed service architecture is that both CDN access by peers and P2P video streaming between peers within edge access networks are fully controlled by cooperation of the video content provider (VCP) and NSP to optimize video service key performance indicators (KPI). The proposed fully controlled P2P-CDN architecture with P2P group formation and chunk scheduling managed at edge datacenters reduces the load on CDN servers while overcoming quality of experience (QoE) fluctuations per flow and unfairness between multiple heterogeneous video-resolution clients over reserved access network slices. Other advantages of this service include: i) better video quality and lower delay for clients; ii) better use of edge network resources; iii) avoiding illegal, unauthorized P2P content sharing. To the best of our knowledge, there are no solutions in the literature that address P2P-CDN services managed at NSP-edge datacenters combining P2P-assisted CDN, SDN-assisted edge computing, and premium service over reserved slices. Experimental results show that the proposed P2P-CDN service deployed at SDN-enabled edge datacenters provides excellent service KPI compared to other state-of-the-art solutions.