Conference PaperPDF Available

Hybrid P2P-CDN Architecture for Live Video Streaming: An Online Learning Approach

December 2022

December 2022

DOI:10.1109/GLOBECOM48099.2022.10001091

Conference: GLOBECOM 2022 - 2022 IEEE Global Communications Conference

Authors:

Reza Farahani

Alpen-Adria-Universität Klagenfurt

Abdelhak Bentaleb

Concordia University Montreal

Ekrem Çetinkaya

Ozyegin University

Show all 6 authorsHide

Content uploaded by Reza Farahani

Content may be subject to copyright.

Hybrid P2P-CDN Architecture for Live Video

Streaming: An Online Learning Approach

Reza Farahani*, Abdelhak Bentaleb§, Ekrem C¸ etinkaya*,

Christian Timmerer*, Roger Zimmermann§, and Hermann Hellwagner*

*Christian Doppler Laboratory ATHENA, Institute of Information Technology, Alpen-Adria-Universit¨

at Klagenfurt, Austria

§School of Computing, National University of Singapore, Singapore

Abstract—Designing a cost-effective, scalable, and ﬂexible

architecture that supports low latency and high quality live

video streaming is still a challenge for Over-The-Top (OTT)

service providers. To cope with this issue, this paper leverages

Peer-to-Peer (P2P), Content Delivery Network (CDN), edge com-

puting,Network Function Virtualization (NFV), and distributed

video transcoding paradigms to introduce a hybRId P2P-CDN

arcHiTecture for livE video stReaming (RICHTER). We ﬁrst

introduce RICHTER’s multi-layer architecture and design an

action tree that considers all feasible resources provided by peers,

edge, and CDN servers for serving peer requests with minimum

latency and maximum quality. We then formulate the problem

as an optimization model executed at the edge of the network.

We present an Online Learning (OL) approach that leverages an

unsupervised Self Organizing Map (SOM) to (i) alleviate the time

complexity issue of the optimization model and (ii) make it a

suitable solution for large-scale scenarios, by enabling decisions

for groups of requests instead of for single requests. Finally, we

implement the RICHTER framework, conduct our experiments on

a large-scale cloud-based testbed including 350 HAS players, and

compare its effectiveness with baseline systems. The experimental

results illustrate that RICHTER outperforms baseline schemes in

terms of users’ Quality of Experience (QoE), latency, and network

utilization, by at least 59%, 39%, and 70% respectively.

Index Terms—HAS; Edge Computing; NFV; CDN; P2P; Low

Latency; QoE; Video Transcoding; Online Learning.

I. INTRODUCTION

Motivation: The proliferation of novel video streaming

technologies, advancement of networking paradigms, and

steadily increasing numbers of users who prefer to watch video

content over the Internet rather than using classical TV have

made video the predominant trafﬁc on the Internet. Among all

types of video trafﬁc, live video streaming has become signif-

icantly popular, accounting for about 17% of the total video

trafﬁc by 2022 [1]. HTTP Adaptive Streaming (HAS) delivery

systems, (e.g., based on the Dynamic Adaptive Streaming over

HTTP (DASH) standard or Apple’s HTTP Live Streaming

(HLS)) have become the prevalent technologies employed by

OTT service providers (e.g., Facebook, YouTube, Twitch) for

live video streaming delivery [2]. In HAS, videos are split

into short segments with ﬁxed duration, and each segment

is encoded at various qualities/bitrates (i.e., representations);

then, HAS clients adapt to the available bandwidth and/or

playout buffer status to download appropriate segments from

CDN servers, using an adaptive bitrate algorithm [2]. Although

utilizing CDN services to scale HAS delivery systems has been

a step forward, tremendous growth in high-quality and low

latency live video demands create several challenges for OTT

services. For instance, CDN servers can be overloaded, then

OTT services fail to deliver a satisfactory quality and latency to

end-users [3]. Recent studies have revealed that using clients’

capabilities within a P2P network to form hybrid P2P-CDN

video delivery systems addresses the aforementioned issues

and brings many advantages, like alleviating network con-

gestion, increasing streaming stability, and reducing delivery

costs [4]–[6]. Considering these beneﬁts, many companies,

e.g., Peer5 and Livepeer, have been utilizing peer-assisted

networks with some promising networking protocols (e.g., We-

bRTC) to ofﬂoad CDNs and accomplish the aforementioned

goals. Some works [7] reveal that existing hybrid P2P-CDN

live streaming systems do not consider the full capability of

peers to provide high quality and low latency live streaming,

consequently suffering from inefﬁcient resource utilization and

unpleasant users’ QoE. Therefore, the primary motivation for

our work is devising a hybrid P2P-CDN live streaming system

to (i) employ both computing and bandwidth capabilities

provided by the P2P network, (ii) leverage modern networking

paradigms (i.e., NFV and edge computing) and an OL-based

approach to utilize P2P and CDN resources efﬁciently, and (iii)

satisfy HAS client requests with high QoE and low latency.

Related Work: Hybrid P2P-CDN systems generally include

three main components: (i) the media servers (i.e., origin or

CDN servers) for distributing the video contents to the peers,

(ii) peers that stream the same video contents with the same

quality, and (iii) a tracker server including a matching table to

ﬁnd the best peers with minimum latency who are watching

the same video content and quality level. Some works like [8]

customize HAS players and propose such a hybrid system

in order to reduce CDN bandwidth usage and transmission

costs. Muscat et al. [9] utilize the server push functionality

of HTTP/2 to propose a hybrid P2P-CDN low latency live

video streaming system. In our previous works [10]–[12],

we propose edge- and Software-Deﬁned Networking (SDN-)

assisted video streaming frameworks that do not leverage P2P

capability and focus on Video on Demand (VoD) scenarios.

Nacakli et al. [13] leverage SDN and edge computing to

present a novel hybrid P2P-CDN service that is hosted at SDN-

enabled edge data centers. Our previous work [14] proposes

a hybrid P2P-CDN architecture for low latency live video

streaming without implementation, evaluation an utilizing an

online learning approach. Ma et al. [15] propose machine

learning-based approaches for hybrid P2P-CDN systems that

enable their trackers to perform peer selection. However, their

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1911

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

CDN Network

Virtual Tracker

Server (VTS)

P2P Layer

Peer Transcoder

Edge Transcoder

gNodeB

Partial Cache (PC)

Media Organization Layer

Encoding and

Packaging

Seeders

CMCD

CMSD

Media

Leechers

Edge Layer

CDN Layer

Figure 1: RICHTER system architecture

and does not include transcoding-based actions. To the best of

our knowledge, none of the existing hybrid P2P-CDN video

streaming frameworks proposes a system to (i) use peers’

potential idle computational resources for serving HAS clients

through running video transcoding and (ii) make the virtual

edge trackers intelligent by employing an OL approach.

Contributions: To tackle these challenges, in this paper,

we leverage HAS, P2P, CDN, NFV, and edge computing

technologies and propose a hybRId P2P-CDN arcHiTecture

for livEvideo stReaming (RICHTER). Our solution aims to

minimize HAS clients’ latency and network costs. Besides

considering resource limitations, we design an Action Tree

including all possible actions for serving clients’ requests

employed by Virtual Tracker Servers (VTSs) at the edge of

a P2P-CDN network. We formulate the problem as a mixed-

integer linear programming (MILP) optimization model. Due

to the NP-completeness of the proposed MILP model, we

design an OL-based approach that uses an unsupervised SOM

technique [16] for action selection decisions. To test the

practical deployment of our solution, we implement RICHTER

and analyze its performance through experiments conducted

in a large-scale testbed including 350 clients and compare its

results with selected baseline approaches. The experimental

results demonstrate the effectiveness of RICHTER for achiev-

ing high users’ QoE, low latency, and optimized network

utilization.

Paper Outline: The remainder of this paper is structured

as follows. Section II-A explains the proposed architecture;

we formulate the problem as a MILP optimization model in

Section II-B, and explain our proposed OL-enabled method in

Section II-C. The evaluation setup, methods, metric and results

are described in Section III. Section IV concludes the paper

and gives an outlook on future work.

II. RICHTER DESIGN

A. System Model

The proposed architecture of RICHTER includes four core

layers and is shown in Fig. 1.

Media Organization Layer. In this layer, the raw live

videos are encoded and packaged into DASH format, then

stored on the origin server. Note that this layer is able to

package the encoded videos to other formats like HLS or

Common Media Application Format (CMAF).

CDN

Server

VTS

(PC.)

Origin

Server

Peer

(Tran.)

VTS

(Tran.)

VTS

(Tran.)

Action Tree

Clients

Peer

2345 6 7

Figure 2: RICHTER action tree

CDN Layer. This layer is constructed by a group of CDN

servers (either OTT servers or a purchased service from CDN

providers), each of which contains various parts of video

sequences. Inspired by the Consumer Technology Association

CTA-5004 standard [17], [18], CDN servers periodically in-

form the edge layer about their cache occupancy via Common

Media Server Data (CMSD) messages.

P2P Layer. Given the continuous increases in smartphone

capabilities, e.g., high-bandwidth access to the Internet, en-

ergy resources, and hardware-accelerated video transcoding,

RICHTER utilizes the peers’ idle resources to provide a

distributed video transcoding approach besides video trans-

mission. Like most hybrid P2P-CDN schemes, we construct

the P2P layer based on the tree-mesh structure, including

two types of peers: Seeders and Leechers. In this scheme,

seeders’ requests can be served by all nodes (i.e., CDNs,

origin, edge, or other seeders) except leechers, while leechers’

requests can be served by all nodes. Inspired by the CTA-5004

standard [17], peers periodically inform the edge layer about

their cache occupancies through Common Media Client Data

(CMCD) messages and receive updates from the edge layer

via CMSD messages.

Edge Layer. This layer leverages the capabilities of NFV

and edge computing and presents virtualized edge components

called Virtual Tracker Servers (VTSs) close to base stations

(e.g., gNodeB in 5G). Note that, in the proposed system,

during a live session, clients’ requests are directed to a VTS,

and then they get responses based on the VTS’s decisions.

As shown in Fig. 1, a VTS is equipped with transcoding

and partial cache functions to serve clients’ requests from

existing higher content qualities (by transcoding) or directly

from cached qualities, respectively. Note that because the VTS

has a broader view of both P2P and CDN layers (based on the

received CMCD/CMSD messages and monitored information),

it can track clients’ requests and store a mapping between

all transmitted content and all served clients in its peer-map

lists. Thus, it must respond to the following vital questions

whenever it needs to decide to serve received requests:

1) Where is the optimal place (i.e., adjacent peers, VTS, CDN

servers, or origin server) in terms of lowest latency for

fetching each client’s requested content quality level from,

while efﬁciently utilizing the available resources?

2) What is the optimal approach for responding to the re-

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1912

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

quested quality level (i.e., fetch or transcode)?

Among other tasks, a VTS monitors the system frequently to

obtain precise information about the available resources (e.g.,

bandwidth, peers’ computational and power resources), and

peers’ joining/leaving times. Therefore, when a VTS receives

a new request, it can ﬁnd the optimal solution (i.e., in terms

of minimum latency) from the action tree (Fig. 2) (action

numbering as in the ﬁgure): (1) Use the P2P network and

transmit the requested quality directly from the best adjacent

peer with maximum stability (i.e., the least recent joining

time). (2) Transcode the requested quality from a higher

quality at the most stable adjacent peer and transmit it through

the P2P network. (3) Fetch the requested quality directly from

the edge, i.e., the VTS. (4) Transcode the requested quality

from a higher quality at the VTS. (5) Fetch the requested

quality from the origin server. (6) Fetch a higher quality from

the best CDN server and transcode it at the VTS. (7) Fetch

the requested quality from the best CDN server.

B. Problem Formulation

We introduce an MILP optimization model that includes

four groups of constraints: Action Selection (AS), Serving Time

(ST), Origin/Peer (CP), and Resource Usage (RU).

(i) AS constraint. Constraint (1) selects an appropriate

action from the proposed action tree (Fig. 2) for the request

issued by peer j. It chooses a suitable value of the binary

variable Xq,t

i,j (refer to Table I for the deﬁnition):

i∈{P∪V ∪C}\{j}X

t∈T X

q∈Qj

Xq,t

i,j ×αq

i,j = 1,∀j∈ P (1)

(ii) ST constraint. Constraint (2) determines transmitting

time Tq

i,j to transmit quality level q∈ Qjfrom source node i

to peer j:

Pt∈T Xq,t

i,j ×δq

ωi,j

≤Tq

i,j ,(2)

∀j∈ P, i ∈ {P ∪ V ∪ C} \ {j}, q ∈ Qj

Constraint (3) determines the required transcoding time τq

i,j

at node iin case of serving the quality requested by peer j

from a higher quality qby transcoding:

t∈T \{0}X

q∈Qj

Xq,t

i,j ×µq

i,j ≤τq

i,j ,∀j∈ P, i ∈ {P ∪ V} \ {j}

(3)

Therefore, the serving time, namely Ψ,i.e., fetching time

plus transcoding time, can be expressed as follows:

i∈{P∪V ∪C}\{j}X

j∈P X

q∈Qj

i,j +τq

i,j ≤Ψ(4)

(iii) CP constraint. Constraint (5) forces the model to fetch

the exact quality q∗from the origin or CDNs when one of them

is chosen to serve peer j∈ P.

Table I: Notation for RICHTER

Notation Description

Input Parameters

CSet of kCDN servers and an origin server (i.e.,c= 0)

PSet of npeers including sseeders and lleechers in subsets P1and P2,

respectively

VVirtual Tracer Servers (VTSs)

QjSet of possible quality levels for serving quality q∗requested by j∈ P,

where Qj={q∗, q∗+ 1, ..., q∗

max}and q∗

max is the maximum

quality level for the demanded segment

TSet of possible transcoding statuses, where T={0,1,2}and t= 1 or

t= 2 if the requested quality is transcoded from a higher quality q∈ Qj

at a VTS i∈ V or a peer i∈ {P } \ {j}, respectively; otherwise t=0

RSet of ρpeer regions

QSet of quality level queues in the VTS

αq

i,j Available quality levels in i∈ {P ∪ V ∪ C};αq

i,j = 1 means node i

hosts quality qrequested by peer j∈P; otherwise αq

i,j = 0

ωi,j Available bandwidth on path between i∈ {P ∪ V ∪ C} \ {j},j∈ P

θq

i,j Required resources (i.e., CPU usage in %) for transcoding quality q∈ Qj

into the quality requested by j∈ P in i∈ {P ∪ V}} \ {j}

ηq

i,j Required power (in milliampere-hour) for transcoding quality q∈ Qj

into the quality requested by j∈ P in i∈ {P } \ {j}

µq

i,j Required time (in seconds) for transcoding quality q∈ Qjinto the

quality requested by j∈ P in i∈ {P ∪ V}} \ {j}

δq

jSize of segment in quality q(in bytes) requested by j∈ P

λq

jBitrate for quality level q∈ Qjrequested by j∈ P

ΩiAvailable computation resources (available CPU) of i∈ {P ∪ V}

ϕiAvailable power resources of i∈ P

Variables

Xq,t

i,j Binary variable where Xq,t

i,j = 1 indicates source

i∈ {P ∪ V ∪ C} \ {j}transmits quality q∈ Qjrequested by peer

j∈ P with transcoding status t, otherwise Xq,r,t

i,j = 0

τq

i,j Required transcoding time at source i∈ {P ∪ V} \ {j}to serve quality

q∈ Qjrequested by j∈ P

i,j Required time of transmitting quality level q∈ Qjin response to peer

j∈ P from server i∈ {P ∪ V ∪ C} \ {j}

ΨServing time consisting of τq

i,j and Tq

i,j

i∈C X

q∈Qj

Xq,t=0

i,j ×q=q∗,∀j∈ P (5)

Note that i= 0 in Eq. (5) denotes the origin server.

Moreover, we should prevent seeders from fetching requested

qualities from leechers, expressed in Eq. (6):

i∈P2

t∈T \{1}X

q∈Qj

Xq,t

i,j ×q= 0,∀j∈ P1(6)

(iv) RU constraint. Constraint (7) guarantees that the

required bandwidth for transmitting segments on the link

between nodes iand jmust respect the available bandwidth:

t∈T X

q∈Qj

Xq,t

i,j ×λq

j≤ωi,j ,∀j∈ P, i ∈ {P ∪ V ∪ C} \ {j}(7)

Constraint (8) limits the maximum required processing

capacity for the transcoding operation to the available com-

putational resource.

j∈P X

q∈Qj

(Xq,t=1

i∈V,j +Xq,t=2

i∈P\{j},j )×θq

i,j ≤Ωi∀i∈ {P ∪ V}

(8)

Similarly, constraint (9) limits the maximum required peers’

power resources for running the transcoding function to the

available power resource.

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1913

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

j∈P X

q∈Qj

Xq,t=2

i,j ×ηq

i,j ≤ϕi∀i∈ P \ {j}(9)

MILP Optimization Model. The following model min-

imizes the requests’ serving times (i.e., fetching time plus

transcoding time), denoted by Ψ:

Minimize Ψ(10)

s.t. constraints Eq.(1) −Eq.(9)

vars. T q

i,j , τ q

i,j ,Ψ≥0, Xq,t

i,j ∈ {0,1}

By running the MILP model (10), an optimal action will be

selected for each request issued by peer j∈ P such that the

total serving time is minimized. However, the MILP model

(10) is NP-hard [19], and suffers from high time complexity.

The next section introduces an OL-based approach based on

SOM [16] to remedy this issue.

C. Proposed Online Learning Approach

We design an OL-based solution depicted in Fig. 3(a),

which works in a time-slotted fashion. As shown in Fig. 3(b),

the proposed time slot structure consists of two intervals:

(i) Collecting Data (CD) and (ii) Serving Requests (SR). In

addition to the transcoding and partial cache functions, a

VTS is equipped with the four following modules: Resource

monitoring Module (RM), Manager Module (MM), Queuing

Module (QM), and OL Agent. Moreover, the VTS hosts

multiple queues, one each per peer region, live video channel,

and bitrate in each channel. In the CD interval, the following

modules are called to prepare inputs for the OL agent:

RM. This module is responsible for collecting received

CMCD and CMSD messages, monitoring available resources

(i.e., bandwidth, power, computation, joining/leaving times),

queues, and notifying the MM module.

MM. This module is used to (i) receive HTTP requests from

players, (ii) extract regions based on IP addresses, requested

channels, and bitrates from the incoming HTTP requests, (iii)

aggregate and forward the incoming HTTP requests and the

extracted information (i.e., region/channel/bitrate) to the QM

module, (iv) update the OL agent based on the items received

by the RM, (v) control the correctness of decisions made by

the OL agent before fetching or transcoding qualities from

nodes, (vi) communicate with the peers, CDN and/or origin

server regarding the decisions made by the OL agent, and

(vii) store popular segments fetched from CDN/origin server

into the partial cache. Note that the MM module immediately

responds to a requested segment that exists in the partial cache.

Furthermore, it includes an on the ﬂy list that is used to prevent

delivering a request to the QM module if a response to the

request is in ﬂight from the CDN/origin server.

QM. This module receives extracted features of requests from

the MM module and places requests in separate queues based

on peer regions, requested channel IDs, and bitrates.

Considering the system’s current state, i.e., available in-

formation on resources and queues of requests provided by

the MM module, the OL agent in the SR interval must run

multiple threads of an OL algorithm (one thread per peer

region) to answer the questions mentioned in Section II-A.

Since SOM [16] (i) is one of the widely used techniques for

unsupervised classiﬁcation problems, (ii) can be applied to

solve NP-hard problems [20], (iii) does not require a prepared

dataset for supervised model training, (iv) allows online real-

time decision making, and (v) evolves its model quickly over

time, it is adopted as the request management solution in the

OL agent. For each queue Qb∈ Q with requested bitrate level

b, a set of SOM neurons (black circles in Fig. 3(a)) is created,

each of which is a feasible node holding the requested quality

(i.e., same-region peer, VTS, CDN/origin server) or a higher

quality (i.e., same-region peer, VTS) for serving bthrough

fetching or running transcoding, respectively. Note that since

more than one queue can proceed and might violate all/several

resource constraints (e.g., the bandwidth, computation, or

power limitations), they are evaluated in a priority order where

the queue with a higher number of requests comes ﬁrst.

Each SOM’s neuron has two features (i.e., feature map)

that are deﬁned as a <latency, penalty>tuple. The latency

feature indicates fetching plus transcoding times, while the

penalty feature is used to penalize the neuron whenever the

agent makes an incorrect decision (due to violating one/several

constraints (1)-(9)). For the sake of simplicity, we assume that

each violating action increases the sum of penalties by one.

Moreover, in order to represent the SOM features in the same

space, we use normalized features in the range between 0

and 1. When the SOM thread is executed, it will consider

the neurons’ feature map and classify neurons to ﬁnd the

best matching unit (BMU) with the maximum reward, i.e.,

minimum <latency, penalty>values. The Euclidean distance

function DQb(i, j) = qP2

n=1 wn

Qb(i[n]−j[n])2as a simple

discriminant function is used to calculate the best matching

of the features used in each neuron jcompared to BMU

i, where wn

Qbin weight matrix wQbis used for the nth

feature of each feature list. Usually, after selecting the BMU,

the corresponding neuron and its neighbors must be updated.

Note that the neighborhood function employed in the SOM

is the Gaussian distribution function HQb(i, j) = e

−DQb(i,j)2

2σ2,

where σis the learning rate. Finally, an output list of tuples

(N, A, R, V ) sorted in ascending order (in terms of latency)

is sent to the MM module, where each tuple indicates the

determined node N, action A, the maximum number of

requests Rthat can be served via that node/action, and a

violation signal V, respectively (Fig. 3(c)). For instance, tuple

(p1,1,2,0) of the output list shows that peer1using action1can

serve two requests without violating the deﬁned constraints.

The MM module follows the SOM decisions for serving

requests with tuples with V= 0, while it ignores tuples with

V≥1. Note that the MM module updates the inputs of the OL

agent (i.e., available resources) regarding the accepted outputs

of the OL agent since the SOM threads might execute several

times during two consecutive CD intervals.

This process will be repeated in each SR interval until the

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1914

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

Region 1

Region 2

Region n

Ch I

Channel n

...

Channel I

Q I

Q 2

Q 3

Q n

Channel II

...

Ch I

...

Channel I

Q I

Q 2

Q 3

Q n

Channel II

...

Ch I

...

Q I

Q 2

Q 3

Q n

...

Queuing Module (QM)

Region 1 Region 2 Region n

Manager Module (MM)

. .

Online Learning Agent

Resource monitoring Module (RM)

1 2 3 456 7

Feature Map

Actions

Thread 1

Thread 2

Thread n

Partial

Cache

Channel n

Channel II

Channel I

3 7

CDN/Origin

VTS

Peers

(a)

...

900

100

500

300

123

Collecting Data (CD) Interval

Serving Requests (SR) Interval

... Time

(b) (c)

Figure 3: (a) Proposed online learning structure, (b) time slot structure, and (c) a sample of the OL agent output

live streaming session ends and all queues are served. Assume

ρ,β, and γindicate the number of peer regions, number of

live channels, and number of bitrates per channel. In the worst

case, the time complexity of the multi-thread SOM method

employed by the OL agent would be O(ρ×β×γ)in each

time slot.

III. PERFORMANCE EVALUATI ON

Evaluation Setup: To assess the effectiveness of RICHTER

in a realistic large-scale environment, InternetMCI1is con-

sidered as a real backbone network topology. We instanti-

ate our testbed including 375 elements, i.e., 350 AStream2

DASH players running the BOLA [21] adaptive bitrate (ABR)

algorithm (seven groups of 50 peers), ﬁve Apache HTTP

servers (i.e., four CDN servers with a total cache size of

40% of the video dataset and an origin server, containing all

video sequences), 19 OpenFlow (OF) backbone switches, 45

backbone layer-2 links, and a VTS server (with a partial cache

size of only 5% of the video sequences) on the CloudLab [22]

environment. Each element is run on Ubuntu 18.04 LTS inside

Xen virtual machines. RICHTER is independent of the caching

policy and is compatible with various caching strategies. For

simplicity, Least Recently Used (LRU) is considered in all

CDN and partial caches as the cache replacement policy. Note

that we assume each peer can cache ﬁve segments of the

videos at most. We implement all modules of VTS in Python

to serve clients’ requests for ﬁve live channels (i.e., CH I–

CH V). Each live channel plays a unique video [23] with 300

seconds duration, comprising two-second segments in bitrate

ladder {(0.089,320p), (0.262,480p), (0.791,720p), (2.4,1080p),

(4.2,1080p)}[Mbps, content resolution].

1http://www.topology-zoo.org/dataset.html; last access: 2022-05-16.

2https://github.com/pari685/AStream; last access: 2022-05-16.

The Docker image jrottenberg/ffmpeg3is utilized to measure

the segment transcoding time on the VTS. To measure the

transcoding time on the heterogeneous P2P network, we run

the transcoding function via FFmpegKit4on an iPhone 11

(Apple A13 Bionic, iOS 15.3), a Xiaomi Mi11 (Snapdragon

888, Android 11), and a PC (Apple M1, MacOS 12.0.1).

Moreover, power consumption is measured via device tools,

such as Android Energy Proﬁler and Android Battery Manager.

The bandwidth of all links in different paths from the CDN

and origin servers to the VTS are set to 50 and 100 Mbps,

respectively. To emulate the mobile network conditions, we

assume 250 peers initiate the experiments, and then, every

three seconds, a new peer joins the sessions. The VTS directs

the ﬁrst peer to the best CDN server (in terms of lowest

latency), while other participating peers can be connected

on both CDN and P2P links. A real 4G network trace [24]

collected on bus rides is employed for links between peers to

edge servers in all experiments. The average bandwidth of this

trace is approximately 3780 kbps with a standard deviation

of 3190 kbps. The channel access probability is generated

following a Zipf distribution with the skew parameter α= 0.7,

i.e., the probability of an incoming request for the ith channel

in each peer group is given as prob(i) = 1/iα

j=1 1/jα, where

K= 5. The learning rate and weighting parameters associ-

ated with latency and penalty are set to 0.01, 0.5, and 0.5,

respectively.

Evaluation Methods: The results achieved by the

RICHTER will be compared with the following baseline meth-

ods: (i) Non Hybrid (NOH): regular CDN-based streaming

with no P2P support. (ii) Non Transcoding-enabled Hybrid

(NTH): Like in most works, there is no transcoding capa-

3https://hub.docker.com/r/jrottenberg/ffmpeg; last access: 2022-05-16.

4https://tanersener.github.io/ffmpeg-kit/; last access: 2022-05-16.

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1915

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

ANS

AQS

(%)

(a)

(b)

(c)

(d)

ASD (sec.)

ASB (Mbps)

(e)

APQ

(%)

CHR

ETR

NOH NTH ECT RICHTER

100

NOH NTH ECT RICHTER

AST (sec.)

NOH NTH ECT RICHTER

120

150

NOH NTH ECT RICHTER

ASB

AQS

ANS

ASD

APQ

AST

CHR

ETR

BTL (Gbps)

Figure 4: Evaluation results for the NOH, NTH, ECT, and RICHTER systems for 350 clients

bility in this approach. In an NTH-based system, peers only

can be served via one of the actions 1, 5, or 7 (Fig. 2).

(iii) Edge Caching/Transcoding Hybrid (ECT): In this ap-

proach, transcoding at the peer side is not considered, and

requests can be served via all actions except action 2. For fair

comparisons, our testbed with a similar setup is used in all

systems. Moreover, the NOH, NTH, and ECT systems em-

ploy lightweight heuristic approaches to answer the questions

mentioned in Section II-A by considering Eqs. (1)–(10).

Evaluation Metrics: The performance of these systems is

evaluated through the following metrics: (i) Average Segment

Bitrate (ASB) of all the downloaded segments; (ii) Average

Number of Quality Switches (AQS), the average number of

segments whose bitrate level changed compared to the pre-

vious one; (iii) Average Stall Duration (ASD), the average

of total video freeze time of all clients; (iv) Average Number

of Stalls (ANS), the average number of rebuffering events;

(v) Average Perceived Overall QoE (APQ) calculated by the

ITU-T Rec. P.1203 model in mode 05;(vi) Average Serving

Time (AST), deﬁned as the overall time for serving all clients,

including fetching time plus transcoding time; (vii) Backhaul

Trafﬁc Load (BTL), the volume of segments downloaded

from the origin server; (viii) Edge Transcoding Ratio (ETR),

the fraction of segments transcoded at the VTS or peers;

(ix) Cache Hit Ratio (CHR), deﬁned as the fraction of seg-

ments fetched from the CDN or edge servers or peers. Each

experiment is executed 20 times, and the average and standard

deviation values are reported in the experimental results.

Evaluation Results: Running transcoding on peers must

be fast enough, not signiﬁcantly impose a delay to the live

system, and not consume much battery; otherwise, the clients’

requests may use other actions that congest the network

and edge server. In the ﬁrst scenario, we run experiments

to investigate the latency and energy overheads of running

transcoding tasks on peers. To evaluate the latency overheads,

we measure transcoding times for a ﬁve-minute video in

different resolutions/bitrates on the mobile devices. In fact,

transcoding demands decoding video into raw frames and then

re-encoding those frames into new frames. Thus, transcoding

time at the peer-side is equal to the encoding time due to

leveraging the video processing that is already being done

to capture or view video. As shown in Table II, running

transcoding for the whole video takes 8.5–254.2 seconds on

these devices (0.056–1.69 seconds per segment) and is fast

enough in action 2. In another experiment, we measure the

battery consumption of peers when they (i) play a video, (ii)

transcode a video from a higher quality, or (iii) play video

5https://github.com/itu-p1203/itu-p1203; last access: 2022-05-16.

Table II: Average transcoding times for a 5-min. video on peers

Resolution Bitrate iOS Android PC

1080p →240p 4219k→89k 34.2 53.25 18.85

1080p →360p 4219k →262k 42.7 61 24.9

1080p →720p 4219k →791k 166.5 130.9 53.1

1080p →1080p 4219k →2484k 254.2 249.6 87.2

1080p →240p 2484k →89k 35 55.1 18.9

1080p →360p 2484k →262k 45 62.7 21.8

1080p →720p 2484k →791k 172.2 132.5 52

720p →240p 791k →89k 16.25 34.75 10.8

720p →360p 791k →262k 25.1 49.3 15.4

360p →240p 262k →89k 8.5 19.5 8.8

I, transcode video II, and transmit video III, simultaneously.

The average values for ﬁve-minute videos are approximately

0.8%, 0.4%, and 1.3% of peers’ battery usage, respectively.

Thus, a combination of playing, transcoding, and transmitting

tasks does not put a signiﬁcant burden on the peers’ batteries

compared to the energy used to play or transcode video.

In the second scenario, we evaluate RICHTER’s effective-

ness in terms of the aforementioned metrics and compare the

results with the baseline systems. As illustrated in Fig. 4(a–c),

RICHTER downloads segments with higher ASB, decreases

AQS and ANS, shortens ASD, and thus improves APQ and

AST by at least 59% and 39% compared to the baseline

approaches (Fig. 4(c)), respectively. Thus, the average latency

can be signiﬁcantly reduced due to shortening the ASD and

AST values. This is because RICHTER utilizes all peers’

possible resources for serving clients. The performance of

RICHTER regarding the CHR, BTL, and ETR metrics is

shown in Fig. 4(d–e). Note that a cache miss event occurs

when (i) the requested or higher quality levels are not avail-

able in the partial caches or on CDN servers, (ii) avail-

able bandwidth values are insufﬁcient to fetch the requested

or higher quality levels from CDN servers or peers, (iii)

the available edge or peers’ processing capabilities are not

sufﬁcient to transcode the requested quality from a higher

quality. The CHR and BTL metrics indicate that RICHTER

outperforms other systems due to its ability to fetch requested

or higher quality levels in a hybrid system or using distributed

transcoded. Although RICHTER downloads fewer segments

from the origin server and improves backhaul bandwidth usage

(by about 70%) compared to ECT, it uses more computation

resources of the edge and P2P layer due to employing a

distributed transcoding approach.

IV. CONCLUSION AND FUT UR E WO RK

This paper presents RICHTER, a hybrid P2P-CDN archi-

tecture for HAS-based live video streaming services. We (i)

introduce an action tree that deﬁnes all the possible actions

to serve HAS clients (from peers, CDN, edge, or origin

servers) with maximum users’ QoE and minimum latency,

(ii) formulate the action decision problem as an MILP op-

timization, and (iii) solve the formulated problem using an

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1916

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

online learning, SOM-based approach. Experimental results

on a large-scale testbed indicate the superiority of RICHTER

compared to its competitors. Extending the proposed action

tree and employing a reinforcement learning approach are our

future research directions.

ACKNOWLEDGMENT

The ﬁnancial support of the Austrian Federal Ministry for

Digital and Economic Affairs, the National Foundation for

Research, Technology and Development, and the Christian

Doppler Research Association is gratefully acknowledged.

Christian Doppler Laboratory ATHENA: https://athena.itec.

aau.at/. This research has been supported in part by the

Singapore Ministry of Education Academic Research Fund

Tier 2 under MOE’s ofﬁcial grant number MOE2018-T2-1-

103.

REFERENCES

[1] Sandvine, “The Global Internet Phenomena Report,” White Paper, Jan-

uary 2022. [Online]. Available: https://www.sandvine.com/phenomena

[2] A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann,

“A survey on bitrate adaptation schemes for streaming media over

HTTP,” IEEE Communications Surveys & Tutorials, 2018.

[3] A. A. Barakabitze, N. Barman, A. Ahmad, S. Zadtootaghaj, L. Sun,

M. G. Martini, and L. Atzori, “QoE management of multimedia

streaming services in future networks: a tutorial and survey,” IEEE

Communications Surveys & Tutorials, 2019.

[4] N.-N. Dao, A.-T. Tran, N. H. Tu, T. T. Thanh, V. N. Q. Bao, and S. Cho,

“A Contemporary Survey on Live Video Streaming from a Computation-

Driven Perspective,” ACM Computing Surveys (CSUR), 2022.

[5] R. Farahani, “CDN and SDN Support and Player Interaction for HTTP

Adaptive Video Streaming,” in Proc. 12th ACM Multimedia Systems

Conf., 2021.

[6] S. Budhkar and V. Tamarapalli, “An overlay management strategy

to improve QoS in CDN-P2P live streaming systems,” Peer-to-Peer

Networking and Applications, 2020.

[7] N. Anjum, D. Karamshuk, M. Shikh-Bahaei, and N. Sastry, “Survey on

peer-assisted content delivery networks,” Computer Networks, 2017.

[8] H. Yousef, J. Le Feuvre, P.-L. Ageneau, and A. Storelli, “Enabling

adaptive bitrate algorithms in hybrid CDN/P2P networks,” in Proc. 11th

ACM Multimedia Systems Conf., 2020.

[9] N. Muscat and C. J. Debono, “A Hybrid CDN-P2P Architecture for

Live Video Streaming,” in IEEE EUROCON Int’l. Conf. on Smart

Technologies, 2021.

[10] R. Farahani, F. Tashtarian, A. Erfanian, C. Timmerer, M. Ghanbari,

and H. Hellwagner, “ES-HAS: An Edge- and SDN-Assisted Framework

for HTTP Adaptive Video Streaming,” in Proc. 31st ACM NOSSDAV

Workshop, 2021.

[11] R. Farahani, F. Tashtarian, H. Amirpour, C. Timmerer, M. Ghanbari,

and H. Hellwagner, “CSDN: CDN-Aware QoE Optimization in SDN-

Assisted HTTP Adaptive Video Streaming,” in Proc. 46th IEEE Conf. on

Local Computer Networks (LCN), 2021.

[12] R. Farahani, F. Tashtarian, C. Timmerer, M. Ghanbar, and H. Hellwag-

ner, “LEADER: A Collaborative Edge- and SDN-Assisted Framework

for HTTP Adaptive Video Streaming,” in Proc. IEEE Int’l. Conf. on

Communications (ICC), 2022.

[13] S. Nacakli and A. M. Tekalp, “Controlling P2P-CDN live streaming

services at SDN-enabled multi-access edge datacenters,” IEEE Trans.

on Multimedia, 2020.

[14] R. Farahani, H. Amirpour, F. Tashtarian, A. Bentaleb, C. Timmerer,

H. Hellwagner, and R. Zimmermann, “RICHTER: hybrid P2P-CDN

architecture for low latency live video streaming,” in Proc. of the 1st

Mile-High Video Conference, 2022, pp. 87–88.

[15] Z. Ma, S. Roubia, F. Giroire, and G. Urvoy-Keller, “When Locality is not

enough: Boosting Peer Selection of Hybrid CDN-P2P Live Streaming

Systems using Machine Learning,” in Network Trafﬁc Measurement and

Analysis Conf (IFIP TMA), 2021.

[16] T. Kohonen, “Self-Organizing Maps,” Springer Science & Business

Media, 2012.

[17] CTA-5004, “Web application video ecosystem–common media client

data.” 2020. [Online]. Available: https://cdn.cta.tech/cta/media/media/

resources/standards/pdfs/cta-5004- ﬁnal.pdf

[18] CTA-WAVE, “Common-media-server Data.” 2021. [Online]. Available:

https://github.com/cta-wave/common-media-client-data/issues/19

[19] M. R. Garey et al.,Computers and Intractability. A Guide to the Theory

of NP-Completeness. W.H. Freeman, 1979.

[20] A. Bentaleb, M. N. Akcay, M. Lim, A. C. Begen, and R. Zimmermann,

“Catching the moment with LoL+ in Twitch-like low-latency live

streaming platforms,” IEEE Trans. on Multimedia, 2021.

[21] K. Spiteri, R. Urgaonkar, and R. K. Sitaraman, “BOLA: Near-optimal

bitrate adaptation for online videos,” in 35th IEEE Int’l. Conf. on

Computer Communications, 2016.

[22] R. Ricci, E. Eide, and C. Team, “Introducing CloudLab: Scientiﬁc

infrastructure for advancing cloud architectures and applications,” login::

The Magazine of USENIX & SAGE, 2014.

[23] S. Lederer, C. M¨

uller, and C. Timmerer, “Dynamic adaptive streaming

over HTTP dataset,” in Proc. 3rd ACM Multimedia Systems Conf., 2012.

[24] D. Raca, J. J. Quinlan, A. H. Zahran, and C. J. Sreenan, “Beyond

throughput: a 4G LTE dataset with channel and context metrics,” in

Proc. 9th ACM Multimedia Systems Conf., 2018.

Hybrid Event - 2022 IEEE Global Communications Conference: Communications Software and Multimedia

1917

Authorized licensed use limited to: Universitaet Klagenfurt. Downloaded on January 16,2023 at 09:35:01 UTC from IEEE Xplore. Restrictions apply.

Towards AI-Assisted Sustainable Adaptive Video Streaming Systems: Tutorial and Survey

Preprint

Full-text available

Jun 2024

Improvements in networking technologies and the steadily increasing numbers of users, as well as the shift from traditional broadcasting to streaming content over the Internet, have made video applications (e.g., live and Video-on-Demand (VoD)) predominant sources of traffic. Recent advances in Artificial Intelligence (AI) and its widespread application in various academic and industrial fields have focused on designing and implementing a variety of video compression and content delivery techniques to improve user Quality of Experience (QoE). However, providing high QoE services results in more energy consumption and carbon footprint across the service delivery path, extending from the end user's device through the network and service infrastructure (e.g., cloud providers). Despite the importance of energy efficiency in video streaming, there is a lack of comprehensive surveys covering state-of-the-art AI techniques and their applications throughout the video streaming lifecycle. Existing surveys typically focus on specific parts, such as video encoding, delivery networks, playback, or quality assessment, without providing a holistic view of the entire lifecycle and its impact on energy consumption and QoE. Motivated by this research gap, this survey provides a comprehensive overview of the video streaming lifecycle, content delivery, energy and Video Quality Assessment (VQA) metrics and models, and AI techniques employed in video streaming. In addition, it conducts an in-depth state-of-the-art analysis focused on AI-driven approaches to enhance the energy efficiency of end-to-end aspects of video streaming systems (i.e., encoding, delivery network, playback, and VQA approaches). Finally, it discusses prospective research directions for developing AI-assisted energy-aware video streaming systems.

Enhancing Crowd-Sourced Video Sharing through P2P-Assisted HTTP Video Streaming

Article

Full-text available

Mar 2024

This paper introduces a decentralized architecture designed for the sharing and distribution of user-generated video streams. The proposed system employs HTTP Live Streaming (HLS) as the delivery method for these video streams. In the architecture, a creator who captures a video stream using a smartphone camera subsequently transcodes it into a sequence of video chunks called HLS segments. These chunks are then stored in a distributed manner across the worker network, forming the core of the proposed architecture. Despite the presence of a coordinator for bootstrapping within the worker network, the selection of worker nodes for storing generated video chunks and autonomous load balancing among worker nodes are conducted in a decentralized fashion, eliminating the need for central servers. The worker network is implemented using the Golang-based IPFS (InterPlanetary File System) client, called kubo, leveraging essential IPFS functionalities such as node identification through Kademlia-DHT and message exchange using Bitswap. Beyond merely delivering stored video streams, the worker network can also amalgamate multiple streams to create a new composite stream. This bundling of multiple video streams into a unified video stream is executed on the worker nodes, making effective use of the FFmpeg library. To enhance download efficiency, parallel downloading with multiple threads is employed for retrieving the video stream from the worker network to the requester, thereby reducing download time. The result of the experiments conducted on the prototype system indicates that those concerned with the transmission time of the requested video streams compared with a server-based system using AWS exhibit a significant advantage, particularly evident in the case of low-resolution video streams, and this advantage becomes more pronounced as the stream length increases. Furthermore, it demonstrates a clear advantage in scenarios characterized by a substantial volume of viewing requests.

Context-Aware HTTP Adaptive Video Streaming Utilizing QUIC's Stream Priority

Conference Paper

Full-text available

Jun 2023

CP-Steering: CDN- and Protocol-Aware Content Steering Solution for HTTP Adaptive Video Streaming

Conference Paper

Full-text available

Jun 2023

NCTM: A Novel Coded Transmission Mechanism for Short Video Deliveries

Conference Paper

May 2024

Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Streaming

Conference Paper

Dec 2023

ALIVE: A Latency-and Cost-Aware Hybrid P2P-CDN Framework for Live Video Streaming

Article

Full-text available

Jan 2023

Recent years have witnessed video streaming demands evolve into one of the most popular Internet applications. With the ever-increasing personalized demands for highdefinition and low-latency video streaming services, networkassisted video streaming schemes employing modern networking paradigms have become a promising complementary solution in the HTTP Adaptive Streaming (HAS) context. The emergence of such techniques addresses long-standing challenges of enhancing users’ Quality of Experience (QoE), end-to-end (E2E) latency, as well as network utilization. However, designing a cost-effective, scalable, and flexible network-assisted video streaming architecture that supports the aforementioned requirements for live streaming services is still an open challenge. This article leverages novel networking paradigms, i.e., edge computing and Network Function Virtualization (NFV), and promising video solutions, i.e., HAS, Video Super-Resolution (SR), and Distributed Video Transcoding (TR), to introduce A Latency-and cost-aware hybrId P2P-CDN framework for liVe video strEaming (ALIVE). We first introduce the ALIVE multi-layer architecture and design an action tree that considers all feasible resources (i.e., storage, computation, and bandwidth) provided by peers, edge, and CDN servers for serving peer requests with acceptable latency and quality. We then formulate the problem as a Mixed Integer Linear Programming (MILP) optimization model executed at the edge of the network. To alleviate the optimization model’s high time complexity, we propose a lightweight heuristic, namely, Greedy-Based Algorithm (GBA). Finally, we (i) design and instantiate a large-scale cloud-based testbed including 350 HAS players, (ii) deploy ALIVE on it, and (iii) conduct a series of experiments to evaluate the performance of ALIVE in various scenarios. Experimental results indicate that ALIVE (i) improves the users’ QoE by at least 22%, (ii) decreases incurred cost of the streaming service provider by at least 34%, (iii) shortens clients’ serving latency by at least 40%, (iv) enhances edge server energy consumption by at least 31%, and (v) reduces backhaul bandwidth usage by at least 24% compared to baseline approaches.

SARENA: SFC-Enabled Architecture for Adaptive Video Streaming Applications

Conference Paper

Full-text available

Apr 2023

5G and 6G networks are expected to support various novel emerging adaptive video streaming services (e.g., live, VoD, immersive media, and online gaming) with versatile Quality of Experience (QoE) requirements such as high bitrate, low latency, and sufficient reliability. It is widely agreed that these requirements can be satisfied by adopting emerging networking paradigms like Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing. Previous studies have leveraged these paradigms to present network-assisted video streaming frameworks, but mostly in isolation without devising chains of Virtualized Network Functions (VNFs) that consider the QoE requirements of various types of Multimedia Services (MS). To bridge the aforementioned gaps, we first introduce a set of multimedia VNFs at the edge of an SDN-enabled network, form diverse Service Function Chains (SFCs) based on the QoE requirements of different MS services. We then propose SARENA, an SFC-enabled ArchitectuRe for adaptive VidEo StreamiNg Applications. Next, we formulate the problem as a central scheduling optimization model executed at the SDN controller. We also present a lightweight heuristic solution consisting of two phases that run on the SDN controller and edge servers to alleviate the time complexity of the optimization model in large-scale scenarios. Finally, we design a large-scale cloud-based testbed including 250 HTTP Adaptive Streaming (HAS) players requesting two popular MS applications (i.e., live and VoD), conduct various experiments, and compare its effectiveness with baseline systems. Experimental results illustrate that SARENA outperforms baseline schemes in terms of users’ QoE by at least 39.6%, latency by 29.3%, and network utilization by 30% in both MS services.

ARARAT: A Collaborative Edge-Assisted Framework for HTTP Adaptive Video Streaming

Article

Full-text available

Sep 2022

With the ever-increasing demands for high-definition and low-latency video streaming applications, network-assisted video streaming schemes have become a promising complementary solution in the HTTP Adaptive Streaming (HAS) context to improve users’ Quality of Experience (QoE) as well as network utilization. Edge computing is considered one of the leading networking paradigms for designing such systems by providing video processing and caching close to the end-users. Despite the wide usage of this technology, designing network-assisted HAS architectures that support low-latency and high-quality video streaming, including edge collaboration is still a challenge. To address these issues, this article leverages the Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing paradigms to propose A collaboRative edge-Assisted framewoRk for HTTP Adaptive video sTreaming (ARARAT). Aiming at minimizing HAS clients’ serving time and network cost, besides considering available resources and all possible serving actions, we design a multi-layer architecture and formulate the problem as a centralized optimization model executed by the SDN controller. However, to cope with the high time complexity of the centralized model, we introduce three heuristic approaches that produce near-optimal solutions through efficient collaboration between the SDN controller and edge servers. Finally, we implement the ARARAT framework, conduct our experiments on a large-scale cloud-based testbed including 250 HAS players, and compare its effectiveness with state-of-the-art systems within comprehensive scenarios. The experimental results illustrate that the proposed ARARAT methods (i) improve users’ QoE by at least 47%, (ii) decrease the streaming cost, including bandwidth and computational costs, by at least 47%, and (iii) enhance network utilization, by at least 48% compared to state-of-the-art approaches.

LEADER: A Collaborative Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

Conference Paper

Full-text available

May 2022

RICHTER: hybrid P2P-CDN architecture for low latency live video streaming

Conference Paper

Full-text available

Mar 2022

A Contemporary Survey on Live Video Streaming from a Computation-Driven Perspective

Article

Full-text available

Feb 2022

Live video streaming services have experienced significant growth since the emergence of social networking paradigms in recent years. In this scenario, adaptive bitrate streaming communications transmitted on web protocols provide a convenient and cost-efficient facility to serve various multimedia platforms over the Internet. In these communication models, video content is delivered optimally, possibly transcoded, edited automatically, and cached temporarily by network elements along the path. To this end, the computational capabilities of various network elements are considered as major resources to be optimized for service quality improvements. This paper provides a contemporary survey of cutting-edge live video streaming studies from a computation-driven perspective. First, an overview of the global standards, system architectures, and streaming protocols is presented. Next, hierarchical computation-driven models of live video streaming are anatomized, including cloud-, edge-, and peer-to-peer-based solutions. Cutting-edge studies are then reviewed to discover the advances they have made in improving system performance in multiple aspects. Finally, open challenges are presented to direct future research in this field.

CDN and SDN Support and Player Interaction for HTTP Adaptive Video Streaming

Conference Paper

Full-text available

Oct 2021

Reza Farahani

Video streaming has become one of the most prevailing, bandwidth-hungry, and latency-sensitive Internet applications. HTTP Adaptive Streaming (HAS) has become the dominant video delivery mechanism over the Internet. Lack of coordination among the clients and lack of awareness of the network in pure client-based adaptive video bitrate approaches have caused problems, such as sub-optimal data throughput from Content Delivery Network (CDN) or origin servers, high CDN costs, and non-satisfactory users' experience. Recent studies have shown that network-assisted HAS techniques by utilizing modern networking paradigms, e.g., Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing can significantly improve HAS system performance. In this doctoral study, we leverage the aforementioned modern networking paradigms and design network-assistance for/by HAS clients to improve HAS systems performance and CDN/network utilization. We present four fundamental research questions to target different challenges in devising a network-assisted HAS system. CCS CONCEPTS • Information systems → Multimedia streaming; • Networks → In-network processing.

CSDN: CDN-Aware QoE Optimization in SDN-Assisted HTTP Adaptive Video Streaming

Conference Paper

Full-text available

Oct 2021

Recent studies have revealed that network-assisted techniques, by providing a comprehensive view of the network, improve HTTP Adaptive Streaming (HAS) system performance significantly. This paper leverages the capability of Software-Defined Networking, Network Function Virtualization, and edge computing to introduce a CDN-Aware QoE Optimization in SDN-Assisted Adaptive Video Streaming (CSDN) framework. We employ virtualized edge entities to collect various information items and run an optimization model with a new server/segment selection approach in a time-slotted fashion to serve the clients' requests by selecting optimal cache servers. In case of a cache miss, a client's request is served by an optimal replacement quality from a cache server, by a quality transcoded from an optimal replacement quality at the edge, or by the originally requested quality from the origin server. Comprehensive experiments conducted on a large-scale testbed demonstrate that CSDN outperforms other approaches in terms of the users' QoE and network utilization.

Catching the Moment with LoL+ in Twitch-Like Low-Latency Live Streaming Platforms

Article

Full-text available

May 2021

Our earlier Low-on-Latency (dubbed as LoL) solution offered an accurate bandwidth prediction and rate adaptation algorithm tailored for live streaming applications that needed an end-to-end latency of up to two seconds. While LoL was a significant step forward in multi-bitrate low-latency live streaming, further experimentation and testing showed that there was room for improvement in three areas. First, LoL used hardcoded parameters computed from an offline training process in the rate adaptation algorithm and this was seen as a significant barrier in LoLs wide deployment. Second, LoLs objective was to maximize a collective QoE function. Yet, certain use cases have specific objectives besides the singular QoE and this had to be accommodated. Third, the adaptive playback speed control failed to produce satisfying results in some scenarios. Our goal in this paper is to address these areas and make LoL sufficiently robust to deploy. We refer to the new solution as LoL+ .

ES-HAS: an edge- and SDN-assisted framework for HTTP adaptive video streaming

Conference Paper

Full-text available

Sep 2021

Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.

Enabling adaptive bitrate algorithms in hybrid CDN/P2P networks

Conference Paper

Full-text available

May 2020

As video traffic becomes the dominant part of the global Internet traffic, keeping a good quality of experience (QoE) becomes more challenging. To improve QoE, HTTP adaptive streaming with various adaptive bitrate (ABR) algorithms has been massively deployed for video delivery. Based on their required input information, these algorithms can be classified, into buffer-based, throughput-based or hybrid buffer-throughput algorithms. Nowadays, due to their low cost and high scalability, peer-to-peer (P2P) networks have become an efficient alternative for video delivery over the Internet, and many attempts at merging HTTP adaptive streaming and P2P networks have surfaced. However, the impact of merging these two approaches is still not clear enough, and interestingly, the existing HTTP adaptive streaming algorithms lack testing in a P2P environment. In this paper, we address and analyze the main problems raised by the use of the existing HTTP adaptive streaming algorithms in the context of P2P networks. We propose two methodologies to make these algorithms more efficient in P2P networks regardless of the ABR algorithm used, one favoring overall QoE and one favoring P2P efficiency. Additionally, we propose two new metrics to quantify the P2P efficiency for ABR delivery over P2P.

A Hybrid CDN-P2P Architecture for Live Video Streaming

Conference Paper

Jul 2021

Controlling P2P-CDN Live Streaming Services at SDN-enabled Multi-Access Edge Datacenters

Article

Oct 2020

Recognizing the shortcomings of current hybrid peer-to-peer (P2P) content-distribution network (CDN) video solutions and the potential of emerging multi-access edge datacenters, we propose a novel P2P-CDN service model that is hosted at software defined networks (SDN)-enabled multi-access edge datacenters operated by network service providers (NSP). An important feature of the proposed service architecture is that both CDN access by peers and P2P video streaming between peers within edge access networks are fully controlled by cooperation of the video content provider (VCP) and NSP to optimize video service key performance indicators (KPI). The proposed fully controlled P2P-CDN architecture with P2P group formation and chunk scheduling managed at edge datacenters reduces the load on CDN servers while overcoming quality of experience (QoE) fluctuations per flow and unfairness between multiple heterogeneous video-resolution clients over reserved access network slices. Other advantages of this service include: i) better video quality and lower delay for clients; ii) better use of edge network resources; iii) avoiding illegal, unauthorized P2P content sharing. To the best of our knowledge, there are no solutions in the literature that address P2P-CDN services managed at NSP-edge datacenters combining P2P-assisted CDN, SDN-assisted edge computing, and premium service over reserved slices. Experimental results show that the proposed P2P-CDN service deployed at SDN-enabled edge datacenters provides excellent service KPI compared to other state-of-the-art solutions.

Hybrid P2P-CDN Architecture for Live Video Streaming: An Online Learning Approach

Recommended publications

RICHTER: hybrid P2P-CDN architecture for low latency live video streaming

A Hybrid CDN-P2P Architecture for Live Video Streaming

Towards Low-Latency and Energy-Efficient Hybrid P2P-CDN Live Video Streaming

SARENA: SFC-Enabled Architecture for Adaptive Video Streaming Applications

ALIVE: A Latency-and Cost-Aware Hybrid P2P-CDN Framework for Live Video Streaming