ArticlePDF Available

Adapting Caching to Audience Retention Rate: Which Video Chunk to Store?

Authors:

Abstract and Figures

Rarely do users watch online contents entirely. We study how to take this into account to improve the performance of cache systems for video-on-demand and video-sharing platforms in terms of traffic reduction on the core network. We exploit the notion of "Audience retention rate", introduced by mainstream online content platforms and measuring the popularity of different parts of the same video content. We first characterize the performance limits of a cache able to store parts of videos, when the popularity and the audience retention rate of each video are available to the cache manager. We then relax the assumption of known popularity and we propose a LRU (Least Recently Used) cache replacement policy that operates on the first chunks of each video. We characterize its performance by extending the well-known Che's approximation to this case. We prove that, by refining the chunk granularity, the chunk-LRU policy increases its performance. It is shown numerically that even for a small number of chunks (N=20), the gains of chunk-LRU are still significant in comparison to standard LRU policy that caches entire files, and they are almost optimal.
Content may be subject to copyright.
Adapting Caching to Audience Retention Rate:
Which Video Chunk to Store?
Lorenzo Maggi?, Lazaros Gkatzikis?, Georgios Paschos?, and J´
er´
emie Leguay?
Abstract—Rarely do users watch online contents entirely. We
study how to take this into account to improve the performance of
cache systems for video-on-demand and video-sharing platforms
in terms of traffic reduction on the core network. We exploit the
notion of “Audience retention rate”, introduced by mainstream
online content platforms and measuring the popularity of differ-
ent parts of the same video content. We first characterize the
performance limits of a cache able to store parts of videos, when
the popularity and the audience retention rate of each video are
available to the cache manager. We then relax the assumption of
known popularity and we propose a LRU (Least Recently Used)
cache replacement policy that operates on the first chunks of
each video. We characterize its performance by extending the
well-known Che’s approximation to this case. We prove that, by
refining the chunk granularity, the chunk-LRU policy increases
its performance. It is shown numerically that even for a small
number of chunks (N= 20), the gains of chunk-LRU are still
significant in comparison to standard LRU policy that caches
entire files, and they are almost optimal.
Index Terms—cache, audience retention rate, chunk, LRU
I. INTRODUCTION
Content Distribution Networks (CDN) and Video on De-
mand applications use network caches to store the most
popular contents near the user and reduce backhaul bandwidth
expenditure. The future projections for the cost of memory
and bandwidth promote the use of caching to satisfy the ever-
increasing network traffic [1]. Since the bandwidth saving
potential of caching is restricted by the number of files that fit
in the cache (the cache capacity), it is interesting to maximize
the caching effectiveness under such a constraint. Here we
consider the use of partial caching, a technique according to
which we may cache specific parts of files, instead of whole
ones.
We focus on video files which represent a significant
fraction of the global Internet traffic (64% according to [2]).
Videos are the most representative example of contents that
are only partially retrieved, since specific parts of a video
are viewed more than others. Typically, the average user
will “crawl” several video files before watching one in its
entirety. The above imply that most of the times it is not
needed to cache the entire video. Indeed Fig. 1 shows the
video watch-time from a trace of 7000 YouTube videos. The
histogram emphasizes the fact that the vast majority of files
is only partially watched, and motivates the design of caching
algorithms that avoid caching rarely accessed video parts, e.g.
the tail.
Optimization of caching is often based on file popularity.
Storing the most popular files results in more cache hits,
?Mathematical and Algorithmic Sciences Lab, France Research Center,
Huawei Technologies Co. Ltd.
Figure 1: Histogram of watch-time in YouTube (based on a data sample of 7000 video
files from [5]). On average 60% of a file is watched.
which decreases the impact on the traffic on the core network.
Nevertheless, not all the parts of a file are equally popular
[3]. Hence, a natural generalization of “store the most popular
files” is to split the video files into chunks and “store the most
popular chunks” instead. To differentiate the popularity of each
video chunk we use the metric of the audience retention rate
[4], which measures the popularity of different parts of the
same file. It has many advantages: it is file specific, it is
available in most content distribution platforms, e.g., YouTube
[4], and it evolves very slowly over time, which facilitates its
easy estimation1. The latter is not generally true for chunk
popularity which are affected by the time-varying popularity
of the corresponding file.
In this paper we establish a link between the audience re-
tention rate and the efficiency of partial caching. Our approach
is based on decomposing popularity into video popularity
and video retention rate. More specifically, we address the
following questions: i)How much bandwidth could we save
via partial caching of video content and ii)Is this gain
achievable by practical caching algorithms?
A. Related Work
Partial caching techniques were first reported in the context
of proxy caching, where it was proposed to store the file
headers to improve latency performance [6]. To capture both
latency and bandwidth improvements, [7] splits the files into
segments of exponentially increasing size. More generally, it
is possible to cache specific chunks in order to capture the
different popularity of sections within a file (a.k.a. internal
popularity) [3], [8].
1The quasi-static nature of audience retention rate relates to file particular-
ities, e.g. a movie may become uninteresting towards the end.
arXiv:1512.03274v1 [cs.NI] 10 Dec 2015
Intuitively, extreme chunking (e.g. at byte level) offers finer
granularity and potentially leads to the optimal caching perfor-
mance. However, tracking popularity at such fine granularity
is impractical and leads to algorithms of prohibitively high
complexity [9]. A series of works suggest to split each file
into a small number of chunks and treat each chunk indepen-
dently [7], [10]. Alternatively, it is proposed to model internal
popularity as a parametric k-transformed Zipf distribution [9],
[11]. Knowing the distribution type, simplifies the estimation
task but still requires parameter estimations individually for
each file. Deducing the optimal size and number of chunks
is not straightforward. It was recently shown that restricting
to nhomogeneous chunks incurs a loss which is bounded
by O(n2) [8]. Alternative heuristic approaches suggest that
only a specific segment of each file should be cached and
dynamically adjust its size. For instance, [12] proposes a seg-
mentation scheme where initially the whole object is cached
but the segment size is gradually set equal to its estimated
average watch-time. Similar adaptive strategies have been
also considered for peer-to-peer networks [13], where starting
from a small segment, the portion to be cached is increased
according to the number of requests and watch-time. The
caching of several segments of each file was proposed in [14],
since users may be interested only in specific, non-contiguous
parts of files. In this case the segment size has to be selected
accordingly.
In this paper we prove that the performance of partial
caching indeed improves when the file is split into chunks. We
develop an analytical framework for LRU performance under
partial caching and we use it to show that the performance
gains of partial caching remain significant even for a small
number of chunks. Up to the authors’ knowledge, there are no
studies assessing analytically the actual performance of such
cache management strategies and their inherent performance
limits under the partial viewing assumption.
B. Main contributions
We first investigate a trace of YouTube data [5] and conclude
that partial caching has a great potential to improve perfor-
mance, mainly because: (i) the average video watch-time is
no more than 70%, and (ii) the larger the video is the less
its average watch-time. Motivated by this, in Section IV we
present an analysis of traffic bandwidth reduction which is
based on the audience retention rate. Combining the theoretical
analysis with the YouTube data, we show that in realistic
settings the traffic reduction of partial caching over traditional
caching may reach up to 50%.
The above analysis compares the performance limits of
the two caching approaches assuming known popularity and
retention rates. Therefore, it is also interesting to investigate
the bandwidth benefits from partial caching in a more realistic
setting. In Section V we design a class of practical chunk-LRU
(Least Recently Used) policies, which split files into different
chunks and always drop (i.e., never cache) the last chunk at
the tail of files. Chunk-LRU policies harness the realistic gain
of partial caching due to video watch-time. Moreover we gain
intuition into designing optimal chunking and we show that
the maximum performance can be approached with a small
number of chunks of equal size.
Our main technical contributions to the literature are:
We formulate the traffic reduction optimization problem
and provide a waterfilling algorithm to solve it efficiently.
For the special case where users watch each video con-
tinuously until they abandon it, we derive the optimal
waterfilling partial allocation in closed form. It consists
of caching a compact interval [0, ν]of the file where ν
is given in closed form.
We propose a novel chunk-LRU algorithm that splits each
file in N+ 1 chunks where the last one is never cached.
We build an analytical framework to analyze the chunk-
LRU performance under partial viewing, subject to Che’s
approximation for LRU performance, [15].
We provide a sufficient condition for retention rates such
that sub-splitting chunks is always beneficial.
We characterize the optimal performance of chunk-LRU
as a simple optimization problem over the tail drop factor
and with infinitesimal chunking.
II. YOU TUB E VID EO WATCH -TI ME
In this section we examine YouTube access traces2[5] in
order to analyze the average video watch-time, which is the
portion ([0; 1]) of each file watched by the users. Watch-
times are crucial for caching: using partial caching we may
avoid to cache rarely watched parts of videos and use the freed
cache space to store more files.
Since most strategies try to cache the most popular files,
first we investigate the relationship between average watch-
time and file popularity. We classify videos into 10 groups
according to their average daily views. Fig. 2 depicts the
estimated probability density function of watch-time for three
representative groups, the 10% most popular videos, the 10%
least popular, and the intermediate ones. Interestingly, we
observe that the more popular a video is, the higher the
average watch-time. However, even for the most popular ones,
on average only 72% of each video is watched, which leaves
room for caching optimization.
Figure 2: Watch-time distribution for different classes of video popularity. The average
watch-time of a video increases with its popularity.
2The dataset is publicly available and was crawled using the YouTube Data
API in 2013. It contains information about 7000 files, including daily views,
watch-time, duration, genre and title of each file.
2
Figure 3: Average watch-time is increasing with the popularity of files, but steeply
decreasing with its duration.
Next, we investigate the relationship between watch-time
and video duration. The latter is a critical parameter for
caching due to the cache capacity constraint which eventually
determines caching performance. If longer videos are only
partially watched, avoiding to cache their unwatched parts
will yield a greater benefit. In Fig. 3 we depict with dots
the YouTube data for the 20% most popular files. In order to
identify how the watch-time is affected by the video duration
and its popularity, we use locally weighted polynomial regres-
sion [16] to fit a smoothed surface to the corresponding data.
Notice that the most beneficial regime for caching purposes
corresponds to the upper left corner of the plot, namely highly
popular videos of large size. We observe that in this region the
average watch-time is around 0.7. In addition, independently
of the video popularity, watch-time decreases rapidly with
video duration.
We then group the available data to 10 classes according to
their popularity and duration (200 sec). We depict the details
of the derived classes in Table I, namely for each class we
depict the average watch-time, the fraction of videos belonging
to this class and its average duration in seconds. We observe
that the large and popular videos amount to a non-negligible
percentage of 5%. In addition, the average watch-time of large
files is significantly smaller than that of smaller ones. To
precisely evaluate the impact of watch-time to caching, we
use these data in the subsequent Sections IV,V to quantify
the theoretical maximum and the practically feasible caching
performance.
III. SYS TE M MOD EL
We consider a communication system where users download
video contents from the network. Let M={1, . . . , M }
be the video content (or simply, video) catalog. Each video
i M is of size Sibytes. Content requests are generated
using the well-known Independent Reference Model (IRM)
[17] according to which the requests for the videos Mare
independent of each other. We call pithe probability that
video iis requested, given that a video request has arrived.
Equivalently, the sequence of video requests can be thought
of as Mindependent homogeneous Poisson processes with
intensity rate proportional to the probability vector {pi}i. For
M: video catalog of cardinality |M| =M
C: cache size
pi: popularity of video i
Ri(τ): audience retention rate of video i
πi(τ): viewing abandonment p.d.f. of video i
Si: size of video i
Bs(Y): traffic bandwidth on the core network when the portion
Yiof video iis statically stored in the cache (see Eq. (2))
N+1 : number of chunks for chunk-LRU
B: minimum core network traffic achieved by optimal partial
caching
[xk1, xk]:k-th chunk of a video
x: collection of chunks
ν: tail drop factor for chunk-LRU; the last chunk [ν; 1] is
never stored in the cache
hk,i : hit rate of the k-th chunk of video i
tC: characteristic time for chunk-LRU
BcLRU(x, ν ): traffic on core network with chunk-LRU (see Eq. (9))
subject to the chunking xand a tail drop factor ν
BcLRU : optimal traffic performance for chunk-LRU (see Eq. (11))
Table II: Table of notation symbols
convenience of notation, we assume that the probabilities are
in decreasing order, i.e., p1p2 ·· · pM.
One cache of size Cbytes is deployed in the network.3
Whenever a requested video is found in the cache, the cache
itself can directly serve the user. Otherwise, the video needs to
be retrieved through the core network, which provides access
to a central video content store containing the entire video
catalog, see Fig. 4. Hence, good caching performance has a
profound impact on the traffic reduction on the core network.
The goal of this paper is to determine the extra bandwidth
benefits that may be gained by exploiting the fact that videos
are rarely watched entirely.
Figure 4: System model
A. Viewing Behavior Model: Audience Retention Rate
To mathematically analyze the impact of watch-time, we
introduce the central notion of audience retention rate Ri(τ).
According to Youtube’s definition, the audience retention rate
Ri(τ)measures the percentage of users that are still watching
video iat the corresponding (normalized) instant τ, out of the
overall number of views [4]. As we will see, in our analysis the
retention rate has a prominent role in determining the caching
performance.
Typically a user may watch video ifrom instant ai(1) up
to bi(1), then she possibly skips to ai(2) and watches until
bi(2), and so forth4. The watched part Wi, which equals the
3Our analysis can be extended to a cache hierarchy by letting piexpress
the probability that a request for video iis missed by the caches at all the
child nodes [1].
4We remark that such intervals may also overlap, i.e., a user may rewind the
video and watch a part of it multiple times. We assume that, if this occurs, then
the user can directly retrieve the video portion that she has already watched
from her terminal’s cache.
3
Popularity \Duration Small Large
Av. watch-time Fraction of population Av. Duration (sec) Av. watch-time Fraction of population Av. Duration (sec)
Lowest 0.52 0.179 81 0.37 0.020 220
Low 0.6 0.162 112 0.47 0.036 220
Medium 0.64 0.153 128 0.57 0.045 223
High 0.67 0.152 130 0.60 0.047 222
Highest 0.72 0.145 124 0.65 0.053 235
Table I: The characteristics of each class of videos. These data will be used to derive realistic and class-specific retention rates for our numerical evaluation.
minimum portion of video ithat the user needs to download,
is the union of all watch intervals j:
Wi=j[ai(j); bi(j)].
We call |Wi|the watch-time of user watching video i. For ease
of notation we consider ai, bi[0; 1] as portions of the whole
video duration. The audience retention rate5 function Ri(τ)
can be then formally defined as the probability that a user has
watched the (normalized) instant τof the video, i.e.,
Ri(τ) = Pr (τWi), τ [0; 1].
Alternatively, we may think of Ri(τ)as the fraction of users
that watch the (normalized) instant τof the video i.
We remark that, thanks to the definition of Ri, we can easily
evaluate the average watch-time for video ias R1
0Ri(τ).
Next we devise a realistic and more specific viewing behav-
ior model and we derive its relationship to audience retention
rate.
1) Viewing Abandonment Model: This is a special instance
of the viewing model presented above. It assumes that users
always start watching each video ifrom its beginning, and
they abandon it after a random time portion bi[0; 1].
Hence, in this case the watched part Witakes on the simple
form Wi= [0; bi], thus biequals the watch-time. We call
πi(.)the probability density distribution of the abandonment
time variable bi. The relationship between the abandonment
distribution πiand the audience retention rate Riis described
by the expression:
Ri(τ) = 1 Zτ
0
πi(t)dt. (1)
Hence, in this case the audience retention rate Ri(τ)measures
the fraction of users with watch-time higher than τfor the par-
ticular video i. We first observe from (1) that Riis inherently
non-increasing, with Ri(0) = 1. We also remark that, under
the viewing abandonment assumption, the audience retention
rate Riuniquely describes the random watch behavior [0; bi]
of user via πi. This observation does not hold though for
the general case described in Section III-A, where the same
retention rate Rimay result from an arbitrary distribution of
watch behaviors.
In order to come up with a realistic audience retention
rate function from the estimated parameters in Tab. I for our
numerical investigations in Sections IV-C,V-D we assume that
the viewing abandonment model holds.
5Our definition is in accordance with the definition of audience retention (or
“engagement”) rate by Wistia.com [18]. Youtube’s audience retention rate [4]
actually counts the video rewinds as multiple views inside the same videos.
Figure 5: Instance of audience retention rate from YouTube.
IV. PERFORMANCE LIMITS OF PARTIAL CACHI NG
This section analyzes the performance limits of partial
caching in the context of audience retention rate. Our perfor-
mance metric is core network traffic and we tackle the off-line
problem of finding the optimal static (partial) file cache allo-
cation6. In particular, we will compare the maximum network
traffic saved by caching entire videos versus caching arbitrary
portions of each of those. In both cases it is idealistically
assumed that the video popularity distribution {pi}i∈M and
the audience retention rate functions {Ri}i∈M are perfectly
known to the cache manager. This analysis serves as an upper
bound for any cache management strategy with more limited
information, as the one devised in Section V.
Let us first formalize our problem. We define the partial
allocation Yi[0; 1] of video ito be the collection of (pos-
sibly) non-adjacent bytes, that are selected to be permanently
stored in the cache. Subject to a partial allocation Yi, any
requests for the remaining portions [0; 1]\Yineed to be served
by the origin video store. Due to the specific retention rate
for this video, this happens with probability R[0;1]\YiRi(τ).
Therefore, under a partial allocation vector Y, we may express
the expected traffic on the core network per request B(Y)as
B(Y) = X
i∈M
SipiZ[0;1]\Yi
Ri(τ)dτ. (2)
Considering the video size Siand cache size C, a partial
allocation vector Yis feasible whenever Pi∈M SiRYi1dx =
C. Our goal is to select a feasible vector Ythat minimizes
the incurred traffic Bs(Y), i.e.,
Y= argmin
Y
B(Y)(3)
s.t.Pi∈M SiRYi1dx =C
Yi[0; 1]
If users always watch the whole video, i.e., Ri(τ) = 1 for
all τ[0; 1] and i M, then the optimization (3) takes a
6We remark that in our analysis of the optimal traffic bandwidth B(Y)
we assumed that the videos Yare already present in the cache and we
did not take into account the traffic needed to fill the cache. If we wish
to incorporate this aspect, we could say that B(Y)is the expected traffic
achieved asymptotically over a number of requests tending to infinity.
4
simple form which is solved by the well-known store the most
popular videos policy. In this case, we would choose to fully
store, Yi= [0; 1], the videos of highest piup to the cache
capacity and no portion of the rest, i.e. Yi=otherwise. As
indicated by the previous section however, in reality this is not
the case, hence we expect Yto bring certain improvement,
that we evaluate in Section IV-C.
Technically speaking, if we lift any assumption on the
shape of the audience retention rate, the best cache allocation
should intuitively prescribe to partition all videos at the finest
granularity (at the byte level, say), order them according to
their popularity, and fill the cache with the most popular bytes.
We now provide an equivalent waterfilling characterization of
the optimal partial video allocation Yto solve this problem.
The main advantage of this formulation lies in the fact that it
leads to an efficient algorithm to compute Y, that we present
at the end of the section.
Theorem 1. The optimal partial video allocation Ycan be
expressed as
Y
i(µ) = {τ:piRi(τ)µ} i M,(4)
where µis such that Pi∈M Si|Y
i(µ)|=C, where |.|is the
size7of a subset of [0; 1].
Informally speaking, the water level µdetermines a popu-
larity threshold above which a byte of any video deserves to
be stored in the cache.
A. Viewing Abandonment Model
In the special case of viewing abandonment model, we
already observed that the audience retention rate Riis non-
increasing for all i M. This allows us to specialize our
result in Theorem 1 as follows.
Corollary 1. Consider the viewing abandonment model with
strictly decreasing Ri, for all i M. The optimal video
allocations writes Y= [0; η
i]for all i M, where
η
i(µ) =
1 if piRi(1) µ(µ0)
0 if piµ
R1
i(µ/pi) otherwise
Pi∈M Siη
i(µ) = C.
(5)
A remarkable observation here is that optimum bandwidth
performance is achieved by splitting every video in only two
parts and caching the first one. We may determine the exact
splits if the abandonment distribution is given. For instance,
if πiis truncated exponential one with parameter λi, i.e.,
πi(τ) = λi
1eλieλiτ, τ [0; 1],
then the following holds.
Corollary 2. Under the exponential viewing abandonment
model the optimal video allocations writes Y= [0; η
i]for
7formally defined as the Lebesgue measure
all i M, where
(η
i(µ) = h1
λiln µ
pi(1 eλi) + eλii+
,(µ0)
PM
m=1 Siη
i(µ) = C.
(6)
B. Computation of Optimal Performance
To solve (3), we observe that it can be expressed as a
separable convex optimization problem with linear and box
constraints. If we further assume that the functions Rido
not have any plateau, then the objective function becomes
strictly convex, thus we can adapt the algorithm presented
in (Section 7.2, [19]) to our scope in order to efficiently
compute the optimal cache partial video allocation Y. We
present below the high-level description of the algorithm. An
interested reader may find in the Appendix the implementation
details.
Waterfilling algorithm
Set k:= 0. Set M(0) := M.
while M(k)6=
refine the search of the set of indices M(k)in correspondence
to which the optimal solution is deemed to be in the interior of
the box constraint
if the approximated solution for video i M(k)falls beyond
the box [0; 1], it is rounded to the nearest boundary; it is now
optimal and discarded from M(k)
set k:= k+ 1
end
C. Performance Evaluation with Real Data
In order to evaluate the performance of the optimal partial
allocation in a realistic scenario we utilize the average watch-
time parameters shown in Tab. I. In Fig. 6 we compare the
core network traffic B=Bs(Y)generated by the optimal
partial caching strategy with the one produced by the most
natural strategy prescribing to store the most popular videos
in their entirety. We observe that remarkable gains from partial
caching are achieved for cache size ratios higher than 102of
the total catalog size, which we typically find in current CDN
scenarios.
We then show in Fig. 7 the optimal portion of videos
that should be stored according to the same optimal caching
strategy, for different values of the cache size. Interestingly,
only very popular videos are stored in their entirety, even for
large cache sizes.
We finally remark that in this paper we normalize all the
core network traffic figures with respect to the minimum
bandwidth per video request Bnc required to serve the users
when no cache is deployed in the system, which equals
Bnc =
M
X
i=1
SipiZ1
0
Ri(τ)dτ. (7)
5
Figure 6: Core traffic generated by the optimal partial caching strategy in a realistic
scenario vs. the traffic produced by storing the most popular videos in their entirety.
We show in red the resulting performance gain by using the first strategy. We utilized
the parameters obtained via real data shown in Tab. I. The video popularity distribution
follows a Zipf law with parameter 0.8 [17]. Sis denoted as the average video size.
Figure 7: Optimal portion of videos that should be stored according to the same optimal
caching strategy in Fig. 6. Given a certain C/SM , the video with video popularity x
should be stored from its beginning up to portion y.
V. A PRAC TI CA L CHU NK -LRU SC HE ME F OR DECREASING
RET EN TI ON RATE S
After analyzing the best performance that can only be
achieved with full information on the system parameters, we
turn to the study of a practical cache update scheme that shows
good performance even when popularity piand audience
retention rate Riare unknown for each video i.
It is a widespread understanding that the Least Recently
Used (LRU) cache replacement policy represents a good
trade-off between hit-rate performance and implementation
complexity in a real scenario where no statistics on video
popularity are available to the cache manager. Moreover,
thanks to its short memory it reacts quickly to variations in
video popularity. In its simplest form though, each time a video
is requested even only partially by a user and is not found in
the cache, LRU would prescribe to cache it in its entirety
(and to update the LRU recency table accordingly). Since
users rarely watch videos entirely, as previously observed, the
standard LRU would generate extra-traffic in the core network
and would waste precious cache space to store unpopular
portions of files.
In order to counter this, we propose a new cache manage-
ment policy that generalizes the classic LRU policy. We first
Figure 8: Video split into N+1 chunks. Only the first Nare considered for chunk-LRU;
the last one is never stored in the cache.
suggest to split each video into N+1 consecutive and non-
overlapping chunks. We denote by [xi1;xi]the i-th chunk.
Moreover, we argue that the last (i.e., the (N+ 1)-th) chunk
of each video, which is the least popular part under the
assumption of decreasing audience retention rate, should never
be stored in the cache, even if requested by a user. Intuitively,
this frees up space for more popular chunks of less popular
videos to be stored in the cache. We call νthe tail drop factor
that pinpoints the position of the last chunk. Hence, the first
Nchunks of each video are stored only if requested, and then
evicted from the cache in an LRU fashion.
Remark 1. For the sake of analysis simplicity we assume that
the chunk splitting x, ν does not depend on the identity of the
file. We leave this as a future extension.
Performing LRU on the first Nchunks presents two main
benefits. On the one hand, it reduces the extra-traffic on the
core network caused for the retrieval of video portions that
are not requested. For instance, whenever a user watches
a video from its beginning up to portion b, only the first
¯
k= mink{xkb}chunks are downloaded. Hence, only the
portion xkbis stored in the cache without being accessed.
On the other hand, we exploit the fact that the tail of a
video is generally less popular than the rest [9]. Hence, by
systematically discarding the tail of each video we avoid to
evict from the cache the first chunks, which are likely to be
more popular.8
We now formally describe our algorithm which uses as input
the chunking of files and the tail drop factor. The impact of
those parameters on actual performance is analyzed in the
following subsections.
chunk-LRU Algorithm
Step 1 (Initialization):
1.1) Set the tail drop factor ν(0; 1]
1.2) Partition each video iinto N+ 1 chunks of the form [x0=
0; x1],[x1, x2],...,[xN1;νxN],[xN=ν;xN+1 = 1], where
xi[0; 1] (see Fig. 8)
1.3) An initial chunk request recency vector is available
Step 2: A request for a packet of video i M belonging to its k-th chunk
[xk1, xk]arrives
2.1) If k=N+1, then the request is handled by the core network and the
cache is not updated (i.e., the tail is never cached)
8Additionally, although this is not the focus of this paper, performing LRU
on chunks would allow to keep track of the evolution of the popularity of
each chunk. Nevertheless, the resulting benefits would be minor, since the
retention rate varies on a time scale much slower than the video popularity
dynamics.
6
2.2) Else, if 1kN, then
2.2.1) If the requested chunk is stored in the cache, then the cache sends
the packet to the user
2.2.2) If the requested chunk is not stored in the cache, then it is retrieved
from the core network and then stored in the cache, after evicting
the minimum number of least recently used chunks. Finally, the
cache sends the packet to the user
2.3) The recency vector of the chunks stored in the cache is updated in an
LRU fashion
2.4) Return to step 2)
A. Chunk-LRU Performance under Viewing Abandonment
After having described our chunk-LRU algorithm, we now
turn to the analysis of its performance. To this purpose, in
this section we will assume that the viewing abandonment
model holds. Moreover, in order to come up with our analytical
results we make the common simplifying assumption that all
videos have the same size S=Si. This is well justified
by the fact that we can break large videos into equal size
fragments, and perform chunk-LRU over the chunks of the
video fragments.
We first observe that, under the viewing abandonment model
(Section III-A1), the probability that the k-th chunk of video i
is requested by a user knowing that the user herself has already
started watching video iequals Ri(xk1) = R1
xk1πm(τ).
Since the requests for video ifollow by assumption a Poisson
process of intensity (proportional to) pi, then the request
process for the k-th chunk is also Poisson with reduced
intensity piRi(xk1). Thus, thanks to an adaptation of the
popular Che’s approximation [15] we can already compute
the hit rate for a specific chunk, i.e., the probability that a
chunk is found in the cache when requested.
Let us elaborate on this. Che’s approximation was originally
proposed in [15] to compute the hit rate for files whose request
successions follow independent Poisson processes. It approxi-
mates the characteristic time tC, measuring the time that a file
spends in the cache, as a constant. When shifting the request
granularity from the video to the chunk level, the independence
property of request streams is unavoidably lost. Nevertheless
we can still rely on the intuition that when the cache size
is significantly larger than the video size the characteristic
time of each chunk is approximately equal and constant, hence
Che’s approximation still holds, which has been shown valid
in [1]. Therefore, the hit rate hk,i for the k-th chunk of video
ican be approximated as hk,i = 1 epiRi(xk1)tC, where
the characteristic time tCobeys the following relation [17]:
C
S=
N
X
k=1
xk
M
X
i=1
hk,i,(8)
where xk=xkxk1. Finally, the expected traffic per video
request BcLRU forwarded to the core network when the chunk-
LRU cache management policy is employed writes
BcLRU(x, ν) = (9)
S
M
X
i=1
pi N
X
k=1
Ri(xk1)(1hk,i )∆xk+Z1
ν
Ri(τ)!
where x={x1, . . . , xN1}.
B. Benefits of Chunk Sub-Splitting
We now focus on the impact of the chunk size on chunk-
LRU performance, measured as the traffic generated at the
core network BcLRU. Intuitively speaking, shrinking the chunk
size should translate into better traffic performance, since this
reduces the traffic surplus generated when users do not watch
a chunk in its entirety. Nevertheless this does not prove the
intuition, since modifying the chunk size also has an impact
on the characteristic time tCin a non-trivial way via Eq. (8).
Before stating the main result of this section, we first
need to introduce some notation. Let tCand tCbe the
characteristic times when only one chunk (i.e., [0; ν]) and
chunks of infinitesimal size dx (say, at the byte level) are
employed, respectively. More formally, tCand tCare the
unique roots of the two following equations:
C
S=ν
M
X
i=1 1epitC
C
S=
M
X
i=1 Zν
01epiRi(x)tCdx,
respectively. Moreover, we say that the chunk split x0is a
sub-split with respect to xwhenever i{xi}⊂∪i{x0
i}. We
finally observe that if ν=C
MS then the cache can store all
the first videos up to their portion ν; hence, it is reasonable
to constrain νwithin the interval [C
MS ; 1].
We are now ready to prove that any refinement of the chunk
granularity produces a decrease in the expected traffic load on
the core network.
Theorem 2. Let ν[C
MS ; 1] and let xbe a video chunk split.
Assume that
d
M
X
i=1
piRi(τ)epiRi(τ)tC<0,tC[tC;tC], τ [0; 1]
(10)
Then, any video chunk sub-split x0outperforms xin terms of
traffic generated on the core network, i.e., the following holds:
BcLRU(x0, ν)< BcLRU(x, ν).
Numerical experiments suggest that our sufficient condi-
tion (10) is very loose, and it generally holds for realistic
popularity distributions and retention rates. It is not satisfied
only in pathological cases where the distribution is extremely
concentrated around few popular files and the cache size very
small, near to the size of a single file.
C. Optimal Performance of Chunk-LRU
In this section we focus on the computation of the best
performance of chunk-LRU, optimized over the chunk size
and tail drop factor ν. We will utilize it as a benchmark for
the performance evaluation of practical chunk-LRU policies in
realistic scenarios in Section V-D.
7
In order to come up with the best performance achievable
by chunk-LRU we need to find the solution of the following
optimization problem:
BcLRU = min
N,x,ν,tC
BcLRU(x, ν)(11)
s.t.
C
S=PN
k=1 xkPM
i=1 1epiRi(xk1)tC
C
MS ν1
0 = x0x1 · · · xN1xN=ν.
It follows from Theorem 2 that, if condition (10) holds,
then the bandwidth utilization of any video chunk split xand
ν[C
MS ; 1] is lower bounded by the performance BcLRU(ν)
of the infinitesimal split (say, at the byte level). This greatly
simplifies the formulation of (11) in a two-variable constrained
optimization problem (see Eq. 12). Below we formalize this
result.
Corollary 3. Assume that condition (10) holds. For any video
chunk split xand tail drop factor ν, the traffic performance
BcLRU(x, ν)is lower bounded by the performance BcLRU of
the infinitesimal chunking approach:
BcLRU BcLRU(x, ν),
where BcLRU is computed as
BcLRU = min
ν,tC
M
X
i=1 Zν
0
piRi(x)epiRi(x)tCdx +Z1
ν
piRi(τ)
s.t.C
S=PM
i=1 Rν
01epiRi(x)tCdx
C
MS ν1.(12)
We stress the fact that BcLRU is the lowest core network
traffic achievable by a chunk-LRU cache management policy.
Thanks to the formulation in (12) we can prove the
following two results via standard Lagrangian optimization
techniques.
Corollary 4. If Riis continuous and Ri(1) = 0 for all i M
then the optimal ν<1.
Corollary 5. If Ri(τ) = 1 for all τ[0; 1],i M then
standard LRU (only one chunk for each video and ν= 1)
achieves optimal performance.
The former result states that if users never watch videos in
their entirety, then it is always optimal to never cache a non-
negligible portion of file, i.e., ν<1. The latter claims that,
as intuition suggests, if all users watch the whole video then
the best chunk-LRU policy is actually the standard LRU.
D. Performance evaluations with real data
In this section we numerically evaluate the traffic perfor-
mance on the core network of the proposed class of chunk-
LRU cache management policies. We compare them with the
optimal performance Bunder full information that we derived
in Section IV. We also take the performance of standard LRU
as a second term of comparison. As in Section IV, we consider
the audience retention rate scenario shown in Tab. I, estimated
from a real Youtube dataset, with the only difference that the
file size is supposed to be uniform. We show our results9
in Fig. 9. We first notice that, as hinted by Theorem 2, the
traffic generated by chunk-LRU decreases as the number N
of chunks increases (N= 4,20). The infinitesimal chunk size
approach (N=) is shown to achieve optimal performance
BcLRU, as claimed in Corollary 3. Notably, the chunk-LRU
performs close to its optimal performance even with a limited
number of chunks (N= 20 or also N= 4). Moreover, a
suboptimal value of the tail drop factor ν= 1 still performs
close to optimal for Nsufficiently high (see Sect. V-E for
further details). On the other hand, as expected, standard LRU
performs poorly. In fact, the traffic generated by retrieving
parts of file that are not requested by the users outweighs
the obtained benefits through cache hits even for medium-size
caches. This explains why the traffic generated by LRU can
be even higher than the one without any cache deployed.
The best tail drop factor ν=ν(N)used to produce Fig. 9
is optimized for each value of Nand cache size C, as shown
in Fig. 10. We notice that νis closely related to average
watch-time, since it captures the portion of files with the lowest
popularity which need to be systematically discarded from the
cache. For small cache sizes, simulations show that νis lower
than the watch-time: in fact, to compensate for the reduced
cache size, low values of νallow to squeeze in the cache a
significant amount of different - and popular - headers of files.
E. Tuning the chunk-LRU parameters
Although the optimization of chunk-LRU parameters is
beyond the scope of this paper, next we provide guidelines
on how reasonable values could be selected.
a) Choosing the number of chunks: Increasing the num-
ber of chunks translates into an increase of the frequency
at which the cache content and the associated recency list
is updated, as well as an increase of the recency table size.
Therefore, the design of the optimal number Nof chunks in
real systems should capture the trade-off between the actual
performance of the policy (for which high values of Nare
preferable, see Cor. 3) and the required processing/memory
resources, increasing with N. Our numerical results in Fig. 9
suggest that even a small number of chunks (around 4),
that would result to a low complexity policy, can achieve
reasonably good traffic performance.
b) Choosing the tail drop factor ν:The exact optimal
value ν(N)can be computed by solving the problem in (12)
only if all the system parameters, i.e., the file popularity pi
and the retention rates Ri, are known to the cache controller.
For comparison purposes10, we then show in Fig. 9 the perfor-
mance achieved in the extreme case where the cache manager
is agnostic to piand Riand the tail drop parameter νis blindly
set to 1, i.e., no chunks are ever discarded. Remarkably, if the
number of chunks is sufficiently high (N= 20 in this case),
9The traffic performance is normalized w.r.t. the traffic Bnc generated when
no cache is present, as in Section IV.
The chunk-LRU policies have chunks with equal size.
10If the full information assumption holds then using chunk-LRU would
be highly suboptimal, since the theoretically optimal solution computed in
Section IV can be actually implemented.
8
Figure 9: Normalized core network traffic generated by chunk-LRU vs. the theoretical optimum Band vs.
the standard LRU. The optimal ν=ν(N)is computed for each value of Nand cache size C, as
depicted in Fig. 10. We also evaluate the performance achieved when the sub-optimal value of ν= 1 is
utilized. The video popularity distribution follows a Zipf law with parameter 0.8 [17]
Figure 10: Optimal tail drop factor νfor different number of chunks
N= 4,20,. We notice that the optimal ν(N)is within a
neighborhood of the average watch-time of 0.61.
the loss in performance incurred by such sub-optimal choice
is limited: the fine granularity of chunk splitting compensates
for the loss incurred by setting ν= 1.
Remark 2. We claim that a reasonable choice of ν(<1) can
be still made in realistic scenarios, based on an estimation of
the parameters pi, Ri. First of all, indeed, the optimal νis
not strictly a function of the popularity of each video, but only
of the rank-dependent popularity piof the i-th most popular
video, for each i. It has been shown [17] that such rank-
dependent relation depends on the class of traffic and is slowly
varying over time, hence it is easily predictable. Secondly, we
argue the video retention rate functions Rivary on a much
slower time scale than that of video popularity, which greatly
facilitates its estimation.
VI. CONCLUSIONS
In this paper we investigated the potential of partial caching
towards minimizing core network traffic. Our numerical results
based on real YouTube access data reveal that big caches
benefit the most from such strategies, namely up to 50%
over the classic approach of storing the most popular files.
Interestingly, partial caching is beneficial even when the actual
popularity of videos is not known. In this case, practical
chunk-based LRU strategies which never cache the tail of
videos were shown to perform well as long as a sufficient
number of chunks is used.
The introduction of audience retention rate in caching
decisions opens up interesting research directions. Retention
rate is generally available in online video distribution systems
and does not evolve over time. Thus, it can be used to
decompose the problems of file popularity estimation and
optimal chunking without loss of optimality. In this context,
the generalization of existing caching mechanisms so as to op-
timally exploit the benefits of partial caching is an interesting
topic for future study.
VII. APPENDIX
A. Waterfilling Algorithm
Algorithm to compute η
Step 1 (Initialization): Let k= 0,C(0) := C,M(0) := M,Mµ
a:= ,
Mµ
b:= . Define ˜
R0
i:RRas a strictly decreasing extension of piR0
i
over the whole real axis, i.e., ˜
R00
i(τ) = piR0
i(τ)for all τ[0; 1] and ˜
R00
i
is strictly decreasing over R.
Step 2 Compute µ(k)via the equation Pi∈M(k)Si[˜
R00
i]1(µ(k)) = C(k).
Compute the sets Mµ(k)
a={m: [ ˜
R00
i]1(µ(k))<0},Mµ(k)
b={m:
[˜
R00
i]1(µ(k))>1},Mµ(k)={m: 0 [˜
R00
i]1(µ(k))1}.
Compute δ(µ(k)) = Pi∈Mµ(k)
b
Si+Pi∈Mµ(k)Si[R0
i]1(µ(k))C(k).
Step 3 If δ(µ(k)) = 0 or Mµ(k)=then set µ=µ(k),Mµ
a=Mµ
a
Mµ(k)
a,Mµ
b=Mµ
b Mµ(k)
b,Mµ=Mµ(k), and go to step 6.
Else, if δ(µ(k))>0then go to step 4.
Else, if δ(µ(k))<0then go to step 5.
Step 4 Set η
i= 0 for all i Mµ(k)
a. Set C(k+1) := C(k). Compute
M(k+1) := M(k)\ Mµ(k)
a,Mµ
a:= Mµ
a Mµ(k)
a,k:= k+ 1. Go to
step 2.
Step 5 Set η
i=Sifor all i Mµ(k)
b. Compute C(k+1) =C(k)
Pi∈Mµ(k)
b
Si,M(k+1) := M(k)\ Mµ(k)
b,Mµ
b:= Mµ
b Mµ(k)
b,
k:= k+ 1. Go to step 2.
Step 6 Set η
i= 0 for all i Mµ
a;η
i= 1 for all i Mµ
b;η
i=
[˜
R00
i]1(µ(k))for all i Mµ. Stop.
B. Proof of Theorem 1
Proof. As a first step, let us define fi(τ) : [0; 1] [0; 1] as a
one-to-one function such that the permuted audience retention
rate function R0
i(τ) := Ri(f1
i(τ)) is non decreasing. The
function fiis a permutation function that orders the video parts
in order of decreasing popularity, such that fi(τ)< fi(τ0)
if and only if Ri(τ)> Ri(τ0)11. Then, R0
iis the outcome
11We notice that such fialways exists, even though is not unique, since it
can arbitrarily break the ties among equally popular parts of a single video,
and it is in general discontinuous.
9
of such permutation. As a second step, we reformulate the
optimization problem in (3) as
Y= argmax
YX
i∈M
SiZYi
piRi(τ) (13)
s.t.Pi∈M SiRYi1 =C
Yi[0; 1]
We can recast the bandwidth saving optimization problem in
(13) in terms of the permuted engagement rates R0
iand by
considering only right intervals of 0 of the kind Yi= [0; ηi],
as follows:
max
ηRMX
i∈M
piSiZηi
0
R0
i(τ) (14)
s.t.Pi∈M ηiSi=C
ηi[0; 1].
In fact, it is not profitable to consider a larger search domain,
e.g., more complicated subsets Yof [0; 1]M: for any collection
of subsets Yit is possible to replace Yiwith the interval
[0; RYi]with a strict increase of the objective function while
the feasibility is still preserved. We can further simplify (14)
by defining the function R00
i(τ) = piR0
i(τ), as follows:
min
ηRMX
i∈M Zηi
0
R00
i(τ) (15)
s.t.Pi∈M ηi=C
ηiSi[0; Si].
We notice that d
iRηi
0R00
i(τ) =piR0
i(ηi), which is
non-decreasing in ηi. Thus we recognize in (15) a convex
optimization problem with linear and box constraints, where
the objective function is separable in the optimization variables
η. It is known that such kind of problems can be solved via
a classic water-filling technique (see [19], Chapter 6): more
specifically, there exists a positive “water level” µsuch that
the optimal portions η(µ)can be computed as
ηi(µ) =
1 if minτ[0;1] R00
i(τ)µ
0 if maxτ[0;1] R00
i(τ)µ
R001
i(µ) else
Pi∈M Siη
i(µ) = C
(16)
By rewriting (16) in terms of R0
iwe obtain the expressions:
η
i=
1 if piminτ[0;1] R0
i(τ)µ
0 if pimaxτ[0;1] R0
i(τ)µ
R01
i(µ/pi) else
Pi∈M Si|Y
i|=C.
and we can finally claim that
Y
i=f1
i([0; η
i]) = {τ:piRi(τ)µ} i M.
The thesis follows.
C. Proof of Proposition 1
Proof. Since Riis already strictly decreasing, then we can
consider fi(τ) = τand R0
i=Ri. Moreover, in this case
minτRi(τ) = 0 and maxτRi(τ) = 1. The thesis easily
follows.
D. Proof of Corollary 2
Proof. Define
˜
R1
i(τ) = 1
λi
ln τ(1 eλi) + eλi.
We notice that ˜
R1
i(µ/pi) = R1
i(µ/pi)when 0< µ pi
and ˜
R1
i(µ/pi)<0whenever pi> µ. Then, we can rewrite
(5) as
(η
i=h˜
R1
i(µ/pi)i+
Pi∈M Siη
i=C.
The thesis easily follows.
E. Proof of Theorem 2
Proof. Let us first introduce the function
ξ(tC)(τ) =
M
X
i=1
piRi(τ)epiRi(τ)tC.
We then define I(f)|x, where fis a continuous function
defined over R, the integral approximation of fvia Riemann
sums of the type:
I(f)|x=
N
X
k=1
f(xk1)∆xk.
We notice that if fis increasing (decreasing) then I(f)|x<
(>)I(f)|x0for any sub-splitting x0. We can now rewrite
BcLRU(x, ν)as (compare with (9))
BcLRU(x, ν) = I(ξ(tC))|x
s.t. M ν C
S=I(h(tC))|x
where h(tC)(τ) = PM
i=1 epiRi(τ)tC. Since h(tC)(τ)is in-
creasing in τ, it easily follows from an induction argument
that the value of characteristic time for any chunk splitting is
found within [tC;tC].
Consider now a sub-splitting x0with associated characteristic
time t0
C. Since h(tC)(τ)is increasing, then I(h(tC))|x0>
I(h(tC))|x. Also, since I(h(t0
C))|x0=I(h(tC))|x, and h(t)(τ)
is decreasing in tthen t0
C> tC. We then have
BcLRU(x, ν) = I(ξ(tC))|x>I(ξ(t0
C))|x>I(ξ(t0
C))|x0
=BcLRU(x0, ν)
where the second inequality follows from the fact that ξ(t)(τ)
is decreasing in τfor any value tof the characteristic time.
The thesis is proven.
F. Proof of Corollary 4
Proof. The derivative with respect to νof the objective
function in (12) in the direction along which the constraint
is satisfied writes
q(ν) =
M
X
i=1
(1 epiRi(ν)tC)piRi(ν)+ (17)
Zν
0
M
X
i=1
p2
iR2
i(τ)epiRi(τ)tC PM
i=1 1epiRi(ν)tC
Rν
0PM
i=1 piRi(τ)epiRi(τ)tC
10
Let us calculate q(1 ), which equals
A+B
C+D
M
X
i=1
pi|R0
i(1)|
M
X
i=1
p2
i|R0
i(1)|2!.
Since A=Rν
0PM
i=1 p2
iR2
i(τ)epiRi(τ)tC > 0and B=
Rν
0PM
i=1 piRi(τ)epiRi(τ)tC > 0, then q(1 )>0and
thesis is proven.
G. Proof of Corollary 5
Proof. We first observe that, if Ri(τ)=1, then for all νwe
have BcLRU([0; ν], ν) = BcLRU(x, ν)for any chunk splitting
x. Then it suffices to prove that q(ν)<0holds for all ν
(0; 1), i.e., that the following expression holds:
M
X
i=1
1epitC!M
X
i=1
p2
iepitC+
M
X
i=1
(1 epitC)pi
M
X
i=1
piepitC<0
REFERENCES
[1] J. Roberts and N. Sbihi, “Exploring the memory-bandwidth tradeoff in
an information-centric network,” in Proc. of ITC, 2013, pp. 1–9.
[2] “Cisco visual networking index: Forecast and methodology, 20142019,”
http://www.cisco.com/c/en/us/solutions/collateral/service-provider/
ip-ngn- ip-next-generation-network/white paper c11-481360.html.
[3] K. W. Hwang, D. Applegate, A. Archer, V. Gopalakrishnan, S. Lee,
V. Misra, K. K. Ramakrishnan, and D. F. Swayne, “Leveraging video
viewing patterns for optimal content placement, in Proceedings of IFIP
Conference on Networking, ser. IFIP’12, 2012, pp. 44–58.
[4] http://support.google.com/youtube/answer/1715160?hl=en-GB.
[5] M. Zeni, D. Miorandi, and F. De Pellegrini, “YOUStatAnalyzer: a tool
for analysing the dynamics of YouTube content popularity, in Proc of
VALUETOOLS 13. ICST, 2013, pp. 286–289.
[6] S. Sen, J. Rexford, and D. Towsley, “Proxy prefix caching for multimedia
streams,” in Proc. of IEEE INFOCOM ’99, vol. 3, Mar 1999, pp. 1310–
1319 vol.3.
[7] K.-L. Wu, P. Yu, and J. Wolf, “Segmentation of multimedia streams for
proxy caching,” IEEE Transactions on Multimedia, vol. 6, no. 5, pp.
770–780, Oct 2004.
[8] L. Wang, S. Bayhan, and J. Kangasharju, “Optimal chunking and partial
caching in information-centric networks,” Computer Communications,
vol. 61, pp. 48–57, 2015.
[9] J. Yu, C. T. Chou, Z. Yang, X. Du, and T. Wang, A dynamic caching
algorithm based on internal popularity distribution of streaming media,”
Multimedia Systems, vol. 12, no. 2, pp. 135–149, 2006.
[10] K. Agrawal, T. Venkatesh, and D. Medhi, A dynamic popularity-based
partial caching scheme for video on demand service in IPTV networks,”
in Proc. of COMSNETS 14, Jan 2014, pp. 1–8.
[11] S.-H. Lim, Y.-B. Ko, G.-H. Jung, J. Kim, and M.-W. Jang, “Inter-chunk
popularity-based edge-first caching in content-centric networking,” IEEE
Communications Letters, vol. 18, no. 8, pp. 1331–1334, Aug 2014.
[12] S. Chen, H. Wang, X. Zhang, B. Shen, and S. Wee, “Segment-based
proxy caching for Internet streaming media delivery,” IEEE Multimedia,
vol. 12, no. 3, pp. 59–67, 2005.
[13] M. Hefeeda and O. Saleh, “Traffic modeling and proportional partial
caching for peer-to-peer systems,” IEEE/ACM Transactions on Network-
ing,, vol. 16, no. 6, pp. 1447–1460, Dec 2008.
[14] U. Devi, R. Polavarapu, M. Chetlur, and S. Kalyanaraman, “On the
partial caching of streaming video,” in IEEE IWQoS, 2012, June 2012,
pp. 1–9.
[15] H. Che, Y. Tung, and Z. Wang, “Hierarchical web caching systems:
Modeling, design and experimental results,” IEEE Journal on Selected
Areas in Communications,, vol. 20, no. 7, pp. 1305–1314, 2002.
[16] W. S. Cleveland, “Robust locally weighted regression and smoothing
scatterplots,” Journal of the American statistical association, vol. 74,
no. 368, pp. 829–836, 1979.
[17] C. Fricker, P. Robert, and J. Roberts, “A versatile and accurate ap-
proximation for lru cache performance,” in 24th International Teletraffic
Congress (ITC 24), Sept 2012, pp. 1–8.
[18] http://wistia.com/doc/audience-engagement- graph.
[19] S. M. Stefanov, Separable programming: theory and methods. Springer
Science & Business Media, 2013, vol. 53.
11
... One aspect that has not received sufficient attention is the fact that videos are often not viewed in their entirety. Empirical studies have shown that on average, only 60% of a video is watched by viewers [8], [9]. A direct consequence of this is that not all parts of a video are equally popular and this motivates dividing videos into chunks/segments for caching [8], [10]- [13]. ...
... Empirical studies have shown that on average, only 60% of a video is watched by viewers [8], [9]. A direct consequence of this is that not all parts of a video are equally popular and this motivates dividing videos into chunks/segments for caching [8], [10]- [13]. ...
... Several empirical studies have found that most users do not watch videos in their entirety [8], [9], [11], [14], [15]. In [8], the authors state that, on average, only 60% of a video is watched by viewers. ...
Conference Paper
Numerous empirical studies have shown that users of video-on-demand platforms do not always watch videos in their entirety. A direct consequence of this is that not all parts of a video are equally popular. Motivated by this, we explore the benefits of dividing files into smaller segments for caching. We treat incoming requests as requests for segments of files and propose a Markovian request model which captures the time-correlation in requests. We characterize the fundamental limit on the performance of caching policies which only cache full files. Next, we propose and analyze the performance of policies which cache partial files. Using this, we characterize the potential for improvement in performance due to caching partial files and analyze its dependence on various system parameters like cache size and the popularity profile of the files being cached.
... For efficient caching and delivery, this nonuniform viewing behaviour calls for partial caching, where only the most viewed portion of each video file is cached. Audience retention rate aware partial caching is shown to improve the performance of uncoded caching in [15]. ...
... To model this, we employ the notion of audience retention rate, defined as the fraction of users that request chunk W ij among all the users that have requested W i , denoted by p ij , for i ∈ [N ] and j ∈ [B] [15]. Alternatively, we can regard p ij as the probability that a user who requested video W i will watch the jth chunk 3 . ...
... The detailed proof can be found in Appendix A. (1 − q ij )F/B bits of chunk W ij if it is requested. We note that for any demand combination, the Uncoded scheme sends the same number of bits as the RAN delivery scheme, for the placement scheme described in Section III-A, which results in the same average delivery rate given in (15). ...
Preprint
Full-text available
Most results on coded caching focus on a static scenario, in which a fixed number of users synchronously place their requests from a content library, and the performance is measured in terms of the latency in satisfying all of these demands. In practice, however, users start watching an online video content asynchronously over time, and often abort watching a video before it is completed. The latter behaviour is captured by the notion of audience retention rate, which measures the portion of a video content watched on average. In order to bring coded caching one step closer to practice, asynchronous user demands are considered in this paper, by allowing user demands to arrive randomly over time, and both the popularity of video files, and the audience retention rates are taken into account. A decentralized partial coded caching (PCC) scheme is proposed, together with two cache allocation schemes; namely the optimal cache allocation (OCA) and the popularity-based cache allocation (PCA), which allocate users' caches among different chunks of the video files in the library. Numerical results validate that the proposed PCC scheme, either with OCA or PCA, outperforms conventional uncoded caching as well as the state-of-the-art decentralized caching schemes, which consider only the file popularities, and are designed for synchronous demand arrivals. An information-theoretical lower bound on the average delivery rate is also presented.
... We denote with w n the utility obtained when file n is requested and found in the cache (also called a hit). This file-dependent utility can be used to model bandwidth economization from cache hits [25], QoS improvement [24], or any other cache-related benefit. We will also be concerned with the special case w n = w, n ∈ N , i.e., the cache hit ratio maximization. ...
... Why caching of file fractions makes sense? Large video files are composed of chunks stored independently, see literature of partial caching[25]. Also, the fractional variables may represent caching probabilities[2],[26], or coded equations of chunks[24]. ...
Preprint
Full-text available
This paper introduces a novel caching analysis that, contrary to prior work, makes no modeling assumptions for the file request sequence. We cast the caching problem in the framework of Online Linear Optimization (OLO), and introduce a class of minimum regret caching policies, which minimize the losses with respect to the best static configuration in hindsight when the request model is unknown. These policies are very important since they are robust to popularity deviations in the sense that they learn to adjust their caching decisions when the popularity model changes. We first prove a novel lower bound for the regret of any caching policy, improving existing OLO bounds for our setting. Then we show that the Online Gradient Ascent (OGA) policy guarantees a regret that matches the lower bound, hence it is universally optimal. Finally, we shift our attention to a network of caches arranged to form a bipartite graph, and show that the Bipartite Subgradient Algorithm (BSA) has no regret
... We denote with w n the utility obtained when file n is requested and found in the cache (also called a hit). This file-dependent utility can be used to model bandwidth economization from cache hits [25], QoS improvement [24], or any other cache-related benefit. We will also be concerned with the special case w n = w, n ∈ N , i.e., the cache hit ratio maximization. ...
... Why caching of file fractions makes sense? Large video files are composed of chunks stored independently, see literature of partial caching[25]. Also, the fractional variables may represent caching probabilities[2],[26], or coded equations of chunks[24]. ...
Conference Paper
Full-text available
This paper introduces a novel caching analysis that, contrary to prior work, makes no modeling assumptions for the file request sequence. We cast the caching problem in the framework of Online Linear Optimization (OLO), and introduce a class of minimum regret caching policies, which minimize the losses with respect to the best static configuration in hindsight when the request model is unknown. These policies are very important since they are robust to popularity deviations in the sense that they learn to adjust their caching decisions when the popularity model changes. We first prove a novel lower bound for the regret of any caching policy, improving existing OLO bounds for our setting. Then we show that the Online Gradient Ascent (OGA) policy guarantees a regret that matches the lower bound, hence it is universally optimal. Finally, we shift our attention to a network of caches arranged to form a bipartite graph, and show that the Bipartite Subgradient Algorithm (BSA) has no regret.
... The average percentage viewed continues dropping for longer videos (45% for 20 minutes long videos). Maggi et al. (2018) also report that the average video watch-time is less than 70%, and the longer the video the shorter its watch-time. ...
Article
Full-text available
This article explores video-viewing behavior when videos are wrapped in interactive content in the case of iOtok, a 13-episodes web documentary series. The interaction and viewing data were collected over a period of one year, providing a dataset of more than 12,200 total video views by 6000 users. Standard metrics (video views, percentage viewed, number of sessions) show higher active participation for registered users compared to unregistered users. Results also indicate that serialization over multiple weeks is an effective strategy for audience building over a long period of time without negatively affecting video views. In viewing behavior analysis, we focused on three perspectives: (i) regularity (watching on a weekly basis or not), (ii) intensity (number of videos per session), and (iii) order of watching. We performed a perspective based and combined perspectives analysis involving manual coding techniques, rule-based, and k-means clustering algorithms to reveal different user profiles (intermittent, exemplary, detached, enthusiastic users, and nibblers) and highlight further viewing behavior differences (e.g., post-series users binge-watched more than concurrent users during first 13 weeks while the series was weekly released). We discuss how these results can be used to inform the design and promotion of future web documentaries.
... Along with the most popular content cached only in few SBSs, the joint caching and scheduling algorithm also caches the less popular contents, which, when delivered to the users, improves the reconstruction quality of the views. It is worth noting that this range of capacity values is of great practical interest, as SBSs are typically assumed to cache only 5-10% of the total video catalogue [13], [14]. The performance of all the algorithms becomes limited by the insufficient transmission capacity of the network. ...
Conference Paper
The emergence of novel interactive multimedia applications with high data rate and low latency requirements has led to a drastic increase in the video data traffic over wireless cellular networks. Endowing the small base stations of a macro-cell with caches that can store some of the contents is a promising technology to cope with the increasing pressure on the backhaul connections, and to reduce the delay for demanding video applications. In this work, delivery of an interactive multiview video over an heterogeneous cellular network is studied. Differently from existing works that focus on the optimization of the delivery delay and ignore the video characteristics, the caching and scheduling policies are jointly optimized, taking into account the quality of the delivered video and the video delivery time constraints. We formulate our joint caching and scheduling problem via submodular set function maximization and propose efficient greedy approaches to find a well performing joint caching and scheduling policy. Numerical evaluations show that our solution significantly outperforms benchmark algorithms based on popularity caching and independent scheduling.
... Along with the most popular content cached only in few SBSs, the joint caching and scheduling algorithm also caches the less popular content, which, when delivered to the users, improves the reconstruction quality of the views. It is worth noting that this range of capacity values is of great practical interest as SBSs are typically assumed to cache only 5-10% of the total video catalogue [19], [20]. The performance of all the algorithms becomes limited by the insufficient transmission capacity of the network. ...
Article
The emergence of novel interactive multimedia applications with high rate and low latency requirements has led to a drastic increase in the video data traffic over wireless cellular networks. Endowing the small base stations of a macro-cell with caches that can store some of the content is a promising technology to cope with the increasing pressure on the backhaul connections, and to reduce the delay for demanding video applications. In this work, delivery of an interactive multiview video to a set of wireless users is studied in an heterogeneous cellular network. Differently from existing works that focus on the optimization of the delivery delay and ignore the video characteristics, the caching and scheduling policies are jointly optimized, taking into account the quality of the delivered video and the video delivery time constraints. We formulate our joint caching and scheduling problem as the average expected video distortion minimization, and show that this problem is NP-hard. We then provide an equivalent formulation based on submodular set function maximization and propose a greedy solution with $\frac{1}{2}(1-\mbox{e}^{-1})$ approximation guarantee. The evaluation of the proposed joint caching and scheduling policy shows that it significantly outperforms benchmark algorithms based on popularity caching and independent scheduling. Another important contribution of this paper is a new constant approximation ratio for the greedy submodular set function maximization subject to a $d$-dimensional knapsack constraint.
Article
Given the rapid growth of user-generated videos, internet traffic has been heavily dominated by online video streaming. Caching videos on edge servers in close proximity to users has been an effective approach to reduce the backbone traffic and the request response time, as well as to improve the video quality on the user side. Video popularity, however, can be highly dynamic over time. The cost of cache replacement at edge servers, particularly that related to service interruption during replacement, is not yet well understood. This paper presents a novel lightweight video caching algorithm for edge servers, seeking to optimize the hit rate with real-time decisions and minimized cost. Inspired by recent advances in deep Q-learning, our DQN-based online video caching (DQN-OVC) makes effective use of the rich and readily available information from users and networks. We decompose the Q-value function as a product of the video value function and the action function, which significantly reduces the state space. We instantiate the action function for cost-aware caching decisions with low complexity so that the cached videos can be updated continuously and instantly with dynamic video popularity. We used video traces from Tencent, one of the largest online video providers in China, to evaluate the performance of our DQN-OVC and to compare it with state-of-the-art solutions. The results demonstrate that DQN-OVC significantly outperforms the baseline algorithms in the edge caching context.
Conference Paper
There exist many aspects involved in a video turning viral on YouTube. These include properties of the video such as the attractiveness of its title and thumbnail, the recommendation policy of YouTube, marketing and advertising policies and the influence that the video's creator or owner has in social networks. In this work, we study audience retention measures provided by YouTube to video creators which may provide valuable information for improving the videos and for better understanding the viewers' potential interests in them. We then study the question of when is a video too long and can gain from being shortened. We examine consistency between several existing audience retention measures. We end in a proposal for a new audience retention measure and identify its advantages.
Conference Paper
Full-text available
Understanding the dynamics of on-line content popularity is an active research field with application in sectors as diverse as media advertising, content replication and caching and on-line marketing. In most cases, scientists have focused on user-generated contents, which are freely accessible through different on-line services. Among such services, the incumbent one is indeed YouTube. This online platform was launched in 2005 and it currently features more than 6 billions hours of video watched every month (almost one hour per person on Earth), with more than 100 hours of videos uploaded every minute and 1 billion unique users per month1. In order to analyze or predict content popularity, statistics about viewers, watch time and shares must be retrieved. The YouTube APIs, however, do not allow third parties to retrieve such an information in an open and accessible way. In order to overcome this problem, we have developed a framework, based on Web scraping techniques and big data tools, for the collection and analysis of YouTube video content popularity at scale. Our framework, called YOUStatAnalyzer, enables researchers to create their own dataset, according to a number of different search criteria and analyse them to extract relevant features and significant statistics.
Conference Paper
Full-text available
As IP becomes the predominant choice for video delivery, storing the ever increasing number of videos for delivery will become a challenge. In this paper we focus on how to take advantage of user viewing patterns to place content in provider networks to reduce their storage and network utilization. We first characterize user viewing behavior using data collected from a nationally deployed Video-on-Demand service. We provide proof that users watch only a small portion of videos (not just for short clips, but even with full-length movies). We use this information and a highly flexible Mixed Integer Programming (MIP) formulation to solve the placement problem, in contrast to traditional popularity-based placement and caching strategy. We perform detailed simulations using real traces of user viewing sessions (including stream control operations such as Pause, Skip, etc.). Our results show that the use of a segmentbased placement yields substantial savings both in storage as well as network bandwidth. For example, compared to a simple caching scheme using full videos, our MIP-based placement using segments can achieve up to 71% reduction in peak link bandwidth usage.
Article
Full-text available
In a 2002 paper, Che and co-authors proposed a simple approach for estimating the hit rates of a cache operating the least recently used (LRU) replacement policy. The approximation proves remarkably accurate and is applicable to quite general distributions of object popularity. This paper provides a mathematical explanation for the success of the approximation, notably in configurations where the intuitive arguments of Che, et al clearly do not apply. The approximation is particularly useful in evaluating the performance of current proposals for an information centric network where other approaches fail due to the very large populations of cacheable objects to be taken into account and to their complex popularity law, resulting from the mix of different content types and the filtering effect induced by the lower layers in a cache hierarchy.
Conference Paper
Caching video objects closer to the users in delivery of on-demand video services in IPTV networks reduces the load on the network and improves the latency in video delivery. Partial caching of the video objects is attractive due to the space constraints of the cache and also due to the fact that some parts of the video might be more popular than the others. However, fixed segment-based caching of videos does not take into account the changing popularity of the segments and the changes in the viewing patterns of the users. In this work, we propose a partial caching strategy that considers the changes in the popularity of the segments over time and the access patterns of the users to compute the utility of the objects in the cache. We also propose to partition the cache to avoid the eviction of the popular objects (those not accessed frequently) by the unpopular ones which are accessed with higher frequency. We measured the popularity distribution and ageing of popularity from two online datasets and use the parameters in simulations. Our simulation results show that the proposed caching scheme improves the byte hit ratio when compared to the LRU caching scheme both for static and dynamic object pools and ageing of popularity.
Article
Caching is widely used to reduce network traffic and improve user experience. Traditionally caches store complete objects, but video files and the recent emergence of information-centric networking have highlighted a need for understanding how partial caching could be beneficial. In partial caching, objects are divided into chunks which are cached either independently or by exploiting common properties of chunks of the same file. In this paper, we identify why partial caching is beneficial, and propose a way to quantify the benefit. We develop an optimal n-Chunking algorithm with complexity for an s-byte file, and compare it with -optimal homogeneous chunking, where is bounded by . Our analytical results and comparison lead to the surprising conclusion that neither sophisticated partial caching algorithm nor high complexity optimal chunking are needed in information-centric networks. Instead, simple utility-based in-network caching algorithm and low complexity homogeneous chunking are sufficient to achieve the most benefits of partial caching.
Article
Content-centric networking (CCN) is considered promising for the efficient support of ever-increasing streaming multimedia services. Inter-chunk popularity-based caching is one of the key requirements in CCN multimedia services because some chunks of a content file tend to be requested more frequently than others. For multimedia contents, forepart chunks often have higher popularity than others as users may interrupt and abort before finishing its service. This paper presents a novel cache replication scheme, which places more popular chunks ahead on the edge router, and establishes a cache pipelining on the relaying routers along the path to reduce user-perceived delay. Simulation results show that the proposed scheme incurs less delay and reduces the overall redundant network traffic while guaranteeing a higher cache hit ratio.
Article
An information-centric network should realize significant economies by exploiting a favourable memory-bandwidth tradeoff: it is cheaper to store copies of popular content close to users than to fetch them repeatedly over the Internet. We evaluate this tradeoff for some simple cache network structures under realistic assumptions concerning the size of the content catalogue and its popularity distribution. Derived cost formulas reveal the relative impact of various cost, traffic and capacity parameters, allowing an appraisal of possible future network architectures. Our results suggest it probably makes more sense to envisage the future Internet as a loosely interconnected set of local data centers than a network like today's with routers augmented by limited capacity content stores.
Article
Video objects are much larger in size than traditional web objects and tend not to be viewed in entirety. Hence, caching them partially is a promising approach. Also, the projected growth in video traffic over wireless cellular networks calls for resource-efficient caching mechanisms in the wireless edge to lower traffic over the cellular backhaul and peering links and their associated costs. An evaluation of traditional partial caching solutions proposed in the literature shows that known solutions are not robust to video viewing patterns, increasing object pool size, changing object popularity, or limitation in the resources available for caching at the wireless network elements. In this paper, to overcome the limitations, we propose a novel approach that adopts a flexible segmentation policy and generalizes both LRU and LFU when applied to segmented accesses, and in our simulations, is shown to significantly lower wireless backhaul traffic (by around 20--30% and in some cases even higher).
Article
Proxy caching of large multimedia objects on the edge of the Internet has become increasingly important for reducing network latency. For a large media object, such as a two-hour video, treating the whole media as a single object for caching is not appropriate. In this paper, we study three media segmentation approaches to proxy caching: fixed, pyramid, and skyscraper. Blocks of a media stream are grouped into various segments for cache management. The cache admission and replacement policies attach different caching priorities to individual segments, taking into account the access frequency of the media object and the segment distance from the start of the media. These caching policies give preferential treatment to the beginning segments. As such, most user requests can be quickly played back from the proxy servers without delay. Event-driven simulations are conducted to evaluate the segmentation approaches and compare them with whole media caching. The results show that: 1) compared with whole media caching, segmentation-based caching is more effective not only in increased byte-hit ratio but also in lowered fraction of requests that requires delayed start; 2) pyramid segmentation, where segment size increases exponentially, is the best segmentation approach; and 3) segmentation-based caching is especially advantageous when the cache size is limited, when the set of hot media objects changes over time, when the media file size is large, and when there are a large number of distinct media objects.