Conference PaperPDF Available

Popularity Prediction Caching Using Hidden Markov Model for Vehicular Content Centric Networks

Authors:
Popularity Prediction Caching Using Hidden
Markov Model for Vehicular Content Centric
Networks
Lin Yao, Yuqi Wang, QiuFen Xia, and Rui Xu‡§
International School of Information Science & Engineering, Dalian University of Technology, China
School of Software, Dalian University of Technology, China
Cyberspace Security Technology Laboratory of CETC, Chengdu Sichuan, China
§China Electronic Technology Cyber Security Co., Ltd, Chengdu Sichuan, China
Abstract—Vehicular Content Centric Network (VCCN) is pro-
posed to cope with mobility and intermittent connectivity issues
of vehicular ad hoc networks by enabling the Content Centric
Network (CCN) model in vehicular networks. The ubiquitous
in-network caching of VCCN allows nodes to cache contents
frequently accessed data items, improving the hit ratio of content
retrieval and reducing the data access delay. Furthermore, it can
significantly mitigate bandwidth pressure. Therefore, it is crucial
to cache more popular contents at various caching nodes. In
this paper, we propose a novel cache replacement scheme named
Popularity-based Content Caching (PopCC), which incorporates
the future popularity of contents into our decision making. We
adopt Hidden Markov Model (HMM) to predict the content
popularity based on the inherent characters of the received
interests, request ratio, request frequency and content priority.
To evaluate the performance of our proposed scheme PopCC,
we compare it with some state-of-the-art schemes in terms of
cache hit, average access delay, average hop count and average
storage usage. Simulations demonstrate that the proposed scheme
possesses a better performance.
Keywords—Popularity Prediction, Hidden Markov Model, VC-
CN
I. INTRODUCTION
Though Vehicular Ad-hoc Network (VANET) can provide
entertainment and road safety services to drivers and passen-
gers, it has difficulties in maintaining end-to-end connections
due to its dynamic network topology and harsh propagation
environment [1]. Therefore, the information communication in
VANET is becoming challenging. IP-based host-centric proto-
cols work awkwardly in mobile environments and functional
patches such as mobile IP have been proven to cause increased
complexity and performance degradation of VANET [2]. CCN
is concerned with contents, not the actual carriers of the con-
tents. It can overstep the inefficiencies of TCP/IP in handling
the node mobility, unreliability of wireless links, and resource-
constrained devices [3]. In particular, the network capacity
becomes gradually limited with the increasing of vehicles. The
in-networking caching property of CCN allows each vehicle
to retrieve contents from the nearby providers.
Recently, Content Centric Network (CCN) has been advo-
cated to address the mentioned challenges and design the next
generation vehicular network called Vehicular Content Centric
Network (VCCN) [2]. A node requests a content by sending
an interest packet consisting the content name. The data
packet will be returned if any other node has a replica of the
corresponding content. Each node forwarding the data packet
can decide whether to cache it based on its cache replacement
strategy. This intra-network caching allows multiple contents
to be distributed through the VCCN to improve the network
performance.
Different from most caching strategies in CCN, a data
packet in VCCN might be returned by a different path instead
of the reverse path of the corresponding interest due to
the vehicle mobility. Moreover, the store-and-forward routing
may bring longer delay on remote access of information if
the requested content is cached at a node farther from the
requester. Due to the high cost of deploying RSUs and their
relatively limited storage capacities, it is impossible to store
all contents on RSUs. In order to reduce communication cost
among nodes, minimize bandwidth usage, reduce access delay
and improve hit ratio, each vehicle or RSU has to potentially
cache contents which are frequently accessed by other nodes.
Obviously, the network performance can be improved if the
requested content can be retrieved from the most convenient
providers.
In this paper, we consider taking advantage of the idea
of cache eviction and present an efficient cache replacement
policy, Popularity-based Content Caching (PopCC). We incor-
porate the future content popularity into our caching decision,
which is predicted with Hidden Markov Model (HMM) by
learning the past traffic patterns. Contents are cached in the
descending order of content popularity. We aim to cache more
popular contents in each node to achieve a higher performance.
Given all the above considerations, this paper has the following
contributions:
To the best of our knowledge, we are the first to adopt the
idea of future content popularity to design and implement
533
2019 20th IEEE International Conference on Mobile Data Management (MDM)
2375-0324/19/$31.00 ©2019 IEEE
DOI 10.1109/MDM.2019.00115
a cache replacement policy for VCCN. Using the fore-
casted popularity, PopCC makes proper cache replace-
ment decision to cache contents in the descending order
of content popularity. When the buffer is full, contents
with lower popularity will be evicted automatically.
We are the first to learn the recent access patterns of
interests to get the content popularity. To predict it more
accurately, our PopCC integrates request ratio, request
frequency and content priority to calculate the popularity
of an object with a general function. To compute the
content priority, we adopt TF-IDF algorithm [4] and
K-means [5]. TF-IDF algorithm is used to weight the
content names to establish the vector space model and
then K-means is used to cluster the content names to get
the content priority.
The remainder of this paper is organized as follows. In
Section II, we discuss the related work. Problem statement
is given in Section III. In Section IV, we present the details
of our approach. We evaluate the performance of PopCC in
Section V, and conclude the work in Section VI.
II. RELATED WORK
In this section, we review some literature works related to
our proposed method.
Cache Eviction Policies- Polices based on cache eviction
mainly aim to remove contents when new contents are received
due to the limit caching space. Cache eviction policies can
be classified into recency-based ones, frequency-based ones,
recency/frequency-based ones, randomized ones and function-
based ones [6]. In [7], Least Recently Used (LRU) as the
typical recency-based strategy and Least Frequently Used
(LFU) as the typical frequency-based strategy were proposed,
which consider the recently access time and request counts
respectively. TLRU [8] was proposed by combing recency and
frequency. Though recency and frequency information can be
obtained easily from the past Interests, the cache replacement
policies based on these fixed factors lack flexibility and are
difficult to meet the requirements of different workload situ-
ations. Randomized approaches aim to reduce the algorithm
complexity by finding a random content object for replacemen-
t. In [9], Zhou et al. proposed a random replacement policy by
efficiently approximating the hit probability in linear time with
moderate space in a single round. The above algorithms are
easy to implement but they ignore a fact that more popular
contents tend to stay longer in the cache, causing them to
suffer from major performance degradation [10].
Function-based approaches can solve the above challenging
by considering multiple factors to calculate the popularity of
an object with a general function. In [11], a cache replace-
ment policy, named as “Recent Usage Frequency (RUF)” was
proposed to deal with dynamic traffic patterns and alleviate
temporal traffic variations. The popularity is updated based
on the hit ratio and utilized time of each requested content.
PopCache was proposed to evaluate a content object’s Zipf
popularity and only contents with higher popularity were
cached in [12]. In [13], a value-based cache replacement
approach was proposed to calculate the popularity based on
access delay, frequency and aging time. In [14], the neural
network was first adopted to design a cache replacement policy
by integrating the cached contents longevity, frequency access,
the standard deviation of contents access frequency, the last
access to the content and hit ratio. [10] presented a cache
replacement method by incorporating the future content pop-
ularity into the caching decision. [15] proposed a centralized
control caching strategy based on popularity and betweenness
centrality.
VANET- A caching scheme for mobile ad hoc networks
(MANETs) was first proposed in [16], where the cross-layer
design is exploited to improve the caching performance with
the cooperative caching. The available cache replacement
mechanisms for ad hoc network are categorized into coor-
dinated and uncoordinated [17]. In uncoordinated schemes,
the replacement decision is made by individual nodes, while
neighbors cooperate and serve each other’s requests in co-
ordinated schemes. Mobility-aware replacement schemes [18]
were proposed to design the cache replacement by detecting
their movement patterns such as current location and moving
direction, which aims to cache more contents related to future
locations. The replacement policies in [19] predicted the future
regions based on users movement and stored the location
dependent data first.
To improve the network performance, the cooperative
caching seeks to coordinate the contents of each client’ cache
in VANET [20]. In [21], a cache replacement with content
popularity in VCCN was proposed to allow a cluster head
to make caching decisions cooperatively based on the past
popularity and the current cache hit of each content. The head
servers data to its members within the cluster. The LDCC
strategy in [22] applied a prediction model to approximate
client movement behavior such as the future location and
design a cooperative caching replacement. While, most of
other works focused on selecting caching nodes [23][24].
III. PROBLEM STATEMENT
In VCCN, each node contains three key data structures:
CS (Content Store), PIT (Pending Interest Table), and FIB
(Forwarding Information Base) in Table I. CS is used to
cache the contents forwarded by the node, including content
name, the corresponding content category, content priority and
content popularity. PIT records the interests that have been
forwarded but not yet replied. FIB is a table for routing the
incoming interest packets based on name prefixes. When an
interest is received, each node updates the arrival time and
discards it to avoid the duplicate storage if it has existed in
the PIT; Otherwise, it adds an item including the content name
and arrival time into its PIT. Then, it checks its own CS. If
the content is found in its CS, the corresponding data will be
replied; Otherwise, it forwards the interest if the request is
received for the first time. When receiving a data packet, each
node makes its own decision on whether to cache it or not
based on the caching replacement policy.
534
In this paper, our goal is to allow contents to be queried and
delivered efficiently among different nodes. As different ve-
hicles have different moving trajectories and receive different
interests, they may not cache the same contents. For example,
a vehicle node which usually drives nearby a shopping mall
may help others forward more interests on some promotional
information of goods. How to determine the popular contents
so that they can be accessed with the best efficiency? How
to exploit the recent access pattern to evaluate the content
popularity accurately and design the cache replacement policy?
TABLE I
DATA STRUCTURE MAINTAINED ATEACH NODE
(a) Content Store
Notation Description
name Content name
content Cached content
category Content name category
priority Content priority
popularity Content popularity
(b) Pending Interest Table
Notation Description
name Content name
time Time of the latest request
(c) Forwarding Information Base
Notation Description
name Content name
uids List of forwarding nodes
IV. POPULARITY-BASED CONTENT CACHING (POPCC)
In this paper, we aim to propose a Popularity-based Content
Caching (PopCC) strategy. We first give an overview of
PopCC, and then present our algorithm in details.
A. Overview
As vehicles move between infrastructures, each node includ-
ing RSU and vehicle always receives different interests from
different users. As usual, each interest has strong relationship
with the data consumer’s hobby, current location, etc.. For
example, one vehicle near a shopping mall may receive a large
number of requests on some coupons for certain goods during
some period. After this vehicle drives away, similar requests
will drop sharply. This example shows the popular contents are
always changing in each node. In order to react quickly to the
change of popular contents and provide better service for users,
we predict the content popularity by analyzing the inherent
statistics of incoming interests from the following parameters.
The frequently used notations are listed in Table II.
Definition 1 (Request Ratio): γ(ci)is defined as the ratio
between the number of interests for the same content ciand
that of all the requests received in a slice,
γ(ci)= n(ci)
ck∈C n(ck),(1)
where n(ci)is the number of requests for ci.
TABLE II
FREQUENT NOTATIONS
Notation Description
CThe set of content
ciThe i-th content in C
γk(ci)The request ratio of ciin the k-th slice
fk(ci)The request frequency of ciin the k-th slice
t(ci)The last time of receiving ci
ρk(ci)The content priority of ciin the k-th slice
PciThe popularity of ci
Definition 2 (Request Frequency): f(ci)is defined as the
average frequency of interests for ciin the past mslices.
f(ci)= 1
n(ci)n(ci)
k=1
1
t(ci)k+1 t(ci)k
,
f(ci)= 1
mm
j=1(f(cj)),
(2)
where t(ci)k+1 and t(ci)kare respectively the time of the last
two requests and f(ci)is the frequency in the j-th slice.
Definition 3 (Content Priority): ρ(ci)is defined as the
priority of ciduring a slice,
ρ(ci)= Ni
NT
,(3)
where Niis the number of interests in the category which ci
belongs to, and NTis the total number of interest in all the
categories in the current slice.
Among the three factors, request ratio and request frequency
for the same contents are obtained from the statistics of incom-
ing interests. Content priority is calculated during the training
stage. And PopCC works with the following procedures:
1) During the training stage of HMM, each node makes s-
tatistics on the number of requests and contained content
name in each interest. Once the training is over, all the
statistics will be reported to the nearest RSUs. To ensure
the accuracy and reliability of training, all statistics need
to be synchronized between RSUs.
2) Each RSU is responsible for determining the content
priority defined in Eq. (3). Each RSU first adopts TF-
IDF to establish the vector space model of content
names. All the content names in the model are clustered
with K-means method. Based on the number of elements
in each cluster, the priority of contents in each cluster
is obtained.
3) During the prediction stage, each node calculates the
request ratio and request frequency for the same contents
according to Eq. (1) and Eq. (2).
4) Each node establishes HMM to calculate the future
popularity of each content based on the request ratio,
request frequency and content priority. Then, contents
are cached in the descending order of content popularity.
If the cache space is full, the contents with lower future
popularity will be evicted.
535
B. Calculation of Content Priority
This module aims to give a brief introduction on how to
obtain content priority, which works in the following two
procedures:
1) Getting VSM with TF-IDF:To analyze the collected
content names in C, we adopt an algebraic model, Vector S-
pace Model(VSM), to transform the text document containing
names into vectors of identifiers,
V(d)=((t1,W
1),(t2,W
2),(t3,W
3),··· ,(tn,W
n)),
where tirepresents the feature vector of ci,Wirepresents the
weight of ciin the text document dand V(d)is the vector
representation of a text document d.
Then, TF-IDF is adopted to calculate calculate Wi.TF-
IDF, short for term frequency-inverse document frequency, is
a numerical statistic that is intended to reflect how important a
word is to a document in a collection or corpus. It is often used
as a weighting factor in searches of information retrieval, text
mining, and user modeling. TF measures how frequently a
term occurs in a document and IDF measures how important
it is. The TF-IDF value increases proportionally to the
number of times a word appears in the document.
Wi=TF
i×IDFi,
where IDFiis calculated as lg( N
ni+1 )with Nthe total number
of names and nithe number of texts including ciin d.
For multiple documents, the text vector need to be normal-
ized processing by the weight function,
Wik =TF
ik ×lg( N
ni+1 )
M
k=1(TF
ik ×lg( N
ni+1 ))2
,
where Wik represents the weight of the kth feature vector
in diand Wi=(Wi1,W
i2,··· ,W
iM )with Mrepresenting
the number of different content names.
2) Clustering names with K-means: In this section, we use
K-means to cluster the content names in VSM and content
names with more similarities are clustered into one cluster,
sim(ci,c
j)= n
k=1(Wik ×Wjk)
(n
k=1(Wik )2)×(n
k=1(Wjk)2),(4)
where sim(ci,c
j)is the similarity between ciand cj.
We follow the steps below to complete the clustering:
Step 1: We first determine the number of categories (K)
based on the weight of the contents which is bigger than the
threshold ω, and set these corresponding content names as the
initial cluster centroids.
Step 2: For any other name except centroids, we calculate
the similarity between it and each cluster centroid by Eq. (4),
and assign it to the cluster with the highest similarity.
Step 3: We recalculate the mean of feature vectors in each
cluster and set it as the new cluster centroid.
Step 4:Steps 2 and Step 3 are repeated until there is no
feature vector left.
Step 5: After clustering is over, the content priority is
calculated according to Eq. (3).
C. Popularity Prediction
Each node adopts HMM [25] to predict the content pop-
ularity. All contents are cached in the descending order of
the predicted popularity. If the buffer is full, the contents
with lower future popularity will be evicted. First, several
terminologies are introduced:
Definition 4 (Content Popularity): P(ci)is defined as the
predicted popularity of ciaccording to recent access patterns.
Definition 5 (Factor Sequence): A factor sequence {F1,F2,
···,Fk,···} is made up of a series of triples, where Fkis a
triple γk(ci),f
k(ci),pr
k(ci)in the k-th slice.
By applying a Forward-Backward Algorithm to train HMM,
we can predict P(ci)with HMM by exploiting the recent
access patterns to generate the factor sequence. HMM can
be denoted by λ=(π, A, B), and the correlative variables are
introduced as follows:
π={πi}, the initial hidden state probabilities, where πi=
P(Si).
A={aij }, the transition probabilities between the hidden
states Siand Sj, where aij =P(Sj|Si).
B={bj(k)}, the probabilities of the observable states Ok
in the hidden state Sj, where bj(k)=P(Ok|Sj).
ζt(i, j)=P(Si,S
j|O1O2...OT), the probability of transit-
ing from the hidden state Siat the time tto the hidden state
Sjat the time t+1, given the model λand the observation
sequence.
ηt(i)=P(Si|O1O2...OT), the probability of the hidden
state Siat the time t, given the model λand the observation
sequence.
We can obtain the accurate model by calculating the correl-
ative variables in Eq. (5),
πi=ηt(i),
aij =T
t=1 ζt(i, j)
T
t=1 ηt(i),
bj(k)=T
t=1,Okηt(j)
T
t=1 ηt(j).
(5)
where πirepresents the expected number of times that the
content stays at the hidden state Siat t,aij represents the
probability of transition from Sito Sj, and bj(k)represents
the probability of observing the state Okwhen the hidden state
is Sj.
In PopCC, Fkrepresents the observe state Okin the k-
th slice and the predicted popularity in (k+1)-th slice is the
hidden state Sk+1.
V. P ERFORMANCE EVALUATION
In order to evaluate the performance of our caching scheme,
we conduct our simulations over the Opportunistic Network
Environment (ONE) simulator [26]. In our design, there are
300 nodes, including 30 RSUs and 270 users (50 people, 100
buses, and 120 taxis), distributed in the map. All nodes move
following the Working-Day-Movement Model in ONE with a
daily routine. A person drives a car with a chance of pv,or
536
s/he must take the bus or taxi to reach different destinations.
Buses follow the Route-Based Movement model and taxis run
by the Random Waypoint Model in ONE. All nodes have the
same caching buffer size, movement speed range, transmission
range and data rate. We list important simulation parameters
in Table III.
TABLE III
SIMULATION PARAMETERS
Parameter Description Value
Caching Buffer 1000MB
Request Interval [50minutes, 100minutes]
Message TTL 10minutes
Simulation Time 15weeks
RSU Number 30
RSU Transmission Range 500m
RSU Transmission Speed 10Mbps
Vehicle Speed [7m/s, 10m/s]
Vehicle Transmission Range 50m
Vehicle Transmission Speed 2Mbps
pv0.3
A. Performance Metrics
Our main comparisons are made between the proposed
PopCC scheme, and reference schemes TRLU [8], CRCP [21]
and PopCaching [10]. In PopCC scheme, we incorporate the
future content popularity into our caching decision, which is
predicted with Hidden Markov Model (HMM) by learning
the past traffic patterns including content priority, request
ratio and request frequency. Thus, we choose TLRU which
considers the past request time patterns and combines recency
and frequency to make replacement scheme. Meanwhile, we
choose the predictive caching scheme CRCP and PopCaching
as references. CRCP allows the cluster head to combine the
past popularity with the current cache hit of each content
so as to make decisions cooperatively. In PopCaching, the
method combining region partition and access pattern update
is designed to predict the content popularity. The following
metrics are used to compare these schemes:
Success Ratio:The ratio of queries that successfully
obtain the requested contents.
Average Access Delay:The average delay of obtaining
responses in successful queries.
Average Hop Count:The average hop count between the
requester and provider in successful queries.
Average Storage Usage:The average storage usage of all
the caching nodes in the network.
In the remaining sections, we provide a number of studies
to evaluate the performance of our proposed scheme.
B. Effect of Content Size
In Fig. 1(a) and Fig. 1(b), we compare success ratio and av-
erage access delay under different content size. As the content
size increases, each node caches fewer contents, causing lower
success ratio and higher access delay. Our PopCC has the best
performance with the highest success ratio (up to 10.62%
gain) and least access delay (up to 19.33% drop), because
10 20 30 40 50
Content Size(MB)
0.22
0.27
0.32
0.37
Success Ratio
PopCC
TLRU
CRCP
PopCaching
(a) Success Ratio
10 20 30 40 50
Content Size(MB)
1.2
1.5
1.8
2.1
Access Delay(hours)
PopCC
TLRU
CRCP
PopCaching
(b) Average Access Delay
10 20 30 40 50
Content Size(MB)
0
150
300
450
600
Storage Usage(MB)
PopCC
TLRU
CRCP
PopCaching
(c) Average Storage Usage
Fig. 1. Effect of Content Size
PopCC has taken into account request patterns to predict
the content popularity. Among the four schemes, TRLU only
considers the request recency and frequency without predicting
the content popularity, making it possess the least success
ratio and largest delay. CRCP shows better performance than
PopCaching on success ratio and average access delay when
the content size is larger than 40MB, because the amount
of cached content decreases with the content size. There is
relatively less impact on these two mechanisms, PopCaching
and CRCP, because PopCaching makes the replacement policy
based on the context learning mechanism and the cluster heads
in CRCP are responsible for caching contents. Fig. 1(c) shows
the effect of content size on average storage usage. PopCC
needs a slightly more storage space, because it stores more
interests to compute the three factors, while only the request
recency and frequency are necessary in TLRU. Though CRCP
and PopCaching predict the future popularity, they do not
require statistics for the past interests.
0.2 0.4 0.6 0.8 1
Caching Node Ratio
0.18
0.24
0.3
0.36
Success Ratio
PopCC
TLRU
CRCP
PopCaching
(a) Success Ratio
0.2 0.4 0.6 0.8 1
Caching Node Ratio
1.2
1.7
2.2
2.7
Access Delay(hours)
PopCC
TLRU
CRCP
PopCaching
(b) Average Access Delay
0.2 0.4 0.6 0.8 1
Caching Node Ratio
1.8
2.1
2.4
2.7
Average Hop Counts
PopCC
TLRU
CRCP
PopCaching
(c) Average Hop Counts
Fig. 2. Effect of Caching Node Ratio
C. Effect of Caching Node Ratio
In Fig. 2, we compare success ratio, average access delay,
and average hop counts among the four schemes under dif-
ferent caching node ratio. In Fig. 2(a), the success ratio of
the four schemes improves with the caching node ratio. As
discussed before, PopCC performs the best overall by combing
537
the request patterns to predict the content popularity. TLRU
does not predict the content popularity, causing the lowest
success ratio. Fig. 2(b) and Fig. 2(c) show that the average
access delay and average hop count decrease with the ratio of
caching nodes and PopCC outperforms the best.
100 1,000 10,000 100,000
Content Number
0.19
0.28
0.37
0.46
Success Ratio
PopCC
TLRU
CRCP
PopCaching
(a) Success Ratio
100 1,000 10,000 100,000
Content Number
1.2
1.6
2
2.4
Access Delay(hours)
PopCC
TLRU
CRCP
PopCaching
(b) Average Access Delay
Fig. 3. Effect of Content Number
D. Effect of Content Number
Fig. 3 shows the success ratio under different content
number. It can be seen that the success ratio decreases with
the number of contents, because a larger number of content
types increase the difficulty of cache hit. It shows that PopCC
always achieves the best performance among the four schemes
in terms of the success ratio and the average access delay
though the number of contents increases by a 100-fold.
VI. CONCLUSION
In this paper, we propose PopCC, a cache replacement
policy based on the popularity predicted for VCCN. Com-
bining the three content factors of frequency, ratio, and pri-
ority, PopCC adopts HMM to predict the content popularity.
We have presented our extensive simulations on the PopCC
scheme and compared its performance with several other
competing schemes. It has been shown that PopCC can gain a
better performance in success ratio and average access delay.
In our future work, we plan to optimize our algorithm to
minimize the energy consumption and design some incentive
policy to encourage each vehicle to actively caching content.
ACKNOWLEDGMENT
This research is sponsored in part by National Key Research
and Development Project of China (2017YFC0704100) and
the National Natural Science Foundation of China (contrac-
t/grant numbers: 61772113, 61872053, and 61802047).
REFERENCES
[1] S. H. Bouk, S. H. Ahmed, and D. Kim, “Vehicular content centric
network (vccn): a survey and research challenges, in ACM Symposium
on Applied Computing, 2015, pp. 695–700.
[2] M. Amadeo, C. Campolo, and A. Molinaro, “Information-centric net-
working for connected vehicles: a survey and future perspectives, IEEE
Communications Magazine, vol. 54, no. 2, pp. 98–104, 2016.
[3] M. Amadeo, C. Campolo, A. Molinaro, and G. Ruggeri, “Content-centric
wireless networking: A survey,” Computer Networks, vol. 72, no. 7, pp.
1–13, 2014.
[4] B. Trstenjak, S. Mikac, and D. Donko, “Knn with tf-idf based framework
for text categorization, Procedia Engineering, vol. 69, pp. 1356–1364,
2014.
[5] B. Aaron, D. E. Tamir, N. D. Rishe, and A. Kandel, “Dynamic incremen-
tal k-means clustering,” in Computational Science and Computational
Intelligence (CSCI), 2014 International Conference on, vol. 1. IEEE,
2014, pp. 308–313.
[6] H. Jin, D. Xu, C. Zhao, and D. Liang, “Information-centric mobile
caching network frameworks and caching optimization: a survey,”
EURASIP Journal on Wireless Communications and Networking, vol.
2017, no. 1, pp. 33–64, Feb 2017.
[7] D. Lee, J. Choi, J. H. Kim, S. H. Noh, L. M. Sang, Y. Cho, and S. K.
Chong, “On the existence of a spectrum of policies that subsumes the
least recently used (lru) and least frequently used (lfu) policies,” in ACM
SIGMETRICS International Conference on Measurement and Modeling
of Computer Systems, 1999, pp. 134–143.
[8] M. Bilal and S.-G. Kang, “Time aware least recent used (tlru) cache
management policy in icn,” in Advanced Communication Technology
(ICACT), 2014 16th International Conference on. IEEE, 2014, pp.
528–532.
[9] S. Zhou, “An efficient simulation algorithm for cache of random
replacement policy, in Network and Parallel Computing. Springer
Berlin Heidelberg, 2010, pp. 144–154.
[10] S. Li, J. Xu, M. V. D. Schaar, and W. Li, “Popularity-driven content
caching,” in INFOCOM 2016 - the IEEE International Conference on
Computer Communications, IEEE, 2016, pp. 1–9.
[11] S. J. Kang, S. W. Lee, and Y. B. Ko, A recent popularity based
dynamic cache management for content centric networking,” in Fourth
International Conference on Ubiquitous and Future Networks, 2012, pp.
219–224.
[12] K. Suksomboon, S. Yamada, S. Tarnoi, Y. Ji, M. Koibuchi, K. Fukuda,
S. Abe, N. Motonori, M. Aoki, and S. Urushidani, “Popcache: Cache
more or less based on content popularity for information-centric net-
working,” in Local Computer Networks, 2014, pp. 236–243.
[13] F. M. Al-Turjman, A. E. Al-Fagih, and H. S. Hassanein, “A value-based
cache replacement approach for information-centric networks,” in Local
Computer Networks Workshops. IEEE, 2014, pp. 874–881.
[14] A. Karami and M. Guerrero-Zapata, “An anfis-based cache replacement
method for mitigating cache pollution attacks in named data network-
ing,” Computer Networks, vol. 80, pp. 51–65, 2015.
[15] Y. Cui, M. Zhao, and M. Wu, A centralized control caching strategy
based on popularity and betweenness centrality in ccn,” in International
Symposium on Wireless Communication Systems, 2016, pp. 286–291.
[16] M. K. Denko and J. Tian, “Cross-layer design for cooperative caching
in mobile ad hoc networks,” in Consumer Communications and NET-
WORKING Conference, 2008. Ccnc, 2008, pp. 375–380.
[17] P. T. Joy and K. P. Jacob, “Cache replacement policies for cooperative
caching in mobile ad hoc networks,” International Journal of Computer
Science Issues, vol. 9, no. 3, pp. 2012–2017, 2012.
[18] R. Wang, X. Peng, J. Zhang, and K. B. Letaief, “Mobility-aware caching
for content-centric wireless networks: modeling and methodology,
IEEE Communications Magazine, vol. 54, no. 8, pp. 77–83, 2016.
[19] A. K. Gupta and U. Shanker, “Spmc-crp:a cache replacement policy for
location dependent data in mobile environment, Procedia Computer
Science, vol. 125, pp. 632–639, 2018.
[20] M. Chaqfeh, A. Lakas, and I. Jawhar, A survey on data dissemination
in vehicular ad hoc networks,” Vehicular Communications,vol. 1, no. 4,
pp. 214–225, 2014.
[21] S. Chootong and J. Thaenthong, “Cache replacement mechanism with
content popularity for vehicular content-centric networks (vccn),” in
International Joint Conference on Computer Science and Software
Engineering, 2017, pp. 1–6.
[22] E. Chan, W. Li, and S. Lu, “Movement prediction based cooperative
caching for location dependent information service in mobile ad hoc
networks,” Journal of Supercomputing, vol. 59, no. 1, pp. 297–322,
2012.
[23] S. E. El Khawaga, A. I. Saleh, and H. A. Ali, An administrative cluster-
based cooperative caching (accc) strategy for mobile ad hoc networks,
Journal of Network and Computer Applications, vol. 69, pp. 54–76,
2016.
[24] L. Yao, A. Chen, J. Deng, J. Wang, and G. Wu, A cooperative
caching scheme based on mobility prediction in vehicular content centric
networks,” IEEE Transactions on Vehicular Technology, vol. PP, no. 99,
pp. 1–1, 2017.
[25] W. Khreich, E. Granger, A. Miri, and R. Sabourin, “On the memory
complexity of the forwardcbackward algorithm, Pattern Recognition
Letters, vol. 31, no. 2, pp. 91–99, 2010.
[26] A. Keranen, J. Ott, and T. Karkkainen, “The one simulator for dtn
protocol evaluation, in International Conference on Simulation TOOLS
and Techniques, 2009, p. 55.
538
... The proposal in Yao et al 142 incorporates the future content popularity for caching replacement decision. Metrics such as the characteristic of the received interests, the verified request ratio and frequency, and the content priority are used to access the content popularity. ...
... The critical challenge on improving CHR and access delay is the selection of an appropriate caching node within the network. 142 Standard NDN caching decision strategies are reactive-based, where the caching process starts with a content request, and then, some copies of the content are cached somewhere along delivery path, according to the proposed caching strategy. For VANET, the conclusion from the selected studies indicates that proactive caching provides more efficient solution than reactive caching. ...
... Three of the selected studies 132,142,143 propose only a caching replacement mechanism. Eleven studies, out of the 25 proactive-based caching solutions rely on mobility prediction for path and node selection. ...
Article
Full-text available
Named data networking (NDN) presents a huge opportunity to tackle some of the unsolved issues of IP‐based vehicular ad hoc networks (VANET). The core characteristics of NDN such as the name‐based routing, in‐network caching, and built‐in data security provide better management of VANET proprieties (e.g., the high mobility, link intermittency, and dynamic topology). This study aims at providing a clear view of the state‐of‐the‐art on the developments in place, in order to leverage the characteristics of NDN in VANET. We resort to a systematic literature review (SLR) to perform a reproducible study, gathering the proposed solutions and summarizing the main open challenges on implementing NDN‐based VANET. There exist several related studies, but they are more focused on other topics such as forwarding. This work specifically restricts the focus on VANET improvements by NDN‐based routing (not forwarding), caching, and security. The surveyed solution herein presented is performed between 2010 and 2021. The results show that proposals on the selected topics for NDN‐based VANET are recent (mainly from 2016 to 2021). Among them, caching is the most investigated topic. Finally, the main findings and the possible roadmaps for further development are highlighted. A systematic review on the realization of NDN‐based VANET, specifically for NDN‐based routing (not forwarding), caching and security issues, was performed for the period of 2010‐2021. The study concluded that caching has been the more investigated topic. There are, however, some gaps still to be tackled. For instance, a means to further reduce the broadcast storm problem in routing may include leveraging all overhead packets for location‐aware, and leverage caching for better routing decision.
... Although the concept of enhancing security in VNDN is widely acknowledged [27], the development and implementation of effective measures to prevent security attacks [28] in VNDN is still in its early stage. The strategies for efficient content caching include popularity-based content caching [29,30], cooperative caching [31], signature-based content verification [32], and rating-based trust management system [21]. ...
Article
Full-text available
Named data networking (NDN) is gaining momentum in vehicular ad hoc networks (VANETs) thanks to its robust network architecture. However, vehicular NDN (VNDN) faces numerous challenges, including security, privacy, routing, and caching. Specifically, the attackers can jeopardize vehicles’ cache memory with a Content Poisoning Attack (CPA). The CPA is the most difficult to identify because the attacker disseminates malicious content with a valid name. In addition, NDN employs request–response-based content dissemination, which is inefficient in supporting push-based content forwarding in VANET. Meanwhile, VNDN lacks a secure reputation management system. To this end, our contribution is three-fold. We initially propose a threshold-based content caching mechanism for CPA detection and prevention. This mechanism allows or rejects host vehicles to serve content based on their reputation. Secondly, we incorporate a blockchain system that ensures the privacy of every vehicle at roadside units (RSUs). Finally, we extend the scope of NDN from pull-based content retrieval to push-based content dissemination. The experimental evaluation results reveal that our proposed CPA detection mechanism achieves a 100% accuracy in identifying and preventing attackers. The attacker vehicles achieved a 0% cache hit ratio in our proposed mechanism. On the other hand, our blockchain results identified tempered blocks with 100% accuracy and prevented them from storing in the blockchain network. Thus, our proposed solution can identify and prevent CPA with 100% accuracy and effectively filters out tempered blocks. Our proposed research contribution enables the vehicles to store and serve trusted content in VNDN.
... We consider that the content popularity varies from caching nodes because the network position and the user distribution may change the inflow request rate of contents. There has been some research focused on popularity estimation [22,23]. However, the process of popularity estimation should be lightweight and simple due to the requirement of line-speed forwarding in ICN. ...
Article
Full-text available
The Information-Centric Network (ICN), designed for efficient content acquisition and distribution, is a promising candidate architecture for the future Internet. In-network caching in ICN makes it possible to reuse contents and the Name Resolution System (NRS) makes cached contents better serve users. In this paper, we focused on the ICN caching scenario equipped with an NRS, which records the positions of contents cached in ICN. We propose a Popularity-based caching strategy with Number-of-Copies Control (PB-NCC) in this paper. PB-NCC is proposed to solve the problems of unreasonable content distribution and frequent cache replacement in traditional caching strategies in ICN. We examine PB-NCC with a large number of experiments in different topologies and workloads. The simulation results reveal that PB-NCC can improve the cache hit ratio by at least 8.85% and reduce the server load by at least 11.34% compared with other on-path caching strategies, meanwhile maintaining a low network latency.
... However, the majority of existing works consider the LRU scheme, due to its intrinsic simplicity; only very few ones specifically target cache replacement in VNDN and they almost unanimously do not refer to transient contents. For instance, the Popularity-based Content Caching (PopCC) policy in Reference [56] employs a Hidden Markov Model to predict the future popularity of contents and then caches the items in descending order of popularity. The last packet in the CS is the one with the lower popularity and the first candidate to be evicted. ...
Article
Full-text available
Vehicular Named Data Networking (VNDN) is a revolutionary information-centric architecture specifically conceived for vehicular networks and characterized by name-based forwarding and in-network caching. So far, a variety of caching schemes have been proposed for VNDN that work in presence of static Data packets, like traditional Internet contents. However, with the advent of Internet of Things (IoT) and Internet of Vehicles (IoV) applications, large sets of vehicular contents are expected to be transient, i.e., they are characterized by a limited lifetime and become invalid after the latter expires. This is the case of information related to road traffic or parking lot availability, which can change after a few minutes—or even after a few seconds—it has been generated at the source. The transiency of contents may highly influence the network performance, including the gain of in-network caching. Therefore, in this paper, we consider the dissemination of transient contents in vehicular networks and its effects on VNDN caching. By providing a detailed review of related work, we identify the main challenges and objectives when caching transient contents, e.g., to avoid cache inconsistency, to minimize the Age of Information (AoI) and the retrieval latency, and the main strategies to fulfill them. We scan the existing caching and replacement policies specifically designed for transient contents in VNDN and, finally, we outline interesting research perspectives.
Article
Caching plays a vital role in maintaining normal information exchanges in vehicular named data networks (VNDNs). However, current caching techniques cannot adapt to variable network environments. In this paper, we propose an environment-adaptive dynamic caching (EADC) strategy for VNDNs to cope with the changing environments. This strategy consists of three factors: vehicle characteristic attributes, motion centrality attributes, and transmission cost attributes, which are used jointly to determine the cache probability. The vehicle characteristic attributes can reflect not only the social attributes but also the content popularity, and helps cache the most popular and desired contents. The motion centrality attributes reflect the contacting capacity and the control capacity of vehicles, and this enables vehicles with the most important locations to cache valuable contents. The transmission cost attributes focus on multiple performance metrics such as transmission delay and cache redundancy, which are used to reflect the content acquisition difficulty and reduce the hard events of long-distance content acquisition. The greatest characteristics and advantages of EADC are the strong adaptability to the content demand changing, network topology changing, and channel quality changing. Extensive simulation results illustrate the advantages of EADC in terms of average delay, cache utilization, cache hit ratio, and average hop count.
Article
By shifting the requested content to the edge in the Internet of Vehicles (IoV), edge caching is expected to be an effective solution to satisfy the low latency and high reliability requirements of IoV users for multimedia services. However, the edge node’s coverage area and storage space are limited. Moreover, since vehicles have high mobility and in-vehicle multimedia applications require sequential delivery for contents, we need to address two main issues: 1) How to optimize the proactive content caching decision (i.e., the placement of cached content chunks) among edge nodes (ENs) to provide better Quality of Services (QoS) for IoV users. 2) How to ensure that vehicles can download the required contents sequentially to improve Quality of Experience (QoE). In this paper, we propose a mobility-aware proactive edge caching scheme (MSTPS), where the spatial and temporal prediction of vehicles are taken into account for content deployment and scheduling. Specifically, we optimize the caching decision based on predicting the vehicle’s driving trajectory and travel preference. The scheme learns the vehicle’s travel preferences to cope with mobility uncertainty by combining users with similar travel patterns. Meanwhile, the proposed scheme can support the sequential downloading of content chunks. Furthermore, in order to deal with the dynamic characteristics and unpredictable challenges of the IoV, we design a system recovery strategy, which can avoid the degradation of the proposed scheme due to the failure of prediction. Finally, by using real mobility datasets and scenarios, we explore the impact of the number of ENs deployed in advance for each vehicle’s request when the cache needs to be updated on system performance. In addition, we evaluate the effectiveness of the proposed scheme. Our proposed scheme can achieve the best cache hit ratio and decrease caching costs compared to the existing mobility-aware caching schemes.
Article
Vehicular Ad hoc NETworks (VANETs) have become a leading technology receiving great attention from various research communities as a pivotal infrastructure for data dissemination in intelligent transportation systems. Data dissemination in VANET is a challenging task due to high dynamics in topology, mobility, and links connection. Internet model (i.e., TCP/IP) is inefficient for VANET data dissemination due to the host/address-centric, and connection-oriented communication mechanism that is fundamentally designed for stable wired networks. Recently, Named Data Networking (NDN) paradigm has been used as a promising perfect-enabler underlying vehicular communication model, i.e., Vehicular Named Data Networking (V-NDN) model. In NDN, the nodes communication involves named-based data-centric operations decoupled from the data provider address/location. Several V-NDN data dissemination schemes have been proposed. In this article, we provide a comprehensive survey representing a thorough-critical presentation of recently proposed V-NDN data dissemination solutions and introduce a new fine-grained taxonomy for these solutions. Then, a qualitative comparison of the reviewed solutions based on several parameters is provided. We also suggest a unified performance evaluation metrics in this domain. Finally, we present the open problems in V-NDN data dissemination and highlight the directions of future-oriented solutions. This comprehensive and self-contained survey can contribute to the exploration and understanding of this research domain. Consequently, the future solutions in the aspects of unresolved problems and inefficient resolutions may be directed towards new solving methods.
Article
Full-text available
In recent years, information-centric networks (ICNs) have gained attention from the research and industry communities as an efficient and reliable content distribution network paradigm, especially to address content-centric and bandwidth-needed applications together with the heterogeneous requirements of emergent networks, such as the Internet of Things (IoT), Vehicular Ad-hoc NETwork (VANET) and Mobile Edge Computing (MEC). In-network caching is an essential part of ICN architecture design, and the performance of the overall network relies on caching policy efficiency. Therefore, a large number of cache replacement strategies have been proposed to suit the needs of different networks. The literature extensively presents studies on the performance of the replacement schemes in different contexts. The evaluations may present different variations of context characteristics leading to different impacts on the performance of the policies or different results of most suitable policies. Conversely, there is a lack of research efforts to understand how the context characteristics influence policy performance. In this direction, we conducted an extensive study of the ICN literature through a Systematic Literature Review (SLR) process to map reported evidence of different aspects of context regarding the cache replacement schemes. Our main findings contribute to the understanding of what is a context from the perspective of cache replacement policies and the context characteristics that influence cache behavior. We also provide a helpful classification of policies based on context dimensions used to determine the relevance of contents. Further, we contribute with a set of cache-enabled networks and their respective context characteristics that enhance the cache eviction process.
Article
Full-text available
Earlier cache replacement policies used in LDIS have not evolved any accurate next location prediction policy that can be used in cost computation of several data items. To overcome this limitation of previous policy and to ensure efficient cache utilization SPMC-CRPis being proposed. Here sequential pattern mining and clustering is used to remove random movement data in mobile user’s profiles and predict accurate next location. The proposed policy uses the mobility rules extracted from given client movement trajectories. The mobility rules taken here are derived from sequential pattern mining. Proceeding to accurate next location prediction, the policy considers the important factors such as client access probability, query rate, update rate and predicted next client’s location while estimating cache replacement cost in order to improve the effectiveness of previous cache replacement policy.
Article
Full-text available
The demand for content oriented service and compute-intensive service stimulates the shift of current cellular networks to deal with the explosive growth in mobile traffic. Information centric mobile caching network architectures have emerged in Information-Centric Networking as well as mobile cellular and ad-hoc networks deployed with caches. Caching optimization based on information centric mobile caching has become the key issue, and several significant research challenges remain to be addressed before its widespread adoption. In this paper, a brief survey on Information centric mobile caching network architecture and caching optimization is presented, including cache placement in different mobile wireless network architectures, the taxonomy of cache insertion and eviction policies, the modeling behavior of caching networks as well as caching optimization based on network centric and user centric metrics, and typical applications based on mobile caching. Finally, the research directions and open challenges are investigated.
Conference Paper
Full-text available
Recently, Content Centric Networking (CCN) has been proposed for the Future Internet. Since CCN is at an early bud stage, many issues are still unidentified and open. In this paper, we investigate the feasibility of applying the CCN concept to vehicular communications (named as Vehicular CCN, VCCN in this work). In addition, we identify a number of VCCN challenges such as naming, name resolution, routing or forwarding strategies, content storing, management and policy of forwarding information base and pending interest table management, security and trust issues, etc.
Article
Vehicular Content Centric Networks (VCCNs) emerge as a strong candidate to be deployed in informationrich applications of vehicular communications. Due to vehicles' mobility, it becomes rather inefficient to establish end-to-end connections in VCCNs. Consequently, content packets are usually sent back to the requesting node via different paths in VCCNs. To improve network performance of VCCNs, node mobility should be exploited for vehicles to serve as relays and to carry data for delivery. In this work, we propose a scheme called Cooperative Caching based on Mobility Prediction (CCMP) for VCCNs. The main idea of CCMP is to cache popular contents at a set of mobile nodes that may visit the same hot spot areas repeatedly. In our CCMP scheme, we use Prediction based on Partial Matching (PPM) to predict mobile nodes' probability of reaching different hot spot regions based on their past trajectories. Vehicles with longer sojourn time in a hot region can provide more services and should be preferred as caching nodes. To solve the problem of limited buffer at each node, we design a cache replacement based on content popularity to guarantee only popular contents are cached. We evaluate CCMP through the ONE simulator for its salient features in success ratio and content access delay compared to other state-of-the-art schemes.
Article
As mobile services are shifting from "connection-centric" communications to "content-centric" communications, content-centric wireless networking emerges as a promising paradigm to evolve the current network architecture. Caching popular content at the wireless edge, including base stations (BSs) and user terminals (UTs), provides an effective approach to alleviate the heavy burden on backhaul links, as well as lowering delays and deployment costs. In contrast to wired networks, a unique characteristic of content-centric wireless networks (CCWNs) is the mobility of mobile users. While it has rarely been considered by existing works in caching design, user mobility contains various helpful side information that can be exploited to improve caching efficiency at both BSs and UTs. In this paper, we present a general framework on mobility-aware caching in CCWNs. Key properties of user mobility patterns that are useful for content caching will be firstly identified, and then different design methodologies for mobility-aware caching will be proposed. Moreover, two design examples will be provided to illustrate the proposed framework in details, and interesting future research directions will be identified.
Article
Recent explosive growth in computing and wireless communication technologies has led to an increasing interest in mobile ad hoc networks (MANET). Among the many challenges for MANET designers and users, data availability is a critical issue. Caching is considered as an effective solution for the availability problem. Although cooperative caching improves the data access by reducing access latency and bandwidth usage in MANETs, it still suffers from several hurdles and technical problems. This paper introduces a new cooperative caching strategy for MANETs, which is called Administrative Cluster-Based Cooperative Caching (ACCC). ACCC keeps at most two copies of the cached data items in each cluster. Moreover, it develops a new administrative module, which perfectly controls the caching process. ACCC is a cluster-based caching strategy, hence, it divides the network into a set of overlapping clusters. Each cluster is managed by a Cluster Manager (CM) as well as a Cluster Backup (CB). ACCC has several salient features that other techniques do not have such as; (i) the integration between CM and CB besides the integration among all the caching modules improves the data availability within the cluster, (ii) ACCC; maximizes the caching hit ratio, reduces access delay, and saves the battery power of network nodes. ACCC is simulated using Java Caching System JCS2 with the efficient platform Java Enterprise Edition EE8 using a client/server model. Experimental results have shown that ACCC outperforms recent cluster-based caching strategies as it introduces higher cache hit ratio as well as better data availability.
Article
In the connected vehicle ecosystem, a high volume of information-rich and safety-critical data will be exchanged by roadside units and onboard transceivers to improve the driving and traveling experience. However, poor-quality wireless links and the mobility of vehicles highly challenge data delivery. The IP address-centric model of the current Internet barely works in such extremely dynamic environments and poorly matches the localized nature of the majority of vehicular communications, which typically target specific road areas (e.g., in the proximity of a hazard or a point of interest) regardless of the identity/address of a single vehicle passing by. Therefore, a paradigm shift is advocated from traditional IP-based networking toward the groundbreaking information- centric networking. In this article, we scrutinize the applicability of this paradigm in vehicular environments by reviewing its core functionalities and the related work. The analysis shows that, thanks to features like named content retrieval, innate multicast support, and in-network data caching, information-centric networking is positioned to meet the challenging demands of vehicular networks and their evolution. Interoperability with the standard architectures for vehicular applications along with synergies with emerging computing and networking paradigms are debated as future research perspectives.