Conference PaperPDF Available

Popularity Prediction Caching Using Hidden Markov Model for Vehicular Content Centric Networks

June 2019

June 2019

DOI:10.1109/MDM.2019.00115

Conference: 2019 20th IEEE International Conference on Mobile Data Management (MDM)

Authors:

Lin Yao

Nanjing Agricultural University

Qiufen Xia

Dalian University of Technology

Content uploaded by Qiufen Xia

Content may be subject to copyright.

Popularity Prediction Caching Using Hidden

Markov Model for Vehicular Content Centric

Networks

Lin Yao∗, Yuqi Wang†, QiuFen Xia∗, and Rui Xu‡§

∗International School of Information Science & Engineering, Dalian University of Technology, China

†School of Software, Dalian University of Technology, China

‡Cyberspace Security Technology Laboratory of CETC, Chengdu Sichuan, China

§China Electronic Technology Cyber Security Co., Ltd, Chengdu Sichuan, China

Abstract—Vehicular Content Centric Network (VCCN) is pro-

posed to cope with mobility and intermittent connectivity issues

of vehicular ad hoc networks by enabling the Content Centric

Network (CCN) model in vehicular networks. The ubiquitous

in-network caching of VCCN allows nodes to cache contents

frequently accessed data items, improving the hit ratio of content

retrieval and reducing the data access delay. Furthermore, it can

signiﬁcantly mitigate bandwidth pressure. Therefore, it is crucial

to cache more popular contents at various caching nodes. In

this paper, we propose a novel cache replacement scheme named

Popularity-based Content Caching (PopCC), which incorporates

the future popularity of contents into our decision making. We

adopt Hidden Markov Model (HMM) to predict the content

popularity based on the inherent characters of the received

interests, request ratio, request frequency and content priority.

To evaluate the performance of our proposed scheme PopCC,

we compare it with some state-of-the-art schemes in terms of

cache hit, average access delay, average hop count and average

storage usage. Simulations demonstrate that the proposed scheme

possesses a better performance.

Keywords—Popularity Prediction, Hidden Markov Model, VC-

I. INTRODUCTION

Though Vehicular Ad-hoc Network (VANET) can provide

entertainment and road safety services to drivers and passen-

gers, it has difﬁculties in maintaining end-to-end connections

due to its dynamic network topology and harsh propagation

environment [1]. Therefore, the information communication in

VANET is becoming challenging. IP-based host-centric proto-

cols work awkwardly in mobile environments and functional

patches such as mobile IP have been proven to cause increased

complexity and performance degradation of VANET [2]. CCN

is concerned with contents, not the actual carriers of the con-

tents. It can overstep the inefﬁciencies of TCP/IP in handling

the node mobility, unreliability of wireless links, and resource-

constrained devices [3]. In particular, the network capacity

becomes gradually limited with the increasing of vehicles. The

in-networking caching property of CCN allows each vehicle

to retrieve contents from the nearby providers.

Recently, Content Centric Network (CCN) has been advo-

cated to address the mentioned challenges and design the next

generation vehicular network called Vehicular Content Centric

Network (VCCN) [2]. A node requests a content by sending

an interest packet consisting the content name. The data

packet will be returned if any other node has a replica of the

corresponding content. Each node forwarding the data packet

can decide whether to cache it based on its cache replacement

strategy. This intra-network caching allows multiple contents

to be distributed through the VCCN to improve the network

performance.

Different from most caching strategies in CCN, a data

packet in VCCN might be returned by a different path instead

of the reverse path of the corresponding interest due to

the vehicle mobility. Moreover, the store-and-forward routing

may bring longer delay on remote access of information if

the requested content is cached at a node farther from the

requester. Due to the high cost of deploying RSUs and their

relatively limited storage capacities, it is impossible to store

all contents on RSUs. In order to reduce communication cost

among nodes, minimize bandwidth usage, reduce access delay

and improve hit ratio, each vehicle or RSU has to potentially

cache contents which are frequently accessed by other nodes.

Obviously, the network performance can be improved if the

requested content can be retrieved from the most convenient

providers.

In this paper, we consider taking advantage of the idea

of cache eviction and present an efﬁcient cache replacement

policy, Popularity-based Content Caching (PopCC). We incor-

porate the future content popularity into our caching decision,

which is predicted with Hidden Markov Model (HMM) by

learning the past trafﬁc patterns. Contents are cached in the

descending order of content popularity. We aim to cache more

popular contents in each node to achieve a higher performance.

Given all the above considerations, this paper has the following

contributions:

•To the best of our knowledge, we are the ﬁrst to adopt the

idea of future content popularity to design and implement

533

2019 20th IEEE International Conference on Mobile Data Management (MDM)

DOI 10.1109/MDM.2019.00115

a cache replacement policy for VCCN. Using the fore-

casted popularity, PopCC makes proper cache replace-

ment decision to cache contents in the descending order

of content popularity. When the buffer is full, contents

with lower popularity will be evicted automatically.

•We are the ﬁrst to learn the recent access patterns of

interests to get the content popularity. To predict it more

accurately, our PopCC integrates request ratio, request

frequency and content priority to calculate the popularity

of an object with a general function. To compute the

content priority, we adopt TF-IDF algorithm [4] and

K-means [5]. TF-IDF algorithm is used to weight the

content names to establish the vector space model and

then K-means is used to cluster the content names to get

the content priority.

The remainder of this paper is organized as follows. In

Section II, we discuss the related work. Problem statement

is given in Section III. In Section IV, we present the details

of our approach. We evaluate the performance of PopCC in

Section V, and conclude the work in Section VI.

II. RELATED WORK

In this section, we review some literature works related to

our proposed method.

Cache Eviction Policies- Polices based on cache eviction

mainly aim to remove contents when new contents are received

due to the limit caching space. Cache eviction policies can

be classiﬁed into recency-based ones, frequency-based ones,

recency/frequency-based ones, randomized ones and function-

based ones [6]. In [7], Least Recently Used (LRU) as the

typical recency-based strategy and Least Frequently Used

(LFU) as the typical frequency-based strategy were proposed,

which consider the recently access time and request counts

respectively. TLRU [8] was proposed by combing recency and

frequency. Though recency and frequency information can be

obtained easily from the past Interests, the cache replacement

policies based on these ﬁxed factors lack ﬂexibility and are

difﬁcult to meet the requirements of different workload situ-

ations. Randomized approaches aim to reduce the algorithm

complexity by ﬁnding a random content object for replacemen-

t. In [9], Zhou et al. proposed a random replacement policy by

efﬁciently approximating the hit probability in linear time with

moderate space in a single round. The above algorithms are

easy to implement but they ignore a fact that more popular

contents tend to stay longer in the cache, causing them to

suffer from major performance degradation [10].

Function-based approaches can solve the above challenging

by considering multiple factors to calculate the popularity of

an object with a general function. In [11], a cache replace-

ment policy, named as “Recent Usage Frequency (RUF)” was

proposed to deal with dynamic trafﬁc patterns and alleviate

temporal trafﬁc variations. The popularity is updated based

on the hit ratio and utilized time of each requested content.

PopCache was proposed to evaluate a content object’s Zipf

popularity and only contents with higher popularity were

cached in [12]. In [13], a value-based cache replacement

approach was proposed to calculate the popularity based on

access delay, frequency and aging time. In [14], the neural

network was ﬁrst adopted to design a cache replacement policy

by integrating the cached contents longevity, frequency access,

the standard deviation of contents access frequency, the last

access to the content and hit ratio. [10] presented a cache

replacement method by incorporating the future content pop-

ularity into the caching decision. [15] proposed a centralized

control caching strategy based on popularity and betweenness

centrality.

VANET- A caching scheme for mobile ad hoc networks

(MANETs) was ﬁrst proposed in [16], where the cross-layer

design is exploited to improve the caching performance with

the cooperative caching. The available cache replacement

mechanisms for ad hoc network are categorized into coor-

dinated and uncoordinated [17]. In uncoordinated schemes,

the replacement decision is made by individual nodes, while

neighbors cooperate and serve each other’s requests in co-

ordinated schemes. Mobility-aware replacement schemes [18]

were proposed to design the cache replacement by detecting

their movement patterns such as current location and moving

direction, which aims to cache more contents related to future

locations. The replacement policies in [19] predicted the future

regions based on users movement and stored the location

dependent data ﬁrst.

To improve the network performance, the cooperative

caching seeks to coordinate the contents of each client’ cache

in VANET [20]. In [21], a cache replacement with content

popularity in VCCN was proposed to allow a cluster head

to make caching decisions cooperatively based on the past

popularity and the current cache hit of each content. The head

servers data to its members within the cluster. The LDCC

strategy in [22] applied a prediction model to approximate

client movement behavior such as the future location and

design a cooperative caching replacement. While, most of

other works focused on selecting caching nodes [23][24].

III. PROBLEM STATEMENT

In VCCN, each node contains three key data structures:

CS (Content Store), PIT (Pending Interest Table), and FIB

(Forwarding Information Base) in Table I. CS is used to

cache the contents forwarded by the node, including content

name, the corresponding content category, content priority and

content popularity. PIT records the interests that have been

forwarded but not yet replied. FIB is a table for routing the

incoming interest packets based on name preﬁxes. When an

interest is received, each node updates the arrival time and

discards it to avoid the duplicate storage if it has existed in

the PIT; Otherwise, it adds an item including the content name

and arrival time into its PIT. Then, it checks its own CS. If

the content is found in its CS, the corresponding data will be

replied; Otherwise, it forwards the interest if the request is

received for the ﬁrst time. When receiving a data packet, each

node makes its own decision on whether to cache it or not

based on the caching replacement policy.

534

In this paper, our goal is to allow contents to be queried and

delivered efﬁciently among different nodes. As different ve-

hicles have different moving trajectories and receive different

interests, they may not cache the same contents. For example,

a vehicle node which usually drives nearby a shopping mall

may help others forward more interests on some promotional

information of goods. How to determine the popular contents

so that they can be accessed with the best efﬁciency? How

to exploit the recent access pattern to evaluate the content

popularity accurately and design the cache replacement policy?

TABLE I

DATA STRUCTURE MAINTAINED ATEACH NODE

(a) Content Store

Notation Description

name Content name

content Cached content

category Content name category

priority Content priority

popularity Content popularity

(b) Pending Interest Table

Notation Description

name Content name

time Time of the latest request

Notation Description

name Content name

uids List of forwarding nodes

IV. POPULARITY-BASED CONTENT CACHING (POPCC)

In this paper, we aim to propose a Popularity-based Content

Caching (PopCC) strategy. We ﬁrst give an overview of

PopCC, and then present our algorithm in details.

A. Overview

As vehicles move between infrastructures, each node includ-

ing RSU and vehicle always receives different interests from

different users. As usual, each interest has strong relationship

with the data consumer’s hobby, current location, etc.. For

example, one vehicle near a shopping mall may receive a large

number of requests on some coupons for certain goods during

some period. After this vehicle drives away, similar requests

will drop sharply. This example shows the popular contents are

always changing in each node. In order to react quickly to the

change of popular contents and provide better service for users,

we predict the content popularity by analyzing the inherent

statistics of incoming interests from the following parameters.

The frequently used notations are listed in Table II.

Deﬁnition 1 (Request Ratio): γ(ci)is deﬁned as the ratio

between the number of interests for the same content ciand

that of all the requests received in a slice,

γ(ci)= n(ci)

ck∈C n(ck),(1)

where n(ci)is the number of requests for ci.

TABLE II

FREQUENT NOTATIONS

Notation Description

CThe set of content

ciThe i-th content in C

γk(ci)The request ratio of ciin the k-th slice

fk(ci)The request frequency of ciin the k-th slice

t(ci)The last time of receiving ci

ρk(ci)The content priority of ciin the k-th slice

PciThe popularity of ci

Deﬁnition 2 (Request Frequency): f(ci)is deﬁned as the

average frequency of interests for ciin the past mslices.

f(ci)= 1

n(ci)n(ci)

k=1

t(ci)k+1 −t(ci)k

f(ci)= 1

mm

j=1(f(cj)),

(2)

where t(ci)k+1 and t(ci)kare respectively the time of the last

two requests and f(ci)is the frequency in the j-th slice.

Deﬁnition 3 (Content Priority): ρ(ci)is deﬁned as the

priority of ciduring a slice,

ρ(ci)= Ni

,(3)

where Niis the number of interests in the category which ci

belongs to, and NTis the total number of interest in all the

categories in the current slice.

Among the three factors, request ratio and request frequency

for the same contents are obtained from the statistics of incom-

ing interests. Content priority is calculated during the training

stage. And PopCC works with the following procedures:

1) During the training stage of HMM, each node makes s-

tatistics on the number of requests and contained content

name in each interest. Once the training is over, all the

statistics will be reported to the nearest RSUs. To ensure

the accuracy and reliability of training, all statistics need

to be synchronized between RSUs.

2) Each RSU is responsible for determining the content

priority deﬁned in Eq. (3). Each RSU ﬁrst adopts TF-

IDF to establish the vector space model of content

names. All the content names in the model are clustered

with K-means method. Based on the number of elements

in each cluster, the priority of contents in each cluster

is obtained.

3) During the prediction stage, each node calculates the

request ratio and request frequency for the same contents

according to Eq. (1) and Eq. (2).

4) Each node establishes HMM to calculate the future

popularity of each content based on the request ratio,

request frequency and content priority. Then, contents

are cached in the descending order of content popularity.

If the cache space is full, the contents with lower future

popularity will be evicted.

535

B. Calculation of Content Priority

This module aims to give a brief introduction on how to

obtain content priority, which works in the following two

procedures:

1) Getting VSM with TF-IDF:To analyze the collected

content names in C, we adopt an algebraic model, Vector S-

pace Model(VSM), to transform the text document containing

names into vectors of identiﬁers,

V(d)=((t1,W

1),(t2,W

2),(t3,W

3),··· ,(tn,W

n)),

where tirepresents the feature vector of ci,Wirepresents the

weight of ciin the text document dand V(d)is the vector

representation of a text document d.

Then, TF-IDF is adopted to calculate calculate Wi.TF-

IDF, short for term frequency-inverse document frequency, is

a numerical statistic that is intended to reﬂect how important a

word is to a document in a collection or corpus. It is often used

as a weighting factor in searches of information retrieval, text

mining, and user modeling. TF measures how frequently a

term occurs in a document and IDF measures how important

it is. The TF-IDF value increases proportionally to the

number of times a word appears in the document.

Wi=TF

i×IDFi,

where IDFiis calculated as lg( N

ni+1 )with Nthe total number

of names and nithe number of texts including ciin d.

For multiple documents, the text vector need to be normal-

ized processing by the weight function,

Wik =TF

ik ×lg( N

ni+1 )

M

k=1(TF

ik ×lg( N

ni+1 ))2

where Wik represents the weight of the k−th feature vector

in diand Wi=(Wi1,W

i2,··· ,W

iM )with Mrepresenting

the number of different content names.

2) Clustering names with K-means: In this section, we use

K-means to cluster the content names in VSM and content

names with more similarities are clustered into one cluster,

sim(ci,c

j)= n

k=1(Wik ×Wjk)

(n

k=1(Wik )2)×(n

k=1(Wjk)2),(4)

where sim(ci,c

j)is the similarity between ciand cj.

We follow the steps below to complete the clustering:

Step 1: We ﬁrst determine the number of categories (K)

based on the weight of the contents which is bigger than the

threshold ω, and set these corresponding content names as the

initial cluster centroids.

Step 2: For any other name except centroids, we calculate

the similarity between it and each cluster centroid by Eq. (4),

and assign it to the cluster with the highest similarity.

Step 3: We recalculate the mean of feature vectors in each

cluster and set it as the new cluster centroid.

Step 4:Steps 2 and Step 3 are repeated until there is no

feature vector left.

Step 5: After clustering is over, the content priority is

calculated according to Eq. (3).

C. Popularity Prediction

Each node adopts HMM [25] to predict the content pop-

ularity. All contents are cached in the descending order of

the predicted popularity. If the buffer is full, the contents

with lower future popularity will be evicted. First, several

terminologies are introduced:

Deﬁnition 4 (Content Popularity): P(ci)is deﬁned as the

predicted popularity of ciaccording to recent access patterns.

Deﬁnition 5 (Factor Sequence): A factor sequence {F1,F2,

···,Fk,···} is made up of a series of triples, where Fkis a

triple γk(ci),f

k(ci),pr

k(ci)in the k-th slice.

By applying a Forward-Backward Algorithm to train HMM,

we can predict P(ci)with HMM by exploiting the recent

access patterns to generate the factor sequence. HMM can

be denoted by λ=(π, A, B), and the correlative variables are

introduced as follows:

π={πi}, the initial hidden state probabilities, where πi=

P(Si).

A={aij }, the transition probabilities between the hidden

states Siand Sj, where aij =P(Sj|Si).

B={bj(k)}, the probabilities of the observable states Ok

in the hidden state Sj, where bj(k)=P(Ok|Sj).

ζt(i, j)=P(Si,S

j|O1O2...OT,λ), the probability of transit-

ing from the hidden state Siat the time tto the hidden state

Sjat the time t+1, given the model λand the observation

sequence.

ηt(i)=P(Si|O1O2...OT,λ), the probability of the hidden

state Siat the time t, given the model λand the observation

sequence.

We can obtain the accurate model by calculating the correl-

ative variables in Eq. (5),

πi=ηt(i),

aij =T

t=1 ζt(i, j)

T

t=1 ηt(i),

bj(k)=T

t=1,Okηt(j)

T

t=1 ηt(j).

(5)

where πirepresents the expected number of times that the

content stays at the hidden state Siat t,aij represents the

probability of transition from Sito Sj, and bj(k)represents

the probability of observing the state Okwhen the hidden state

is Sj.

In PopCC, Fkrepresents the observe state Okin the k-

th slice and the predicted popularity in (k+1)-th slice is the

hidden state Sk+1.

V. P ERFORMANCE EVALUATION

In order to evaluate the performance of our caching scheme,

we conduct our simulations over the Opportunistic Network

Environment (ONE) simulator [26]. In our design, there are

300 nodes, including 30 RSUs and 270 users (50 people, 100

buses, and 120 taxis), distributed in the map. All nodes move

following the Working-Day-Movement Model in ONE with a

daily routine. A person drives a car with a chance of pv,or

536

s/he must take the bus or taxi to reach different destinations.

Buses follow the Route-Based Movement model and taxis run

by the Random Waypoint Model in ONE. All nodes have the

same caching buffer size, movement speed range, transmission

range and data rate. We list important simulation parameters

in Table III.

TABLE III

SIMULATION PARAMETERS

Parameter Description Value

Caching Buffer 1000MB

Request Interval [50minutes, 100minutes]

Message TTL 10minutes

Simulation Time 15weeks

RSU Number 30

RSU Transmission Range 500m

RSU Transmission Speed 10Mbps

Vehicle Speed [7m/s, 10m/s]

Vehicle Transmission Range 50m

Vehicle Transmission Speed 2Mbps

pv0.3

A. Performance Metrics

Our main comparisons are made between the proposed

PopCC scheme, and reference schemes TRLU [8], CRCP [21]

and PopCaching [10]. In PopCC scheme, we incorporate the

future content popularity into our caching decision, which is

predicted with Hidden Markov Model (HMM) by learning

the past trafﬁc patterns including content priority, request

ratio and request frequency. Thus, we choose TLRU which

considers the past request time patterns and combines recency

and frequency to make replacement scheme. Meanwhile, we

choose the predictive caching scheme CRCP and PopCaching

as references. CRCP allows the cluster head to combine the

past popularity with the current cache hit of each content

so as to make decisions cooperatively. In PopCaching, the

method combining region partition and access pattern update

is designed to predict the content popularity. The following

metrics are used to compare these schemes:

•Success Ratio:The ratio of queries that successfully

obtain the requested contents.

•Average Access Delay:The average delay of obtaining

responses in successful queries.

•Average Hop Count:The average hop count between the

requester and provider in successful queries.

•Average Storage Usage:The average storage usage of all

the caching nodes in the network.

In the remaining sections, we provide a number of studies

to evaluate the performance of our proposed scheme.

B. Effect of Content Size

In Fig. 1(a) and Fig. 1(b), we compare success ratio and av-

erage access delay under different content size. As the content

size increases, each node caches fewer contents, causing lower

success ratio and higher access delay. Our PopCC has the best

performance with the highest success ratio (up to 10.62%

gain) and least access delay (up to 19.33% drop), because

10 20 30 40 50

Content Size(MB)

0.22

0.27

0.32

0.37

Success Ratio

PopCC

TLRU

CRCP

PopCaching

(a) Success Ratio

10 20 30 40 50

Content Size(MB)

1.2

1.5

1.8

2.1

Access Delay(hours)

PopCC

TLRU

CRCP

PopCaching

(b) Average Access Delay

10 20 30 40 50

Content Size(MB)

150

300

450

600

Storage Usage(MB)

PopCC

TLRU

CRCP

PopCaching

Fig. 1. Effect of Content Size

PopCC has taken into account request patterns to predict

the content popularity. Among the four schemes, TRLU only

considers the request recency and frequency without predicting

the content popularity, making it possess the least success

ratio and largest delay. CRCP shows better performance than

PopCaching on success ratio and average access delay when

the content size is larger than 40MB, because the amount

of cached content decreases with the content size. There is

relatively less impact on these two mechanisms, PopCaching

and CRCP, because PopCaching makes the replacement policy

based on the context learning mechanism and the cluster heads

in CRCP are responsible for caching contents. Fig. 1(c) shows

the effect of content size on average storage usage. PopCC

needs a slightly more storage space, because it stores more

interests to compute the three factors, while only the request

recency and frequency are necessary in TLRU. Though CRCP

and PopCaching predict the future popularity, they do not

require statistics for the past interests.

0.2 0.4 0.6 0.8 1

Caching Node Ratio

0.18

0.24

0.3

0.36

Success Ratio

PopCC

TLRU

CRCP

PopCaching

(a) Success Ratio

0.2 0.4 0.6 0.8 1

Caching Node Ratio

1.2

1.7

2.2

2.7

Access Delay(hours)

PopCC

TLRU

CRCP

PopCaching

(b) Average Access Delay

0.2 0.4 0.6 0.8 1

Caching Node Ratio

1.8

2.1

2.4

2.7

Average Hop Counts

PopCC

TLRU

CRCP

PopCaching

Fig. 2. Effect of Caching Node Ratio

C. Effect of Caching Node Ratio

In Fig. 2, we compare success ratio, average access delay,

and average hop counts among the four schemes under dif-

ferent caching node ratio. In Fig. 2(a), the success ratio of

the four schemes improves with the caching node ratio. As

discussed before, PopCC performs the best overall by combing

537

the request patterns to predict the content popularity. TLRU

does not predict the content popularity, causing the lowest

success ratio. Fig. 2(b) and Fig. 2(c) show that the average

access delay and average hop count decrease with the ratio of

caching nodes and PopCC outperforms the best.

100 1,000 10,000 100,000

Content Number

0.19

0.28

0.37

0.46

Success Ratio

PopCC

TLRU

CRCP

PopCaching

(a) Success Ratio

100 1,000 10,000 100,000

Content Number

1.2

1.6

2.4

Access Delay(hours)

PopCC

TLRU

CRCP

PopCaching

(b) Average Access Delay

Fig. 3. Effect of Content Number

D. Effect of Content Number

Fig. 3 shows the success ratio under different content

number. It can be seen that the success ratio decreases with

the number of contents, because a larger number of content

types increase the difﬁculty of cache hit. It shows that PopCC

always achieves the best performance among the four schemes

in terms of the success ratio and the average access delay

though the number of contents increases by a 100-fold.

VI. CONCLUSION

In this paper, we propose PopCC, a cache replacement

policy based on the popularity predicted for VCCN. Com-

bining the three content factors of frequency, ratio, and pri-

ority, PopCC adopts HMM to predict the content popularity.

We have presented our extensive simulations on the PopCC

scheme and compared its performance with several other

competing schemes. It has been shown that PopCC can gain a

better performance in success ratio and average access delay.

In our future work, we plan to optimize our algorithm to

minimize the energy consumption and design some incentive

policy to encourage each vehicle to actively caching content.

ACKNOWLEDGMENT

This research is sponsored in part by National Key Research

and Development Project of China (2017YFC0704100) and

the National Natural Science Foundation of China (contrac-

t/grant numbers: 61772113, 61872053, and 61802047).

REFERENCES

[1] S. H. Bouk, S. H. Ahmed, and D. Kim, “Vehicular content centric

network (vccn): a survey and research challenges,” in ACM Symposium

on Applied Computing, 2015, pp. 695–700.

[2] M. Amadeo, C. Campolo, and A. Molinaro, “Information-centric net-

working for connected vehicles: a survey and future perspectives,” IEEE

Communications Magazine, vol. 54, no. 2, pp. 98–104, 2016.

[3] M. Amadeo, C. Campolo, A. Molinaro, and G. Ruggeri, “Content-centric

wireless networking: A survey,” Computer Networks, vol. 72, no. 7, pp.

1–13, 2014.

[4] B. Trstenjak, S. Mikac, and D. Donko, “Knn with tf-idf based framework

for text categorization,” Procedia Engineering, vol. 69, pp. 1356–1364,

2014.

[5] B. Aaron, D. E. Tamir, N. D. Rishe, and A. Kandel, “Dynamic incremen-

tal k-means clustering,” in Computational Science and Computational

Intelligence (CSCI), 2014 International Conference on, vol. 1. IEEE,

2014, pp. 308–313.

[6] H. Jin, D. Xu, C. Zhao, and D. Liang, “Information-centric mobile

caching network frameworks and caching optimization: a survey,”

EURASIP Journal on Wireless Communications and Networking, vol.

2017, no. 1, pp. 33–64, Feb 2017.

[7] D. Lee, J. Choi, J. H. Kim, S. H. Noh, L. M. Sang, Y. Cho, and S. K.

Chong, “On the existence of a spectrum of policies that subsumes the

least recently used (lru) and least frequently used (lfu) policies,” in ACM

SIGMETRICS International Conference on Measurement and Modeling

of Computer Systems, 1999, pp. 134–143.

[8] M. Bilal and S.-G. Kang, “Time aware least recent used (tlru) cache

management policy in icn,” in Advanced Communication Technology

(ICACT), 2014 16th International Conference on. IEEE, 2014, pp.

528–532.

[9] S. Zhou, “An efﬁcient simulation algorithm for cache of random

replacement policy,” in Network and Parallel Computing. Springer

Berlin Heidelberg, 2010, pp. 144–154.

[10] S. Li, J. Xu, M. V. D. Schaar, and W. Li, “Popularity-driven content

caching,” in INFOCOM 2016 - the IEEE International Conference on

Computer Communications, IEEE, 2016, pp. 1–9.

[11] S. J. Kang, S. W. Lee, and Y. B. Ko, “A recent popularity based

dynamic cache management for content centric networking,” in Fourth

International Conference on Ubiquitous and Future Networks, 2012, pp.

219–224.

[12] K. Suksomboon, S. Yamada, S. Tarnoi, Y. Ji, M. Koibuchi, K. Fukuda,

S. Abe, N. Motonori, M. Aoki, and S. Urushidani, “Popcache: Cache

more or less based on content popularity for information-centric net-

working,” in Local Computer Networks, 2014, pp. 236–243.

[13] F. M. Al-Turjman, A. E. Al-Fagih, and H. S. Hassanein, “A value-based

cache replacement approach for information-centric networks,” in Local

Computer Networks Workshops. IEEE, 2014, pp. 874–881.

[14] A. Karami and M. Guerrero-Zapata, “An anﬁs-based cache replacement

method for mitigating cache pollution attacks in named data network-

ing,” Computer Networks, vol. 80, pp. 51–65, 2015.

[15] Y. Cui, M. Zhao, and M. Wu, “A centralized control caching strategy

based on popularity and betweenness centrality in ccn,” in International

Symposium on Wireless Communication Systems, 2016, pp. 286–291.

[16] M. K. Denko and J. Tian, “Cross-layer design for cooperative caching

in mobile ad hoc networks,” in Consumer Communications and NET-

WORKING Conference, 2008. Ccnc, 2008, pp. 375–380.

[17] P. T. Joy and K. P. Jacob, “Cache replacement policies for cooperative

caching in mobile ad hoc networks,” International Journal of Computer

Science Issues, vol. 9, no. 3, pp. 2012–2017, 2012.

[18] R. Wang, X. Peng, J. Zhang, and K. B. Letaief, “Mobility-aware caching

for content-centric wireless networks: modeling and methodology,”

IEEE Communications Magazine, vol. 54, no. 8, pp. 77–83, 2016.

[19] A. K. Gupta and U. Shanker, “Spmc-crp:a cache replacement policy for

location dependent data in mobile environment,” Procedia Computer

Science, vol. 125, pp. 632–639, 2018.

[20] M. Chaqfeh, A. Lakas, and I. Jawhar, “A survey on data dissemination

in vehicular ad hoc networks,” Vehicular Communications,vol. 1, no. 4,

pp. 214–225, 2014.

[21] S. Chootong and J. Thaenthong, “Cache replacement mechanism with

content popularity for vehicular content-centric networks (vccn),” in

International Joint Conference on Computer Science and Software

Engineering, 2017, pp. 1–6.

[22] E. Chan, W. Li, and S. Lu, “Movement prediction based cooperative

caching for location dependent information service in mobile ad hoc

networks,” Journal of Supercomputing, vol. 59, no. 1, pp. 297–322,

2012.

[23] S. E. El Khawaga, A. I. Saleh, and H. A. Ali, “An administrative cluster-

based cooperative caching (accc) strategy for mobile ad hoc networks,”

Journal of Network and Computer Applications, vol. 69, pp. 54–76,

2016.

[24] L. Yao, A. Chen, J. Deng, J. Wang, and G. Wu, “A cooperative

caching scheme based on mobility prediction in vehicular content centric

networks,” IEEE Transactions on Vehicular Technology, vol. PP, no. 99,

pp. 1–1, 2017.

[25] W. Khreich, E. Granger, A. Miri, and R. Sabourin, “On the memory

complexity of the forwardcbackward algorithm,” Pattern Recognition

Letters, vol. 31, no. 2, pp. 91–99, 2010.

[26] A. Keranen, J. Ott, and T. Karkkainen, “The one simulator for dtn

protocol evaluation,” in International Conference on Simulation TOOLS

and Techniques, 2009, p. 55.

538

On the realization of VANET using named data networking: On improvement of VANET using NDN‐based routing, caching, and security

Article

Full-text available

Sep 2022
INT J COMMUN SYST

Named data networking (NDN) presents a huge opportunity to tackle some of the unsolved issues of IP‐based vehicular ad hoc networks (VANET). The core characteristics of NDN such as the name‐based routing, in‐network caching, and built‐in data security provide better management of VANET proprieties (e.g., the high mobility, link intermittency, and dynamic topology). This study aims at providing a clear view of the state‐of‐the‐art on the developments in place, in order to leverage the characteristics of NDN in VANET. We resort to a systematic literature review (SLR) to perform a reproducible study, gathering the proposed solutions and summarizing the main open challenges on implementing NDN‐based VANET. There exist several related studies, but they are more focused on other topics such as forwarding. This work specifically restricts the focus on VANET improvements by NDN‐based routing (not forwarding), caching, and security. The surveyed solution herein presented is performed between 2010 and 2021. The results show that proposals on the selected topics for NDN‐based VANET are recent (mainly from 2016 to 2021). Among them, caching is the most investigated topic. Finally, the main findings and the possible roadmaps for further development are highlighted. A systematic review on the realization of NDN‐based VANET, specifically for NDN‐based routing (not forwarding), caching and security issues, was performed for the period of 2010‐2021. The study concluded that caching has been the more investigated topic. There are, however, some gaps still to be tackled. For instance, a means to further reduce the broadcast storm problem in routing may include leveraging all overhead packets for location‐aware, and leverage caching for better routing decision.

A Content Poisoning Attack Detection and Prevention System in Vehicular Named Data Networking

Article

Full-text available

Jul 2023

Named data networking (NDN) is gaining momentum in vehicular ad hoc networks (VANETs) thanks to its robust network architecture. However, vehicular NDN (VNDN) faces numerous challenges, including security, privacy, routing, and caching. Specifically, the attackers can jeopardize vehicles’ cache memory with a Content Poisoning Attack (CPA). The CPA is the most difficult to identify because the attacker disseminates malicious content with a valid name. In addition, NDN employs request–response-based content dissemination, which is inefficient in supporting push-based content forwarding in VANET. Meanwhile, VNDN lacks a secure reputation management system. To this end, our contribution is three-fold. We initially propose a threshold-based content caching mechanism for CPA detection and prevention. This mechanism allows or rejects host vehicles to serve content based on their reputation. Secondly, we incorporate a blockchain system that ensures the privacy of every vehicle at roadside units (RSUs). Finally, we extend the scope of NDN from pull-based content retrieval to push-based content dissemination. The experimental evaluation results reveal that our proposed CPA detection mechanism achieves a 100% accuracy in identifying and preventing attackers. The attacker vehicles achieved a 0% cache hit ratio in our proposed mechanism. On the other hand, our blockchain results identified tempered blocks with 100% accuracy and prevented them from storing in the blockchain network. Thus, our proposed solution can identify and prevent CPA with 100% accuracy and effectively filters out tempered blocks. Our proposed research contribution enables the vehicles to store and serve trusted content in VNDN.

Article

Full-text available

Jan 2022

The Information-Centric Network (ICN), designed for efficient content acquisition and distribution, is a promising candidate architecture for the future Internet. In-network caching in ICN makes it possible to reuse contents and the Name Resolution System (NRS) makes cached contents better serve users. In this paper, we focused on the ICN caching scenario equipped with an NRS, which records the positions of contents cached in ICN. We propose a Popularity-based caching strategy with Number-of-Copies Control (PB-NCC) in this paper. PB-NCC is proposed to solve the problems of unreasonable content distribution and frequent cache replacement in traditional caching strategies in ICN. We examine PB-NCC with a large number of experiments in different topologies and workloads. The simulation results reveal that PB-NCC can improve the cache hit ratio by at least 8.85% and reduce the server load by at least 11.34% compared with other on-path caching strategies, meanwhile maintaining a low network latency.

A Literature Review on Caching Transient Contents in Vehicular Named Data Networking

Article

Full-text available

Feb 2021

Marica Amadeo

Vehicular Named Data Networking (VNDN) is a revolutionary information-centric architecture specifically conceived for vehicular networks and characterized by name-based forwarding and in-network caching. So far, a variety of caching schemes have been proposed for VNDN that work in presence of static Data packets, like traditional Internet contents. However, with the advent of Internet of Things (IoT) and Internet of Vehicles (IoV) applications, large sets of vehicular contents are expected to be transient, i.e., they are characterized by a limited lifetime and become invalid after the latter expires. This is the case of information related to road traffic or parking lot availability, which can change after a few minutes—or even after a few seconds—it has been generated at the source. The transiency of contents may highly influence the network performance, including the gain of in-network caching. Therefore, in this paper, we consider the dissemination of transient contents in vehicular networks and its effects on VNDN caching. By providing a detailed review of related work, we identify the main challenges and objectives when caching transient contents, e.g., to avoid cache inconsistency, to minimize the Age of Information (AoI) and the retrieval latency, and the main strategies to fulfill them. We scan the existing caching and replacement policies specifically designed for transient contents in VNDN and, finally, we outline interesting research perspectives.

Environment-Adaptive Dynamic Caching for Vehicular Named Data Networks in Dynamic Network Environments

Article

Jan 2023

Caching plays a vital role in maintaining normal information exchanges in vehicular named data networks (VNDNs). However, current caching techniques cannot adapt to variable network environments. In this paper, we propose an environment-adaptive dynamic caching (EADC) strategy for VNDNs to cope with the changing environments. This strategy consists of three factors: vehicle characteristic attributes, motion centrality attributes, and transmission cost attributes, which are used jointly to determine the cache probability. The vehicle characteristic attributes can reflect not only the social attributes but also the content popularity, and helps cache the most popular and desired contents. The motion centrality attributes reflect the contacting capacity and the control capacity of vehicles, and this enables vehicles with the most important locations to cache valuable contents. The transmission cost attributes focus on multiple performance metrics such as transmission delay and cache redundancy, which are used to reflect the content acquisition difficulty and reduce the hard events of long-distance content acquisition. The greatest characteristics and advantages of EADC are the strong adaptability to the content demand changing, network topology changing, and channel quality changing. Extensive simulation results illustrate the advantages of EADC in terms of average delay, cache utilization, cache hit ratio, and average hop count.

Multi-feature content popularity prediction algorithm based on GRU-Attention in V-NDN

Conference Paper

May 2023

Mobility-Aware Proactive Edge Caching for Large Files in the Internet of Vehicles

Article

Jul 2023

By shifting the requested content to the edge in the Internet of Vehicles (IoV), edge caching is expected to be an effective solution to satisfy the low latency and high reliability requirements of IoV users for multimedia services. However, the edge node’s coverage area and storage space are limited. Moreover, since vehicles have high mobility and in-vehicle multimedia applications require sequential delivery for contents, we need to address two main issues: 1) How to optimize the proactive content caching decision (i.e., the placement of cached content chunks) among edge nodes (ENs) to provide better Quality of Services (QoS) for IoV users. 2) How to ensure that vehicles can download the required contents sequentially to improve Quality of Experience (QoE). In this paper, we propose a mobility-aware proactive edge caching scheme (MSTPS), where the spatial and temporal prediction of vehicles are taken into account for content deployment and scheduling. Specifically, we optimize the caching decision based on predicting the vehicle’s driving trajectory and travel preference. The scheme learns the vehicle’s travel preferences to cope with mobility uncertainty by combining users with similar travel patterns. Meanwhile, the proposed scheme can support the sequential downloading of content chunks. Furthermore, in order to deal with the dynamic characteristics and unpredictable challenges of the IoV, we design a system recovery strategy, which can avoid the degradation of the proposed scheme due to the failure of prediction. Finally, by using real mobility datasets and scenarios, we explore the impact of the number of ENs deployed in advance for each vehicle’s request when the cache needs to be updated on system performance. In addition, we evaluate the effectiveness of the proposed scheme. Our proposed scheme can achieve the best cache hit ratio and decrease caching costs compared to the existing mobility-aware caching schemes.

A survey of data dissemination schemes in vehicular named data networking

Article

Mar 2021

Vehicular Ad hoc NETworks (VANETs) have become a leading technology receiving great attention from various research communities as a pivotal infrastructure for data dissemination in intelligent transportation systems. Data dissemination in VANET is a challenging task due to high dynamics in topology, mobility, and links connection. Internet model (i.e., TCP/IP) is inefficient for VANET data dissemination due to the host/address-centric, and connection-oriented communication mechanism that is fundamentally designed for stable wired networks. Recently, Named Data Networking (NDN) paradigm has been used as a promising perfect-enabler underlying vehicular communication model, i.e., Vehicular Named Data Networking (V-NDN) model. In NDN, the nodes communication involves named-based data-centric operations decoupled from the data provider address/location. Several V-NDN data dissemination schemes have been proposed. In this article, we provide a comprehensive survey representing a thorough-critical presentation of recently proposed V-NDN data dissemination solutions and introduce a new fine-grained taxonomy for these solutions. Then, a qualitative comparison of the reviewed solutions based on several parameters is provided. We also suggest a unified performance evaluation metrics in this domain. Finally, we present the open problems in V-NDN data dissemination and highlight the directions of future-oriented solutions. This comprehensive and self-contained survey can contribute to the exploration and understanding of this research domain. Consequently, the future solutions in the aspects of unresolved problems and inefficient resolutions may be directed towards new solving methods.

Contextual dimensions for cache replacement schemes in information-centric networks: a systematic review

Article

Full-text available

Mar 2021

In recent years, information-centric networks (ICNs) have gained attention from the research and industry communities as an efficient and reliable content distribution network paradigm, especially to address content-centric and bandwidth-needed applications together with the heterogeneous requirements of emergent networks, such as the Internet of Things (IoT), Vehicular Ad-hoc NETwork (VANET) and Mobile Edge Computing (MEC). In-network caching is an essential part of ICN architecture design, and the performance of the overall network relies on caching policy efficiency. Therefore, a large number of cache replacement strategies have been proposed to suit the needs of different networks. The literature extensively presents studies on the performance of the replacement schemes in different contexts. The evaluations may present different variations of context characteristics leading to different impacts on the performance of the policies or different results of most suitable policies. Conversely, there is a lack of research efforts to understand how the context characteristics influence policy performance. In this direction, we conducted an extensive study of the ICN literature through a Systematic Literature Review (SLR) process to map reported evidence of different aspects of context regarding the cache replacement schemes. Our main findings contribute to the understanding of what is a context from the perspective of cache replacement policies and the context characteristics that influence cache behavior. We also provide a helpful classification of policies based on context dimensions used to determine the relevance of contents. Further, we contribute with a set of cache-enabled networks and their respective context characteristics that enhance the cache eviction process.

Conference Paper

Nov 2020

SPMC-CRP:A Cache Replacement Policy for Location Dependent Data in Mobile Environment

Article

Full-text available

Jan 2018

Earlier cache replacement policies used in LDIS have not evolved any accurate next location prediction policy that can be used in cost computation of several data items. To overcome this limitation of previous policy and to ensure efficient cache utilization SPMC-CRPis being proposed. Here sequential pattern mining and clustering is used to remove random movement data in mobile user’s profiles and predict accurate next location. The proposed policy uses the mobility rules extracted from given client movement trajectories. The mobility rules taken here are derived from sequential pattern mining. Proceeding to accurate next location prediction, the policy considers the important factors such as client access probability, query rate, update rate and predicted next client’s location while estimating cache replacement cost in order to improve the effectiveness of previous cache replacement policy.

Cache replacement mechanism with Content Popularity for Vehicular Content-Centric Networks (VCCN)

Conference Paper

Full-text available

Jul 2017

Information-centric mobile caching network frameworks and caching optimization: a survey

Article

Full-text available

Feb 2017
EURASIP J WIREL COMM

The demand for content oriented service and compute-intensive service stimulates the shift of current cellular networks to deal with the explosive growth in mobile traffic. Information centric mobile caching network architectures have emerged in Information-Centric Networking as well as mobile cellular and ad-hoc networks deployed with caches. Caching optimization based on information centric mobile caching has become the key issue, and several significant research challenges remain to be addressed before its widespread adoption. In this paper, a brief survey on Information centric mobile caching network architecture and caching optimization is presented, including cache placement in different mobile wireless network architectures, the taxonomy of cache insertion and eviction policies, the modeling behavior of caching networks as well as caching optimization based on network centric and user centric metrics, and typical applications based on mobile caching. Finally, the research directions and open challenges are investigated.

Vehicular Content Centric Network (VCCN): A Survey and Research Challenges

Conference Paper

Full-text available

Apr 2015

Recently, Content Centric Networking (CCN) has been proposed for the Future Internet. Since CCN is at an early bud stage, many issues are still unidentified and open. In this paper, we investigate the feasibility of applying the CCN concept to vehicular communications (named as Vehicular CCN, VCCN in this work). In addition, we identify a number of VCCN challenges such as naming, name resolution, routing or forwarding strategies, content storing, management and policy of forwarding information base and pending interest table management, security and trust issues, etc.

A Cooperative Caching Scheme Based on Mobility Prediction in Vehicular Content Centric Networks

Article

Dec 2017

Vehicular Content Centric Networks (VCCNs) emerge as a strong candidate to be deployed in informationrich applications of vehicular communications. Due to vehicles' mobility, it becomes rather inefficient to establish end-to-end connections in VCCNs. Consequently, content packets are usually sent back to the requesting node via different paths in VCCNs. To improve network performance of VCCNs, node mobility should be exploited for vehicles to serve as relays and to carry data for delivery. In this work, we propose a scheme called Cooperative Caching based on Mobility Prediction (CCMP) for VCCNs. The main idea of CCMP is to cache popular contents at a set of mobile nodes that may visit the same hot spot areas repeatedly. In our CCMP scheme, we use Prediction based on Partial Matching (PPM) to predict mobile nodes' probability of reaching different hot spot regions based on their past trajectories. Vehicles with longer sojourn time in a hot region can provide more services and should be preferred as caching nodes. To solve the problem of limited buffer at each node, we design a cache replacement based on content popularity to guarantee only popular contents are cached. We evaluate CCMP through the ONE simulator for its salient features in success ratio and content access delay compared to other state-of-the-art schemes.

A centralized control caching strategy based on popularity and betweenness centrality in CCN

Conference Paper

Sep 2016

Popularity-driven content caching

Conference Paper

Apr 2016

Mobility-Aware Caching for Content-Centric Wireless Networks: Modeling and Methodology

Article

May 2016

As mobile services are shifting from "connection-centric" communications to "content-centric" communications, content-centric wireless networking emerges as a promising paradigm to evolve the current network architecture. Caching popular content at the wireless edge, including base stations (BSs) and user terminals (UTs), provides an effective approach to alleviate the heavy burden on backhaul links, as well as lowering delays and deployment costs. In contrast to wired networks, a unique characteristic of content-centric wireless networks (CCWNs) is the mobility of mobile users. While it has rarely been considered by existing works in caching design, user mobility contains various helpful side information that can be exploited to improve caching efficiency at both BSs and UTs. In this paper, we present a general framework on mobility-aware caching in CCWNs. Key properties of user mobility patterns that are useful for content caching will be firstly identified, and then different design methodologies for mobility-aware caching will be proposed. Moreover, two design examples will be provided to illustrate the proposed framework in details, and interesting future research directions will be identified.

An Administrative Cluster-based Cooperative Caching (ACCC) strategy for Mobile Ad Hoc Networks

Article

May 2016

Recent explosive growth in computing and wireless communication technologies has led to an increasing interest in mobile ad hoc networks (MANET). Among the many challenges for MANET designers and users, data availability is a critical issue. Caching is considered as an effective solution for the availability problem. Although cooperative caching improves the data access by reducing access latency and bandwidth usage in MANETs, it still suffers from several hurdles and technical problems. This paper introduces a new cooperative caching strategy for MANETs, which is called Administrative Cluster-Based Cooperative Caching (ACCC). ACCC keeps at most two copies of the cached data items in each cluster. Moreover, it develops a new administrative module, which perfectly controls the caching process. ACCC is a cluster-based caching strategy, hence, it divides the network into a set of overlapping clusters. Each cluster is managed by a Cluster Manager (CM) as well as a Cluster Backup (CB). ACCC has several salient features that other techniques do not have such as; (i) the integration between CM and CB besides the integration among all the caching modules improves the data availability within the cluster, (ii) ACCC; maximizes the caching hit ratio, reduces access delay, and saves the battery power of network nodes. ACCC is simulated using Java Caching System JCS2 with the efficient platform Java Enterprise Edition EE8 using a client/server model. Experimental results have shown that ACCC outperforms recent cluster-based caching strategies as it introduces higher cache hit ratio as well as better data availability.

Information-Centric Networking for Connected Vehicles: A Survey and Future Perspectives

Article

Feb 2016

In the connected vehicle ecosystem, a high volume of information-rich and safety-critical data will be exchanged by roadside units and onboard transceivers to improve the driving and traveling experience. However, poor-quality wireless links and the mobility of vehicles highly challenge data delivery. The IP address-centric model of the current Internet barely works in such extremely dynamic environments and poorly matches the localized nature of the majority of vehicular communications, which typically target specific road areas (e.g., in the proximity of a hazard or a point of interest) regardless of the identity/address of a single vehicle passing by. Therefore, a paradigm shift is advocated from traditional IP-based networking toward the groundbreaking information- centric networking. In this article, we scrutinize the applicability of this paradigm in vehicular environments by reviewing its core functionalities and the related work. The analysis shows that, thanks to features like named content retrieval, innate multicast support, and in-network data caching, information-centric networking is positioned to meet the challenging demands of vehicular networks and their evolution. Interoperability with the standard architectures for vehicular applications along with synergies with emerging computing and networking paradigms are debated as future research perspectives.

Popularity Prediction Caching Using Hidden Markov Model for Vehicular Content Centric Networks

Recommended publications

A failure prediction approach based on cloud theory and hidden Markov model in networked computing s...

Social Position Predicting Physical Activity Level in Youth: An Application of Hidden Markov Modelin...

A new routing algorithm based on opportunistic networks

Online User-AP Association with Predictive Scheduling in Wireless Caching Networks