Conference PaperPDF Available

Adaptive Reinforcement Routing in Software Defined Vehicular Networks

July 2020

July 2020

DOI:10.1109/IWCMC48107.2020.9148237

Conference: 2020 International Wireless Communications and Mobile Computing (IWCMC)
At: Cyprus

Authors:

Ankur Nahar

Indian Institute of Technology Jodhpur

Debasis Das Das

Indian Institute of Health Managment Research

The integration of learning architecture with SDN-based VANETs (SDVN) is beneficial for utilizing computing power by decoupling network management services from data transfer services. However, fast safety messages dissemination in a highly dynamic vehicular environment is a challenging and complex dilemma due to bi-directional traffic and the directional movement of vehicles. It is also challenging to get an effective solution against bottleneck situations and a reliable and fault-tolerant SDN network using clustering. So considering the features of adaptive learning, in this paper, we propose adaptive self-learning clustering algorithm with reinforcement routing in SDVN known as RL-SDVN. An Expectation-Maximization model is used to predict a vehicle’s movement and further Qlearning model is used to route data packets, so that vehicles in the same cluster coordinate with each other to find optimum routes. We evaluate our experimental results by comparing our approach with the clustering and selflearning based schemes proposed in the past. The outcomes exhibit that the proposed scheme improved cluster stability and life-time of a cluster member vehicle with better performance in terms of low average transmission delay, and high throughput compared to the existing routing protocols used in this research.

Content uploaded by Ankur Nahar

Content may be subject to copyright.

Adaptive Reinforcement Routing in Software

Deﬁned Vehicular Networks

Ankur Nahar

Computer Science and Engineering Department

Indian Institute of Technology

Jodhpur, India

nahar.1@iitj.ac.in

Debasis Das

Computer Science and Engineering Department

Indian Institute of Technology

Jodhpur, India

debasis@iitj.ac.in

Abstract—The integration of learning architecture with

SDN-based VANETs (SDVN) is beneﬁcial for utilizing com-

puting power by decoupling network management services

from data transfer services. However, fast safety messages

dissemination in a highly dynamic vehicular environment

is a challenging and complex dilemma due to bi-directional

trafﬁc and the directional movement of vehicles. It is also

challenging to get an effective solution against bottleneck

situations and a reliable and fault-tolerant SDN network

using clustering. So considering the features of adaptive

learning, in this paper, we propose adaptive self-learning

clustering algorithm with reinforcement routing in SDVN

known as RL-SDVN. An Expectation-Maximization model

is used to predict a vehicle’s movement and further Q-

learning model is used to route data packets, so that

vehicles in the same cluster coordinate with each other to

ﬁnd optimum routes. We evaluate our experimental results

by comparing our approach with the clustering and self-

learning based schemes proposed in the past. The outcomes

exhibit that the proposed scheme improved cluster stability

and life-time of a cluster member vehicle with better

performance in terms of low average transmission delay,

and high throughput compared to the existing routing

protocols used in this research.

Index Terms—Vehicular ad-hoc networks (VANETs),

Clustering, Routing, Software deﬁned network, Q-learning,

Adaptive learning.

I. INTRODUCTION

The heterogeneity of VANETs makes it challenging

to coordinate with the rising number of accidents and

the vehicles on the road [1]–[4]. Thus, the vehicu-

lar ad hoc networks (VANETs) aims to enhance road

safety, information dissemination and trafﬁc manage-

ment applications [5]–[8]. Through VANETs, we can

provide vehicle-to-vehicle (V2V) communication, which

can avoid a crash and hazardous condition by propa-

gating alert messages to other vehicles in the network.

Unfortunately, the current routing protocol techniques

make it challenging to provide Quality of Service (QoS)

requirements [9]–[11]. Warning message propagation is

a primary concern in providing trafﬁc safety regulations.

A warning message should travel fast and reliably to the

group of nearby vehicles to alert about the hazardous

circumstance. Although, numerous research efforts have

attempted to solve this complexity for routing of warn-

ing messages by selecting probabilistic approach [1],

moving direction, relative speed [2], link stability [3],

[9], and social aware clustering [10]. Various learning-

based schemes have also been proposed in the past to

serve QoS routing purposes. J. Wu [4] and L. Zhao

[5] used adaptive and machine learning techniques to

ﬁnd the optimal route to the destination. V. Vashishth

et al. [6] used the Gaussian Mixture Model (GMM)

and machine learning for the soft clustering, whereas

A. Maio et al. [7] and K. Liyanage et al. [8] used

management capabilities of SDN and proposed opti-

mal route ﬁnding techniques. However, designing an

effective routing protocol is a challenging task as the

real-world high mobility trafﬁc situations, propagation

speed, bandwidth constraints, and trafﬁc complexity is

an inﬂuential parameter that inﬂuences the performance

factor of a vehicular network [12]–[14]. Concerned by

the fast warning message dissemination and performance

enhancement, this work proposed optimum convergence

clustering and reinforcement learning-based routing. To

overcome the routing challenges, and tightly bound

clustering, this paper proposed an SDN-based VANETs

with reinforcement learning, known as RL-SDVN. We

adopted the SDN controlled VANET and proposed a Q-

learning based method for routing decisions and self-

learning architecture for clustering purposes. We further

tabulate RL-SDVN performance analysis and compare

with the previously proposed routing techniques. The

simulation result demonstrates a low average transmis-

sion delay, improved lifetime of a cluster member and

lower cluster transitions. It also enhanced the through-

put of the network and shows improvement than other

existing protocols used in this paper.

The paper is organized in the subsequent sections:

Section II, provides the objectives and contribution of

the research. Section III and IV, contains the details

about the architectural model and our proposed RL-

SDVN protocol in detail. Whereas, in section V, we

present a performance analysis of simulated protocols

and the simulation results.

II. OBJECTIVE AND CONTRIBUTION

The principal objective behind this proposed work is

to improve the message dissemination process and lower

the average transmission time. Efﬁcient clustering plays

a vital role in grouping vehicles and routing process, so

Gaussian mixture model (GMM) along with reinforce-

ment learning is used to ﬁnd the vehicle’s movement

and orientation pattern based on the features selection

and further used for forming the clusters and optimal

route selection. Objectives for this work listed below:

Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.

•Tackle the instability of clusters and identify

the circumstances that introduce delays in mes-

sage dissemination.

•Propose a GMM based clustering and

reinforcement-based routing scheme, which

can regulate the coordination and control of the

network and enhance network performance.

In this paper, ﬁrst, we trained a self-learning classiﬁer,

based on the optimal feature selection and ﬁtness value

generation. Further Q-learning classiﬁer is used for Q-

value generation, which governs packet forwarding deci-

sions. We also used SDN capabilities for controlling for-

warding, management, and optimal path decisions. The

research ﬁndings demonstrate that using this technique,

we achieved more stable and efﬁcient clusters with less

average delay and high throughput performance in the

network. Precisely, this paper provides the subsequent

vital contributions.

•An efﬁcient RL-SDVN scheme is proposed

for vehicular communications, where stable

clusters are formed using machine learning

(GMM) and features extraction, i.e., connec-

tivity, distance, transmission range, and queue

occupancy.

•The SDN-based Q-learning scheme is proposed

to improve reliability, average transmission

time, and throughput. The learning process

also improves cluster stability and network

optimization.

III. ARCHITECTURAL MODEL

In this research, we used to represent the trafﬁc

ﬂow using Spatio-temporal propagation of the vehicles.

We deﬁne the trafﬁc ﬂow model as the function of

vehicle density, trafﬁc ﬂow, and the velocity of vehicles

concerning space x and the time t. The assumptions for

this trafﬁc model can be represented as:

∂ρ

∂t +∂ρV

∂x =ν(x, t)(1)

Here, ρrepresent the trafﬁc density, and Vrepresent

the velocity of the vehicles whereas νrepresent the

inﬂows and outﬂows of the road. If we use second

order derivatives to represent the trafﬁc instability then

dynamic acceleration can be given as:

∂V

∂t +V∂V

∂x +1

ρ∗dP (ρ, V )

dx =A{ρ, V, ρa,V

a,∂ρ

∂x,∂V

∂x }

(2)

Here, we represent trafﬁc pressure with respect to

function A() and the velocity variance. A real-time

vehicular network represents characteristics of the het-

erogeneous trafﬁc environment. For our research pur-

pose, we formulated our trafﬁc ﬂow model as a time-

continuous car-to-car following model, which is shown

in Fig. 1.

We assume that each vehicle in the network commu-

nicates using IEEE 802.11p and dedicated short-range

communication (DSRC) standards (i.e., 300-1000m)

transmission range. Each vehicle in a VANET can com-

municate to other vehicles directly using V2V commu-

nication or through RSUs (V2I). A vehicle broadcasts

Fig. 1. Trafﬁc Flow Arrangement With Vehicle Dynamics

a trafﬁc safety message every 100-300 ms, which keeps

the vehicle’s driving-related information, such as loca-

tion, speed, turning intention, and driving status (e.g.,

regular driving, waiting for a trafﬁc light, trafﬁc jam) to

other vehicles [3].

IV. RL-SDVN: ADAPTIVE-REINFORCEMENT

ROUTING IN SDVN

This paper proposed an adaptive-reinforcement learn-

ing method for clustering and routing using a classi-

ﬁer named RL-SDVN that can enhance the neighbor

selection method and improve cluster stability, prompted

by the RREQ packets. Our approach operates in two

phases: ﬁrstly, it formulates distinct clusters based on

the feature extraction and stored in the database. This

extracted data will be used for training RL-SDVN using

the Expectation-Maximization learning method. Further,

this classiﬁer is used to label the clusters and assign each

vehicle to its suitable cluster. Secondly, we use SDN to

handle network operations and ﬁnd the most suitable

route to the destination. As stated above, the proposed

algorithm works in two steps.

A. Adaptive Self-Learning Model Training

The GMM based clustering inspires our technique for

performing soft clustering in the network [6]. GMM can

be beneﬁcial as it uses a probabilistic method for gen-

erating clusters. The clusters using this method follow

a probability distribution in a PD

ndimensional space.

GMM uses sampled data from an unknown parameter

of Gaussian distribution. Using a learning method, we

can formulate different clusters using the values of the

parameters. In GMM, the probability distribution of a

vehicle (A vector representation) can be represented by

the sum of all probabilities of the vehicles in the cluster.

the equation for the distribution can be given as:

ˆρ(Ve)=



n=1

λn(ˆ

k|σn,υ

n)(3)

Here, ˆρ(Ve)is the probability distribution of a vehicle

Ve. λnis the mixture coefﬁcient and (σn,υ

n)is the

mean and covariance of the normal distribution. Now

the Expected Maximization algorithm is used to train

the data model. The Expected-Maximum approach is

based on determining the maximum likelihood parameter

2119

Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.

from the selected parameters for training the model. The

GMM model can be utilized to estimate unobserved

data points in the distribution and increase distribution

likelihood until it reaches local maxima. In the GMM

model, ﬁrstly, a vehicle is assigned to a cluster using

a random value of the mixture coefﬁcient, and then the

expectation step is performed. In this step, the likelihood

value of the samples is calculated for each of the clusters.

This value signiﬁes how strongly a vehicle is associated

with a cluster. The likelihood value can be given as:

ϕin =λn(Vf|σn,υ

N

n=1 λn(Vf|σn,υ

n)(4)

Here, ϕin represent the likelihood value of a ith

vehicle to a cluster N. Further, the Maximization step

is performed for each vehicle in a cluster. In this step,

the mean coefﬁcient, and the covariance coefﬁcient is

calculated again and update their values. Now, these

updated values are used to assign the vehicle to cluster

with a more strong likelihood. The mean coefﬁcient

update can be written as:

σi

n=N

n=1 ϕin.Vf

N

n=1 ϕin

(5)

The updated covariance coefﬁcient is given by:

σi

n=N

n=1 ϕin.(Vf−σi

n)((Vf−σi

n)x

N

n=1 ϕin

(6)

The updated mixture coefﬁcient λy

ncan be given as:

λy

n=N

n=1 ϕin

X(7)

After updating all coefﬁcient values, log-likelihood is

determined for the Xnumber of samples. If the log-

likelihood value remains stable for consecutive itera-

tions, then the learning algorithm is stopped otherwise

cluster reﬁning is continued till convergence.

ln(ˆρ(Ve)|Vf)=



i=1



n=1

λnˆ

k(Vf|σn,υ

n)(8)

B. Cluster Formation using Self-Learning

In the cluster formation step, Data concerning chosen

network features are extracted and stored in the vehicle’s

OBU, and this contextual information is further used to

create a training instance. A vehicle receives a beacon

message from the neighbour in every 100 ms, which

contains selected feature data in the packet header. This

data is further used by the classiﬁer for computation and

assigning a vehicle to a cluster. Four different features

are selected for the training purposes.

Adjacency Feature: This connectivity matrix is used

to identify the adjacency (connectivity to other vehicles)

of a vehicle. The adjacency matrix allows representing

the network with the N*N matrix.

neiMat =[f(i, j)] (9)

Where each element represents the connectivity of the

vehicles.

Distance Feature: In this phase, we perform the

computation of the cosine similarity index. The simi-

larity index is used to determine the cluster size and the

cluster members, which fall into the similar index values.

The cosine distance model is used to ﬁnd the distance

between the communicating vehicles. It can deﬁne as:

k={1−ε}(10)

Where ˆ

kis the cosine distance and the εis the

cosine similarity. Using eq. (10) we can ﬁnd the distance

between two vehicles i.e., i and j in a vehicle set Ve

The cosine similarity can be deﬁned as:

ε=N

i=1 Ve

iVe

N

i=1 Ve

iN

i=1 Ve

(11)

Here, Ve

iand Ve

jis the ith and jth vehicle in the

communication.

Transmission Range Feature: transmission range is

used as another feature for cluster formation. In this

feature transmission range of each vehicle is calculated.

In general, each vehicle communicates using dedicated

short-range communication (DSRC) standards (i.e., 100-

1000m) transmission range.

Queue Occupancy: This feature donates the number

of packets in the queue for processing at the given node.

The vehicle with high adjacency, and low queue

occupancy selected as cluster head in the cluster.

C. Route selection in RL-SDVN

After selecting the cluster head for each cluster in

the network, the next phase of the route discovery

takes place. With the help of the cluster head and the

geographical information, SDN calculates the optimum

route to the destination based on the quality of available

paths and reinforcement learning. The learning process

iterates at each forwarding vehicle until the packet

reaches the destination. The SDN controller is used to

compute the route propagation based on a Q-learning

algorithm. Fig. 2 represent Q-learning model for the

vehicular networks where, in the context of the learning

method, vehicles in the network considered as a state of

the agent ΓS∈Γs. where ΓS=Γsiand i=1,2, ...Vn

and the possible set of action a, which means a vehicle

Visend a packet to the vehicle Vj. The Q-value is

based on a rewarding function, which is obtained by

the exploration of the environment. Every vehicle in the

network maintains a Q-table, which is updated based

on the future negative or positive reward values. The

packet forwarding decisions are based on the maximized

Q-value of the vehicles.

The amendments in the Q-table are performed upon

reception of a hello message. Each vehicle maintains the

Q-value for the one-hop and two-hop information and

attached in the route request messages. When a vehicle

wants to send a packet to another vehicle in the network,

it will check the Q-value in the packet. After that, the

vehicle checks if it can make progress in sending the

packet. If possible, then the positive reward is awarded,

the negative reward is applied, and the packet is dropped.

The distance from the destination vehicle and the delay

2120

Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.

Algorithm 1: Adaptive self-learning network

training

Input: Vf,neiMatij , Vehicle set

Ve{Ve

i,Ve

j, ....V en}

// vector features Vfand the neighbour

adjacency matrix neiMatij of each vehicle

iin a defined vehicle set Ve

Output: Adaptive learning classiﬁer RL-SDVN

Data: Vehicles set Ve

i, map, threshold σth

1for each discovery packet Pkthdr received by a

neighbour do

// find destination node in discovery

packet header Pkthdr

2Set destination node DestN =Pkthdr(d)

3Features selection procedure{Vf[i]}

4Set classiﬁer feature Vf

f[1] = neiMatij

// neiM at [i, j]=

⎧

⎪

⎨

⎪

⎩

0,if i=j

neiM at (i, j),if i=jand (i, j )∈E

∞,if i=jand(i, j)/∈E

5Set classiﬁer feature

f[2] = DisNi(neiDst, DestN )

// Calculate cosine distance (ˆ

// ε=N

i=1 ViVj

N

i=1 V2

iN

i=1 V2

// εis the cosine similarity index

// ˆ

k={1−ε}

6Set classiﬁer feature

f[3] = VRg

i(Transmission Range)

7Set classiﬁer feature

f[4] = QueOcci(Queue occupancy)

8Train RL-SDVN procedure

{RL −SDV N(Vf

f)}

9for each vehicle in Ve

ido

10 Assign vehicle to a cluster according to

ﬁtness value

11 return RL −SDV N(Vf

12 Exit

Fig. 2. Q-Learning Model for Vehicular Network

is used as the parameter to map Q-values. The progress

reward can be calculated as:

⎧

⎪

⎨

⎪

⎩

DχiVd−DVMβVd/DχiVd

;// If ACK is received

−DχiVMβ/DχiVd

;// If ACK is not received

(12)

Here DχiVdis the distance between the cluster head

vehicle and the destination vehicle, DVMβVdis the dis-

tance between the member vehicle of a cluster and the

destination vehicle, and DχiVdis the distance between

the requesting member vehicle and the cluster head.

Algorithm 2: Route Selection at SDN Controller

Input: Initialize all clusters ClustCSα, cluster

members vehicle ClustMβ

// α=1,2, ....N ,β=1,2, ...M

Output: Minimize path to the destination

(OptimRouteγ)

Data: Clusters set ClustCSα,DestV eN,

neiMatij

// Route discovery using SDN

1Find Route {(OptimRouteγ)}

// Send route discovery request to the

SDN controller

// Select denser cluster for packet

forwarding and apply Q-learning method

for route discovery

2for i=1 to βdo

// Collect the geographical

information about vehicles in the

clusters and perform the next hop

calculation

3Fwd route Req (Q-value, state of the agent

ΓS, Action A,DisV eN)

4Initialise Q-values for the neighbour vehicles

5if hello packet received then

// check the neighbour table for

the maximum Q-value

6if route exist to the destination then

// select the route, update the

Q-table, and receive a reward A

7DχiVd−DVMβVd/DχiVd

8else

9Drop the packet

10 Receive a negative reward

−DχiVMβ/DχiVd

11 forward the RREQ request with the updated

Q-value to the nearest cluster-head (χi)

and wait for RREP messages

12 Select shortest path

13 Store: {(OptimRouteγ)}

// Minimize path

14 Exit

The above presented Algorithm 1 (i.e., Cluster forma-

tion using GMM in RL-SDVN Protocol) and Algorithm

2(i.e., Route selection using Q-Learning in RL-SDVN

protocol) illustrates the selection of the shortest route

using link reliability and SDN.

2121

Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.

V. P ERFORMANCE ANALYSIS AND SIMULATION

RESULTS

In this paper, we have taken multiple existing proto-

cols in consideration such as CPB [1], DMMCA [2],

M.Ren et al. [3], MFCAR [7], SCF [8], and RL-SDVN.

These protocols are examined in a VANET scenario in

the simulation.

A. Simulation Environment

For simulation designs, we use the ns3.28 tool, which

is an C++ and python based scripting capable network

simulator. We created the scenario using the SUMO

tool, which is used for reproducing live trafﬁc situations.

In our simulation scenario, a block of the Oslo city,

Norway, is used for the trafﬁc simulation purpose as

it reﬂects a smart city transportation system. 100-300m

transmission range is used for simulation purposes as

generally, vehicles broadcast their information to all

other vehicles within this transmission range. The data

rate is set to 12 Mbps and each vehicle sends the peri-

odic messages at every 100 ms, including the vehicles

positional and velocity information.

B. Vehicles Parameter Conﬁgurations

In this section, we deﬁne various parameters for the

simulation purpose. The simulation parameters are listed

in Table I

TABLE I

VEHICLES CONFIGURATION IN URBAN ENVIRONMENT

Parameter Name Value of the parameter

Outgoing packet size 512 bytes (based on 1500-byte

MTU)

DSRC Range 100-1000 Meters

Runtime 150 seconds

Send and receive buffer 50

Trafﬁc source TCP/UDP

Area 5000*5000

MAC standards IEEE 802.11p, WAVE

Tool NS-3.28

Mobility model Car following model

Classiﬁer model Gaussian mixture mobility

Data rates 12 Mbps

Transmission range of a

vehicle’s OBU

100-300 Meters

Vehicle density 25-200 Vehicles

Velocity of vehicles 0∼20 (m/s)

C. Simulation Results

The simulation results based on the performance pa-

rameters outlined earlier are depicted in the subsequent

illustrations. In this research, we compare our proposed

scheme in two patterns. First, we compare RL-SDVN

with clustering based schemes named CPB, DMMCA,

and M.Ren [3] based on the parameter cluster stability,

and cluster lifetime. Second, we compare RL-SDVN

with SDN based schemes i.e., MFCAR, and SCF based

on parameters such as average transmission delay, and

throughput.

1) Cluster Stability: Our research endeavours for

cluster stability, more stable clusters produce robust

routing optimization. The stability of a cluster can be

estimated by considering the movement of the vehicles.

RL-SDVN is designed such that the number of cluster

changes by a vehicle should be minimized. The multiple

cluster transitions made by a vehicle during its lifetime

can be used to analyze cluster stability. The transition

of the vehicles can be identiﬁed when a vehicle leaves

a cluster and forms a new cluster, or it is merged with

another cluster. We compare RL-SDVN with the existing

approaches based on the number of cluster changes dur-

ing the lifetime of a vehicle using different transmission

ranges and at the speed of 20 m/s. Fig. 3 represents

that RL-SDVN has a low number of cluster transitions

for a vehicle. The resulting graph shows an average

reduction of 20-30% in cluster transitions using RL-

SDVN. We can see that the cluster transitions decrease as

the transmission range increases because with the higher

transmission range, the chances of vehicle connectivity

with its neighbour’s increases.

Fig. 3. Cluster Stability Comparison in Cluster Based Schemes

2) Cluster Lifetime: Another signiﬁcant performance

metric for a cluster-based scheme is the lifetime of a

cluster. The cluster lifetime is directly associated with

the cluster head, and the cluster head plays a vital role

in packet forwarding decisions. The cluster lifetime is

compared with the existing approaches with different

transmission ranges and speed of 20 m/s. Fig. 4 shows

an increment in the lifetime of a cluster by a margin

of 15-25% due to the iterative learning method of the

scheme, which labelled the vehicle to a cluster based on

the feature selection and strong correlation.

3) Average Transmission Delay and Throughput: Fig.

5 indicates the average delay in simulated conditions.

As represented, RL-SDVN cluster-based protocol has a

marginally moderate delay from the other protocols. We

use SDN for controlling packet forwarding operations

and route selection methods. The most suitable selected

cluster heads are used for the packet transmission. Due

to this, the average delay is quite low in RL-SDVN

based schemes. Our proposed scheme substantially out-

performs all other existing protocols used in this paper

in terms of average delay. M.Ren [3] possesses the

highest delay amongst all other protocols as it demands

substantial processing time in the neighbour discovery

process, whereas RL-SDVN has the lowest delay in the

2122

Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.

Fig. 4. Cluster Lifetime Comparison in Cluster Based Schemes

network due to stable clustering.

Fig. 5. Effect of Trafﬁc Density on Average Transmission Delay

Fig. 6 indicates the effect of network density on the

throughput. As depicted, both RL-SDVN and MFCAR

protocol performs good and has a high throughput. SCF

and CPB emphasize more on the neighbour discovery

process and lack of performance enhancement. More

bandwidth is utilized in the discovery process, and a

low delivery ratio results in low throughput.

Fig. 6. Effect of Trafﬁc Density on Throughput

VI. CONCLUSION

In this paper, we proposed a Q-learning based clus-

tered routing with the SDN based approach named RL-

SDVN. In this approach, the Q-learning classiﬁer is

responsible for optimizing routing estimation. The ve-

hicles are categorized to different clusters using GMM.

In the second phase, SDN is used to obtain an efﬁcient

route to the destination vehicle in a topological scenario.

The cluster head is elected by minimum queue occu-

pancy and connectivity matrix. A rewarding function

is applied to determine the forwarding or dropping

of the data packets. For estimating the performance

of the proposed RL-SDVN based scheme, we conduct

numerous simulations and use distinctive density sce-

narios of vehicular networks. We observed that in the

car following model, existing routing protocols suffer

from performance degradation. In contrast, RL-SDVN

performs more efﬁciently in given circumstances.

REFERENCES

[1] Liu, Lei, Chen Chen, Tie Qiu, Mengyuan Zhang, Siyu Li, and

Bin Zhou. ”A data dissemination scheme based on clustering and

probabilistic broadcasting in VANETs.” Vehicular Communica-

tions 13 (2018): 78-88.

[2] K. Huang and B. Hu, ”A New Distributed Mobility-Based Multi-

Hop Clustering Algorithm for Vehicular Ad Hoc Networks in

Highway Scenarios,” 2019 IEEE 90th Vehicular Technology

Conference (VTC2019-Fall), Honolulu, HI, USA, 2019, pp. 1-6.

[3] Ren, Mengying, Lyes Khoukhi, Houda Labiod, Jun Zhang,

and Veronique Veque. ”A mobility-based scheme for dynamic

clustering in vehicular ad-hoc networks (VANETs).” Vehicular

Communications 9 (2017): 233-241.

[4] Wu, Jinqiao, Min Fang, and Xiao Li. ”Reinforcement learning

based mobility adaptive routing for vehicular ad-hoc networks.”

Wireless Personal Communications 101, no. 4 (2018): 2143-

2171.

[5] Zhao, Liang, Yujie Li, Chao Meng, Changqing Gong, and

Xiaochun Tang. ”A SVM based routing scheme in VANETs.”

In 2016 16th International Symposium on Communications and

Information Technologies (ISCIT), pp. 380-383. IEEE, 2016.

[6] Vashishth, Vidushi, Anshuman Chhabra, and Deepak Kumar

Sharma. ”GMMR: A Gaussian mixture model based unsuper-

vised machine learning approach for optimal routing in oppor-

tunistic IoT networks.” Computer Communications 134 (2019):

138-148.

[7] Di Maio, Antonio, Maria Rita Palattella, and Thomas Engel.

”Multi-ﬂow congestion-aware routing in software-deﬁned ve-

hicular networks.” In 2019 IEEE 90th Vehicular Technology

Conference (VTC2019-Fall), pp. 1-6. IEEE, 2019.

[8] Liyanage, Kushan Sudheera Kalupahana, Maode Ma, and Peter

Han Joo Chong. ”Connectivity aware tribrid routing framework

for a generalized software deﬁned vehicular network.” Computer

Networks 152 (2019): 167-177.

[9] H. Wang, W. Cheng, X. Lu and H. Qin, ”A Improved Routing

Scheme based on Link Stability for VANET,” 2019 14th IEEE

Conference on Industrial Electronics and Applications (ICIEA),

Xi’an, China, 2019, pp. 542-546.

[10] Qi, Weijing, Qingyang Song, Xiaojie Wang, Lei Guo, and

Zhaolong Ning. ”SDN-enabled social-aware clustering in 5G-

VANET systems.” IEEE Access 6 (2018): 28213-28224.

[11] Moore, Garret L., and Peixiang Liu. ”A Hybrid (Active-Passive)

Clustering Technique for VANETs.” IEEE ComSoc International

Communications Quality and Reliability Workshop (CQR), pp. 1-

6. IEEE, 2019.

[12] Zhao, Liang, Zhuhui Li, Jiajia Li, Ahmed Al-Dubai, Geyong

Min, and Albert Y. Zomaya. ”A Temporal-information-based

Adaptive Routing Algorithm for Software Deﬁned Vehicular

Networks.” In ICC 2019-2019 IEEE International Conference on

Communications (ICC), pp. 1-6. IEEE, 2019.

[13] Zhao, Liang, Weiliang Zhao, Ahmed Al-Dubai, and Geyong

Min. ”A Novel Adaptive Routing and Switching Scheme for

Software-Deﬁned Vehicular Networks.” In ICC 2019-2019 IEEE

International Conference on Communications (ICC), pp. 1-6.

IEEE, 2019.

[14] Correia, Sergio, Azzedine Boukerche, and Rodolfo I.

Meneguette. ”An architecture for hierarchical software-deﬁned

vehicular networks.” IEEE Communications Magazine 55, no. 7

(2017): 80-86.

[15] Zhang, Degan, Ting Zhang, and Xiaohuan Liu. ”Novel self-

adaptive routing service algorithm for application in VANET.”

Applied Intelligence 49, no. 5 (2019): 1866-1879.

[16] Mammeri, Zoubir. ”Reinforcement Learning Based Routing in

Networks: Review and Classiﬁcation of Approaches.” IEEE

Access 7 (2019): 55916-55950.

2123

Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.

A survey on routing and load-balancing mechanisms in software-defined vehicular networks

Article

Full-text available

Apr 2024
WIREL NETW

Software-defined vehicular networks (SDVN) is a promising technology for wireless data transmissions between vehicles. SDVN inherits software-defined networking principles and aims to improve the typical performance of safety and non-safety applications of vehicular adhoc networks. Consequently, enhancing the performance of Intelligent Transportation System (ITS). However, the performance of these ITS applications largely depends on the computational capability of the controller node, which involves creating or destroying a data path from the source vehicle to the destination vehicle and generating flow rules for the requests coming from the data plane elements. As a result, SDVN often suffers from the problems of overburdening the controller node with route requests under heavy traffic generation at vehicles and single-point controller failure. To counter these problems, solutions based on multiple controllers are proposed. In fact, the load-balancing problem remains an important issue. So, routing of data with multiple controllers and load-balancing, both topics in SDVN, go hand in hand. In this paper, we survey this state-of-the-art that discusses the above-mentioned challenges, starting with the SDVN preliminaries. We scrutinize the existing routing methodologies and also discuss load-balancing techniques. Furthermore, we provide real-time applications and services of SDVN, discuss trending research, potential future research directions, and the real-life applicability of SDVN that have not been addressed previously.

SpTFrame: A Framework for Spatio-Temporal Information Aware Message Dissemination in Software Defined Vehicular Networks

Conference Paper

Full-text available

Jan 2023

AlcFier: Adaptive Self-Learning Classifier for Routing in Vehicular Ad-Hoc Network

Conference Paper

Full-text available

Sep 2022

This paper presents an adaptive self-learning classifier-based clustering algorithm called AlcFier, to support scalability, enhance the stability of the network topology, and provide efficient routing. We incorporate mobility and channel characteristics (i.e., orientation, adjacency, link availability, queue occupancy, and signal-to-noise ratio) into the clustering approach as a channel-aware metric to provide a new direction to the taxonomy of the approaches employed to handle cluster head election, cluster affiliation, and cluster administration challenges. Experimental results show that AlcFier performs efficiently, improves cluster stability, reduces transmission delays, and improves throughput compared with the state-of-the-art routing protocols

Prediction and Analysis of Machine Learning Models for Efficient Routing Protocol in VANET Using Feature Information

Article

Full-text available

Jun 2024
WIRELESS PERS COMMUN

Vehicular Ad-hoc Networks (VANET) as the key correspondence organizing innovation has been pulled in by the scholarly world and enterprises with surprising turn of events. With each vehicle acting as a node in an ad hoc network made up of immobile or mobile vehicles, the VANET, which connects vehicles over a wireless connection, is a developing research field that is garnering prominence. The authors of this study examined real-time vehicles and the outcomes of four routing protocols on the basis of three parameters are recorded using network simulator (NS-3) as network simulator and synchronized with simulation on urban mobility (SUMO) as mobility simulator. A dataset is compiled using recorded results with NS-3 and SUMO. For selecting the efficient routing protocol, collection of dataset and selection of different features is done. machine learning (ML) models such as random forest, logistic regression, and k-nearest neighbor (k-NN) are implemented utilizing a set of relevant information regarding the relationship between sender and receiver. The effectiveness of ML models is assessed using a novel dataset and especially in comparison to that with others. The results shows that k-NN outperforms on the basis of evaluation parameters: F-score (75.5%), Accuracy (97.2%), Recall (79.9%) and Precision (75.3%) of classification learning techniques. The purpose of this research is prediction and analysis of ML Models for efficient routing protocol in VANET using different feature information that may be utilized to improve effectiveness of VANET and provided efficient routing protocol for safe, secure, reliable connection between vehicles.

Prediction and Analysis of Machine Learning Models for Efficient Routing Protocol in VANET using Feature Information

Preprint

Full-text available

Mar 2023

Vehicular Ad-hoc Networks (VANET) as the key correspondence organizing innovation has been pulled in by the scholarly world and enterprises with surprising turn of events. With each vehicle acting as a node in an ad hoc network made up of immobile or mobile vehicles, the VANET, which connects vehicles over a wireless connection, is a developing research field that is garnering prominence. The authors of this study examined real-time vehicles and the outcomes of four routing protocols on the basis of three parameters are recorded using Network Simulator (NS-3) as network simulator and synchronized with Simulation on Urban Mobility (SUMO) as mobility simulator. A dataset is compiled using recorded results with NS-3 and SUMO. For selecting the efficient routing protocol, collection of dataset and selection of different features is done. Machine Learning (ML) models such as Random Forest (RF), Logistic Regression (LR), and k-Nearest Neighbor (k-NN) are implemented utilizing a set of relevant information regarding the relationship between sender and receiver. The effectiveness of ML models is assessed using a novel dataset and especially in comparison to that with others. The results shows that k-NN outperforms on the basis of evaluation parameters: F-score (75.5%), Accuracy (97.2%), Recall (79.9%) and Precision (75.3%) of classification learning techniques. The purpose of this research is prediction and analysis of ML Models for efficient routing protocol in VANET using different feature information that may be utilized to improve effectiveness of VANET and provided efficient routing protocol for safe, secure, reliable connection between vehicles.

Reinforcement Learning-Based Routing Protocols in Vehicular and Flying Ad Hoc Networks – A Literature Survey

Article

Full-text available

Dec 2022
PROMET-ZAGREB

Vehicular and flying ad hoc networks (VANETs and FANETs) are becoming increasingly important with the development of smart cities and intelligent transportation systems (ITSs). The high mobility of nodes in these networks leads to frequent link breaks, which complicates the discovery of optimal route from source to destination and degrades network performance. One way to overcome this problem is to use machine learning (ML) in the routing process, and the most promising among different ML types is reinforcement learning (RL). Although there are several surveys on RL-based routing protocols for VANETs and FANETs, an important issue of integrating RL with well-established modern technologies, such as software-defined networking (SDN) or blockchain, has not been adequately addressed, especially when used in complex ITSs. In this paper, we focus on performing a comprehensive categorisation of RL-based routing protocols for both network types, having in mind their simultaneous use and the inclusion with other technologies. A detailed comparative analysis of protocols is carried out based on different factors that influence the reward function in RL and the consequences they have on network performance. Also, the key advantages and limitations of RL-based routing are discussed in detail.

MetaLearn: Optimizing routing heuristics with a hybrid meta-learning approach in vehicular ad-hoc networks

Article

Sep 2022
AD HOC NETW

Routing protocols in vehicular ad-hoc networks (VANETs) are typically challenged by high vehicular mobility and changing network topology. It becomes more apparent as the inherently dispersed nature of VANETs affects the Quality-of-Service (QoS), which makes it challenging to find a routing algorithm that maximizes the network throughput. Integrating Reinforcement Learning (RL) with Meta-Heuristic (MH) techniques allow for solving constrained, high dimensional problems such as routing optimization. Motivated by this fact, we introduce MetaLearn, a technique akin to global search, which employs a parameterized approach to remove future rewards uncertainty as well as vehicular state exploration to optimize the multilevel network structure. The proposed technique searches for the optimum solution that may be sped up by balancing global exploration using Grey Wolf Optimization (GWO) and exploitation through Temporal Difference Learning (particularly Q(λ)). MetaLearn approach enables cluster heads to learn how to adjust route request forwarding according to QoS parameters. The input received by a vehicle from previous evaluations is used to learn and adapt the subsequent actions accordingly. Furthermore, a customized reward function is developed to select the cluster head and identify stable clusters through GWO. An in-depth experimental demonstration of the proposed protocol addresses applicability and solution challenges for hybrid MH-RL algorithms in VANETs.

Deep Reinforcement Learning-Based Intelligent Security Forwarding Strategy for VANET

Article

Full-text available

Jan 2023
SENSORS-BASEL

The vehicular ad hoc network (VANET) constitutes a key technology for realizing intelligent transportation services. However, VANET is characterized by diverse message types, complex security attributes of communication nodes, and rapid network topology changes. In this case, how to ensure safe, efficient, convenient, and comfortable message services for users has become a challenge that should not be ignored. To improve the flexibility of routing matching multiple message types in VANET, this paper proposes a secure intelligent message forwarding strategy based on deep reinforcement learning (DRL). The key supporting elements of the model in the strategy are reasonably designed in combination with the scenario, and sufficient training of the model is carried out by deep Q networks (DQN). In the strategy, the state space is composed of the distance between candidate and destination nodes, the security attribute of candidate nodes and the type of message to be sent. The node can adaptively select the routing scheme according to the complex state space. Simulation and analysis show that the proposed strategy has the advantages of fast convergence, well generalization ability, high transmission security, and low network delay. The strategy has flexible and rich service patterns and provides flexible security for VANET message services.

A Recent Reinforcement Learning Trend for Vehicular Ad Hoc Networks Routing

Conference Paper

Oct 2023

A Node Backup Strategy for Routing Protocol in Software-Defined Vehicular Networks

Conference Paper

Oct 2022

Multi-Flow Congestion-Aware Routing in Software-Defined Vehicular Networks

Conference Paper

Full-text available

Sep 2019

A Temporal-Information-Based Adaptive Routing Algorithm for Software Defined Vehicular Networks

Conference Paper

Full-text available

May 2019

Reinforcement Learning Based Routing in Networks: Review and Classification of Approaches

Article

Full-text available

Apr 2019

Zoubir Mammeri

Reinforcement learning (RL), which is a class of machine learning, provides a framework by which a system can learn from its previous interactions with its environment to efficiently select its actions in the future. RL has been used in a number of application fields, including game playing, robotics and control, networks, and telecommunications, for building autonomous systems that improve themselves with experience. It is commonly accepted that RL is suitable for solving optimization problems related to distributed systems in general and to routing in networks in particular. RL also has reasonable overhead—in terms of control packets, memory and computation—compared to other optimization techniques used to solve the same problems. Since the mid-1990s, over sixty protocols have been proposed, with major or minor contributions in the field of optimal route selection to convey packets in different types of communication networks under various user QoS requirements. This paper provides a comprehensive review of literature on the topic. The review is structured in a way that shows how network characteristics and requirements were gradually considered over time. Classification criteria are proposed to present and qualitatively compare existing RL-based routing protocols.

Novel self-adaptive routing service algorithm for application in VANET

Article

Full-text available

May 2019
APPL INTELL

As a special MANET (mobile ad hoc network), VANET (vehicular ad-hoc network) has two important properties: the network topology changes frequently, and communication links are unreliable. Both properties are caused by vehicle mobility. To predict the reliability of links between vehicles effectively and design a reliable routing service protocol to meet various QoS application requirements, in this paper, details of the motion characteristics of vehicles and the reasons that cause links to go down are analyzed. Then a link duration model based on time duration is proposed. Link reliability is evaluated and used as a key parameter to design a new routing protocol. Quick changes in topology make it a huge challenge to find and maintain the end-to-end optimal path, but the heuristic Q-Learning algorithm can dynamically adjust the routing path through interaction with the surrounding environment. This paper proposes a reliable self-adaptive routing algorithm (RSAR) based on this heuristic service algorithm. By combining the reliability parameter and adjusting the heuristic function, RSAR achieves good performance with VANET. With the NS-2 simulator, RSAR performance is proved. The results show that RSAR is very useful for many VANET applications.

A New Distributed Mobility-Based Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks in Highway Scenarios

Conference Paper

Sep 2019

A Hybrid (Active-Passive) Clustering Technique for VANETs

Conference Paper

Apr 2019

A Improved Routing Scheme based on Link Stability for VANET

Conference Paper

Jun 2019

A Novel Adaptive Routing and Switching Scheme for Software-Defined Vehicular Networks

Conference Paper

May 2019

Connectivity Aware Tribrid Routing Framework For A Generalized Software Defined Vehicular Network

Article

Feb 2019
COMPUT NETW

GMMR: A Gaussian mixture model based unsupervised machine learning approach for optimal routing in opportunistic IoT networks

Article

Dec 2018
COMPUT COMMUN

Opportunistic IoT (OppIoT) network is a subclass of Internet of Things network, in which connections between the source and destination devices are intermittent. This infrequent connectivity is due to lack of network infrastructure and random mobility models followed by devices. These attributes of the network make routing in OppIoT, an increasingly complex problem. Moreover OppIoT shares its unique network characteristics with another class of networks called Opportunistic Networks (OppNets). This commonality enables the same routing designs to be applicable to both OppNets and OppIoT. Increased research interest in Machine Learning (ML) has led to its successful application in routing solutions for OppNets through protocols like KNNR and MLPROPH. In this paper we pursue utilizing ML to automate routing decisions in OppIoT. To this end we use Gaussian Mixture Models, an ML based soft clustering mechanism, to develop the proposed routing protocol called GMMR. The design of GMMR is such that it combines the advantages of both context-aware and context-free routing protocols. We compare the performance of GMMR with that of KNNR, HBPR, MLPROPH, and PROPHET using simulations run on Opportunistic Network Environment (ONE) simulator. The performance criteria for this comparison includes delivery probability, network overhead ratio, average hop count and number of messages dropped. We will show through the results of the simulations, that GMMR outperforms all of the aforementioned routing protocols in terms of every performance parameter.

Adaptive Reinforcement Routing in Software Defined Vehicular Networks

Abstract

Recommended publications

Coherency Routing Algorithm with Redundancy Elimination in Software Defined Data Center Networks

Energy efficient routing algorithm of software defined data center network

Design and evaluation of HWRE software defined data center network architecture

Joint Optimization of VNF Deployment and Routing in Software Defined Satellite Networks