Conference PaperPDF Available

Adaptive Reinforcement Routing in Software Defined Vehicular Networks

Authors:

Abstract

The integration of learning architecture with SDN-based VANETs (SDVN) is beneficial for utilizing computing power by decoupling network management services from data transfer services. However, fast safety messages dissemination in a highly dynamic vehicular environment is a challenging and complex dilemma due to bi-directional traffic and the directional movement of vehicles. It is also challenging to get an effective solution against bottleneck situations and a reliable and fault-tolerant SDN network using clustering. So considering the features of adaptive learning, in this paper, we propose adaptive self-learning clustering algorithm with reinforcement routing in SDVN known as RL-SDVN. An Expectation-Maximization model is used to predict a vehicle’s movement and further Qlearning model is used to route data packets, so that vehicles in the same cluster coordinate with each other to find optimum routes. We evaluate our experimental results by comparing our approach with the clustering and selflearning based schemes proposed in the past. The outcomes exhibit that the proposed scheme improved cluster stability and life-time of a cluster member vehicle with better performance in terms of low average transmission delay, and high throughput compared to the existing routing protocols used in this research.
Adaptive Reinforcement Routing in Software
Defined Vehicular Networks
Ankur Nahar
Computer Science and Engineering Department
Indian Institute of Technology
Jodhpur, India
nahar.1@iitj.ac.in
Debasis Das
Computer Science and Engineering Department
Indian Institute of Technology
Jodhpur, India
debasis@iitj.ac.in
Abstract—The integration of learning architecture with
SDN-based VANETs (SDVN) is beneficial for utilizing com-
puting power by decoupling network management services
from data transfer services. However, fast safety messages
dissemination in a highly dynamic vehicular environment
is a challenging and complex dilemma due to bi-directional
traffic and the directional movement of vehicles. It is also
challenging to get an effective solution against bottleneck
situations and a reliable and fault-tolerant SDN network
using clustering. So considering the features of adaptive
learning, in this paper, we propose adaptive self-learning
clustering algorithm with reinforcement routing in SDVN
known as RL-SDVN. An Expectation-Maximization model
is used to predict a vehicle’s movement and further Q-
learning model is used to route data packets, so that
vehicles in the same cluster coordinate with each other to
find optimum routes. We evaluate our experimental results
by comparing our approach with the clustering and self-
learning based schemes proposed in the past. The outcomes
exhibit that the proposed scheme improved cluster stability
and life-time of a cluster member vehicle with better
performance in terms of low average transmission delay,
and high throughput compared to the existing routing
protocols used in this research.
Index Terms—Vehicular ad-hoc networks (VANETs),
Clustering, Routing, Software defined network, Q-learning,
Adaptive learning.
I. INTRODUCTION
The heterogeneity of VANETs makes it challenging
to coordinate with the rising number of accidents and
the vehicles on the road [1]–[4]. Thus, the vehicu-
lar ad hoc networks (VANETs) aims to enhance road
safety, information dissemination and traffic manage-
ment applications [5]–[8]. Through VANETs, we can
provide vehicle-to-vehicle (V2V) communication, which
can avoid a crash and hazardous condition by propa-
gating alert messages to other vehicles in the network.
Unfortunately, the current routing protocol techniques
make it challenging to provide Quality of Service (QoS)
requirements [9]–[11]. Warning message propagation is
a primary concern in providing traffic safety regulations.
A warning message should travel fast and reliably to the
group of nearby vehicles to alert about the hazardous
circumstance. Although, numerous research efforts have
attempted to solve this complexity for routing of warn-
ing messages by selecting probabilistic approach [1],
moving direction, relative speed [2], link stability [3],
[9], and social aware clustering [10]. Various learning-
based schemes have also been proposed in the past to
serve QoS routing purposes. J. Wu [4] and L. Zhao
[5] used adaptive and machine learning techniques to
find the optimal route to the destination. V. Vashishth
et al. [6] used the Gaussian Mixture Model (GMM)
and machine learning for the soft clustering, whereas
A. Maio et al. [7] and K. Liyanage et al. [8] used
management capabilities of SDN and proposed opti-
mal route finding techniques. However, designing an
effective routing protocol is a challenging task as the
real-world high mobility traffic situations, propagation
speed, bandwidth constraints, and traffic complexity is
an influential parameter that influences the performance
factor of a vehicular network [12]–[14]. Concerned by
the fast warning message dissemination and performance
enhancement, this work proposed optimum convergence
clustering and reinforcement learning-based routing. To
overcome the routing challenges, and tightly bound
clustering, this paper proposed an SDN-based VANETs
with reinforcement learning, known as RL-SDVN. We
adopted the SDN controlled VANET and proposed a Q-
learning based method for routing decisions and self-
learning architecture for clustering purposes. We further
tabulate RL-SDVN performance analysis and compare
with the previously proposed routing techniques. The
simulation result demonstrates a low average transmis-
sion delay, improved lifetime of a cluster member and
lower cluster transitions. It also enhanced the through-
put of the network and shows improvement than other
existing protocols used in this paper.
The paper is organized in the subsequent sections:
Section II, provides the objectives and contribution of
the research. Section III and IV, contains the details
about the architectural model and our proposed RL-
SDVN protocol in detail. Whereas, in section V, we
present a performance analysis of simulated protocols
and the simulation results.
II. OBJECTIVE AND CONTRIBUTION
The principal objective behind this proposed work is
to improve the message dissemination process and lower
the average transmission time. Efficient clustering plays
a vital role in grouping vehicles and routing process, so
Gaussian mixture model (GMM) along with reinforce-
ment learning is used to find the vehicle’s movement
and orientation pattern based on the features selection
and further used for forming the clusters and optimal
route selection. Objectives for this work listed below:
978-1-7281-3129-0/20/$31.00 ©2020 IEEE 2118
Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.
Tackle the instability of clusters and identify
the circumstances that introduce delays in mes-
sage dissemination.
Propose a GMM based clustering and
reinforcement-based routing scheme, which
can regulate the coordination and control of the
network and enhance network performance.
In this paper, first, we trained a self-learning classifier,
based on the optimal feature selection and fitness value
generation. Further Q-learning classifier is used for Q-
value generation, which governs packet forwarding deci-
sions. We also used SDN capabilities for controlling for-
warding, management, and optimal path decisions. The
research findings demonstrate that using this technique,
we achieved more stable and efficient clusters with less
average delay and high throughput performance in the
network. Precisely, this paper provides the subsequent
vital contributions.
An efficient RL-SDVN scheme is proposed
for vehicular communications, where stable
clusters are formed using machine learning
(GMM) and features extraction, i.e., connec-
tivity, distance, transmission range, and queue
occupancy.
The SDN-based Q-learning scheme is proposed
to improve reliability, average transmission
time, and throughput. The learning process
also improves cluster stability and network
optimization.
III. ARCHITECTURAL MODEL
In this research, we used to represent the traffic
flow using Spatio-temporal propagation of the vehicles.
We define the traffic flow model as the function of
vehicle density, traffic flow, and the velocity of vehicles
concerning space x and the time t. The assumptions for
this traffic model can be represented as:
∂ρ
∂t +∂ρV
∂x =ν(x, t)(1)
Here, ρrepresent the traffic density, and Vrepresent
the velocity of the vehicles whereas νrepresent the
inflows and outflows of the road. If we use second
order derivatives to represent the traffic instability then
dynamic acceleration can be given as:
∂V
∂t +V∂V
∂x +1
ρdP (ρ, V )
dx =A{ρ, V, ρa,V
a,∂ρ
∂x,∂V
∂x }
(2)
Here, we represent traffic pressure with respect to
function A() and the velocity variance. A real-time
vehicular network represents characteristics of the het-
erogeneous traffic environment. For our research pur-
pose, we formulated our traffic flow model as a time-
continuous car-to-car following model, which is shown
in Fig. 1.
We assume that each vehicle in the network commu-
nicates using IEEE 802.11p and dedicated short-range
communication (DSRC) standards (i.e., 300-1000m)
transmission range. Each vehicle in a VANET can com-
municate to other vehicles directly using V2V commu-
nication or through RSUs (V2I). A vehicle broadcasts
Fig. 1. Traffic Flow Arrangement With Vehicle Dynamics
a traffic safety message every 100-300 ms, which keeps
the vehicle’s driving-related information, such as loca-
tion, speed, turning intention, and driving status (e.g.,
regular driving, waiting for a traffic light, traffic jam) to
other vehicles [3].
IV. RL-SDVN: ADAPTIVE-REINFORCEMENT
ROUTING IN SDVN
This paper proposed an adaptive-reinforcement learn-
ing method for clustering and routing using a classi-
fier named RL-SDVN that can enhance the neighbor
selection method and improve cluster stability, prompted
by the RREQ packets. Our approach operates in two
phases: firstly, it formulates distinct clusters based on
the feature extraction and stored in the database. This
extracted data will be used for training RL-SDVN using
the Expectation-Maximization learning method. Further,
this classifier is used to label the clusters and assign each
vehicle to its suitable cluster. Secondly, we use SDN to
handle network operations and find the most suitable
route to the destination. As stated above, the proposed
algorithm works in two steps.
A. Adaptive Self-Learning Model Training
The GMM based clustering inspires our technique for
performing soft clustering in the network [6]. GMM can
be beneficial as it uses a probabilistic method for gen-
erating clusters. The clusters using this method follow
a probability distribution in a PD
ndimensional space.
GMM uses sampled data from an unknown parameter
of Gaussian distribution. Using a learning method, we
can formulate different clusters using the values of the
parameters. In GMM, the probability distribution of a
vehicle (A vector representation) can be represented by
the sum of all probabilities of the vehicles in the cluster.
the equation for the distribution can be given as:
ˆρ(Ve)=
N
n=1
λn(ˆ
k|σn
n)(3)
Here, ˆρ(Ve)is the probability distribution of a vehicle
Ve. λnis the mixture coefficient and (σn
n)is the
mean and covariance of the normal distribution. Now
the Expected Maximization algorithm is used to train
the data model. The Expected-Maximum approach is
based on determining the maximum likelihood parameter
2119
Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.
from the selected parameters for training the model. The
GMM model can be utilized to estimate unobserved
data points in the distribution and increase distribution
likelihood until it reaches local maxima. In the GMM
model, firstly, a vehicle is assigned to a cluster using
a random value of the mixture coefficient, and then the
expectation step is performed. In this step, the likelihood
value of the samples is calculated for each of the clusters.
This value signifies how strongly a vehicle is associated
with a cluster. The likelihood value can be given as:
ϕin =λn(Vf|σn
n)
N
n=1 λn(Vf|σn
n)(4)
Here, ϕin represent the likelihood value of a ith
vehicle to a cluster N. Further, the Maximization step
is performed for each vehicle in a cluster. In this step,
the mean coefficient, and the covariance coefficient is
calculated again and update their values. Now, these
updated values are used to assign the vehicle to cluster
with a more strong likelihood. The mean coefficient
update can be written as:
σi
n=N
n=1 ϕin.Vf
N
n=1 ϕin
(5)
The updated covariance coefficient is given by:
σi
n=N
n=1 ϕin.(Vfσi
n)((Vfσi
n)x
N
n=1 ϕin
(6)
The updated mixture coefficient λy
ncan be given as:
λy
n=N
n=1 ϕin
X(7)
After updating all coefficient values, log-likelihood is
determined for the Xnumber of samples. If the log-
likelihood value remains stable for consecutive itera-
tions, then the learning algorithm is stopped otherwise
cluster refining is continued till convergence.
lnρ(Ve)|Vf)=
I
i=1
N
n=1
λnˆ
k(Vf|σn
n)(8)
B. Cluster Formation using Self-Learning
In the cluster formation step, Data concerning chosen
network features are extracted and stored in the vehicle’s
OBU, and this contextual information is further used to
create a training instance. A vehicle receives a beacon
message from the neighbour in every 100 ms, which
contains selected feature data in the packet header. This
data is further used by the classifier for computation and
assigning a vehicle to a cluster. Four different features
are selected for the training purposes.
Adjacency Feature: This connectivity matrix is used
to identify the adjacency (connectivity to other vehicles)
of a vehicle. The adjacency matrix allows representing
the network with the N*N matrix.
neiMat =[f(i, j)] (9)
Where each element represents the connectivity of the
vehicles.
Distance Feature: In this phase, we perform the
computation of the cosine similarity index. The simi-
larity index is used to determine the cluster size and the
cluster members, which fall into the similar index values.
The cosine distance model is used to find the distance
between the communicating vehicles. It can define as:
ˆ
k={1ε}(10)
Where ˆ
kis the cosine distance and the εis the
cosine similarity. Using eq. (10) we can find the distance
between two vehicles i.e., i and j in a vehicle set Ve
i.
The cosine similarity can be defined as:
ε=N
i=1 Ve
iVe
j
N
i=1 Ve
2
iN
i=1 Ve
2
j
(11)
Here, Ve
iand Ve
jis the ith and jth vehicle in the
communication.
Transmission Range Feature: transmission range is
used as another feature for cluster formation. In this
feature transmission range of each vehicle is calculated.
In general, each vehicle communicates using dedicated
short-range communication (DSRC) standards (i.e., 100-
1000m) transmission range.
Queue Occupancy: This feature donates the number
of packets in the queue for processing at the given node.
The vehicle with high adjacency, and low queue
occupancy selected as cluster head in the cluster.
C. Route selection in RL-SDVN
After selecting the cluster head for each cluster in
the network, the next phase of the route discovery
takes place. With the help of the cluster head and the
geographical information, SDN calculates the optimum
route to the destination based on the quality of available
paths and reinforcement learning. The learning process
iterates at each forwarding vehicle until the packet
reaches the destination. The SDN controller is used to
compute the route propagation based on a Q-learning
algorithm. Fig. 2 represent Q-learning model for the
vehicular networks where, in the context of the learning
method, vehicles in the network considered as a state of
the agent ΓSΓs. where ΓSsiand i=1,2, ...Vn
and the possible set of action a, which means a vehicle
Visend a packet to the vehicle Vj. The Q-value is
based on a rewarding function, which is obtained by
the exploration of the environment. Every vehicle in the
network maintains a Q-table, which is updated based
on the future negative or positive reward values. The
packet forwarding decisions are based on the maximized
Q-value of the vehicles.
The amendments in the Q-table are performed upon
reception of a hello message. Each vehicle maintains the
Q-value for the one-hop and two-hop information and
attached in the route request messages. When a vehicle
wants to send a packet to another vehicle in the network,
it will check the Q-value in the packet. After that, the
vehicle checks if it can make progress in sending the
packet. If possible, then the positive reward is awarded,
the negative reward is applied, and the packet is dropped.
The distance from the destination vehicle and the delay
2120
Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.
Algorithm 1: Adaptive self-learning network
training
Input: Vf,neiMatij , Vehicle set
Ve{Ve
i,Ve
j, ....V en}
// vector features Vfand the neighbour
adjacency matrix neiMatij of each vehicle
Ve
iin a defined vehicle set Ve
Output: Adaptive learning classifier RL-SDVN
Data: Vehicles set Ve
i, map, threshold σth
1for each discovery packet Pkthdr received by a
neighbour do
// find destination node in discovery
packet header Pkthdr
2Set destination node DestN =Pkthdr(d)
3Features selection procedure{Vf[i]}
4Set classifier feature Vf
f[1] = neiMatij
// neiM at [i, j]=
0,if i=j
neiM at (i, j),if i=jand (i, j )E
,if i=jand(i, j)/E
5Set classifier feature
Vf
f[2] = DisNi(neiDst, DestN )
// Calculate cosine distance (ˆ
k)
// ε=N
i=1 ViVj
N
i=1 V2
iN
i=1 V2
j
// εis the cosine similarity index
// ˆ
k={1ε}
6Set classifier feature
Vf
f[3] = VRg
i(Transmission Range)
7Set classifier feature
Vf
f[4] = QueOcci(Queue occupancy)
8Train RL-SDVN procedure
{RL SDV N(Vf
f)}
9for each vehicle in Ve
ido
10 Assign vehicle to a cluster according to
fitness value
11 return RL SDV N(Vf
f)
12 Exit
Fig. 2. Q-Learning Model for Vehicular Network
is used as the parameter to map Q-values. The progress
reward can be calculated as:
R=
DχiVdDVMβVd/DχiVd
;// If ACK is received
DχiVMβ/DχiVd
;// If ACK is not received
(12)
Here DχiVdis the distance between the cluster head
vehicle and the destination vehicle, DVMβVdis the dis-
tance between the member vehicle of a cluster and the
destination vehicle, and DχiVdis the distance between
the requesting member vehicle and the cluster head.
Algorithm 2: Route Selection at SDN Controller
Input: Initialize all clusters ClustCSα, cluster
members vehicle ClustMβ
// α=1,2, ....N ,β=1,2, ...M
Output: Minimize path to the destination
(OptimRouteγ)
Data: Clusters set ClustCSα,DestV eN,
neiMatij
// Route discovery using SDN
1Find Route {(OptimRouteγ)}
// Send route discovery request to the
SDN controller
// Select denser cluster for packet
forwarding and apply Q-learning method
for route discovery
2for i=1 to βdo
// Collect the geographical
information about vehicles in the
clusters and perform the next hop
calculation
3Fwd route Req (Q-value, state of the agent
ΓS, Action A,DisV eN)
4Initialise Q-values for the neighbour vehicles
5if hello packet received then
// check the neighbour table for
the maximum Q-value
6if route exist to the destination then
// select the route, update the
Q-table, and receive a reward A
7DχiVdDVMβVd/DχiVd
8else
9Drop the packet
10 Receive a negative reward
DχiVMβ/DχiVd
11 forward the RREQ request with the updated
Q-value to the nearest cluster-head (χi)
and wait for RREP messages
12 Select shortest path
13 Store: {(OptimRouteγ)}
// Minimize path
14 Exit
The above presented Algorithm 1 (i.e., Cluster forma-
tion using GMM in RL-SDVN Protocol) and Algorithm
2(i.e., Route selection using Q-Learning in RL-SDVN
protocol) illustrates the selection of the shortest route
using link reliability and SDN.
2121
Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.
V. P ERFORMANCE ANALYSIS AND SIMULATION
RESULTS
In this paper, we have taken multiple existing proto-
cols in consideration such as CPB [1], DMMCA [2],
M.Ren et al. [3], MFCAR [7], SCF [8], and RL-SDVN.
These protocols are examined in a VANET scenario in
the simulation.
A. Simulation Environment
For simulation designs, we use the ns3.28 tool, which
is an C++ and python based scripting capable network
simulator. We created the scenario using the SUMO
tool, which is used for reproducing live traffic situations.
In our simulation scenario, a block of the Oslo city,
Norway, is used for the traffic simulation purpose as
it reflects a smart city transportation system. 100-300m
transmission range is used for simulation purposes as
generally, vehicles broadcast their information to all
other vehicles within this transmission range. The data
rate is set to 12 Mbps and each vehicle sends the peri-
odic messages at every 100 ms, including the vehicles
positional and velocity information.
.
B. Vehicles Parameter Configurations
In this section, we define various parameters for the
simulation purpose. The simulation parameters are listed
in Table I
TABLE I
VEHICLES CONFIGURATION IN URBAN ENVIRONMENT
Parameter Name Value of the parameter
Outgoing packet size 512 bytes (based on 1500-byte
MTU)
DSRC Range 100-1000 Meters
Runtime 150 seconds
Send and receive buffer 50
Traffic source TCP/UDP
Area 5000*5000
MAC standards IEEE 802.11p, WAVE
Tool NS-3.28
Mobility model Car following model
Classifier model Gaussian mixture mobility
Data rates 12 Mbps
Transmission range of a
vehicle’s OBU
100-300 Meters
Vehicle density 25-200 Vehicles
Velocity of vehicles 020 (m/s)
C. Simulation Results
The simulation results based on the performance pa-
rameters outlined earlier are depicted in the subsequent
illustrations. In this research, we compare our proposed
scheme in two patterns. First, we compare RL-SDVN
with clustering based schemes named CPB, DMMCA,
and M.Ren [3] based on the parameter cluster stability,
and cluster lifetime. Second, we compare RL-SDVN
with SDN based schemes i.e., MFCAR, and SCF based
on parameters such as average transmission delay, and
throughput.
1) Cluster Stability: Our research endeavours for
cluster stability, more stable clusters produce robust
routing optimization. The stability of a cluster can be
estimated by considering the movement of the vehicles.
RL-SDVN is designed such that the number of cluster
changes by a vehicle should be minimized. The multiple
cluster transitions made by a vehicle during its lifetime
can be used to analyze cluster stability. The transition
of the vehicles can be identified when a vehicle leaves
a cluster and forms a new cluster, or it is merged with
another cluster. We compare RL-SDVN with the existing
approaches based on the number of cluster changes dur-
ing the lifetime of a vehicle using different transmission
ranges and at the speed of 20 m/s. Fig. 3 represents
that RL-SDVN has a low number of cluster transitions
for a vehicle. The resulting graph shows an average
reduction of 20-30% in cluster transitions using RL-
SDVN. We can see that the cluster transitions decrease as
the transmission range increases because with the higher
transmission range, the chances of vehicle connectivity
with its neighbour’s increases.
Fig. 3. Cluster Stability Comparison in Cluster Based Schemes
2) Cluster Lifetime: Another significant performance
metric for a cluster-based scheme is the lifetime of a
cluster. The cluster lifetime is directly associated with
the cluster head, and the cluster head plays a vital role
in packet forwarding decisions. The cluster lifetime is
compared with the existing approaches with different
transmission ranges and speed of 20 m/s. Fig. 4 shows
an increment in the lifetime of a cluster by a margin
of 15-25% due to the iterative learning method of the
scheme, which labelled the vehicle to a cluster based on
the feature selection and strong correlation.
3) Average Transmission Delay and Throughput: Fig.
5 indicates the average delay in simulated conditions.
As represented, RL-SDVN cluster-based protocol has a
marginally moderate delay from the other protocols. We
use SDN for controlling packet forwarding operations
and route selection methods. The most suitable selected
cluster heads are used for the packet transmission. Due
to this, the average delay is quite low in RL-SDVN
based schemes. Our proposed scheme substantially out-
performs all other existing protocols used in this paper
in terms of average delay. M.Ren [3] possesses the
highest delay amongst all other protocols as it demands
substantial processing time in the neighbour discovery
process, whereas RL-SDVN has the lowest delay in the
2122
Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. Cluster Lifetime Comparison in Cluster Based Schemes
network due to stable clustering.
Fig. 5. Effect of Traffic Density on Average Transmission Delay
Fig. 6 indicates the effect of network density on the
throughput. As depicted, both RL-SDVN and MFCAR
protocol performs good and has a high throughput. SCF
and CPB emphasize more on the neighbour discovery
process and lack of performance enhancement. More
bandwidth is utilized in the discovery process, and a
low delivery ratio results in low throughput.
Fig. 6. Effect of Traffic Density on Throughput
VI. CONCLUSION
In this paper, we proposed a Q-learning based clus-
tered routing with the SDN based approach named RL-
SDVN. In this approach, the Q-learning classifier is
responsible for optimizing routing estimation. The ve-
hicles are categorized to different clusters using GMM.
In the second phase, SDN is used to obtain an efficient
route to the destination vehicle in a topological scenario.
The cluster head is elected by minimum queue occu-
pancy and connectivity matrix. A rewarding function
is applied to determine the forwarding or dropping
of the data packets. For estimating the performance
of the proposed RL-SDVN based scheme, we conduct
numerous simulations and use distinctive density sce-
narios of vehicular networks. We observed that in the
car following model, existing routing protocols suffer
from performance degradation. In contrast, RL-SDVN
performs more efficiently in given circumstances.
REFERENCES
[1] Liu, Lei, Chen Chen, Tie Qiu, Mengyuan Zhang, Siyu Li, and
Bin Zhou. ”A data dissemination scheme based on clustering and
probabilistic broadcasting in VANETs.” Vehicular Communica-
tions 13 (2018): 78-88.
[2] K. Huang and B. Hu, ”A New Distributed Mobility-Based Multi-
Hop Clustering Algorithm for Vehicular Ad Hoc Networks in
Highway Scenarios,” 2019 IEEE 90th Vehicular Technology
Conference (VTC2019-Fall), Honolulu, HI, USA, 2019, pp. 1-6.
[3] Ren, Mengying, Lyes Khoukhi, Houda Labiod, Jun Zhang,
and Veronique Veque. ”A mobility-based scheme for dynamic
clustering in vehicular ad-hoc networks (VANETs).” Vehicular
Communications 9 (2017): 233-241.
[4] Wu, Jinqiao, Min Fang, and Xiao Li. ”Reinforcement learning
based mobility adaptive routing for vehicular ad-hoc networks.
Wireless Personal Communications 101, no. 4 (2018): 2143-
2171.
[5] Zhao, Liang, Yujie Li, Chao Meng, Changqing Gong, and
Xiaochun Tang. ”A SVM based routing scheme in VANETs.”
In 2016 16th International Symposium on Communications and
Information Technologies (ISCIT), pp. 380-383. IEEE, 2016.
[6] Vashishth, Vidushi, Anshuman Chhabra, and Deepak Kumar
Sharma. ”GMMR: A Gaussian mixture model based unsuper-
vised machine learning approach for optimal routing in oppor-
tunistic IoT networks.” Computer Communications 134 (2019):
138-148.
[7] Di Maio, Antonio, Maria Rita Palattella, and Thomas Engel.
”Multi-flow congestion-aware routing in software-defined ve-
hicular networks.” In 2019 IEEE 90th Vehicular Technology
Conference (VTC2019-Fall), pp. 1-6. IEEE, 2019.
[8] Liyanage, Kushan Sudheera Kalupahana, Maode Ma, and Peter
Han Joo Chong. ”Connectivity aware tribrid routing framework
for a generalized software defined vehicular network. Computer
Networks 152 (2019): 167-177.
[9] H. Wang, W. Cheng, X. Lu and H. Qin, ”A Improved Routing
Scheme based on Link Stability for VANET,” 2019 14th IEEE
Conference on Industrial Electronics and Applications (ICIEA),
Xi’an, China, 2019, pp. 542-546.
[10] Qi, Weijing, Qingyang Song, Xiaojie Wang, Lei Guo, and
Zhaolong Ning. ”SDN-enabled social-aware clustering in 5G-
VANET systems.” IEEE Access 6 (2018): 28213-28224.
[11] Moore, Garret L., and Peixiang Liu. ”A Hybrid (Active-Passive)
Clustering Technique for VANETs.” IEEE ComSoc International
Communications Quality and Reliability Workshop (CQR), pp. 1-
6. IEEE, 2019.
[12] Zhao, Liang, Zhuhui Li, Jiajia Li, Ahmed Al-Dubai, Geyong
Min, and Albert Y. Zomaya. ”A Temporal-information-based
Adaptive Routing Algorithm for Software Defined Vehicular
Networks.” In ICC 2019-2019 IEEE International Conference on
Communications (ICC), pp. 1-6. IEEE, 2019.
[13] Zhao, Liang, Weiliang Zhao, Ahmed Al-Dubai, and Geyong
Min. ”A Novel Adaptive Routing and Switching Scheme for
Software-Defined Vehicular Networks.” In ICC 2019-2019 IEEE
International Conference on Communications (ICC), pp. 1-6.
IEEE, 2019.
[14] Correia, Sergio, Azzedine Boukerche, and Rodolfo I.
Meneguette. ”An architecture for hierarchical software-defined
vehicular networks.” IEEE Communications Magazine 55, no. 7
(2017): 80-86.
[15] Zhang, Degan, Ting Zhang, and Xiaohuan Liu. ”Novel self-
adaptive routing service algorithm for application in VANET.”
Applied Intelligence 49, no. 5 (2019): 1866-1879.
[16] Mammeri, Zoubir. ”Reinforcement Learning Based Routing in
Networks: Review and Classification of Approaches. IEEE
Access 7 (2019): 55916-55950.
2123
Authorized licensed use limited to: Indian Institute of Technology - Jodhpur. Downloaded on September 06,2023 at 04:10:28 UTC from IEEE Xplore. Restrictions apply.
... A few more routing schemes were proposed to handle multi-hop routing and fast message dissemination, such as Maio et al. [40] proposed Multi-Flow Congestion-Aware Routing (MFCAR) to address the challenges associated with multi-hop routing by finding an optimal V2V routing path that is both short and low congested. Nahar et al. [41] put forth an adaptive self-learning clustering algorithm with reinforcement routing for fast safety message dissemination in a highly dynamic vehicular environment. Kong et al. [42] proposed a routing algorithm using the ant colony technique and greedy algorithm for selecting an optimal routing path for an urban dynamic traffic environment. ...
... In contrast to the discussed routing schemes in [7,9,28,[37][38][39][40][41][42], where RSUs were used as an intermediary node to communicate between the vehicle node and the controller, few existing schemes consider RSUs as local controllers. Considering RSUs as local controllers brings the control plane close to the data plane, reduces flow installation delay, and minimizes the end-to-end delay. ...
... All the discussed routing schemes in this section handle multiple routing challenges and states to improve routing performance. However, the proposed schemes in [7,8,[28][29][30][34][35][36][37][38][39][40][41][42] induces heavy burden on the main controller. This is because the main controller solely handles the entire network architecture, takes all the routing decisions, and generates flow rules for every route request. ...
Article
Full-text available
Software-defined vehicular networks (SDVN) is a promising technology for wireless data transmissions between vehicles. SDVN inherits software-defined networking principles and aims to improve the typical performance of safety and non-safety applications of vehicular adhoc networks. Consequently, enhancing the performance of Intelligent Transportation System (ITS). However, the performance of these ITS applications largely depends on the computational capability of the controller node, which involves creating or destroying a data path from the source vehicle to the destination vehicle and generating flow rules for the requests coming from the data plane elements. As a result, SDVN often suffers from the problems of overburdening the controller node with route requests under heavy traffic generation at vehicles and single-point controller failure. To counter these problems, solutions based on multiple controllers are proposed. In fact, the load-balancing problem remains an important issue. So, routing of data with multiple controllers and load-balancing, both topics in SDVN, go hand in hand. In this paper, we survey this state-of-the-art that discusses the above-mentioned challenges, starting with the SDVN preliminaries. We scrutinize the existing routing methodologies and also discuss load-balancing techniques. Furthermore, we provide real-time applications and services of SDVN, discuss trending research, potential future research directions, and the real-life applicability of SDVN that have not been addressed previously.
... The authors disregard the direction of movement for vehicles outside of junctions, which makes resource management in multidimensional models a challenging task. The technique in [12] described a protocol for software-defined networks (SDNs) that employs a Gaussian mixture model (GMM)-based classifier in combination with Q-learning to make packet forwarding choices. However, the proposed method relied heavily on a distributed architecture to accelerate the dissemination of control packets throughout the network. ...
... etc.) are used to pre and post-process the raw data of the city map downloaded from the OpenStreetMap. The performance assessment is carried out by comparing our proposed protocol with learning and SDN-based scheme, i.e., RL-SDVN [12], DROM [20], IV2XQ [8], MFCAR [3], M.Chahal [2]. We chose network performance metrics (i.e., average transmission delay and throughput), reward, and convergence speed to evaluate the proposed protocols' performance. ...
... In our trained network, a favorable solution (one with a high reward) is instantly apparent. The convergence rate of cumulative rewards with epochs for the proposed model, IV2XQ [8], and RL-SDVN [12] is shown in detail, stressing the necessity of actor-critic-based network training. From Figure 6, it can be concluded that the proposed model converges faster than the other mentioned model, as the proposed model converges around 600 epochs. ...
... Compared to prior technologies that relied on computational models, ML algorithms can intelligently and dynamically solve complex routing challenges due to their dynamic processing capabilities and learning abilities [2]. Numerous control mechanisms based on adaptive learning [3] have been proposed in the literature for regulating topology management and introducing a level of coordination between the adjacent vehicles to improve the routing mechanism. The clustering structure is one of these techniques. ...
... Thus, this research aims to address the routing issues by leveraging clustering-based VANETs and developing a novel self-adaptive learning technique for an efficient routing protocol in VANETs. Different from previous clustering schemes, which focus on either vehicular mobility [2], or orientation [5], we utilize the soft clustering of the Gaussian Mixture Model (GMM) to address the non-linearity of the VANET mobility and consider the probability distribution in an n-dimensional space [3]. AlcFier seeks to select the relay nodes more effectively by assessing speed, queue occupancy, and adjacent vehicles. ...
... The AlcFier protocol presented in this paper improves our previously proposed RL-SDVN [3]. In [3], the technique does not consider the channel characteristics that can affect the network performance. ...
Conference Paper
Full-text available
This paper presents an adaptive self-learning classifier-based clustering algorithm called AlcFier, to support scalability, enhance the stability of the network topology, and provide efficient routing. We incorporate mobility and channel characteristics (i.e., orientation, adjacency, link availability, queue occupancy, and signal-to-noise ratio) into the clustering approach as a channel-aware metric to provide a new direction to the taxonomy of the approaches employed to handle cluster head election, cluster affiliation, and cluster administration challenges. Experimental results show that AlcFier performs efficiently, improves cluster stability, reduces transmission delays, and improves throughput compared with the state-of-the-art routing protocols
... Authors in [8] suggested the RLSDVN, a clustered routing method based on Q-learning and an SDN architecture. This method optimizes routing estimation with the use of a Q-learning classifier. ...
... The types of ML techniques, dataset, area, simulators, routing path, number of iterations and routing are briefly discussed in Table 1 for references [4][5][6][7][8][9][10][11][12][13]. ...
Article
Full-text available
Vehicular Ad-hoc Networks (VANET) as the key correspondence organizing innovation has been pulled in by the scholarly world and enterprises with surprising turn of events. With each vehicle acting as a node in an ad hoc network made up of immobile or mobile vehicles, the VANET, which connects vehicles over a wireless connection, is a developing research field that is garnering prominence. The authors of this study examined real-time vehicles and the outcomes of four routing protocols on the basis of three parameters are recorded using network simulator (NS-3) as network simulator and synchronized with simulation on urban mobility (SUMO) as mobility simulator. A dataset is compiled using recorded results with NS-3 and SUMO. For selecting the efficient routing protocol, collection of dataset and selection of different features is done. machine learning (ML) models such as random forest, logistic regression, and k-nearest neighbor (k-NN) are implemented utilizing a set of relevant information regarding the relationship between sender and receiver. The effectiveness of ML models is assessed using a novel dataset and especially in comparison to that with others. The results shows that k-NN outperforms on the basis of evaluation parameters: F-score (75.5%), Accuracy (97.2%), Recall (79.9%) and Precision (75.3%) of classification learning techniques. The purpose of this research is prediction and analysis of ML Models for efficient routing protocol in VANET using different feature information that may be utilized to improve effectiveness of VANET and provided efficient routing protocol for safe, secure, reliable connection between vehicles.
... Authors in [10] suggested the RLSDVN, a clustered routing method based on Q-learning and an SDN architecture. This method optimizes routing estimation with the use of a Q-learning classifier. ...
... The types of ML techniques, dataset, area, simulators, routing path, number of iterations and routing are briefly discussed in Table 1 for references [6][7][8][9][10][11][12][13][14][15]. k-NN, LR and RF are two of the most widely used ML approaches. ...
Preprint
Full-text available
Vehicular Ad-hoc Networks (VANET) as the key correspondence organizing innovation has been pulled in by the scholarly world and enterprises with surprising turn of events. With each vehicle acting as a node in an ad hoc network made up of immobile or mobile vehicles, the VANET, which connects vehicles over a wireless connection, is a developing research field that is garnering prominence. The authors of this study examined real-time vehicles and the outcomes of four routing protocols on the basis of three parameters are recorded using Network Simulator (NS-3) as network simulator and synchronized with Simulation on Urban Mobility (SUMO) as mobility simulator. A dataset is compiled using recorded results with NS-3 and SUMO. For selecting the efficient routing protocol, collection of dataset and selection of different features is done. Machine Learning (ML) models such as Random Forest (RF), Logistic Regression (LR), and k-Nearest Neighbor (k-NN) are implemented utilizing a set of relevant information regarding the relationship between sender and receiver. The effectiveness of ML models is assessed using a novel dataset and especially in comparison to that with others. The results shows that k-NN outperforms on the basis of evaluation parameters: F-score (75.5%), Accuracy (97.2%), Recall (79.9%) and Precision (75.3%) of classification learning techniques. The purpose of this research is prediction and analysis of ML Models for efficient routing protocol in VANET using different feature information that may be utilized to improve effectiveness of VANET and provided efficient routing protocol for safe, secure, reliable connection between vehicles.
... The value of the reward is affected by the distance to the destination vehicle. The proposed algorithm increases the stability and lifetime of the clusters, and also improves network performance in terms of average delay and throughput (TH), as shown in [21]. ...
... An adequate representative of the second category is the adaptive self-learning clustering algorithm with reinforcement routing in SDN-based VANETs (RL-SDVN) [21], which combines the application of the QL algorithm and SDN technique for clustering and finding the optimal route. The main goal of RL-SDVN is to improve the message dissemination process and reduce the average data transfer time. ...
Article
Full-text available
Vehicular and flying ad hoc networks (VANETs and FANETs) are becoming increasingly important with the development of smart cities and intelligent transportation systems (ITSs). The high mobility of nodes in these networks leads to frequent link breaks, which complicates the discovery of optimal route from source to destination and degrades network performance. One way to overcome this problem is to use machine learning (ML) in the routing process, and the most promising among different ML types is reinforcement learning (RL). Although there are several surveys on RL-based routing protocols for VANETs and FANETs, an important issue of integrating RL with well-established modern technologies, such as software-defined networking (SDN) or blockchain, has not been adequately addressed, especially when used in complex ITSs. In this paper, we focus on performing a comprehensive categorisation of RL-based routing protocols for both network types, having in mind their simultaneous use and the inclusion with other technologies. A detailed comparative analysis of protocols is carried out based on different factors that influence the reward function in RL and the consequences they have on network performance. Also, the key advantages and limitations of RL-based routing are discussed in detail.
... Instead, we use an online and continuous learning system, where the Q value may be changed at any moment. Although the traditional Q-learning [16] is comparable to the technique provided in this work, we add multiple features to overcome the deficiencies described above. A multi-step Q-learning using the Boltzmann softmax policy is used to include predicted future events into the action selection process, allowing for a balance of exploration and exploitation. ...
... Softwaredefined networking (SDN) is a new network architecture that enables adding new services and capabilities to existing VANETs. In order to improve the packet delivery ratio and transmission latency, the authors in [16] developed a Gaussian-Mixture Model (GMM)-based classifier. The packet forwarding decision has been made using the Q-learning approach. ...
Article
Routing protocols in vehicular ad-hoc networks (VANETs) are typically challenged by high vehicular mobility and changing network topology. It becomes more apparent as the inherently dispersed nature of VANETs affects the Quality-of-Service (QoS), which makes it challenging to find a routing algorithm that maximizes the network throughput. Integrating Reinforcement Learning (RL) with Meta-Heuristic (MH) techniques allow for solving constrained, high dimensional problems such as routing optimization. Motivated by this fact, we introduce MetaLearn, a technique akin to global search, which employs a parameterized approach to remove future rewards uncertainty as well as vehicular state exploration to optimize the multilevel network structure. The proposed technique searches for the optimum solution that may be sped up by balancing global exploration using Grey Wolf Optimization (GWO) and exploitation through Temporal Difference Learning (particularly Q(λ)). MetaLearn approach enables cluster heads to learn how to adjust route request forwarding according to QoS parameters. The input received by a vehicle from previous evaluations is used to learn and adapt the subsequent actions accordingly. Furthermore, a customized reward function is developed to select the cluster head and identify stable clusters through GWO. An in-depth experimental demonstration of the proposed protocol addresses applicability and solution challenges for hybrid MH-RL algorithms in VANETs.
... X. Yang et al. [14] designed an algorithm that adopts a heuristic approach to accelerate the convergence speed of the Q-learning algorithm and selects forwarding nodes according to path reliability. A. Nahar et al. [15] proposed the RL-SDVN algorithm based on SDN and RL, which groups vehicles in the network and assists each other in finding the optimal path. L. Xiao et al. [16] proposed an RL-based anti-jamming VANET routing protocol (RAVR) with the help of a UAV and a jammer and achieved the purpose of taking correct actions to resist malicious nodes through continuous learning. ...
Article
Full-text available
The vehicular ad hoc network (VANET) constitutes a key technology for realizing intelligent transportation services. However, VANET is characterized by diverse message types, complex security attributes of communication nodes, and rapid network topology changes. In this case, how to ensure safe, efficient, convenient, and comfortable message services for users has become a challenge that should not be ignored. To improve the flexibility of routing matching multiple message types in VANET, this paper proposes a secure intelligent message forwarding strategy based on deep reinforcement learning (DRL). The key supporting elements of the model in the strategy are reasonably designed in combination with the scenario, and sufficient training of the model is carried out by deep Q networks (DQN). In the strategy, the state space is composed of the distance between candidate and destination nodes, the security attribute of candidate nodes and the type of message to be sent. The node can adaptively select the routing scheme according to the complex state space. Simulation and analysis show that the proposed strategy has the advantages of fast convergence, well generalization ability, high transmission security, and low network delay. The strategy has flexible and rich service patterns and provides flexible security for VANET message services.
Article
Full-text available
Reinforcement learning (RL), which is a class of machine learning, provides a framework by which a system can learn from its previous interactions with its environment to efficiently select its actions in the future. RL has been used in a number of application fields, including game playing, robotics and control, networks, and telecommunications, for building autonomous systems that improve themselves with experience. It is commonly accepted that RL is suitable for solving optimization problems related to distributed systems in general and to routing in networks in particular. RL also has reasonable overhead—in terms of control packets, memory and computation—compared to other optimization techniques used to solve the same problems. Since the mid-1990s, over sixty protocols have been proposed, with major or minor contributions in the field of optimal route selection to convey packets in different types of communication networks under various user QoS requirements. This paper provides a comprehensive review of literature on the topic. The review is structured in a way that shows how network characteristics and requirements were gradually considered over time. Classification criteria are proposed to present and qualitatively compare existing RL-based routing protocols.
Article
Full-text available
As a special MANET (mobile ad hoc network), VANET (vehicular ad-hoc network) has two important properties: the network topology changes frequently, and communication links are unreliable. Both properties are caused by vehicle mobility. To predict the reliability of links between vehicles effectively and design a reliable routing service protocol to meet various QoS application requirements, in this paper, details of the motion characteristics of vehicles and the reasons that cause links to go down are analyzed. Then a link duration model based on time duration is proposed. Link reliability is evaluated and used as a key parameter to design a new routing protocol. Quick changes in topology make it a huge challenge to find and maintain the end-to-end optimal path, but the heuristic Q-Learning algorithm can dynamically adjust the routing path through interaction with the surrounding environment. This paper proposes a reliable self-adaptive routing algorithm (RSAR) based on this heuristic service algorithm. By combining the reliability parameter and adjusting the heuristic function, RSAR achieves good performance with VANET. With the NS-2 simulator, RSAR performance is proved. The results show that RSAR is very useful for many VANET applications.
Article
Opportunistic IoT (OppIoT) network is a subclass of Internet of Things network, in which connections between the source and destination devices are intermittent. This infrequent connectivity is due to lack of network infrastructure and random mobility models followed by devices. These attributes of the network make routing in OppIoT, an increasingly complex problem. Moreover OppIoT shares its unique network characteristics with another class of networks called Opportunistic Networks (OppNets). This commonality enables the same routing designs to be applicable to both OppNets and OppIoT. Increased research interest in Machine Learning (ML) has led to its successful application in routing solutions for OppNets through protocols like KNNR and MLPROPH. In this paper we pursue utilizing ML to automate routing decisions in OppIoT. To this end we use Gaussian Mixture Models, an ML based soft clustering mechanism, to develop the proposed routing protocol called GMMR. The design of GMMR is such that it combines the advantages of both context-aware and context-free routing protocols. We compare the performance of GMMR with that of KNNR, HBPR, MLPROPH, and PROPHET using simulations run on Opportunistic Network Environment (ONE) simulator. The performance criteria for this comparison includes delivery probability, network overhead ratio, average hop count and number of messages dropped. We will show through the results of the simulations, that GMMR outperforms all of the aforementioned routing protocols in terms of every performance parameter.