ArticlePDF Available

Clustering and Reinforcement-Learning-Based Routing for Cognitive Radio Networks

August 2017
IEEE Wireless Communications 24(4):146-151

August 2017
24(4):146-151

DOI:10.1109/MWC.2017.1600117

Authors:

Yasir Saleem

Inria Lille - Nord Europe

Kok-Lim Alvin Yau

Sunway University

Hafizal Mohamad

Malaysian Institute of Microelectronic Systems

Nordin Ramli

Malaysian Institute of Microelectronic Systems

Show all 6 authorsHide

The CRN is a future generation wireless communication system that allows SUs to use the underutilized or unused spectrum, known as white spaces, in licensed spectrum with minimum interference to PUs. However, the dynamic conditions of CRNs (e.g., PUs' activities and channel availability) make routing more challenging compared to traditional wireless networks. In this tutorial, we focus on solving the routing problem in CRNs with the help of a clustering mechanism. Cluster-based routing in CRNs enhances network scalability by reducing the flooding of routing overheads, as well as network stability by reducing the effects of dynamicity of channel availability. Additionally, RL, an artificial intelligence approach, is applied as a tool to further enhance network performance. We present SMART, which is a cluster-based routing scheme designed for the CRN, and evaluate its performance via simulations in order to show the effectiveness of cluster- based routing in CRNs using RL.

Cluster structure.

…

Cluster-based routing example.

…

Evaluation results: a) SU-PU interference ratio; b) route discovery frequency; c) end-to-end delay.

…

Figures - uploaded by Kok-Lim Alvin Yau

Content may be subject to copyright.

Content uploaded by Kok-Lim Alvin Yau

Content may be subject to copyright.

Clustering and Reinforcement Learning based

Routing for Cognitive Radio Networks

Yasir Saleema,b,c , Kok-Lim Alvin Yaub, Haﬁzal Mohamadc, Nordin Ramlic, Mubashir Husain

Rehmanid, Qiang Nie

aInstitute Mines-Telecom, Telecom SudParis, France

bSunway University, Selangor, Malaysia.

cMIMOS Berhad, Kuala Lumpur, Malaysia.

dCOMSATS Institute of Information Technology, Wah Cantt, Pakistan.

eLancaster University, Lancashire, United Kingdom.

Abstract

Cognitive radio network (CRN) is afuture generation wireless communication system which allows secondary

users (SUs) to use the underutilized or unused spectrum, known as white spaces, in licensed spectrum with minimum

interference to the primary users (PUs). However, the dynamic conditions of CRNs (such as PUs’ activities and

channel availability) make the routing challenging as compared to traditional wireless networks. In this tutorial, we

focus on solving the routing problem in CRNs with the help of clustering mechanism. Cluster-based routing in

CRNs enhances network scalability by reducing the ﬂooding of routing overheads, as well as network stability by

reducing the effects of dynamicity of channel availability. Additionally, reinforcement learning (RL), an artiﬁcial

intelligence approach, is applied as a tool to further enhance network performance. We present SMART, which is a

cluster-based routing scheme designed for CRN, and evaluate its performance via simulations in order to show the

effectiveness of cluster-based routing in CRNs using RL.

Index Terms

Cognitive radio, clustering, routing, reinforcement learning, cluster-based routing

I. INTRODUCTION

Cognitive radio network (CRN) is afuture generation wireless communication system which solves the problem

of spectrum scarcity caused by static channel assignment policy in the past. CRN solves this problem by allowing

secondary users (SUs) or unlicensed users to explore and exploit underutilized licensed channels, known as white

spaces which are owned by primary users (PUs) or licensed users for improving the overall channels utilization.

Whenever a PU re-appears on the operating channel of a SU, the SU must switch to another available channel or

wait for the PU’s transmission to cease.

With the emergence of CRN applications such as cognitive radio sensor networks and cognitive vehicular net-

works, multi-hop routing for wide area coverage is becoming an essential. Multi-hop routing in CRN is challenging

due to several reasons. Firstly, CRN is characterized by the dynamicity of channel availability (or white spaces)

due to different levels of PUs’ activities. Secondly, the broadcasting of routing control messages over the distinctive

available channels causes higher routing overhead and limits network scalability. Thirdly, the dynamicity of channel

availability can cause the lack of common control channel (CCC) for exchanging control information in routing.

Routing protocols for traditional wireless networks that maintain end-to-end paths, e.g., ad-hoc on-demand

distance vector (AODV) routing protocol, are not preferable for CRNs because they do not consider the challenges

of multi-hop routing in CRNs and highly increase the network overhead by ﬂooding the routing messages constantly.

Hence, such protocols cannot be directly applied in CRNs. Therefore, routing protocols for CRNs must address

the challenges of CRNs by considering spectrum awareness in order to establish stable routes, so that SUs can

perform data communication for long duration without having much disruptions from PUs, as well as with minimal

interference to PUs. Furthermore, there is only limited cluster-based routing schemes proposed in the literature in

the context of CRNs.

In this tutorial, we solve the routing problem in CRNs with the help of clustering mechanism and reinforcement

learning (RL), an artiﬁcial intelligence approach, which is our main contribution. Cluster-based routing for CRNs

enhances network scalability by reducing the ﬂooding of routing overheads as well as network stability by reducing

the effects of the dynamicity of channel availability. RL is a tool that further enhances network performance through

observing and learning the environment. We have proposed a cluster-based routing scheme using RL, which is known

as SMART and is designed for CRNs, in order to fulﬁll the requirement on the minimum number of common

channels in a cluster through cluster maintenance (i.e., cluster merging and splitting) which enhances network

stability, as well as enhances network performance. Since, cluster-based routing has not been well investigated

before in the context of CRNs, this is the focus of our article.

The organization of this article is as follows. In Section II and III, we present an overview of clustering and cluster-

based routing for CRNs, respectively, by highlighting their advantages and importance for solving the problem of

multi-hop routing in CRNs. In Section IV, we present an overview of RL. Subsequently, in Section V, we present

SMART (SpectruM-Aware clusteR-based rouTing), which is a cluster-based routing scheme that applies RL for

CRNs. In Section VI, we evaluate the performance of SMART. Finally, we conclude in Section VII.

II. CLUSTERING IN CRNS

Clustering, a topology management mechanism, provides network stability and scalability by organizing the nodes

into logical groups called clusters. The cluster structure provides a suitable network model to support cooperative

tasks which are very important for CR operations (e.g., routing and channel sensing). Fig. 1 shows a cluster

structure in which nodes are grouped into three clusters. Each cluster consists of four types of nodes: clusterhead,

member node, relay node and gateway node. The clusterhead is the central process for cooperative tasks within the

cluster. Each member node is associated with a clusterhead. Clusterhead and member nodes communicate between

themselves using a common channel known as operating channel. This communication is known as intra-cluster

communications. The operating channel is available to all nodes in a cluster. A relay node is a member node that

provides connection to another member node which is located out of the transmission range of clusterhead. A

gateway is also a member node which can hear from neighboring cluster(s). It provides two-hop, or even more,

inter-cluster communications and is located at the boundary of a cluster.

Fig. 1. Cluster structure.

Cluster size represents the number of nodes in a cluster and it affects various performance metrics. Larger cluster

size minimizes routing overhead since the ﬂooding of routing overheads only involves clusterheads and gateway

nodes along a backbone, as well as reduces error probability in the ﬁnal decision of channel availability since it

is based on channel sensing outcomes collected from higher number of nodes in a cluster. Smaller cluster size

(or higher number of clusters in a network) maximizes the number of common channels, and hence connectivity

among nodes in a cluster, because physically close nodes are more likely to have a similar list of available channels.

Since clusters may use different operating channels, the contention and interference levels in the network can be

reduced, and this subsequently improves routing and network performances. Higher number of common channels

in a cluster minimizes the occurrence of re-clustering due to improved connectivity between nodes within a cluster.

While achieving larger cluster size may seem to be more favourable in traditional wireless networks in order to

improve scalability, the same cannot be said for CRNs. Since smaller cluster size maximizes the number of common

channels in a cluster, it enhances the connectivity among member nodes and clusterhead in a cluster. This improves

stability and addresses the challenge of dynamicity of channel availability in CRNs [1].

III. CLU ST ER -BAS ED ROUTING IN CRNS

Routing protocols can be cluster-based which runs over the clustered network. In the literature, there have been a

larger number of separate investigations into clustering [2], [3] and routing [4], [5]. While, there is only a perfunctory

attempt to investigate cluster-based routing schemes for CRNs. Readers are referred to surveys on clustering [6],

[7] and routing algorithms [8] in CRNs for a comprehensive review of the literature.

Cluster-based routing is preferred in CRN for the following reasons. Firstly, it provides network stability by

reducing the effects of dynamic channel availability since any changes to channel availability affect the network at

the cluster level, so only local updates are required instead of whole network reconﬁguration. Secondly, it provides

network scalability as routing control messages, such as route request (RREQ) and route reply (RREP), are only

exchanged among some nodes, particularly clusterheads and gateway nodes. As clusterheads and gateway nodes

share a similar operating channel, and gateway nodes are aware of operating channel of neighboring clusters, this

facilitates broadcasting using a single transceiver as it is no longer required to broadcast in the distinctive available

channels used by neighboring nodes in non-clustered networks. Thirdly, it reduces the need of a common control

channel for exchanging control information in routing since an operating channel is used which is available to all

nodes in a cluster. Fourthly, it supports cooperative tasks and improves channel sensing outcomes. For example, a

clusterhead collects channel sensing outcomes from its member nodes and subsequently makes a ﬁnal decision on

channel availability. This improves the accuracy of channel availability decision as compared to the decision made

based on the outcome of a single node.

IV. REINFOR CE ME NT LEARNING:ATOOL T O ENHANCE NET WORK PERFORMANCE

RL [9] is an artiﬁcial intelligence approach that enables an agent or decision maker to observe its state and

reward, learn, and then perform an action in order to improve the state and reward in the next time instant. In

RL model, the action affects (improves or deteriorate) the state and reward which affects the next choice of action

by an agent. With the passage of time, an agent estimates the reward for each state-action pair, which constitutes

knowledge; and subsequently carries out a proper action at next time instant given a particular state to maximize

accumulated rewards. The important representations for the agent in RL model include state, action and reward.

•State represents the decision-making factors which is observed in the operating environment by an agent. It

can affect the reward (or network performance).

•Action represents the action of an agent which helps an agent to learn about the optimal actions. It can affect

the state (or operating environment) and reward (or network performance).

•Reward represents the positive or negative consequence on operating environment caused by the agent’s action

in previous time instant in the form of network performance.

Q-routing has been applied in routing which is a prominent RL scheme. In Q-routing model, the state represents

the destination node, action represents the next-hop neighbor node of the decision making node that relays data

towards destination node, and the reward represents network performance (e.g., throughput). Each link of a route

is associated with a cost (e.g., delay) and a node computes Q-value for each state-action pair (or destination and

next-hop neighbor node pair) in order to estimate the cost required for transmitting the data towards the destination

node along the route.

There are two main advantages of applying RL to routing in CRNs. Firstly, rather than considering each factor

which affects the network performance, RL models the network performance that covers various factors in the

operating environment or network conditions affecting the network performance (i.e., the channel utilization level

by PUs and channel quality); hence, it is a simple modeling approach. Secondly, prior knowledge of the operating

environment or network conditions is not necessary; and so a SU can learn about the operating environment on the

ﬂy as time goes by. Hence, the application of RL to cluster-based routing in CRNs can improve both routing and

clustering performances and it is very novel. Since CRN is characterized by the dynamicity of channel availability

due to PUs’ activities, cluster maintenance is imperative in CRNs to adapt the cluster structure and cluster size. RL

reduces the effects of dynamic channel availabilities by observing, learning and taking the optimal or near-optimal

actions that minimizes cluster maintenance.

V. SMART

We present SMART for overcoming the challenges of multi-hop routing in CRNs through cluster-based routing

and RL. In SMART, clustering aims to form clusters that fulﬁll the requirements on the number of common channels

in a cluster and allow nodes to forward routing control messages efﬁciently without the need of broadcasting on all

the available channels; while RL aims to ﬁnd a route that increases the usage of white spaces for maximizing SUs’

network performance. Moreover, in order to overcome the dynamicity of channel availability, SMART provides

extension to clustering through cluster merging and splitting. Subsequently, SMART adjusts cluster size as time

goes by, so that a cluster fulﬁls the requirement on cluster size for improving scalability, as well as the number of

common channels in a cluster for improving stability. SMART estimates the OFF-state probability of a channel at

next time instant [10], and uses this estimation to rank and select the operating channels in clustering and routes

in routing.

A. Clustering

There exists signiﬁcant amount of work on cluster formation and gateway node selection in CRNs, therefore

readers are recommended to refer the existing work [11], [12], [13] for cluster formation and gateway node selection.

However, cluster maintenance (i.e., cluster merging and splitting), and cluster-based routing using RL are novel and

have not been investigated before. Therefore, we mainly present them in this paper which is our main contribution.

Cluster maintenance adjusts the cluster size in order to reduce dynamic effects of the network and it consists of

cluster merging and splitting. These are best explained with illustrations as presented in Fig. 2. In this ﬁgure, the

labels of the nodes are revised after each clustering event for better understanding. We assume that a threshold for

minimum number of common channels is 2 in both cluster merging and splitting. Cluster merging combines two

clusters into one and is possible when two clusters satisfy the threshold for minimum number of common channels.

Fig. 2(a) presents initial clusters formed after cluster formation in which gateway node 1 in cluster 2 discovers that

the set of common channels between clusters 1 and 2 is two and it satisﬁes the threshold value. Thus, it informs

both clusterheads in clusters 1 and 2 about the potential cluster merging. Suppose, both clusterheads agree to merge

and subsequently, gateway node 1 in cluster 2 becomes the new clusterhead as presented in Fig. 2(b). The existing

clusterheads in Fig. 2(a) relinquish their roles and become member nodes of the new clusterhead, and then inform

their respective member nodes to join the new clusterhead. Member nodes which are in the transmission range of

new clusterhead in Fig. 2(a) join the new clusterhead. However, member nodes which are not in the transmission

range of new clusterhead request their previous clusterheads to provide connection towards the new clusterhead,

and so the relinquished clusterheads become relay nodes for such member nodes as presented in Fig. 2(b). Finally,

the new clusterhead selects operating channel of the new cluster, and subsequently, gateway nodes are selected to

provide inter-cluster communication for the newly merged cluster.

Cluster splitting splits one cluster into two and it is performed when a clusterhead realizes that its cluster cannot

satisfy a threshold for minimum number of common channels. Fig. 2(c) shows clusters after cluster splitting is

performed on Fig. 2(b). Suppose, common channels 3 and 4 of cluster 2 in Fig. 2(b) are re-occupied by PUs. So,

clusterhead of cluster 2 in Fig. 2(b) initiates cluster splitting. Since the clusterhead is aware of a list of available

channels for all nodes in its cluster, it counts the number of nodes in each available channel and ranks these channels

based on maximum node degree. Subsequently, the clusterhead selects the highest ranked channels and identiﬁes

nodes which have such channels available. In Fig. 2(b), such channels are available to four nodes in cluster 2, so

the clusterhead forms one cluster comprised of these nodes as cluster 2, presented in Fig. 2(c). For the remaining

Fig. 2. An illustration of cluster maintenance: (a) initial clusters formed after cluster formation; (b) new clusters after cluster merging is

performed on clusters 1 and 2 in initial clusters; (c) new clusters after cluster splitting is performed on cluster 2 due to re-appearance of

PUs’ on common channels 3 and 4 of cluster 2 in Fig. 2(b).

nodes, the clusterhead identiﬁes the common channels among these nodes and creates another cluster consists of

them, which is presented as cluster 3 in Fig. 2(c). Finally, clusterheads and gateway nodes for the newly split

clusters are selected.

B. Cluster-based Routing using Reinforcement Learning

In this section, we present cluster-based routing based on Q-routing, a RL model, which is performed on the

clustered network. A source clusterhead estimates the Q-value for each neighbor node to reach the destination

node and subsequently, it updates the routing table of Q-values. The traditional Q-value equation [14] is modiﬁed

to incorporate the OFF-state probability of the bottleneck channel along a route. The bottleneck channel is the

channel having the least OFF-state probability for the next time instant along a route towards the destination node,

connecting two clusters via a SU neighbor node. The Q-value equation can be generally described below:

Qnext

src (dst, nbr)←(1 −α)×Qcurrent

src (dst, nbr)+α×min chanP r obcurrent

src,nbr , Qcurrent

nbr,max(dst) (1)

where 0≤α≤1is the learning rate, src is source clusterhead, nbr is the SU neighbor node of the source

clusterhead, dst is the destination node, chanP robcurrent

src,nbr is the OFF-state probability of the operating channel

between source clusterhead and its SU neighbor node, Qcurrent

nbr,max(dst)is the OFF-state probability of the bottleneck

channel along a route from a SU neighbor node of the source clusterhead’s neighbor node to the SU destination node.

The minimum value among chanP robcurrent

src,nbr and Qcurrent

nbr,max(dst)represents the channel availability probability of

bottleneck channel along the route.

Fig. 3 presents an example of cluster-based routing using RL in which clusterhead 1 wants to send data packets

to SU destination node BS. Initially, the clusterhead 1 initiates RREQ towards SU destination node BS in order to

discover a route. The procedure of RREQ propagation is traditional, so we are not going into its details. When a

SU destination node BS receives two RREQ messages from clustersheads 2 and 4, it generates RREP messages and

sends them back to clusterhead 1 using the reverse route in which RREQ messages traversed. When clusterhead 4

receives RREP message from SU destination node BS via its gateway node, it updates the Q-value with channel

OFF-state probability of the link between cluster 4 and SU destination node BS. Subsequently, clusterhead 4 embeds

this Q-value in RREP and forwards it towards clusterhead 3. When clusterhead 3 receives RREP from clusterhead

4, it compares the OFF-state probability of a channel provided in the RREP with the OFF-state probability of a

channel along the link between clusters 3 and 4, and ﬁnds that its link channel has lower OFF-state probability.

Therefore, it updates the Q-value and forwards it to SU source node by embedding it in RREP message. When SU

source node receives RREP from clusterhead 3, it updates its routing table of Q-values. Similar procedure runs on

clusterhead 2 to process RREP. Finally, there are two entries in routing table of Q-values at SU source node. The

SU source node selects clusterhead 3 as its next-hop SU node because it provides the highest Q-value for the route

leading to destination node BS. It is important to note that the lower route is selected because it is more stable,

although it is longer compared to the alternative (upper) route.

Fig. 3. Cluster-based routing example.

VI. PERFORMANCE EVALUATIO N

The performance of SMART is evaluated in the network simulator QualNet 6.1, which is incorporated with CR

functionality. The total number of SUs is 10 and channels is 5. The SU learning rate αis set to 0.5 for maintaining

a balance between the estimated and recent value. Since a cluster must have at least two common channels (i.e.,

master and backup channels), therefore the threshold for the minimum number of common channels is set to 2.

Whenever a master channel is re-occupied by PUs’ activities, all member nodes and the clusterhead in a cluster

switch to a backup channel. The simulation time for each run is 550s and a total of 100 simulation runs, each

with random topology, were performed for each measurement. Each result shown in a graph is an average value

for the values gathered in 100 runs. We assume a perfect channel sensing because the main focus of our work is

on network layer. The ON-OFF transitions of PU activity follows a Poisson model with exponential distribution

with rates λk

ON,j and λk

OF F,j for ON and OFF periods, respectively.

The network performance of SMART is compared with clustered and non-clustered schemes. The clustered

scheme is known as SMART-NO-MNT (SMART no maintenance) that operates similar to SMART, however it does

Fig. 4. Evaluation results: a) SU-PU interference ratio; b) route discovery frequency; c) end-to-end delay.

not have the functionality of cluster maintenance (i.e., cluster merging and splitting). The non-clustered scheme,

called SA-AODV (spectrum-aware AODV), is a variant of AODV routing protocol designed for CR environment

which has been used for comparison in the literature [15]. SA-AODV is spectrum-aware and operates on multi-

channel environment. It selects a random channel from the list of available channels for operation. There are

two performance metrics of SMART, speciﬁcally, SU-PU interference ratio and route discovery frequency. SU-PU

interference ratio is the ratio of the total number of SU-PU interfered packets to the total number of transmitted

packets by a SU source node. Route discovery frequency is the number of route discovery (or RREQ messages)

initiated by a SU source node.

Fig. 4(a) presents SU-PU interference ratio.Fig. 4(b) shows route discovery frequency and Fig. 4(c) illustrates

end-to-end delay by varying the number of SUs. SMART achieves signiﬁcantly lower SU-PU interference ratio as

well as route discovery frequency. This is because SMART is a cluster-based routing that adopts cluster maintenance

(i.e., cluster merging and splitting) and RL. The cluster maintenance mechanisms reduce the dynamic effects of

network caused by PUs’ activities, and RL helps in the right selection of SU next-hop node in routing by learning

from the environment and previous actions. Therefore, the selected routes are stable, having lower chances of PUs’

re-appearance. However, SMART achieves higher end-to-end delay. This is because higher number of SUs increases

the number of clusters and cluster size in the network, resulting in more frequent cluster maintenance. SMART-NO-

MNT causes higher route discovery frequency than SMART due to the lack of cluster maintenance mechanisms.

Therefore, there is a higher chance that clusters are lack of inter-cluster connection with increasing number of PUs,

and so the SU source node initiates higher number of re-routing, and hence higher number of RREQ messages

are sent in order to discover a route. Additionally, SMART-NO-MNT drops higher number of packets due to the

lack of inter-cluster connection, and therefore, with lower number of transmissions, the SU-PU interference and

end-to-end delay are naturally lower. SA-AODV causes higher SU-PU interference and route discovery frequency

due to the lack of stability achieved by clustering, as well as the beneﬁt of RL for learning from the environment

and previous actions. Moreover, since SA-AODV is a non-clustered scheme, it does not incur delays caused by

clustering, contributing to lower end-to-end delay. The results show the effectiveness and feasibility of cluster-based

routing and the application of RL to routing for CRNs.

VII. CONCLUSION

In this article, we focus on routing problem in CRN caused by an intrinsic characteristic of cognitive radio,

speciﬁcally dynamic channel availability. The problem is addressed by clustering mechanisms, particularly cluster

merging and splitting, and an artiﬁcial intelligence approach, speciﬁcally RL. Clustering and RL solves the routing

problem in CRN and improves network scalability and stability. We also propose SMART, which is a cluster-based

routing scheme for CRNs and evaluate it through simulations. One of the main goals of CRN is to minimize SUs’

interference to PUs. The simulation results conﬁrm that cluster-based routing minimizes SUs’ interference to PUs,

as well as selects more stable routes and achieves signiﬁcantly lower route discovery frequency.

ACK NOWLEDG EM EN T

This work was supported by the Ministry of Education Malaysia under Fundamental Research Grant Scheme

(FRGS) FRGS/1/2014/ICT03/SYUC/02/2. Kok-Lim Alvin Yau and Qiang Ni were also funded under the Small

Grant Scheme (Sunway-Lancaster), grant agreement number SGSSL-FST-CSNS-0114-05 and PVM1204. The work

of Qiang Ni was also supported by the U.K. EPSRC under Grant EP/K011693/1 and by the European FP7 CROWN

project under Grant number PIRSES-GA-2013-610524.

REFERENCES

[1] M.-R. Kim and S.-J. Yoo, “Distributed coordination protocol for common control channel selection in multichannel ad-hoc cognitive

radio networks,” in IEEE WIMOB, 2009, pp. 227–232.

[2] W. Zhang and C. K. Yeo, “Cluster-based adaptive multispectrum sensing and access in cognitive radio networks,” Wireless

Communications and Mobile Computing, vol. 15, no. 1, pp. 100–114, 2015.

[3] M. Bradonjic and L. Lazos, “Graph-based criteria for spectrum-aware clustering in cognitive radio networks,” Ad Hoc Networks, vol. 10,

no. 1, pp. 75–94, 2012.

[4] A. Bourdena, C. X. Mavromoustakis, G. Kormentzas, E. Pallis, G. Mastorakis, and M. B. Yassein, “A resource intensive trafﬁc-aware

scheme using energy-aware routing in cognitive radio networks,” Future Generation Computer Systems, vol. 39, pp. 16–28, 2014.

[5] A. C. Talay and D. T. Altilar, “Self adaptive routing for dynamic spectrum access in cognitive radio networks,” Journal of Network

and Computer Applications, vol. 36, no. 4, pp. 1140–1151, 2013.

[6] J. Y. Yu and P. H. J. Chong, “A survey of clustering schemes for mobile ad hoc networks.” IEEE Communications Surveys & Tutorials,

vol. 7, no. 1, pp. 32–48, 2005.

[7] K.-L. A. Yau, N. Ramli, W. Hashim, and H. Mohamad, “Clustering algorithms for cognitive radio networks: A survey,” Journal of

Network and Computer Applications, vol. 45, pp. 79–95, 2014.

[8] M. Youssef, M. Ibrahim, M. Abdelatif, L. Chen, and A. V. Vasilakos, “Routing metrics of cognitive radio networks: A survey,” IEEE

Communications Surveys & Tutorials, vol. 16, no. 1, pp. 92–109, 2014.

[9] R. S. Sutton and A. G. Barto, Introduction to reinforcement learning. MIT Press, 1998.

[10] H. Kim and K. G. Shin, “Efﬁcient discovery of spectrum opportunities with mac-layer sensing in cognitive radio networks,” IEEE

Transactions on Mobile Computing, vol. 7, no. 5, pp. 533–545, 2008.

[11] X.-L. Huang, G. Wang, F. Hu, and S. Kumar, “Stability-capacity-adaptive routing for high-mobility multihop cognitive radio networks,”

IEEE Transactions on Vehicular Technology, vol. 60, no. 6, pp. 2714–2729, 2011.

[12] A. C. Talay and D. T. Altilar, “United nodes: cluster-based routing protocol for mobile cognitive radio networks,” IET communications,

vol. 5, no. 15, pp. 2097–2105, 2011.

[13] T. Chen, H. Zhang, G. M. Maggio, and I. Chlamtac, “Cogmesh: a cluster-based cognitive radio network,” in IEEE DySPAN, 2007, pp.

168–178.

[14] J. A. Boyan and M. L. Littman, “Packet routing in dynamically changing networks: A reinforcement learning approach,” in Advances

in neural information processing systems, 1994, pp. 671–671.

[15] W. Kim, M. Gerla, S. Y. Oh, K. Lee, and A. Kassler, “Coroute: a new cognitive anypath vehicular routing protocol,” Wireless

Communications and Mobile Computing, vol. 11, no. 12, pp. 1588–1602, 2011.

A reinforcement learning-based cluster routing scheme with dynamic path planning for mutli-UAV network

Article

Mar 2023

Optimized ANFIS Model for Stable Clustering in Cognitive Radio Network

Article

Full-text available

Jan 2023

Reinforcement Learning-Based Routing Protocols in Vehicular Ad Hoc Networks for Intelligent Transport System (ITS): A Survey

Article

Full-text available

Dec 2022

Today, the use of safety solutions in Intelligent Transportation Systems (ITS) is a serious challenge because of novel progress in wireless technologies and the high number of road accidents. Vehicular ad hoc network (VANET) is a momentous element in this system because they can improve safety and efficiency in ITS. In this network, vehicles act as moving nodes and work with other nodes within their communication range. Due to high-dynamic vehicles and their different speeds in this network, links between vehicles are valid for a short time interval. Therefore, routing is a challenging work in these networks. Recently, reinforcement learning (RL) plays a significant role in developing routing algorithms for VANET. In this paper, we review reinforcement learning and its characteristics and study how to use this technique for creating routing protocols in VANETs. We propose a categorization of RL-based routing schemes in these networks. This paper helps researchers to understand how to design RL-based routing algorithms in VANET and improve the existing methods by understanding the challenges and opportunities in this area.

A Hybrid Route Selection Scheme for 5G Network Scenarios: An Experimental Approach

Article

Full-text available

Aug 2022
SENSORS-BASEL

With the significant rise in demand for network utilization, such as data transmission and device-to-device (D2D) communication, fifth-generation (5G) networks have been proposed to fill the demand. Deploying 5G enhances the utilization of network channels and allows users to exploit licensed channels in the absence of primary users (PUs). In this paper, a hybrid route selection mechanism is proposed, and it allows the central controller (CC) to evaluate the route map proactively in a centralized manner for source nodes. In contrast, source nodes are enabled to make their own decisions reactively and select a route in a distributed manner. D2D communication is preferred, which helps networks to offload traffic from the control plane to the data plane. In addition to the theoretical analysis, a real testbed was set up for the proof of concept; it was composed of eleven nodes with independent processing units. Experiment results showed improvements in traffic offloading, higher utilization of network channels, and a lower interference level between primary and secondary users. Packet delivery ratio and end-to-end delay were affected due to a higher number of intermediate nodes and the dynamicity of PU activities.

AI-Based Performance Enhancement for Multi-Tenant Slicing

Chapter

Apr 2024

With the maturity of 5G standards and the commercialization of 5G, more and more vertical industries are brought in the business model of mobile network, opening new market opportunities to Mobile Network Operators (MNOs). Vertical industries as tenants provide customized services by renting resources from MNOs to deploy slices. Network slicing enables these tenants to efficiently share resource of the same federated infrastructure network. However, as the number of tenants with differentiated requirements increases, resource competition among the slices owned by different tenants will bring the degradation of slice traffic performance. This chapter first introduces the collaborative business model of multi-tenant slicing and illustrates the resource competition using a multilayer network model. After obtaining a global perspective of resource utilization of multi-tenant slicing, traffic performance analysis of multiple isolated slices are conducted with the slice traffic model. The influence of different factors, such as the number and scale of slices, nodal coverage and resource allocation parameters, is analyzed comprehensively. Based on the analysis results, this chapter proposes control strategies for avoiding the damage of resource competition on slice traffic performance. To deal with the dynamics and complexity in resource management of multi-tenant slicing, the architecture and advantages of three newest deep reinforcement learning (DRL) algorithms are presented. Applying these algorithms in multi-tenant slicing can achieve faster convergence speed and better performance.

Distributed topology control based on reinforcement learning in unmanned aerial vehicles networks

Conference Paper

Feb 2024

Deep Reinforcement Learning Based Opportunistic Routing for Cognitive Relay Networks

Conference Paper

Oct 2023

A multi-buffer congestion resolution scheme using prioritization and shortest path algorithms

Conference Paper

Jan 2023

Transfer Reinforcement Learning for Dynamic Spectrum Environment

Article

Jan 2023

Reinforcement learning (RL) has proven to be an effective approach for achieving intelligence in Cognitive Radio (CR). Through interactions with the environment, RL enables a CR to optimize in an efficient and flexible manner. The vast majority of studies, however, are carried out in a spectrum environment with prefixed user access rules, typically with a constant transition probability and reward distribution. In fact, in a real-world spectrum environment, changes in access rules are common, which has a significant impact on the effectiveness of RL, while few studies have been conducted on this topic. This paper demonstrates how changes in primary user’s (PU) access rules affect RL strategies. To improve the secondary user’s (SU) performance for the dynamic spectrum environment, a transfer Deep Q-Network (DQN) is proposed, this method screens out knowledge from historical experience while avoiding interference from irrelevant information with an experience playback mechanism. Experiments show that this method outperforms traditional RL methods in terms of conflict rate, spectrum utilization rate, and convergence rate in the dynamic spectrum. Given the scarcity of studies on this topic, this study is expected to serve as a benchmark for the future research.

An effective Routing Algorithm for Spectrum Allocations in Cognitive Radio-based Internet of Things

Article

Sep 2022
CONCURR COMP-PRACT E

The Internet of Things (IoT) concept increases the spectrum demands of mobile users in wireless communications because of the intensive and heterogeneous structure of IoT. Various devices are joining IoT networks every day, and spectrum scarcity may be a crucial issue for IoT environments in the near future. Cognitive Radio (CR) is capable of sensing and detecting spectrum holes. With the aim of CR, more powerful IoT devices will be constructed in such crowded wireless environments. Also, dynamic and ad-hoc CR networks have not a fixed base station. Therefore, CR capable IoT (CR-based IoT) device approach with routing capabilities will be a solution for future IoT environments. In this study, spectrum aware Ad hoc On-Demand Distance Vector (AODV) routing protocol is proposed for CR-based IoT devices in IoT environments. For the performance analysis of the proposed method, various network scenarios with different idle probability have been performed and throughput and delay results for different offered loads have been analyzed.

Clustering algorithms for Cognitive Radio networks: A survey

Article

Full-text available

Oct 2014

Clustering organizes nodes into logical groups to provide scalability, stability and cooperative tasks support, and the dynamicity of channel availability in Cognitive Radio networks has brought about challenges to clustering. This article presents an extensive review on various aspects of clustering algorithms in Cognitive Radio networks, including clustering objectives, characteristics, performance enhancements, complexity analysis, and open issues. Of particular focus is clustering metrics and how these metrics have been applied to form clusters in Cognitive Radio networks.

Stability-Capacity-Adaptive Routing for High-Mobility Multihop Cognitive Radio Networks

Article

Full-text available

Aug 2011

In high-mobility cognitive radio networks (CRNs), the fast topology changes increase the complexity of the routing scheme. In this paper, we propose a novel CRN routing scheme that considers the path stability and node capacity. First, a realistic mobility model is proposed to describe the movement of highly mobile airborne nodes [e.g., unmanned aerial vehicles (UAVs)] and estimate the link stability performance based on node movement patterns. Second, we propose a CRN topology management scheme based on a clustering model that considers radio link availability, and the cluster heads (CHs) are selected based on the node degree level, the average number of hops, and channel switching from member nodes to the CH. Third, we propose two new common control channel (CCC) selection schemes based on the node contraction concept and the discrete particle swarm optimization algorithm. The intercluster control channels and gateways are selected from the CHs, considering the average delay of control information transmission between two CHs, as well as the total throughput of control channels. Finally, a novel routing scheme is proposed that tightly integrates with the channel assignment scheme based on the node capacity. Our simulation results show that our proposed CCC selection scheme has high throughput and small transmission time. Compared with other popular CRN routing approaches, our proposed routing scheme achieves lower average end-to-end delay and higher packet delivery ratio for high-mobility CRN applications (such as airborne surveillance).

Packet routing in dynamically changing networks: A reinforcement learning approach

Article

Jan 1994

Cluster-based adaptive multispectrum sensing and access in cognitive radio networks

Article

Jan 2015
WIREL COMMUN MOB COM

Spectrum sensing and access have been widely investigated in cognitive radio network for the secondary users to efficiently utilize and share the spectrum licensed by the primary user. We propose a cluster-based adaptive multispectrum sensing and access strategy, in which the secondary users seeking to access the channel can select a set of channels to sense and access with adaptive sensing time. Specifically, the spectrum sensing and access problem is formulated into an optimization problem, which maximizes the utility of the secondary users and ensures sufficient protection of the primary users and the transmitting secondary users from unacceptable interference. Moreover, we explicitly calculate the expected number of channels that are detected to be idle, or being occupied by the primary users, or being occupied by the transmitting secondary users. Spectrum sharing with the primary and transmitting secondary users is accomplished by adapting the transmission power to keep the interference to an acceptable level. Simulation results demonstrate the effectiveness of our proposed sensing and access strategy as well as its advantage over conventional sensing and access methods in terms of improving the achieved throughput and keeping the sensing overhead low.Copyright © 2012 John Wiley & Sons, Ltd.

A resource intensive traffic-aware scheme using energy-aware routing in cognitive radio networks

Article

Feb 2014
FUTURE GENER COMP SY

This paper proposes a resource intensive traffic-aware scheme, incorporated into an energy-efficient routing protocol that enables energy conservation and efficient data flow coordination, among secondary communicating nodes with heterogeneous spectrum availability in distributed cognitive radio networks. The proposed scheme associates the backward difference traffic moments with the Sleep-time duration to tune the activity durations of a node for achieving optimal energy conservation and alleviating the uncontrolled energy consumption of wireless devices. Efficient routing protocol operation, as a matter of maximum energy conservation, maximum-possible routing paths establishments and minimum delays is obtained, by utilizing a signalling mechanism, developed based on a simulation scenario that includes a number of secondary communication nodes. The validity of the proposed resource intensive traffic-aware scheme and the energy-efficient routing protocol is estimated and verified, by conducting experimental simulation tests and obtaining performance evaluation results. The simulation results validated the efficiency of the proposed scheme and the effectiveness of the routing protocol, in terms of minimizing the energy consumption and maximizing resources exchange between secondary communication nodes in a distributed cognitive radio network.

United nodes: Cluster-based routing protocol for mobile cognitive radio networks

Article

Oct 2011

Advancement of cognitive radio (CR) technology can overcome the problems encountered from bandwidth and spectrum access limitations because of tremendous potential to improve the utilisation of the radio spectrum by efficiently reusing and sharing the licensed spectrum bands, as long as the interference power inflicted on the primary users of the band remains below a predefined threshold level. In mobile CR ad hoc networks, routing is one of the most important issues to be addressed and desires deep investigation. In this study, a distributed and efficient cluster-based spectrum and interference aware routing protocol is proposed. The protocol incorporates the spectrum availability cost and interference metrics into the routing algorithm to find better routes. A route preservation method is also implemented to repair the route when it is defective because of primary user activity. Extensive experimental evaluations are performed in the ns2 simulator. Results of the simulations illustrate that, the proposed algorithm can well fit into the mobile CR ad hoc networks and improve the network performance. The results indicate that the proposed protocol provides better adaptability to the environment than the existing ones. It also increases throughput and reduces data delivery latency in a number of realistic scenarios and outperforms recently proposed routing protocols for CR networks.

Routing Metrics of Cognitive Radio Networks: A Survey

Article

Jan 2013

The majority of work in cognitive radio networks have focused on single-hop networks with mainly challenges at the physical and MAC layers. Recently, multi-hop secondary networks have gained attention as a promising design to leverage the full potential of cognitive radio networks. One of the main features of routing protocols in multi-hop networks is the routing metric used to select the best route for forwarding packets. In this paper, we survey the state-of-the-art routing metrics for cognitive radio networks. We start by listing the challenges that have to be addressed in designing a good routing metric for cognitive radio networks. We then provide a taxonomy of the different metrics and a survey of the way they have been used in different routing protocols. Then we present a case study to compare different classes of metrics. After that, we discuss how to combine individual routing metrics to obtain a global one. We end the paper with a discussion of the open issues in the design of future metrics for cognitive radio networks.

Self adaptive routing for dynamic spectrum access in cognitive radio networks

Article

Jul 2013

Cognitive radio technology inherently possesses self adaptivity. In order to design self adaptive mobile cognitive radio networks, routing is one of the key challenging issues to be addressed. In this paper, a novel self adaptive routing (SAR) algorithm for multi-hop cognitive radio ad hoc networks is proposed. The proposed routing algorithm incorporates with routing metrics and autonomous distributed adaptive transmission range control mechanism to provide self adaptivity. SAR aims to choose optimal routes at the outset of routing and aims to retain optimal route by the use of route adaptation and route preservation. SAR is compared with previously suggested algorithms to indicate performance differences. Extensive experimental evaluations are performed in the ns2 simulator. It is shown that the SAR provides better adaptability to the environment than the previously suggested algorithms and maximizes throughput, minimizes end-to-end delay in a number of realistic scenarios and significantly improves routing performance.

Distributed Coordination Protocol for Common Control Channel Selection in Multichannel Ad-Hoc Cognitive Radio Networks

Conference Paper

Oct 2009

The exponential growth in wireless services has resulted in an overly crowded spectrum. The current state of spectrum allocation indicates that almost all usable frequencies have already been occupied. This makes one pessimistic about the feasibility of integrating emerging wireless services such as large-scale sensor networks into the existing communication infrastructure. Cognitive radio is the emerging dynamic spectrum access technology to achieve open spectrum sharing flexibly and efficiently. It is an intelligent wireless communication system that is aware of its radio environment and is capable of adapting its operation to statistical variations of the radio frequency. Ad-hoc networks in terms of cognitive radio rely on a common control channel (CCC) for operation. Control signals are used to enable cooperation communicate through a common control channel. However, common control channel may not be always available in an open spectrum allocation scheme due to interference and coexistence with primary systems (PS) of the spectrum. In this paper, we propose a novel common control channel selection protocol (DCP-CCC) in a distributed way based on appearance patterns of PS and connectivity among nodes. Using simulation results, we evaluate the performance of the proposed CCC selection scheme.

Graph-based criteria for spectrum-aware clustering in cognitive radio networks

Article

Jan 2012
AD HOC NETW

Cognitive radios (CRs) can exploit vacancies in licensed frequency bands to self-organize in opportunistic spectrum networks. Such networks, henceforth referred to as cognitive radio networks (CRNs), operate over a dynamic bandwidth in both time and space. This inherently leads to the partition of the network into clusters depending on the spatial variation of the primary radio network (PRN) activity. In this article, we analytically evaluate the performance of a new class of clustering criteria designed for CRNs, which explicitly take into account the spatial variations of spectrum opportunities. We jointly represent the network topology and spectrum availability using bipartite graphs. This representation reduces the problem of spectrum-aware cluster formation to a biclique construction problem. We investigate several criteria for constructing clusters for the CRN environment, and characterize their performance under different spectrum sensing and PR activity models. In particular, we evaluate the expected cluster size and number of common idle channels within each cluster, as a function of the spectrum and topology variability. We verify our analytical results via extensive simulations.

Clustering and Reinforcement-Learning-Based Routing for Cognitive Radio Networks

Abstract and Figures

Recommended publications

Analysis of Millimeter-Wave Multi-Hop Networks With Full-Duplex Buffered Relays

Space-time code design and its applications in wireless networks

Cost/Benefit Analysis Of Interval Jumping In Power-Control Simulation

QoS-driven adaptive resource allocation for mobile wireless communications and networks