ChapterPDF Available

Reinforcement Learning Based Routing Protocols Analysis for Mobile Ad-Hoc Networks

May 2019

May 2019

DOI:10.1007/978-3-030-19945-6_17

In book: Machine Learning for Networking, First International Conference, MLN 2018, Paris, France, November 27–29, 2018, Revised Selected Papers (pp.247-256)

Authors:

Mili Redha

Université Constantine 2

Energy consumption and maximize lifetime routing in Mobile Ad hoc Network (MANETs) is one of the most important issues. In our paper, we compare a global routing approach with a local routing approach both using reinforcement learning to maximize lifetime routing. We first propose a global routing algorithm based on reinforcement learning algorithm called Q-learning then we compare his results with a local routing algorithm called AODV-SARSA. Average delivery ratio, End to end delay and Time to Half Energy Depletion are used like metrics to compare both approach.

Comparison between AODV-SARSA and EQ-AODV

…

Simulation Parameters Setting

…

Content may be subject to copyright.

Content uploaded by Salim Chikhi

Content may be subject to copyright.

Reinforcement Learning Based Routing Protocols

Analysis for Mobile Ad-Hoc Networks

Global Routing versus Local Routing

Redha Mili1 and Salim Chikhi2

1&2 Constantine 2 – Abdelhamid Mehri University, Constantine, Algeria

MISC Laboratory

1 redha.mili@gmail.com

2 slchikhil@yahoo.fr

Abstract. Energy consumption and maximize lifetime routing in Mobile Ad

hoc Network (MANETs) is one of the most important issues.

In our paper, we compare a global routing approach with a local routing

approach both using reinforcement learning to maximize lifetime routing.

We first propose a global routing algorithm based on reinforcement learning

algorithm called Q-learning then we compare his results with a local routing

algorithm called AODV-SARSA.

Average delivery ratio, End to end delay and Time to Half Energy Depletion

are used like metrics to compare both approach.

Keywords: Reinforcement Learning, Ad-Hoc Network, MANETs, Energy

AODV, Q-learning.

1 Introduction

In Mobile Ad-hoc Networks (MANETs) [1] the End to End delay, the delivery Rate,

the Network lifetime and the energy consumption are indicators of a good network man-

agement and a good offered quality of service. In order to satisfy the strict requirements

of these parameters, MANET nodes must deal with routing in an efficient and adaptive

way.

Indeed, the routing protocol must perform efficiently in mobile environments; it must

be able to adapt automatically to the high mobility, the dynamic network topology and

link changes. Simple rules are not enough to extend lifetime of the network.

Hence, Reinforcement learning (RL) [2] methods can be used to control both packet

routing decisions and node mobility.

Energy efficient routing is a real challenge and may be the most important design

criteria for MANETs since mobile nodes will be powered by batteries with different and

limited capacity.

Generally, MANETs routing protocol using reinforcement learning can be classified

in two different approaches: Global Routing and Local Routing.

This paper presents a performances evaluation comparison between the designed

Global Routing protocols EQ-AODV (Energy Q-Learning AODV), with AODV-

SARSA which is a Local Routing protocol using reinforcement leaning.

The EQ-AODV protocol that we present in this paper is hybridization between

AODV (Ad hoc On Demand Distance Vector) [3] and the reinforcement learning algo-

rithm Q-Learning [4].

The remainder of the paper is organizing as follow. In section II, we discuss the re-

lated work covering adaptive energy aware routing in MANETs. In section III we give

a general description of the protocol EQ-AODV. We present in section IV a performance

evaluation comparison between EQ-AODV and AODV-SARSA. In this simulation we

captured several metrics: Lifetime, battery energy, thus, the End to End Delay and De-

livery Rate. Finally, Section V concludes the paper.

2 Energy Aware Routing in MANETs

Maximum lifetime-routing protocols perform energy aware routes discovery in two

different ways, namely [5-6] Global Routing and Local Routing. In this section, we sur-

vey related work on modeling routing behavior in ad hoc networks. Most of these papers

are intelligent routing based, they combines well-known routing algorithms with well-

known learning techniques.

2.1 Global Routing Protocol

In Global Routing, all mobile nodes participate in the route discovery process by

forwarding RREQ (Route Request) packets. Subsequently, discovered paths are evalu-

ated according an energy-aware metric either by source or destination nodes.

In [7], the concept is in the time delay route request sent by each node. In fact, a node

holds the RREQ packet for some time; this time is inversely proportional to its residual

battery energy. Hence, paths with nodes that are poor in energy will have minimal chance

to be chosen.

In [8], author aims to maximize the nodes lifetime while minimizing the energy con-

sumption. Every source node runs the first-visit ONMC RL algorithm in order to choose

the best path based on three main parameters: The minimum- energy path, the max–min

residual battery path, and the minimum-cost path.

Like work in [8], authors in [9] choose also to combine the routing protocol with RL

algorithm. First they modeled the issue as a sequential decision making problem, then,

they show how to map routing into a reinforcement learning problem involving a par-

tially observable Markov decision process.

[10] This paper presents a new algorithm called Energy-Aware Span Routing Pro-

tocol (EASRP) that uses energy-saving approaches such as Span and the Adaptive Fi-

delity Energy Conservation Algorithm (AFECA) [11]. Energy consumption is further

optimized by using a hardware circuit called the Remote Activated Switch (RAS) to

wake up sleeping nodes. These energy-saving approaches are well-established in reac-

tive protocols.

However, there are certain issues to be addressed when using EASRP in a hybrid

protocol, especially a proactive protocol.

2.2 Local Routing Protocol

In Local Routing, each intermediate node, according to its energy-profile, makes its

own decision in order:

─ To participate or not in routes-discovery,

─ To delay EQ-forwarding, or,

─ To eventually adjust its EQ-forwarding rate.

The routing model proposed in [12] gives nodes two possible modes of behavior: to

cooperate (forward packets) or to defect (drop packets).

In [13], each node j forwards packets with a probability µj. When a packet is sent,

each node computes the current equilibrium strategy and uses it as the forwarding prob-

ability. Also, a punishment mechanism is proposed where nodes decrease their forward-

ing probabilities, when someone deviates from the equilibrium strategy.

Despite the proven effectiveness of these works, authors in [14-15] offer other effi-

cient routing techniques by applying Reinforcement Learning to enable each node to

learn appropriate forwarding rate reflecting its willingness to participate in routes dis-

covery process.

In [16] authors propose a dynamic fuzzy energy state based AODV (DFES-AODV)

routing protocol for Mobile Ad-hoc Networks (MANETs) where during route discov-

ery phase, each node uses a Mamdani fuzzy logic system (FLS) to decide its Route

Requests (RREQs) forwarding probability.

Unlike work in [16], Fuzzy Logic System (FLS) was used in [17] for adjusting the

willingness parameter in OLSR protocol. Decisions made at each mobile node by the

FLS take into account its remaining energy and its expected residual lifetime.

Authors in [18] propose a new intelligent routing protocol for MANET based on

the combination of Multi Criteria Decision Making (MCDM) technique with an intel-

ligent method, namely, Intuitionistic Fuzzy Soft Set (IFSS) which reduces uncertainty

related to the mobile node and offers energy efficient route.

MCDM technique used in this paper is based on entropy and Preference Ranking

Organization Method for Enrichment of Evaluations-II (PROMETHEE-II) method to

determine efficient route.

3 Proposed Protocol Design

To maximize lifetime of network, we present in this section a reactive protocol called

EQ-AODV (Energy Q-learning AODV protocol) using reinforcement learning algo-

rithm.

Based on the original AODV [4], EQ-AODV is an enhanced routing protocol that use

Q-learning algorithm [3] to achieve whole network link status information from local

communication and change routes preemptively using the information so learned.

In our approach, the network was modeled as a Markov decision processes (MDP)

as described in [19] (Fig. 1)

Fig. 1. AODV as a Markov Decision Process

We see clearly two new values Qmax and R added to the original AODV. Respec-

tively, the best Q-value extract from the routing table of neighbor which sends RREQ or

RREP, and the reward R calculated at each RREQ and RREP reception based on Energy.

Before gives the proposed RL model, let’s do a comparison between AODV-SARSA

[6] and EQ-AODV:

Table 1. Comparison between AODV-SARSA and EQ-AODV

Comparison Criterions

AODV SARSA

EQ-AODV

Global Vs Local routing

Local

Global

Set of states

Residual Lifetime % (RT)

Nodes

Set of actions

Ratio of RREQs forward

RREQ (Route Request)

Reward Regime

Average of Drain Rates

Residual Lifetime (RT)

Metric

Min-Hop

Q-Value

The RL model proposed in this article can be described as follows:

3.1 The Set of States

Each node in the network is considered as a state. The set of all nodes is the state

space. Each node:

 Calculates the reward R,

 Calculates Q-value with neighbors,

 Selects the next hop that it should forward packets.

3.2 The Set of Actions

The action can be equivalent to a packet being delivered from one node to its neigh-

bor. The set of neighbor changes due to mobility of nodes. Each node only needs to

select its best next hop. The metric used by AODV to choose the best next hop is hop-

count. In EQ-AODV, the best next hop is based on the estimation of Q-value from origin

to destination based on the Q-Learning algorithm.

3.3 Reward

To calculate our Q-Value for destination, we chose a reward signal based on Drate

value (Energy Drain Rate) and the Residual Energy of node. Drate is calculated using

the exponential moving average method [20]:

        (1)

 and  indicate, respectively, the old and the newly calcu-

lated energy drain rate values. More priority should be attributed to the current drain rate

value using  weighting factor. To measure the energy drain rate per second, each node

monitors its energy consumption during a T seconds sampling interval [20].

We use the result of  with RE (Residual Energy) to calculate the RT (Resid-

ual Lifetime) who is considered like our reward:

  

 (2)

3.4 RL Algorithm

We chose Q-Learning algorithm [4]. One of the most popular and we define an ex-

perience-tuple: (st, at, Rt, st+1, at+1) summarizing a single transition for the RL-agent in

its environment. Where:

 st is the state before the transition,

 at is the taken action,

 Rt is the immediate reward,

 St+1 is the resulting state,

 and at+1 is the chosen action at the next time step t+1.

 Let  [0, 1] and  [0, 1] be the learning rate and the discount factor, respectively.

The Action- value function Q(s, a), estimates the expected future reward to the agent

when it performs a given action in a given state and following the learned policy .

Algorithm. Q-Learning algorithm

Initializations:

Initialize Q(s,a);

Initialize s ;

Repeat for each time-step

Choose an action a using

Take a

Observe the reward r and the state s

Update Q(s ,a ) : Q(s ,a )  Q(s ,a )+(r+(s ,a )-(s ,a ))

Until the terminal state is reached

In this paper, we assume that the Q-Learning is distributed and each node has a part

of Q(s, a) table with Neighbors.

4 Experiments Results and Discussion

In this section, we first describe the simulation environment used in our study and

then discuss the results in detail. Our simulations are implemented in Network Simulator

(NS-2) [21]. At this level of our study, we discuss the results of both EQ-AODV and

AODV-SARSA.

In brief, simulation parameters were set as illustrated in Table2.

Table 2. Simulation Parameters Setting

Simulation parameter

Value

Network Scale

800x800

Simulation Time

900s

Number of nodes

Mobility Model

Random Way Point

Pause time

Traffic Type

CBR

Connections Number

10,20,30

Packets Transmission Rate

4 Packets/s

Initial Energy

10 joules

Transmission Power

0,6 Watt

Reception Power

0,3 Watt

T Sampling Interval

Learning rate

0,9

Discount factor

0,1

To evaluate performance of EQ-AODV, we compare the EQ-AODV algorithm with

AODV-SARSA, using the following metrics:

 Delivery Rate: the ratio of packets reaching the destination node to the total packets

generated at the source node.

 Average End-to-End Delay: the interval time between sending by the source node

and receiving by the destination node, which includes the processing time and queu-

ing time.

 The Time Half Nodes Depletion: the time at which the network see 50% of its

nodes exhausting all their batteries [14].

Tables 3, 4 and 5 shows the performances of each protocol EQ-AODV and AODV-

SARSA using 50 nodes and Maximum Velocity 10m/s in low, medium and high traffic.

Table 3. Simulation Results for Delivery Rate

Delivery Rate

AODV-SARSA

EQ-AODV

Low Traffic-10 Connections

82,45798333

85,34965

Medium Traffic- 20 Connections

68,74455517

71,30487333

High traffic-30 Connections

54,77744

57,02822

Fig. 2. Average Delivery Ratio

Results (Table3 and Fig.2) show that EQ-AODV has the best Delivery Ration. The

ration of packets reaching destination is higher in Low traffic, Medium traffic and High

traffic.

Table 4. Simulation Results for End to End Delay

End to End Delay

AODV-SARSA

EQ-AODV

Low Traffic-10 Connections

0,046278313

0,085645713

Medium Traffic- 20 Connections

0,079130017

0,095483247

High traffic-30 Connections

0,152210963

0,21413287

Fig. 3. End to End Delay

By changing the hop-count metric of AODV that represents the shortest path, we

expected to degrade the End to End Delay. Results (Table4 and Fig.3) show that the

End to End delay is clearly the weakness point of EQ-AODV. AODV-SARSA is better

in Low traffic, Medium traffic and High traffic.

Table 5. Simulation results for Time Half Energy Deplation

Delivery Rate

AODV-SARSA

EQ-AODV

Low Traffic-10 Connections

118,2996865

122,0999305

Medium Traffic- 20 Connections

86,9253407

90,1144598

High traffic-30 Connections

74,99828117

76,31837853

Fig. 4. Time to Half Energy Depletion

About consuming energy, results (Table5 and Fig4) show that EQ-AODV is better

than AODV-SARSA in all simulations. The Time to Half Energy Depletion is clearly

better in Low traffic, Medium traffic and High traffic.

The more Time Half Energy Depletion, high lifetime will be.

5 Conclusion

In this paper we have raised the issue of Energy Aware Routing while maximizing

the Network lifetime in MANET.

Using simulation, we chose to compare two types of routing algorithms: Global and

Local Routing. Both algorithms are based on reinforcement learning techniques.

The results show that both algorithms have encouraging performance for MANET net-

works. The EQ-AODV gives better performances than AODV-SARSA in most metrics,

as the packet delivery ratio, and energy consumption. However, AODV-SARSA End to

End performances is better.

We can conclude that the choice of the routing algorithm will be made according to the

metric that network want optimize, and this; depending on the service demand.

Our future work will focus on implementing other reinforcement learning algorithms

based on difference temporal for both local and global approach. Also, testing the pro-

posal in different network conditions (high/ low mobility, high/ low density…)

References

1. Giordano, S.: Mobile ad hoc networks. Handbook of wireless networks and mobile com-

puting. pp. 325–346. (2002)

2. Sutton, R.S., Barto, A.G.: Reinforcement Learning, Second edition, in progress, MIT

Press (2014).

3. Perkins, C., Belding-Royer, E., Das, S.: Ad hoc On-Demand Distance Vector (AODV)

Routing. Network Working Group, ftp://ftp.nordu.net/rfc/rfc3561.txt, July (2003).

4. Watkins, C. J. C. H., Dayan, P.: Qlearnin, Maching Learning, 8: 279-292, (1992).

5. Chettibi, S., Chikhi, S.: A Survey of Reinforcement Learning Based Routing Proto-

cols for Mobile Ad-Hoc Networks, Recent Trends in Wireless and Mobile Networks,

Communications in Computer and information Science, Springer,vol. 162, pp. 1-13.

(2011).

6. Vassileva, N., Barcelo-Arroyo, F.: A Survey of Routing Protocols for Maximizing the

Lifetime of Ad Hoc Wireless Networks. International Journal of Software Engineering

and Its Applications, Vol. 2, No. 3, pp. 77 – 9. (2008).

7. Cho, W., Kim, S. L.: A fully distributed routing algorithm for maximizing life time of a

wireless ad hoc network. In Proc. IEEE 4 th Int. Workshop-Mobile & Wireless Commun.

Network, pp. 670-674. Sep. (2002).

8. Naruephiphat, W., Usaha, W.: Balancing tradeoffs for energy- efficient routing in MA-

NETs based on reinforcement learning. In: The IEEE 67th Vehicular Technology Con-

ference, (2008).

9. Nurmi, P.: Reinforcement learning for routing in ad-hoc networks. In: Proceedings of the

Fifth International Symposium on Modeling and Optimization in Mobile, Ad-Hoc, and

Wireless Networks (WiOpt), (2007).

10. Ravi, G., & Kashwan, K. R.: A new routing protocol for energy efficient mobile applica-

tions for ad hoc networks. Computers & Electrical Engineering, 48, 77–85. (2015).

11. Xu, Y., Heidemann, J., D. Estrin, D.: Geography informed Energy Conservation for Ad-

Hoc Routing, Proceedings of 7th Annual International Conference on Mobile Computing

and Networking, pp. 70-84. (2001).

12. Srinivasan, V., Nuggehalli, P., Chiasserini, C. F., Rao, R. R.: Cooperation in wireless ad

hoc networks. In Proceedings of the 22nd Annual Joint Conference of the IEEE Computer

and Communications Societies (INFOCOM). IEEE Computer Society, pp. 808– 817.

(2003).

13. Altman, E., Kherani, A. A., Michiardi, P., Molva, R.: Non-cooperative forwarding in ad-

hoc networks. In Proceedings of the 15th IEEE International Symposium On Personal,

Indoor and Mobile Radio Communications, (2004).

14. Chettibi, S., Chikhi, S.: An Adaptive Energy-Aware Routing Protocol for MANETs Us-

ing the SARSA Reinforcement Lea ing Algorithm. Evolving and Adaptive Intelligent

Systems (EAIS), IEEE Conference on, 84-89. (2012).

15. Chettibi, S., Chikhi, S.: Adaptive maximum-lifetime routing in mobile ad-hoc networks

using temporal difference reinforcement learning. Evol. Syst. 5, Springer Berlin Heidel-

berg, (2014).

16. Chettibi, S., & Chikhi, S.: Dynamic fuzzy (local routing) logic and reinforcement learning

for adaptive energy efficient routing in mobile ad-hoc networks. Applied Soft Compu-

ting, 38, 321–328. (2016).

17. Chettibi, S., & Chikhi, S.: FEA-OLSR: An adaptive energy aware routing protocol for

manets using zero-order sugeno fuzzy system. International Journal of Computer Science

Issues (IJCSI), 10(2), 136–141. (2013).

18. Das, S.K., Tripathi,S.: Intelligent energy-aware efficient routing for MANET. Wireless

Networks 24(4): 1139-1159. (2018).

19. Sutton, R., Barto, A.: Reinforcement Learning, MIT Press, Cambridge, MA, (1998).

20. Kim, D., Aceves, J. J. G. L., Obraczka, Cano, J. C., Manzoni, P.: Power-aware routing

based on the energy drain rate for mobile ad-hoc networks. In: 11th International Confer-

ence on Computer Communications and Networks, (2002).

21. NS, The UCB/LBNL/VINT Network Simulator (NS), http://www.isi.edu/nsnam/ns/,

(2004).

Reinforcement Learning Based on Routing with Infrastructure Nodes for Data Dissemination in Vehicular Networks (RRIN)

Article

Full-text available

Jul 2022
WIREL NETW

Vehicular-Ad hoc Networks (VANETs) are extremely important due to the potential for improving road safety, traffic monitoring, and in-vehicle infotainment services. A novel Q-learning-based routing protocol named Reinforcement learning-based Routing with Infrastructure Node Data Dissemination in Vehicular Network (RRIN) is proposed to efficiently address such a dynamic network. RRIN is a routing protocol that aims to achieve low end-to-end communication latency and a high data delivery ratio. To meet the objectives, we proposed two Q-routing functions for Road Model Segment Selection (RMSS) and Intermediate Vehicle Selection (IVS). The network environment is separated into road model segments, and Road Side Units (RSU) at each road junction to assist nodes in data dissemination was deployed. The exploration feature of the Q-learning algorithm allowed the vehicles to randomly explore and interact with the dynamic environment in the vehicular network. Our findings show that the proposed RRIN routing protocol is highly beneficial compared to other efficient routing protocols with high packet delivery, high throughput, and low end-to-end communication latency. Our findings show that the proposed RRIN routing protocol is highly beneficial compared to other efficient routing protocols with high packet delivery, high throughput, and low end-to-end communication latency. Due to the exploration and exploitation phases of Q-Learning, the proposed RRIN routing protocol enhances the reliability and the efficiency of the vehicular network in terms of high throughput, low communication latency, and low packet and high packet delivery ratio. For RMSS, the shortest distance and higher connectivity distribution are considered parameters; whereas, the parameters for IVS are vehicle speed difference, link reliability, moving direction, and buffer size. .

A Study on Various Technologies to Solve the Routing Problem in Internet of Vehicles (IoV)

Article

Full-text available

Jul 2021
WIRELESS PERS COMMUN

Internet of Vehicles (IoV) can be pivotal factor towards realization of Intelligent Transportation Systems. IoV principle focus is to have time decisive safety applications, optimize traffic flow, infotainment and Vehicular network with the intention to improve road safety through deployment of application allowing drivers to anticipate danger on the road. One of the important challenges of IoV is timely, reliable, and consistent propagation of messages among vehicles which enable drivers to take appropriate decisions to have improved road safety. Many proposals has been put forward by researchers to identify the traffic jam and routing the vehicular nodes in urban and highway roads for consistent, safe and secured driving environment. Even though the protocols have several limitations including lack of scalability to larger networks, routing overheads, etc. To overcome these limitations bio-inspired, big data, genetic algorithm, machine learning approaches have been proposed to identify and route packets among vehicular nodes in an optimized manner. The paper contains the survey of already proposed method and new approach to identify and route the vehicular node for the IoV environment.

Quality of Service Protection in 5G Mobile Ad Hoc Networks with Behavior Learning-Enhanced CNN-AODV Routing

Chapter

May 2024

Enhancing Reliability in Mobile Ad Hoc Networks (MANETs) Through the K-AOMDV Routing Protocol to Mitigate Black Hole Attacks

Article

Full-text available

Feb 2024

A Mobile Ad Hoc Network (MANET) is a self-organize assemblage of mobile nodes without the use of pre-existing infrastructure. They face challenges of security, routing efficiency, and network stability due to dynamic topology and limited resources. The Black Hole Attack on MANETs is a critical concern, affecting communication reliability. This malicious activity involves a node falsely advertising the shortest route to the destination, leading data packets to be routed into a “black hole” where they are dropped and causing severe disruptions. This research focuses on the Ad Hoc On-Demand Multi-Path Distance Vector Routing (AOMDV) protocol, which is preferred for its improved efficiency compared to a single-path routing protocol in MANETs. We observe, investigate, and estimate wireless ad-hoc network route optimization by reducing packet hops between nodes. We suggested a novel strategy in this paper, the K-AOMDV protocol that uses K-means clustering to prevent routing misbehavior. The efficiency of the proposed K-AOMDV (KNN-Ad-hoc on demand multi-path distance vector) routing protocol is calculated using supervised machine learning approach to predict optimal routes with delay and attacks. By employing multiple paths and dynamic route discovery, it ensures robust data delivery even in the presence of malicious nodes. This protocol’s adaptability and multi-path nature effectively minimize the effects of Black Hole Attacks, bolstering the MANETs security. Proposed algorithm has a high accuracy rate of 0.99%, 80% true positives, and 80% recall.

An improved method of AODV routing protocol using reinforcement learning for ensuring QoS in 5G-based mobile ad-hoc networks

Article

Jul 2023

Futuristic Analysis of Machine Learning Based Routing Protocols in Wireless Ad Hoc Networks

Conference Paper

Jul 2021

Intelligent energy-aware efficient routing for MANET

Article

Full-text available

May 2018
WIREL NETW

Designing an energy efficient routing protocol is one of the main issue of Mobile Ad-hoc Networks (MANETs). It is challenging task to provide energy efficient routes because MANET is dynamic and mobile nodes are fitted with limited capacity of batteries. The high mobility of nodes results in quick changes in the routes, thus requiring some mechanism for determining efficient routes. In this paper, an Intelligent Energy-aware Efficient Routing protocol for MANET (IE2R) is proposed. In IE2R, Multi Criteria Decision Making (MCDM) technique is used based on entropy and Preference Ranking Organization METHod for Enrichment of Evaluations-II (PROMETHEE-II) method to determine efficient route. MCDM technique combines with an intelligent method, namely, Intuitionistic Fuzzy Soft Set (IFSS) which reduces uncertainty related to the mobile node and offers energy efficient route. The proposed protocol is simulated using the NS-2 simulator. The performance of the proposed protocol is compared with the existing routing protocols, and the results obtained outperforms existing protocols in terms of several network metrics.

FEA-OLSR: An adaptive energy aware routing protocol for MANETs using zero-order Sugeno fuzzy system

Article

Full-text available

Jan 2013

Optimized Link State Routing (OLSR) is a standard proactive routing protocol for Mobile Ad-hoc NETworks (MANETs). In this paper, we use a zero-order Sugeno Fuzzy Logic System (FLS) for adjusting the willingness parameter in OLSR protocol. Decisions made at each mobile node by the FLS take into account its remaining energy and its expected residual lifetime. Simulation study revealed that the proposed protocol Fuzzy Energy-Aware OLSR (FEA-OLSR) is more energy efficient than EE-OLSR a heuristic based energy-aware variant of OLSR.

Dynamic Fuzzy Logic and Reinforcement Learning for Adaptive Energy Efficient Routing in Mobile Ad-hoc Networks

Article

Full-text available

Oct 2015
APPL SOFT COMPUT

In this paper, a Dynamic Fuzzy Energy State based AODV (DFES-AODV) routing protocol for Mobile Ad-hoc NETworks (MANETs) is presented. In DFES-AODV route discovery phase, each node uses a Mamdani Fuzzy Logic System (FLS) to decide its Route REQuests (RREQs) forwarding probability. The FLS inputs are residual battery level and energy drain rate of mobile node. Unlike previous related-works, membership-function of residual energy input is made dynamic. Also, a zero-order Takagi Sugeno FLS with the same inputs is used as a means of generalization for state-space in SARSA-AODV a Reinforcement Learning based energy-aware routing protocol. The simulation study confirms that using a dynamic Fuzzy system ensures more energy efficiency in comparison to its static counterpart. Moreover, DFES-AODV exhibits similar performance to SARSA-AODV and its Fuzzy extension FSARSA-AODV. Therefore, the use of dynamic fuzzy logic for adaptive routing in MANETs is recommended.

Reinforcement learning

Article

Full-text available

Jan 1999
J COGNITIVE NEUROSCI

Adaptive maximum-lifetime routing in mobile ad-hoc networks using temporal difference reinforcement learning

Article

Full-text available

Aug 2013

Saloua Chettibi

Mobile ad-hoc NETworks (MANETs) are very dynamic environments. A routing protocol for MANETs should be adaptive in order to operate correctly in presence of variable network conditions. Reinforcement learning (RL) is a recently used technique to achieve adaptive routing in MANETs. In comparison to other machine learning and computational intelligence techniques, RL achieves optimal results at low processing and medium memory costs. To deal with adaptive energy-aware routing issue in MANETs, a RL-based maximum-lifetime routing strategy is proposed. Each mobile node learns how to adjust its route-request packets forwarding-rate according to its energy profile. In terms of RL-resolution methods, Q-Learning, SARSA, Q(λ) and SARSA(λ) which are Temporal difference RL-algorithms are used. The proposed RL model is implemented on the top of AODV routing protocol. Simulation results show that the RL-based AODV achieved good performances in comparison to Time-Delay and Probability based AODV. Particularly, the Q-Learning based AODV has marked the best global performances in terms of energy efficiency and end to end delay.

Q-learning

Article

Jan 1992

Ad-hoc on-demand distance vector (AODV) routing

Article

Jan 2003

Ad hoc on-demand distance vector (AODV) routing

Article

Jan 2002

A new routing protocol for energy efficient mobile applications for ad hoc networks

Article

Apr 2015
COMPUT ELECTR ENG

A Mobile Ad hoc Network (MANET) is an infrastructure-less collection of nodes that are powered by portable batteries. Consumption of energy is the major constraint in a wireless network. This paper presents a new algorithm called Energy-Aware Span Routing Protocol (EASRP) that uses energy-saving approaches such as Span and the Adaptive Fidelity Energy Conservation Algorithm (AFECA). Energy consumption is further optimized by using a hardware circuit called the Remote Activated Switch (RAS) to wake up sleeping nodes. These energy-saving approaches are well-established in reactive protocols. However, there are certain issues to be addressed when using EASRP in a hybrid protocol, especially a proactive protocol. Simulation results for the EASRP protocol show an increase in energy efficiency of 12.2% and 17.45% compared with EAZRP and ZRP, respectively. The EASRP protocol also proves to be effective in by producing a better packet delivery ratio for low- and high-density networks as measured by the NS-2 simulation tool.

Mobile Ad Hoc Networks

Conference Paper

Feb 2002

Silvia Giordano

Chapter 15 focuses on the state of the art in mobile ad-hoc networks. It highlights some of the emerging technologies, protocols, and approaches (at different layers) for realising network services for users on the move. People-based networks, where information is transmitted using ‘people’ (i.e. personal digital assistants) are also discussed.

Reinforcement Learning Based Routing Protocols Analysis for Mobile Ad-Hoc Networks

Abstract and Figures

Recommended publications

A survey - Energy efficient routing protocols in MANET

A multi metric priority based energy conserving routing protocol for mobile ad hoc network

QTAR: A Q-learning-based topology-aware routing protocol for underwater wireless sensor networks

A link availability-based QoS-aware routing protocol for mobile ad hoc sensor networks