BlueSky user interface.

Source publication

Figure 1. Safety cylinder around aircraft.

Figure 4. Conflict resolution agent training process.

Review of Deep Reinforcement Learning Approaches for Conflict Resolution in Air Traffic Control

Article

Full-text available

May 2022

Deep reinforcement learning (DRL) has been widely adopted recently for its ability to solve decision-making problems that were previously out of reach due to a combination of nonlinear and high dimensionality. In the last few years, it has spread in the field of air traffic control (ATC), particularly in conflict resolution. In this work, we conduc...

Context 1

... is now a full-blown, user-friendly air traffic simulator that can run at higher update rates than expected for a high number of aircraft. The user interface is shown in Figure 5. To sum up, the authors recommend that the independent development environment be used to carry out the research in the free flight mode, and the environment based on the BlueSky platform be used to carry out the research in the en-route mode. ...

View in full-text

Comprehensive Review of Drones Collision Avoidance Schemes: Challenges and Open Issues

Article

Full-text available

Mar 2024
IEEE T INTELL TRANSP

In the contemporary landscape, the escalating deployment of drones across diverse industries has ushered in a consequential concern, including ensuring the security of drone operations. This concern extends to a spectrum of challenges, encompassing collisions with stationary and mobile obstacles and encounters with other drones. Moreover, the inherent limitations of drones, namely constraints on energy consumption, data storage capacity, and processing power, present formidable obstacles in developing collision avoidance algorithms. This review paper explores the challenges of ensuring safe drone operations, focusing on collision avoidance. We explore collision avoidance methods for UAVs from various perspectives, categorizing them into four main groups: obstacle detection and avoidance, collision avoidance algorithms, drone swarm, and path optimization. Additionally, our analysis delves into machine learning techniques, discusses metrics and simulation tools to validate collision avoidance systems, and delineates local and global algorithmic perspectives. Our evaluation reveals significant challenges in current drone collision prevention algorithms. Despite advancements, critical UAV network and communication challenges are often overlooked, prompting a reliance on simulation-based research due to cost and safety concerns. Challenges encompass precise detection of small and moving obstacles, minimizing path deviations at minimal cost, high machine learning and automation expenses, prohibitive costs of real testbeds, limited environmental comprehension, and security apprehensions. By addressing these key areas, future research can advance the field of drone collision avoidance and pave the way for safer and more efficient UAV operations.

Graph Reinforcement Learning for Multi-Aircraft Conflict Resolution

Article

Full-text available

Mar 2024

The escalating density of airspace has led to sharply increased conflicts between aircraft. Efficient and scalable conflict resolution methods are crucial to mitigate collision risks. Existing learning-based methods become less effective as the scale of aircraft increases due to their redundant information representations. In this paper, to accommodate the increased airspace density, a novel graph reinforcement learning (GRL) method is presented to efficiently learn deconfliction strategies. A time-evolving conflict graph is exploited to represent the local state of individual aircraft and the global spatiotemporal relationships between them. Equipped with the conflict graph, GRL can efficiently learn deconfliction strategies by selectively aggregating aircraft state information through a multi-head attention-boosted graph neural network. Furthermore, a temporal regularization mechanism is proposed to enhance learning stability in highly dynamic environments. Comprehensive experimental studies have been conducted on an OpenAI Gym-based flight simulator. Compared with the existing state-of-the-art learning-based methods, the results demonstrate that GRL can save much training time while achieving significantly better deconfliction strategies in terms of safety and efficiency metrics. In addition, GRL has a strong power of scalability and robustness with increasing aircraft scale.

Route-Recapturing State-Based Horizontal Maneuver Strategy for Automated Detect-and-Avoid

Technical Report

Full-text available

Feb 2024

M Gilbert Wu

This report describes a novel approach to the development of a horizontal maneuver guidance strategy for Detect-and-Avoid systems. The maneuver guidance strategy provides a directive turn action that can be automatically executed by the vehicle's auto-pilot system, taking into account the cost of recapturing the flight plan path. Pairwise conflict scenarios with non-accelerating intruders are simulated to validate the effectiveness of the maneuver guidance strategy. Initial results suggest the strategy is more effective for faster ownship than for slower ownship, which is unable to avoid conflict in certain scenarios against fast intruders. These findings indicate this novel approach shows great potential, but improvement to its performance is necessary and will be future work.

Dynamic Spectrum Allocation: Unleashing the Power of DRL-RW Algorithm in UAV-Based Energy Harvesting Networks

Preprint

Full-text available

Jan 2024

The improvement of energy and spectral efficiency in networks can be realized by seamlessly integrating energy harvesting, cognitive radio technologies, and NOMA techniques. These complementary strategies work together to optimize resource usage and address challenges related to energy consumption. Additionally, the adaptability and versatility of UAVs offer an innovative solution for enhancing coverage performance, not only improving connectivity but also overall efficiency and reliability. This study introduces a novel approach named a Deep Reinforcement Learning-Random Walrus (DRL-RW) algorithm, to enhance energy efficiency. The developed method combines Deep Reinforcement Learning and the Random Walrus optimization technique to efficiently allocate spectrum resources and manage energy harvesting in a dynamic environment. The DRL-RW algorithm empowers UAVs to learn optimal spectrum sharing strategies and energy harvesting policies. The random walrus optimization enhances the algorithm's adaptability and speed in exploring diverse solutions. Simulation results demonstrate the effectiveness of the DRL-RW algorithm, indicating improvements in various performance metrics, including reduced energy consumption, enhanced computation time, improved convergence, signal-to-noise ratio, increased throughput, network lifetime, harvested energy, and overall superior network performance compared to baseline techniques. These findings highlight the efficacy of the DRL-RW approach in effectively addressing challenges associated with energy management in cognitive radio networks. The integration of UAVs, NOMA networks, and the novel algorithm represents a promising direction for advancing energy-efficient communication systems.

General real-time three-dimensional multi-aircraft conflict resolution method using multi-agent reinforcement learning

Article

Full-text available

Dec 2023
TRANSPORT RES C-EMER

Reinforcement learning (RL) techniques have been studied for solving the conflict resolution (CR) problem in air traffic management, leveraging their potential for computation and ability to handle uncertainty. However, challenges remain that impede the application of RL methods to CR in practice, including three-dimensional manoeuvres, generalisation, trajectory recovery, and success rate. This paper proposes a general multi-agent reinforcement learning approach for real-time three-dimensional multi-aircraft conflict resolution, in which agents share a neural network and are deployed on each aircraft to form a distributed decision-making system. To address the challenges, several technologies are introduced, including a partial observation model based on imminent threats for generalisation, a safety separation relaxation model for multiple flight levels for three-dimensional manoeuvres, an adaptive manoeuvre strategy for trajectory recovery, and a conflict buffer model for success rate. The Rainbow Deep Q-learning Network (DQN) is used to enhance the efficiency of the RL process. A simulation environment that considers flight uncertainty (resulting from mechanical and navigation errors and wind) is constructed to train and evaluate the proposed approach. The experimental results demonstrate that the proposed method can resolve conflicts in scenarios with much higher traffic density than in today’s real-world situations.

Using Relative State Transformer Models for Multi-Agent Reinforcement Learning in Air Traffic Control

Conference Paper

Full-text available

Nov 2023

Deep Reinforcement Learning has seen more usage in the field of Air Traffic Control over the last couple of years. As the number of aircraft in a given sector of airspace is not constant, there is a need for methods to be invariant to the number of agents in the system. Often this is done by making a selection of the aircraft that will be included in the state, which introduces human biases. Another option that has been used is Recurrent Neural Networks to process the entire sequence of aircraft present. These methods however are sequence-dependent and can give different results depending on the order that the aircraft are given, which is undesirable. Methods that solely rely on attention mechanisms, such as transformers, allow sequential data to be processed in a sequence-invariant manner by using multi-head attention mechanisms. However, because traditional Transformers operate on individual tokens, this does not allow for relative state information to be encoded into the hidden state. This paper shows that by performing a transformation operation on the key and value tokens, it is possible to use Transformers on relative states, at the cost of a factor (N-1) additional attention computations, where N is the number of agents in the system. This adaptation allows relative state Transformers to obtain significantly higher performance than standard Transformers. The results also showed that using attention mechanisms to construct the initial observation vector out of a total of 20 agents results in similar, but slightly lower, performance to handcrafted observation vectors, without requiring manual selection of the important agents. Future research should investigate whether additional changes to the attention mechanisms and their training can result in higher performance.

Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle

Conference Paper

Full-text available

Nov 2023

The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different conflict resolution methods. One of the main disadvantages of analytical conflict resolution methods, in high-traffic density scenarios, is that they can cause instabilities of the airspace due to a domino effect of secondary conflicts. Therefore, many studies have also investigated other methods of conflict resolution, such as Deep Reinforcement Learning, which have shown positive results, but tend to be hard to explain due to their black-box nature. This paper investigates if it is possible to explain the behaviour of a Soft Actor-Critic model, trained for resolving vertical conflicts in a layered urban airspace, by interpreting the policy through a heat map of the selected actions. It was found that the model actively changes its policy depending on the degrees of freedom and has a tendency to adopt preventive behaviour on top of conflict resolution. This behaviour can be directly linked to a decrease in secondary conflicts when compared to analytical methods and can potentially be incorporated into these methods to improve them while maintaining explainability.

TD3-Based Optimization Framework for RSMA-Enhanced UAV-Aided Downlink Communications in Remote Areas

Article

Full-text available

Nov 2023

The need for reliable wireless communication in remote areas has led to the adoption of unmanned aerial vehicles (UAVs) as flying base stations (FlyBSs). FlyBSs hover over a designated area to ensure continuous communication coverage for mobile users on the ground. Moreover, rate-splitting multiple access (RSMA) has emerged as a promising interference management scheme in multi-user communication systems. In this paper, we investigate an RSMA-enhanced FlyBS downlink communication system and formulate an optimization problem to maximize the sum-rate of users, taking into account the three-dimensional FlyBS trajectory and RSMA parameters. To address this continuous non-convex optimization problem, we propose a TD3-RFBS optimization framework based on the twin-delayed deep deterministic policy gradient (TD3). This framework overcomes the limitations associated with the overestimation issue encountered in the deep deterministic policy gradient (DDPG), a well-known deep reinforcement learning method. Our simulation results demonstrate that TD3-RFBS outperforms existing solutions for FlyBS downlink communication systems, indicating its potential as a solution for future wireless networks.

Fragility Impact of RL Based Advanced Air Mobility under Gradient Attacks and Packet Drop Constraints

Conference Paper

Oct 2023

The increasing utilization of unmanned aerial vehicles (UAVs) in advanced air mobility (AAM) necessitates highly automated conflict resolution and collision avoidance strategies. Consequently, reinforcement learning (RL) algorithms have gained popularity in addressing conflict resolution strategies among UAVs. However, increasing digitization introduces challenges related to packet drop constraints and various adversarial cyber threats, rendering AAM fragile. Adversaries can introduce perturbations into the system states, reducing the efficacy of learning algorithms. Therefore, it is crucial to systematically investigate the impact of increased digitization, including adversarial cyber-threats and packet drop constraints to study the fragile characteristics of AAM infrastructure. This study examines the performance of artificial intelligence(AI) based path planning and conflict resolution strategies under different adversarial and stochastic packet drop constraints in UAV systems. The fragility analysis focuses on the number of conflicts, collisions and fuel consumption of the UAVs with respect to its mission, considering various adversarial attacks and packet drop constraint scenarios. The safe deep q-networks (DQN) architecture is utilized to navigate the UAVs, mitigating the adversarial threats and is benchmarked with vanilla DQN using the necessary metrics. The findings are a foundation for investigating the necessary modification of learning paradigms to develop antifragile strategies against emerging adversarial threats.

AirTrafficSim: An open-source web-based air traffic simulation platform.

Article

Full-text available

Jun 2023

BlueSky user interface.

Context in source publication

Citations