Context in source publication

Context 1
... proposed HRL framework focuses on automated overtaking. As shown in Fig. 2, a typical overtaking maneuver contains three phases: (1) the host vehicle (HV) moves to the passing lane, (2) the host vehicle drives in parallel with the overtaken vehicle (OV) and performs overtaking, (3) the host vehicle moves back to the original lane. Generally, overtaking can be divided into the continuous overtaking and ...

Citations

... The actions consist of the following states: To enhance the integration of decision-making and planning, and to avoid jitter caused by direct controller tracking [13], we make RL outputs selectable continuous trajectories, which follow lateral and longitudinal rules below. (4) When the agent chooses to change lanes, to enhance flexibility, the estimated lane change time is divided into long, normal, and short categories [14]. Based on common sense, the normal expected lane change time is set within 3 seconds, short is within 2.7 seconds, and long refers to completing the lane change within 3.3 seconds [15]. ...
... Overtaking is an important way for AVs to improve driving efficiency, especially in mixedtraffic environments that contain AVs and human-driven vehicles (HDVs). There are some works [24][25][26] using reinforcement learning to handle single-vehicle overtaking problems. ...
Preprint
Full-text available
With the recent advancements in Vehicle-to-Vehicle communication technology, autonomous vehicles are able to connect and collaborate in platoon, minimizing accident risks, costs, and energy consumption. The significant benefits of vehicle platooning have gained increasing attention from the automation and artificial intelligence areas. However, few studies have focused on platoon with overtaking. To address this problem, the NoisyNet multi-agent deep Q-learning algorithm is developed in this paper, which the NoisyNet is employed to improve the exploration of the environment. By considering the factors of overtake, speed, collision, time headway and following vehicles, a domain-tailored reward function is proposed to accomplish safe platoon overtaking with high speed. Finally, simulation results show that the proposed method achieves successfully overtake in various traffic density situations.
... Various methods for constructing the node feature matrix are summarized in TABLE I, and methods for constructing the adjacency matrix are given in TABLE II. 2) Driving Behaviors: According to the output level, driving behaviors can be separated into two categories: highlevel behaviors and low-level control commands [32]. Highlevel behaviors mostly include merging, overtaking, and lane keeping, whereas low-level control commands include velocity and acceleration in various vehicle control directions. ...
Preprint
Full-text available
Proper functioning of connected and automated vehicles (CAVs) is crucial for the safety and efficiency of future intelligent transport systems. Meanwhile, transitioning to fully autonomous driving requires a long period of mixed autonomy traffic, including both CAVs and human-driven vehicles. Thus, collaboration decision-making for CAVs is essential to generate appropriate driving behaviors to enhance the safety and efficiency of mixed autonomy traffic. In recent years, deep reinforcement learning (DRL) has been widely used in solving decision-making problems. However, the existing DRL-based methods have been mainly focused on solving the decision-making of a single CAV. Using the existing DRL-based methods in mixed autonomy traffic cannot accurately represent the mutual effects of vehicles and model dynamic traffic environments. To address these shortcomings, this article proposes a graph reinforcement learning (GRL) approach for multi-agent decision-making of CAVs in mixed autonomy traffic. First, a generic and modular GRL framework is designed. Then, a systematic review of DRL and GRL methods is presented, focusing on the problems addressed in recent research. Moreover, a comparative study on different GRL methods is further proposed based on the designed framework to verify the effectiveness of GRL methods. Results show that the GRL methods can well optimize the performance of multi-agent decision-making for CAVs in mixed autonomy traffic compared to the DRL methods. Finally, challenges and future research directions are summarized. This study can provide a valuable research reference for solving the multi-agent decision-making problems of CAVs in mixed autonomy traffic and can promote the implementation of GRL-based methods into intelligent transportation systems. The source code of our work can be found at https://github.com/Jacklinkk/Graph_CAVs.
... As a result, with Double DQN as a learning method, continuous states can directly be used to learn a more precise policy without discretization, which is insufficient in this study since a simple simulation was used instead of experiments. A novel hierarchical reinforcement learning based on the semi-Markov decision process and motion primitives for overtaking is proposed in [32]. In this method, high-level decision-making is combined with lowlevel control by motion primitives. ...
Preprint
Full-text available
Autonomous vehicles are a growing technology that aims to enhance safety, accessibility, efficiency, and convenience through autonomous maneuvers ranging from lane change to overtaking. Overtaking is one of the most challenging maneuvers for autonomous vehicles, and current techniques for autonomous overtaking are limited to simple situations. This paper studies how to increase safety in autonomous overtaking by allowing the maneuver to be aborted. We propose a decision-making process based on a deep Q-Network to determine if and when the overtaking maneuver needs to be aborted. The proposed algorithm is empirically evaluated in simulation with varying traffic situations, indicating that the proposed method improves safety during overtaking maneuvers. Furthermore, the approach is demonstrated in real-world experiments using the autonomous shuttle iseAuto.
... HRL introduces hierarchical structures and temporal abstraction into learning, which makes it possible to learn policies separately for distinct subtasks [19]. Moreover, some researchers combine the idea of motion primitives (MPs) with HRL [20]. MPs-based methods decompose a complex problem into multiple easy subtasks and arrange corresponding MPs for each subtask. ...
Preprint
Full-text available
In urban environments, the complex and uncertain intersection scenarios are challenging for autonomous driving. To ensure safety, it is crucial to develop an adaptive decision making system that can handle the interaction with other vehicles. Manually designed model-based methods are reliable in common scenarios. But in uncertain environments, they are not reliable, so learning-based methods are proposed, especially reinforcement learning (RL) methods. However, current RL methods need retraining when the scenarios change. In other words, current RL methods cannot reuse accumulated knowledge. They forget learned knowledge when new scenarios are given. To solve this problem, we propose a hierarchical framework that can autonomously accumulate and reuse knowledge. The proposed method combines the idea of motion primitives (MPs) with hierarchical reinforcement learning (HRL). It decomposes complex problems into multiple basic subtasks to reduce the difficulty. The proposed method and other baseline methods are tested in a challenging intersection scenario based on the CARLA simulator. The intersection scenario contains three different subtasks that can reflect the complexity and uncertainty of real traffic flow. After offline learning and testing, the proposed method is proved to have the best performance among all methods.
... As the "brains" of connected autonomous vehicles (CAVs) [5], the decision-making module formulates the most reasonable control strategy according to the state feature matrix transmitted by the sensing module, the vehicle state, and the cloud transmission information [6]. Moreover, it sends the determined control strategy to the motion-controlling module, including high-level behavior and low-level control requirements [7,8]. It is crucial to complete autonomous driving tasks safely and efficiently by making reasonable decisions based on other modules [9]. ...
Article
Full-text available
As one of the main elements of reinforcement learning, the design of the reward function is often not given enough attention when reinforcement learning is used in concrete applications, which leads to unsatisfactory performances. In this study, a reward function matrix is proposed for training various decision-making modes with emphasis on decision-making styles and further emphasis on incentives and punishments. Additionally, we model a traffic scene via graph model to better represent the interaction between vehicles, and adopt the graph convolutional network (GCN) to extract the features of the graph structure to help the connected autonomous vehicles perform decision-making directly. Furthermore, we combine GCN with deep Q-learning and multi-step double deep Q-learning to train four decision-making modes, which are named the graph convolutional deep Q-network (GQN) and the multi-step double graph convolutional deep Q-network (MDGQN). In the simulation, the superiority of the reward function matrix is proved by comparing it with the baseline, and evaluation metrics are proposed to verify the performance differences among decision-making modes. Results show that the trained decision-making modes can satisfy various driving requirements, including task completion rate, safety requirements, comfort level, and completion efficiency, by adjusting the weight values in the reward function matrix. Finally, the decision-making modes trained by MDGQN had better performance in an uncertain highway exit scene than those trained by GQN.
... This paper proposes an MDP-based module to ensure that a proper overtaking decision can be made when OVs have different social preferences. In Ref. [19], an SMDP-based decision-making module and motion primitives (MPs) were proposed, which have solved the motion-planning and control problems for a complete classic overtaking framework. Hence, combining two decision-making modules, a complete hierarchical reinforcement learning based (HRL-based) AOS considering social preferences can be developed. ...
... (2) A complete HRL-based AOS combining two modules is proposed. In this AOS, the high-level module is the MDP-based module proposed in this paper, and the lowlevel module is the Semi-Markov decision process based (SMDP-based) decision module proposed in Ref. [19]. Two modules will operate in corresponding phases during overtaking, thus completing a safe overtaking task. ...
... In Ref. [19], the SMDP-based decision-making module has been developed for selecting MP to achieve safe and efficient motion planning and control. To clearly illustrate the procedure of AOS later on, this section will make a brief introduction to this module and MP. ...
Article
Full-text available
As intelligent vehicles usually have complex overtaking process, a safe and efficient automated overtaking system (AOS) is vital to avoid accidents caused by wrong operation of drivers. Existing AOSs rarely consider longitudinal reactions of the overtaken vehicle (OV) during overtaking. This paper proposed a novel AOS based on hierarchical reinforcement learning, where the longitudinal reaction is given by a data-driven social preference estimation. This AOS incorporates two modules that can function in different overtaking phases. The first module based on semi-Markov decision process (SMDP) and motion primitives (MPs) is built for motion planning and control. The second module based on Markov decision process (MDP) is designed to enable vehicles to make proper decisions according to the social preference of OV. Based on realistic overtaking data, the proposed AOS and its modules are verified experimentally. The results of the tests show that the proposed AOS can realize safe and effective overtaking in scenes built by realistic data, and has the ability to flexibly adjust lateral driving behavior and lane changing position when the OVs have different social preferences.
... Qi Liu, Xueyuan Li, Shihua Yuan, Zirui Li planning system. In general, the inputs of decision-making system are environmental clues and status of ego vehicle, while the outputs are a serious of strategies including high-level behaviors and low-level control commands that are fed into motion planning system [12]. ...
Preprint
Full-text available
Autonomous vehicles have a great potential in the application of both civil and military fields, and have become the focus of research with the rapid development of science and economy. This article proposes a brief review on learning-based decision-making technology for autonomous vehicles since it is significant for safer and efficient performance of autonomous vehicles. Firstly, the basic outline of decision-making technology is provided. Secondly, related works about learning-based decision-making methods for autonomous vehicles are mainly reviewed with the comparison to classical decision-making methods. In addition, applications of decision-making methods in existing autonomous vehicles are summarized. Finally, promising research topics in the future study of decision-making technology for autonomous vehicles are prospected.
... Qi Liu, Xueyuan Li, Shihua Yuan, Zirui Li planning system. In general, the inputs of decision-making system are environmental clues and status of ego vehicle, while the outputs are a serious of strategies including high-level behaviors and low-level control commands that are fed into motion planning system [12]. ...
... These problems are typically addressed separately. Exceptions can be found in some recent works that propose end-to-end solutions using neural network based controllers trained using reinforcement learning [3], [4], [5]. However, the development of end-to-end reinforcement learning methods is still in its infancy and the methods lack stability, robustness, or optimality. ...
Preprint
Full-text available
Overtaking is one of the most challenging tasks in driving, and the current solutions to autonomous overtaking are limited to simple and static scenarios. In this paper, we present a method for behaviour and trajectory planning for safe autonomous overtaking. The proposed method optimizes the trajectory by simultaneously enforcing safety and minimizing intrusion onto the adjacent lane. Furthermore, the method allows the overtaking to be aborted, enabling the autonomous vehicle to merge back in the lane, if safety is compromised, because of e.g. traffic in opposing direction appearing during the maneuver execution. A finite state machine is used to select an appropriate maneuver at each time, and a combination of safe and reachable sets is used to iteratively generate intermediate reference targets based on the current maneuver. A nonlinear model predictive controller then plans dynamically feasible and collision-free trajectories to these intermediate reference targets. Simulation experiments demonstrate that the combination of intermediate reference generation and model predictive control is able to handle multiple behaviors, including following a lead vehicle, overtaking and aborting the overtake, within a single framework.