ArticlePDF Available

Probabilistic multi-modal expected trajectory prediction based on LSTM for autonomous driving

March 2024

March 2024

DOI:10.1177/09544070231167906

Authors:

Mingxi Bao

Jilin University

Fei Gao

Jilin University

Autonomous vehicles (AVs) need to adequately predict the trajectory space of surrounding vehicles (SVs) in order to make reasonable decision-making and improve driving safety. In this paper, we build the driving behavior intention recognition module and traffic vehicle expected trajectory prediction module by deep learning. On the one hand, the driving behavior intention recognition module identifies the probabilities of lane keeping, left lane changing, right lane changing, left acceleration lane changing, and right acceleration lane changing of the predicted vehicle. On the other hand, the expected trajectory prediction module adopts an encoder-decoder architecture, in which the encoder encodes the historical environment information of the surrounding agents as a context vector, and the decoder and MDN network combine the context vector and the identified driving behavior intention to predict the probability distribution of future trajectories. Additionally, our model produces the multiple behaviors and trajectories that may occur in the next 6 s for the predicted vehicle (PV). The proposed model is trained, validated and tested with the HighD dataset. The experimental results show that the constructed probabilistic multi-modal expected trajectory prediction possesses high accuracy in the intention recognition module with full consideration of interactive information. At the same time, the multi-modal probability distribution generated by the anticipated trajectory prediction model is more consistent with the real trajec-tories, which significantly improves the trajectory prediction accuracy compared with other approaches and has apparent advantages in predicting long-term domain trajectories.

Multiple possible future trajectories.

…

Diagram of the coordinate system of the traffic scene. Figure 3. Schematic diagram of the target vehicle orientation.

…

Trajectory forecasting flowchart.

…

Diagram of driving behavior intention.

…

Architecture of the PMETP.

…

Figures - uploaded by Mingxi Bao

Content may be subject to copyright.

Content uploaded by Mingxi Bao

Content may be subject to copyright.

Original Article

Proc IMechE Part D:

J Automobile Engineering

1–12

ÓIMechE 2023

Article reuse guidelines:

sagepub.com/journals-permissions

DOI: 10.1177/09544070231167906

journals.sagepub.com/home/pid

Probabilistic multi-modal expected

trajectory prediction based on LSTM

for autonomous driving

Zhenhai Gao, Mingxi Bao , Fei Gao and Minghong Tang

Abstract

Autonomous vehicles (AVs) need to adequately predict the trajectory space of surrounding vehicles (SVs) in order to

make reasonable decision-making and improve driving safety. In this paper, we build the driving behavior intention recog-

nition module and traffic vehicle expected trajectory prediction module by deep learning. On the one hand, the driving

behavior intention recognition module identifies the probabilities of lane keeping, left lane changing, right lane changing,

left acceleration lane changing, and right acceleration lane changing of the predicted vehicle. On the other hand, the

expected trajectory prediction module adopts an encoder-decoder architecture, in which the encoder encodes the his-

torical environment information of the surrounding agents as a context vector, and the decoder and MDN network

combine the context vector and the identified driving behavior intention to predict the probability distribution of future

trajectories. Additionally, our model produces the multiple behaviors and trajectories that may occur in the next 6 s for

the predicted vehicle (PV). The proposed model is trained, validated and tested with the HighD dataset. The experimen-

tal results show that the constructed probabilistic multi-modal expected trajectory prediction possesses high accuracy in

the intention recognition module with full consideration of interactive information. At the same time, the multi-modal

probability distribution generated by the anticipated trajectory prediction model is more consistent with the real trajec-

tories, which significantly improves the trajectory prediction accuracy compared with other approaches and has appar-

ent advantages in predicting long-term domain trajectories.

Keywords

Trajectory prediction, behavioral intent recognition, LSTM, interactive behavior

Date received: 15 December 2022; accepted: 20 March 2023

Introduction

To safely and effectively navigate through complex

traffic scenarios, the vehicle needs to have the ability to

predict the intentions and future trajectories of sur-

rounding agents. Excellent trajectory prediction capa-

bility not only makes decisions in advance but also

better enhances the safety and efficiency of agents.

1–4

Many researchers have studied the future trajec-

tories of AVs in recent years. Trajectory prediction can

be roughly divided into the prediction model based on

physical constraints

5–9

and the data-driven.

10,11

The

prediction model based on physical constraints mainly

considered the vehicle’s motion state, road environment

factors, and vehicle’s characteristics to predict the

future motion trend of the agent using kinematic mod-

els. However, this model relied too much on the cer-

tainty of the current state of the vehicle and the

completeness of the model input. The state estimation

of the host vehicle is still a significant challenge for

autonomous driving (AD) due to dynamic model

uncertainties, sensor noise, and bias.

5–8

The model

based on physical constraints was not applicable to

solve the high nonlinearity

of the vehicle trajectory. As

a result, the trajectory prediction method of this model

could not predict the long-time domain information

accurately. In addition, to solve the problem of low

accuracy of long-term domain prediction in dynamic

scenes, deep learning has been widely applied to trajec-

tory prediction. Kim et al.

used the LSTM to predict

the position of vehicles in the next 2s. Khakzar et al.

built a dual learning model (DLM) based on LSTM,

but increasing the dimension of the input feature space

State Key Laboratory of Automotive Simulation and Control, School of

Vehicle Engineering, Jilin University, Changchun, China

Corresponding author:

Fei Gao, State Key Laboratory of Automotive Simulation and Control,

School of Vehicle Engineering, Jilin University, 5988 Renmin Road,

Changchun 130022, China.

Email: gaofei123284123@jlu.edu.cn

will increase the difficulty of training the model, which

is difficult to meet the real-time requirements of intelli-

gent agents. Xie et al.

constructed a data-driven lane

change model based on LSTM only, without consider-

ing the influence of driving behaviors such as lane

keeping. Lin et al.

analyzed the influence of historical

trajectories and adjacent vehicles on PV based on

spatial-temporal attention LSTM, which lacked inter-

pretation of driving intentions. Xiao et al.

used a

behavioral intention module and a trajectory prediction

module in a highway scene to predict the future single-

mode trajectories of vehicles, which can effectively

identify the vehicle’s future behavioral intention.

However, the model’s output produced a large error

with the real trajectories, so the vehicle’s trajectories

were further fitted through optimization.

However, the trajectory prediction model constructed

above predicts the future single-mode trajectories through

the historical time-domain information, which does not

comprehensively represent the future prediction space of

PV and analyze the influence of driving behavior inten-

tions on the model. Therefore, in this paper, we investi-

gate multi-modal trajectorypredictionintermsofthe

diversity of the future prediction space and the influence

of driving behavior intentions on the model. Different

self-driving cars make various behaviors in the same sce-

nario, that is, there are multiple possible future outcomes

due to the inherent uncertainty in predicting the future.

For example, a blue vehicle may continue to go straight

or turn left based on the current environment, forming

different patterns in the trajectory space, as shown in

Figure 1. The problem of uncertainty in predicting the

future leads to the existence of multi-modal properties in

motion forecasting, making trajectory prediction a chal-

lenging problem.

To simulate uncertainty, a large number of scholars

have learned the potential variable

to indicate the

multi-modal properties of the trajectory, such as

VAES

18,19

and GANS.

Tang and Salakhutdinov

constructed a model architecture to capture multi-

modal attributes by introducing latent variables and

parallel neural networks. A large amount of work has

also focused on raster images to process interactions for

environmental modeling , applying convolutional neural

networks(CNN)

17,21–24

and recurrent neural networks

(RNN) to extract scenario information. Deo et al.

25–27

nominated a convolutional LSTM model based on a

social pool. It predicted the distribution of future traffic

vehicle trajectories, but ignored the effect of interactions

between agents. Cui et al.

encoded each participant’s

surroundings as a raster image, which was used as the

input to the deep CNN. However, the image-based

approach generates two complicated problems: (1) it

causes sparse convolution and wastes computational

resources (2) it is difficult to be interpretative.

In response to the above problems, in order to

adequately represent the vehicle’s behavior prediction

space, reduce model complexity, address the inherent

uncertainty of prediction, and lessen the safety issues of

motion planning, we propose a Probabilistic Multi-modal

Expected Trajectory Prediction (PMETP) model. The

contributions of this paper is two-fold.

1. In this paper, the motion state of PV and the inter-

action information of the surrounding environ-

ment are deeply extracted as the model input.

Meanwhile, the interaction information between

the SVs of the target agent is considered.

2. A framework is proposed to realize the specific

classification of the behavior intention by a neural

network, and predict the probability distribution of

the future trajectories of PV by the MDN network.

Problem formulation

In this paper, the expected trajectory prediction is

expressed as the probability distribution of predicting

the future position of a vehicle at each time step from

the historical characteristic information of the PV and

its surrounding traffic vehicles. PMETP aims to gener-

ate multiple possible and safe trajectories for traffic

agents in complex and highly dynamic scenarios to ade-

quately represent the future prediction space. It consists

of two main tasks: (1) how to represent the multi-

modal nature of the prediction results: different targets

may have different future trajectories for the same his-

torical trajectories. (2) How to model the interactions

between targets: the behavior among targets is influ-

enced not only by their intentions, but also by other

targets around them.

Frame of reference

The traffic scenario uses a fixed coordinate system

where the origin of the coordinate system is fixed to the

predicted vehicle at time t, as shown in Figure 2. The

direction of the x-axis is defined as the driving direction

of the vehicle, and the direction of the y-axis is defined

as the direction perpendicular to the driving direction

of the vehicle. PMETP does not rely on high-precision

maps, but only requires lane parameters and vehicle

status information to complete the expected trajectory

prediction.

Environment characteristic information

In the complex dynamic traffic environment, the

prediction of the trajectory of AVs should consider not

Figure 1. Multiple possible future trajectories.

2Proc IMechE Part D: J Automobile Engineering 00(0)

only the motion state of PV but also the environmental

information of the target vehicle, that is, the character-

istic information of the surrounding traffic vehicles, the

characteristic information of the interaction between

the predicted vehicle and the surrounding traffic vehi-

cles. In order for the motion prediction model to

understand the interactive behavior between vehicles,

the input information includes the historical and envi-

ronmental characteristics of the predicted agents, as

shown in formula (1).

Mt=Pt

ego,Et



,t2(0, T)ð1Þ

where, Mtis the input of the motion prediction model

at time t.Pt

ego =(x

) denotes

the characteristic information of the predicted vehicle

at time t. T is the length of vehicle historical track time.

xt,ytare the vertical and horizontal coordinates of the

target vehicle respectively. vt

x,vt

ycorresponds to the

speeds of longitudinal and horizontal coordinate of

agents respectively. at

x,at

yare the accelerations of longi-

tudinal and lateral of the predicted vehicle at time t,

respectively. dhwthwttc are the headway distance, head-

way time distance, and collision time between the pre-

dicted and the vehicle ahead, respectively. Etis the

environmental information of target agent at time t.

The environmental information is represented by the

eight directions of the predicted vehicle, as shown in

Figure 3. Environmental information is characterized

as Et=(St

LF,St

LA,St

LB,St

MF,St

MB,St

RF,St

RA,St

RB,It), St

{xt

p,yt

p,vt

px,vt

py,at

px,at

py}. p= LF, LA, LB, MF, MB, RF,

RA, RB refers to the location number of vehicles

around the predicted agent. It=fDSt

i,DSt

I,g.DSt

(Dxt

i,Dyt

i,Dacct

i,Dvt

i),i=(LF_TV, LA_TV, LB_

TV, MF_TV, MB_TV, RF_TV, RA_TV, RB_TV)

indicates the interaction information between the pre-

dicted vehicle and the surrounding traffic vehicles.

DSt

I=(Dlt

LF LA,Dlt

LA LB,Dlt

LF LB,Dlt

RF RA,Dlt

RA RB,Dlt

RF RB)

represents the interaction information between the

surrounding traffic vehicles, reflecting the absolute dis-

tance between the vehicles in the left and right lanes of

the predicted agent.

Dlt

mn=ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

(xt

mxt

n)2+(yt

myt

n)2

q,m,n

= (LF`LA`LB`RF`RA`RB):

Probabilistic motion forecasting

The flowchart of the trajectory forecasting model is

shown in Figure 4. Among them, the data pre-

processing is seen in Section 4.2, the PMETP is

described in Section 3, and the model training result is

illustrated in Section 4.

The PMETP proposed in this paper predicts the

probability distribution P(OjI) of future locations from

historical environmental information and driving beha-

vior intentions.

P(OjI)= XiPm,s,r,u(oijci,I)P(cijI)ð2Þ

where, Orepresents model output.Pm,s,r,u(oijci,I) cor-

responds to the probability distribution of the model’s

trajectory based on different driving intentions.

m,s,r,uare the parameters of the bi-variate Gaussian

distribution function regarding to the future time step,

which are the mean, variance, probability, and correla-

tion coefficient, respectively.

Driving behavior intention classification

When AV drive in a dynamic traffic environment, the

reason for lane change is to obtain larger driving space

or to avoid collision risk. Therefore, in this paper, the

specific driving behavior intentions of the left and right

lane changing are classified as acceleration and general

lane changing. As a result, we divided the driving beha-

vior intentions into five categories: left lane-changing,

lane keeping, right lane-changing, left speeded-up lane-

changing, and right speeded-up lane-changing, as shown

in Figure 5. If the average speed in the predicted time

domain is greater than the historical time domain aver-

age speed of 7.2 km/h, it is defined as an accelerated lane

Figure 2. Diagram of the coordinate system of the traffic scene.

Figure 3. Schematic diagram of the target vehicle orientation.

Gao et al. 3

change; otherwise, the vehicle’s intention is to change

lanes under normal running conditions.

PMETP model

Model framework

The PMETP proposed in this paper is composed of a

driving behavior intention recognition module and a

traffic vehicle expected trajectory prediction module, as

shown in Figure 6. Driving behavior intention recognition

and expected trajectory prediction of AVs are classifica-

tion problems and regression problems in machine learn-

ing, respectively. Based on the historical coded state

information of the vehicle, the AV driving behavior inten-

tion recognition module outputs five driving behavior

probabilities of the vehicle at the current moment, such as

lane keeping, left and right lane-changing, left and right

accelerating lane-changing. The traffic vehicle expected

trajectory prediction module produces a probability distri-

bution of future vehicle’s trajectories in the time domain

based on historical coded information and probability of

driving behavior intention.

The AV’ behavior intention recognition module,

which is built based on long short-term memory

(LSTM) network and multiple layer perceptron (MLP),

calculates the probability of each driving behavior

intention by the Softmax function. Let the historical

state feature information Mof the overall environment

of AVs at the current moment be the input vector of the

vehicle motion forecasting model. C=(c1,c2,c3,c4,c5)

is the intention category vector. c1;c5represents lane

keeping, left and right lane changing, left and right

acceleration lane changing respectively. The driving

intention category probability vector is

Ω=v1,v2,v3,v4,v5

ðÞ ð3Þ

where, v

= P(c

|M), i = 1,2,3,4,5 represents the prob-

ability of driving intention c

The traffic vehicle expected trajectory prediction

module is made up of a fully connected layer, encoder,

decoder, MLP, and Mixture density network (MDN).

First of all, the fully connected layer network extracts

the feature information of the historical state of the

traffic vehicle as the input of the encoder. Secondly, the

encoder uses LSTM to encode the input feature infor-

mation into a context vector. In order to improve the

backward and forward correlation of the current state,

Figure 5. Diagram of driving behavior intention.

Figure 4. Trajectory forecasting flowchart.

Figure 6. Architecture of the PMETP.

4Proc IMechE Part D: J Automobile Engineering 00(0)

the encoder uses a combination of bidirectional LSTM

and unidirectional LSTM to enhance the extraction of

the environmental feature information in the forward

process, so that the encoder pays more attention to the

feature information of the forward process. MLP and

MDN input the output vector of the decoder, enabling

the model to predict the probability distribution of

future tracks based on intent recognition.

LSTM

is a gated recurrent neural network with a for-

get gate, an input gate, an output gate and memory cells

with the same shape as the hidden state, as shown in

Figure 7. The input information of the LSTM is X

the current time step, the hidden state H

t21

of the previ-

ous time step and the memory cell C

t21

. The fully con-

nected layer activation functions sand tanh are:

s=1

1+ex

tanh =exex

ex+ex

ð4Þ

The forgetting gate determines the proportion of the

cell state that is forgotten from the previous moment,

the input gate determines whether the current input

contributes to the cell state, and the output gate con-

trols the candidate memory outcome. The input gate I

the forgetting gate F

, and the output gate O

are:

It=s(XtWxi +Ht1Whi +bi)

Ft=s(XtWxf +Ht1Whf +bf)

Ot=s(XtWxo +Ht1Who +bo)

ð5Þ

where,WxiWxfWxo 2dhWhi Whf,Who 2hhdenotes the

weight parameters. bi,bf,bo21hrepresents the bias

parameters.

Candidate memory cells select new information, and

memory cells combine information from memory cells

of the previous time step and candidate memory cells

of the current time step. The candidate memory con-

trols the flow of information through both input gates

as well as forgetting gates.

Ct=tanh(XtWxc +Ht1Whc +bc)

Ct=FtCt1+It~

ð6Þ

where,Wxc 2dh,Whc 2hhindicates the weight para-

meters. bc21hcorresponds to the bias parameters.

The hidden state at the current time step is controlled

by the output gate.

Ht=Ottanh(Ct)ð7Þ

Behavior intention recognition module

The traffic vehicle behavior intention recognition mod-

ule can understand the running law of PV and its sur-

rounding traffic vehicles with respect to their motion

state and interaction information, and can accurately

identify the vehicle’s driving intention in the current

state. The model framework is shown in Figure 6. The

overall model is built by a combination of MLP and

LSTM. The LSTM unit reads the input feature infor-

mation Mof the current time step and the hidden state

t21

of the historical time step and updates the hidden

state of the current time step, that is, Ht=f(Ht1,M).

Finally, the MLP and Softmax produce the probability

matrix Cof five driving intention categories: lane keep-

ing, left lane-changing, right lane-changing, left accel-

eration lane-changing, and right acceleration lane-

changing. Softmax is represented as:

softmax zn

ðÞ=ezn

m=1 ezmð8Þ

where, znis the output value of the n-th driving beha-

vior intention, which is in the range [0,1] and sums to 1.

mshows the number of categories for the classification

of driving behavior intention.

The intention recognition module employs multi-

categorical cross entropy as the loss function with an

Adam optimizer and a learning rate of 0.0002. The loss

function is:

Lc=Xm

n=1 yn3log(pn)ð9Þ

where, Lcis the loss of the behavioral intention recog-

nition module. yndenotes the real value of the n-th

sample label. pnrepresents the prediction probability of

the n-th observed sample.

Encoder-decoder

The anticipated trajectory prediction module is com-

posed of a fully connected layer, encoder, decoder,

MLP and MDN. The encoder consists of a deep bidir-

ectional LSTM and a unidirectional deep LSTM, and

the input to the decoder contains the output of the

encoder and a probability vector Xof the output of the

behavioral intention recognition module after feature

extraction from the fully connected layer. The state of

Figure 7. LSTM structure.

Gao et al. 5

the current time step of the agent is related not only to

the temporal state of the previous historical moment

but also to the future state. The feature information of

the current moment of the predicted vehicle is obtained

simultaneously from the historical and future timing

information via bi-directional LSTM to compose con-

textual information to determine the current state char-

acteristics of the agent. The structure of bidirectional

RNN is shown in Figure 8.

The fundamental concept of bidirectional LSTM is

that each training sequence is two RNNs forward and

backward respectively, and both of them are connected

to an output layer. This structure provides the output

layer with complete past and future contextual infor-

mation for each point in the input sequence. Six unique

weights are utilized repeatedly at each time step, and

the six weights correspond to the input to the hidden

layers of forward and backward (v1,v3), the hidden

layers to the hidden layers (v2,v5), and the hidden

layers of forward and backward to the output layers

(v4,v6), but there is no information communication

between the forward and backward hidden layers.

Mixture density network

MDN was proposed by Christopher Bishop

in 1994

to tackle multi-valued mapping problems using

Gaussian mixture models and neural networks. In

order to better reflect the diversity of driving behaviors

and the uncertainty in predicting future trajectories, the

probability distribution of future trajectories is fore-

casted by MDN to generate multiple possible future

behaviors and trajectories of PV. In this paper, the

combination of six Gaussian functions is selected as the

kernel function of MDN. The probability of the trajec-

tory distribution is

p(ojx)= P

i=1

ai(x)ui(ojx)

ui(ojx)= 1

(2p)c=2si(x)cexp jjomi(x)jj2

2s2

i(x)

()

:ð10Þ

where, xis the input characteristic parameter. odenotes

the location of the agent at a given time. nindicates the

number of mixed kernel function. ai(x) refers to the

model weight coefficient. si(x) is the variance para-

meter. mi(x) corresponds to the center of i-th kernel

function.

Ensure that the weighting coefficients of the model

add up to 1 and that each one is greater than 0, while

the exponential operation ensures that siis positive.

i=1 ai(x)=1

ai(x)= exp(za

j=1 za

si=expfzs

ð11Þ

The maximum likelihood function loss function

which minimizes the negative logarithm is used as the

optimization objective. The loss function Lis

L=log( XkPu(GjCk,Xobs)P(CkjXobs)) ð12Þ

where, Xobs denotes the historical trajectory sequence

of PV. Ckillustrates the driving behavior forecasted in

driving behavior prediction module. Gis the Gaussian

distribution of the future trajectories.

Experimental evaluation

Dataset

We use the HighD

dataset published by the Institute of

Automotive Engineering at RWTH Aachen University,

Germany, for training, validating and testing the

proposed motion forecasting model. The dataset

provides an extensive set of test data for AD, including a

total of 16.5 h of measurement data, a total distance of

45,000 km, and 5600 complete lane changes. The

sampling frequency of the original dataset trajectory data

is 25 Hz. In order to conform to the experimental scene

and reduce the computational cost, the sampling fre-

quency of the dataset is set to 8 Hz. The scene diagram

of the dataset is shown in Figure 9.

Figure 8. Bidirectional circulatory neural network.

Figure 9. The scene diagram of the HighD dataset.

6Proc IMechE Part D: J Automobile Engineering 00(0)

Implementation details

The proposed learning framework was implemented

using the PyTorch(1.12.1) Library and the python(3.8)

Library, and the model was trained on Nvidia GeForce

GTX 1650 Ti GPU cards. In the behavioral intention

recognition module, the historical state information M

first passes through the Fully Connected Layers (FC)

with 128 neurons and Leaky ReLU activation with

a= 0.1, and the encoded vectors are passed to the deep

RNN, which uses a two-layer LSTM network structure

with 256 hidden features and a Dropout ratio of 0.5. In

addition, in the expected trajectory prediction module,

the historical state information Mundergoes two fully

connected layers with 256 dimensional state and Tanh,

and is passed into the bidirectional LSTM and unidir-

ectional LSTM with hidden features of 512 dimensional

state and a Dropout ratio of 0.5, respectively. At the

end, the MDN and MLP produce the trajectory data of

PV after 6 s.

Data preprocessing

The behavioral intent recognition module needs to

extract the trajectories of lane keeping, left lane chang-

ing, right lane changing, left acceleration lane changing,

and right acceleration lane changing in the HighD data-

set and add the corresponding labels (0,1,2,3,4). The

step size of each sampling sequence is 3 s, and the step

length of the prediction sequence is 6 s. The steps to

classify the lane-changing trajectory of the vehicle illu-

strated in Figure 10 are:

Extraction of the intersection point of the track and

the lane line and recording of the time

Calculate yaw angle u

u=xt+1 xt

yt+1 ytð13Þ

where, xt,ytare the vertical and horizontal coordinates

of the vehicle at time t. xt+1,yt+1 represents the vertical

and horizontal coordinates of the vehicle at time t+1.

u

\ub(Heading angle threshold at the start) was

defined as lane keeping, otherwise as lane changing

Determine the start point and end point of the lane-

changing

Due to the uneven distribution of working condi-

tions, straight-line driving is far more than the category

of lane changing in the extracted sequence. The whole

data set was randomly selected, 80% of which was

taken as the training set, 10% as the verification set,

and 10% as the test set. Finally, all the extracted data

must be standardized to facilitate the training of the

proposed model.

Performance analysis of behavior intention

recognition module

The accuracy of the behavioral intention recognition

module of a traffic vehicle plays a crucial role in the

predicted trajectories of the agent. We adopt negative

log-likelihood loss (NLL) as the loss function of this

module, and the loss values of the behavioral intention

recognition model are displayed in Figure 11. The loss

value of the training process is stable at 0.0584, and the

loss value of the validation process is stable at 0.0796,

and the convergence effect of the behavioral intention

recognition module is outstanding. The confusion

matrix is a common visualization approach for super-

vised learning in machine learning, and the confusion

matrix for behavioral intention recognition is demon-

strated in Table 1. We consider the accuracy ratio,

recall ratio, F1-score, and precision ratio as the evalua-

tion metrics of the classifier. Taking binary classifica-

tion as an example, the precision ratio p, recall ratio r,

F1-score F1, and accuracy ratio aare:

Figure 10. Diagram of lane changing trajectory.

Gao et al. 7

p=TP

TP +FP

r=TP

TP +FN

F1=1

p+1

a=TP +TN

TP +FN +FP +TN

ð14Þ

where, Tp,Tn,Fp,Fnrepresent the number of true cases,

true negative cases, false positive cases, and false nega-

tive cases, respectively.

As can be seen from Table 2, all the performance

indicators of behavioral intention are excellent, with p

of over 90%; rof over 97% for lane keeping, left and

right lane changing, and rof over 81% for left and right

acceleration lane changing. The F1-score reflects the

average level of aand r.F1of 97% or more for lane

keeping, left lane changing and right lane changing ,

and 85% or more for left and right acceleration chang-

ing. The accuracy rate reflects the degree of goodness of

the model, and the accuracy rate of the model reached

more than 98%. Figure 12 shows the accuracy of the

behavioral intention recognition model. It should be

noted that as the irregular trajectories of straight-line

driving are removed during the data pre-processing

stage, the accuracy, recall, and F1-score of straight-line

driving are higher, and the indicators of left and right

lane changing are closer. Due to the small sample size

of the extracted left and right acceleration lane chang-

ing, each of their performance indexes is lower than

those of the lane keeping, left and right lane changing ,

but the indexes of left and right acceleration lane chang-

ing are close. Although some misjudgments are gener-

ated in the intention recognition module, the opposite

type of verdict is rarely produced, which shows that the

behavior intention recognition module has a good

intention recognition capability and meets the require-

ments of the vehicle motion forecasting module.

PMETP performance analysis

From Figure 13, we can observe the multi-modal pre-

diction results of the agents in different time domains

for both straight-line driving and lane change scenar-

ios. Each plot presents the historical trajectories of the

vehicles for the past 3 s and the predicted trajectories

for the next 6 s. The shades of color in the plot are pro-

portional to the probability of predicting behavioral

intentions and reveal the complete heat map of the pre-

dicted multi-modal distribution.

Figure 11. Behavioral intention recognition model loss.

Table 1. Confusion matrix for behavioral intention identification.

Real intention

Lane-keeping

(item)

Left

change

(item)

Right

change

(item)

Left acceleration

lane change (item)

Right acceleration

lane change (item)

Predicted

Intention

Lane-keeping (item) 201598 683 497 462 596

Left change (item) 473 28560 37 274 7

Right change (item) 692 21 35169 17 239

Left acceleration lane change (item) 171 89 1 3346 18

Right acceleration lane change (item) 205 8 116 3 3855

Table 2. The performance measures for behavioral intention identification.

Evaluation metric Precision

ratio p

Recall

ratio r

F1-score F1Accuracy

ratio a

Predicted intention Lane-keeping (item) 0.989 0.992 0.991 0.983

Left change (item) 0.973 0.973 0.973

Right change (item) 0.973 0.982 0.978

Left acceleration lane change (item) 0.923 0.816 0.866

Right acceleration lane change (item) 0.921 0.818 0.872

8Proc IMechE Part D: J Automobile Engineering 00(0)

Figure 13(a) shows the effect of the lane keeping sce-

nario on the PMETP with an interval of 0.84 s. The first

example (top-left) and the second example (top-middle)

represent that the purpose of PV and its trailers is to

overtake the vehicle in front, therefore the PMETP out-

puts the predicted trajectory to go around the vehicle in

front (without causing a collision) and keep it straight.

The third example (top-right) indicates that the current

environment is insufficient to complete the overtaking

behavior, and the historical trajectories and lane are

approximately parallel, so there is no obvious overtak-

ing tendency. Therefore, the predicted vehicle will stay

in the lane, and PMETP predicts that there will be two

probability trajectories in the later period to continue

driving. However, during the prediction process,

PMETP predicts that neither the behavioral intention

nor the future trajectories will change lanes based on

the historical trajectories.

Figure 13(b) shows the effect of vehicles in adjacent

lanes and the same lane on PMETP in the lane chang-

ing scenario with an interval of 0.84 s. As can be seen in

Figure 13(b), the PV is in the congested rightmost lane,

and the middle lane is faster than the right lane. The

first example (top-left) demonstrates that based on the

vehicle’s historical trajectories (no significant tendency

to change lanes), PMETP predicts the highest probabil-

ity of remaining in the lane and the lower probability of

moving to the middle lane based on the current envi-

ronment. In the second example (top-middle), the PV

travels to the middle of the front and back vehicles and

has a clear tendency to change lanes. PMETP predicts

that the future trajectories of PV will all move toward

the middle lane, and the probabilities of the two pre-

dicted trajectories are approximate. The third example

(top-right) shows the historical trajectory of the red car

continuing toward the middle lane. The main trend

Figure 12. The accuracy of the behavioral intention

recognition model.

Figure 13. Multi-modal trajectory prediction of PMETP model: (a) prediction results of PMETP in lane keeping scenario and (b)

prediction results of PMETP in the channel change scenario.

Gao et al. 9

predicted via the PMETP based on the current scene is

consistent but the model predicts two probabilistic tra-

jectories with similar probabilities in the future time

domain.

The test dataset of HighD can be seen that our pro-

posed PMETP has performed well in predicting the

multi-modal distribution.

In this paper, the Root Mean Square Error (RMSE)

and the Negative Log Likelihood (NLL) of the 6s both

of predicted trajectories and the true trajectories are

employed as the evaluation metrics of the PMETP pre-

diction results. For the trajectory prediction model with

multi-modal distribution, the RMSE is calculated by

the maximum probability trajectory. The advantages

and disadvantages of uni-modal and multi-modal dis-

tributions are compared by the NLL both of the trajec-

tory distribution and the real trajectories generated by

the PMETP.

We compare the RMSE and NLL of the following

models within 6 s to test the validity of the models.

Constant Velocity (CV): The fixed-speed Kalman

filter is used as the basic model.

LSTM with convolutional social pooling and

maneuvers (CS-LSTM): This method was proposed

by Deo and Trivedi.

It includes the maneuver-

based decoder that generates a multi-modal predic-

tive distribution. Each vehicle is modeled using

LSTM and the hidden states were pooled in each

iteration using a social pooling layer. The model

was trained using Adam with learning rate of 0.001.

The encoder LSTM has 64 dimensional states while

the decoder has a 128 dimensional states.

XY-LSTM: Based on the architecture of the

PMETP, we use the location feature information of

the predicted vehicle and the surrounding vehicles.

The parameter values are derived from the model

proposed in this paper.

V-LSTM: we increase the speed feature information

of the vehicle based on XY-LSTM. The parameter

values are derived from the model proposed in this

paper.

E1-LSTM: Additional information on the interac-

tion between the predicted vehicle and the sur-

rounding vehicles is added to the V-LSTM. The

parameter values are derived from the model pro-

posed in this paper.

E2-LSTM: we augment the E1-LSTM with infor-

mation about the interaction between surrounding

vehicles. The parameter sizes are derived from the

model proposed in this paper.

PMETP(M): The complete model described in this

paper includes behavioral intention recognition and

multi-modal prediction distribution generated by

encoders and decoders. The parameter values are

provided in section implementation details.

Table 3 indicates the RMSE and NLL results for

each model based on the HighD dataset. It can be seen

that the RMSE of the models (CS-LSTM, E1-LSTM,

E2-LSTM, PMETP) considering the information of the

interaction characteristics of the predicted vehicle and

the surrounding vehicles is significantly lower than that

of XY-LSTM and V-LSTM, indicating that the interac-

tion between agents is one of the powerful factors for

motion forecasting. The RMSE of CV-based and

LSTM-based are close in the short-term domain, indi-

cating that the CV-based model is only suitable for

short-term trajectory prediction, and also proving that

LSTM has a stronger advantage in the long-term

domain prediction. Furthermore, according to the com-

parison of RMSE and NLL between each proposed

model based on the HighD dataset, the PMETP pro-

posed in this paper has obvious advantages. The aver-

age error of RMSE in 6 s was decreased by 45.93%,

5.97%, 34.76%, 29.6%, 10.7%, 5.51%, and the average

error of NLL in 6 s was reduced by 6.61%, 28.29%,

22.49%, 16.06%, 8.7%, respectively. The generated

multi-modal probability distribution is more consistent

with the real trajectory.

Table 3. Test results of each model based on HighD dataset.

Evaluation

metric

Prediction

horizon (s)

CV CS-LSTM XY-LSTM V-LSTM E1-LSTM E2-LSTM PMETP

RMSE (m) 1 0.86 0.59 0.72 0.69 0.65 0.63 0.61

2 2.08 1.22 1.59 1.46 1.33 1.28 1.24

3 4.52 2.33 3.68 3.25 2.43 2.16 2.03

4 5.79 3.42 4.79 4.33 3.48 3.23 3.14

5 6.93 4.19 6.26 5.84 4.31 4.22 3.95

6 8.67 4.84 6.87 6.59 5.27 4.99 4.63

NLL 1 - 0.59 2.11 1.65 1.26 0.72 0.52

2 - 2.16 2.98 2.69 2.43 2.23 1.98

3 - 2.78 3.83 3.49 3.06 2.85 2.56

4 - 3.25 4.39 4.03 3.76 3.46 3.08

5 - 4.41 5.29 5.06 4.97 4.66 4.31

6 - 5.25 5.82 5.67 5.38 5.26 5.06

10 Proc IMechE Part D: J Automobile Engineering 00(0)

Conclusion

To adequately represent the vehicle behavior prediction

space and address the inherent uncertainty in predic-

tion, we propose a multi-modal expected trajectory pre-

diction model based on probability density. The major

contents and results of this paper are as follows.

1. In this paper, we consider not only the motion

state of the predicted vehicle but also the environ-

mental information of the target vehicle, that is,

the characteristic information of the surrounding

traffic vehicles, the characteristic information of

the interaction between the predicted vehicle and

the surrounding traffic vehicles, and the associa-

tion between the surrounding traffic vehicles.

Meanwhile, we also extract the classification label

of the agent based on the yaw angle, which pro-

vides the data support for the PMETP model.

2. The driving behavior intention recognition module

was used to predict the probability of the target

vehicle in lane keeping, left lane changing, right

lane changing, left accelerated lane changing and

right accelerated lane changing. The probability

distribution of the future trajectory position is pre-

dicted by the MDN Gaussian kernel function.

3. The driving behavior intention recognition module

is analyzed by evaluation metrics such as accuracy,

recall, F1-score, and precision, and the model

achieves an accuracy rate of over 98%. Meanwhile,

PMETP can produce multi-modal prediction

results in the driving scenarios of lane-keeping and

lane-changing . In this paper, we compare the

RMSE and NLL between the proposed PMETP

and different models. The average error of RMSE

in 6 s was decreased by 45.93%, 5.97%, 34.76%,

29.6%, 10.7%, 5.51%, and the average error of

NLL in 6 s was reduced by 6.61%, 28.29%,

22.49%, 16.06%, 8.7%, respectively.

However, we only focus on the multi-modal trajec-

tory prediction under the motorway, and do not con-

sider complex scenarios, such as congestion,

intersections, pedestrian, and vehicular mixing, etc. The

following research will take into account complex sce-

narios in motion prediction and adopt state-of-the-art

methods such as a transformer.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest

with respect to the research, authorship, and/or publi-

cation of this article.

Funding

The author(s) disclosed receipt of the following finan-

cial support for the research, authorship, and/or publi-

cation of this article: This research was supported by

the National Natural Science Foundation of China

(grants No. 52202494).

ORCID iDs

Mingxi Bao https://orcid.org/0000-0003-4929-5260

Fei Gao https://orcid.org/0000-0003-4195-5033

References

1. Huang Z, Mo X and Lv C. Multi-modal motion predic-

tion with transformer-based neural network for autono-

mous driving. In: 2022 International conference on

robotics and automation (ICRA), Philadelphia, PA,

USA, 23–27 May 2022, pp. 2605–2611. New York, NY:

IEEE.

2. Luo C, Sun L, Dabiri D, et al. Probabilistic multi-modal

trajectory prediction with lane attention for autonomous

vehicles. In: 2020 IEEE/RSJ international conference on

intelligent robots and systems (IROS), Las Vegas, NV,

USA, 24 October 2020–24 January 2021, pp.2370–2376.

New York, NY: IEEE.

3. Casas S, Gulino C, Suo S, et al. The importance of prior

knowledge in precise multimodal prediction. In:2020

IEEE/RSJ international conference on intelligent robots

and systems (IROS), Las Vegas, NV, USA, 24 October

2020–24 January 2021 pp.2295–2302. New York, NY:

IEEE.

4. Huang Y, Du J, Yang Z, et al. A survey on trajectory-

prediction methods for autonomous driving. IEEE Trans

Intell Vehicles 2022; 7: 652–674.

5. Xiong L, Xia X, Lu Y, et al. IMU-based automated

vehicle body sideslip angle and attitude estimation aided

by GNSS using parallel adaptive Kalman filters. IEEE

Trans Vehicular Technol 2020; 69(10): 10668–10680.

6. Liu W, Xia X, Xiong L, et al. Automated vehicle sideslip

angle estimation considering signal measurement charac-

teristic. IEEE Sens J 2021; 21(19): 21675–21687.

7. Liu W, Xiong L, Xia X, et al. Vision-aided intelligent

vehicle sideslip angle estimation based on a dynamic

model. IET Intell Transp Syst 2020; 14(10): 1183–1189.

8. Zhenhai G. Soft sensor application in vehicle yaw rate

measurement based on Kalman filter and vehicle

dynamics. In: Proceedings of the 2003 IEEE international

conference on intelligent transportation systems, Shanghai,

China, 12–15 October 2003, pp.1352–1354. New York,

NY: IEEE.

9. Polychronopoulos A, Tsogas M, Amditis AJ, et al. Sen-

sor fusion for predicting vehicles’ path for collision

avoidance systems. IEEE Trans Intell Transp Syst 2007;

8(3): 549–562.

10. Mozaffari S, Al-Jarrah OY, Dianati M, et al. Deep

learning-based vehicle behavior prediction for autono-

mous driving applications: A review. IEEE Trans Intell

Transp Syst 2022; 23(1): 33–47.

11. Lefe

`vre S, Vasquez D and Laugier C. A survey on motion

prediction and risk assessment for intelligent vehicles.

Robomech J 2014; 1(1): 1–14.

12. Kim BD, Kang CM, Kim J, et al. Probabilistic vehicle

trajectory prediction over occupancy grid map via recur-

rent neural network. In: 2017 IEEE 20th international

conference on intelligent transportation systems (ITSC),

Yokohama, Japan, 16–19 October 2017, pp.399–404.

New York, NY: IEEE.

Gao et al. 11

13. Khakzar M, Rakotonirainy A, Bond A, et al. A dual

learning model for vehicle trajectory prediction. IEEE

Access 2020; 8: 21897–21908.

14. Xie DF, Fang ZZ, Jia B, et al. A data-driven lane-chang-

ing model based on deep learning. Transp Res Part C

Emerg Technol 2019; 106: 41–60.

15. Lin L, Li W, Bi H, et al. Vehicle trajectory prediction

using LSTMs with spatial–temporal attention mechan-

isms. IEEE Intell Transp Syst Mag 2022; 14(2): 197–208.

16. Xiao H, Wang C, Li Z, et al. UB-LSTM: a trajectory pre-

diction method combined with vehicle behavior recogni-

tion. J Adv Transport 2020; 2020: 1–12.

17. Tang C and Salakhutdinov RR. Multiple futures predic-

tion. Adv Neural Inf Process Syst 2019; 32: 15398–15408.

18. Lee N, Choi W, Vernaza P, et al. Desire: distant future

prediction in dynamic scenes with interacting agents. In:

Proceedings of the IEEE conference on computer vision

and pattern recognition, Honolulu, HI, USA,21–26 July

2017, pp.336–345. New York, NY: IEEE.

19. Yuan Y and Kitani K. Diverse trajectory forecasting

with determinantal point processes. arXiv preprint

arXiv:1907.04967, 2019.

20. Gupta A, Johnson J, Fei-Fei L, et al. Social GAN:

socially acceptable trajectories with generative adversarial

networks. In: Proceedings of the IEEE conference on com-

puter vision and pattern recognition, Salt Lake City, UT,

USA, 18–23 June 2018, pp.2255–2264. New York, NY:

IEEE.

21. Cui H, Radosavljevic V, Chou FC, et al. Multimodal tra-

jectory predictions for autonomous driving using deep

convolutional networks. In: 2019 International conference

on robotics and automation (ICRA), Montreal, QC,

Canada, 20–24 May 2019, pp.2090–2096.

22. Phan-Minh T, Grigore EC, Boulton FA, et al. CoverNet:

multimodal behavior prediction using trajectory sets. In:

Proceedings of the IEEE/CVF conference on computer

vision and pattern recognition, Seattle, WA, USA,13–19

June 2020, pp.14074–14083. New York, NY: IEEE.

23. Rhinehart N, McAllister R, Kitani K, et al. PRECOG:

prediction conditioned on goals in visual multi-agent set-

tings. In: Proceedings of the IEEE/CVF international con-

ference on computer vision, Seoul, South Korea, 27

October–2 November 2019, pp.2821–2830. New York,

NY: IEEE.

24. Biktairov Y, Stebelev M, Rudenko I, et al. Prank: motion

prediction based on ranking. Adv Neural Inf Process Syst

2020; 33: 2553–2563.

25. Deo N and Trivedi MM. Convolutional social pooling

for vehicle trajectory prediction. In: Proceedings of the

IEEE conference on computer vision and pattern recogni-

tion workshops, Salt Lake City, UT, USA, 18–22 June

2018, pp.1468–1476. New York, NY: IEEE.

26. Deo N and Trivedi MM. Multi-modal trajectory predic-

tion of surrounding vehicles with maneuver based

LSTMs. In: 2018 IEEE intelligent vehicles symposium

(IV), Changshu, China, 26–30 June 2018, pp.1179–1184.

New York, NY: IEEE.

27. Deo N, Wolff E and Beijbom O. Multimodal trajectory

prediction conditioned on lane-graph traversals. In:

Proceedings of the 5th Conference on Robot Learning,

London, UK, 2022, pp.203–212.

28. Gers FA, Schmidhuber J and Cummins F. Learning to

forget: Continual prediction with LSTM. Neural Comput

2000; 12(10): 2451–2471.

29. Bishop CM. Mixture density networks. Technical report,

Aston University, 1994.

30. Krajewski R, Bock J, Kloeker L, et al. The highD data-

set: a drone dataset of naturalistic vehicle trajectories on

German highways for validation of highly automated

driving systems. In: 2018 21st international conference on

intelligent transportation systems (ITSC), Maui, HI,

2018, pp.2118–2125. New York, NY: IEEE.

12 Proc IMechE Part D: J Automobile Engineering 00(0)

Cooperative Mission Planning of USVs Based on Intention Recognition

Article

Full-text available

Apr 2024
MOBILE NETW APPL

To enhance task completion efficiency and quality, the coordination of Unmanned Surface Vehicle (USV) formations in complex environmental situations often requires user intervention. This paper proposes a human-machine collaborative approach for USV mission planning and explores a method for identifying user intervention intentions. A method for recognizing user intention based on intervention style was proposed. The method utilizes the Improved Particle Swarm Optimization-Support Vector Machine (IPSO-SVM) model to recognize intervention style and emphasizes human intention recognition to enhance the ability of USV in complex environments. The method involves modeling continuous intervention operations and incorporating intervention style features to accurately identify user intent. The study proposes a fusion method that combines feature attention, self-attention, and Fusion of Long Short-Term Memory Networks (FLSTMS) to achieve its purpose. Furthermore, it suggests a cooperative mission planning method based on prospect theory, which integrates user risk propensity and identified intentions to optimize planning. Simulation experiments confirm the effectiveness of this approach, highlighting its advantages over traditional methods.

Study on Aircraft Wing Collision Avoidance through Vision-Based Trajectory Prediction

Conference Paper

Apr 2024

div class="section abstract"> When the aircraft towing operations are carried out in narrow areas such as the hangars or parking aprons, it has a high safety risk for aircraft that the wingtips may collide with the surrounding aircraft or the airport facility. A real-time trajectory prediction method for the towbarless aircraft taxiing system (TLATS) is proposed to evaluate the collision risk based on image recognition. The Yolov7 module is utilized to detect objects and extract the corresponding features. By obtaining information about the configuration of the airplane wing and obstacles in a narrow region, a Long Short-Term Memory (LSTM) encoder-decoder model is utilized to predict future motion trends. In addition, a video dataset containing the motions of various airplane wings in real traction scenarios is constructed for training and testing. Compared with the conventional methods, the proposed method combines image recognition and trajectory prediction methods to describe the relative positional relationship between the wings and obstacles, which enhances the accuracy of aircraft wing collision prediction during aircraft towing operations. </div

Surrounding vehicle trajectory prediction under mixed traffic flow based on graph attention network

Article

Feb 2024

Collision Risk Assessment for Intelligent Vehicles Considering Multi-Dimensional Uncertainties

Article

Full-text available

Jan 2024

To ensure the reliability of autonomous driving, the system must be capable of potential hazard identification and appropriate response to prevent accidents. This involves the prediction of possible developments in traffic situations and an evaluation of the potential danger of future scenarios. Precise Collision Risk Assessment (CRA) faces complex challenges due to uncertainties inherent in vehicle and road environmental conditions. This paper introduces a new CRA approach, the Multi-Dimensional Uncertainties-CRA (MDU-CRA), which integrates uncertainties related to driver behavior, sensor perception, motion prediction models, and road infrastructure into a comprehensive risk evaluation framework. The estimation of vehicle state is initiated using Extended Kalman Filtering (EKF) to capture uncertainties in sensor perception. Concurrently, a probabilistic motion prediction model based on Gaussian distributions has been developed, which considers the uncertainty in driver behavior. Subsequently, the uncertainty of the road structure is modeled using a truncated Gaussian distribution. Finally, collision risk is quantified as the future probability of collision through heuristic Monte Carlo (MC) sampling. This paper presents the results of two experiments Firstly, our proposed method is demonstrated to outperform the reference neural network-based method in terms of short-term motion prediction accuracy. Secondly, two driving scenarios are extracted and reconstructed from the Next Generation Simulation (NGSIM) dataset for validation and evaluation, i.e., an active lane-change scenario and an emergency braking scenario. In the domain of collision risk assessment, our approach consistently outperforms other evaluation methods. It exhibits the capability to perceive collision risks 2 to 5 seconds in advance, significantly reducing the probability of imminent collision incidents.

CoverNet: Multimodal Behavior Prediction Using Trajectory Sets

Conference Paper

Full-text available

Jun 2020

We present CoverNet, a new method for multimodal, probabilistic trajectory prediction for urban driving. Previous work has employed a variety of methods, including multimodal regression, occupancy maps, and 1-step stochastic policies. We instead frame the trajectory prediction problem as classification over a diverse set of trajectories. The size of this set remains manageable due to the limited number of distinct actions that can be taken over a reasonable prediction horizon. We structure the trajectory set to a) ensure a desired level of coverage of the state space, and b) eliminate physically impossible trajectories. By dynamically generating trajectory sets based on the agent's current state, we can further improve our method's efficiency. We demonstrate our approach on public, real world self-driving datasets, and show that it outperforms state-of-the-art methods.

UB-LSTM: A Trajectory Prediction Method Combined with Vehicle Behavior Recognition

Article

Full-text available

Aug 2020
J ADV TRANSPORT

In order to make an accurate prediction of vehicle trajectory in a dynamic environment, a Unidirectional and Bidirectional LSTM (UB-LSTM) vehicle trajectory prediction model combined with behavior recognition is proposed, and then an acceleration trajectory optimization algorithm is proposed. Firstly, the interactive information with the surrounding vehicles is obtained by calculation, then the vehicle behavior recognition model is established by using LSTM, and the vehicle information is input into the behavior recognition model to identify vehicle behavior. Then, the trajectory prediction model is established based on Unidirectional and Bidirectional LSTM, and the identified vehicle behavior and the input information of the behavior recognition model are input into the trajectory prediction model to predict the horizontal and vertical speed and coordinates of the vehicle in the next 3 seconds. Experiments are carried out with NGSIM data sets, and the experimental results show that the mean square error (MSE) between the predicted trajectory and the actual trajectory obtained by this method is 0.124, which is 97.2% lower than that of the method that does not consider vehicle behavior and directly predicts the trajectory. The test loss is 0.000497, which is 95.68% lower than that without considering vehicle behavior. The predicted trajectory is obviously optimized, closer to the actual trajectory, and the performance is more stable.

Multi-modal Motion Prediction with Transformer-based Neural Network for Autonomous Driving

Conference Paper

May 2022

A Survey on Trajectory-Prediction Methods for Autonomous Driving

Article

Sep 2022

In order to drive safely in a dynamic environment, autonomous vehicles should be able to predict the future states of traffic participants nearby, especially surrounding vehicles, similar to the capability of predictive driving of human drivers. That is why researchers are devoted to the field of trajectory prediction and propose different methods. This paper is to provide a comprehensive and comparative review of trajectory-prediction methods proposed over the last two decades for autonomous driving. It starts with the problem formulation and algorithm classification. Then, the popular methods based on physics, classic machine learning, deep learning, and reinforcement learning are elaborately introduced and analyzed. Finally, this paper evaluates the performance of each kind of method and outlines potential research directions to guide readers.

Automated Vehicle Sideslip Angle Estimation Considering Signal Measurement Characteristic

Article

Oct 2021

Vehicle slip angle (VSA) estimation is of paramount importance for connected automated vehicle dynamic control, especially in critical lateral driving scenarios. In this paper, a novel kinematic-model-based VSA estimation method is proposed by fusing information from a global navigation satellite system (GNSS) and an inertial measurement unit (IMU). First, to reject the gravity components induced by the vehicle roll and pitch, a vehicle attitude angle observer based on the square-root cubature Kalman filter (SCKF) is designed to estimate the roll and pitch. A novel feedback mechanism based on the vehicle intrinsic information (the steering angle and wheel speed) for the pitch and roll is designed. Then, the integration of the reverse smoothing and grey prediction is adopted to compensate for the cumulative velocity errors during the relatively low sampling interval of the GNSS. Moreover, the GNSS signal delay has been addressed by an estimation-prediction integrated framework. Finally, the results confirm that the proposed method can estimate the VSA under both the slalom and double lane change (DLC) scenarios.

The Importance of Prior Knowledge in Precise Multimodal Prediction

Conference Paper

Oct 2020

Probabilistic Multi-modal Trajectory Prediction with Lane Attention for Autonomous Vehicles

Conference Paper

Oct 2020

Vehicle Trajectory Prediction Using LSTMs With Spatial–Temporal Attention Mechanisms

Article

Feb 2021

Accurate vehicle trajectory prediction can benefit a variety of intelligent transportation system applications ranging from traffic simulations to driver assistance. The need for this ability is pronounced with the emergence of autonomous vehicles as they require the prediction of nearby vehicles’ trajectories to navigate safely and efficiently. Recent studies based on deep learning have greatly improved prediction accuracy. However, one prominent issue of these models is the lack of model explainability. We alleviate this issue by proposing spatiotemporal attention long short-term memory (STA-LSTM), an LSTM model with spatial-temporal attention mechanisms for explainability in vehicle trajectory prediction. STA-LSTM not only achieves comparable prediction performance against other state-of-the-art models but, more importantly, explains the influence of historical trajectories and neighboring vehicles on the target vehicle. We provide in-depth analyses of the learned spatial–temporal attention weights in various highway scenarios based on different vehicle and environment factors, including target vehicle class, target vehicle location, and traffic density. A demonstration illustrating that STA-LSTM can capture and explain fine-grained lane-changing behaviors is also provided. The data and implementation of STA-LSTM can be found at https://github.com/leilin-research/VTP .

Deep Learning-Based Vehicle Behavior Prediction for Autonomous Driving Applications: A Review

Article

Aug 2020

Behaviour prediction function of an autonomous vehicle predicts the future states of the nearby vehicles based on the current and past observations of the surrounding environment. This helps enhance their awareness of the imminent hazards. However, conventional behavior prediction solutions are applicable in simple driving scenarios that require short prediction horizons. Most recently, deep learning-based approaches have become popular due to their promising performance in more complex environments compared to the conventional approaches. Motivated by this increased popularity, we provide a comprehensive review of the state-of-the-art of deep learning-based approaches for vehicle behavior prediction in this article. We firstly give an overview of the generic problem of vehicle behavior prediction and discuss its challenges, followed by classification and review of the most recent deep learning-based solutions based on three criteria: input representation, output type, and prediction method. The article also discusses the performance of several well-known solutions, identifies the research gaps in the literature and outlines potential new research directions.

IMU-based Automated Vehicle Body Sideslip Angle and Attitude Estimation Aided by GNSS using Parallel Adaptive Kalman Filters

Article

Mar 2020

The sideslip angle and attitude are crucial for automated driving especially for chassis integrated control and environmental perception. In this article an inertial measurement unit (IMU)-based automated vehicle body sideslip angle and attitude estimation method aided by low-sample-rate global navigation satellite system (GNSS) velocity and position measurements using parallel adaptive Kalman filters is proposed. This method can estimate the sideslip angle and attitude simultaneously and is robust against the vehicle parameters and road friction even as the vehicle enters critical maneuvers. First, based on the acceleration and angular rate from the six-dimensional inertial measurement unit, the attitude, velocity and position (AVP) are integrated with the navigation coordinates and the AVP error dynamics and observation equations of the integration results are developed. Second, parallel innovation adaptive estimation (IAE)-based Kalman filters is designed to estimate the AVP error of the integration method to address the issues of the GNSS low sampled rate and abnormal measurements. Then the AVP error is forwarded to the AVP integration to compensate the accumulated error. To improve the heading angle estimation accuracy, the heading error is estimated by a decoupled IAE-based Kalman filter aided by GNSS heading. In addition, time synchronization of the IMU and GNSS is realized through hardware based on the pulse per second signal of the GNSS receiver and the spatial synchronization is achieved by a direct compensation method. Lastly, the sideslip angle and attitude estimation method is validated by a comprehensive experimental test including critical double lane change and slalom maneuvers. The results show that the estimation error of the longitudinal velocity and lateral velocity is smaller than 0.1 m/s $({1\sigma })$ , and the estimation error of the sideslip angle is smaller than 0.15° $({1\sigma })$ .

Probabilistic multi-modal expected trajectory prediction based on LSTM for autonomous driving

Abstract and Figures

Recommended publications

Trajectory Prediction in Autonomous Driving with a Lane Heading Auxiliary Loss

Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Trajectory Prediction in Autonomous Driving With a Lane Heading Auxiliary Loss

Multi-Head Attention with Joint Agent-Map Representation for Trajectory Prediction in Autonomous Dri...