PreprintPDF Available

Data-Driven Predictive Control for Connected and Autonomous Vehicles in Mixed Traffic

Authors:

Abstract and Figures

Cooperative control of Connected and Autonomous Vehicles (CAVs) promises great benefits for mixed traffic. Most existing research focuses on model-based control strategies, assuming that car-following dynamics of human-driven vehicles (HDVs) are explicitly known. In this paper, instead of relying on a parametric car-following model, we introduce a data-driven predictive control strategy to achieve safe and optimal control for CAVs in mixed traffic. We first present a linearized dynamical model for mixed traffic systems, and investigate its controllability and observability. Based on these control-theoretic properties, we then propose a novel DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control) strategy for CAVs based on measurable driving data to smooth mixed traffic . Our method is implemented in a receding horizon manner, in which input/output constraints are incorporated to achieve collision-free guarantees. Nonlinear traffic simulations show that DeeP-LCC can save up to 24.96% fuel consumption during a braking scenario of Extra-Urban Driving Cycle while ensuring safety.
Content may be subject to copyright.
Data-Driven Predictive Control for Connected and Autonomous
Vehicles in Mixed Traffic
Jiawei Wang1, Yang Zheng2, Qing Xu1and Keqiang Li1
Abstract Cooperative control of Connected and Au-
tonomous Vehicles (CAVs) promises great benefits for mixed
traffic. Most existing research focuses on model-based control
strategies, assuming that car-following dynamics of human-
driven vehicles (HDVs) are explicitly known. In this paper,
instead of relying on a parametric car-following model, we
introduce a data-driven predictive control strategy to achieve
safe and optimal control for CAVs in mixed traffic. We first
present a linearized dynamical model for mixed traffic systems,
and investigate its controllability and observability. Based on
these control-theoretic properties, we then propose a novel
DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control)
strategy for CAVs based on measurable driving data to smooth
mixed traffic . Our method is implemented in a receding horizon
manner, in which input/output constraints are incorporated to
achieve collision-free guarantees. Nonlinear traffic simulations
show that DeeP-LCC can save up to 24.96% fuel consumption
during a braking scenario of Extra-Urban Driving Cycle while
ensuring safety.
I. INTRODUCTION
The emergence of connected and autonomous vehicles
(CAVs) has provided new opportunities for smoothing traffic
flow [1]. One typical technology is Cooperative Adaptive
Cruise Control (CACC) that regulates a series of CAVs to
achieve higher traffic efficiency and better fuel economy [2]–
[4]. Given the gradual deployment of CAVs, there will be a
transition phase of mixed traffic flow, where human-driven
vehicles (HDVs) and CAVs coexist. Recently, it has been
shown theoretically and experimentally that incorporating the
HDVs’ behavior into CAVs’ controller design can signifi-
cantly improve traffic performance in mixed traffic [5]–[8].
The mixed traffic flow is essentially a complex human-in-
the-loop cyber-physical system, where HDVs are controlled
by human drivers with uncertain and stochastic behaviors.
Most existing research exploits microscopic car-following
models to describe HDVs’ behavior and designs model-based
control strategies for CAVs [7]–[9]. In practice, however,
human car-following dynamics for individual vehicles are
complex and nonlinear, which are non-trivial to identify ac-
curately. Indeed, model-free or data-driven methods, bypass-
ing model identifications, have recently received significant
This work is supported by National Key R&D Program of China with
2018YFE0204302, National Natural Science Foundation of China with
52072212, China Intelligent and Connected Vehicles (Beijing) Research
Institute Co., Ltd., and Dongfeng Automobile Co., Ltd. All correspondence
should be sent to Y. Zheng and K. Li.
1J. Wang, Q. Xu and K. Li are with the School of Vehicle and Mobility,
Tsinghua University, Beijing, China. (wang-jw18@mails.tsinghua.edu.cn,
{qingxu,likq}@tsinghua.edu.cn).
2Y. Zheng is with the Department of Electrical and Computer Engineer-
ing, University of California San Diego, CA 92093. (zhengy@eng.ucsd.edu)
attention [10], [11]. For example, reinforcement learning [12]
and adaptive dynamic programming [13] have been recently
utilized for mixed traffic control, which employ large-scale
driving data of HDVs to train control strategies of CAVs.
However, these methods typically bring a heavy computation
burden and have difficulties in including safety constraints to
achieve collision-free guarantees in practical deployment.
Recent advancements in data-driven predictive control
have provided effective techniques towards safe learn-
ing [14]. One promising strategy is the recent Data-EnablEd
Predictive Control (DeePC) method [15], which is able to
achieve safe and optimal control for unknown systems using
input/output measurements. Instead of identifying a paramet-
ric system model, DeePC relies on Willems’ fundamental
lemma [16] to directly predict future trajectories. DeePC
also allows one to incorporate input/output constraints to
ensure safety. Moreover, DeePC has shown its equivalence
with sequential system identification and Model Predictive
Control (MPC) for deterministic linear time-invariant (LTI)
systems [15], and has demonstrated better control perfor-
mance for stochastic and nonlinear systems [17], [18]. More
recently, practical applications have been seen in quadcopter
systems [15], power grids [19] and electric motor drives [20].
To our best knowledge, data-driven predictive control meth-
ods such as DeePC have not been utilized in mixed traffic
control, and the results above are not directly applicable due
to distinct dynamical properties of mixed traffic systems.
In this paper, we aim to design safe and optimal control
strategies for CAVs to smooth mixed traffic flow that require
no prior knowledge of HDVs’ car-following dynamics. In
particular, motivated by DeePC [15], we introduce a Data-
EnablEd Predictive Leading Cruise Control (DeeP-LCC)
strategy, in which the CAVs utilize measurable driving data
for controller design with collision-free guarantees. Our
main contributions include: 1) We establish a linearized
state-space model for a general mixed traffic system with
multiple CAVs and HDVs under the Leading Cruise Control
(LCC) framework [21]. We define measurable driving data
as the LCC system output, highlighting the fact that HDVs’
equilibrium spacing is practically unknown. This issue has
been neglected in many results on mixed traffic that require
state-feedback control [7], [9], [13]. We show that the
linearized mixed traffic system is not completely controllable
unless the first vehicle is a CAV, but is observable under a
mild condition. These control-theoretic results serve as the
foundation for our reformulation of DeePC [15] for mixed
traffic. 2) We propose a DeeP-LCC method for mixed traffic
control, which directly utilizes trajectory data of HDVs with-
arXiv:2110.10097v1 [eess.SY] 19 Oct 2021
Fig. 1. Schematic of mixed traffic flow. The head vehicle, at the very
beginning, is indexed as 0, behind which there exist nvehicles consisting
of mCAVs and nmHDVs with unknown driving dynamics.
out identifying an explicit parametric car-following model.
The standard DeePC requires the underlying system to be
controllable [15], [16], and thus cannot be directly applied
to mixed traffic. To address this issue, we introduce an
external input signal to record the data of the head vehicle.
Together with CAVs’ control input, this contributes to system
controllability. In addition, we directly incorporate spacing
constraints on the driving behavior for safety guarantees.
The rest of this paper is organized as follows. Section II
introduces the modeling for the mixed traffic system, and
Section III presents the controllability and observability
analysis results. We then present DeeP-LCC in Section IV.
Traffic simulations are presented in Section V, and Sec-
tion VI concludes this paper.
II. THEORETICAL MODELING FRA MEWORK
In this section, we first introduce the parametric modeling
of HDVs’ car-following behavior, and then present the lin-
earized dynamics of a general mixed traffic system under the
LCC framework; see [21] for a detailed motivation of LCC.
As shown in Fig. 1, we consider a general mixed traffic
system with n+ 1 individual vehicles, among which there
exist one head vehicle, indexed as 0, and mCAVs and
nmHDVs. Define Ω = {1,2, . . . , n}as the set of all
the vehicle indices, ordered from front to end, and S=
{i1, i2, . . . , im} ⊆ as the set of the CAV indices, where
i1< i2< . . . < imalso represent the spatial locations of
the CAVs in mixed traffic flow. The position, velocity and
acceleration of the i-th vehicle at time tis denoted as pi(t),
vi(t)and ai(t), respectively.
A. Car-Following Dynamics of HDVs
There exist many well-established continuous-time models
for car-following dynamics, including the optimal velocity
model (OVM), the intelligent driver model (IDM) and their
variants [22]. Most of these models can be written in the
following nonlinear form
˙vi(t) = F(si(t),˙si(t), vi(t)) , i \S, (1)
where si(t) = pi1(t)pi(t)denotes the car-following
spacing of vehicle i, and ˙si(t) = vi1(t)vi(t)denotes the
relative velocity. The nonlinear function F(·)represents that
the acceleration of an HDV depends on the relative distance,
relative velocity and its own velocity.
In an equilibrium traffic state, each vehicle moves with
the same equilibrium velocity vand the corresponding
equilibrium spacing s. According to (1), we have the
following equilibrium equation
F(s,0, v)=0.(2)
Upon assuming that the mixed traffic flow is around an
equilibrium state (s, v), we define the error state between
actual and equilibrium point as ˜si(t) = si(t)s,˜vi(t) =
vi(t)v, i , where ˜si,˜virepresent the spacing error
and velocity error of vehicle i, respectively. Then a linearized
dynamics model for each HDV can be derived by using (2)
and applying the first-order Taylor expansion to (1)
(˙
˜si(t) = ˜vi1(t)˜vi(t),
˙
˜vi(t) = α1˜si(t)α2˜vi(t) + α3˜vi1(t),i /S, (3)
where α1=∂F
∂s , α2= F
˙s∂F
∂v , α3= F
˙swith the partial
derivatives evaluated at the equilibrium state (s, v). To
reflect the real stable driving behavior of human drivers, we
have α1>0,α2> α3>0. More linearization details can
be found in [5], [9].
B. State-Space Model of Mixed Traffic Systems
Similar to [5], [9], [13], [23], we use the acceleration of
each CAV as the control input, i.e.,˙vi(t) = ui(t), i S. A
second-order model is used to describe the linear longitudinal
dynamics of each CAV
(˙
˜si(t) = ˜vi1(t)˜vi(t),
˙
˜vi(t) = ui(t),iS. (4)
To derive a linearized model of the mixed traffic system
shown in Fig. 1, we lump the error states of all the vehicles
as the mixed traffic system state (x(t)R2n)
x(t) = ˜s1(t),˜v1(t),˜s2(t),˜v2(t),...,˜sn(t),˜vn(t)T,
and lump the acceleration signal of all the CAVs as the
collective control input u(t) = ui1(t), ui2(t), . . . , uim(t)T
Rm. Then, based on the linearized HDVs’ car-following
model (3) and the CAV’s dynamics (4), the linearized state-
space model for the mixed traffic system is obtained as
˙x(t) = Ax(t) + Bu(t) + H(t),(5)
where (t) = ˜v0(t) = v0(t)v,i.e., the velocity error of the
head vehicle, is regarded as an external signal to the system.
The system matrices in (5) are given by (see [21] for details)
A=
A1,1
A2,2A2,1
......
An1,2An1,1
An,2An,1
R2n×2n,
B=e2i1
2n,e2i2
2n,...,e2im
2nR2n×m,
H=hT
1, hT
2, . . . , hT
nTR2n×1,
where er
pdenotes a p×1unit vector, with the r-th entry
being one and the others being zeros, and
Ai,1=(P1, i /S;
S1, i S;Ai,2=(P2, i /S;
S2, i S;
h1=1
α3, hj=0
0, j ∈ {2,3, . . . , n},
with
P1=01
α1α2, P2=0 1
0α3,
S1=01
0 0 , S2=0 1
0 0.
Measurable driving data: Note that the state in (5) cannot
be directly measured. In practice, the equilibrium velocity
vcan be obtained from the steady-state velocity of the
head vehicle. However, the equilibrium spacing sfor the
HDVs is non-trivial to accurately estimate, since the car-
following behavior of each human driver is unknown. It is
thus impractical to observe the information of the spacing
error signal of the HDVs, i.e.,˜si(i /S). For the CAVs,
by contrast, their equilibrium spacing can be designed [5],
and thus their spacing error signal can be directly measured.
Accordingly, we introduce an output signal y(t)as follows
y(t) = ˜v1(t),˜v2(t),...,˜vn(t),˜si1(t),˜si2(t),...,˜sim(t)T,
where y(t)Rn+mconsists of the velocity error signal
of both the HDVs and the CAVs, i.e.,˜vi(iΩ), and the
spacing error of all the CAVs, i.e.,˜si(iS). The output
in (5) is given by
y(t) = Cx(t),(6)
where C=e2
2n,e4
2n,...,e2n
2n,e2i11
2n,e2i21
2n,...,e2im1
2nT
R(n+m)×2nis the output matrix.
III. CONTROLLABILITY AND OBSERVABILITY
OF MI XE D TR AFFI C SYST EMS
Controllability and observability serve as foundations in
dynamical systems [24]. For mixed traffic systems, existing
research has revealed the controllability for the scenario of
one single CAV and multiple HDVs, i.e.,|S|= 1 [9], [21].
These results have been unified in the recent LCC framework
with one single CAV [21].
For general mixed traffic systems with multiple HDVs and
CAVs, given by (5) and (6), we have the following results.
Theorem 1 (Controllability): Consider the general mixed
traffic system given by (5) and (6), where there exist m(m
1) CAVs with indices S={i1, i2, . . . , im},i1< i2< . . . <
im. The following statements hold.
1) When 1S,i.e.,i1= 1, the mixed traffic system is
controllable if the following condition holds
α1α2α3+α2
36= 0.(7)
2) When 1/S, the mixed traffic system is not completely
controllable but is stabilizable, if (7) holds. The subsys-
tem consisting of the states ˜s1,˜v1,...,˜si11,˜vi11is
not controllable but is stable, while the subsystem con-
sisting of the states ˜si1,˜vi1,...,˜sn,˜vnis controllable.
The proof idea is based on a controllability invariance prop-
erty under state feedback with respect to the result [21, The-
orem 2]; the technical proof is postponed to the Appendix.
Theorem 1 indicates that the general mixed traffic system is
not completely controllable unless the vehicle immediately
behind the head vehicle is a CAV. This is expected, since the
motion of the HDVs between the head vehicle and the first
CAV cannot be influenced by the CAVs’ control.
Theorem 2 (Observability): The general mixed traffic sys-
tem, given by (5) and (6), is observable when (7) holds.
The observability result can be proved by adapting [21,
Theorem 4]. The slight asymmetry between Theorems 1
and 2 is due to the fact that the control input in (6) includes
only the acceleration signal of all the CAVs, while the system
output consists of the velocity error signal of all the vehicles
and the spacing error signal of the CAVs. Theorem 2 reveals
the observability of the full state x(t)in mixed traffic under a
mild condition. This observability result facilitates the design
of our DeeP-LCC controller.
Still, the controllability of a dynamical system is a desired
property, which guarantees the data-driven behavior repre-
sentation in the next section. Note that the velocity error
signal of the head vehicle (t) = v0(t)vis an external
input. This signal is not controlled, but can be measured in
practice. Define ˆu(t) = col ((t), u(t)) as a combined input
signal and b
B=H, Bas the corresponding input matrix.
Then, a reformulated model for the mixed traffic system is
(˙x(t) = Ax(t) + b
Bˆu(t),
y(t) = Cx(t).(8)
For this system model, we have the following result.
Corollary 1: Consider the mixed traffic system given
by (8), where there exist m(m1) CAVs. Then, the system
(A, b
B, C )is controllable and observable if (7) holds.
IV. DE E P-LCC FOR MI XED TR AFFI C FL OW
In this section, we first introduce a non-parametric data-
centric representation of the mixed traffic system behav-
ior based on Willems’ fundamental lemma [16], and then
present the Data-EnablEd Predicted Leading Cruise Control
(DeeP-LCC) strategy for mixed traffic control.
A. Non-Parametric Representation of System Behavior
The system model in (5) and (6) is continuous. It can be
straightforwardly transformed into the discrete time domain
(x(k+ 1) = Adx(k) + Bdu(k) + Hd(k),
y(k) = Cdx(k),(9)
where Ad=eAtR2n×2n, Bd=Rt
0eAtBdt
R2n×m, Hd=Rt
0eAtH dt R2n×1, Cd=C
R(n+m)×2n, and t > 0is the sampling interval.
Model-based control strategies typically follow the se-
quential system identification and control procedure: learning
the parametric model (i.e.,Ad, Bd, Hd, Cdin (9)) in advance
and then performing model-based controller design. By con-
trast, the recently proposed DeePC method [15] is a non-
parametric method that bypasses the system identification
process and directly designs the control input compatible
with history system data. In particular, DeePC directly uses
past data to predict the system “behavior” based on Willems’
fundamental lemma [16]. The standard DeePC requires the
system to be completely controllable. Thus, we rely on the
reformulated system model (9) with two input signals u, 
combined into one control input ˆuto design our DeeP-LCC
method for mixed traffic control. The following notion of
persistent excitation is needed.
Definition 1: The signal ω= col(ω1, ω2, . . . , ωT)of
length Tis persistently exciting of order l(lT)if the
following Hankel matrix is of full row rank
Hl(ω) :=
ω1ω2· · · ωTl+1
ω2ω3· · · ωTl+2
.
.
..
.
.....
.
.
ωlωl+1 · · · ωT
.(10)
Data collection: We begin by collecting a sequence trajec-
tory data of length Tfrom the system (9) with sampling
interval t. The collected data includes:
1) the combined input sequence ˆud= col(ˆud(1),...,
ˆud(T)) R(m+1)T, consisting of CAVs’ acceleration
sequence ud= col(ud(1), . . . , ud(T)) RmT and
the velocity error sequence of the head vehicle d=
col(d(1), . . . , d(T)) RT;
2) the output sequence of the mixed traffic system yd=
col(yd(1), . . . , yd(T)) R(n+m)T.
These data samples could be generated offline, or collected
online from the trajectory data of those involved vehicles.
Then, we partition the pre-collected data into two parts,
corresponding to “past data” of length Tini Nand “future
data” of length NN. Precisely, define
Up
Uf:= HTini+N(ud),Ep
Ef:= HTini+N(d),
Yp
Yf:= HTini+N(yd),
(11)
where Upand Ufconsist of the first Tini block rows and the
last Nblock rows of HTini+N(ud), respectively (similarly
for Ep, Efand Yp, Yf).
System behavior representation: motivated by Willems’
fundamental lemma [16] and the DeePC formulation [15],
we have the following result: given time t, we define uini =
col(u(tTini), u(tTini + 1), . . . , u(t1)) as the control
input sequence within a past time horizon of length Tini,
and u= col(u(t), u(t+ 1), . . . , u(t+N1)) as the control
input sequence within a predictive time horizon of length N
(similarly for ini,  and yini, y).
Proposition 1: Suppose that (7) holds, and the combined
input sequence ˆudis persistently exciting of order Tini +N+
2n. Then, any trajectory of the mixed traffic system (9) of
length Tini +N, denoted as col(uini, ini , yini, u, , y ), can be
constructed via
Up
Ep
Yp
Uf
Ef
Yf
g=
uini
ini
yini
u
y
,(12)
where gRTTiniN+1 . If Tini 2n,yis uniquely
determined from (12), (uini, ini , yini, u, ).
This proposition is adapted from Willems’ fundamental
lemma [16] and the DeePC method [15] for the mixed traffic
system (9). It reveals that we can use past trajectories to
predict the future trajectory of the mixed traffic system with-
out identifying an explicit parametric model. Specifically,
given a past trajectory (uini, ini, yini )and a future input
sequence (u, ), the formulation (12) allows one to predict
the future output sequence ydirectly from pre-collected
data (ud, d, yd). Therefore, we can bypass a parametric
system model and directly use non-parametric data-centric
representation for the behaviors of the mixed traffic system.
Note that the velocity error of the head vehicle (t)is
an external input signal, which is under human control. It
is always oscillating around zero in practice considering
the real behavior of human drivers, i.e., the drivers always
attempt to maintain the equilibrium velocity while also
suffering from small perturbations. Accordingly, given a
trajectory with length T(m+ 1)(Tini +N+ 2n)1, and
persistently exciting acceleration input signal u(t)of CAVs
(e.g., white noise with zero mean), the persistent excitation
requirement in Proposition 1 for the combined input ˆu(t)is
naturally satisfied.
Remark 1: Willems’ fundamental lemma [16] reveals that
the subspace consisting of all valid trajectories is identical to
the range space of Hankel matrix of the same order generated
by sufficiently rich inputs. DeePC has recently applied this
fundamental lemma to predictive control [15]. However,
DeePC requires the underlying system to be controllable.
This requirement can be relaxed to checking the rank of data
Hankel matrices [18]. For mixed traffic control, we introduce
the external input signal ,i.e., the velocity error of the head
vehicle, to make the reformulated system controllable, and
rely on (12) for representation of the mixed traffic system
behavior. In addition, for an observable system, one can
estimate the system state from past data whose length is
not smaller than the state dimension. Given the observability
of the mixed traffic system, the underlying initial state for
the future trajectory is implicitly fixed from (12) when
Tini 2n, which guarantees the uniqueness property in
Proposition 1; see [15] for more discussions on DeePC.
B. Design of Cost Function and Constraints in DeeP-LCC
Here, we show how to utilize the non-parametric behavior
representation (12) to design the control input of the CAVs
in mixed traffic flow, motivated by the standard DeePC [15].
Precisely, we aim to design the future behavior (u, , y)for
the mixed traffic system through a receding horizon manner
based on pre-collected data (ud, d, yd)and the most recent
past data (uini, ini , yini)which is updated online.
The past external input sequence ini can be collected in
the control process, but the future external input sequence
cannot be designed (it is controlled by a human driver).
Although the future human behavior might be predicted
using ahead traffic conditions, it is still non-trivial to achieve
accurate prediction. Considering that the driver always at-
tempts to maintain the equilibrium velocity, one natural
approach is to assume that the future velocity error of the
head vehicle is zero, i.e.,
=0N,(13)
where 0Ndenotes a N×1vector of all zeros.
Similar to the LCC framework [21], we consider the
performance of the entire mixed traffic system for CAVs’
controller design. Precisely, we consider a quadratic-form
cost function J(y, u), which penalizes the output deviation
from the equilibrium state and the energy of control input
from time t, defined as follows
J(y, u) =
t+N1
X
k=tky(k)k2
Q+ku(k)k2
R,(14)
where the coefficient matrices Q, R are set as Q=
diag(Qv, Qs)with Qv= diag(wv, . . . , wv)Rn×n,Qs=
diag(ws, . . . , ws)Rm×mand R= diag(wu, . . . , wu)
Rm×m, where wv, ws, wurepresent the penalty for the
velocity error of all the vehicles, spacing error of all the
CAVs, and the control input of the CAVs, respectively.
We further incorporate several constraints for safety guar-
antees. In particular, a minimal spacing constraint for CAVs
is required. Accordingly, we impose a lower bound on the
spacing error
˜sismin , i S, (15)
with smin denoting the minimum spacing error for each CAV.
With appropriate choice of smin, the rear-end collision of
the CAVs is avoided. In addition, for CAV control with
the aim of attenuating traffic perturbations, existing CAVs
controllers tend to leave an extremely large spacing from
the preceding vehicle in the control procedure (see, e.g., [6]
and the discussions in [7, Section V-D]), which in practice
might cause vehicles from adjacent lanes to cut in. To address
this problem, we also include an upper bound on the spacing
error of each CAV, shown as follows
˜sismax , i S. (16)
Recall that the spacing error signal of the CAVs is con-
tained in the system output (6), and thus the constraints (15)
and (16) can be converted to the following compact form
smin I(n+m)N0m×nImysmax,(17)
by exploiting yas the decision variable. In addition, the
control input of each CAV is constrained as
amin uamax,(18)
where amin and amax denote the minimum and the maximum
acceleration, respectively.
C. Formulation of DeeP-LCC
We are now ready to present the following optimization
problem to obtain the optimal control input of the CAVs at
each time step
min
g,u,y J(y, u)
subject to (12),(13),(17),(18).
(19)
Algorithm 1 DeeP-LCC
Input: Pre-collected traffic data (ud, d, yd), initial time t0,
terminal time tf;
1: Construct data Hankel matrices Up, Uf, Ep, Ef, Yp, Yf;
2: Initialize past traffic data (uini, ini, yini)before the ini-
tial time t0;
3: while t0ttfdo
4: Solve (20) for optimal predicted input u=
col(u(t), u(t+ 1), . . . , u(t+N1));
5: Apply the input u(t)u(t)to the CAVs;
6: tt+1 and update past traffic data (uini, ini, yini);
7: end while
Our formulation (19) is similar to the standard DeePC [15],
with one significant distinction being the introduction of the
future velocity error sequence of the head vehicle, i.e., the
external input of the mixed traffic system. Note that is not
a decision variable in (19), unlike uand y. Instead, it is fixed
as a constant value, as shown in (13).
For implementation, the optimal control problem (19) is
solved in a receding horizon manner (see Algorithm 1), simi-
larly to standard MPC. Unlike MPC, by contrast, the optimal
control problem (19) does not rely on an explicit parametric
system model, but utilizes the data-centric representation (12)
to predict future behaviors. However, the formulation (12) is
only valid for deterministic LTI mixed traffic systems. In
practice, the car-following behavior of HDVs is nonlinear as
shown in Section II-A, and also has uncertainties, leading to
a nonlinear and non-deterministic system. Moreover, practi-
cal traffic data collected online or offine is always noise-
corrupted, and thus the equality constraint (12) becomes
inconsistent, i.e., the subspace spanned by the columns of
the data Hankel matrices fails to coincide with the subspace
of all valid trajectories of the underlying system.
Similar to the regulation for standard DeePC [15], we
introduce a slack variable σyR(n+m)Tini for the system
past output to ensure the feasibility of the equality constraint,
yielding the following optimization problem
min
g,u,y,σy
J(y, u) + λgkgk2
2+λykσyk2
2
subject to
Up
Ep
Yp
Uf
Ef
Yf
g=
uini
ini
yini
u
y
+
0
0
σy
0
0
0
,
(13),(17),(18),
(20)
which is suitable for nonlinear and non-deterministic mixed
traffic flow. This is our final DeeP-LCC formulation at
each time step. In (20), the slack variable σyis penalized
with a weighted two-norm penalty function, and the weight
coefficient λy>0can be chosen sufficiently large such
that σy6= 0 only if the equality constraint is infeasible.
In addition, a two-norm penalty on gwith a weight coef-
ficient λg>0is also incorporated to avoid overfitting in
case of noise-corrupted data samples. As discussed in [17],
[19], such regulation on gcoincides with distributional two-
norm robustness. The receding horizon implementation of
DeeP-LCC is listed in Algorithm 1.
Remark 2: In our DeeP-LCC for mixed traffic, we intro-
duce the external input signal and utilize (13) to straight-
forwardly predict its future value. Besides (13), another
potential approach to address the unknown future external
input is to assume a bounded future velocity error of the
head vehicle. This idea is similar to robust DeePC against
unknown external disturbances; see, e.g., [19], [25]. It is
an interesting topic for future research to design robust
DeeP-LCC for mixed traffic when the head vehicle is
oscillating around a particular equilibrium velocity.
V. TR AFFI C SIMULATIONS
We now present nonlinear and non-deterministic traffic
simulations to validate the performance of DeeP-LCC. Our
simulation setup is motivated from the standard Extra-Urban
Driving Cycle (EUDC) [26]. The nonlinear OVM model
in [7] is used for HDVs (1), and a noise signal following the
uniform distribution of U[0.1,0.1] is added to the dynamics
model (1) of each HDV in our simulations1.
A. Experimental Setup
We consider eight vehicles with two CAVs and six HDVs,
i.e.,n= 8,m= 2 in Fig. 1. The two CAVs are located at the
third and the sixth vehicles respectively among the following
eight vehicles, i.e.,S={3,6}. In our DeeP-LCC strategy,
we use the following parameters.
Offline data collection: the length for the pre-collected
data is chosen as T= 2000 with a sampling interval
t= 0.05 s. We collect a single trajectory around
15 m/s, and there exists a uniformly distributed signal
of U[1,1] on both the control input signal uand
the external input signal . This naturally satisfies the
persistent excitation requirement in Proposition 1.
Online control procedure: the time horizons for the
future signal sequence and past signal sequence are
set to N= 50,Tini = 20, respectively. In the cost
function (14), the weight coefficients are set to wv=
1, ws= 0.5, wu= 0.1; for constraints, the boundaries
for the spacing error of the CAVs are set to smax =
20, smin =15, and the limit for the acceleration of the
CAVs are set to amax = 2, amin =5. The regularized
parameters in (20) are set to λg= 100, λy= 10000.
For the HDVs’ OVM model, we assume a heterogeneous
parameter setup around the nominal value. We also consider
the standard MPC for comparison in our simulations, assum-
ing that the explicit parametric linearized model is known.
The corresponding parameter setup remains the same as that
in the DeeP-LCC. Note that in the simulations, the traffic
flow has different equilibrium states in different time periods,
and this is common in real traffic. Accordingly, we use the
average velocity of the head vehicle among the past horizon
1Our code is available at https://github.com/soc-ucsd/DeeP-LCC.
TABLE I
VELOCITY PROFI LE O F TH E HEA D VEH IC LE I N EXP ER IM E NT A
Time [s] 0-10 18-38 51-71 106-126 136-156
Velocity [km/h] 70 50 70 100 70
1This table shows the cruise velocity during different time periods.
Between these time periods, the head vehicle takes a uniform
acceleration or deceleration.
TABLE II
FUE L CONSUMPTION IN EXP ER IM EN T A
All HDVs MPC DeePC
Phase 1 172.59 158.79 (7.99%) 159.17 (7.78%)
Phase 2 379.13 374.35 (1.26%) 374.60 (1.19%)
Phase 3 817.16 812.91 (0.52%) 812.71 (0.54%)
Phase 4 399.86 377.66 (5.55%) 377.58 (5.57%)
Total Simulation 1977.16 1928.19 (2.48%) 1929.09 (2.43%)
1All the values have been rounded and the unit is mL in this table.
of Tini as the equilibrium velocity for the CAVs, and the
corresponding equilibrium spacing is manually set. Due to
page limit, more details on numerical implementation will
be presented in an extended version.
B. Numerical Results
Experiment A: To validate the performance of DeeP-LCC,
we first design a comprehensive simulation scenario to
validate the capability of the proposed control strategy in
improving traffic performance. Specifically, motivated by
the standard Extra-Urban Driving Cycle (EUDC) from New
European Driving Cycle (NEDC), we design a velocity tra-
jectory for the head vehicle, as shown in Table I. To capture
the traffic performance, we consider the fuel consumption
for the vehicles indexed from 3 to 8, given that the first two
HDVs would not be influenced by the following CAVs (recall
that n= 8 and S={3,6}). The fuel consumption model
proposed in [27] is employed.
The simulation results are shown in Fig. 2. It can be
clearly observed that compared to the case where all the
vehicles are HDVs, DeeP-LCC apparently mitigates velocity
perturbations and smooths traffic flow with only two CAVs
existing in mixed traffic. The results of the fuel consumption
is presented in Table II, with the whole simulation separated
into four phases (as clarified in Fig. 2). Both MPC and
DeeP-LCC reduce the fuel consumption throughout the four
phases, and particularly, the two controllers contribute to a
greater improvement on traffic performance in the braking
phases (Phases 1 and 4) than the accelerating phases (Phases
2 and 3). In particular, DeeP-LCC saved 7.78% and 5.57%
fuel consumption during Phases 1 and 4, respectively.
Note that the MPC controller utilizes the nominal model
to design the control input, while the DeeP-LCC controller
relies on the trajectory data to directly predict the future
system behavior. In practice, MPC might be inapplicable,
since the nominal model for individual HDVs is non-trivial
to identify. By contrast, DeeP-LCC achieves similar per-
formance compared to MPC using only trajectory data,
without explicitly identifying a parametric model. Hence,
DeeP-LCC has demonstrated great potential to improve
traffic performance in practical mixed traffic flow.
(a) All HDVs
(b) DeeP-LCC
Fig. 2. Velocity profiles in Experiment A. The black profile represents the
head vehicle, and the gray profile represents the HDVs. The red profile and
the blue profile represent the first and the second CAV, respectively. (a) All
the vehicles are HDVs. (b) The CAVs utilize DeeP-LCC.
Experiment B: To further validate the safety performance
of DeeP-LCC, we design a particular braking scenario
motivated by the EUDC circle, where the head vehicle
takes a sudden emergency brake with maximum deceleration
capability, maintains the low velocity for a while, and finally
accelerates to the original normal velocity. This is a typical
emergency case in real traffic flow, which requires the CAVs’
control to have strict safety guarantee from rear-end collision.
The results are shown in Fig. 3. As can be clearly
observed, when all the vehicles are HDVs, they have a large
velocity fluctuation as a response to the brake perturbation
of the head vehicle. By contrast, when two vehicles utilize
DeeP-LCC, they have a quite distinct response pattern from
the HDVs. Precisely, the CAVs decelerate immediately when
the head vehicle starts to brake, thus leaving a relatively large
safe distance from the preceding vehicle (see the time period
before 10 s). Then, the CAVs accelerate slowly when the
head vehicle begins to return to the original velocity (see the
time period in 912 s), while in the case of all the HDVs,
they would take a delayed rapid acceleration (see the time
period in 12 20 s), which could lead to driving discomfort
and collision risk. In addition, 24.96% fuel consumption
reduction has been observed after introducing DeeP-LCC
compared with the case of all HDVs. Our strategy allows the
CAVs to eliminate velocity overshoot, improve fuel economy,
and constrain the spacing among the safe range, contributing
to smoother traffic flow with safety guarantees.
VI. CONCLUSIONS
In this paper, we have presented the DeeP-LCC strategy
for CAV control in mixed traffic. Our dynamical modeling
and controllability/observability analysis guarantee its ratio-
nality. In particular, the proposed strategy relies directly on
the trajectory data of the HDVs rather than a parametric
HDV model to design the control input of the CAVs, and
(a) (b)
(c) (d)
(e) (f)
Fig. 3. Simulation results in Experiment B. (a)(c)(e) show the velocity,
spacing, and acceleration profiles, respectively when all the vehicles are
HDVs, while (b)(d)(f) show the corresponding profiles where the two CAVs
utilize DeeP-LCC. In (c)-(f), the profiles of the other HDVs are hided. The
color of each profile has the same meaning as that in Fig. 2.
our method is applicable to nonlinear and non-deterministic
traffic systems. Traffic simulations have shown the signif-
icant improvement of our strategy in traffic efficiency and
fuel economy, with safety guarantee. Some interesting fu-
ture directions include incorporating delayed trajectory data
caused from communication delay, exploring the scalability
of DeeP-LCC for large-scale mixed traffic, and addressing
the problem of time-varying equilibrium traffic states.
REFERENCES
[1] J. Guanetti, Y. Kim, and F. Borrelli, “Control of connected and
automated vehicles: State of the art and future challenges,” Annu. Rev.
Control, vol. 45, pp. 18–40, 2018.
[2] S. E. Li, Y. Zheng, K. Li, Y. Wu, J. K. Hedrick, F. Gao, and
H. Zhang, “Dynamical modeling and distributed control of connected
and automated vehicles: Challenges and opportunities,” IEEE Intell.
Transp. Syst. Mag., vol. 9, no. 3, pp. 46–58, 2017.
[3] Y. Zheng, S. E. Li, J. Wang, D. Cao, and K. Li, “Stability and
scalability of homogeneous vehicular platoon: Study on the influence
of information flow topologies,IEEE Trans. Intell. Transp. Syst.,
vol. 17, no. 1, pp. 14–26, 2016.
[4] V. Milan ´
es, S. E. Shladover, J. Spring, C. Nowakowski, H. Kawazoe,
and M. Nakamura, “Cooperative adaptive cruise control in real traffic
situations,” IEEE Trans. Intell. Transp. Syst., vol. 15, no. 1, pp. 296–
305, 2013.
[5] Y. Zheng, J. Wang, and K. Li, “Smoothing traffic flow via control
of autonomous vehicles,” IEEE Internet Things J., vol. 7, no. 5, pp.
3882–3896, 2020.
[6] R. E. Stern, S. Cui, M. L. Delle Monache, R. Bhadani et al.,
“Dissipation of stop-and-go waves via control of autonomous vehicles:
Field experiments,” Transp. Res. Part C Emerging Technol., vol. 89,
pp. 205–221, 2018.
[7] J. Wang, Y. Zheng, Q. Xu, J. Wang, and K. Li, “Controllability analysis
and optimal control of mixed traffic flow with human-driven and
autonomous vehicles,” IEEE Trans. Intell. Transp. Syst., pp. 1–15,
2020.
[8] G. Orosz, “Connected cruise control: modelling, delay effects, and
nonlinear behaviour,Veh. Syst. Dyn., vol. 54, no. 8, pp. 1147–1176,
2016.
[9] I. G. Jin and G. Orosz, “Optimal control of connected vehicle systems
with communication delay and driver reaction time,IEEE Trans.
Intell. Transp. Syst., vol. 18, no. 8, pp. 2056–2070, 2017.
[10] B. Recht, “A tour of reinforcement learning: The view from continuous
control,” Annu. Rev. Control Rob. Auton. Syst., vol. 2, pp. 253–279,
2019.
[11] L. Furieri, Y. Zheng, and M. Kamgarpour, “Learning the globally
optimal distributed lq regulator,” in 2020 L4DC. PMLR, pp. 287–297.
[12] C. Wu, A. Kreidieh, K. Parvate, E. Vinitsky, and A. M. Bayen, “Flow:
Architecture and benchmarking for reinforcement learning in traffic
control,” arXiv preprint arXiv:1710.05465, 2017.
[13] M. Huang, Z.-P. Jiang, and K. Ozbay, “Learning-based adaptive
optimal control for connected vehicles in mixed traffic: Robustness
to driver reaction time,IEEE Trans. Cybern., pp. 1–11, 2020.
[14] L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger,
“Learning-based model predictive control: Toward safe learning in
control,” Annu. Rev. Control Rob. Auton. Syst., vol. 3, pp. 269–296,
2020.
[15] J. Coulson, J. Lygeros, and F. D ¨
orfler, “Data-enabled predictive
control: In the shallows of the deepc,” in 2019 IEEE ECC, pp. 307–
312.
[16] J. C. Willems, P. Rapisarda, I. Markovsky, and B. L. De Moor, “A
note on persistency of excitation,Syst. Control Lett., vol. 54, no. 4,
pp. 325–329, 2005.
[17] J. Coulson, J. Lygeros, and F. D ¨
orfler, “Regularized and distribution-
ally robust data-enabled predictive control,” in 2019 IEEE CDC, pp.
2696–2701.
[18] F. D¨
orfler, J. Coulson, and I. Markovsky, “Bridging direct & indirect
data-driven control formulations via regularizations and relaxations,
arXiv preprint arXiv:2101.01273, 2021.
[19] L. Huang, J. Coulson, J. Lygeros, and F. D ¨
orfler, “Decentralized data-
enabled predictive control for power system oscillation damping,
arXiv preprint arXiv:1911.12151, 2019.
[20] P. G. Carlet, A. Favato, S. Bolognani, and F. D¨
orfler, “Data-driven
predictive current control for synchronous motor drives,” in 2020 IEEE
ECCE, pp. 5148–5154.
[21] J. Wang, Y. Zheng, C. Chen, Q. Xu, and K. Li, “Leading cruise
control in mixed traffic flow: System modeling, controllability, and
string stability,IEEE Trans. Intell. Transp. Syst., pp. 1–16, 2021,
Accepted, to Appear.
[22] G. Orosz, R. E. Wilson, and G. St´
ep´
an, “Traffic jams: dynamics and
control,” Philos. Trans. R. Soc. London, Ser. A, vol. 368, no. 1928,
pp. 4455–4479, 2010.
[23] W. Gao, Z.-P. Jiang, and K. Ozbay, “Data-driven adaptive optimal
control of connected vehicles,” IEEE Trans. Intell. Transp. Syst.,
vol. 18, no. 5, pp. 1122–1133, 2017.
[24] S. Skogestad and I. Postlethwaite, Multivariable feedback control:
analysis and design. New York: Wiley, 2007, vol. 2.
[25] Y. Lian, J. Shi, M. P. Koch, and C. N. Jones, “Adaptive robust data-
driven building control via bi-level reformulation: an experimental
result,” arXiv preprint arXiv:2106.05740, 2021.
[26] E. Tzirakis, K. Pitsas, F. Zannikos, and S. Stournas, “Vehicle emissions
and driving cycles: comparison of the athens driving cycle (adc) with
ece-15 and european driving cycle (edc),Global NEST Journal, vol. 8,
no. 3, pp. 282–290, 2006.
[27] D. P. Bowyer, R. Akccelik, and D. Biggs, Guide to fuel consumption
analysis for urban traffic management. Vermont South, Australia:
ARRB Transport Research Ltd, 1985, no. 32.
APPENDIX
In this appendix, we provide some auxiliary results on
controllability. First, we present the results of the mixed
traffic system with one CAV, which has been revealed in [21].
Lemma 1 ( [21, Theorem 2]): Given the linearized mixed
traffic system (5), if condition (7) holds, then we have:
1) When S={1}, the system is controllable. 2) When
S={i1}with 1< i1n, the system is not completely
controllable but is stabilizable. Precisely, the subsystem
consisting of the states ˜s1,˜v1,...,˜si11,˜vi11is not con-
trollable but is stable, while the subsystem consisting of the
states ˜si1,˜vi1,...,˜sn,˜vnis controllable.
The physical interpretation of Lemma 1 is that the control
input of the CAV has no influence on the state of its preced-
ing HDVs, but has full control of the motion of its following
HDVs, when condition (7) holds. This condition is satisfied
with probability one with random choices of α1, α2, α3.
We are now ready to present the proof of Theorem 1. The
following lemma is useful.
Lemma 2 (Controllability invariance [24]): The control-
lability is invariant under state feedback for a linear system
(A, B). Precisely, (A, B) is controllable if and only if (A
BK, B ) is controllable for any matrix Kwith compatible
dimensions.
Based on Lemma 2, we transform system (A, B) in (5)
into ( ¯
A, B) by introducing a virtual input ¯u(t), defined as
¯u(t) = ui1(t),¯ui2(t),...,¯uim(t)T,(21)
where for r= 2, . . . , m, we define
¯uir(t) = uir(t)(α1˜sir(t)α2˜vir(t) + α3˜vir1(t)) .
Then, we have
¯u(t) = u(t)Kx(t),(22)
where K= [0n,ei2
n,...,eim
n]T¯
K, and ¯
Kis given by
¯
K=
0
k2,2k2,1
......
kn,2kn,1
Rn×2n,
with ki,1=α1α2, ki,2=0α3.
According to (22), it holds that A=¯
ABK. By
Lemma 2, the controllability property is consistent between
(A, B) and ( ¯
A, B). For system ( ¯
A, B), the physical interpre-
tation of the definition of the virtual input ¯u(t)in (21) is that
except the control input of the first CAV, the control input
signals of all the other CAVs contain a signal that follows
the linearized car-following dynamics of HDVs (3). Letting
uir(t) = 0 (r= 2, . . . , m), system ( ¯
A, B) is converted to
a mixed traffic system with one single CAV — only the
CAV indexed as i1,i.e., the fist CAV in the mixed traffic
flow, has a control input that can be arbitrarily designed. By
Lemma 1, which states the controllability of the mixed traffic
system with one single CAV, system ( ¯
A, B) thus has the same
controllability property. Considering the consistency between
the controllability of ( ¯
A, B) and that of (A, B ), we have the
results in Theorem 1.
Finally, we note that the proof of Corollary 1 is similar
to that of Lemma 1 when S={1}. We refer the interested
readers to [21] for details.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Connected and autonomous vehicles (CAVs) have great potential to improve road transportation systems. Most existing strategies for CAVs' longitudinal control focus on downstream traffic conditions, but neglect the impact of CAVs' behaviors on upstream traffic flow. In this paper, we introduce a notion of Leading Cruise Control (LCC), in which the CAV maintains car-following operations adapting to the states of its preceding vehicles, and also aims to lead the motion of its following vehicles. Specifically, by controlling the CAV, LCC aims to attenuate downstream traffic perturbations and smooth upstream traffic flow actively. We first present the dynamical modeling of LCC, with a focus on three fundamental scenarios: car-following, free-driving, and Connected Cruise Control. Then, the analysis of controllability, observability, and head-to-tail string stability reveals the feasibility and potential of LCC in improving mixed traffic flow performance. Extensive numerical studies validate that the capability of CAVs in dissipating traffic perturbations is further strengthened when incorporating the information of the vehicles behind into the CAVs' control.
Conference Paper
Full-text available
Data-driven control techniques have become increasingly popular in recent years due to the availability of massive amounts of data and several advances in data science. These control design methods bypass the system identification step and directly exploit collected data to construct the controller. In this paper, we investigate the application of data-driven methods to the control of electric motor drives, and specifically to the design of current controllers for three-phase synchronous permanent magnet motor drives. Two of the most promising data-driven algorithms are presented, namely the Subspace Predictive Control algorithm and the Data-Enabled Predictive Control algorithm. The theory behind these techniques is first reviewed in the optimization-based control framework. Standard algorithms are slightly modified to fulfill the requirements of the specific application, and then simulated in the MATLAB Simulink environment. Some key aspects of real-time implementation are studied, providing a proof-of-concept demonstration of the applicability of these algorithms. The data-driven design is proposed for three different topologies of synchronous motors, proving the flexibility of the approach. Index Terms-Data driven control, Model Predictive Control (MPC), Current control of Permanent Magnet Synchronous Motor (PMSM), Subspace Predictive Control (SPC), Data EnablEd Predictive Control (DeePC)
Article
Full-text available
We study model-free learning methods for the output-feedback Linear Quadratic (LQ) control problem in finite-horizon subject to subspace constraints on the control policy. Subspace constraints naturally arise in the field of distributed control and present a significant challenge in the sense that standard model-based optimization and learning leads to intractable numerical programs in general. Building upon recent results in zeroth-order optimization, we establish model-free sample-complexity bounds for the class of distributed LQ problems where a local gradient dominance constant exists on any sublevel set of the cost function.
Article
Data-driven control approaches for the minimization of energy consumption of buildings have the potential to significantly reduce deployment costs and increase the uptake of advanced control in this sector. A number of recent approaches based on the application of Willems’ fundamental lemma for data-driven controller design from input/output measurements are very promising for deterministic linear time-invariant (LTI) systems. This article proposes a systematic way to handle unknown measurement noise and measurable process noise, and extends these data-driven control schemes to adaptive building control via a robust bilevel formulation, whose upper level ensures robustness and whose lower level guarantees prediction quality. Corresponding numerical improvements and an active excitation mechanism are proposed to enable a computationally efficient reliable operation. The efficacy of the proposed scheme is validated by a multizone building simulation and a real-world experiment on a single-zone conference building on the École Polytechnique Fédérale de Lausanne (EPFL) campus. The real-world experiment includes a 20-day nonstop test, where, without extra modeling effort, our proposed controller improves 18.4% energy efficiency against an industry-standard controller while also robustly ensuring occupant comfort.
Article
We discuss connections between sequential system identification and control for linear time-invariant systems, often termed indirect data-driven control, as well as a contemporary direct data-driven control approach seeking an optimal decision compatible with recorded data assembled in a Hankel matrix and robustified through suitable regularizations. We formulate these two problems in the language of behavioral systems theory and parametric mathematical programs, and we bridge them through a multi-criteria formulation trading off system identification and control objectives. We illustrate our results with two methods from subspace identification and control: namely, subspace predictive control and low-rank approximation which constrain trajectories to be consistent with a non-parametric predictor derived from (respectively, the column span of) a data Hankel matrix. In both cases we conclude that direct and regularized data-driven control can be derived as convex relaxation of the indirect approach, and the regularizations account for an implicit identification step. Our analysis further reveals a novel regularizer and a plausible hypothesis explaining the remarkable empirical perfo
Article
We employ a novel data-enabled predictive control (DeePC) algorithm in voltage source converter (VSC)-based high-voltage DC (HVDC) stations to perform safe and optimal wide-area control for power system oscillation damping. Conventional optimal wide-area control is model based. However, in practice, detailed and accurate parametric power system models are rarely available. In contrast, the DeePC algorithm uses only input-output data measured from the unknown system to predict the future trajectories and calculate the optimal control policy. We showcase that the DeePC algorithm can effectively attenuate interarea oscillations even in the presence of measurement noise, communication delays, nonlinear loads, and uncertain load fluctuations. We investigate the performance under different matrix structures as data-driven predictors. Furthermore, we derive a novel Min-Max DeePC algorithm to be applied independently in multiple VSC-HVDC stations to mitigate interarea oscillations, which enables decentralized and robust optimal wide-area control. Further, we discuss how to relieve the computational burden of the Min-Max DeePC by reducing the dimension of prediction uncertainty and how to leverage disturbance feedback to reduce the conservativeness of robustification. We illustrate our results with high-fidelity, nonlinear, and noisy simulations of a four-area test system.
Article
Through vehicle-to-vehicle (V2V) communication, both human-driven and autonomous vehicles can actively exchange data, such as velocities and bumper-to-bumper distances. Employing the shared data, control laws with improved performance can be designed for connected and autonomous vehicles (CAVs). In this article, taking into account human-vehicle interaction and heterogeneous driver behavior, an adaptive optimal control design method is proposed for a platoon mixed with multiple preceding human-driven vehicles and one CAV at the tail. It is shown that by using reinforcement learning and adaptive dynamic programming techniques, a near-optimal controller can be learned from real-time data for the CAV with V2V communications, but without the precise knowledge of the accurate car-following parameters of any driver in the platoon. The proposed method allows the CAV controller to adapt to different platoon dynamics caused by the unknown and heterogeneous driver-dependent parameters. To improve the safety performance during the learning process, our off-policy learning algorithm can leverage both the historical data and the data collected in real time, which leads to considerably reduced learning time duration. The effectiveness and efficiency of our proposed method is demonstrated by rigorous proofs and microscopic traffic simulations.
Article
The emergence of autonomous vehicles is expected to revolutionize road transportation in the near future. Although large-scale numerical simulations and small-scale experiments have shown promising results, a comprehensive theoretical understanding to smooth traffic flow via autonomous vehicles is lacking. In this paper, from a control-theoretic perspective, we establish analytical results on the controllability, stabilizability, and reachability of a mixed traffic system consisting of human-driven vehicles and autonomous vehicles in a ring road. We show that the mixed traffic system is not completely controllable, but is stabilizable, indicating that autonomous vehicles can not only suppress unstable traffic waves but also guide the traffic flow to a higher speed. Accordingly, we establish the maximum traffic speed achievable via controlling autonomous vehicles. Numerical results show that the traffic speed can be increased by over 6% when there are only 5% autonomous vehicles. We also design an optimal control strategy for autonomous vehicles to actively dampen undesirable perturbations. These theoretical findings validate the high potential of autonomous vehicles to smooth traffic flow.
Article
Recent successes in the field of machine learning, as well as the availability of increased sensing and computational capabilities in modern control systems, have led to a growing interest in learning and data-driven control techniques. Model predictive control (MPC), as the prime methodology for constrained control, offers a significant opportunity to exploit the abundance of data in a reliable manner, particularly while taking safety constraints into account. This review aims at summarizing and categorizing previous research on learning-based MPC, i.e., the integration or combination of MPC with learning methods, for which we consider three main categories. Most of the research addresses learning for automatic improvement of the prediction model from recorded data. There is, however, also an increasing interest in techniques to infer the parameterization of the MPC controller, i.e., the cost and constraints, that lead to the best closed-loop performance. Finally, we discuss concepts that leverage MPC to augment learning-based controllers with constraint satisfaction properties. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 3 is May 3, 2020. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.