ArticlePDF Available

Model-based vs data-driven adaptive control: An overview

Authors:

Abstract

In this paper, we present an overview of adaptive control by contrasting model-based approaches with data-driven approaches. Indeed, we propose to classify adaptive controllers into two main subfields, namely, model-based adaptive control and data-driven adaptive control. In each subfield, we cite monographs, survey papers, and recent research papers published in the last few years. We also include a few simple examples to illustrate some general concepts in each subfield.
Received: 10 May 2017 Revised: 29 January 2018 Accepted: 31 January 2018
DOI: 10.1002/acs.2862
LITERATURE SURVEY
Model-based vs data-driven adaptive control: An overview
Mouhacine Benosman
Mitsubishi Electric Research Laboratories
(MERL), Cambridge, MA 02139, USA
Correspondence
Mouhacine Benosman, Mitsubishi
Electric Research Laboratories (MERL),
Cambridge, MA 02139, USA.
Email: m_benosman@ieee.org
Summary
In this paper, we present an overview of adaptive control by contrasting
model-based approaches with data-driven approaches. Indeed, we propose to
classify adaptive controllers into two main subfields, namely, model-based adap-
tive control and data-driven adaptive control. In each subfield, we cite mono-
graphs, survey papers, and recent research papers published in the last few years.
We also include a few simple examples to illustrate some general concepts in
each subfield.
KEYWORDS
adaptive control, data-driven, learning-based, model-based
1INTRODUCTION
The field of adaptive control is vast, which makes the task of surveying all the existing results nearly impossible. Thus, to
simplify the presentation, we have decided to structure the paper based on how much model information is needed for
the adaptive control method. What we mean by model information is whether the adaptive approach is model based or
data driven, also known as model-free. We underline here that in the sequel of this presentation, by model we mean either
the plant model, ie, for adaptive controllers dealing with plant uncertainties (see, eg, the work of Ioannou and Sun1), or
the disturbance model, ie, for adaptive controllers dealing with disturbance estimation and rejection (see, eg, the work of
Landau et al2).
There have been many survey papers on adaptive control theory and its applications. Some survey papers were writ-
ten in the 1970s and 1980s3-8; other surveys were written in the 1990s.9-13 However, to the best of our knowledge, there
have been much fewer survey papers on adaptive control published recently.14-19 Instead, we found several books on
this topic, some focusing on a general presentation of adaptive control10,20-22 and others more specialized in a specific
subtopic in the adaptive control field, eg, robust adaptive control,1direct adaptive control,23 linear adaptive control,2,9,10,24-32
nonlinear adaptive control,33-38 stochastic adaptive control,39 partial differential equation–based adaptive control,40 and
learning-based adaptive control.41-46
To the best of our knowledge, interest in adaptive control started in the 1950s with the emergence of servosystems and
flight controllers (see, eg, the works of Benner and Drenick,47 Drenick and Shahbender,48 and Gregory49). Since then, there
have been great efforts in theoretical research and practical applications of adaptive control. On the theoretical side, early
advances were made in systems parametric identification50 and dynamics programming stochastic control theory,51-54
where the basic foundations have been formulated for adaptive and learning control theory as we know it today. As for
the application side, there has been a large number of applications of adaptive control and learning, either in laboratory
testbeds or in real-life industrial applications (see the works of Seborg et al5and Åström and Wittenmark10 for some
surveys of applications).
We can count a large number of references on adaptive and learning control theory. One way to classify these results,
which is not meant to be unique, is to group them in terms of their correlation with the model of the system. Indeed, we
Int J Adapt Control Signal Process. 2018;1–24. wileyonlinelibrary.com/journal/acs Copyright © 2018 John Wiley & Sons, Ltd. 1
2BENOSMAN
can decompose adaptive control theory into the following two main classes: model-based adaptive control and data-driven
adaptive control. To make sure that there is no confusion in the mind of the reader, let us first define what we mean by each
class. In this paper, when we refer to model-based adaptive controllers, we mean controllers that are entirely or partially
based on a given model of the system. The model can be motivated from physics or an input-output model. However,
the key point here is that, all or part of, the controller is based on a model of the system. Alternatively, when we refer to
data-driven adaptive controllers, we mean controllers that do not rely on any model of the system; instead, they are fully
based on learning from direct interaction with the environment, eg, trial-and-error approaches.
Furthermore, we can decompose the model-based class into two subclasses, namely, fully model-based or classical
adaptive control and partially model-based or learning-based adaptive control. Indeed, in the fully model-based adaptive
control, the controller is entirely based on a model of the system. In this case, both the controller and the adaptation filters
are based on the model. In contrast, in the partially model-based or learning-based subclass, the controller is based on a
model of the system, but the adaptation filters are data driven, eg, based on machine learning algorithms. The key point
here is that the adaptation layer is solely based on direct interaction with the environment, without any assumption on
the system model, ie, the design of the adaptation filters is not based on the model.
We emphasize that this classification is not meant to be universal; it is solely used here to simplify the citation of about
350 references gathered in this survey, while putting in parallel model-based and data-driven methods.
Each of these classes can be further decomposed into subclasses. Indeed, when talking about (fully/partially)
model-based adaptive control, we can identify the following main subclasses: direct adaptive control and indirect adaptive
control. Direct adaptive control is based on the idea of designing a controller that adapts to the model uncertainties, with-
out explicitly having to identify the true values of these uncertainties. On the other hand, indirect adaptive control aims
at estimating the true values of the uncertainties and then uses the uncertainty estimates and their associated estimated
model to design the controller. Other deeper subclasses can be identified based on the mathematical nature of the model,
eg, linear, nonlinear, continuous, discrete, and hybrid. For the sake of clarity, we have summarized this classification
in Figure 1.
In this paper, we will present the surveyed references following the classification given above. We underline again
that this paper does not mean to be an exhaustive presentation of all existing adaptive control algorithms, but rather an
overview pointing out to monographs, surveys, and recent research papers, in which the readers can find a more detailed
presentation of the specific results.
FIGURE 1 A classification of adaptive control [Colour figure can be viewed at wileyonlinelibrary.com]
BENOSMAN 3
In the sequel, we first start by reviewing works related to model-based adaptive control together with a few open chal-
lenges, in Section 2. In Section 3, we report results related to data-driven adaptive control and conclude the section with
some open problems pertaining to this class of adaptive controllers. Section 4 contains a few points of comparison between
these two classes. Finally, this overview ends with concluding remarks and some general open problems in adaptive
control, in Section 5.
2MODEL-BASED ADAPTIVE CONTROL
As we explained earlier, by model-based adaptive controllers, we mean controllers that are designed completely based
on a model of the system. The model can be the system's uncertain model, used in works related to adaptation to plants'
uncertainties (see, eg, the work of Ioannou and Sun1), or the disturbance uncertain model, used in adaptive disturbance
rejection, also known in some literature as adaptive regulation dealing with uncertain disturbance rejection (see, eg, the
work of Landau et al2). As depicted in Figure 1, we decompose this class into the following two subclasses.
2.1 Fully model-based adaptive control
In this subclass, we include all the methods that are fully based on a model of the system, in the sense that both the
controller and the adaptation filters are based on a model of the system. These methods can be further classified in terms
of the approach used to compensate for the model uncertainties, ie, either direct or indirect, end also in terms of the
nature of the model and the controller equations. Indeed, we can classify the model-based adaptive algorithm as linear
controllers (continuous, discrete, or hybrid) and nonlinear controllers (continuous, discrete, or hybrid).
There have been myriads of textbooks and survey papers about model-based linear (direct/indirect) adaptive control.
For instance, the reader is referred to some related books1,2,9,10,24-29,31,55 and related survey papers.4-7,13,15,17
A noticeable recent model-based adaptive control is the l1adaptive control, proposed first for linear systems (see, eg,
the work of Cao and Hovakimyan56,57). The main idea in l1adaptive control can be seen as an extension of the well-known
direct-Model Reference Adaptive Control (d-MRAC), where a strictly proper stable filter is inserted before the d-MRAC
controller to guarantee, on top of asymptotic performances obtained by classical d-MRAC, some transient tracking per-
formances (providing high adaptation feedback). The performance of this approach has been discussed in the works of
Ioannou et al58 and van Heusden et al.59
We can also cite here works that focus on designing model-based adaptive controllers for linear systems under input
constraints60-65 or in infinite dimension.66 Furthermore, we could classify as direct linear adaptive control the concept of
multiple-model adaptive switching control, where a set of robust linear controllers is designed to deal with a range of
plant uncertainties, and a supervisor is used to switch online between these controllers, while maintaining performance
and stability.67-70
Another interesting approach is the so-called concurrent adaptive control, which uses recorded and instantaneous data
concurrently for adaptation. It is shown to converge, under some conditions on the recorded data, to the true uncertain
model parameters without the need for a classical persistent-excitation condition, required by traditional model-based
adaptive laws that use only instantaneous data for parametric estimation.71,72
Another important problem in adaptive control is the one dealing with unknown disturbance rejection, often referred
to as “adaptive regulation.”2,73 Indeed, in these works, the plant's model is known or identified, but the disturbance model
is unknown or uncertain. In the direct adaptive methods, adaptive filters are used to tune the feedback controller (for
example, via adaptive observers of the disturbance) to reject the unknown disturbance, without explicitly estimating its
model. Several linear direct adaptive controllers have been proposed in this context (see, eg, the works of Marino et al,74
Marino and Tomei,75 and Aranovskiy and Freidovich76). Under the indirect adaptive disturbance rejection methods, we
can include numerous algorithms that use the internal model principle to model unknown disturbances and then use
adaptive filters to estimate the coefficients of the internal model (refer to the works of Bodson and Douglas,77 Ding,78
Landau et al,79 and Serrani80 as well as to the nice survey papers by Landau et al15,81).
Some of the concepts mentioned above have been extended to the more challenging case of nonlinear systems; one
can refer, for example, to the following books: chapter 8 in the work of Slotine and Li,82 Krsti´
cetal,
33 Spooner et al,34
Astolfi et al,35 Fradkov et al,36 and Yakubovich83 and the following survey papers: Ortega and Tang,6Barkana,17 Astolfi,37
Ilchmann and Ryan,84 and Fekri et al.85
4BENOSMAN
For example, following a formulation that appeared in the Russian literature in the late 1960s (see, eg, the works of
Yakubovich83 and Fradkov86 ), we can state the model-based nonlinear adaptive control problem as follows: consider a
nonlinear plant modeled in the state space by the nonlinear differential equation, ie,
x=F(t,x,u,𝑝),
𝑦=G(t,x,u,𝑝),(1)
where tR+,xRn,𝑦Rm,and uRncare the scalar time variable, the state vector, the output vector, and the
control vector, respectively. 𝑝R𝑝represents a vector of unknown parameters, element of an a priori known set
,andFand Gare two smooth functions.
We then define a control objective function Q(t,x,u)∶ R+×Rn×RncR, such that the control objective is deemed
reached if and only if
Q(t,x,u)0,for tt>0.(2)
The adaptive problem is to find a (model-based) two-level control algorithm, of the form
u(t)=Ut(u(s),𝑦(s),𝜃(s)),s∈[0,t[
𝜃(t)=Θ
t(u(s),𝑦(s),𝜃(s)),s∈[0,t[,(3)
where Utand Θtare nonanticipative operators (or causal operators). The goal of the two-level control (3) is to meet the
control objective (2), while maintaining the closed-loop trajectories of (1) and (3) bounded for any 𝑝and for a
given initial condition set Ω, subject to (x(0),𝜃(0)) Ω. This formulation is indeed fairly general. For instance, if one is
concerned with a regulation adaptive problem, the objective function Qcan be chosen as follows.
For state regulation to a constant vector x:
Qxr =xx2
R𝛿, R=RT>0,𝛿>0.(4)
For output regulation to a constant vector y:
Q𝑦r=𝑦𝑦2
R𝛿, R=RT>0,𝛿>0.(5)
Similarly, if state or output tracking is the control target, then the objective function can be formulated as
Qxt =xx(t)2
R𝛿, R=RT>0,state tracking of x(t),
Q𝑦t=𝑦𝑦(t)2
R𝛿, R=RT>0,output tracking of 𝑦(t),(6)
where the reference trajectories are solutions of a desired reference model
x=F(t,x),
𝑦=G(t,x).(7)
Of course, to be able to obtain any useful analysis of the nonlinear controllers, some assumptions on the type of nonlinear-
ities had to be formulated. For instance, a very well-known assumption in nonlinear (both direct and indirect) adaptive
control is the linear parameterization of the plant model, ie, Fand Gin (1) are linearly dependent on the plant unknown
parameters p. This assumption is important enough to be used as a metric to further classify direct nonlinear controllers
in terms of the nature of the plant's parameterization in the unknown parameters.
Indeed, there have been a lot of efforts in dealing with the challenging case of nonlinear uncertainty parameterization
(see, eg, references87-102).
Let us present on a simple example one of these results, namely, the speed gradient–based adaptive method for
nonlinear parameterization, which was introduced first in the Russian literature by Fradkov et al,36 Krasovskii,103 and
Krasovskii and Shendrik.104
As an example of adaptive control with nonlinear parameterization, we report below an application of the
speed-gradient approach to a passivity-based adaptive control, as introduced in the work of Seron et al.87 Thegoalhereis
to use passivity characteristics to solve nonlinear adaptive control problems of the general form (1), (2), and (3). The con-
trol objective is to render an uncertain nonlinear system passive, also known as passifying the system, using an adaptive
BENOSMAN 5
controller of the form (3). For instance, based on the definition of input-output passivity, one can choose the control
objective function in (2) as
Q𝑝=V(x(t))V(x(0))t
0
uT(𝜏)𝑦(𝜏)d𝜏, (8)
where Vis a Crreal-valued nonnegative function, subject to V(0)=0. The reader can infer that this choice of control
objective function is due to the definition of input-output Crpassivity. Indeed, if the control objective (2) is (adaptively)
achieved, then one can write the inequality
t
0
uT(𝜏)𝑦(𝜏)d𝜏V(x(t))V(x(0)),(9)
which implies passivity of the system. Since the stability properties of passive systems are well documented (see, eg, the
work of Byrnes et al105), one can then easily claim asymptotic stability of the adaptive feedback system. For example, let
us consider the following nonlinear system in the following normal form (see, eg, the work of Isidori and Willems106):
z=𝑓1(z,𝑝)+𝑓2(z,𝑦,𝑝)𝑦,
𝑦 =𝑓(z,𝑦,𝑝)+g(z,𝑦)u,(10)
where zRnmare the zero-dynamic states, 𝑦Rm,and𝑝R𝑝, is the vector of constant unknown parameters.
We assume that gis nonsingular and that system (10) is weakly minimum-phase. Then, one can propose the following
nonlinear (passifying) adaptive controller:
u(z,𝑦,𝜃)=g1(z,𝑦)wT(z,𝑦,𝜃)−𝜌𝑦 +v,𝜌>0,
𝜃=−Γ
𝜃(w(z,𝑦,𝜃)𝑦),Γ>0,(11)
where wis defined as
w(z,𝑦,𝜃)=−𝑓T(z,𝑦,𝜃)−𝜕W(z,𝜃)
𝜕z𝑓2(z,𝑦,𝜃),(12)
with Wsatisfying the following (weak minimum-phase) condition:
𝜕W(z,𝜃)
𝜕z𝑓1(z,𝜃)0,(13)
locally in z,𝜃. Under the assumption of convexity of the function w(z,y,𝜃)ywith respect to 𝜃, the adaptive
closed-loop systems (10), (11), and (12) are passive from the new input vto the output y. Finally, asymptotic stabilization
can be easily achieved by a simple output feedback from yto v.
Other recent results on nonlinear direct adaptive control can be found, for example, in the works of Wang et al107 and
Cao et al,108 where the concept of l1adaptive control has been extended to nonlinear models.
We also include in this section the so-called composite (or combined) adaptive control.109-112 In this approach, the idea is
to combine the advantages of both direct and indirect adaptive control and write the combined adaptive controllers that
have been claimed to give better transient behavior than direct or indirect MRAC adaptive controllers.
To conclude this section about fully model-based adaptation, we can cite other recent works, ie, post the latest gen-
eral survey paper,18 which can be classified under the model-based paradigm: for nonlinear models,113-126 for models with
time delay,127-137 with parameter-independent realization controllers,138 with input/output quantization,139-144 under state
constraints,145,146 under inputs and actuator-bandwidth constraints,147-149 for Markovian jump systems,150-152 for switched
systems,153-156 for partial differential equation (PDE)–based models,157-162 for nonminimum/minimum-phase systems,163-167
to achieve adaptive regulation and disturbance rejection,168-173 multiple-model and switching adaptive control,174-178 linear
quadratic regulator (LQR)–based adaptive control,179 model predictive control–based adaptive control,180,181 applications
of model-based adaptive control,182-192 for sensor/actuator fault mitigation,193-195 for rapidly time-varying uncertainties,196
nonquadratic Lyapunov function–based MRAC,197 for stochastic systems,198,199 retrospective cost adaptive control,200 per-
sistent excitation–free/data accumulation–based control or concurrent adaptive control,201 sliding mode–based adaptive
control,202-204 set-theoretic–based adaptive controller with performance guarantees,205 sampled data systems,206 and robust
adaptive control.207
2.2 Learning-based adaptive control
As we mentioned in the Introduction section, by learning-based controllers, we mean controllers that are partly based on a
model of the system and partly based on a data-driven learning algorithm. The data-driven learning is used to complement
6BENOSMAN
the model-based part and compensate for the uncertain or missing information from the model. This compensation is
done either directly by learning the uncertain part or indirectly by tuning the controller to deal with the uncertainty. We
have to emphasize that the learning algorithm is solely based on the interaction with the system, and not on the model.
Examples of such learning algorithms include machine learning and neural networks (NNs), data-driven optimization
algorithms like extremum seekers, etc.
Recently, there have been a lot of efforts in this direction of adaptive control. One of the reasons behind this increas-
ing interest is that the field of data-driven learning has reached a certain maturity, which led to a good analysis and
understanding of the main properties of the available data-driven learning algorithms. Indeed, the idea that is based on
combining a model-based controller with a data-driven learning algorithm is attractive. The reason is that, due to this
combination, one could take advantage of the model-based design, with its stability guarantees, and add to it the advan-
tages of the data-driven learning, with its fast convergence and robustness to uncertainties. This combination is usually
referred to as the dual or modular design for adaptive control. In this line of research in adaptive control, we can cite some
related references.34,41-44,208-248
For example, in the NN-based modular adaptive control design, the idea is to write the model of the system as a combi-
nation of a known part and an unknown part (also known as the disturbance part). The NN is then used to approximate
the unknown part of the model. Finally, a controller, based on both the known part and the NN estimate of the unknown
part, is determined to realize some desired regulation or tracking performance, see for example.34,42,208,214,235-237,245,249-255
Let us use a simple second-order example to illustrate this idea. Consider the state-space model (in Brunowsky form)
x1=x2,
x2=𝑓(x)+u,(14)
where fis an unknown smooth nonlinear scalar function of the state variables x=(x1,x2)T,anduis a scalar control
variable. The unknown part of the model fis approximated by an NN-based estimation
𝑓, as follows:
𝑓=
WTS(x),(15)
where
W=(
w1,,
wN)TRNis the estimate of the weight vector of dimension N, which is the number of the NN
nodes. The vector S(x)=(s1(x),,sN(x))TRNis the regressor vector, with si,i=1,,Nrepresenting some radial
basis functions. Next, we consider the reference model
xref1=xref2,
xref2=𝑓ref (x),(16)
where fref is a known nonlinear smooth function of the desired state trajectories xref =(xref1,xref2)T.Weassumethatfref is
chosen such that the reference trajectories are uniformly bounded in time and orbital, ie, repetitive motion, starting from
any desired initial conditions xref(0). Then, a simple learning-based controller is given by
u=−e1c1e2
WTS(e)+
v,
e1=x1xref1,
e2=x2v,
v=−c2e1+xref2,
v=−c2(−c2e1+e2)+𝑓ref(xref),c1,c2>0,
WS(e)e2𝜎
W,𝜎>0,Γ=Γ
T>0.
(17)
This is a modular controller because the expression of control uin (17) is dictated by the Brunowsky control form of
the model (this is the known model-based information used in the control), and the remainder of the control is based
on a model-free NN estimating the unknown part of model f. This controller has been proven to ensure uniform bound-
edness of the closed-loop signals, practical exponential convergence of the state trajectories to a neighborhood of the
reference orbital trajectories, and convergence of the regressor vector to its optimal value, ie, optimal estimation of the
model uncertainty (see, eg, Theorem 4.1 in the work of Wang and Hill42). This is only a very simple example of NN-based
learning–based adaptive controllers. More algorithms and analysis of this type of controllers can be found, for example,
in the works of Spooner et al,34 Wang and Hill,42 and Lewis et al208 as well as the references therein.
BENOSMAN 7
Note that the NN-based learning part does not use any part of the model; instead, it is completely data driven, which is
in contrast with the fully model-based adaptive controllers that use model-based filters, eg, gradient descent filters,33 to
estimate the uncertainties.
We also want to mention here some of the learning-based control methods based on extremum seeking (ES) algorithms.
For instance, the so-called numerical optimization–based ES algorithms (see, eg, the work of Zhang and Ordonez43)are
used to optimize a desired performance cost function (the data-driven part) under the constraints of the system dynamics
(the model-based part). These algorithms rely on the measurement of the cost function to generate a sequence of desired
states that can lead the system to an optimal value for the performance cost function. The system model (assumed known)
is then used to design a model-based controller that forces the system states to track these desired states. Hence, if we
examine these algorithms closely, we see that they use a data-driven step, to optimize an unknown performance function,
and then use a model-based step to guide the system dynamics toward the optimal performance. Due to these two-step
designs, we classify these algorithms as learning-based adaptive controllers.
We present here a simple example of a numerical optimization–based ES control (see, eg, chapter 4 in the work of Zhang
and Ordonez43).
Consider the following nonlinear dynamics:
x=𝑓(x,u),(18)
where xRnis the state vector, uis the control (assumed to be a scalar, to simplify the presentation), and fis a known
smooth known (possibly nonlinear) vector function. We associate with the system modeled by Equation (18) a desired
performance cost function Q(x),whereQis a scalar smooth function of x. However, the explicit expression of Qas function
of x(or u) is not known. In other words, the only available information is direct measurements of Q(and maybe its
gradient). The goal is then to iteratively learn a control input that seeks the minimum of Q.
One example of such ES algorithm is given below.
Step 1: Initiate: t0=0,x(t0)=xs
0(chosen in Rn), and the iteration step k=0.
Step 2: Use a numerical optimization algorithm to generate xs
k+1, based on the measurements Q(x(tk)) (and, if available,
Q(x(tk))), st, xs
k+1=argmin(J(x(tk))).
Step 3: Using the known model (18), design a state regulator uto regulate the actual state x(tk+1)to the desired
(optimal) state xs
k+1,inafinitetime𝛿k,ie,x(tk+1)=xs
k+1,tk+1=tk+𝛿k.
Step 4: Increment the iteration index kto k+1, and go back to Step 2.
Under some assumptions ensuring the existence of a global minimum, which is a stabilizable equilibrium point of
(18), it has been shown that this type of algorithms converges to the minimum of the performance cost function (see, eg,
chapters 3, 4, and 5 in the work of Zhang and Ordonez43).
For example, if we consider the simple linear time-invariant model
x=Ax +Bu,xRn,uR.(19)
We assume that the pair (A,B)is controllable. In this case, the previous numerical optimization–based ES algorithm
reduces as follows.
Step 1: Initiate: t0=0,x(t0)=xs
0(chosen in Rn), and the iteration step k=0.
Step 2: Use a descent optimization algorithm to generate xs
k+1, based on the measurements Q(x(tk)) and Q(x(tk))), st,
xs
k+1=x(tk)−𝛼kQ(x(tk)),𝛼k>0.
Step 3: Choose a finite regulation time 𝛿k, and compute the model-based regulation control
u(t)=−BTeAT(tk+1t)G1(𝛿k)eA𝛿kx(tk)−xs
k+1,
G(𝛿k)=𝛿k
0
eA𝜏BBTeAT𝜏d𝜏,
for tkttk+1 =tk+𝛿k.
Step 4: Test Q(x(tk+1 )) <𝜀,(𝜀>0, a convergence threshold), if yes End; otherwise, increment the iteration index kto
k+1, and go back to Step 2.
It has been shown in theorem 4.1.5 in the work of Zhang and Ordonez43 that this numerical optimization–based ES
controller converges (globally and asymptotically) to the first-order stationary point of the cost function Q.
8BENOSMAN
Many other ES-based adaptive controllers, which fall into the class of learning-based adaptive controllers, have been
proposed in the literature. We cannot possibly present here all of these results; instead, we refer the readers to a few other
references for more examples on this topic.44,215,216,222-225,227-229,232,256
To end this section, we want to mention some other recent results in the field of learning-based adaptive control, namely,
the (model-based)*reinforcement-learning (RL) controllers, also known as, depending on the specific research commu-
nity, (model-based) approximate/ adaptive dynamic programming algorithms (ADP), or (model-based) neuro-dynamic
programming (NDP), see references.41,46,210,211,213,217,218,231,238-241,257-268
RL and (A,N)-DP algorithms are all based on the fundamental principles of dynamic programming (DP) (see, eg, the
work of Bellman51). However, the DP solutions based on solving the Bellman optimality equation can only be solved
efficiently for problems with small state space, action space, and outcome space, ie, this is referred to as the three curses of
dimensionality in the work of Powell.258 To overcome these curses of dimensionality, researchers have developed several
algorithms to approximate the solution of the Bellman equation (also referred to as the Jacobian, Hamilton-Jacobian, or
Hamilton-Jacobian-Bellman equation in the control community). These algorithms are mainly known as RL algorithms
in the computer science community or as (A,N)-DP in the control community.
Indeed, one popular way to implement RL and (A,N)-DP algorithms is by using an “actor-critic” structure that
involves two approximator functions, namely, the actor, which parameterizes the control policy, and the critic, which
parameterizes the cost function describing the performance of the control system.41,269
We can illustrate one of these algorithms on the simple case of linear time-invariant continuous systems described by
the model
x=Ax +Bu,(20)
with xRn,uRm,andthepair(A,B)assumed to be stabilizable. We underline here that the main assumption which
makes this case classifiable as learning-based adaptive control is that the drift matrix Ais not known and the control
matrix Bis known, ie, partial knowledge of the model. Thus, the controller will be partly based on the model (based on B)
and partly data driven (uses RL to compensate for the unknown part A). This model is then associated with an LQR-type
cost function of the form
V(u)=
t0xT(𝜏)R1x(𝜏)+uT(𝜏)R2u(𝜏)dτ,
R10,R2>0,
(21)
where R1is chosen such that (R
1
2
1,A)is detectable. The LQR optimal control is the controller satisfying
u(t)=argminnu(t)V(u),t∈[t0,∞[.(22)
It is well known that (see, eg, the work of Kailath270) the solution (in the nominal case with the known model) is given by
u(t)=−Kx(t),
K=R1
2BTP,(23)
where Pis the solution of the algebraic Riccati equation
ATP+PA PBR1
2BTP+Q=0,(24)
which has a unique semidefinite solution, under the detectability condition mentioned above. However, the above (clas-
sical) solution relies on the full knowledge of model (20). In our case, we assumed that Awas an unknown that requires
some learning steps. One way to learn P, and thus learn the optimal control u, is based on an iterative RL algorithm
called the integral reinforcement learning policy iteration algorithm (see, eg, chapter 3 in the work of Vrabie et al41). This
algorithm is based on the iterative solution of the following equations:
xTPix=t+T
txT(𝜏)R1+KT
iR2Kix(𝜏)d𝜏+xT(t+T)Pix(t+T),
Ki+1=R1
2BTPi,i=1,2,,(25)
*The reason why we insisted on adding the term (model-based) before each of the RL and (A,N)-DP is that we want to focus in this section on the
RL and (A,N)-DP algorithms that use some knowledge about the model of the system. Other RL and (A,N)-DP algorithms exist that do not use any
information about the system model and will be referred to in the next section.
BENOSMAN 9
where the initial gain K1is chosen such that ABK1is stable. It has been proven in theorem 3.4 in the work of Vrabie et al41
that the policy iteration (25), if initiated from a stabilizing gain K1, under the assumptions of stabilizability of (A,B)and
detectability of (R
1
2
1,A), converges to the optimal LQR solution ugiven by (23) and (24).
Note that the more challenging case of nonlinear systems has also been studied, for example in,41,213,231,238,261,262,264 and
references therein.
2.3 Open problems and future work
We want to summarize in this section some of the open problems in model-based adaptive control. For instance, in
PDE-based adaptive control, some of the most recent available results deal only with linear or semilinear PDEs with linear
uncertainty parameterization (see, eg, the works of Ahmed-Ali et al,159,271 Anfinsen et al,160 and Anfinsen and Aamo161).
The extension of these results to more general nonlinear PDEs or to nonlinear uncertainty parameterization remains an
open problem.
For systems with time delays affecting the control input, recent works have focused on constant time delays (see, eg, the
works of Ma et al,134 Nguyen et al,135 and Hussain et al136); however, the case of time-varying time delays is a challenging
problem, which is important in many applications, eg, control over networks with time-varying communication delays
of a group of moving robots.
Another interesting area in adaptive control is control under input constraints and input bandwidth limitations. The
available recent model-based algorithms consider linear plants with linear actuator dynamics (see, eg, the works of Gru-
enwald et al147 and Thiel et al148), taking into account nonlinearities either in the plant and/or to the actuator dynamics
is an open problem (see the work of Chen et al272 for a recent work considering hysteresis effects on the actuators and
the sensors). As far as state constraints are concerned, the case of strictly feedback form with linear parameterization
has been considered recently146; however, an extension to the more general type of nonlinearities remains a challenging
problem. Of course, in many real applications, constraints should be enforced both on the actuators and the states. Some
recent attempts in that direction have been proposed in180,181,232,242,246,247; however, these results are limited to plants with
constant or slowly varying uncertainties, whereas the case of nonlinear plants under input and state constraints with
rapidly time-varying uncertainties remains an open problem.
A relatively new paradigm in adaptive control is the one aiming at a priori performance guarantees, eg, upper bounds
on the tracking error imposed a priori by the user. This problem has been studied, for example, in the work of Arabi et al205
where the authors proposed a set-theoretic (fully model-based) adaptive controller using generalized restricted potential
functions. However, these results could be extended in the context of learning-based adaptive control, by learning the
uncertain part of the model, for instance, by using universal function approximation, eg, NN, instead of assuming that it
is linearly parameterized by known basis functions.
In learning-based adaptive control, ADP approaches have been very successful in dealing with (unknown) linear plants
with quadratic costs over an infinite time-horizon (see, eg, the works of Vamvoudakis and Ferraz265 and Vamvoudakis266);
one possible way to extend these results is to consider different types of cost functions, in finite horizon, with nonlinear
terms, and delays in the plant's model.
3DATA-DRIVEN ADAPTIVE CONTROL
As we mentioned earlier, by data-driven adaptive controllers, we mean all those that do not rely on any mathematical
model of the system. These data-driven controllers are solely based on online measurements collected directly from the
system. The term “adaptive” here means that the controller can adapt and, in principle, deal with any uncertainty of the
system, since it does not rely on any specific model. For example, one well-known approach, which can be classified under
a data-driven control framework, is the so-called (model-free)ES methods (see, eg, the works of Ariyur and Krsti´
c273 and
Scheinker and Krsti´
c274). This type of data-driven optimization methods has been proposed in the French literature related
to train systems, in the 1920s.275 Their basic goal is to seek an extremum, ie, maximize (or minimize), of a given function
As opposite to the ES controllers presented in Section 2.2, which are based on some knowledge of the system model.
10 BENOSMAN
without closed-from knowledge of the function or its gradient. There have been a lot of results on ES algorithms,273,274,276-309
following the appearance of a rigorous convergence analysis in the work of Krsti´
c and Wang.310
To give the reader a sense of how ES methods work, let us present below a simple ES algorithm. Consider the following
general dynamics:
x=𝑓(x,u),(26)
where xRnis the state, uRis the scalar control (for simplicity), and 𝑓Rn×RRnis a smooth function. Now,
assume that Equation (26) represents the model of a real system and that the goal of the control is to optimize a given
performance of the system. This performance can be as simple as a regulation of a given output of the system to a desired
constant value, or a more involved output tracking of a desired time-varying trajectory, etc. Let us now model this desired
performance as a smooth function J(x,u)∶ Rn×RR, which we simply denote J(u), since the state vector xis driven
by u. To be able to derive some convergence results, we need the following assumptions.
Assumption 1. There exists a smooth function lRRnsuch that
𝑓(x,u)=0,if and only if x=l(u).(27)
Assumption 2. For each uR, the equilibrium x=l(u)of system (26) is locally exponentially stable.
Assumption 3. There exists (a maximum) uR,suchthat
(Jl)(1)(u)=0,
(Jl)(2)(u)<0.(28)
Then, based on these assumptions, one can design some simple extremum seekers with proven convergence bounds.
Indeed, one of the simplest way to maximize Jis to use a gradient-based ES control as follows:
u=kdJ
du,k>0.(29)
We can analyze the convergence of the ES algorithm (29) by using the Lyapunov function
V=J(u)−J(u)>0,for uu.(30)
The derivative of Vleads to
V=dJ
du
u=−kdJ
du20.(31)
This proves that algorithm (29) drives uto the invariant set subject to dJ
du=0, which is (by Assumption 3) equivalent to
u=u. However, as simple as algorithm (29) might seem, it still requires the knowledge of the gradient of J.Toovercome
this requirement, one can use instead an algorithm motivated by sliding-mode control. For instance, if we define the
tracking error
e=J(u)−ref(t),(32)
where “ref” denotes a time function that is monotonically increasing. The idea is that if Jtracks “ref,” then it will increase
until it reaches an invariant set centered around the equality dJ
du=0. A simple way to achieve this goal is by choosing the
following sliding-mode–inspired ES law:
u=k1sgn sin 𝜋e
k2,k1,k2>0.(33)
This controller is shown (see, eg, chapter 3 in the work of Zhang and Ordonez43) to steer uto the set subject to
dJ
du<
ref(t)k1, which can be made arbitrarily small by the proper tuning of k1. Note that the controller requires only
measurements of the performance cost, without any need of the system's model.
Another well-known class of ES approaches is the so-called perturbation-based ES. It uses a perturbation signal
(often sinusoidal) to explore the space of control and steers the control variable toward its local optimum, by implic-
itly following a gradient update. This type of ES algorithms has been thoroughly analyzed, for example, in the works
BENOSMAN 11
of Ariyur and Krsti´
c,273 Krsti´
c and Wang,310 Tan et al,281 and Rotea.283 For instance, a simplified version of a sinusoidal
disturbance–based ES algorithm writes as follows:
z=𝑎sin 𝜔t+𝜋
2J(u),
u=z+𝑎sin 𝜔t𝜋
2,a>0,𝜔>0,
(34)
where JRRis a scalar cost function, and aand 𝜔are two tuning parameters. It has been shown, using averaging
theory and singular perturbation theory, that this simple algorithm under some assumptions (of at least local optimality
and smoothness of J) can (locally) converge to a neighborhood of the optimal argument of J(see, eg, the works of Krsti´
c
and Wang310 and Rotea283). There are, of course, many other ES algorithms; however, it is not the purpose of this paper to
review all the ES results. Instead, we refer the interested reader to the ES literature cited above for more details.
Let us now talk about another well-known data-driven control method, namely, the model-freeRL algorithms, and as
mentioned before, also known as, approximate dynamic programming algorithms, adaptive dynamic programming, or
NDP.259,269,311-320
The idea behind data-driven RL is that by trying random control actions, the controller can eventually build a predictive
model of the system on which it is operating. RL is a class of machine learning algorithms, which learns how to map
states to actions in such a way as to maximize a desired reward. In these algorithms, the controller has to discover the
best actions by trial and error. This idea was motivated by the field of psychology where it has been realized that animals
have the tendency to reselect (or not) actions based on their good (or bad) outcomes.321
In RL, the controller learns an optimal policy (or action), which defines the system's way of behaving at a given time and
state. The obtention of the best policy is based on the optimization, through trial and error, of a desired value function.
The value function represents the value of a policy in the long run. Simply put, the value function at a given state is the
total amount of immediate reward a controller can expect to accumulate over the future, starting from that state. The
trial-and-error process leads to the well-known exploration-versus-exploitation tradeoff. Indeed, to maximize the value
function, the controller has to select actions (or policy) that have been tried before, which leads to a high immediate
reward and, most importantly, to a high long-term value. However, to discover these actions with a high reward, the
controller has to try as many different actions as needed. This trial versus application of control actions is the exploitation
(application) versus exploration (trial) dilemma, which characterizes most of the data-driven learning controllers.
There are a lot of RL methods available in the literature; they all use the same main ingredients mentioned above, but
they differ in their algorithms, eg, in which way they estimate the long-term value function, etc. Note that the implemen-
tation of some of the model-free RL or model-free (A,N)-DP algorithms follows also an actor-critic structure, similarly to
what we presented in the learning-based methods. The main difference is that, here, no prior knowledge about the model
is required to “guide” the structure of the actor or the critic, ie, they are purely based on the interaction with the system.
Examples of such algorithms are action-dependent heuristic dynamic programming and action-dependent dual heuristic
dynamic programming, as defined in chapter 13 in the work of Werbos.269
We can also mention other approaches that have been used in data-driven adaptive control, like, for instance, the
evolutionary methods, eg, genetic algorithms,322 NN deep learning algorithms and deep RL algorithms,319,323-327 kernel
function–based parameterization,328 particle filters,329 and iterative learning control (ILC).330-343
3.1 Open problems and future work
We conclude this section by citing some of the open problems in the data-driven methods. For instance, in the extremum
seeker–based methods, global ES in the presence of local extrema has been studied in the work of Tan et al282 for the case of
a single input; extension of these types of results to the general case of multiple inputs, under input and state constraints,
is still missing. The general setting of ES for hybrid maps has been proposed in recent works,308,309 where input and state
constraints have been considered. However, possible extensions could include nonsmooth cost functions, codesign of the
extremum seekers together with the stabilizing state feedback loop, and considering stochastic hybrid plants.
Another important improvement direction in data-driven adaptive methods is the fact that these algorithms rely on
some tuning parameters, eg, amplitudes and frequencies of the dither signals in some extremum seekers,310 or the choice
There are model-based RL and (A,N)-DP algorithms that rely partly on a given model of the system. We have reported them in Section 2.2 dedicated
to “learning-based approaches.”
12 BENOSMAN
of the kernel parameters in kernel function–based approaches328; these tuning parameters can strongly influence the
performances of these data-driven approaches. An open research area is to design algorithms that are robust with respect
to these tuning parameters, with performance guarantees. We will reemphasize this point in the final conclusion of
this paper.
In recent ILC results, the important problem of varying iteration length has been successfully studied for linear systems,
or for some particular classes of nonlinear plants (see, eg, the work of Shen et al342 and the references therein). Possible
extensions could deal with more general nonlinear models, for example, by relaxing the global Lipschitz conditions that
are often used in ILC papers.
Regarding RL data-driven methods, one of the main missing part is transient stability or stability while learning. Indeed,
the existing data-driven RL algorithms, often developed within the computer science community, lack stability guaran-
tees. However, to be able to implement these algorithms on real safety-critical systems, eg, autonomous cars or unmanned
aerial vehicles, etc, one has to impose some kind of stability guarantees at the design phase. Using tools from dynami-
cal systems and control theory to tackle this important problem is an open direction. Some preliminary attempts in this
direction have been reported in recent works.344,345 Another important point of improvement for existing data-driven RL
methods is robustness with respect to training data and feedback delays. Indeed, RL data-driven optimal control policies
that are learned from simulation data should be able to perform well on the real testbed without retraining. One possible
research direction to enable this is to use tools from robust control theory to design data-driven RL policies that are robust
with respect to bounded errors on the training data and bounded feedback delays.
Similarly, the data-driven NN control algorithms, more known recently as deep learning algorithms, have demonstrated
interesting performances on several testbeds (refer to the papers of Porikli et al346,347 for the latest image and video pro-
cessing applications); however, no rigorous constructive design approach has been proposed so far, no a priori stability
guarantees have been obtained, and no robustness to training data has been formally established (refer to the recent
survey on robustness of deep learning networks in image processing applications in the work of Fawzi et al348 and to
the discussion about machine learning theory and control theory in the paper of Lamnabhi-Lagarrigue et al349(pp16-18)).
Stability concepts in dynamical systems and control theory, as well as robustness theoretical tools, could be used to better
understand the existing deep learning algorithms, but most importantly, these tools could be used to propose a construc-
tive theory to design stable and robust deep learning–based controllers. Along this line of thought, we can refer to some
recent papers.350-355
4COMPARISON BETWEEN MODEL-BASED AND DATA-DRIVEN
ADAPTIVE CONTROL
We have seen that there have been numerous works both on the (fully or partially) model-based methods and on the
data-driven methods. The fully model-based adaptive control field has a long history of theoretical analysis and, as such,
is considered to be very mature in terms of theoretical guarantees. However, due to its model-based formulations, the
obtained results are rather restricted to some known types of models, and the remaining extensions to more general mod-
els are very challenging. Some relaxations of these restrictions have been obtained in the partially model-based adaptive
control or learning-based control paradigm, where only part of the model is needed to design the adaptive controller,
whereas the unmodeled part is handled by some data-driven optimization and learning algorithms. Still, this gained flex-
ibility comparatively to the fully model-based approaches remains constrained in comparison with the fully data-driven
adaptive methods. Indeed, fully data-driven methods, as we saw in Section 3, learn the best control policies by direct inter-
action with the system, without any prior knowledge about the model of the system. This allows a great deal of flexibility,
indeed. However, it does come at the cost of extensive measurements and probing of the system or data collection. It is
also prone to high computation power needs, since, by discarding any form of model, these data-driven methods do not
use any prior knowledge about the physics of the system and, thus, have to explore a larger space of action to find the
optimal policies. In contrast, the learning-based methods, which rely partially on some prior knowledge of the system,
by using a partial model, explore a smaller or a parameterized space of actions to search for optimal policies or optimal
parameters, and by doing so, these learning-based methods prove to be faster or less computationally demanding than
the fully data-driven methods. Finally, an important point of comparison of all these adaptive control approaches is the
stability and performance guarantees, which have been obtained in the fully model-based and learning-based methods
but lack tremendously in the fully data-driven approaches, as we discussed in Section 3.1.
BENOSMAN 13
5CONCLUSION
In this paper, we wanted to give a brief overview of the adaptive control field. We choose to decompose the field of adap-
tive control into two main streams, namely, model-based adaptive control and data-driven adaptive control. Indeed, we
defined the model-based adaptive control as being fully or partially relying on some knowledge of a system model. On
the opposite side, we defined the data-driven adaptive control as adaptive algorithms relying entirely on data collected
from direct interaction with the system. In each case, we presented few simple examples and cited relevant references
in each subfield, focusing mainly on monographs, surveys, and recent research papers, with a total number of about
350 references.
As documented by the large number of monographs and papers in adaptive control reported here, we can say that this
field is well studied. However, many challenging problems remain open for investigation. For instance, we found very
few papers addressing the problem of adaptive control for hybrid dynamical systems, modeled by the general mathe-
matical class of differential inclusions (see, eg, the works of Poveda and Teel308 and Goebel et al356). Another challenging
area, which we did not discuss here, is the field of adaptive control of multiagent systems. Indeed, this field has attracted
many research interest in recent years (see, eg, the works of Panait and Luke,357 Dörfler and Bullo,358 and Poveda et al359).
However, several challenging problems remain open, for example, robustness of the adaptive controllers to clock syn-
chronization, which could be obtained by developing fully decentralized (in terms of clock synchronization) algorithms.
Another problem that seems open relates to the problem of codesigning multiagent-system adaptive algorithms together
with the communication network, in terms not only of topological graph constraints but also of communication band-
widths, communication delays, and communication nodes, eg, transmitter /receiver optimization using adaptive control
techniques.
We can also underline here one common drawback in all data-driven and learning-based adaptive controllers, which
are, by design, more forgiving in terms of the system's model knowledge; however, their learning algorithms often rely on
a proper choice of weight functions and other coefficients defined in the learning algorithm, eg, excursion amplitudes and
dither frequencies in the ES algorithms, choice of the basis functions in NN algorithms, etc, which, in some sense, defeats
the purpose of not needing a good tuning of the model in the first place. Maybe an interesting direction to improve these
algorithms would be to use tools from robust control theory, merged with the learning-based and data-driven adaptive
control, to design learning and adaptive algorithms that are robust with respect to their tuning parameters. This will make
these adaptive algorithms less sensitive to the designer/user tuning.
Finally, as we discussed earlier in Section 4, the fully data-driven approaches, like the deep learning methods, would
benefit immensely from the theoretical tools in dynamical systems theory, as well as in nonlinear and robust control
theory, to achieve constructive design aiming for stability and performance guarantees.
ORCID
Mouhacine Benosman http://orcid.org/0000-0002-0154-454X
REFERENCES
1. Ioannou P, Sun J. Robust Adaptive Control. Mineola, NY: Dover Publications, Inc; 2012.
2. Landau ID, Airimi¸toaie T-B, Castellanos-Silva A, Constantinescu A. Adaptive and robust active vibration control: Methodology and tests.
Advances in Industrial Control. Springer International Publishing Switzerland; 2017.
3. Jarvis RA. Optimization strategies in adaptive control: a selective survey. IEEE Trans Syst Man Cybern. 1975;5(1):83-94.
4. Åström KJ. Theory and applications of adaptive control survey. Automatica. 1983;19(5):471-486.
5. Seborg DE, Edgar TF, Shah SL. Adaptive control strategies for process control: a survey. AIChE J. 1986;32(6):881-913.
6. Ortega R, Tang Y. Robustness of adaptive controllers—a survey. Automatica. 1989;25(5):651-677.
7. Isermann R. Parameter adaptive control algorithms—a tutorial. Automatica. 1982;18(5):513-528.
8. Kumar PR. A survey of some results in stochastic adaptive control. SIAM J Control Optim. 1985;23(3):329-380.
9. Egardt B. Stability of Adaptive Controllers. Berlin Heidelberg: Springer-Verlag; 1979.
10. Åström KJ, Wittenmark B. A survey of adaptive control applications. Paper presented at: IEEE Conference on Decision and Control;
1995; New Orleans, LA.
11. Ljung L, Gunnarsson S. Adaptation and tracking in system identification—a survey. Automatica. 1990;26(1):7-21.
12. Filatov NM, Unbehauen H. Survey of adaptive dual control methods. IET Control Theory Appl. 2000;147(1):118-128.
14 BENOSMAN
13. Åström KJ, Hagglund T, Hang CC, Ho WK. Automatic tuning and adaptation for PID controllers a survey. Control Eng Pract.
1994;1(4):699-714.
14. Chen YQ, Moore KL, Yu J, Zhang T. Iterative learning control and repetitive control in hard disk drive industry - A tutorial. Int J Adapt
Control Signal Process. 2008;22(4):325-343.
15. Landau ID, Alma M, Constantinescu A, Martinez JJ, Noë M. Adaptive regulation—rejection of unknown multiple narrow band
disturbances (a review on algorithms and applications). Control Eng Pract. 2011;19(10):1168-1181.
16. Martín-Sànchez JM, Lemos JM, Rodellar J. Survey of industrial optimized adaptive control. Int J Adapt Control Signal Process.
2015;26(10):881-918.
17. Barkana I. Simple adaptive control a stable direct model reference adaptive control methodology brief survey. Int J Adapt Control
Signal Process. 2014;28(7-8):567-603.
18. Tao G. Multivariable adaptive control: a survey. Automatica. 2014;50(11):2737-2764.
19. Nageshrao SP, Lopes GAD, Jeltsema D, Babuška R. Port-Hamiltonian systems in adaptive and learning control: a survey. IEEE Trans
Autom Control. 2016;61(5):1223-1238.
20. Feng G, Lozano R. Adaptive Control Systems. Oxford, UK: Newnes; 1999.
21. Goodwin GC, ed. Model Identification and Adaptive Control: From Windsurfing to Telecommunications. Springer-Verlag London Limited;
2001.
22. Ioannou P, Fridan B. Adaptive Control Tutorial. Philadelphia, PA: SIAM; 2006.
23. Kaufman H, Barkana I, Sobel K. Direct adaptive control algorithms: Theory and applications. Communications and Control Engineering.
New York, NY: Springer; 1998.
24. Landau ID. Adaptive Control. New York, NY: Marcel Dekker; 1979.
25. Goodwin GC, Sin KS. Adaptive Filtering Prediction and Control. Englewood Cliffs, NJ: Prentice-Hall; 1984.
26. Narendra KS, Annaswamy AM. Stable Adaptive Systems. Mineola, NY: Dover Publications, Inc; 1989.
27. Tsakalis KS, Ioannou PA. Linear Time Varying Systems: Control and Adaptation. Upper Saddle River, NJ: Prentice-Hall, Inc; 1993.
28. Landau ID, Lozano R, M'Saad M, Karimi A. Adaptive control: Algorithms, analysis and applications. Communications and Control
Engineering. Springer-Verlag London Limited; 2011.
29. Sastry S, Bodson M. Adaptive Control: Stability, Convergence and Robustness. Mineola, NY: Dover Publications, Inc; 2011.
30. Tao G. Adaptive Control Design and Analysis. New York, NY: John Wiley & Sons, Inc; 2003.
31. Goodwin GC, Sin KS. Adaptive Filtering Prediction and Control. Mineola, NY: Dover Publications, Inc; 2014.
32. Mosca E. Optimal, Predictive, and Adaptive Control. Upper Saddle River, NJ, USA: Prentice Hall; 1995.
33. Krsti´
c M, Kanellakopoulos I, Kokotovic PV. Nonlinear and Adaptive Control Design. New York, NY: Wiley; 1995.
34. Spooner JT, Maggiore M, Ordòñez R, Passino KM. Stable Adaptive Control and Estimation for Nonlinear Systems. New York, NY: John
Wiley & Sons, Inc; 2002.
35. Astolfi A, Karagiannis D, Ortega R. Nonlinear and Adaptive Control with Applications. London, UK: Springer; 2008.
36. Fradkov A, Miroshnik I, Nikiforov V. Nonlinear and AdaptiveControl of Complex Systems. Dordrecht, The Netherlands: Kluwer Academic
Publishers; 1999.
37. Astolfi A. Nonlinear adaptive control. Encyclopedia of Systems and Control. New York, NY: Springer; 2015:866-870.
38. Guay M, Adetola V, DeHaan D. Robust and Adaptive Model Predictive Control of Nonlinear Systems. London, UK: The Institution of
Engineering and Technology; 2015.
39. Sragovich V. Mathematical Theory of Adaptive Control. Interdisciplinary Mathematical Sciences. Vol. 4. Singapore: World Scientific;
2006.
40. Smyshlyaev A, Krstic M. Adaptive Control of Parabolic PDEs. Princeton, NJ: Princeton University Press; 2010.
41. Vrabie D, Vamvoudakis K, Lewis FL. Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. England:
IET Digital Library; 2013.
42. Wang C, Hill DJ. Deterministic Learning Theory for Identification, Recognition, and Control. Boca Raton, FL: CRC Press; 2010.
43. Zhang C, Ordonez R. Extremum-Seeking Control and Applications: A Numerical Optimization-Based Approach. New York, NY: Springer;
2012.
44. Benosman M. Learning-Based Adaptive Control: An Extremum Seeking Approach–Theory and Applications. Oxford, UK:
Butterworth-Heinemann; 2016.
45. Vamvoudakis KG, Jagannathan S. Control of Complex Systems-Theory and Applications. Oxford, UK: Butterworth-Heinemann; 2016.
46. Liu D, Wei Q, Wang D, Yang X, Li H. Adaptive dynamic programming with applications in optimal control. Advances in Industrial Control.
Cham, Switzerland: Springer; 2017.
47. Benner AH, Drenick RF. An Adaptive Servo System. IRE Convention Record, Part 4; 1955.
48. Drenick RF, Shahbender RA. Adaptive servomechanism. Trans Am Inst Electr Eng. 1957;76(2):286-292.
49. Gregory PC. Proceedings of the Self Adaptive Flight Control Systems Symposium. WADC [Technical Report]. Dayton, Ohio: Wright Air
Development Centre; 1959.
50. Åström KJ, Eykhoff P. System identification—a survey. Automatica. 1971;7:123-162.
51. Bellman R. Dynamic Programming. Princeton, NJ: Princeton University Press; 1957.
BENOSMAN 15
52. Bellman R. Adaptive Control Processes: A Guided Tour. Princeton, NJ: Princeton University Press; 1961.
53. Tsypkin YZ. Adaptation and Learning in Automatic Systems. New York, NY: Academic Press; 1971.
54. Tsypkin YZ. Foundations of the Theory of Learning Systems. New York, NY: Academic Press; 1975.
55. Åström KJ, Wittenmark B. Adaptive Control. North Chelmsford, MA: Courier Corporation; 2013.
56. Cao C, Hovakimyan N. Design and analysis of a novel 1adaptive control architecture with guaranteed transient performance. IEEE
Trans Autom Control. 2008;53(2):586-591.
57. Cao C, Hovakimyan N. L1 Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation. Advances in Design and Control.
Philadelphia, PA: SIAM Series; 2010.
58. Ioannou PA, Annaswamy AM, Narendra KS, et al. L1-adaptive control: stability, robustness, and interpretations. IEEE Trans Autom
Control. 2014;59(11):3075-3080.
59. van Heusden K, Talebian K, Dumont GA. Analysis of 1adaptive state feedback control. Why does it approximate an implementable
LTI controller? Eur J Control. 2015;23:1-7.
60. Park BS, Lee JY, Park JB, Choi YH. Adaptive control for input-constrained linear systems. Int J Control Autom Syst. 2012;10(5):890-896.
61. Yeh P-C, Kokotovit PV. Adaptive tracking designs for input-constrained linear systems using backstepping. Paper presented at: IEEE
American Control Conference; 1995; Seattle, WA.
62. Rojas OJ, Goodwin GC, Desbiens A. Study of an adaptive anti-windup strategy for cross-directional control systems. Paper presented at:
IEEE Conference on Decision and Control; 2002; Las Vegas, NV.
63. Ajami AF. Adaptive Flight Control in the Presence of Input Constraints [master's thesis]. Blacksburg, VA: Virginia Polytechnic Institute
and State University; 2005.
64. Lavretsky E, Hovakimyan N. Stable adaptation in the presence of input constraints. Syst Control Lett. 2007;56(11-12):722-729.
65. Turner MC. Positive 𝜇modification as an anti-windup mechanism. Syst Control Lett. 2017;102:15-21.
66. Wen J, Balas M. Robust adaptive control in Hilbert space. J Math Anal Appl. 1989;143(1):1-26.
67. Narendra KS, Balakrishnan J. Adaptive control using multiple models. IEEE Trans Autom Control. 1997;42(2):171-187.
68. Narendra KS, Driollet OA, Feiler M, George K. Adaptive control using multiple models, switching and tuning. Int J Adapt Control Signal
Process. 2003;17(2):87-102.
69. Giovanini L, Sanchez G, Benosman M. Observer-based adaptive control using multiple-models switching and tuning. IET Control Theory
Appl. 2014;8(4):235-247.
70. Narendra KS. Hierarchical adaptive control of rapidly time-varying systems using multiple models. In: Vamvoudakis K, Jagannathan S,
eds. Control of Complex Systems: Theory and Applications. Oxford, UK: Butterworth-Heinemann; 2016.
71. Chowdhary G. Concurrent Learning for Convergence in Adaptive Control Without Persistency of Excitation [PhD thesis]. Atlanta, GA:
Georgia Institute of Technology; 2010.
72. Chowdhary G, Yucelen T, Mühlegg M, Johnson E. Concurrent learning adaptive control of linear systems with exponentially convergent
bounds. Int J Adapt Control Signal Process. 2013;27(4):280-301.
73. Carnevale D, Galeani S, Sassano M, Astolfi A. Robust hybrid estimation and rejection of multi-frequency signals. Int J Adapt Control
Signal Process. 2016;30(12):1649-1673.
74. Marino R, Santosuosso G, Tomei P. Robust adaptive compensation of biased sinusoidal disturbances with unknown frequency.
Automatica. 2003;39(10):1755-1761.
75. Marino R, Tomei P. Output regulation for linear minimum phase systems with unknown order exosystem. IEEE Trans Autom Control.
2007;52(10):2000-2005.
76. Aranovskiy S, Freidovich LB. Adaptive compensation of disturbances formed as sums of sinusoidal signals with application to an active
vibration control benchmark. Eur J Control. 2013;19(4):253-265.
77. Bodson M, Douglas S. Adaptive algorithms for the rejection of sinusoidal disturbances with unknown frequency. Automatica.
1997;33(12):2213-2221.
78. Ding Z. Global stabilization and disturbance suppression of a class of nonlinear systems with uncertain internal model. Automatica.
2003;39(3):471-479.
79. Landau I, Constantinescu A, Rey D. Adaptive narrow band disturbance rejection applied to an active suspension—an internal model
principle approach. Automatica. 2005;41(4):563-574.
80. Serrani A. Rejection of harmonic disturbances at the controller input via hybrid adaptive external models. Automatica.
2006;42(7):1977-1985.
81. Landau I, Airimitoaie T-B, Silva AC. Adaptive attenuation of unknown and time-varying narrow band and broadband disturbances. Int
J Adapt Control Signal Process. 2015;29(11):1367-1390.
82. Slotine J, Li W. Applied Nonlinear Control. Englewood Cliffs, NJ: Prentice-Hall; 1991:68-73.
83. Yakubovich V. Theory of adaptive systems. Sov Phys Dokl. 1969;13:852-855.
84. Ilchmann A, Ryan EP. High-gain control without identification: a survey. GAMM-Mitteilungen. 2008;31(1):115-125.
85. Fekri S, Athans M, Pascoal A. Issues, progress and new results in robust adaptive control. Int J Adapt Control Signal Process.
2006;20(10):519-579.
86. Fradkov AL. Nonlinear adaptive control: regulation-tracking-oscillations. Paper presented at: First IFAC Workshop: New Trends in
Design of Control Systems; 1994; Smolenice, Slovakia.
16 BENOSMAN
87. Seron M, Hill D, Fradkov A. Nonlinear adaptive control of feedback passive systems. Automatica. 1995;31(7):1053-1057.
88. Ortega R, Fradkov A. Asymptotic stability of a class of adaptive systems. Int J Adapt Control Signal Process. 1993;7(4):255-260.
89. Fradkov AL. Speed-gradient scheme and its application in adaptive control problems. Autom Remote Control. 1980;40(9):1333-1342.
90. Karsenti L, Lamnabhi-Lagarrigue F, Bastin G. Adaptive control of nonlinear systems with nonlinear parameterization. Syst Control Lett.
1996;27(2):87-97.
91. Fradkov AL, Stotsky AA. Speed gradient adaptive control algorithms for mechanical systems. Int J Adapt Control Signal Process.
1992;6(3):211-220.
92. Fradkov A, Ortega R, Bastin G. Semi-adaptive control of convexly parametrized systems with application to temperature regulation of
chemical reactors. Int J Adapt Control Signal Process. 2001;15(4):415-426.
93. Liu X, Ortega R, Su H, Chu J. Immersion and invariance adaptive control of nonlinearly parameterized nonlinear systems. IEEE Trans
Autom Control. 2010;55(9):2209-2214.
94. Flores-Perez A, Grave I, Tang Y. Contraction based adaptive control for a class of nonlinearly parameterized systems. Paper presented at:
IEEE American Control Conference; 2013; Washington, DC.
95. Hung NVQ, Tuan HD, Narikiyo T, Apkarian P. Adaptive control for nonlinearly parameterized uncertainties in robot manipulators. IEEE
Trans Control Syst Technol. 2008;16(3):458-468.
96. Adetola V, Guay M, Lehrer D. Adaptive estimation for a class of nonlinearly parameterized dynamical systems. IEEE Trans Autom
Control. 2014;59(10):2818-2824.
97. Cao C, Annaswamy A, Kojic A. Parameter convergence in nonlinearly parameterized systems. IEEE Trans Autom Control.
2003;48(3):397-412.
98. Kojic A, Annaswamy A, Loh A-P, Lozano R. Adaptive control of a class of nonlinear systems with convex/concave parameterization. Syst
Control Lett. 1999;37(5):267-274.
99. Netto M, Annaswamy AM. Adaptive control of a class of multilinearly parameterized systems by using noncertainty equivalence control.
Paper presented at: IEEE Conference on Decision and Control; 2012; Maui, HI.
100. Tyukin I, Prokhorov D, Terekhov V. Adaptive control with nonconvex parameterization. IEEE Trans Autom Control. 2003;48(4):554-567.
101. Wang J, Qu Z. Robust adaptive control of a class of nonlinearly parameterised time-varying uncertain systems. IET Control Theory Appl.
2008;3(6):617-630.
102. Zhang T, Ge S, Hang C, Chai T. Adaptive control of first-order systems with nonlinear parameterization. IEEE Trans Autom Control.
2000;45(8):1512-1516.
103. Krasovskii AA. Optimal algorithms in problems of identification with an adaptive model. Avtom Telemekh. 1976;12:75-82.
104. Krasovskii AA, Shendrik VS. A universal algorithm for optimal control of continuous processes (in Russian). Avtomat i Telemekh.
1977;2:5-13.
105. Byrnes C, Isidori A, Willems JC. Passivity, feedback equivalence, and the global stabilization of minimum phase nonlinear systems. IEEE
Trans Autom Control. 1991;36(11):1228-1240.
106. Isidori A, Willems JC. Nonlinear Control Systems. Communications and Control Engineering Series. 2nd ed. London, UK: Springer-
Verlag; 1989.
107. Wang X, Kharisov E, Hovakimyan N. Real-time 1adaptive control for uncertain networked control systems. IEEE TransAutom Control.
2015;60(9):2500-2505.
108. Cao C, Xargay E, Hovakimyan N. L1adaptive control for a class of uncertain nonaffine-in-control nonlinear systems. IEEE Trans Autom
Control. 2016;61(3):840-846.
109. Duarte MA, Narendra KS. Combined direct and indirect approach to adaptive control. IEEE Trans Autom Control. 1989;34(10):1071-1075.
110. Slotine J-JE, Li W. Composite adaptive control of robot manipulators. Automatica. 1989;25(4):509-519.
111. Lavretsky E. Combined/composite model reference adaptive control. IEEE Trans Autom Control. 2009;54(11):2692-2697.
112. Roy AB, Bhasin S, Kar IN. Combined MRAC for unknown MIMO LTI systems with parameter convergence. IEEE Trans Autom Control.
2018;63(1):283-290.
113. Liu Y, Lin Y, Huang R. Decentralised adaptive output feedback control for interconnected nonlinear systems preceded by unknown
hysteresis. Int J Control. 2015;88(9):1712-1725.
114. Wang C, Lin Y. Adaptive dynamic surface control for MIMO nonlinear time-varying systems with prescribed tracking performance. Int
J Control. 2015;88(4):832-843.
115. Shang F, Liu Y. Adaptive output feedback stabilisation for planar nonlinear systems with unknown control coefficients. Int J Control.
2013;88(8):1609-1618.
116. Sun Z-Y, Li T, Yang S-H. A unified time-varying feedback approach and its applications in adaptive stabilization of high-order uncertain
nonlinear systems. Automatica. 2016;70:246-257.
117. Pisano A, Tanelli M, Ferrara A. Switched/time-based adaptation for second-order sliding mode control. Automatica. 2016;64:126-132.
118. Efimov D, Edwards C, Zolghadri A. Enhancement of adaptive observer robustness applying sliding mode techniques. Automatica.
2016;72:53-56.
119. Zhao X, Zheng X, Niu B, Liu L. Adaptive tracking control for a class of uncertain switched nonlinear systems. Automatica.
2015;52:185-191.
120. Xing L, Wen C, Liu Z, Su H, Cai J. Event-triggered adaptive control for a class of uncertain nonlinear systems. IEEE Trans Autom Control.
2017;62(4):2071-2076.
BENOSMAN 17
121. Dashkovskiy S, Pavlichkov S. Constructive design of adaptive controllers for nonlinear MIMO systems with arbitrary switchings. IEEE
Trans Autom Control. 2016;61(7):2001-2007.
122. Pan Y, Yu H. Composite learning from adaptive dynamic surface control. IEEE Trans Autom Control. 2016;61(9):2603-2609.
123. Song Y, Zhao K, Krsti ´
c M. Adaptive backstepping with exponential regulation in the absence of persistent excitation. Paper presented at:
IEEE American Control Conference; 2016; Boston, MA.
124. Sun Z-Y, Zhang C-H, Wang Z. Adaptive disturbance attenuation for generalized high-order uncertain nonlinear systems. Automatica.
2017;80:102-109.
125. Michailidis I, Baldi S, Kosmatopoulos EB, Ioannou PA. Adaptive optimal control for large-scale nonlinear systems. IEEE Trans Autom
Control. 2017;62(11):5567-5577.
126. Huang C, Yu CB. Tuning function design for nonlinear adaptive control systems with multiple unknown control directions. Automatica.
2018;89:259-265.
127. Li Z, Chen Z, Fu J, Sun C. Direct adaptive controller for uncertain MIMO dynamic systems with time-varying delay and dead-zone inputs.
Automatica. 2016;63:287-291.
128. Zhang X, Lin Y. Adaptive control of nonlinear time-delay systems with application to a two-stage chemical reactor. IEEE Trans Autom
Control. 2015;60(4):1074-1079.
129. Selivanov A, Fridman E, Fradkov A. Passification-based adaptive control: uncertain input and output delays. Automatica.
2015;54:107-113.
130. Jia X, Xu S, Ma Q, Li Y, Chu Y. Universal adaptive control of feedforward nonlinear systems with unknown input and state delays. Int J
Control. 2016;89(11):2311-2321.
131. Zhu Y, Krsti´
c M, Su H. Adaptive output feedback control for uncertain linear time-delay systems. IEEE Trans Autom Control.
2017;62(2):545-560.
132. Zhang X, Lin Y. Adaptive output feedback control for a class of large-scale nonlinear time-delay systems. Automatica. 2015;52:87-94.
133. Shi X, Xu S, Li Y, Chen W, Chu Y. Robust adaptive control of strict-feedback nonlinear systems with unmodelled dynamics and
time-varying delays. Int J Control. 2017;90(2):334-347.
134. Ma J, Ding F, Xiong W, Yang E. Combined state and parameter estimation for Hammerstein systems with time delay using the Kalman
filtering. Int J Adapt Control Signal Process. 2017;31(8):1139-1151.
135. Nguyen K-D, Li Y, Dankowicz H. Delay robustness of an 1adaptive controller for a class of systems with unknown matched
nonlinearities. IEEE Trans Autom Control. 2017;62(10):5485-5491.
136. Hussain H, Yildiz Y, Matsutani M, Annaswamy A, Lavretsky E. Computable delay margins for adaptive systems with state variables
accessible. IEEE Trans Autom Control. 2017;62(10):5039-5054.
137. Liu Z-G, Wu Y-Q. Universal strategies to explicit adaptive control of nonlinear time-delay systems with different structures. Automatica.
2018;89:151-159.
138. Ortega R, Panteley E. When is a parameterized controller suitable for adaptive control? Eur J Control. 2015;22:13-16.
139. Selivanov A, Fradkov A, Liberzon D. Adaptive control of passifiable linear systems with quantized measurements and bounded
disturbances. Syst Control Lett. 2016;88:62-67.
140. Li Y-X, Yang G-H. Adaptive asymptotic tracking control of uncertain nonlinear systems with input quantization and actuator faults.
Automatica. 2016;72:177-185.
141. Lai G, Liu Z, Chen CLP, Zhang Y. Adaptive asymptotic tracking control of uncertain nonlinear system with input quantization. Syst
Control Lett. 2016;96:23-29.
142. Yu X, Lin Y. Adaptive backstepping quantized control for a class of nonlinear systems. IEEE Trans Autom Control. 2017;62(2):981-985.
143. Li G, Lin Y. Adaptive output feedback control for a class of nonlinear systems with quantised input and output. Int J Control.
2017;90(2):239-248.
144. Wang C, Wen C, Lin Y, Wang W. Decentralized adaptive tracking control for a class of interconnected nonlinear systems with input
quantization. Automatica. 2017;81:359-368.
145. Liu Y-J, Tong S. Barrier Lyapunov functions-based adaptive control for a class of nonlinear pure-feedback systems with full state
constraints. Automatica. 2016;64:70-75.
146. Liu Y-J, Tong S. Barrier Lyapunov functions for Nussbaum gain adaptive control of full state constrained nonlinear systems. Automatica.
2017;76:143-152.
147. Gruenwald BC, Wagner D, Yucelen T, Muse JA. Computing actuator bandwidth limits for model reference adaptive control. Int J Control.
2016;89(12):2434-2452.
148. Thiel M, Schwarzmann D, Annaswamy A, Schultalbers M, Jeinsch T. Improved performance for adaptive control of systems with input
saturation. Paper presented at: IEEE American Control Conference; 2016; Boston, MA.
149. López-Araujo D, Loria A, Zavala-Río A. Adaptive tracking control of Euler-Lagrange systems with bounded controls. Int J Adapt Control
Signal Process. 2016;31(3):299-313.
150. Li H, Shi P, Yao D, Wu L. Observer-based adaptive sliding mode control for nonlinear Markovian jump systems. Automatica.
2016;64:133-142.
151. Li H, Shi S, Yao D. Adaptive sliding-mode control of Markov jump nonlinear systems with actuator faults. IEEE Trans Autom Control.
2017;62(4):1933-1939.
18 BENOSMAN
152. Li Y, Sun H, Zong G, Hou L. Composite adaptive anti-disturbance resilient control for Markovian jump systems with partly known
transition rate and multiple disturbances. Int J Adapt Control Signal Process. 2017;31(7):1077-1097.
153. Wang CY, Jiao XH. Adaptive control under arbitrary switching for a class of switched nonlinear systems with nonlinear parameterisation.
Int J Control. 2015;88(10):2044-2054.
154. Kersting S, Buss M. Direct and indirect model reference adaptive control for multivariable piecewise affine systems. IEEE Trans Autom
Control. 2017;62(11):5634-5649.
155. Yuan S, De Schutter B, Baldi S. Adaptive asymptotic trackingcontrol of uncertain time- driven switched linear systems. IEEE Trans Autom
Control. 2017;62(4):5802-5807.
156. Fu J, Ma R, Chai T. Adaptive finite-time stabilization of a class of uncertain nonlinear systems via logic-based switchings. IEEE Trans
Autom Control. 2017;62(11):5998-6003.
157. Ascencio P, Astolfi A, Parisini T. An adaptive observer for a class of parabolic PDEs based on a convex optimization approach for
backstepping PDE design. Paper presented at: IEEE American Control Conference; 2016; Boston, MA.
158. Bialy JB, Chakraborty I, Cekic SC, Dixon WE. Adaptive boundary control of store induced oscillations in a flexible aircraft wing.
Automatica. 2016;70:230-238.
159. Ahmed-Ali T, Giri F, Krsti´
c M, Burlion L, Lamnabhi-Lagarrigue F. Adaptive boundary observer for parabolic PDEs subject to domain
and boundary parameter uncertainties. Automatica. 2016;72:115-122.
160. Anfinsen H, Diagne M, Aamo OM, Krsti´
c M. An adaptive observer design for n+1 coupled linear hyperbolic PDEs based on swapping.
IEEE Trans Autom Control. 2016;61(12):3979-3990.
161. Anfinsen H, Aamo OM. Adaptive stabilization of n+1 coupled linear hyperbolic systems with uncertain boundary parameters using
boundary sensing. Syst Control Lett. 2017;99:72-84.
162. Anfinsen H, Aamo OM. Adaptive control of linear 2 ×2 hyperbolic systems. Automatica. 2018;87:69-82.
163. Gibson TE, Qu Z, Annaswamy AM, Lavretsky E. Adaptive output feedback based on closed-loop reference models. IEEE Trans Autom
Control. 2015;60(10):2728-2733.
164. Bartolini G, Estrada A, Punta E. Observation and output adaptive tracking for a class of nonlinear non-minimum phase systems. Int J
Control. 2016;89(9):1807-1820.
165. Sokolov VF. Adaptive stabilization of parameter-affine minimum-phase plants under Lipschitz uncertainty. Automatica. 2016;73:64-70.
166. Dai S, Ren Z, Bernstein DS. Adaptive control of nonminimum-phase systems using shifted Laurent series. Int J Control.
2015;90(3):407-427.
167. Johnson CR, Goodwin GC, Sin KS. Global convergence of direct adaptive input matching control of some nonminimum phase plants.
Paper presented at: Annual Allerton Conference on Communication, Control, and Computing; 2017; Monticello, IL.
168. Jafari S, Ioannou PA. Robust adaptive attenuation of unknown periodic disturbances in uncertain multi-input multi-output systems.
Automatica. 2016;70:32-42.
169. Basturk HI, Krsti´
c M. Adaptive sinusoidal disturbance cancellation for unknown LTI systems despite input delay. Automatica.
2015;58:131-138.
170. Pyrkin AA, Bobtsov AA. Adaptive controller for linear system with input delay and output disturbance. IEEE Trans Autom Control.
2016;61(12):4229-4234.
171. Jafari S, Ioannou P, Fitzpatrick B, Wang Y. Robustness and performance of adaptive suppression of unknown periodic disturbances.
IEEE Trans Autom Control. 2015;60(8):2166-2171.
172. Wen L, Tao G, Yang H, Zhang Y. An adaptive disturbance rejection control scheme for multivariable nonlinear systems. Int J Control.
2016;89(3):594-610.
173. Marino R, Tomei P. Hybrid adaptive multi-sinusoidal disturbance cancellation. IEEE Trans Autom Control. 2017;62(8):4023-4030.
174. Buchstaller D, French M. Robust stability for multiple model adaptive control: part I—the framework. IEEE Trans Autom Control.
2016;61(3):677-692.
175. Buchstaller D, French M. Robust stability for multiple model adaptive control: part II—gain bounds. IEEE Trans Autom Control.
2016;61(3):693-708.
176. Baldi S, Ioannou PA. Stability margins in adaptive mixing control via a Lyapunov-based switching criterion. IEEE Trans Autom Control.
2016;61(5):1194-1207.
177. Huang M, Wang X, Wang Z. Multiple model adaptive control for a class of linear-bounded nonlinear systems. IEEE Trans Autom Control.
2015;60(1):271-276.
178. Tan C, Tao G, Qi R, Yang H. A direct MRAC based multivariable multiple-model switching control scheme. Automatica. 2017;84:190-198.
179. Kanieski JM, Tambara RV, Pinheiro H, Cardoso R, Gründling HA. Robust adaptive controller combined with a linear quadratic regulator
based on Kalman filtering. IEEE Trans Autom Control. 2016;61(5):1373-1378.
180. Zhu B, Xia X. Adaptive model predictive control for unconstrained discrete-time linear systems with parametric uncertainties. IEEE
Trans Autom Control. 2016;61(10):3171-3176.
181. Heirung T, Ydstie B, Foss B. Dual adaptive model predictive control. Automatica. 2017;80:340-348.
182. Escareno J-A, Rakotondrabe M, Habineza D. Backstepping-based robust-adaptive control of a nonlinear 2-DOF piezoactuator. Control
Eng Pract. 2015;41:57-71.
183. Benosman M, Atinc G. Nonlinear adaptive control of electromagnetic actuators. IET Control Theory Appl. 2015;9(2):258-269.
184. Qiu Z, Santillo M, Jankovic M, Sun J. Composite adaptive internal model control and its application to boost pressure control of a
turbocharged gasoline engine. IEEE Trans Control Syst Technol. 2015;23(6):2306-2315.
BENOSMAN 19
185. Wang N, Qian C, Sun J-C, Liu Y-C. Adaptive robust finite-time trajectory tracking control of fully actuated marine surface vehicles. IEEE
Trans Control Syst Technol. 2016;24(4):1454-1462.
186. Luspay T, Grigoriadis KM. Adaptive parameter estimation of blood pressure dynamics subject to vasoactive drug infusion. IEEE Trans
Control Syst Technol. 2016;24(3):779-787.
187. de Ruiter AHJ. Observer-based adaptive spacecraft attitude control with guaranteed performance bounds. IEEE Trans Autom Control.
2016;61(10):3146-3151.
188. Amini MR, Shahbakhti M, Pan S, Hedrick JK. Bridging the gap between designed and implemented controllers via adaptive robust
discrete sliding mode control. Control Eng Pract. 2017;59:1-15.
189. Evangelista A, Pisano A, Puleston P, Usai E. Receding horizon adaptive second-order sliding mode control for doubly-fed induction
generator based wind turbine. IEEE Trans Control Syst Technol. 2017;25(1):73-84.
190. Zhang Y, Xu Q. Adaptive sliding mode control with parameter estimation and Kalman filter for precision motion control of a piezo-driven
microgripper. IEEE Trans Control Syst Technol. 2017;25(2):728-735.
191. Chaoui H, Gualous H. Adaptive state of charge estimation of lithium-ion batteries with parameter and thermal uncertainties. IEEE Trans
Control Syst Technol. 2017;25(2):752-759.
192. Hu Y, Chen MZQ, Xu S, Liu Y. Semiactive inerter and its application in adaptive tuned vibration absorbers. IEEE Trans Control Syst
Tec hno l. 2017;25(1):294-300.
193. Jin X, Haddad WM, Yucelen T. An adaptive control architecture for mitigating sensor and actuator attacks in cyber-physical systems.
IEEE Trans Autom Control. 2017;62(11):6058-6064.
194. Xie C-H, Yang G-H. Decentralized adaptive fault-tolerant control for large-scale systems with external disturbances and actuator faults.
Automatica. 2017;85:83-90.
195. Ouyang H, Lin Y. Adaptive fault-tolerant control for actuator failures: a switching strategy. Automatica. 2017;81:87-95.
196. Miller DE. The role of convexity in the adaptive control of rapidly time-varying systems. Syst Control Lett. 2017;97:91-97.
197. Hosseinzadeh M, Yazdanpanah MY. Performance enhanced model reference adaptive control through switching non-quadratic Lya-
punov functions. Syst Control Lett. 2015;76:47-55.
198. Farokhi F, Johansson KH. Adaptive control design under structured model information limitation: a cost-biased maximum-likelihood
approach. Syst Control Lett. 2015;75:8-13.
199. Zhang J, Liu Y, Mu X. Further results on global adaptive stabilisation for a class of uncertain stochastic nonlinear systems. Int J Control.
2015;88(3):441-450.
200. Sumer ED, Bernstein DS. On the role of subspace zeros in retrospective cost adaptive control of non-square plants. Int J Control.
2015;88(2):295-323.
201. Mishkov R, Darmonski S. Nonlinear adaptive control system design with asymptotically stable parameter estimation error. Int J Control.
2018;91(1):181-203.
202. Shtessel Y, Fridman L, Plestan F. Adaptive sliding mode control and observation. Int J Control. 2016;89(9):1743-1746.
203. Incremona GP, Ferrara A. Adaptive model-based event-triggered sliding mode control. Int J Adapt Control Signal Process.
2016;30(8):1298-1316.
204. Yang H, Wang Y, Yang Y. Adaptive finite-time control for high-order nonlinear systems with mismatched disturbances. Int J Adapt
Control Signal Process. 2017;31(9):1296-1307.
205. Arabi E, Gruenwald BC, Yucelen T, Nguyen NT. A set-theoretic model reference adaptive control architecture for disturbance rejection
and uncertainty suppression with strict performance guarantees. Int J Control. 2017:1-14.
206. Abidi K, Yildiz Y, Annaswamy A. Control of uncertain sampled-data systems: an adaptive posicast control approach. IEEE Trans Autom
Control. 2017;62(5):2597-2602.
207. Hussain HS, Annaswamy A, Lavretsky E. A new approach to robust adaptive control. Paper presented at: IEEE American Control
Conference; 2016; Boston, MA.
208. Lewis FW, Jagannathan S, Yesildirak A. Neural Network Control of Robot Manipulators and Non-Linear Systems. London, UK: Taylor &
Francis; 1999.
209. Al-Tamini A, Lewis FL, Abu-Khalaf M. Model-free Q-learning designs for linear discrete-time zero-sum games with application to
H-infinity control. Automatica. 2007;43(3):473-481.
210. Vamvoudakis KG, Lewis FL. Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem.
Automatica. 2010;46(5):878-888.
211. Lewis FL, Vamvoudakis KG. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using
measured output data. IEEE Trans Syst Man Cybern B Cybern. 2011;41(1):14-25.
212. Hou Z, Jin S. A novel data-driven control approach for a class of discrete-time nonlinear systems. IEEE Trans Control Syst Technol.
2011;19(6):1549-1558.
213. Lewis FL, Vrabie D, Vamvoudakis KG. Reinforcement learning and feedback control: using natural decision methods to design optimal
adaptive controllers. IEEE Control Syst Mag. 2012;32(6):76-105. http://doi.org/10.1109/MCS.2012.2214134
214. Wang C, Hill DJ, Ge SS, Chen G. An ISS-modular approach for adaptive neural control of pure-feedback systems. Automatica.
2006;42(5):723-731.
215. Guay M, Zhang T. Adaptive extremum seeking control of nonlinear dynamic systems with parametric uncertainties. Automatica.
2003;39:1283-1293.
20 BENOSMAN
216. Dehaan D, Guay M. Extremum-seeking control of state-constrained nonlinear systems. Automatica. 2005;41(9):1567-1574.
217. Vrabie D, Pastravanu O, Lewis F, Abu-Khalaf M. Adaptive optimal control for continuous-time linear systems based on policy iteration.
Automatica. 2009;45(2):477-484.
218. Koszaka L, Rudek R, Pozniak-Koszalka I. An idea of using reinforcement learning in adaptive control systems. Paper presented at: Inter-
national Conference on Networking, International Conference on Systems and International Conference on Mobile Communications
and Learning Technologies (ICNICONSMCL'06); April 2006; Morne, Mauritius.
219. Guay M, Dochain D, Perrier M, Hudon N. Flatness-based extremum-seeking control over periodic orbits. IEEE Trans Autom Control.
2007;52(10):2005-2012.
220. Lee JY, Park JB, Choi YH. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear
systems. Automatica. 2012;48(11):2850-2859.
221. Haring M, van de Wouw N, Neši´
c D. Extremum-seeking control for nonlinear systems with periodic steady-state outputs. Automatica.
2013;49(6):1883-1891.
222. Frihauf P, Krsti´
c M, Basar T. Finite-horizon LQ control for unknown discrete-time linear systems via extremum seeking. Eur J Control.
2013;19(5):399-407.
223. Haghi P, Ariyur K. On the extremum seeking of model reference adaptive control in higher-dimensional systems. Paper presented at:
IEEE American Control Conference; 2011; San Francisco, CA.
224. Haghi P, Ariyur K. Adaptive feedback linearization of nonlinear MIMO systems using ES-MRAC. Paper presented at: IEEE American
Control Conference; 2013; Washington, DC.
225. Benosman M, Atinc G. Extremum seeking-based adaptive control for electromagnetic actuators. Int J Control. 2013;88(3):517-530.
226. Modares H, Lewis F, Yucelen T, Chowdhary G. Adaptive optimal control of partially-unknown constrained-input systems using policy
iteration with experience replay. Paper presented at: AIAA Guidance, Navigation, and Control Conference; 2013; Boston, MA. http://
doi.org/10.2514/6.2013-4519
227. Benosman M, Xia M. Extremum seeking-based indirect adaptive control for nonlinear systems with time-varying uncertainties. Paper
presented at: IEEE European Control Conference; 2015; Linz, Austria.
228. Angulo MT. Nonlinear extremum seeking inspired on second order sliding modes. Automatica. 2015;57:51-55.
229. Benosman M. Multi-parametric extremum seeking-based auto-tuning for robust input-output linearization control. Int J Robust
Nonlinear Control. 2016;26(18):4035-4055.
230. Gruenwald B, Yucelen T. On transient performance improvement of adaptive control architectures. Int J Control. 2015;88(11):2305-2315.
231. Vamvoudakis KG, Miranda MF, Hespanha JP. Asymptotically stable adaptive–optimal control algorithm with saturating actuators and
relaxed persistence of excitation. IEEE Trans Neural Netw Learn Syst. 2016;27(11):2386-2398.
232. Subbaraman A, Benosman M. Extremum seeking-based iterative learning model predictive control (ESILC-MPC). Paper presented at:
12th IFAC Workshop on Adaptation and Learning in Control and Signal; 2016; Eindhoven, The Netherlands.
233. Benosman M, Farahmand A-M. Bayesian optimization-based modular indirect adaptive control for a class of nonlinear systems. Paper
presented at: 12th IFAC International Workshop on Adaptation and Learning in Control and Signal Processing; 2016; Eindhoven,
The Netherlands.
234. Song Y, Huang X, Wen C. Tracking control for a class of unknown nonsquare MIMO nonaffine systems: a deep-rooted information based
robust adaptive approach. IEEE Trans Autom Control. 2016;61(10):3227-3233.
235. Jiang B, Shen Q, Shi P. Neural-networked adaptive tracking control for switched nonlinear pure-feedback systems under arbitrary
switching. Automatica. 2012;61:119-125.
236. Zhao X, Shi P, Zheng X, Zhang L. Adaptive tracking control for switched stochastic nonlinear systems with unknown actuator dead-zone.
Automatica. 2015;60:193-200.
237. Yang Q, Ge SS, Sun Y. Adaptive actuator fault tolerant control for uncertain nonlinear systems with multiple actuators. Automatica.
2012;60:92-99.
238. Jiang Y, Jiang Z-P. Global adaptive dynamic programming for continuous-time nonlinear systems. IEEE Trans Autom Control.
2015;60(11):2917-2929.
239. Gao W, Jiang ZP. Adaptive dynamic programming and adaptive optimal output regulation of linear systems. IEEE Trans Autom Control.
2016;61(12):4164-4169.
240. Bian T, Jiang Y, Jiang ZP. Adaptive dynamic programming for stochastic systems with state and control dependent noise. IEEE Trans
Autom Control. 2016;61(12):4170-4175.
241. Lu W, Zhu P, Ferrari S. A hybrid-adaptive dynamic programming approach for the model-free control of nonlinear switched systems.
IEEE Trans Autom Control. 2016;61(10):3203-3208.
242. Brüggemann S, Possieri C, Poveda JI, Teel AR. Robust constrained model predictive control with persistent model adaptation. Paper
presented at: IEEE Conference on Decision and Control; 2016; Las Vegas, NV.
243. Narendra KS, Wang Y, Mukhopadhay S. Fast reinforcement learning using multiple models. Paper presented at: IEEE Conference on
Decision and Control; 2016; Las Vegas, NV.
244. Zhang T, Xia M, Yi Y. Adaptive neural dynamic surface control of strict-feedback nonlinear systems with full state constraints and
unmodeled dynamics. Automatica. 2017;81:232-239.
245. Song Y, Zhang B, Zhao K. Indirect neuroadaptive control of unknown MIMO systems tracking uncertain target under sensor failures.
Automatica. 2017;77:103-111.
BENOSMAN 21
246. Limon D, Calliess J, Maciejowski JM. Learning-based nonlinear model predictive control. IFAC-PapersOnLine. 2017;50(1):7769-7776.
247. Rosolia U, Borrelli F. Learning model predictive control for iterative tasks: a computationally efficient approach for linear system.
IFAC-PapersOnLine. 2017;50(1):3142-3147.
248. Farrell JA, Polycarpou MM. Adaptive Approximation Based Control: Unifying Neural, Fuzzy and Traditional Adaptive Approximation
Approaches. Hoboken, New Jersey: John Wiley & Sons; 2006.
249. Narendra KS, Parthasarathy K. Identification and control of dynamical systems using neural networks. IEEE Trans Neural Netw.
1990;1(1):4-27.
250. Narendra KS, Mukhopadhyay S. Adaptive control using neural networks and approximate models. IEEE Trans Neural Netw.
1997;8(3):475-485.
251. Narendra KS, Lewis FL. Special issue on neural network feedback control. Automatica. 2001;37(8).
252. Bechlioulis CP, Rovithakis GA. Robust adaptive control of feedback linearizable MIMO nonlinear systems with prescribed performance.
IEEE Trans Autom Control. 2008;53(9):2090-2099.
253. Ren B, Ge SS, Tee KP, Lee TH. Adaptive neural control for output feedback nonlinear systems using a barrier Lyapunov function. IEEE
Trans Neural Netw. 2010;21(8):1339-1345.
254. Theodorakopoulos A, Rovithakis GA. A simplified adaptive neural network prescribed performance controller for uncertain MIMO
feedback linearizable systems. IEEE Trans Neural Netw Learn Syst. 2015;26(3):589-600.
255. Si W-J. Adaptive neural control for nonstrict-feedback stochastic nonlinear time-delay systems with input and output constraints. Int J
Adapt Control Signal Process. 2017;31(10):1401-1417.
256. Tehrani RD, Shabaninia F, Khayatian A, Asemani MH. Transient performance improvement in indirect model reference adaptive control
using perturbation-based extremum seeking identifier. Int J Adapt Control Signal Process. 2017;31(8):1152-1161.
257. Yoshida W, Ishii S. Model-based reinforcement learning: a computational model and an fMRI study. Neurocomputing. 2005;63:253-269.
258. Powell WB. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Hoboken, NJ: John Wiley & Sons; 2007.
259. Busoniu L, Babuska R, De Schutter B, Ernst D. Reinforcement learning and dynamic programming using function approximators.
Automation and Control Engineering. Boca Raton, FL: CRC Press; 2010.
260. Jiang Z-P, Jiang Y. Robust adaptive dynamic programming for linear and nonlinear systems: an overview. Eur J Control.
2013;19(5):417-425.
261. Vamvoudakis KG. Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J Automatica
Sin. 2014;1(3):282-293.
262. Vamvoudakis KG, Vrabie D, Lewis FL. Online adaptive algorithm for optimal control with integral reinforcement learning. Int J Robust
Nonlinear Control. 2014;24(17):2686-2710.
263. Polydoros AS, Nalpantidis L. Survey of model-based reinforcement learning: applications on robotics. JIntellRobotSyst.
2017;86(2):153-173.
264. Vamvoudakis KG, Mojoodi A, Ferraz H. Event-triggered optimal tracking control of nonlinear systems. Int J Robust Nonlinear Control.
2017;27(4):596-619.
265. Vamvoudakis KG, Ferraz H. Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance.
Automatica. 2018;87:412-420.
266. Vamvoudakis KG. Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Control
Lett. 2017;100:14-20.
267. Wang D, He H, Liu D. Adaptive critic nonlinear robust control: a survey. IEEE Trans Cybern. 2017;47(10):3429-3451.
268. Lewis FL, Liu D. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Hoboken, New Jersey: John
Wiley/IEEE Press, Computational Intelligence Series; 2012.
269. Werbos PJ. Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA, eds. Handbook of
Intelligent Control: Neural, Fuzzy, and Adaptative Approaches. New York, NY: Van Nostrand Reinhold Libarary; 1992.
270. Kailath T. Linear Systems. Upper Saddle River, NJ: Prentice-Hall Edition; 1980.
271. Ahmed-Ali T, Giri F, Krsti´
c M, Lamnabhi-Lagarrigue F, Burlion L. Adaptive observer for a class of parabolic PDEs. IEEE Trans Autom
Control. 2016;61(10):3083-3090.
272. Chen X, Feng Y, Su C-Y. Adaptive control for continuous-time systems with actuator and sensor hysteresis. Automatica. 2016;64:196-207.
273. Ariyur KB, Krsti´
cM.Real Time Optimization by Extremum Seeking Control. New York, NY: John Wiley & Sons, Inc; 2003.
274. Scheinker A, Krsti´
cM.Model-Free Stabilization by Extremum Seeking. Cham, Switzerland: Springer; 2016.
275. Leblanc M. Sur l'Électrification des Chemins de fer au Moyen de Courants Alternatifs de Fréquence Élevée. Revue Générale de l'Electricité;
1922.
276. Krsti´
c M. Performance improvement and limitations in extremum seeking. Syst Control Lett. 2000;39:313-326.
277. Ariyur KB, Krsti´
c M. Multivariable extremum seeking feedback: Analysis and design. Paper presented at: Proceedings of the Mathemat-
ical Theory of Networks and Systems; August 2002; South Bend, IN.
278. Coito F, Lemos J, Alves S. Stochastic extremum seeking in the presence of constraints. IFAC World Congress. 2005;16(1):266-271.
279. Tan Y, Nešic D, Mareels I. On non-local stability properties of extremum seeking control. Automatica. 2006;42:889-903.
280. Neši´
c D. Extremum seeking control: convergence analysis. Eur J Control. 2009;15(34):331-347.
281. Tan Y, Nešic D, Mareels I. On the dither choice in extremum seeking control. Automatica. 2008;44:1446-1450.
22 BENOSMAN
282. Tan Y, Nešic D, Mareels I, Astolfi A. On global extremum seeking in the presence of local extrema. Automatica. 2009;45(1):245-251.
283. Rotea M. Analysis of multivariable extremum seeking algorithms. Paper presented at: IEEE American Control Conference; 2000;
Chicago, IL.
284. Stankovi´
c MS, Stipanovi´
c DM. Extremum seeking under stochastic noise and applications to mobile sensors. Automatica.
2016;46(8):1243-1251.
285. Neši´
c D, Nguyen T, Tan Y, Manzie C. A non-gradient approach to global extremum seeking: an adaptation of the Shubert algorithm.
Automatica. 2013;49(3):809-815.
286. Jones DR, Perttunen CD, Stuckman BE. Lipschitzian optimization without the Lipschitz constant. J Optim Theory Appl.
1993;79(1):157-181.
287. Scheinker A, Krsti ´
c M. Maximum-seeking for CLFs: universal semiglobally stabilizing feedback under unknown control directions. IEEE
Trans Autom Control. 2013;58:1107-1122.
288. Scheinker A. Simultaneous stabilization and optimization of unknown, time-varying systems. Paper presented at: IEEE American
Control Conference; 2013; Washington, DC.
289. Khong SZ, Neši ´
c D, Tan Y, Manzie C. Unified frameworksfor sampled- data extremum seeking control: global optimisation and multi-unit
systems. Automatica. 2013;49(9):2720-2733.
290. Noase W, Tan Y, Neši´
c D, Manzie C. Non-local stability of a multi-variable extremum-seeking scheme. Paper presented at: IEEE
Australian Control Conference; November 2011; Melbourne, Australia.
291. Ye M, Hu G. Extremum seeking under input constraint for systems with a time-varying extremum. Paper presented at: IEEE Conference
on Decision and Control; 2013; Florence, Italy.
292. Tan Y, Li Y, Mareels I. Extremum seeking for constrained inputs. IEEE Trans Autom Control. 2013;58(9):2405-2410.
293. Liu S-J, Krsti´
c M. Newton-based stochastic extremum seeking. Automatica. 2014;50(3):952-961.
294. Guay M, Dochain D. A minmax extremum-seeking controller design technique. IEEE Trans Autom Control. 2015;59(7):1874-1886.
295. Poveda JI, Quijano N. Shahshahani gradient-like extremum seeking. Automatica. 2015;58:51-59.
296. Guay M, Moshksar E, Dochain D. A constrained extremum-seeking control approach. Int J Robust Nonlinear Control.
2015;25(16):3132-3153.
297. Guay M, Dochain D. A time-varying extremum-seeking control approach. Automatica. 2015;51:356-363.
298. Guay M, Dochain D. A multi-objective extremum-seeking controller design technique. Int J Control. 2015;88(1):38-53.
299. Guay M. A perturbation-based proportional integral extremum-seeking control approach. IEEE Trans Autom Control.
2016;61(11):3370-3381.
300. Wang L, Chen S, Ma K. On stability and application of extremum seeking control without steady-state oscillation. Automatica.
2016;68:18-26.
301. Liu S-J, Krsti´
c M. Stochastic averaging in discrete time and its applications to extremum seeking. IEEE Trans Autom Control.
2016;61(1):90-102.
302. Radenkovic MS, Altman T. Stochastic adaptive stabilization via extremum seeking in case of unknown control directions. IEEE Trans
Autom Control. 2016;61(11):3681-3686.
303. Radenkovic MS, Altman T. Almost sure convergence of extremum seeking algorithm using stochastic perturbation. Syst Control Lett.
2016;94:133-141.
304. Atta KT, Hostettler R, Brik W, Johansson A. Phasor extremum seeking control with adaptive perturbation amplitude. Paper presented
at: IEEE 55th Conference on Decision and Control; 2016; Las Vegas, NV.
305. Liu SJ, Krsti´
c M, Basar T. Batch-to-batch finite-horizon LQ control for unknown discrete-time linear systems via stochastic extremum
seeking. IEEE Trans Autom Control. 2017;62(8):4116-4123.
306. Haring M, Johansen T-A. Asymptotic stability of perturbation-based extremum-seeking control for nonlinear plants. IEEE Trans Autom
Control. 2017;62(5):2302-2317.
307. Poveda J, Vamvoudakis K, Benosman M. A neuro-adaptive architecture for extremum seeking control using hybrid learning dynamics.
Paper presented at: IEEE American Control Conference; 2017; Seattle, WA.
308. Poveda JI, Teel AR. A framework for a class of hybrid extremum seeking controllers with dynamic inclusions. Automatica.
2017;76:113-126.
309. Poveda JI, Teel AR. A robust event-triggered approach for fast sampled-data extremization and learning. IEEE Trans Autom Control.
2017;62(10):4949-4964.
310. Krsti´
c M, Wang H-H. Stability of extremum seeking feedback for general nonlinear dynamic systems. Automatica. 2000;36(4):595-601.
311. Bertsekas D, Tsitsiklis J. Neurodynamic Programming. Bellmont, MA: Athena Scientific; 1996.
312. Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press; 1998.
313. Busonio L, Babuska R, Schutter BD. A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern Part C
Appl Rev. 2008;38(2):156-172.
314. Szepesvári C. Algorithms for Reinforcement Learning. San Rafael, CA: Morgan & Claypool Publishers; 2010.
315. Kormushev P, Calinon S, Caldwell DG. Robot motor skill coordination with EM-based reinforcement learning. Paper presented at: 2010
IEEE/RSJ International Conference on Intelligent Robots and Systems; 2010; Taipei, Taiwan.
316. Farahmand A-M. Regularization in Reinforcement Learning [Dissertation]. Edmonoton, Canada: University of Alberta; 2011.
BENOSMAN 23
317. Geramifard A, Walsh JT, Tellex S, Chowdhary G, Roy N, How J. A tutorial on linear function approximators for dynamic programming
and reinforcement learning. Found Trends Mach Learn. 2013;6(4):375-451.
318. Dann C, Neumann G, Peters J. Policy evaluation with temporal differences: a survey and comparison. J Mach Learn Res.
2014;15(1):809-883.
319. Arulkumaran K, Deisenroth MP, Brundage M, Bhara AA. Deep reinforcement learning: a brief survey. IEEE Signal Process Mag.
2017;34(6):26-38.
320. Luo B, Liu D, Wu HN, Wang D, Lewis FL. Policy gradient adaptive dynamic programming for data-based optimal control. IEEE Trans
Cybern. 2017;47(10):3341-3354.
321. Thorndike EL. Animal Intelligence: Experimental Studies. New York, NY: The Macmillan Company; 1911.
322. Dracopoulos D. Evolutionary Learning Algorithms for Neural Adaptive Control. London, UK: Springer; 2013.
323. Prabhu SM, Garg DP. Artificial neural network based robot control: an overview. J Intell Robot Syst. 1996;15(4):333-365.
324. Martinetz T, Schulten K. A neural network for robot control: cooperation between neural units as a requirement for learning. Comput
Electr Eng. 1993;19(4):315-332.
325. Levine S. Exploring deep and recurrent architectures for optimal control. Paper presentedat: 2013 Neural Information Processing Systems
(NIPS) Workshop on Deep Learning; 2013; Lake Tahoe, CA.
326. Wang Z, Liu Z, Zheng C. Qualitative Analysis and Control of Complex Neural Networks with Delays. Vol. 34. Berlin, Germany:
Springer-Verlag; 2016.
327. Lv Y, Na J, Yang Q, Wu X, Guo Y. Online adaptive optimal control for continuous-time nonlinear systems with completely unknown
dynamics. Int J Control. 2016;89(1):99-112.
328. Tanaskovic M, Fagiano L, Novara C, Morari M. Data-driven control of nonlinear systems: an on-line direct approach. Automatica.
2017;75:1-10.
329. Teixeira FC, Quintas J, Maurya P, Pascoal A. Robust particle filter formulations with application to terrain-aided navigation. Int J Adapt
Control Signal Process. 2017;31(4):608-651.
330. Bristow DA, Tharayil M, Alleyne AG. A survey of iterative learning control. IEEE Control Syst. 2006;26(3):96-114.
331. Moore KL. Iterative learning control: an expository overview. In: Applied and Computational Control, Signals, and Circuits. Boston, MA:
Springer Birkhäuser; 1999:151-214.
332. Ahn H-S, Chen Y, Moore KL. Iterative learning control: brief survey and categorization. IEEE Trans Syst Man Cybern Part C.
2007;37(6):1099-1121.
333. Owens DH, Feng K. Parameter optimisation in iterative learning control. Int J Control. 2003;76(11):1059-1069.
334. Xu J-X, Yan R. On initial conditions in iterative learning control. IEEE Trans Autom Control. 2005;50(9):1349-1354.
335. Tan Y, Yang SPD, Xu JX. On P-type iterative control for nonlinear systems without global Lipschitz continuity condition. Paper presented
at: 2015 IEEE American Control Conference; 2015; Chicago, IL.
336. Lin T, Owens DH, Hätönen J. Newton method based iterative learning control for discrete non-linear systems. Int J Control.
2006;79(10):1263-1276.
337. Khong SZ, Neši´
cD,Krsti´
c M. Iterative learning control based on extremum seeking. Automatica. 2016;66:238-245.
338. Owens DH, Hätönen J. Iterative learning control an optimization paradigm. Int J Control. 2005;29(1):57-70.
339. Xu J-X. A survey on iterative learning control for nonlinear systems. Int J Control. 2011;84(7):1275-1294.
340. Owens DH. Iterative Learning Control: An Optimization Paradigm. London, UK: Springer; 2015.
341. Zhang R, Hou Z, Chi R, Ji H. Adaptive iterative learning control for nonlinearly parameterised systems with unknown time-varying
delays and input saturations. Int J Control. 2015;88(6):1133-1141.
342. Shen D, Zhang W, Xu J-X. Iterative learning control for discrete nonlinear systems with randomly iteration varying lengths. Syst Control
Lett. 2016;96:81-87.
343. Shen D, Zhang W, Wang Y, Chien C-J. On almost sure and mean square convergence of P-type ILC under randomly varying iteration
lengths. Automatica. 2016;63:359-365.
344. Farahmand A-M, Benosman M. Towards stability in learning-based control: A Bayesian optimization-based adaptive controller. Paper
presented at: 2017 Multi-Disciplinary Conference on Reinforcement Learning and Decision Making (RLDM); 2017; Ann Arbor, MI.
345. Berkenkamp F, Turchetta M, Schoellig AP, Krause A. Safe model-based reinforcementlearning with stability guarantees. Paper presented
at: 2017 Conference on Neural Information Processing Systems (NIPS); 2017; Long Beach, CA.
346. Porikli F, Shan S, Snoek C, Sukthankar R, Wang X. Deep learning for visual understanding: recent advances: part 1. IEEE Signal Process
Mag. 2017;34(6).
347. Porikli F, Shan S, Snoek C, Sukthankar R, Wang X. Deep learning for visual understanding: recent advances: part 2. IEEE Signal Process
Mag. 2017;35(1).
348. Fawzi A, Moosavi-Dezfooli S-M, Frossard P. The robustness of deep networks: a geometrical perspective. IEEE Signal Process Mag.
2017;34(6):50-62.
349. Lamnabhi-Lagarrigue F, Annaswamy A, Engell S, et al. Systems and control for the future of humanity, research agenda: current and
future roles, impact and grand challenges. Annu Rev Control. 2017;43:1-64.
350. Saxe AM, McClelland JL, Ganguli S. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. Paper
presented at: 6th International Conference on Learning Representations; 2014; Edinburgh, Scotland.
24 BENOSMAN
351. Ollivier Y. Riemannian metrics for neural networks I: feedforward networks. Inf Inference: J IMA. 2015;4(2):108-153.
352. Ollivier Y. Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences. Inf Inference: J IMA.
2015;4(2):154-193.
353. Hauser M, Ray A. Principles of Riemannian geometry in neural networks. Paper presented at: 31st Conference on Neural Information
Processing Systems (NIPS 2017); 2017; Long Beach, CA.
354. Chaudhari P, Oberman A, Osher S, Soatto SG, Carlier G. Deep relaxation: partial differential equations for optimizing deep neural
networks. arXiv preprint arXiv:1704.04932; 2017.
355. Vidal R, Bruna J, Giryes R, Soatto S. Mathematics of deep learning. Paper presented at: 2017 IEEE Conference on Decision and Control;
2017; Melbourne, Australia.
356. Goebel R, Sanfelice RG, Teel AR. Hybrid dynamical systems. IEEE Control Syst; 2012;29(2):28-93.
357. Panait L, Luke S. Cooperative multi-agent learning: the state of the art. Auton Agent Multi-Agent Syst. 2005;11(3):387-434.
358. Dörfler F, Bullo F. Synchronization in complex networks of phase oscillators: a survey. Automatica. 2014;50(6):1539-1564.
359. Poveda J, Benosman M, Teel AR. Hybrid online learning control in networked multi-agent systems: a survey. Int J Adapt Control Signal
Process. 2018. In press.
Howtocitethisarticle: Benosman M. Model-based vs data-driven adaptive control: An overview. Int J Adapt
Control Signal Process. 2018:1–24. https://doi.org/10.1002/acs.2862
... As is well known, the Lyapunov method is an efficient and powerful tool for stability analysis and synthesis of control systems. The Lyapunov-type theorems have been developed for stability analysis and application to feedback stabilization of various systems [4,13,14,24,28,37,40,66]. In the enormous literature, there is a number of Lyapunovtype theorems on stability and feedback stabilization of impulsive systems [12,20,31,58,74]. ...
... Clearly, the generalized canonical form (87) has a much wider range of applications. The proposed CPS theory may be adapted for various control strategies such as saturated control [12], adaptive control [4] and model predictive control [17]. Particularly, over past a few decades, many investigations have been conducted into fault-tolerant control systems, fault detection and diagnosis, and reconfigurable control [22,56,59,64,67,70]. ...
... It would also privide a solid base for the development of CPS theory for sampled-data (fault-tolerant) control combined with learning algorithms in the future. It is of great theoretic and practical importance to develop CPS theory for computer control systems by datadriven approaches [59,73] and also for those by combined model-based and data-driven approaches [4]. Just name a few among future work to develop the systems science of design for CPS. ...
Article
Full-text available
Inspired by the cyber-physical systems (CPS) of numerical methods for stochastic differential equations, we present a CPS model of sampled-data control systems (typically a synonym for computer control systems), which regards the intersection of the physical and the cyber (the key feature of CPS). As a theoretic foundation, we develop by the Lyapunov method a stability theory for a general class of stochastic impulsive differential equations (SiDE) which is formulated as a canonical form for CPS that may work in feedback loops and thus include those of sampled-data control systems. Applying the fundamental theory, we study stability of the CPS, which implies that of the sampled-data control system. By our CPS approach, we not only obtain stability criteria for the CPS of sampled-data control systems but also reveal the equivalence and intrinsic relationship between the two main approaches (viz. controller emulation and discrete-time approximation) in the literature. As the applications of our CPS theory, we propose a control design method for feedback stabilization of the CPS of sampled-data stochastic systems. Illustrative examples are conducted to verify that our method significantly improves the existing results. In this paper, we initiate the study of a systems science of design for CPS. This provokes many open and interesting problems.
... Adaptive control belongs to crucial control methodology to deal with time-varying model parameters [2,11]. In addition, it can be seamlessly integrated with various control algorithms in the SBW systems [12][13][14][15][16][17][18][19]. ...
Article
Full-text available
In this paper, an adaptive active disturbance rejection control is newly designed for precise angular steering position tracking of the uncertain and nonlinear SBW system with time delay communications. The proposed adaptive active disturbance rejection control comprises the following two elements: (1) An adaptive extended state observer and (2) an adaptive state error feedback controller. The adaptive extended state observer with adaptive gains is employed for estimating the unmeasured velocity, acceleration, and compound disturbance which consists of system parameter uncertainties, nonlinearities, exterior disturbances, and time delay in which the observer gains are dynamically adjusted based on the estimation error to enhance estimation performances. Based on the accurate estimations of the adaptive extended state observer, the proposed adaptive full state error feedback controller is equipped with variable gains driven by the tracking error to develop control precision. The integration of the advantages of the adaptive extended state observer and the adaptive full state error feedback controller can improve the dynamic transient and static steady-state effectiveness, respectively. To assess the superior performance of the proposed adaptive active disturbance rejection control, a comparative analysis is conducted between the proposed control scheme and the classical active disturbance rejection control in two different cases. It is worth noting that the active disturbance rejection control serves as a benchmark for evaluating the performance of the proposed control approach. The results from the comparison studies executing two simulated cases validate the superiority of the suggested control, in which estimation, tracking response rate, and steering angle precision are greatly improved by the scheme proposed in this article.
... A forgetting-data MRE and mixing (MREM) method was proposed in [36], where regressor mixing means that the extended regression equation (9) with N dimensions is multiplied by adj(Θ) to generate a set of N decoupled scalar equations regarding N unknown parameters. Asymptotic parameter convergence is established under a condition of square nonintegrability, which is weaker than PE, but exponential parameter convergence still depends on the PE condition in [36]. ...
Article
Full-text available
Online data memory is essential for adaptive estimation and control as it can enhance the performance and robustness of adaptive systems compared to adaptive systems without data memory. We provide an overview of four data memory-driven parameter estimation schemes for adaptive systems from a historical perspective, including forgetting-data memory regression extension (MRE), full-data MRE, discrete-data MRE, and interval-data MRE. For clear presentation and better understanding, a general class of nonlinear systems with linear-in-the-parameter uncertainties is applied as a unifying framework to demonstrate the motivation, synthesis, and characteristics of each MRE scheme for parameter estimation in adaptive control. Intensive comparisons of the four MRE schemes are provided to reveal their technical natures, and real-world applications are discussed to show their practicability. It is concluded that all the MRE schemes can achieve exponential parameter convergence under relaxed excitation conditions rather than the classical condition of persistent excitation which is too stringent to satisfy in practice. The distinctive features of interval-data MRE termed composite learning are highlighted with respect to computational simplicity, estimation accuracy, robustness against perturbations, and widespread real-world applications to robot learning and control. Possible directions for future research in this area are suggested to conclude this survey.
Article
Full-text available
Today the tasks of complex artificial and natural objects control have come to the fore in the majority of subject domains. The efficiency and effectiveness of solving these tasks directly depends of the efficiency and effectiveness of data fusion (DF). Data fusion methods are designed to integrate data from multiple sources and transform it in order to produce more consistent, accurate, and useful information than that provided by any individual data source. Although DF has been extensively studied for a considerable period of time it is still hardly applicable in practice in the processes of the control of the real world objects with complex structure and behavior as the data produced by the objects is, in the majority of cases, heterogeneous, multimodal, and imperfect, has huge volume. To ensure proper response to the changes in the state and behavior of the controlled objects that can be caused by both internal and external influencing factors the data should be processed with high accuracy and with minimum delays. Despite the importance of the tasks of complex objects control till now there are no researches that clarify to what extent the DF problem has been solved from the perspective of its application in the processes of objects control based on the data received from the objects. In the survey we define the requirements to DF in the interests of the control of complex artificial and natural objects, consider the structure of the multilevel process of intelligent object control, identify the neural networks that can be used in the control process for data fusion. Despite the wide capabilities of the existing NN we reveal that they still do not meet all the requirements to DF for complex objects control. Based on the analysis of NN architectures, we define requirements for advanced NN architectures and discuss future research directions. To facilitate our literature analyses, we also perform conceptual exploration of collected papers with lattices of closed itemsets and implications from Formal Concept Analysis and Data Mining used for knowledge processing in similar large-scale studies.
Chapter
This paper aims to design a neural network-based model reference intelligent adaptive control for quadrotor UAV and implement a machine learning approach to recognize the severity level of yellow wheat rust for precision agriculture. Yellow wheat rust is a fungal disease that can cause massive destruction in wheat production and quality. Obtaining accurate data from large-scale crops and detecting those diseases based on specific standards via visual inspection become labor-intensive, time-consuming, and sensitive to human error. Addressing these issues involves deploying a quadrotor for data acquisition and training a cutting-edge Convolutional Neural Network (CNN) for image analysis. Since existing control techniques are computationally intensive and show poor performance in tolerating unmatched uncertainty, the proposed controller is designed by using a nested control approach. In this control architecture, feedforward neural networks are trained to estimate position controller parameters online, whereas recurrent neural networks are trained to estimate the model and control the attitude of the quadrotor. Then, Xception CNN is trained by using a transfer learning approach. To verify controller performance, numerical simulations have been conducted in various scenarios. The results show that the designed controller has high tracking precision, robustness, and enhanced antidisturbance ability in nominal scenarios and the presence of matched and unmatched uncertainties, and the retrained model achieves an accuracy of 97.28%. Therefore, the suggested controller is a promising quadrotor control technique, and the retrained Xception model can be used for detecting the severity level of yellow wheat rust.
Article
This survey paper studies deterministic control systems that integrate three of the most active research areas during the last years: (1) online learning control systems, (2) distributed control of networked multiagent systems, and (3) hybrid dynamical systems (HDSs). The interest for these types of systems has been motivated mainly by two reasons: First, the development of cheap massive computational power and advanced communication technologies, which allows to carry out large computations in complex networked systems, and second, the recent development of a comprehensive theory for HDSs that allows to integrate continuous‐time dynamical systems and discrete‐time dynamical systems in a unified manner, thus providing a unifying modeling language for complex learning‐based control systems. In this paper, we aim to give a comprehensive survey of the current state of the art in the area of online learning control in multiagent systems, presenting an overview of the different types of problems that can be addressed, as well as the most representative control architectures found in the literature. These control architectures are modeled as HDSs, which include as special subsets continuous‐time dynamical systems and discrete‐time dynamical systems. We highlight the different advantages and limitations of the existing results as well as some interesting potential future directions and open problems.
Chapter
Indirect adaptive control is a widely applicable adaptive control strategy. In real-time, it combines plant model parameter estimation in closed loop with the redesign of the controller. Adaptive pole placement and its robustified version, together with adaptive generalized predictive control constitute the core of the chapter. Adaptive linear quadratic control is also presented. Application of various strategies for the indirect adaptive control of a flexible transmission illustrates the methodology presented in this chapter.
Article
To overcome the drawback of overparametrization in existing nonlinear adaptive control design with multiple unknown control directions, we propose a new algorithm which combines nonlinear integrator backstepping, tuning function design and a logic-based switching mechanism that tunes the control directions online in a switching manner. Global asymptotic tracking control is achieved for parametric-strict-feedback systems without overparametrization. The logic-based switching criterion is based on monitoring incremental errors caused during two consecutive switching moments, and thus can identify the true control direction quickly.