ArticlePDF Available

Iterative Learning Control With Data-Driven-Based Compensation

Authors:

Abstract

The robust iterative learning control (RILC) can deal with the systems with unknown time-varying uncertainty to track a repeated reference signal. However, the existing robust designs consider all the possibilities of uncertainty, which makes the design conservative and causes the controlled process converging to the reference trajectory slowly. To eliminate this weakness, a data-driven method is proposed. The new design intends to employ more information from the past input-output data to compensate for the robust control law and then to improve performance. The proposed control law is proved to guarantee convergence and accelerate the convergence rate. Ultimately, the experiments on a robot manipulator have been conducted to verify the good convergence of the trajectory errors under the control of the proposed method.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON CYBERNETICS 1
Iterative Learning Control With
Data-Driven-Based Compensation
Shaoying He , Wenbo Chen, Dewei Li , Yugeng Xi ,Senior Member, IEEE,
Yunwen Xu ,Member, IEEE, and Pengyuan Zheng, Member, IEEE
Abstract—The robust iterative learning control (RILC) can
deal with the systems with unknown time-varying uncertainty to
track a repeated reference signal. However, the existing robust
designs consider all the possibilities of uncertainty, which makes
the design conservative and causes the controlled process converg-
ing to the reference trajectory slowly. To eliminate this weakness,
a data-driven method is proposed. The new design intends to
employ more information from the past input–output data to
compensate for the robust control law and then to improve
performance. The proposed control law is proved to guaran-
tee convergence and accelerate the convergence rate. Ultimately,
the experiments on a robot manipulator have been conducted to
verify the good convergence of the trajectory errors under the
control of the proposed method.
Index Terms—Data-driven method, iterative learning control
(ILC), robot manipulator, robust.
I. INTRODUCTION
AS THE name suggests, iterative learning control (ILC) is
an algorithm that imitates the human learning process to
improve the performance of the controller. It aims to improve
the tracking performance and promote the precision of the
repeated task under the same executing conditions by learning
from the previous experiences. The strategy of iterative learn-
ing for repeating a given task with high precision was first
proposed by Uchiyama [1] and was later mathematically for-
mulated by Arimoto et al. [2]. Since the ILC task is to track
a specific command trajectory repeatedly [3], [4], the systems
under the ILC framework all have repetitive operations [5],
such as trajectory tracking [6], [7]; robot manipulator [8];
subway train system [9]; agent formation [10]; chemical pro-
cess [11]; turbine control [12]; motor control [13]; and so
Manuscript received July 16, 2020; revised November 5, 2020; accepted
November 24, 2020. This work was supported in part by the National Key
Research and Development Project under Grant 2018YFB1305902; in part by
the National Science Foundation of China under Grant 61973214 and Grant
61963030; and in part by the Natural Science Foundation of Shanghai under
Grant 19ZR1476200. This article was recommended by Associate Editor
Z.-G. Hou. (Corresponding authors: Yunwen Xu.)
Shaoying He, Wenbo Chen, Dewei Li, Yugeng Xi, and Yunwen Xu are
with the Department of Automation, Shanghai Jiao Tong University, Shanghai
200240, China (e-mail: shaoyinghe@foxmail.com; chenwenbo860@126.com;
dwli@sjtu.edu.cn; ygxi@sjtu.edu.cn; willing419@sjtu.edu.cn).
Pengyuan Zheng is with the College of Automation Engineering,
Shanghai University of Electric Power, Shanghai 200090, China (e-mail:
pyzheng@shiep.edu.cn).
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TCYB.2020.3041705.
Digital Object Identifier 10.1109/TCYB.2020.3041705
on. By making full use of the repeated properties in the
iteration process, ILC can keep updating the control inputs
and eliminate the tracking error gradually [14].
The typical ILC usually updates the control law by using
the information from the previous error and control sequence
so that the output trajectory can converge asymptotically to
the reference. Hillenbrand and Pandit [15] introduced an ILC
law for the linear time-invariant system model, which can
guarantee an exponential rate of convergence by reducing
the sampling rate. Longman [16] presented general iteration
control laws with the method of tuning the parameters eas-
ily for the practicing control engineer. The design strategy
for ILC based on the optimal control theory was designed
in [17]. Xiong et al. [18] proposed an optimal ILC with model
predictive control (MPC) for the linear time-varying model
without modeling error. These above strategies have achieved
great success, but the uncertainty of the model was not consid-
ered. If some uncertainty existed in the model, the convergence
of the system cannot be rigorously guaranteed.
To eliminate the effect of the model uncertainty on con-
vergence, Gao et al. [19] proposed a robust ILC (RILC) to
reject the bounded uncertainty and disturbance by the robust
selection of the weighting matrices. An ILC with the singular
value decomposition of the lifted system matrix and the guide-
lines how to tune the learning gains to reduce the model errors
were introduced in [20]. Tayebi and Zaremba [21] proposed an
RILC design procedure for the uncertain linear time-invariant
system by the μ-Analysis and synthesis approach. Later, Shi
et al. [22] formulated the robust design as matrix inequality
conditions, which can be solved by an algorithm based on the
linear matrix inequality (LMI). Meng et al. [23] proposed an
RILC for the time-delay systems with uncertainty by a LMI
approach. Besides, an RILC for the polytopic uncertain system
was proposed in [24]. For the above methods, the control laws
were just designed to meet the feasibility not the optimality.
Thus, the performance weighting in the control law might be
selected to be large and the performance is limited.
For better performance, an RILC design frame with an
efficient combination of LMI and an appropriate parame-
ter optimization was proposed in [25], which extended the
LMI approach by the minimization of a suitable cost func-
tion and led to an improved convergence rate. Considering
that ILC can be treated as a special class of 2-D model,
which is advantageous to investigate ILC, Wang et al. [26]
designed a robust iterative learning MPC based on the 2-D
model and Hladowski et al. [27] designed an RILC for a 2-D
2168-2267 c
2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2IEEE TRANSACTIONS ON CYBERNETICS
model with the robustness analysis. The robust norm-optimal
ILC design was introduced in [28]. It takes the robust conver-
gence conditions as the constraints in the optimal problem,
which can maximize available performance and ensure the
convergence. However, the existing robust ILC designs consid-
ered all the possibilities of uncertainty, which makes the design
conservative and causes the controlled process to converge to
the reference trajectory slowly.
Since the input–output data of the system can directly reflect
the system characteristics, the data-driven method, such as the
identification method, can obtain more information than the
robust method to reduce model uncertainty. The identification
methods were used in a lot of research to compensate the
ILC and improve the design performance. Janssens proposed
a data-driven norm-optimal ILC in [29] and the key contri-
bution was the estimation of the system impulse response for
the design controller using input–output data from previous
iterations, but the full rank of the past trial input sequences
was required. Chi used the past input–output to identify the
lifted model and proposed a convergent condition for model
updating [30], [31], but the identification result is sensitive
to initial parameters, and the choice of initial parameters was
very important. The above two methods can obtain an accurate
model with the increasing number of iterations and past data,
but the system identification algorithms need a large amount
of past trial input–output data, which made the system con-
verge slowly. Thus, Li et al. [32] proposed a novel data-driven
method for the ILC combined with MPC in which past data
were directly used to update the control output for avoiding
identification.
To solve the existing disadvantages of RILC and adaptive
ILC, we design a data-driven method for ILC by employ-
ing the special optimal linear combination of the past input
sequences to compensate for a traditional RILC law, which
can be developed based on a rough model. Since the proposed
method has no need to do the model identification, we call
it the ILC with data-driven-based compensation (ILC-DDC).
The main contribution in this article is that the data-driven
module is embedded into the traditional RILC to overcome
the conservation and improve the performance. Besides, the
convergence of the algorithm proposed in this article can be
guaranteed strictly in theory, and it can also be proved that
the proposed data-driven method accelerates the convergence
rate of the classical RILC.
This article is organized as follows. Section II introduces the
considered system model and the classical RILC. Section III
presents the design procedure of the proposed data-driven
compensation method for ILC in detail. Subsequently, the
proof about the monotonous convergence of the proposed
method and the analysis of the convergence rate compared
with RILC are both described in Section IV. Finally, the exper-
iments on the 6-freedom manipulator to test the proposed
method are given in Section V.
II. PROBLEM FORMULATION
In this section, we first present the considered system model
and then introduce the classical RILC method as well as its
TAB LE I
NOTATIONS OF SYMBOLS
convergence condition. The notations of the symbols used in
this article are presented in Table I.
Consider a discrete linear time-variant system model in the
repeated iterative process as follows:
x(k,t+1)=A(t)x(k,t)+B(t)u(k,t)(1)
y(k,t)=Cx(k,t)(2)
where kand tare the trial number and the discrete-time instant,
respectively. The discretization time of each trial is denoted by
t∈{0,1,...,N1}, where Nis the number of samples in a
trial. x(k,t)Rnis the measurable system state, u(k,t)Rmis
the control input, and y(k,t)Rh(hm) is the system output
in the time t,atthekth trial. A(t)and B(t)are system dynamic
matrices in the time t, but they are irrelevant to the times of
trial. Cis a constant matrix. Here, we make an assumption
that (A,C)and (A,B)are observability and controllability. A
and Bcan be indicated as the following:
A(t)=A0(t)+A(t)(3)
B(t)=B0(t)+B(t)(4)
where A0(t)and B0(t)are known model matrices of the
system, and A(t)and B(t)are unknown and time-varying
within one trial. Note that A(t)and B(t)are iteration invari-
ants. The control goal is to steer the system output to track a
reference trajectory
γ(k). In addition, the initial conditions of
all repetition trials are the same, and without loss of generality,
we set x(1,0)=···= x(k,0)and y(1,0)= ···=y(k,0).
According to the system model (1)–(4), the states of the
trial process at the kth trial can be characterized by
x(k,1)=A(0)x(k,0)+B(0)u(k,0)
.
.
.
x(k,i+1)=A|i
0x(k,0)+B(i)u(k,i)
+
i
j=1
A|i
jB(j1)u(k,j1)
.
.
.(5)
and the corresponding outputs are
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
HE et al.: ITERATIVE LEARNING CONTROL WITH DATA-DRIVEN-BASED COMPENSATION 3
y(k,1)=CA(0)x(k,0)+CB(0)u(k,0)
.
.
.
y(k,i+1)=CA|i
0x(k,0)+CB(i)u(k,i)
+C
i
j=1
A|i
jB(j1)u(k,j1)
.
.
.(6)
where A|i
j=i
g=jA(g).
Therefore, the output sequence at the kth trial can be
expressed in the form of vectors as follows:
Y(k)=Px(k,0)+G
u(k)
=Px(k,0)+(G0+G)
u(k)(7)
where
Y(k)[yT(k,1), yT(k,2), ··· ,yT(k,N)]TRNh,
u(k)[uT(k,0), uT(k,1), ··· ,uT(k,N1)]TRNm
P
CA|0
0
CA|1
0
.
.
.
CA|N1
0
G
CB(0)O··· O
CA|1
1B(0)CB(1)....
.
.
.
.
..
.
....O
CA|N1
1B(0)··· ··· CB(N1)
G0
CB0(0)O··· O
CA0|1
1B0(0)CB0(1)....
.
.
.
.
..
.
....O
CA0|N1
1B0(0)··· ··· CB0(N1)
A0|i
j=i
g=jA0(g),PRNh×nand GRNh×Nm are the
system matrices, G0RNh×Nm is the certain part in Gwhich
are determined by A0(t)and B0(t),GGG0is the uncer-
tainty in the model which is affected by A(t)and B(t).
Without loss of generality, for the uncertain matrix G,we
give the following assumption which is also used in [25].
Assumption 1: The uncertain matrices A(t)and B(t)are
assumed to belong to a convex bounded uncertainty domain
Das follows:
D=
[A(t), B(t)]=
L
j=1
aj[Aj,Bj]
L
j=1
aj=1;aj0
(8)
where Ajand Bjare the vertices, and Ldenotes the
corresponding number of vertices.
Therefore, the uncertain system matrix Gbelongs to a
larger convex bounded uncertainty domain
Das follows:
D=
G=
L
j=1
bjGj;
L
j=1
bj=1;bj0
(9)
where Gjare the vertices,
Ldenotes the corresponding num-
ber of vertices, and
Lis much larger than L. The polytope
D
makes the uncertainty Gbelong to a certain range.
The reference trajectory can be defined as
γ(k)=
[γT(k,1), γ T(k,2),...,γT(k,N)]Tand the error at time tof
kth trial can be defined as e(k,t)=γ(k,t)y(k,t). Since the
initial output y(k,0)is fixed and the aim is to track
γ(k)from
1toN, the initial error can be set as e(k,0)=0. Referring
to (7), the trajectory tracking error of the system in the kth
trial can be denoted as follows:
e(k)=
γ(k)
Y(k)
=
γ(k)Px(k,0)(G0+G)
u(k)(10)
where
e(k)[eT(k,1), eT(k,2),...,eT(k,N)]T.
Due to the repeatability of
γ(k), that is,
γ(k+1)=
γ(k),
Px(k+1,0)=Px(k,0)and (10), the error model in (k+
1)th trial and the input increment sequence between the two
adjacent trials can be written as
e(k+1)=
γ(k+1)
Y(k+1)
=
e(k)(G0+G)
u(k)(11)
e(k)
e(k+1)
e(k)=−(G0+G)
u(k)
(12)
where
u(k)
u(k+1)
u(k)is the updating control
law at the k+1th trial. It is obvious that the updating control
u(k)leads to the variety of the error
e(k).
During the iteration process, the classical ILC aims at reduc-
ing the error at the next time trial between the output and the
reference using the current error information, so at the next
time trial, the next control input sequences are usually obtained
as the following [5]:
u(k+1)=
u(k)+
u(k)=
u(k)+K
e(k)(13)
where KRNm×Nh is the feedback gain matrix for the cur-
rent error
e(k). Many robust methods, such as the linear
matrix inequality or norm-optimal method, have been utilized
to design the robust feedback gain Kof RILC, such as [13],
[24], and [28]. In the literature, the robust monotonic con-
vergence conditions of the RILC control law are similar and
defined as follows:
||I(G0+G)K|| <I,G,GDG.(14)
In this article, we choose the method introduced in [24] to
design Kas follows. The RILC with polytopic uncertainty
can be first given by
u(k+1,t)=u(k,t)+kt(e(k,t+1)e(k,t)) (15)
where t=0,...,N1 and ktRm×h. Writing the above
equations in the matrix form, we have
u(k)=
u(k+1)
u(k)=K
e(k)(16)
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
4IEEE TRANSACTIONS ON CYBERNETICS
where
K=
k0O··· ··· O
k1k1.......
.
.
O.........O
.
.
..........O
O··· OkN1kN1
.(17)
According to [24], the learning gain ktcan be designed such
that
maxj[1,L]||IC(B0(t)+Bj)kt|| <I(18)
where Bjis the vertices of the convex bounded uncer-
tainty domain D, and Lis the number of the vertices. The
condition (18) can be further converted to LMIs as follows:
I(IC(B0(t)+Bj)kt)T
I>0 (19)
where t∈{0,...,N1}and j∈{1,...,L}. By solving (19),
the feedback law Kcan be obtained by kt, which satisfies the
convergent condition (14). There exists ktand (19) is feasible
if the system matrix CB(t)has full-row rank [24].
III. DATA-DRIVEN COMPENSATION METHOD FOR ILC
Although the robust control law with (14) can ensure the
convergence of the RILC, the robust law is conservative, and
the convergence rate is slow because all possible uncertainties
are considered in the design. Considering that the relationship
between the output and input is caused by the model with
uncertainty, we employ the past input–output data to improve
the performance in this article.
From (12), the relation between the past data and model in
past trials can be obtained as
e(k1)=−(G0+G)
u(k1)
e(k2)=−(G0+G)
u(k2).
.
.
.(20)
Since G0and Gare fixed, according to the data-driven
method in [32], the following general relation can be obtained:
λ
e(i)=−(G0+G)λ
u(i)(21)
where iis less than k, and λis a scalar, which will be
determined by the controller.
During the iteration process, we can obtain the past data of
the update control law
uand the corresponding
e.So
we define that
U(k1)[
u(k1);··· ;
u(kl)]
and
E(k1)[
e(k1);··· ;
e(kl)]
where U(k1)RNm×l,E(k1)RNh×land lis the
length of moving window, and if ki<0(i1,...,l), set
u(ki)and
e(ki)as zeros. Then, according to (21),
it is obvious that
E(k1)
λk1=−(G0+G)U(k1)
λk1(22)
where
λk1=[λk1,...,λ
kl]Rlis a vector.
As for the kth trial, since the past data of the update con-
trol law as
u(k1),
u(k2),
u(k3)··· can be
obtained, Li et al. [32] adopted these past data to design the
control law of the ILC with MPC. In this article, we intend
to employ the linear combination of the past update control
data to compensate the original robust iterative learning update
control law. That is, we use a moving window with a fixed
length to acquire the past data near the current trial for com-
pensating the original robust control law (13). In the kth trial,
the update control law
u(k)can be denoted by
u(k)ˆu(k)+
l
j=1
λkj
u(kj)(23)
where ˆu(k)represents the robust update-feedback control law
from (13), another term l
j=1λkj
u(kj)represents the
update law from the combination of the past update control
data, ldenotes the length of the moving window (i.e., the
number of the past data selected to compensate the control),
and λkjis the linear combination parameter of the past data,
which will be determined by the controller.
According to the increment error model (12) and the
update control law (23), the tracking error increment can be
transformed as follows:
e(k)=−(G0+G)
u(k)
=−(G0+G)(ˆu(k)+
l
j=1
λkj
u(kj)).
(24)
For brevity, the update control law (23) and the error
model (24) will be written in the vector or matrix form as
follows:
u(k)ˆu(k)+U(k1)
λk1(25)
e(k)=−(G0+G)(ˆu(k)+U(k1)
λk1).
(26)
Then the error in the k+1th trial can be described by
e(k+1)=
e(k)(G0+G)
u(k)
=
e(k)(G0+G)(ˆu(k)
+U(k1)
λk1). (27)
According to (22), the relation between the vector of the
past update control law U(k1)and the error increment
E(k1)can be written as follows:
(G0+G)U(k1)
λk1=E(k1)
λk1.(28)
Therefore, substituting (28) into (27), the error in the k+1th
trial can be obtained as follows:
e(k+1)=
e(k)+E(k1)
λk1(G0+G)ˆu(k).
(29)
Then, in order to guarantee the convergence of the system,
ˆu(k)will be designed as the previous RILC method and
required to satisfy (14). Define ˆe(k)
e(k)+E(k
1)
λk1,so
e(k+1)can be rewritten as follows:
e(k+1)e(k)(G0+G) ˆu(k). (30)
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
HE et al.: ITERATIVE LEARNING CONTROL WITH DATA-DRIVEN-BASED COMPENSATION 5
The update robust control law of the error feedback is similar
with (13) as follows:
ˆu(k)=Kˆe(k)(31)
where Kis the RILC law designed for satisfying the robust
convergence condition (14).
Considering that the optimal target of the ILC-DDC is to
reduce the next trial error ||
e(k+1)||2, the optimal problem
can be redefined as follows:
J=min
λk1
||
e(k+1)||2
=min
λk1
||
e(k)+E(k1)
λk1(G0+G)ˆu(k)||2
(32)
where the linear combination parameter of past data
λk1
is the optimization vector,
e(k)is the current error, and
E(k1)and U(k1)are the past error increment and con-
trol increment directly acquired from past data. The solution
of (32) can be directly obtained as
λ
k1=−(ET(k1)E(k1))ET(k1)
e(k)(G0+G)ˆu(k).(33)
The optimality solution
λ
k1contains the uncertain part
Gin (G0+G)ˆu(k), which provides the difficulty in
both calculation and theoretical analysis. Thus, we remove
this uncertain term (G0+G)ˆu(k)in (33) and simplify the
solution (33) to obtain the solution as
λk1=−(ET(k1)E(k1))ET(k1)
e(k).
(34)
After removing the uncertain part,
λk1can be easily cal-
culated by the past data. Then, the ultimate data-driven-based
robust update control law (25) can be rewritten as follows:
u(k)=ˆu(k)+U(k1)
λk1
=K(
e(k)+E(k1)
λk1)+U(k1)
λk1.
(35)
The convergence of the control law (35) with the compensation
coefficients
λk1will be discussed in Section IV.
In the iteration process, with the increase of iteration times,
the error
e(k)and the E(k1)increment error of the system
will become smaller and smaller and close to 0, which will
make ρmax(ET(k1)E(k1)) close to 0 and the pseudo-
inverse of the ET(k1)E(k1)not exist. In the system
that E(k1)is close to 0 implies that the iteration process
is coming to an end. Therefore, in the iteration process, when
ρmax(ET(k1)E(k1)) is less than εand εis set as a
very small value, we directly set the
λk1to be 0.
From the proposed ILC-DDC strategy, one can obtain the
block diagram of how the ILC-DDC controller works in Fig. 1,
and the complete algorithm steps of the ILC-DDC are given in
Algorithm 1. Compared with the work in [32], our proposed
data-driven method in Algorithm 1 can compensate all kinds
of ILC law and improve its performance. In addition, it can
be proved that the ILC with the data-driven compensation is
Fig. 1. ILC-DDC.
Algorithm 1 ILC-DDC
1: Construct the feedback law Kwhich satisfies (14) and set
the initial values l,E(0)=0,
λ0=0, U(0)=0 and
u(0)=0 offline.
2: At the begin of trial k, compute the solution
λk1by
(34). If ρmax(ET(k1)E(k1)) < ε (εis a very
small value), set
λk1=0.
3: Calculate ˆe(k)=
e(k)+e(k1)
λk1and
u(k)=
Kˆe(k)+
l
j=1
λkj
u(kj), then implement
u(k)=
u(k1)+
u(k)on the controlled process.
4: At the end of trial k, add
u(k)and
e(k)into memory
unit and obtain new U(k)and E(k).Letk=k+1 and
return to Step 2.
convergent and the convergent rate is faster than the original
algorithm without data-driven compensation.
Remark 1: The main computation of the above data-driven-
based compensation method in the iterative process is (34),
which is an inverse matrix operation and can be solved by a
standard matrix inverse software package. The computational
burden of (34) depends on the data length l. The LMIs problem
for obtaining K(19) can be solved offline by a standard LMI
software package and does not affect the computations in the
iterative process.
IV. CONVERGENCE AND CONVERGENCE RATE ANALYSIS
In this section, the convergence of Algorithm 1 will be ana-
lyzed. In addition, it is proved that the convergence rate of
Algorithm 1 is faster than RILC. The main results of this
article are as follows.
Theorem 1: Consider system model (7) with polytopic
uncertainty (8) controlled by the control input generated from
Algorithm 1. When the original RILC control law Kmeets the
convergence condition (14), the system with the data-driven
design proposed in Algorithm 1 is convergent.
Proof: First, the traditional RILC law Kcan be designed for
satisfying the convergent condition (14) by the method from
the literature [24].
Define the e(k)as follows:
e(k)E(k1)
λk1.(36)
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6IEEE TRANSACTIONS ON CYBERNETICS
Fig. 2. Decomposition of e(k).
Then, according to the definition of ˆe(k)
e(k)+E(k
1)
λk1,
e(k)can be divided into the following two parts:
e(k)e(k)+e(k). (37)
In addition, substituting (34) into (36), the expression of
e(k)can be shown in the following:
e(k)=E(k1)(ET(k1)
E(k1))ET(k1)
e(k)=Z
e(k)(38)
which implies e(k)is the projection of
e(k)on E(k1).
Meanwhile, we define the singular value decomposition of the
E(k1)as E(k1)=LoeLc. The following equation
can be obtained:
Z=E(k1)(ET(k1)E(k1))ET(k1)
=LoeLc((LoeLc)TLoeLc)(LoeLc)T
=LoImLT
o
ZTZ=(LoImLT
o)TLoImLT
o=LoImLT
o(39)
where Im=I
Oand Ihas same dimension as the matrix e.
Then, the product between ˆeand e(k)can be obtained as
follows:
ˆeT(k)e(k)=(
e(k)e(k))Te(k)
=
e(k)TZ
e(k)
e(k)TZTZ
e(k)
=0.(40)
Therefore, ˆe(k)and e(k)have the vertical correlation in
multidimensional space as shown in Fig. 2. ˆeT(k)e(k)=0is
the key for the later theoretical demonstration.
Next, the error model in the k+1th trial (27) can also be
divided and rewritten by the following two subterms:
e(k+1)e(k+1)+e(k+1). (41)
From (29), the subterms
e(k+1)can be separately denoted
by the following two equations:
ˆe(k+1)e(k)(G0+G)ˆu(k)(42)
e(k+1)=e(k)(G0+G)U(k)
λ(k). (43)
For (43), due to (28) and (36), e(k+1)=0 can be obtained.
Thus, from (41), (42), and precondition (14), we can obtain
the relationship between ||
e(k+1)||2and ||ˆe(k)||2
||
e(k+1)||2eT(k+1)ˆe(k+1)
=[ˆe(k)(G0+G)ˆu(k)]T
[ˆe(k)(G0+G)ˆu(k)]
=[ˆe(k)(G0+G)Kˆe(k)]T
[ˆe(k)(G0+G)Kˆe(k)]
eT(k)[I(G0+G)K]T
[I(G0+G)K]ˆe(k)
<||ˆe(k)||2.(44)
Ultimately, from the error in the kth trial and (40), we can
obtain another relation between ||
e(k)||2and ||ˆe(k)||2
||
e(k)||2=(ˆe(k)+e(k))T(ˆe(k)+e(k))
eT(k)ˆe(k)+eT(k)e(k)+2ˆeT(k)e(k)
eT(k)ˆe(k)+eT(k)e(k)
>||ˆe(k)||2.(45)
Therefore, utilizing the intermediate variable ||ˆe(k)||2,wehave
||
e(k)||2>||ˆe(k)||2>||
e(k+1)||2.(46)
In conclusion, ||
e(k)||2>||
e(k+1)||2for k
{0,1,...,}can be proved, which means that the error of the
trial tracking is monotonically decreasing while the number of
iterations increases. Therefore, the iteration strategy proposed
in this article is monotonic convergent.
Theorem 2: Consider system model (7) with polytopic
uncertainty (8) controlled by the control input generated from
Algorithm 1. In the kth trial, when the ILC-DDC has the
same error situation
e(k)as RILC, that is, ||
e(k)RILC||2=
||
e(k)ILCDDC||2, the error of ILC-DDC is smaller than RILC
in the next trial, that is, ||
e(k+1)RILC||2≥||
e(k+
1)ILCDDC||2, where
e(k+1)RILC and
e(k+1)ILCDDC
are the error of the RILC and ILC-DDC in the k+1th trial,
respectively.
Proof: To compare
e(k+1)RILC and
e(k+1)ILCDDC,the
precondition in the kth trial, such as the error
e(k),system
model G0+Gand the feedback control law Kshould be
kept the same in the RILC or ILC-DDC. In this situation, the
errors of the RILC and ILC-DDC in the k+1th trial after the
action of the control (13) or (25) can be written separately as
follows:
e(k+1)RILC =
e(k)(G0+G)K
e(k)(47)
e(k+1)ILCDDC =
e(k)(G0+G)
(ˆu(k)+U(k1)
λk1). (48)
Considering
e(k)e(k)+e(k),ˆu(k)=Kˆe(k), and
(G0+G)U(k1)=E(k1), (47) and (48) can be
transformed as follows:
e(k+1)RILC e(k)+e(k)
(G0+G)K(ˆe(k)+e(k))
=(I(G0+G)K)ˆe(k)+(I(G0
+G)K)e(k)(49)
e(k+1)ILCDDC e(k)+e(k)(G0+G)( ˆu(k)
+U(k1)
λk1)
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
HE et al.: ITERATIVE LEARNING CONTROL WITH DATA-DRIVEN-BASED COMPENSATION 7
=(I(G0+G)K)ˆe(k)+e(k)
(G0+G)U(k1)
λk1
=(I(G0+G)K)ˆe(k)+e(k)
+e(k1)
λk1
=(I(G0+G)K)ˆe(k). (50)
Therefore, the norm of the RILC error and ILC-DDC error
can be written as
|
e(k+1)RILC||2= ||(I(G0+G)K)ˆe(k)||2
+||(I(G0+G)K)e(k)||2
+2[(I(G0+G)K)ˆe(k)]T
(51)
|
e(k+1)ILCDDC||2= ||(I(G0+G)K)ˆe(k)||2.(52)
According to (40), [(I(G0+G)K)ˆe(k)][(I(G0+
G)K)e(k)]T=0 can be obtained. Thus, the relation between
the error norms of the RILC and ILC-DDC can be shown as
follows:
||
e(k+1)RILC||2≥||
e(k+1)ILCDDC||2.(53)
Remark 2: The quantification of the performance improve-
ment in our method can be calculated by Q=||
e(k+
1)RILC||2−||
e(k+1)ILCDDC||2. The formula can convert to
Q=||(I(G0+G)K)e(k)||2. When ||
e(k+1)RILC||2=
||
e(k+1)ILCDDC||2if and only if Qis 0. From (34) and (36),
it can be obtained that Q=0 means that the ET(k1)=0
or
e(k)=0. When the increment error or the error becomes
0, it means that the iteration process has come to an end. Thus,
at the end of the iteration, the two ILCs will have an equiva-
lent performance, while in other situations, ||
e(k+1)RILC||2
is always larger than ||
e(k+1)ILCDDC||2.
In conclusion, ||
e(k+1)RILC||2≥||
e(k+1)ILCDDC||2
means that the convergence rate of ILC-DDC is faster than
RILC and the data-driven-based compensation can accelerate
the convergence rate of RILC.
V. EXPERIMENTAL STUDY
In the application, the traditional tasks of industrial manip-
ulators include welding, stacking, and spraying on assembly
lines, which have repetitive characteristics and high-precision
requirements. Thus, in this section, we design experiments on
a platform of a 6-freedom manipulator to move repeatedly for
verifying the proposed ILC-DDC and making a comparison
with the existing approaches. In experiments, the manipulator
is designed to track the repetitive reference trajectory.
The robot employed is shown in Fig. 3. For this object, the
repeated position accuracy of joint motors is 0.01 degrees. In
order to keep the model uncertainty in a certain range, the
range of the joint movement needs to be limited in a small
region. Thus, the upper boundary of the joint angle is selected
as follows:
Joint =[20,75,100,5,80,25](54)
and the lower limit of joint angle is selected as
Joint =[5,20,50,5,20,25].(55)
Fig. 3. Six degree-of-freedom manipulator.
TAB LE I I
D-H MODEL OF MANIPULATOR
Under the influence of the joint motor capability, the upper
boundary of joint angle speed is
Speed =[20,20,20,20,20,20]/s(56)
and the lower boundary of joint angle speed is
Speed =[20,20,20,20,20,20]/s.(57)
A simple saturation method is just utilized to protect the
manipulator in actual experiments. Due to the proper feed-
back law, the control inputs do not exceed the constraints
in the experiment. The corresponding modified Denavit–
Hartenberg model [33] can be subsequently established. The
corresponding D-H parameters of the manipulator are listed in
Table II.
According to the Jacobian matrix of the robot manipu-
lator [34], the object kinematic model can be obtained as
follows:
˙s=J(θ ) ˙
θ(58)
where ˙s=[˙x,˙y,˙z]Tis the end-effector velocity of the manip-
ulator, ˙
θis the angular velocity of the manipulator joints, and
J) is the Jacobian matrix.
By discretizing model (58) according to the Euler forward
difference rule [35], the discrete manipulator model can be
obtained as
s(k,t+1)=CA(t)s(k,t)+CB(t)˙
θ(k,t)
s(k,t+1)=s(k,t)+B(t)˙
θ(k,t)(59)
where B(t)=TJ (t)),A(t)=I,C=I, and Tis the sampling
time of the discrete system. Then, the lifted system can be
described as
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
8IEEE TRANSACTIONS ON CYBERNETICS
S(k)=Ps(k,0)+G
u(k)
=Ps(k,0)+(G0+G)
u(k)(60)
where
S(k)=[s(k,1)T,...,s(k,N)T]Tis the end-effector
position in the ktrial, P=[I,...,I]T, and u(k)=
[˙
θ(k,0)T,..., ˙
θ(k,N1)T]Tis the joint velocity in the kth
trial. The state manipulator and the joint position can be
measured in the trial.
In experiments, the initial joint angle is set as follows:
θ(0)=[14.74,66.37,58.92,0.00,54.71,14.74].(61)
Herein, the initial joint angle θ(0)is selected randomly within
the range of the joint boundary, which has no effect on the
control performance. The discretization time is set as T=0.1.
The total time period is set as N=50. The initial position of
the manipulator end-effector is set as follows:
s(k,0)=[0.38,0.1,0.15]T.(62)
In the experiments, the certain model, which can be known
in the design phase, is B(0)). Thus, we set B0=B0(0)=
···=B0(N1)=B (0)) as the certain model and B(t)=
B(t)B0is the uncertain part. The certain matrix G0can be
set as follows:
G0
B0O··· O
B0B0....
.
.
.
.
..
.
....O
B0B0··· B0
.(63)
The corresponding system matrices Gwith uncertainty can be
written as follows:
G
B(0)O··· O
B(0)B(1)....
.
.
.
.
..
.
....O
B(0)B(1)··· B(N1)
.(64)
Since the joint angles in the working process are in a cer-
tain boundary, we use the boundary points Joint and Joint to
describe the convex hull of uncertainty. According to Joint and
Joint, there are 26=64 different combinations for six joints as
θj(j=1,...,64). The convex vertices of B(t)are constructed
as Bj=Bj). Thus, the uncertain model B(t)=B(t)B0
belongs to the convex hull {B=64
j=1ajBj,64
j=1aj=1,
aj0}, where Bj=BjB0is the vertex of B(t).
By using the method in [24], the feedback-control law K
can be obtained for satisfying the condition (14). Since in
experiments B0and the convex hull of B(t)are the same
at every moment, the ktin Kare the same. Based on this
feedback control law Kof RILC, we use the proposed data-
driven method in Algorithm 1 to compensate it.
In order to reduce the computational burden, the data length
lis set as 1, which makes the computational burden of
λk1
in (34) be a simple algebraic computation. The computational
complexity is simple in the iteration process.
To evaluate the performance of our method, we compare our
proposed ILC-DDC with the RILC [24] and the data-driven
optimal ILC algorithm (DDOILC) [31] in the experiments. In
Fig. 4. Error variation curves of the ILCs for the line function.
Fig. 5. Tracking trajectories in the x-axis for the line function.
the control process, we only need to send the joint positions
calculated by the above ILCs to the manipulator in each trial,
and the manipulator can detect the current joint position, cal-
culate, and feedback the posture of the end effector by the
robot forward kinematics [33].
A. Case 1: Tracking Straight Line Repeatedly
In this experiment, the manipulator end effector is to track
the repeated reference constructed by a straight line from
the point [0.38,0.1,0.15]Tto point [0.30,0.02,0.15]T, whose
length is 0.1131 m.
The Euclidean error norm of the considered algorithms is
shown in Fig. 4, which indicates that all algorithms can achieve
the convergence, but ILC-DDC and DDOILC converge faster
than RILC. This is because ILC-DDC and DDOILC utilize
the past data to enhance either the system model or the con-
trol inputs, resulting in faster convergence rates. Comparing
ILC-DDC and DDOILC carefully, we can find that DDOILC
provides a slower convergence rate than ILC-DDC since
DDOILC utilizes the model identified by a large amount of
past data (data from one trial) for control, while ILC-DDC
directly utilizes the data with length l(l=1) for the design
of the control law. In the first trial, ILC-DDC has no past data
for control, so its trajectories are the same with DDOILC and
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
HE et al.: ITERATIVE LEARNING CONTROL WITH DATA-DRIVEN-BASED COMPENSATION 9
Fig. 6. Tracking trajectories in the y-axis for the line function.
Fig. 7. Tracking trajectories in 3-D space at different trials for the line
function.
RILC. But with the increase of the number of trials, the tra-
jectories of ILC-DDC outperform the DDOILC and RILC.
Figs. 5 and 6 show the actual trajectories of algorithms in
the x- and y-axis. Fig. 7 provides the actual trajectory of the
algorithms in the 3-D coordinate system corresponding to dif-
ferent trials. We can see that in the 6th trial, ILC-DDC has
tracked the path better than the RILC and DDOILC. Due to
the direct utilization of past data for constructing the control,
the convergence rate of ILC-DDC is faster than RILC and
DDOILC.
B. Case 2: Tracking Sine Curve Repeatedly
Without loss of generality, the experiment for the sine
function reference is also designed to verify the adaptabil-
ity of the proposed algorithm in different tasks. In this
experiment, the manipulator end effector is to track the ref-
erence constructed by a sine curve from [0.38,0.1,0.15]T
to [0.3035,0.0745,0.15]T. The sine function is y=
0.01sin(255(x0.38)).
The curve of tracking errors in the different trials for the sine
function is shown in Fig. 8. It can be also seen that all algo-
rithms can converge, and the convergence rate of ILC-DDC is
Fig. 8. Error variation curves of the ILCs for the sine function.
Fig. 9. Tracking trajectories in the x-axis for the sine function.
Fig. 10. Tracking trajectories in the y-axis for the sine function.
still faster than RILC and DDOILC as case 1. Figs. 9 and 10
show the actual trajectories of algorithms in the x- and y-axis.
Fig. 11 gives the actual tracking trajectories in the 3-D coordi-
nate system under different algorithms in the 6th, 12th, 18th,
and 49th trials. It can be obviously seen that ILC-DDC pro-
vides less tracking error than RILC and DDOILC from the 6th
trial and completes the tracking task faster than them, while
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
10 IEEE TRANSACTIONS ON CYBERNETICS
Fig. 11. Tracking trajectories in 3-D space at different trials for the sine
function.
Fig. 12. Joint trajectories of ILC-DDC at different trials for the line function.
RILC and DDOILC almost eliminate the tracking error in the
49th trial. Figs. 12 and 13 show that the joint trajectories of
ILC-DDC at different trials for both line and sine functions are
smooth and within the constraints. The saturation method is
just utilized for protecting the manipulator in the actual exper-
iment. However, due to the proper feedback law, the control
inputs do not exceed the constraints.
In Fig. 14, the manipulator uses the control law of the ILC-
DDC in the 49th trial to draw the two trajectories in a white
paper, which indicates that the proposed method can achieve
good results in the actual environment.
In conclusion, the above two experiments illustrate that the
ILC algorithms can all achieve good tracking performance ulti-
mately. Compared with the RILC algorithm, the ILC-DDC and
DDOILC with the aid of past data have better performance.
The DDOILC needs more data from the controlled plant to
identify the more exact model, which limits the convergence
rate. Unlike DDOILC, the ILC-DDC has no requirement on
the amount of data and avoids identifying the model by the
Fig. 13. Joint trajectories of ILC-DDC at different trials for the sine function.
Fig. 14. Drawing the results in the environment.
Fig. 15. Error curves of the ILC-DDC with the different data lengths.
past data. It directly uses the optimized combination of the
past control law to compensate the current control law by
increasing the proportion of the past control law which makes
the tracking error drop faster. Thus, ILC-DDC can obtain the
better convergence rates as shown in Figs. 4 and 8.
C. Case 3: The Effect of the Data Length l in ILC-DDC
To show the influence of the data lengths lon the
performance of ILC-DDC, we plot the error curves for
both line and sine functions with different data lengths in
Fig. 15, where the data lengths are set as 1,2,3,4,5, and
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
HE et al.: ITERATIVE LEARNING CONTROL WITH DATA-DRIVEN-BASED COMPENSATION 11
TABLE III
AVERAGE COST TIME WITH DIFFERENT l
Fig. 16. Error variation curves of the ILCs for “α.”
Fig. 17. Tracking trajectories in 3-D space at different trials for “α.”
10, respectively. From the result, we can see the ILC-DDC
with the larger ltends to have better control performance. But
after the lrises to 4, the increase of lhas little effect on the
performance improvement. This is because with a larger l,the
ILC-DDC can obtain more information about the system for
the compensation of the control, while too large value of l
instead tends to cause the information redundancy, reducing
the control performance to some extent.
To investigate the computation with different data lengths
l, we calculate the main computational cost in ILC-DDC as
O(l3+l2·Nh +l·((Nh)2+Nh)), that is, calculating
λk1
by (34), where Nis the number of samples in each trial and h
is the system output dimension. In the experiment, the average
cost time for calculating
λk1as lgoes from 1 to 10 is listed
in Table III, which is performed on MATLAB2019a with core
i5-8300. It can be seen that the increasing of lincreases the
cost time of
λk1. Thus, the large data length lwill bring the
large computational burden.
D. Case 4: Real-World Application for Writing Character
In the case, the manipulator learns to follow the reference
curve and write the character “α” by ILC-DDC, RILC, and
DDOILC. The curve describing the handwriting “α”isset
as the reference. The results of error variation curves under
different methods are shown in Fig. 16, and the trajectories
for tracking “α” in the 6th, 12th, 18th, and 49th trials are
shown in Fig. 17, respectively. It is obvious that the system
under ILC-DDC achieves the best convergence performance
and the manipulator can complete the tracking task faster than
DDOILC and RILC.
VI. CONCLUSION
ILC-DDC for the linear systems with unknown time-varying
uncertainty was proposed in this article. This strategy consists
of two terms. One is the inputs from the error-feedback law
of the RILC and the other is the special optimal combination
of the past data. This design overcame the conservation of the
RILC, improved the performance, and accelerated the conver-
gence rate. For the proposed ILC-DDC, it was proved that the
convergence can be guaranteed and the convergence rate can
be accelerated. Finally, experiments on a 6-freedom manipu-
lator also showed that the proposed design can provide a great
performance.
The 2-D model can be used to design a real-time feedback
ILC controller. Since the trial number and the one along the
trial were used to describe the system, the dimensions of the
considered system were much less. Thus, we will make efforts
to extend the data-drive method to 2-D system in future work.
REFERENCES
[1] M. Uchiyama, “Formation of high-speed motion pattern of a mechan-
ical arm by trial,” Trans. Soc. Instrum. Control Eng., vol. 14, no. 6,
pp. 706–712, 1978.
[2] S. Arimoto, S. Kawamura, and F. Miyazaki, “Bettering operation of
robots by learning,” J. Robot. Syst., vol. 1, no. 2, pp. 123–140, 1984.
[3] J. H. Lee and K. S. Lee, “Iterative learning control applied to
batch processes: An overview,” Control Eng. Pract., vol. 15, no. 10,
pp. 1306–1318, 2007.
[4] W. He, T. Meng, X. He, and C. Sun, “Iterative learning control for a
flapping wing micro aerial vehicle under distributed disturbances,IEEE
Trans. Cybern., vol. 49, no. 4, pp. 1524–1535, Apr. 2019.
[5] H.-S. Ahn, Y. Chen, and K. L. Moore, “Iterative learning control: Brief
survey and categorization,IEEE Trans. Syst., Man, Cybern. C, Appl.
Rev., vol. 37, no. 6, pp. 1099–1121, Nov. 2007.
[6] X. Jin, “Fault tolerant nonrepetitive trajectory tracking for MIMO output
constrained nonlinear systems using iterative learning control,IEEE
Trans. Cybern., vol. 49, no. 8, pp. 3180–3190, Aug. 2019.
[7] X. Jin, “Nonrepetitive leader–follower formation tracking for multiagent
systems with LOS range and angle constraints using iterative learning
control,” IEEE Trans. Cybern., vol. 49, no. 5, pp. 1748–1758, May 2019.
[8] G. Sebastian, Y. Tan, D. Oetomo, and I. Mareels, “Input and output
constraints in iterative learning control design for robotic manipulators,
Unmanned Syst., vol. 6, no. 3, pp. 197–208, 2018.
[9] Z. Hou and G. Liu, “Cooperative adaptive iterative learning fault-tolerant
control scheme for multiple subway trains,” IEEE Trans. Cybern., early
access, May 4, 2020, doi: 10.1109/TCYB.2020.2986006.
[10] R. Chi, Y. Hui, B. Huang, and Z. Hou, “Adjacent-agent dynamic
linearization-based iterative learning formation control,IEEE Trans.
Cybern., vol. 50, no. 10, pp. 4358–4369, Oct. 2020.
[11] H. Havlicsek and A. Alleyne, “Nonlinear control of an electrohydraulic
injection molding machine via iterative adaptive learning,IEEE/ASME
Trans. Mechatronics, vol. 4, no. 3, pp. 312–323, Sep. 1999.
[12] X. Liu and X. Kong, “Nonlinear fuzzy model predictive iterative learning
control for drum-type boiler–turbine system,” J. Process Control, vol. 23,
no. 8, pp. 1023–1040, 2013.
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
12 IEEE TRANSACTIONS ON CYBERNETICS
[13] J. J. M. V. de Wijdeven, M. C. F. Donkers, and O. H. Bosgra, “Iterative
learning control for uncertain systems: Noncausal finite time interval
robust control design,” Int. J. Robust Nonlinear Control, vol. 21, no. 14,
pp. 1645–1666, 2011.
[14] Y. Wang, F. Gao, and F. J. Doyle, III, “Survey on iterative learning
control, repetitive control, and run-to-run control,J. Process Control,
vol. 19, no. 10, pp. 1589–1600, 2009.
[15] S. Hillenbrand and M. Pandit, “A discrete-time iterative learning control
law with exponential rate of convergence,” in Proc. IEEE Conf. Decis.
Control, vol.2, pp. 1575–1580, 1999.
[16] R. Longman, “Iterative learning control and repetitive control for
engineering practice,” Int. J. Control, vol. 73, no. 10, pp. 930–954, 2000.
[17] R. Tousain, E. Van der Meché, and O. Bosgra, “Design strategy for
iterative learning control based on optimal control,” in Proc. 40th IEEE
Conf. Decis. Control, vol. 5, 2001, pp. 4463–4468.
[18] Z. Xiong, J. Zhang, and D. Jin, “Optimal iterative learning control for
batch processes based on linear time-varying perturbation model,” Chin.
J. Chem. Eng., vol. 16, no. 2, pp. 235–240, 2008.
[19] F. Gao, Y. Yang, and C. Shao, “Robust iterative learning control with
applications to injection molding process,” Chem. Eng. Sci., vol. 56,
no. 24, pp. 7025–7034, 2001.
[20] W. B. J. Hakvoort, R. G. K. M. Aarts, J. V. Dijk, and J. B. Jonker,
“Lifted system iterative learning control applied to an industrial robot,
Control Eng. Pract., vol. 16, no. 4, pp. 377–391, 2008.
[21] A. Tayebi and M. B. Zaremba, “Robust iterative learning control
design is straightforward for uncertain LTI systems satisfying the robust
performance condition,” IEEE Trans. Autom. Control, vol. 48, no. 1,
pp. 101–106, Jan. 2003.
[22] J. Shi, F. Gao, and T.-J. Wu, “Robust iterative learning control design
for batch processes with uncertain perturbations and initialization,”
AIChE J., vol. 52, no. 6, pp. 2171–2187, Jun. 2006.
[23] D. Meng, Y. Jia, J. Du, and F. Yu, “Monotonically convergent iterative
learning control for uncertain time-delay systems: An LMI approach,”
in Proc. Amer. Control Conf., 2009, pp. 1622–1627.
[24] D. Meng, Y. Jia, J. Du, and F. Yu, “Initial shift problem for robust
iterative learning control systems with polytopic-type uncertainty,” Int.
J. Syst. Sci., vol. 41, no. 7, pp. 825–838, 2010.
[25] B. Cichy, Ł. Hładowski, K. Gałkowski, A. Rauh, and H. Aschemann,
“Iterative learning control of an electrostatic microbridge actuator with
polytopic uncertainty models,” IEEE Trans. Control Syst. Technol.,
vol. 23, no. 5, pp. 2035–2043, Sep. 2015.
[26] Y. Wang, D. Zhou, and F. Gao, “Iterative learning model predictive
control for multi-phase batch processes,” J. Process Control, vol. 18,
no. 6, pp. 543–557, Jul. 2008.
[27] L. Hladowski, K. Galkowski, Z. Cai, E. Rogers, C. T. Freeman, and
P. L. Lewin, “Experimentally supported 2D systems based iterative learn-
ing control law design for error convergence and performance,Control
Eng. Pract., vol. 18, no. 4, pp. 339–348, 2010.
[28] T. D. Son, G. Pipeleers, and J. Swevers, “Robust monotonic convergent
iterative learning control,IEEE Trans. Autom. Control, vol. 61, no. 4,
pp. 1063–1068, Apr. 2016.
[29] P. Janssens, G. Pipeleers, and J. Swevers, “A data-driven constrained
norm-optimal iterative learning control framework for LTI systems,”
IEEE Trans. Control Syst. Technol., vol. 21, no. 2, pp. 546–551, Mar.
2013.
[30] R. Chi, Z. Hou, and S. Jin, “A data-driven adaptive ILC for a
class of nonlinear discrete-time systems with random initial states and
iteration-varying target trajectory,” J. Frankl. Inst., vol. 352, no. 6,
pp. 2407–2424, Jun. 2015.
[31] R. Chi, Z. Hou, B. Huang, and S. Jin, “A unified data-driven design
framework of optimality-based generalized iterative learning control,
Comput. Chem. Eng., vol. 77, pp. 10–22, Jun. 2015.
[32] D. Li et al., “The synthesis of ILC–MPC controller with data-driven
approach for constrained batch processes,” IEEE Trans. Ind. Electron.,
vol. 67, no. 4, pp. 3116–3125, Apr. 2020.
[33] J. J. Craig, Introduction to Robotics: Mechanics and Control,vol.3.
Upper Saddle River, NJ, USA: Pearson/Prentice Hall, 2005,
[34] Y. Xiong, Robotics: Model, Control and Vision, 1st ed. Wuhan, China:
Huazhong Univ. Sci. Technol. Press, Apr. 2018, ch. 6, pp. 134–141.
[35] D. Guo and Y. Zhang, “Zhang neural network, Getz-Marsden dynamic
system, and discrete-time algorithms for time-varying matrix inversion
with application to robots’ kinematic control,” Neurocomputing, vol. 97,
pp. 22–32, Nov. 2012.
Shaoying He received the B.Eng. degree from
Central South University, Changsha, China, in 2016,
and the M.Eng. degree from Shanghai Jiao Tong
University, Shanghai, China, in 2019, where he
is currently pursuing the D.Eng. degree with the
Department of Automation.
His research interests include predictive control
and robot.
Wenbo Chen received the B.S. degree from the
East China University of Science and Technology,
Shanghai, China, in 2011. He is currently pursu-
ing the Ph.D. degree with the Automation Institute,
Shanghai Jiao Tong University, Shanghai.
His research interest covers the algorithm and
application of model predictive control.
Dewei Li received the B.S. and Ph.D. degrees in
automation from Shanghai Jiao Tong University,
Shanghai, China, in 1993 and 2009, respectively.
He is a Professor with the Department of
Automation, Shanghai Jiao Tong University, where
he worked as a Postdoctoral Researcher from 2009
to 2010. His research interests include predictive
control, robust control, and the related applications.
Yugeng Xi (Senior Member, IEEE) was born in
Shanghai, China. He received the Dr.-Ing. degree
in electrical engineering from Technical University
Munich, Munich, Germany, in 1984.
Since then, he has been with the Department
of Automation, Shanghai Jiao Tong University,
Shanghai, and as a Professor since 1988. He has
authored or coauthored three books and more than
300 journal papers. His research interests include
model-predictive control, optimization and con-
trol of large-scale network systems, and intelligent
robotic systems.
Prof. Xi is currently an Advisory Committee Member of the Asian Control
Association and an Honorary Council Member of the Chinese Association of
Automation.
Yunwen Xu (Member, IEEE) received the B.S.
degree in automation from the Nanjing University of
Science and Technology, Nanjing, China, in 2012,
and the M.S. and Ph.D. degrees in control sci-
ence and engineering from Shanghai Jiao Tong
University, Shanghai, China, in 2014 and 2019,
respectively.
She is currently a Postdoctoral Researcher with
the Department of Automation, Shanghai Jiao Tong
University. Her research interests include model-
predictive control, urban traffic modeling, and intel-
ligent control of complex systems.
Pengyuan Zheng (Member, IEEE) received the
B.Sc. degree in electrical engineering and automa-
tion from the North University of China, Taiyuan,
China, in 2000, the M.Sc. degree in measurement
technology and instrumentation from the University
of Shanghai for Science and Technology, Shanghai,
China, in 2005, and the Ph.D. degree in control
theory and control engineering from Shanghai Jiao
Tong University, Shanghai, in 2010.
He was a Postdoctoral Research Fellow with
Shanghai Jiao Tong University from 2012 to 2014.
Since 2014, he has been an Associate Professor with the College of
Automation Engineering, Shanghai University of Electric Power, Shanghai.
His research interests include predictive control and optimization for
microgrids.
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on January 07,2021 at 00:58:16 UTC from IEEE Xplore. Restrictions apply.
... However, the algorithm was restricted on a need that the parameters were assumed to be constants within each iteration [39], [40]. To increase flexibility of the AILC structure for robotic systems, in [30], [64], new ILC approaches were studied in which data-driven learning laws were adopted to compensate for the systematic uncertainties and to accelerate convergence rates. Unfortunately, since the data-driven ILC methods were designed based on linear time-variant models, they were only applied either to generate referenced joint velocities of general robotic control systems or to control simple mechatronic systems such as gantry robots that have joints in orthogonal configurations and independent control structures. ...
... Variation of the estimation error on the iterative axis is described as in (32) by using the adaptation rule (30): ...
... This result indicates that the estimation error (w i,k ) is bounded, and Lemma 2 has been thus proven. Theorem 2: If the iterative disturbance (25) is bounded and the iterative control signal is updated by the rules (21), (26), (28), (30) and (31), variation of the iterative error (23) is stable. Proof: By using (27)-(28), the variation (24) is simplified in an element-wise form ...
Article
Full-text available
Iterative Learning Control (ILC) is known as a high-accuracy control strategy for repetitive control missions of mechatronic systems. However, applying such learning controllers for robotic manipulators to result in excellent control performances is now a challenge due to unstable behaviors coming from nonlinearities, uncertainties and disturbances in the system dynamics. To tackle this challenge, in this paper, we present a novel proportional-derivative iterative second-order neural-network learning control (PDISN) method for motion-tracking control problems of robotic manipulators. The control framework is structured from time- and iterative-base control layers. First of all, the total systematic dynamics are concretely stabilized by a conventional Proportional-Derivative (PD) control signal in the time domain. The control objective is then accomplished by using an intelligent ILC decision generated in the second layer to compensate for other nonlinear uncertainties and external disturbances in the dynamics. The iterative signal is flexibly composed from various information on the iterative axis. On one hand, the previous iterative control signal is inherently reused in the current iteration but with an appropriate portion based on reliability of the current control performance. On the other hand, the iterative-based modeling deviation remaining is treated by a functional neural network that is specially activated by a second-order learning law and information synthesized from the current and previous iterations. Stabilities of the time-based nonlinear subsystem and overall system are rigorously analyzed using extended Lyapunov theories and high-order regression series criteria. Effectiveness of the proposed controller was intensively verified by the extensive comparative simulation results. Key advantages of the proposed control method are chattering-free, universal, adaptive, and robust.
... Since the matrix Q and the vector c are calculated based on the information of H and x k p0q before determining the u k`1 , this ILC law is model-based. However, the matrix H remains unknown in the case we consider, which is the most significant difference from the existing work [12,13,20]. In the following, we seek to develop a data-driven ILC approach to eliminate the requirement for model information. ...
Conference Paper
Full-text available
This paper discusses optimal batch-to-batch (B2B) control problems and presents a gradient descent method solution for unknown linear batch process systems. Using historical process data, we design a model-free method for B2B optimization that eliminates the need for model information about the system. By using quadratic programming (QP) to formulate the optimal controller design, we first present the optimal iterative learning control (ILC) results. Next, using the gradient descent method, we replace the uncertain term with the actual measurements and develop a new ILC approach based on convex hull representations of uncertain realizations. As compared to the norm-optimal ILC, our proposed ILC can guarantee superior performance with reasonably selected parameters. Finally, we demonstrate our design with an illustrative numerical example.
Article
This work proposes an economic model-free super twisting control (STWC) algorithm for the FCT of a singularly perturbed MAS. Specifically, the intelligent model-free control framework is designed to be the sum of a MISTWC and an iterative learning control (ILC). First, time scales are artificially introduced into the STWC for the multiagent formation construction, without overestimating the control gains. Then, the input-output data collected from the iterative experiments are used to learn the model of unknown repeated uncertainties, and drive the whole system toward satisfactory consensus tracking performance. By utilizing the $\epsilon$ -dependent Lyapunov method, the convergence properties of the STWC-type ILC are rigorously analyzed in both the iteration domain and the time domain. The selection method of the design parameters is also provided. Simulation results validate the effectiveness of the proposed controller in terms of formation construction, trajectory tracking, and robustness to system uncertainties.
Article
For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the P-type learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation. Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information. To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates low-memory footprints and offers flexibility in learning gain design. The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.
Article
The dielectric elastomer actuator (DEA) is an intelligent device with an actuation-sensing integration capacity, which exhibits prospective applications in the field of soft robotics. Previous studies mainly focus on the actuation capacity of the DEA, while the studies on its sensing capacity and the practical realization of its actuation-sensing integration still confront great challenges. This article presents a self-sensing motion control scheme to realize the tracking control objective of the DEA, thus leaving out the use of the external displacement sensor. First, a self-sensing model of the DEA is established based on the nonlinear autoregressive with exogenous inputs (NARX) neural network to predict its own displacement. Then, by adopting the dynamic model of the DEA as the control object, a simulation environment based on the iterative learning control architecture is built to obtain the feedforward control sequence for tracking a target trajectory. Next, to enhance the control quality, the proportional–integral feedback controller is designed to combine with the feedforward control sequence to handle the uncertainties in practical experiments, in which the feedback signal is the output of the self-sensing model. Finally, several trajectory tracking control experiments are performed. The maximum value of the relative root-mean-square errors for all experimental results is less than 3.60%, which illustrates the effectiveness of the presented self-sensing motion control scheme.
Article
Iterative learning model predictive control (ILMPC) has been recognized as an excellent batch process control strategy for progressively improving tracking performance along trials. However, as a typical learning-based control method, ILMPC generally requires the strict identity of trial lengths to implement 2-D receding horizon optimization. The randomly varying trial lengths extensively existing in practice can result in the insufficiency of learning prior information, and even the suspension of control update. Regarding this issue, this article embeds a novel prediction-based modification mechanism into ILMPC, to adjust the process data of each trial into the same length by compensating the data of absent running periods with the predictive sequences at the end point. Under this modification scheme, it is proved that the convergence of the classical ILMPC is guaranteed by an inequality condition relative with the probability distribution of trial lengths. Considering the practical batch process with complex nonlinearity, a 2-D neural-network predictive model with parameter adaptability along trials is established to generate highly matched compensation data for the prediction-based modification. To best utilize the real process information of multiple past trials while guaranteeing the learning priority of the latest trials, an event-based switching learning structure is proposed in ILMPC to determine different learning orders according to the probability event with respect to the trial length variation direction. The convergence of the nonlinear event-based switching ILMPC system is analyzed theoretically under two situations divided by the switching condition. The simulations on a numerical example and the injection molding process verify the superiority of the proposed control methods.
Article
This article proposes a dynamic time warping (DTW)‐based iterative learning control (ILC) scheme for discrete‐time nonlinear systems to tackle the path learning problem with varying trial lengths and terminus constraint. By incorporating the improved DTW algorithm, the varying trial lengths are aligned as a desired length. Meanwhile, this algorithm can find the most similar waypoints between the output and the desired paths, which can be used to design an updating law and facilitate the convergence of path learning. Different from the existing ILC methods based on the probability distribution function for learning trajectory in the time domain, the method in this article can be applied to learn the spatial path corresponding to the desired trajectory. Furthermore, the learning property in the presence of variable initial states is discussed under the proposed method. Several illustrative examples manifest the validity of the proposed DTW‐based ILC algorithm.
Article
Full-text available
Injection molding, a polymer processing technique that converts thermoplastics into a variety of plastic products, is a complicated nonlinear dynamic process that interacts with a different group of variables, including the machine, the mold, the material, and the process parameters. As injection molding process operates sequentially in phases, we treat it as a batch process. The review paper discusses the batch nature of injection molding and identifies the three main objectives for future development of injection molding: higher efficiency, greater profitability, and longer sustainability. From the perspective of system engineering, our discussion centers on the primary challenges for the batch operation of injection molding systems: 1) Model development in face of product changes, 2) Control strategies in face of dynamic changes, 3) Data analysis and process monitoring, and 4) Safety assurance and quality improvement, and the current progress that has been made in addressing these challenges. In light of the advancement of new information technologies, this paper provides several opportunities and encourages further research that may break existing capability limits and develop the next generation of automation solutions to bring about a revolution in this area.
Article
The networked structure has attracted significant attention due to high demand for industrial systems and rapid developments of network communication. Among various network randomness, fading is a common phenomenon, which can lead to signal attenuation, distortion, loss, and interference. This study concentrates on the point-to-point tracking problem via fading communications by proposing a reference update strategy. Using this strategy, the tracking performance is continuously improved even with faded information as the number of iterations increases. A learning control scheme is established and proved convergent in both mean-square and almost-sure senses under mild conditions. The convergence rate is accelerated by introducing the virtual reference compared with the traditional update approach. Illustrative simulations verify the theoretical results.
Article
Full-text available
In this paper, we present a novel iterative learning control (ILC) algorithm for the leader-follower formation tracking problem of a class of nonlinear multiagent systems that are subject to actuator faults. Unlike most ILC works that require identical reference trajectories over the iteration domain, the desired line-of-sight (LOS) range and angle profiles can be iteration dependent based on different tasks and environment in each iteration. Furthermore, the LOS range and angle tracking errors are subject to iteration and time dependent constraint requirements. Both parametric and nonparametric system unknowns and uncertainties, in particular the control input gain functions that are not fully known, are considered. We show that under the proposed algorithm, the formation tracking errors can converge to zero uniformly over the iteration domain beyond a certain time interval in each iteration, while the constraint requirements on the LOS range and angle will not be violated during operations. A numerical simulation involving two agents in leader-follower formation is presented in the end to demonstrate the efficacy of the proposed algorithm.
Article
Full-text available
This paper presents an approach to deal with model uncertainty in iterative learning control (ILC). Model uncertainty generally degrades the performance of conventional learning algorithms. To deal with this problem, a robust worst-case norm-optimal ILC design is introduced. The design problem is reformulated as a convex optimization problem, which can be solved efficiently. The paper also shows that the proposed robust ILC is equivalent to conventional norm-optimal ILC with trial-varying parameters; accordingly, the design trade-off between robustness and convergence speed is analyzed.
Article
In this article, a cooperative adaptive iterative learning fault-tolerant control (CAILFTC) algorithm with the radial basis function neural network (RBFNN) is proposed for multiple subway trains subject to the time-iteration-dependent actuator faults by using the multiple-point-mass dynamics model. First, an RBFNN is utilized to cope with the unknown nonlinearity of the subway train system. Next, a composite energy function (CEF) technique is applied to obtain the convergence property of the presented CAILFTC, which can guarantee that all train speed tracking errors are asymptotic convergence along the iteration axis; meanwhile, the headway distances of neighboring subway trains are kept in a safety range. Finally, the effectiveness of theoretical studies is verified through a subway train simulation.
Article
The iterative learning control (ILC) combining with model predictive control (ILC-MPC) is an effective control method for constrained batch processes. However, in real applications, model uncertainty usually makes it slow for the controlled process to converge to the reference trajectory. To eliminate the restrictions in previous works, a data-driven approach is proposed, which directly describes the relationship between inputs and outputs according to the past data. Based on this method, a novel data-driven ILC-MPC controller is proposed, where the two-mode framework and the invariant updating strategy are employed to guarantee the convergence. Since the outputs caused by model uncertainty are partly known from the past data, the better performance can be achieved by the proposed design which is verified by experimental studies on a manipulator.
Article
The dynamical relationship of the multiple agents' behavior in a networked system is explored and utilized to enhance the control performance of the multiagent formation in this paper. An adjacent-agent dynamic linearization is first presented for nonlinear and nonaffine multiagent systems (MASs) and a virtual linear difference model is built between two adjacent agents communicating with each other. Considering causality, the agents are assigned as parent and child, respectively. Communication is from parent to child. Taking the advantage of the repetitive characteristics of a large class of MASs, an adjacent-agent dynamic linearization-based iterative learning formation control (ADL-ILFC) is proposed for the child agent using 3-D control knowledge from iterations, time instants, and the parent agent. The ADL-ILFC is a data-driven method and does not depend on a first-principle physical model but the virtual linear difference model. The validity of the proposed approach is demonstrated through rigorous analysis and extensive simulations.
Article
Motivated by the safety requirement of rehabilitation robotic systems for after stroke patients, this paper handles position or output constraints in robotic manipulators when the patients repeat the same task with the robot. In order to handle output constraints, if all state information is available, a state feedback controller can ensure that the output constraints are satisfied while iterative learning control (ILC) is used to learn the desired control input through iterations. By incorporating the feedback control using barrier Lyapunov function with feed-forward control (ILC) carefully, the convergence of the tracking error, the boundedness of the internal state, the boundedness of input signals can be guaranteed along with the satisfaction of the output constraints over iterations. The effectiveness of the proposed controller is demonstrated using simulations from the model of EMU, a rehabilitation robotic system.
Article
Most works on iterative learning control (ILC) assume identical reference trajectories for the system state over the iteration domain. This fundamental assumption may not always hold in practice, where the desired trajectories or control objectives may be iteration dependent. In this paper, we relax this fundamental assumption, by introducing a new way of modifying the reference trajectories. The concept of modifier functions has been introduced for the first time in the ILC literature. This proposed approach is also a unified framework that can handle other common types of initial conditions in ILC. Multi-input multi-output nonlinear systems are considered, which can be subject to the actuator faults. Time and iteration dependent constraint requirements on the system output can be effectively handled. Backstepping design and composite energy function approach are used in the analysis. We show that in the closed loop analysis, the proposed control scheme can guarantee uniform convergence on the full state tracking error over the iteration domain, beyond a small initial time interval in each iteration, while the constraint requirements on the system output are never violated. In the end two simulation examples are shown to illustrate the efficacy of the proposed ILC algorithm.
Article
This paper addresses a flexible micro aerial vehicle (MAV) under spatiotemporally varying disturbances, which is composed of a rigid body and two flexible wings. Based on Hamilton's principle, a distributed parameter system coupling in bending and twisting, is modeled. Two iterative learning control (ILC) schemes are designed to suppress the vibrations in bending and twisting, reject the distributed disturbances and regulate the displacement of the rigid body to track a prescribed constant trajectory. At the basis of composite energy function, the boundedness and the learning convergence are proved for the closed-loop MAV system. Simulation results are provided to illustrate the effectiveness of the proposed ILC laws. IEEE
Article
High-speed motion of a mechanical arm is necessary to speed up a job done by the arm. In high speed, however, the desired trajectory of motion of the arm cannot be obtained simply by applying the trajectory function to the servo system as the reference function because the time lag in the servo system is not negligible. A solution to this problem is to apply dynamically compensating computed torques to the servo system. By this method, however, for increasing the accuracy of the mathematical model of the arm necessary to compute the compensating torques, a very large effort would be required. To avoid this difficulty, an alternative method of correcting the reference function by trial will be useful. Repeating a proper process of trial and correction, the reference function which realizes the desired pattern of trajectory may be obtained. In this paper, correcting algorithm of a reference function for this method is investigated theoretically from the standpoint of stability or convergency of the process of trial and correction, and a stable correcting algorithm is obtained. Through the experiment using a mechanical arm of six degrees of freedom controlled by a digital computer, it is confirmed that the process of trial and correction by this algorithm is stable and the response of the servo system converges rapidly to the desired pattern of trajectory.
Article
Advanced control strategy is necessary to ensure high efficiency and high load-following capability in the operation of modern power plant. Model predictive control (MPC) has been widely used for controlling power plant. Nevertheless, MPC needs to further improve its learning ability especially as power plants are nonlinear under load-cycling operation. Iterative learning control (ILC) and MPC are both popular approaches in industrial process control and optimization. The integration of model-based ILC with a real-time feedback MPC constitutes the model predictive iterative learning control (MPILC). Considering power plant, this paper presents a nonlinear model predictive controller based on iterative learning control (NMPILC). The nonlinear power plant dynamic is described by a fuzzy model which contains local liner models. The resulting NMPILC is constituted based on this fuzzy model. Optimal performance is realized within both the time index and the iterative index. Convergence property has been proven under the fuzzy model. Deep analysis and simulations on a drum-type boiler-turbine system show the effectiveness of the fuzzy-model-based NMPILC (C) 2013 Published by Elsevier Ltd.