Conference PaperPDF Available

MATLAB Simulation of Gradient-Based Neural Network for Online Matrix Inversion

Authors:

Abstract and Figures

This paper investigates the simulation of a gradient-based recurrent neural network for online solution of the matrix-inverse problem. Several important techniques are employed as follows to simulate such a neural system. 1) Kronecker product of matrices is introduced to transform a matrix-differential-equation (MDE) to a vector-differential-equation (VDE); i.e., finally, a standard ordinary-differential-equation (ODE) is obtained. 2) MATLAB routine “ode45” is introduced to solve the transformed initial-value ODE problem. 3) In addition to various implementation errors, different kinds of activation functions are simulated to show the characteristics of such a neural network. Simulation results substantiate the theoretical analysis and efficacy of the gradient-based neural network for online constant matrix inversion.
Content may be subject to copyright.
MATLAB Simulation of Gradient-Based Neural
Network for Online Matrix Inversion
Yunong Zhang, Ke Chen, Weimu Ma, and Xiao-Dong Li
Department of Electronics and Communication Engineering
Sun Yat-Sen University, Guangzhou 510275, China
ynzhang@ieee.org
Abstract. This paper investigates the simulation of a gradient-based
recurrent neural network for online solution of the matrix-inverse prob-
lem. Several important techniques are employed as follows to simulate
such a neural system. 1) Kronecker product of matrices is introduced to
transform a matrix-differential-equation (MDE) to a vector-differential-
equation (VDE); i.e., finally, a standard ordinary-differential-equation
(ODE) is obtained. 2) MATLAB routine “ode45” is introduced to solve
the transformed initial-value ODE problem. 3) In addition to various im-
plementation errors, different kinds of activation functions are simulated
to show the characteristics of such a neural network. Simulation results
substantiate the theoretical analysis and efficacy of the gradient-based
neural network for online constant matrix inversion.
Keywords: Online matrix inversion, Gradient-based neural network,
Kronecker product, MATLAB simulation.
1 Introduction
The problem of matrix inversion is considered to be one of the basic problems
widely encountered in science and engineering. It is usually an essential part of
many solutions; e.g., as preliminary steps for optimization [1], signal-processing
[2], electromagnetic systems [3], and robot inverse kinematics [4]. Since the mid-
1980’s, efforts have been directed towards computational aspects of fast matrix
inversion and many algorithms have thus been proposed [5]-[8]. It is known that
the minimal arithmetic operations are usually proportional to the cube of the
matrix dimension for numerical methods [9], and consequently such algorithms
performed on digital computers are not efficient enough for large-scale online
applications. In view of this, some O(n2)-operation algorithms were proposed
to remedy this computational problem, e.g., in [10][11]. However, they may be
still not fast enough; e.g., in [10], it takes on average around one hour to invert
a 60000-dimensional matrix. As a result, parallel computational schemes have
been investigated for matrix inversion.
The dynamic system approach is one of the important parallel-processing
methods for solving matrix-inversion problems [2][12]-[18]. Recently, due to the
in-depth research in neural networks, numerous dynamic and analog solvers
D.-S. Huang, L. Heutte, and M. Loog (Eds.): ICIC 2007, LNAI 4682, pp. 98–109, 2007.
c
Springer-Verlag Berlin Heidelberg 2007
MATLAB Simulation of Gradient-Based Neural Network 99
based on recurrent neural networks (RNNs) have been developed and inves-
tigated [2][13]-[18]. The neural dynamic approach is thus regarded as a powerful
alternative for online computation because of its parallel distributed nature and
convenience of hardware implementation [4][12][15][19][20].
To solve for a matrix inverse, the neural system design is based on the equa-
tion, AX I=0,withARn×n. We can define a scalar-valued energy function
such as E(t)=AX(t)I2/2. Then, we use the negative of the gradient
E/∂X =AT(AX (t)I) as the descent direction. As a result, the classic linear
model is shown as follows:
˙
X(t)=γE
∂X =γAT(AX(t)I),X(0) = X0(1)
where design parameter γ>0, being an inductance parameter or the reciprocal
of a capacitive parameter, is set as large as the hardware permits, or selected
appropriately for experiments.
As proposed in [21], the following general neural model is an extension to the
above design approach with a nonlinear activation-function array F:
˙
X(t)=γATF(AX(t)I)(2)
where X(t), starting from an initial condition X(0) = X0Rn×n,istheacti-
vation state matrix corresponding to the theoretical inverse A1of matrix A.
Like in (1), the design parameter γ>0 is used to scale the convergence rate of
the neural network (2), while F(·):Rn×nRn×ndenotes a matrix activation-
function mapping of neural networks.
2 Main Theoretical Results
In view of equation (2), different choices of Fmay lead to different performance.
In general, any strictly-monotonically-increasing odd activation-function f(·),
being an element of matrix mapping F, may be used for the construction of the
neural network. In order to demonstrate the main ideas, four types of activation
functions are investigated in our simulation:
linear activation function f(u)=u,
bipolar sigmoid function f(u)=(1exp(ξu))/(1 + exp(ξu)) with ξ2,
power activation function f(u)=upwith odd integer p3, and
the following power-sigmoid activation function
f(u)=up,if |u|1
1+exp(ξ)
1exp(ξ)·1exp(ξu)
1+exp(ξu),otherwise (3)
with suitable design parameters ξ1andp3.
Other types of activation functions can be generated by these four basic types.
Following the analysis results of [18][21], the convergence results of using dif-
ferent activation functions are qualitatively presented as follows.
100 Y. Zhang et al.
Proposition 1. [15]-[18][21] For a nonsingular matrix ARn×n,anystrictly
monotonically-increasing odd activation-function array F(·)can be used for con-
structing the gradient-based neural network (2).
1. If the linear activation function is used, then the global exponential conver-
gence is achieved for neural network (2) with convergence rate proportional
to the product of γand the minimum eigenvalue of ATA.
2. If the bipolar sigmoid activation function is used, then the superior conver-
gence can be achieved for error range [δ, δ],δ(0,1),ascomparedto
the linear-activation-function case. This is because the error signal eij =
[AX I]ij in (2) is amplified by the bipolar sigmoid function for error range
[δ, δ].
3. If the power activation function is used, then the superior convergence can be
achieved for error ranges (−∞,1] and [1,+), as compared to the linear-
activation-function case. This is because the error signal eij =[AX I]ij in
(2) is amplified by the power activation function for error ranges (−∞,1]
and [1,+).
4. If the power-sigmoid activation function is used, then superior convergence
can be achieved for the whole error range (−∞,+),ascomparedtothe
linear-activation-function case. This is in view of Properties 2) and 3).
In the analog implementation or simulation of the gradient-based neural net-
works (1) and (2), we usually assume that it is under ideal conditions. However,
there are always some realization errors involved. For example, for the linear
activation function, its imprecise implementation may look more like a sigmoid
or piecewise-linear function because of the finite gain and frequency dependency
of operational amplifiers and multipliers. For these realization errors possibly
appearing in the gradient-based neural network (2), we have the following the-
oretical results.
Proposition 2. [15]-[18][21] Consider the perturbed gradient-based neural model
˙
X=γ(A+ΔA)TF((A+ΔA)X(t)I),
where the additive term ΔAexists such that ΔAε1,ε10, then the steady-
state residual error limt→∞ X(t)A1is uniformly upper bounded by some
positive scalar, provided that the resultant matrix A+ΔAis still nonsingular.
For the model-implementation error due to the imprecise implementation of sys-
tem dynamics, the following dynamics is considered, as compared to the original
dynamic equation (2).
˙
X=γATF(AX(t)I)+ΔB,(4)
where the additive term ΔBexists such that ΔBε2,ε20.
Proposition 3. [15]-[18][21] Consider the imprecise implementation (4), the
steady state residual error limt→∞ X(t)A1is uniformly upper bounded by
some positive scalar, provided that the design parameter γis large enough (the so-
called design-parameter requirement). Moreover, the steady state residual error
limt→∞ X(t)A1can be made to zero as γtends to positive infinity .
MATLAB Simulation of Gradient-Based Neural Network 101
As additional results to the above lemmas, we have the following general obser-
vations.
1. For large entry error (e.g., |eij |>1witheij := [AX I]ij ), the power
activation function could amplify the error signal (|ep
ij |>···>|e3
ij |>|eij |>
1), thus able to automatically remove the design-parameter requirement.
2. For small entry error (e.g., |eij |<1), the use of sigmoid activation func-
tions has better convergence and robustness than the use of linear activation
functions, because of the larger slope of the sigmoid function near the origin.
Thus, using the power-sigmoid activation function in (3) is theoretically a better
choice than other activation functions for superior convergence and robustness.
3 Simulation Study
While Section 2 presents the main theoretical results of the gradient-based neural
network, this section will investigate the MATLAB simulation techniques in
order to show the characteristics of such a neural network.
3.1 Coding of Activation Function
To simulate the gradient-based neural network (2), the activation functions are
to be defined firstly in MATLAB. Inside the body of a user-defined function,
the MATLAB routine “nargin” returns the number of input arguments which
are used to call the function. By using “nargin”, different kinds of activation
functions can be generated at least with their default input argument(s).
The linear activation-function mapping F(X)=XRn×ncan be generated
simply by using the following MATLAB code.
function output=Linear(X)
output=X;
The sigmoid activation-function mapping F(·)withξ= 4 as its default input
value can be generated by using the following MATLAB code.
function output=Sigmoid(X,xi)
if nargin==1, xi=4; end
output=(1-exp(-xi*X))./(1+exp(-xi*X));
The power activation-function mapping F(·)withp= 3 as its default input
value can be generated by using the following MATLAB code.
function output=Power(X,p)
if nargin==1, p=3; end
output=X.^p;
102 Y. Zhang et al.
The power-sigmoid activation function defined in (3) with ξ=4andp=3
being its default values can be generated below.
function output=Powersigmoid(X,xi,p)
if nargin==1, xi=4; p=3;
elseif nargin==2, p=3;
end
output=(1+exp(-xi))/(1-exp(-xi))*(1-exp(-xi*X))./(1+exp(-xi*X));
i=find(abs(X)>=1);
output(i)=X(i).^p;
3.2 Kronecker Product and Vectorization
The dynamic equations of gradient-based neural networks (2) and (4) are all
described in matrix form which could not be simulated directly. To simulate such
neural systems, the Kronecker product of matrices and vectorization technique
are introduced in order to transform the matrix-form differential equations to
vector-form differential equations.
In general case, given matrices A=[aij ]Rm×nand B=[bij ]Rp×q,the
Kronecker product of Aand Bis denoted by ABand is defined to be the
following block matrix
AB:=
a11B ... a
1nB
.
.
..
.
..
.
.
am1B ... a
mnB
Rmp×nq.
It is also known as the direct product or tensor product. Note that in general
AB=BA. Specifically, for our case, IA= diag(A,...,A).
In general case, given X=[xij ]Rm×n, we can vectorize Xas a vector,
i.e., vec(X)Rmn×1, which is defined as
vec(X):=[x11 ,...,x
m1,x
12,...,x
m2,...,x
1n, ..., xmn]T.
As stated in [22], in general case, let Xbe unknown, given ARm×nand
BRp×q, the matrix equation AX =Bis equivalent to the vector equation
(IA)vec(X)=vec(B).
Based on the above Kronecker product and vectorization technique, for sim-
ulation proposes, the matrix differential equation (2) can be transformed to a
vector differential equation. We thus obtain the following theorem.
Theorem 1. The matrix-form differential equation (2) can be reformulated as
the following vector-form differential equation:
vec( ˙
X)=γ(IAT)F(IA) vec(X)vec(I),(5)
where activation-function mapping F(·)in(5)isdenedthesameasin(2)
except that its dimensions are changed hereafter as F(·):Rn2×1Rn2×1.
MATLAB Simulation of Gradient-Based Neural Network 103
Proof. For readers’ convenience, we repeat the matrix-form differential equation
(2) here as ˙
X=γATF(AX(t)I).
By vectorizing equation (2) based on the Kronecker product and the above
vec(·) operator, the left hand side of (2) is vec( ˙
X), and the right hand side of
equation (2) is
vec γATF(AX(t)I)
=γvec ATF(AX(t)I)
=γ(IAT)vec(F(AX (t)I)).
(6)
Note that, as shown in Subsection 3.1, the definition and coding of the activation
function mapping F(·) are very flexible and could be a vectorized mapping from
Rn2×1to Rn2×1.Wethushave
vec(F(AX(t)I))
=F(vec(AX(t)I))
=F(vec(AX) + vec(I))
=F(IA) vec(X)vec(I).
(7)
Combining equations (6) and (7) yields the vectorization of the right hand side
of matrix-form differential equation (2):
vec γATF(AX(t)I)=γ(IAT)F(IA) vec(X)vec(I).
Clearly, the vectorization of both sides of matrix-form differential equation (2)
should be equal, which generates the vector-form differential equation (5). The
proof is thus complete.
Remark 1. The Kronecker product can be generated easily by using MATLAB
routine “kron”; e.g., ABcan be generated by MATLAB command kron(A,B).
To generate vec(X), we can use the MATLAB routine “reshape”. That is, if
the matrix Xhas nrows and mcolumns, then the MATLAB command of
vectorizing Xis reshape(X,m*n,1) which generates a column vector, vec(X)=
[x11,...,x
m1,x
12,...,x
m2,...,x
1n, ..., xmn]T.
Based on MATLAB routines “kron” and “vec”, the following code is used to
define a function returns the evaluation of the right-hand side of matrix-form
gradient-based neural network (2). In other words, it also returns the evaluation
of the right-hand side of vector-form gradient-based neural network (5). Note
that IAT=(IA)T.
function output=GnnRightHandSide(t,x,gamma)
if nargin==2, gamma=1; end
A=MatrixA; n=size(A,1); IA=kron(eye(n),A);
% The following generates the vectorization of identity matrix I
vecI=reshape(eye(n),n^2,1);
% The following calculates the right hand side of equations (2) and (5)
output=-gamma*IA’*Powersigmoid(IA*x-vecI);
104 Y. Zhang et al.
Note that we can change “Powersigmoid” in the above MATLAB code to “Sig-
moid” (or “Linear”) for using different activation functions.
4 Illustrative Example
For illustration, let us consider the following constant matrix:
A=
101
110
111
,A
T=
111
011
101
,A
1=
111
10 1
011
.
For example, matr ix Acan be given in the following MATLAB code.
function A=MatrixA(t)
A=[1 0 1;1 1 0;1 1 1];
The gradient-based neural network (2) is thus in the following specific form
˙x11 ˙x12 ˙x13
˙x21 ˙x22 ˙x23
˙x31 ˙x32 ˙x33
=γ
111
011
101
F
101
110
111
x11 x12 x13
x21 x22 x23
x31 x32 x33
100
010
001
.
4.1 Simulation of Convergence
To simulate gradient-based neural network (2) starting from eight random initial
states, we firstly define a function “GnnConvergence” as follows.
function GnnConvergence(gamma)
tspan=[0 10]; n=size(MatrixA,1);
for i=1:8
x0=4*(rand(n^2,1)-0.5*ones(n^2,1));
[t,x]=ode45(@GnnRightHandSide,tspan,x0,[],gamma);
for j=1:n^2
k=mod(n*(j-1)+1,n^2)+floor((j-1)/n);
subplot(n,n,k); plot(t,x(:,j)); hold on
end
end
To show the convergence of the gradient-based neural model (2) using power-
sigmoid activation function with ξ=4andp= 3 and using the design param-
eter γ:= 1, the MATLAB command is GnnConvergence(1), which generates
Fig. 1(a). Similarly, the MATLAB command GnnConvergence(10) can generate
Fig. 1(b).
To monitor the network convergence, we can also use and show the norm of
the computational error, X(t)A1. The MATLAB codes are given below,
i.e., the user-defined functions “NormError” and “GnnNormError”. By calling
“GnnNormError” three times with different γvalues, we can generate Fig. 2.
It shows that starting from any initial state randomly selected in [2,2], the
state matrices of the presented neural network (2) all converge to the theoretical
MATLAB Simulation of Gradient-Based Neural Network 105
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
x11 x12
x13
x21 x22
x23
x31 x32
x33
(a) γ=1
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
−1
0
1
0 5 10
−1
0
1
2
0 5 10
−2
0
2
x11 x12
x13
x21 x22 x23
x31 x32 x33
(b) γ=10
Fig. 1. Online matrix inversion by gradient-based neural network (2)
inverse A1, where the computational errors X(t)A1(t)all converge to
zero. Such a convergence can be expedited by increasing γ. For example, if γ
is increased to 103, the convergence time is within 30 milliseconds; and, if γis
increased to 106, the convergence time is within 30 microseconds.
function NormError(x0,gamma)
tspan=[0 10]; options=odeset();
[t,x]=ode45(@GnnRightHandSide,tspan,x0,options,gamma);
Ainv=inv(MatrixA);
B=reshape(Ainv,size(Ainv,1)^2,1);
total=length(t); x=x’;
for i=1:total, nerr(i)=norm(x(:,i)-B); end
plot(t,nerr); hold on
function GnnNormError(gamma)
if nargin<1, gamma=1; end
total=8; n=size(MatrixA,1);
for i=1:total
x0=4*(rand(n^2,1)-0.5*ones(n^2,1));
NormError(x0,gamma);
end
text(2.4,2.2,[’gamma=’ int2str(gamma)]);
4.2 Simulation of Robustness
Similar to the transformation of the matrix-form differential equation (2) to a
vector-form differential equation (5), the perturbed gradient-based neural net-
work (4) can be vectorized as follows:
vec( ˙
X)=γ(IAT)F(IA)vec(X)vec(I)+vec(ΔB).(8)
106 Y. Zhang et al.
0
1 2
3
4
5 6
7
8 9
1
0
0
1
2
3
4
5
6
γ=1
0
1 2
3
4
5 6
7
8 9
1
0
0
1
2
3
4
5
6
γ=10
0
1 2
3
4
5 6
7
8 9
1
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
γ= 100
Fig. 2. Convergence of X(t)A1Fusing power-sigmoid activation function
To show the robustness characteristics of gradient-based neural networks, the
following model-implementation error is added in a sinusoidal form (with ε2=
0.5):
ΔB=ε2
cos(3t)sin(3t)0
0sin(3t)cos(3t)
00sin(2t)
.
The following MATLAB code is used to define the function “GnnRightHand-
SideImprecise” for ODE solvers, which returns the evaluation of the right-hand
side of the perturbed gradient-base neural network (4), in other words, the right-
hand side of the vector-form differential equation (8).
function output=GnnRightHandSideImprecise(t,x,gamma)
if nargin==2, gamma=1; end
e2=0.5;
deltaB=e2*[cos(3*t) -sin(3*t) 0; 0 sin(3*t) cos(3*t);0 0 sin(2*t)];
vecB=reshape(deltaB,9,1);
vecI=reshape(eye(3),9,1);
IA=kron(eye(3),MatrixA);
output=-gamma*IA’*Powersigmoid(IA*x-vecI)+vecB;
To use the sigmoid (or linear) activation function, we only need to change
“Powersigmoid” to “Sigmoid” (or “Linear”) in the above MATLAB code. Based
on the above function “GnnRightHandSideImprecise” and the function below
(i.e.,“GnnRobust”), MATLAB commands GnnRobust(1) and GnnRobust(100)
can generate Fig. 3.
function GnnRobust(gamma)
tspan=[0 10]; options=odeset(); n=size(MatrixA,1);
for i=1:8
x0=4*(rand(n^2,1)-0.5*ones(n^2,1));
[t,x]=ode45(@GnnRightHandSideImprecise,tspan,x0,options,gamma);
for j=1:n^2
k=mod(n*(j-1)+1,n^2)+floor((j-1)/n);
subplot(n,n,k); plot(t,x(:,j)); hold on
end
end
MATLAB Simulation of Gradient-Based Neural Network 107
0 5 10
−1
0
1
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
x11 x12 x13
x21
x22 x23
x31
x32 x33
(a) γ=1
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
0 5 10
−2
0
2
x11 x12 x13
x21 x22 x23
x31 x32 x33
(b) γ= 100
Fig. 3. Online matrix inversion by GNN (4) with large implementation errors
0
1 2
3
4
5 6
7
8 9
1
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
γ=1
0
1 2
3
4
5 6
7
8 9
1
0
0
1
2
3
4
5
6
γ=10
0
1 2
3
4
5 6
7
8 9
1
0
0
1
2
3
4
5
6
γ= 100
Fig. 4. Convergence of computational error X(t)A1by perturbed GNN (4)
Similarly, we can show the computational error X(t)A1of gradient-
based neural network (4) with large model-implementation errors. To do so, in
the previously defined MATLAB function “NormError”, we only need change
“GnnRightHandSide” to “GnnRightHandSideImprecise”. See Fig. 4. Even with
imprecise implementation, the perturbed neural network still works well, and
its computational error X(t)A1is still bounded and very small. More-
over, as the design parameter γincreases from 1 to 100, the convergence is
expedited and the steady-state computational error is decreased. It is worth
mentioning again that using power-sigmoid or sigmoid activation functions has
smaller steady-state residual error than using linear or power activation func-
tions. It is observed from other simulation data that when using power-sigmoid
activation functions, the maximum steady-state residual error is only 2 ×102
and 2 ×103respectively for γ= 100 and γ= 1000. Clearly, compared to the
case of using linear or pure power activation functions, superior performance
can be achieved by using power-sigmoid or sigmoid activation functions under
the same design specification. These simulation results have substantiated the
theoretical results presented in previous sections and in [21].
108 Y. Zhang et al.
5 Conclusions
The gradient-based neural networks (1) and (2) have provided an effective online-
computing approach for matrix inversion. By considering different types of acti-
vation functions and implementation errors, such recurrent neural networks have
been simulated in this paper. Several important simulation techniques have been
introduced, i.e., coding of activation-function mappings, Kronecker product of
matrices, and MATLAB routine “ode45”. Simulation results have also demon-
strated the effectiveness and efficiency of gradient-based neural networks for on-
line matrix inversion. In addition, the characteristics of such a negative-gradient
design method of recurrent neural networks could be summarized as follows.
From the viewpoint of system stability, any monotonically-increasing ac-
tivation function f(·)withf(0) = 0 could be used for the construction of
recurrent neural networks. But, for the solution effectiveness and design sim-
plicity, the strictly-monotonically-increasing odd activation-function f(·)is
preferred for the construction of recurrent neural networks.
The gradient-based neural networks are intrinsically designed for solving
time-invariant matrix-inverse problems, but they could also be used to solve
time-varying matrix-inverse problems in an approximate way. Note that, in
this case, design parameter γis required to be large enough.
Compared to other methods, the gradient-based neural networks have an
easier structure for simulation and hardware implementation. As parallel-
processing systems, such neural networks could solve the matrix-inverse
problem more efficiently than those serial-processing methods.
Acknowledgements. This work is funded by National Science Foundation of
China under Grant 60643004 and by the Science and Technology Office of Sun
Yat-Sen University. Before joining Sun Yat-Sen University in 2006, the corre-
sponding author, Yunong Zhang, had been with National University of Ireland,
University of Strathclyde, National University of Singapore, Chinese University
of Hong Kong, since 1999. He has continued the line of this research, supported
by various research fellowships/assistantship. His web-page is now available at
http://www.ee.sysu.edu.cn/teacher/detail.asp?sn=129.
References
1. Zhang, Y.: Towards Piecewise-Linear Primal Neural Networks for Optimization
and Redundant Robotics. Proceedings of IEEE International Conference on Net-
working, Sensing and Control (2006) 374-379
2. Steriti, R.J., Fiddy, M.A.: Regularized Image Reconstruction Using SVD and a
Neural Network Method for Matrix Inversion. IEEE Transactions on Signal Pro-
cessing, Vol. 41 (1993) 3074-3077
3. Sarkar, T., Siarkiewicz, K., Stratton, R.: Survey of Numerical Methods for Solution
of Large Systems of Linear Equations for Electromagnetic Field Problems. IEEE
Transactions on Antennas and Propagation, Vol. 29 (1981) 847-856
MATLAB Simulation of Gradient-Based Neural Network 109
4. Sturges Jr, R.H.: Analog Matrix Inversion (Robot Kinematics). IEEE Journal of
Robotics and Automation, Vol. 4 (1988) 157-162
5. Yeung, K.S., Kumbi, F.: Symbolic Matrix Inversion with Application to Electronic
Circuits. IEEE Transactions on Circuits and Systems, Vol. 35 (1988) 235-238
6. El-Amawy, A.: A Systolic Architecture for Fast Dense Matrix Inversion. IEEE
Transactions on Computers, Vol. 38 (1989) 449-455
7. Neagoe, V.E.: Inversion of the Van Der Monde Matrix. IEEE Signal Processing
Letters, Vol. 3 (1996) 119-120
8. Wang, Y.Q., Gooi, H.B.: New Ordering Methods for Space Matrix Inversion via
Diagonaliztion. IEEE Transactions on Power Systems, Vol. 12 (1997) 1298-1305
9. Koc, C.K., Chen, G.: Inversion of All Principal Submatrices of a Matrix. IEEE
Transactions on Aerospace and Electronic Systems, Vol. 30 (1994) 280-281
10. Zhang, Y., Leithead, W.E., Leith, D.J.: Time-Series Gaussian Process Regression
Based on Toeplitz Computation of O(N2)OperationsandO(N)-Level Storage.
Proceedings of the 44th IEEE Conference on Decision and Control (2005) 3711-
3716
11. Leithead, W.E., Zhang, Y.: O(N2)-Operation Approximation of Covariance Matrix
Inverse in Gaussian Process Regression Based on Quasi-Newton BFGS Methods.
Communications in Statistics - Simulation and Computation, Vol. 36 (2007) 367-
380
12. Manherz, R.K., Jordan, B.W., Hakimi, S.L.: Analog Methods for Computation of
the Generalized Inverse. IEEE Transactions on Automatic Control, Vol. 13 (1968)
582-585
13. Jang, J., Lee, S., Shin, S.: An Optimization Network for Matrix Inversion. Neural
Information Processing Systems, American Institute of Physics, NY (1988) 397-401
14. Wang, J.: A Recurrent Neural Network for Real-Time Matrix Inversion. Applied
Mathematics and Computation, Vol. 55 (1993) 89-100
15. Zhang, Y.: Revisit the Analog Computer and Gradient-Based Neural System for
Matrix Inversion. Proceedings of IEEE International Symposium on Intelligent
Control (2005) 1411-1416
16. Zhang, Y., Jiang, D., Wang, J.: A Recurrent Neural Network for Solving Sylvester
Equation with Time-Varying Coefficients. IEEE Transactions on Neural Networks,
Vol. 13 (2002) 1053-1063
17. Zhang, Y., Ge, S.S.: A General Recurrent Neural Network Model for Time-Varying
Matrix Inversion. Proceedings of the 42nd IEEE Conference on Decision and Con-
trol (2003) 6169-6174
18. Zhang, Y., Ge, S.S.: Design and Analysis of a General Recurrent Neural Network
Model for Time-Varying Matrix Inversion. IEEE Transactions on Neural Networks,
Vol. 16 (2005) 1477-1490
19. Carneiro, N.C.F., Caloba, L.P.: A New Algorithm for Analog Matrix Inversion.
Proceedings of the 38th Midwest Symposium on Circuits and Systems, Vol. 1 (1995)
401-404
20. Mead, C.: Analog VLSI and Neural Systems. Addison-Wesley, Reading, MA (1989)
21. Zhang, Y., Li, Z., Fan, Z., Wang, G.: Matrix-Inverse Primal Neural Network with
Application to Robotics. Dynamics of Continuous, Discrete and Impulsive Systems,
Series B, Vol. 14 (2007) 400-407
22. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis, Cambridge University Press,
Cambridge (1991)
... Different from numerical methods, neural networks have been studied in depth and used to find the inverse of matrices in recent years [14][15][16][17][18][19][20][21][22][23][24][25][26]. In [22], a gradient-based neural network is proposed for matrix inversion, and its global exponential convergence performance and stability are analyzed. ...
... In [22], a gradient-based neural network is proposed for matrix inversion, and its global exponential convergence performance and stability are analyzed. In [24], the software MATLAB is used to simulate the process of online matrix inversion by gradient-based neural networks, and the efficiency of online matrix inversion is verified by simulation results. In [25], Wang proposed a recurrent neural network for matrix inversion in real-time, and the asymptotically stable performance is proven. ...
... In this subsection, the traditional GNN model is first exploited for online solution of the above-presented matrix inversion. Inspired by the traditional GNN model's design methods [22,24,[36][37][38][39][40], by defining the following norm-based energy function E(t) = AX − I 2 F /2, and exploiting the negative gradient descent formula − ∂E(t) ∂X , we can easily obtain the traditional linearly activated gradient-based neural network as: ...
Article
Full-text available
Different from the traditional linearly activated gradient-based neural network model (GNN model), two nonlinear activation functions are presented and investigated to construct two nonlinear gradient-based neural network models (NGNN-1 model and NGNN-2 model) for matrix inversion in this paper. For comparative and illustrative purposes, the traditional GNN model is also used to solve matrix inversion problems under the same circumstance. In addition, the simulation results of the computer finally confirm the validity and superiority of the two nonlinear gradient-based neural network models specially activated by two nonlinear activation functions for matrix inversion, as compared with the traditional GNN model.
... When coefficient A is non-singular, x(t) has an unique theoretical solution x * = A −1 b; when coefficient A is singular, linear equations (1) can have no solution or multiple solutions. A number of related problems such as matrix inversion [1], [2], [3], [4], [5], quadratic programming/minimisation [6], [7], [8], Sylvester equations [9], Lyapunov matrix equation [10], [11] and linear matrix equations AX(t)B = C [12], [13] can be transformed into linear equations (1) via vectorisation and Kronecker product [4]. Such a problem is widely encountered in other fields of science and engineering such as ridge regression in machine learning [14], [15], [16], [17], signal processing [18], optical flow in computer vision [19], [20], and robotic inverse kinematics [21], [22]. ...
... When coefficient A is non-singular, x(t) has an unique theoretical solution x * = A −1 b; when coefficient A is singular, linear equations (1) can have no solution or multiple solutions. A number of related problems such as matrix inversion [1], [2], [3], [4], [5], quadratic programming/minimisation [6], [7], [8], Sylvester equations [9], Lyapunov matrix equation [10], [11] and linear matrix equations AX(t)B = C [12], [13] can be transformed into linear equations (1) via vectorisation and Kronecker product [4]. Such a problem is widely encountered in other fields of science and engineering such as ridge regression in machine learning [14], [15], [16], [17], signal processing [18], optical flow in computer vision [19], [20], and robotic inverse kinematics [21], [22]. ...
Article
Motivated by the advantages achieved by implicit analogue net for solving online linear equations, a novel implicit neural model is designed based on conventional explicit gradient neural networks in this letter by introducing a positive-definite mass matrix. In addition to taking the advantages of the implicit neural dynamics, the proposed implicit gradient neural networks can still achieve globally exponential convergence to the unique theoretical solution of linear equations and also global stability even under no-solution and multi-solution situations. Simulative results verify theoretical convergence analysis on the proposed neural dynamics.
... Recently, many special classes and branches of RNNs have been developed and investigated, such as primaldual neural networks (PDNNs), [68][69][70][71][72][73][74][75][76] Zhang neural networks [or termed zeroing neural networks (ZNNs) and gradient neural networks (GNNs). [107][108][109][110][111][112][113][114] Note that the ZNN is a special class and branch of RNNs, which is originated and extended from the research of Hopfield neural networks. The ZNNs have been proposed developed, and investigated as a systematic as well as efficient approach to solve different dynamical engineering problems in real time since 2001. ...
Article
Exploiting neural networks to solve control problems of robots is becoming commonly and effectively in academia and engineering. Due to the remarkable features like distributed storage, parallelism, easy implementation by hardware, adaptive self-learning capability, and free of off-line training, the solutions of neural networks break the bottlenecks of serial-processing strategies and methods, and serve as significant alternatives for robotic engineers and researchers. Especially, various types and branches of recurrent neural networks (RNNs) have been sequentially developed since the seminal works by Hopfield and Tank. Successively, many classes and branches of RNNs such as primal-dual neural networks (PDNNs), zeroing neural networks (ZNNs) and gradient neural networks (GNNs) are proposed, investigated, developed and applied to the robot autonomy. The objective of this paper is to present a comprehensive review of the research on neural networks (especially RNNs) for control problems solving of different kinds of robots. Specifically, the state-of-the-art research of RNNs, PDNNs, ZNNs and GNNs in different robot control problems solving are detailedly revisited and reported. The readers can readily find many effective and valuable solutions on the basis of neural networks for the robot autonomy in this paper.
Article
The Zhang neural network (ZNN), a recurrent neural network, proposed in 2001, is particularly effective in solving time-varying problems. It has shown high efficiency and excellent performance in various applications. The weighted pseudoinverse is a useful tool in solving and analyzing the constrained least-squares problems. In this paper, we propose a ZNN model for computing the weighted pseudoinverse of a time-varying matrix. We show that our model converges globally and exponentially to the solution and our system is robust at the presence of small errors. A Matlab Simulink implementation of our model is presented. Our convergence analysis is verified by our experiments on testing matrices. A comparison study shows that our model has superior performance over the conventional gradient-based neural networks.
Article
A novel implicit dynamic system together with its electronic implementation is firstly proposed and investigated for online solution of Sylvester equations. In view of the success of recently-proposed Zhang implicit dynamics, our proposed model is also designed in the implicit-dynamical fashion. Compared to the existing neural dynamics, i.e., conventional gradient explicit dynamics and Zhang implicit dynamics, our implicit dynamic system can achieve superior global exponential convergence performance. Computer simulation results demonstrate theoretical analysis of our proposed dynamic model for real-time solution of Sylvester equations.
Article
The real-time solution to a mathematical problem arises in numerous fields of science, engineering, and business. It is usually an essential part of many solutions, e.g., matrix/vector computation, optimization, control theory, kinematics, signal processing, and pattern recognition. In recent years, due to the in-depth research on neural networks, numerous recurrent neural networks (RNN) based on the gradient-based method have been developed and investigated. Particularly, some simple neural networks were proposed to solve linear programming problems in real time and implemented on analog circuits. In this book, ZNN, ZD or ZND theory formalizes these problems and solutions in the time-varying context and provides compact models that could solve those dynamic problems.
Article
Encouraged by superior convergence performance achieved by a recently-proposed hybrid of recursive neural dynamics for online matrix inversion, we investigate its robustness properties in this paper when there exists large model implementation errors. Theoretical analysis shows that the perturbed dynamic system is still global stable with the tight steady-state bound of solution error estimated. Moreover, this paper analyses global exponential convergence rate and finite convergence time of such a hybrid dynamical model to a relatively loose solution error bound. Computer simulation results substantiate our analysis on the perturbed hybrid neural dynamics for online matrix inversion when having large implementation errors.
Conference Paper
Full-text available
Gaussian process (GP) regression is a Bayesian nonparametric model showing good performance in various applications. However, its hyperparameter-estimating procedure may contain numerous matrix manipulations of O(N<sup>3</sup>) arithmetic operations, in addition to the O(N<sup>2</sup>)-level storage. Motivated by handling the real-world large dataset of 24000 wind-turbine data, we propose in this paper an efficient and economical Toeplitz-computation scheme for time-series Gaussian process regression. The scheme is of O(N<sup>2</sup>) operations and O(N)-level memory requirement. Numerical experiments substantiate the effectiveness and possibility of using this Toeplitz computation for very large datasets regression (such as, containing 10000~100000 data points).
Book
Linear algebra and matrix theory are fundamental tools in mathematical and physical science, as well as fertile fields for research. This second edition of this acclaimed text presents results of both classic and recent matrix analysis using canonical forms as a unifying theme and demonstrates their importance in a variety of applications. This thoroughly revised and updated second edition is a text for a second course on linear algebra and has more than 1,100 problems and exercises, new sections on the singular value and CS decompositions and the Weyr canonical form, expanded treatments of inverse problems and of block matrices, and much more.
Book
1. A Neural Processor for Maze Solving.- 2 Resistive Fuses: Analog Hardware for Detecting Discontinuities in Early Vision.- 3 CMOS Integration of Herault-Jutten Cells for Separation of Sources.- 4 Circuit Models of Sensory Transduction in the Cochlea.- 5 Issues in Analog VLSI and MOS Techniques for Neural Computing.- 6 Design and Fabrication of VLSI Components for a General Purpose Analog Neural Computer.- 7 A Chip that Focuses an Image on Itself.- 8 A Foveated Retina-Like Sensor Using CCD Technology.- 9 Cooperative Stereo Matching Using Static and Dynamic Image Features.- 10 Adaptive Retina.
Book
1. The field of values 2. Stable matrices and inertia 3. Singular value inequalities 4. Matrix equations and Kronecker products 5. Hadamard products 6. Matrices and functions 7. Totally positive matrices.
Article
Gaussian process (GP) is a Bayesian nonparametric regression model, showing good performance in various applications. However, during its model-tuning procedure, the GP implementation suffers from numerous covariance-matrix inversions of expensive O(N) operations, where N is the matrix dimension. In this article, we propose using the quasi-Newton BFGS O(N)-operation formula to approximate/replace recursively the inverse of covariance matrix at every iteration. The implementation accuracy is guaranteed carefully by a matrix-trace criterion and by the restarts technique to generate good initial guesses. A number of numerical tests are then performed based on the sinusoidal regression example and the Wiener–Hammerstein identification example. It is shown that by using the proposed implementation, more than 80% O(N) operations could be eliminated, and a typical speedup of 5–9 could be achieved as compared to the standard maximum-likelihood-estimation (MLE) implementation commonly used in Gaussian process regression.
Conference Paper
Motivated by handling joint physical limits, environmental obstacles and various performance indices, researchers have developed a general quadratic-programming (QP) formulation for the redundancy resolution of robot manipulators. Such a general QP formulation is subject to equality constraint, inequality constraint and bound constraint, simultaneously. Each of the constraints has interpretably physical meaning and utility. Motivated by the real-time solution to the robotic problems, dynamic system solvers in the form of recurrent neural networks (RNN) have been developed and employed. This is in light of their parallel-computing nature and hardware implementability. In this paper, we have reviewed five RNN models, which include state-of-the-art dual neural networks (DNN) and LVI-based primal-dual neural networks (LVI-PDNN). Based on the review of the design experience, this paper proposes the concept, requirement and possibility of developing a future recurrent neural network model for solving online QP problems in redundant robotics; i.e., a piecewise-linear primal neural network
Conference Paper
As inspired by revising (Zhang and Ge, 2003), the traditional gradient-based neural system (also termed analog computer (Manherz et al., 1968)) for matrix inversion is re-visited by examining different activation functions and various implementation errors. A general neural system for matrix inversion is thus presented which can be constructed by using monotonically-increasing odd activation functions. For superior convergence and robustness of such a system, the power-sigmoid activation function is preferred to be in use if the hardware permits. In addition to investigating the singular case, this paper also presents an application example on inverse-kinematic control of redundant manipulators via online pseudoinverse solution
Article
A recurrent neural network for computing inverse matrices in real-time is proposed. The proposed recurrent neural network consists of n independent subnetworks where n is the order of the matrix. The proposed recurrent neural network is proven to be asymptotically stable and capable of computing large-scale nonsingular inverse matrices in real-time. An op-amp based analog neural network is discussed. The operating characteristics of the op-amp based analog neural network is also demonstrated via an illustrative example.