Conference PaperPDF Available

MATLAB Simulation of Gradient-Based Neural Network for Online Matrix Inversion

August 2007

August 2007

DOI:10.1007/978-3-540-74205-0_12

Source
DBLP

Conference: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, Third International Conference on Intelligent Computing, ICIC 2007, Qingdao, China, August 21-24, 2007, Proceedings

Authors:

Yunong Zhang

Sun Yat-Sen University

Li Xiaodong

Fujian Agriculture and Forestry University

This paper investigates the simulation of a gradient-based recurrent neural network for online solution of the matrix-inverse problem. Several important techniques are employed as follows to simulate such a neural system. 1) Kronecker product of matrices is introduced to transform a matrix-differential-equation (MDE) to a vector-differential-equation (VDE); i.e., finally, a standard ordinary-differential-equation (ODE) is obtained. 2) MATLAB routine “ode45” is introduced to solve the transformed initial-value ODE problem. 3) In addition to various implementation errors, different kinds of activation functions are simulated to show the characteristics of such a neural network. Simulation results substantiate the theoretical analysis and efficacy of the gradient-based neural network for online constant matrix inversion.

Online matrix inversion by gradient-based neural network (2)

…

Figures - uploaded by Yunong Zhang

Content may be subject to copyright.

Content uploaded by Yunong Zhang

Content may be subject to copyright.

MATLAB Simulation of Gradient-Based Neural

Network for Online Matrix Inversion

Yunong Zhang, Ke Chen, Weimu Ma, and Xiao-Dong Li

Department of Electronics and Communication Engineering

Sun Yat-Sen University, Guangzhou 510275, China

ynzhang@ieee.org

Abstract. This paper investigates the simulation of a gradient-based

recurrent neural network for online solution of the matrix-inverse prob-

lem. Several important techniques are employed as follows to simulate

such a neural system. 1) Kronecker product of matrices is introduced to

transform a matrix-diﬀerential-equation (MDE) to a vector-diﬀerential-

equation (VDE); i.e., ﬁnally, a standard ordinary-diﬀerential-equation

(ODE) is obtained. 2) MATLAB routine “ode45” is introduced to solve

the transformed initial-value ODE problem. 3) In addition to various im-

plementation errors, diﬀerent kinds of activation functions are simulated

to show the characteristics of such a neural network. Simulation results

substantiate the theoretical analysis and eﬃcacy of the gradient-based

neural network for online constant matrix inversion.

Keywords: Online matrix inversion, Gradient-based neural network,

Kronecker product, MATLAB simulation.

1 Introduction

The problem of matrix inversion is considered to be one of the basic problems

widely encountered in science and engineering. It is usually an essential part of

many solutions; e.g., as preliminary steps for optimization [1], signal-processing

[2], electromagnetic systems [3], and robot inverse kinematics [4]. Since the mid-

1980’s, eﬀorts have been directed towards computational aspects of fast matrix

inversion and many algorithms have thus been proposed [5]-[8]. It is known that

the minimal arithmetic operations are usually proportional to the cube of the

matrix dimension for numerical methods [9], and consequently such algorithms

performed on digital computers are not eﬃcient enough for large-scale online

applications. In view of this, some O(n2)-operation algorithms were proposed

to remedy this computational problem, e.g., in [10][11]. However, they may be

still not fast enough; e.g., in [10], it takes on average around one hour to invert

a 60000-dimensional matrix. As a result, parallel computational schemes have

been investigated for matrix inversion.

The dynamic system approach is one of the important parallel-processing

methods for solving matrix-inversion problems [2][12]-[18]. Recently, due to the

in-depth research in neural networks, numerous dynamic and analog solvers

D.-S. Huang, L. Heutte, and M. Loog (Eds.): ICIC 2007, LNAI 4682, pp. 98–109, 2007.

Springer-Verlag Berlin Heidelberg 2007

MATLAB Simulation of Gradient-Based Neural Network 99

based on recurrent neural networks (RNNs) have been developed and inves-

tigated [2][13]-[18]. The neural dynamic approach is thus regarded as a powerful

alternative for online computation because of its parallel distributed nature and

convenience of hardware implementation [4][12][15][19][20].

To solve for a matrix inverse, the neural system design is based on the equa-

tion, AX −I=0,withA∈Rn×n. We can deﬁne a scalar-valued energy function

such as E(t)=AX(t)−I2/2. Then, we use the negative of the gradient

∂E/∂X =AT(AX (t)−I) as the descent direction. As a result, the classic linear

model is shown as follows:

X(t)=−γ∂E

∂X =−γAT(AX(t)−I),X(0) = X0(1)

where design parameter γ>0, being an inductance parameter or the reciprocal

of a capacitive parameter, is set as large as the hardware permits, or selected

appropriately for experiments.

As proposed in [21], the following general neural model is an extension to the

above design approach with a nonlinear activation-function array F:

X(t)=−γATF(AX(t)−I)(2)

where X(t), starting from an initial condition X(0) = X0∈Rn×n,istheacti-

vation state matrix corresponding to the theoretical inverse A−1of matrix A.

Like in (1), the design parameter γ>0 is used to scale the convergence rate of

the neural network (2), while F(·):Rn×n→Rn×ndenotes a matrix activation-

function mapping of neural networks.

2 Main Theoretical Results

In view of equation (2), diﬀerent choices of Fmay lead to diﬀerent performance.

In general, any strictly-monotonically-increasing odd activation-function f(·),

being an element of matrix mapping F, may be used for the construction of the

neural network. In order to demonstrate the main ideas, four types of activation

functions are investigated in our simulation:

–linear activation function f(u)=u,

–bipolar sigmoid function f(u)=(1−exp(−ξu))/(1 + exp(−ξu)) with ξ2,

–power activation function f(u)=upwith odd integer p3, and

–the following power-sigmoid activation function

f(u)=up,if |u|1

1+exp(−ξ)

1−exp(−ξ)·1−exp(−ξu)

1+exp(−ξu),otherwise (3)

with suitable design parameters ξ1andp3.

Other types of activation functions can be generated by these four basic types.

Following the analysis results of [18][21], the convergence results of using dif-

ferent activation functions are qualitatively presented as follows.

100 Y. Zhang et al.

Proposition 1. [15]-[18][21] For a nonsingular matrix A∈Rn×n,anystrictly

monotonically-increasing odd activation-function array F(·)can be used for con-

structing the gradient-based neural network (2).

1. If the linear activation function is used, then the global exponential conver-

gence is achieved for neural network (2) with convergence rate proportional

to the product of γand the minimum eigenvalue of ATA.

2. If the bipolar sigmoid activation function is used, then the superior conver-

gence can be achieved for error range [−δ, δ],∃δ∈(0,1),ascomparedto

the linear-activation-function case. This is because the error signal eij =

[AX −I]ij in (2) is ampliﬁed by the bipolar sigmoid function for error range

[−δ, δ].

3. If the power activation function is used, then the superior convergence can be

achieved for error ranges (−∞,−1] and [1,+∞), as compared to the linear-

activation-function case. This is because the error signal eij =[AX −I]ij in

(2) is ampliﬁed by the power activation function for error ranges (−∞,−1]

and [1,+∞).

4. If the power-sigmoid activation function is used, then superior convergence

can be achieved for the whole error range (−∞,+∞),ascomparedtothe

linear-activation-function case. This is in view of Properties 2) and 3).

In the analog implementation or simulation of the gradient-based neural net-

works (1) and (2), we usually assume that it is under ideal conditions. However,

there are always some realization errors involved. For example, for the linear

activation function, its imprecise implementation may look more like a sigmoid

or piecewise-linear function because of the ﬁnite gain and frequency dependency

of operational ampliﬁers and multipliers. For these realization errors possibly

appearing in the gradient-based neural network (2), we have the following the-

oretical results.

Proposition 2. [15]-[18][21] Consider the perturbed gradient-based neural model

X=−γ(A+ΔA)TF((A+ΔA)X(t)−I),

where the additive term ΔAexists such that ΔAε1,∃ε10, then the steady-

state residual error limt→∞ X(t)−A−1is uniformly upper bounded by some

positive scalar, provided that the resultant matrix A+ΔAis still nonsingular.

For the model-implementation error due to the imprecise implementation of sys-

tem dynamics, the following dynamics is considered, as compared to the original

dynamic equation (2).

X=−γATF(AX(t)−I)+ΔB,(4)

where the additive term ΔBexists such that ΔBε2,∃ε20.

Proposition 3. [15]-[18][21] Consider the imprecise implementation (4), the

steady state residual error limt→∞ X(t)−A−1is uniformly upper bounded by

some positive scalar, provided that the design parameter γis large enough (the so-

called design-parameter requirement). Moreover, the steady state residual error

limt→∞ X(t)−A−1can be made to zero as γtends to positive inﬁnity .

MATLAB Simulation of Gradient-Based Neural Network 101

As additional results to the above lemmas, we have the following general obser-

vations.

1. For large entry error (e.g., |eij |>1witheij := [AX −I]ij ), the power

activation function could amplify the error signal (|ep

ij |>···>|e3

ij |>|eij |>

1), thus able to automatically remove the design-parameter requirement.

2. For small entry error (e.g., |eij |<1), the use of sigmoid activation func-

tions has better convergence and robustness than the use of linear activation

functions, because of the larger slope of the sigmoid function near the origin.

Thus, using the power-sigmoid activation function in (3) is theoretically a better

choice than other activation functions for superior convergence and robustness.

3 Simulation Study

While Section 2 presents the main theoretical results of the gradient-based neural

network, this section will investigate the MATLAB simulation techniques in

order to show the characteristics of such a neural network.

3.1 Coding of Activation Function

To simulate the gradient-based neural network (2), the activation functions are

to be deﬁned ﬁrstly in MATLAB. Inside the body of a user-deﬁned function,

the MATLAB routine “nargin” returns the number of input arguments which

are used to call the function. By using “nargin”, diﬀerent kinds of activation

functions can be generated at least with their default input argument(s).

The linear activation-function mapping F(X)=X∈Rn×ncan be generated

simply by using the following MATLAB code.

function output=Linear(X)

output=X;

The sigmoid activation-function mapping F(·)withξ= 4 as its default input

value can be generated by using the following MATLAB code.

function output=Sigmoid(X,xi)

if nargin==1, xi=4; end

output=(1-exp(-xi*X))./(1+exp(-xi*X));

The power activation-function mapping F(·)withp= 3 as its default input

value can be generated by using the following MATLAB code.

function output=Power(X,p)

if nargin==1, p=3; end

output=X.^p;

102 Y. Zhang et al.

The power-sigmoid activation function deﬁned in (3) with ξ=4andp=3

being its default values can be generated below.

function output=Powersigmoid(X,xi,p)

if nargin==1, xi=4; p=3;

elseif nargin==2, p=3;

end

output=(1+exp(-xi))/(1-exp(-xi))*(1-exp(-xi*X))./(1+exp(-xi*X));

i=find(abs(X)>=1);

output(i)=X(i).^p;

3.2 Kronecker Product and Vectorization

The dynamic equations of gradient-based neural networks (2) and (4) are all

described in matrix form which could not be simulated directly. To simulate such

neural systems, the Kronecker product of matrices and vectorization technique

are introduced in order to transform the matrix-form diﬀerential equations to

vector-form diﬀerential equations.

–In general case, given matrices A=[aij ]∈Rm×nand B=[bij ]∈Rp×q,the

Kronecker product of Aand Bis denoted by A⊗Band is deﬁned to be the

following block matrix

A⊗B:= ⎛

⎜

⎝

a11B ... a

1nB

am1B ... a

mnB

⎞

⎟

⎠∈Rmp×nq.

It is also known as the direct product or tensor product. Note that in general

A⊗B=B⊗A. Speciﬁcally, for our case, I⊗A= diag(A,...,A).

–In general case, given X=[xij ]∈Rm×n, we can vectorize Xas a vector,

i.e., vec(X)∈Rmn×1, which is deﬁned as

vec(X):=[x11 ,...,x

m1,x

12,...,x

m2,...,x

1n, ..., xmn]T.

As stated in [22], in general case, let Xbe unknown, given A∈Rm×nand

B∈Rp×q, the matrix equation AX =Bis equivalent to the vector equation

(I⊗A)vec(X)=vec(B).

Based on the above Kronecker product and vectorization technique, for sim-

ulation proposes, the matrix diﬀerential equation (2) can be transformed to a

vector diﬀerential equation. We thus obtain the following theorem.

Theorem 1. The matrix-form diﬀerential equation (2) can be reformulated as

the following vector-form diﬀerential equation:

vec( ˙

X)=−γ(I⊗AT)F(I⊗A) vec(X)−vec(I),(5)

where activation-function mapping F(·)in(5)isdeﬁnedthesameasin(2)

except that its dimensions are changed hereafter as F(·):Rn2×1→Rn2×1.

MATLAB Simulation of Gradient-Based Neural Network 103

Proof. For readers’ convenience, we repeat the matrix-form diﬀerential equation

(2) here as ˙

X=−γATF(AX(t)−I).

By vectorizing equation (2) based on the Kronecker product and the above

vec(·) operator, the left hand side of (2) is vec( ˙

X), and the right hand side of

equation (2) is

vec −γATF(AX(t)−I)

=−γvec ATF(AX(t)−I)

=−γ(I⊗AT)vec(F(AX (t)−I)).

(6)

Note that, as shown in Subsection 3.1, the deﬁnition and coding of the activation

function mapping F(·) are very ﬂexible and could be a vectorized mapping from

Rn2×1to Rn2×1.Wethushave

vec(F(AX(t)−I))

=F(vec(AX(t)−I))

=F(vec(AX) + vec(−I))

=F(I⊗A) vec(X)−vec(I).

(7)

Combining equations (6) and (7) yields the vectorization of the right hand side

of matrix-form diﬀerential equation (2):

vec −γATF(AX(t)−I)=−γ(I⊗AT)F(I⊗A) vec(X)−vec(I).

Clearly, the vectorization of both sides of matrix-form diﬀerential equation (2)

should be equal, which generates the vector-form diﬀerential equation (5). The

proof is thus complete.

Remark 1. The Kronecker product can be generated easily by using MATLAB

routine “kron”; e.g., A⊗Bcan be generated by MATLAB command kron(A,B).

To generate vec(X), we can use the MATLAB routine “reshape”. That is, if

the matrix Xhas nrows and mcolumns, then the MATLAB command of

vectorizing Xis reshape(X,m*n,1) which generates a column vector, vec(X)=

[x11,...,x

m1,x

12,...,x

m2,...,x

1n, ..., xmn]T.

Based on MATLAB routines “kron” and “vec”, the following code is used to

deﬁne a function returns the evaluation of the right-hand side of matrix-form

gradient-based neural network (2). In other words, it also returns the evaluation

of the right-hand side of vector-form gradient-based neural network (5). Note

that I⊗AT=(I⊗A)T.

function output=GnnRightHandSide(t,x,gamma)

if nargin==2, gamma=1; end

A=MatrixA; n=size(A,1); IA=kron(eye(n),A);

% The following generates the vectorization of identity matrix I

vecI=reshape(eye(n),n^2,1);

% The following calculates the right hand side of equations (2) and (5)

output=-gamma*IA’*Powersigmoid(IA*x-vecI);

104 Y. Zhang et al.

Note that we can change “Powersigmoid” in the above MATLAB code to “Sig-

moid” (or “Linear”) for using diﬀerent activation functions.

4 Illustrative Example

For illustration, let us consider the following constant matrix:

A=⎡

⎣101

110

111

⎤

⎦,A

T=⎡

⎣111

011

101

⎤

⎦,A

−1=⎡

⎣11−1

−10 1

0−11

⎤

⎦.

For example, matr ix Acan be given in the following MATLAB code.

function A=MatrixA(t)

A=[1 0 1;1 1 0;1 1 1];

The gradient-based neural network (2) is thus in the following speciﬁc form

⎡

⎣˙x11 ˙x12 ˙x13

˙x21 ˙x22 ˙x23

˙x31 ˙x32 ˙x33⎤

⎦=−γ⎡

⎣111

011

101

⎤

⎦F⎛

⎝⎡

⎣101

110

111

⎤

⎦⎡

⎣x11 x12 x13

x21 x22 x23

x31 x32 x33⎤

⎦−⎡

⎣100

010

001

⎤

⎦⎞

⎠.

4.1 Simulation of Convergence

To simulate gradient-based neural network (2) starting from eight random initial

states, we ﬁrstly deﬁne a function “GnnConvergence” as follows.

function GnnConvergence(gamma)

tspan=[0 10]; n=size(MatrixA,1);

for i=1:8

x0=4*(rand(n^2,1)-0.5*ones(n^2,1));

[t,x]=ode45(@GnnRightHandSide,tspan,x0,[],gamma);

for j=1:n^2

k=mod(n*(j-1)+1,n^2)+floor((j-1)/n);

subplot(n,n,k); plot(t,x(:,j)); hold on

end

To show the convergence of the gradient-based neural model (2) using power-

sigmoid activation function with ξ=4andp= 3 and using the design param-

eter γ:= 1, the MATLAB command is GnnConvergence(1), which generates

Fig. 1(a). Similarly, the MATLAB command GnnConvergence(10) can generate

Fig. 1(b).

To monitor the network convergence, we can also use and show the norm of

the computational error, X(t)−A−1. The MATLAB codes are given below,

i.e., the user-deﬁned functions “NormError” and “GnnNormError”. By calling

“GnnNormError” three times with diﬀerent γvalues, we can generate Fig. 2.

It shows that starting from any initial state randomly selected in [−2,2], the

state matrices of the presented neural network (2) all converge to the theoretical

MATLAB Simulation of Gradient-Based Neural Network 105

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

x11 x12

x13

x21 x22

x23

x31 x32

x33

(a) γ=1

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

−1

0 5 10

−1

0 5 10

−2

x11 x12

x13

x21 x22 x23

x31 x32 x33

(b) γ=10

Fig. 1. Online matrix inversion by gradient-based neural network (2)

inverse A−1, where the computational errors X(t)−A−1(t)all converge to

zero. Such a convergence can be expedited by increasing γ. For example, if γ

is increased to 103, the convergence time is within 30 milliseconds; and, if γis

increased to 106, the convergence time is within 30 microseconds.

function NormError(x0,gamma)

tspan=[0 10]; options=odeset();

[t,x]=ode45(@GnnRightHandSide,tspan,x0,options,gamma);

Ainv=inv(MatrixA);

B=reshape(Ainv,size(Ainv,1)^2,1);

total=length(t); x=x’;

for i=1:total, nerr(i)=norm(x(:,i)-B); end

plot(t,nerr); hold on

function GnnNormError(gamma)

if nargin<1, gamma=1; end

total=8; n=size(MatrixA,1);

for i=1:total

x0=4*(rand(n^2,1)-0.5*ones(n^2,1));

NormError(x0,gamma);

end

text(2.4,2.2,[’gamma=’ int2str(gamma)]);

4.2 Simulation of Robustness

Similar to the transformation of the matrix-form diﬀerential equation (2) to a

vector-form diﬀerential equation (5), the perturbed gradient-based neural net-

work (4) can be vectorized as follows:

vec( ˙

X)=−γ(I⊗AT)F(I⊗A)vec(X)−vec(I)+vec(ΔB).(8)

106 Y. Zhang et al.

1 2

5 6

8 9

γ=1

1 2

5 6

8 9

γ=10

1 2

5 6

8 9

0.5

1.5

2.5

3.5

4.5

γ= 100

Fig. 2. Convergence of X(t)−A−1Fusing power-sigmoid activation function

To show the robustness characteristics of gradient-based neural networks, the

following model-implementation error is added in a sinusoidal form (with ε2=

0.5):

ΔB=ε2⎡

⎣cos(3t)−sin(3t)0

0sin(3t)cos(3t)

00sin(2t)⎤

⎦.

The following MATLAB code is used to deﬁne the function “GnnRightHand-

SideImprecise” for ODE solvers, which returns the evaluation of the right-hand

side of the perturbed gradient-base neural network (4), in other words, the right-

hand side of the vector-form diﬀerential equation (8).

function output=GnnRightHandSideImprecise(t,x,gamma)

if nargin==2, gamma=1; end

e2=0.5;

deltaB=e2*[cos(3*t) -sin(3*t) 0; 0 sin(3*t) cos(3*t);0 0 sin(2*t)];

vecB=reshape(deltaB,9,1);

vecI=reshape(eye(3),9,1);

IA=kron(eye(3),MatrixA);

output=-gamma*IA’*Powersigmoid(IA*x-vecI)+vecB;

To use the sigmoid (or linear) activation function, we only need to change

“Powersigmoid” to “Sigmoid” (or “Linear”) in the above MATLAB code. Based

on the above function “GnnRightHandSideImprecise” and the function below

(i.e.,“GnnRobust”), MATLAB commands GnnRobust(1) and GnnRobust(100)

can generate Fig. 3.

function GnnRobust(gamma)

tspan=[0 10]; options=odeset(); n=size(MatrixA,1);

for i=1:8

x0=4*(rand(n^2,1)-0.5*ones(n^2,1));

[t,x]=ode45(@GnnRightHandSideImprecise,tspan,x0,options,gamma);

for j=1:n^2

k=mod(n*(j-1)+1,n^2)+floor((j-1)/n);

subplot(n,n,k); plot(t,x(:,j)); hold on

end

MATLAB Simulation of Gradient-Based Neural Network 107

0 5 10

−1

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

x11 x12 x13

x21

x22 x23

x31

x32 x33

(a) γ=1

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

0 5 10

−2

x11 x12 x13

x21 x22 x23

x31 x32 x33

(b) γ= 100

Fig. 3. Online matrix inversion by GNN (4) with large implementation errors

1 2

5 6

8 9

0.5

1.5

2.5

3.5

4.5

γ=1

1 2

5 6

8 9

γ=10

1 2

5 6

8 9

γ= 100

Fig. 4. Convergence of computational error X(t)−A−1by perturbed GNN (4)

Similarly, we can show the computational error X(t)−A−1of gradient-

based neural network (4) with large model-implementation errors. To do so, in

the previously deﬁned MATLAB function “NormError”, we only need change

“GnnRightHandSide” to “GnnRightHandSideImprecise”. See Fig. 4. Even with

imprecise implementation, the perturbed neural network still works well, and

its computational error X(t)−A−1is still bounded and very small. More-

over, as the design parameter γincreases from 1 to 100, the convergence is

expedited and the steady-state computational error is decreased. It is worth

mentioning again that using power-sigmoid or sigmoid activation functions has

smaller steady-state residual error than using linear or power activation func-

tions. It is observed from other simulation data that when using power-sigmoid

activation functions, the maximum steady-state residual error is only 2 ×10−2

and 2 ×10−3respectively for γ= 100 and γ= 1000. Clearly, compared to the

case of using linear or pure power activation functions, superior performance

can be achieved by using power-sigmoid or sigmoid activation functions under

the same design speciﬁcation. These simulation results have substantiated the

theoretical results presented in previous sections and in [21].

108 Y. Zhang et al.

5 Conclusions

The gradient-based neural networks (1) and (2) have provided an eﬀective online-

computing approach for matrix inversion. By considering diﬀerent types of acti-

vation functions and implementation errors, such recurrent neural networks have

been simulated in this paper. Several important simulation techniques have been

introduced, i.e., coding of activation-function mappings, Kronecker product of

matrices, and MATLAB routine “ode45”. Simulation results have also demon-

strated the eﬀectiveness and eﬃciency of gradient-based neural networks for on-

line matrix inversion. In addition, the characteristics of such a negative-gradient

design method of recurrent neural networks could be summarized as follows.

–From the viewpoint of system stability, any monotonically-increasing ac-

tivation function f(·)withf(0) = 0 could be used for the construction of

recurrent neural networks. But, for the solution eﬀectiveness and design sim-

plicity, the strictly-monotonically-increasing odd activation-function f(·)is

preferred for the construction of recurrent neural networks.

–The gradient-based neural networks are intrinsically designed for solving

time-invariant matrix-inverse problems, but they could also be used to solve

time-varying matrix-inverse problems in an approximate way. Note that, in

this case, design parameter γis required to be large enough.

–Compared to other methods, the gradient-based neural networks have an

easier structure for simulation and hardware implementation. As parallel-

processing systems, such neural networks could solve the matrix-inverse

problem more eﬃciently than those serial-processing methods.

Acknowledgements. This work is funded by National Science Foundation of

China under Grant 60643004 and by the Science and Technology Oﬃce of Sun

Yat-Sen University. Before joining Sun Yat-Sen University in 2006, the corre-

sponding author, Yunong Zhang, had been with National University of Ireland,

University of Strathclyde, National University of Singapore, Chinese University

of Hong Kong, since 1999. He has continued the line of this research, supported

by various research fellowships/assistantship. His web-page is now available at

http://www.ee.sysu.edu.cn/teacher/detail.asp?sn=129.

References

1. Zhang, Y.: Towards Piecewise-Linear Primal Neural Networks for Optimization

and Redundant Robotics. Proceedings of IEEE International Conference on Net-

working, Sensing and Control (2006) 374-379

2. Steriti, R.J., Fiddy, M.A.: Regularized Image Reconstruction Using SVD and a

Neural Network Method for Matrix Inversion. IEEE Transactions on Signal Pro-

cessing, Vol. 41 (1993) 3074-3077

3. Sarkar, T., Siarkiewicz, K., Stratton, R.: Survey of Numerical Methods for Solution

of Large Systems of Linear Equations for Electromagnetic Field Problems. IEEE

Transactions on Antennas and Propagation, Vol. 29 (1981) 847-856

MATLAB Simulation of Gradient-Based Neural Network 109

4. Sturges Jr, R.H.: Analog Matrix Inversion (Robot Kinematics). IEEE Journal of

Robotics and Automation, Vol. 4 (1988) 157-162

5. Yeung, K.S., Kumbi, F.: Symbolic Matrix Inversion with Application to Electronic

Circuits. IEEE Transactions on Circuits and Systems, Vol. 35 (1988) 235-238

6. El-Amawy, A.: A Systolic Architecture for Fast Dense Matrix Inversion. IEEE

Transactions on Computers, Vol. 38 (1989) 449-455

7. Neagoe, V.E.: Inversion of the Van Der Monde Matrix. IEEE Signal Processing

Letters, Vol. 3 (1996) 119-120

8. Wang, Y.Q., Gooi, H.B.: New Ordering Methods for Space Matrix Inversion via

Diagonaliztion. IEEE Transactions on Power Systems, Vol. 12 (1997) 1298-1305

9. Koc, C.K., Chen, G.: Inversion of All Principal Submatrices of a Matrix. IEEE

Transactions on Aerospace and Electronic Systems, Vol. 30 (1994) 280-281

10. Zhang, Y., Leithead, W.E., Leith, D.J.: Time-Series Gaussian Process Regression

Based on Toeplitz Computation of O(N2)OperationsandO(N)-Level Storage.

Proceedings of the 44th IEEE Conference on Decision and Control (2005) 3711-

3716

11. Leithead, W.E., Zhang, Y.: O(N2)-Operation Approximation of Covariance Matrix

Inverse in Gaussian Process Regression Based on Quasi-Newton BFGS Methods.

Communications in Statistics - Simulation and Computation, Vol. 36 (2007) 367-

380

12. Manherz, R.K., Jordan, B.W., Hakimi, S.L.: Analog Methods for Computation of

the Generalized Inverse. IEEE Transactions on Automatic Control, Vol. 13 (1968)

582-585

13. Jang, J., Lee, S., Shin, S.: An Optimization Network for Matrix Inversion. Neural

Information Processing Systems, American Institute of Physics, NY (1988) 397-401

14. Wang, J.: A Recurrent Neural Network for Real-Time Matrix Inversion. Applied

Mathematics and Computation, Vol. 55 (1993) 89-100

15. Zhang, Y.: Revisit the Analog Computer and Gradient-Based Neural System for

Matrix Inversion. Proceedings of IEEE International Symposium on Intelligent

Control (2005) 1411-1416

16. Zhang, Y., Jiang, D., Wang, J.: A Recurrent Neural Network for Solving Sylvester

Equation with Time-Varying Coeﬃcients. IEEE Transactions on Neural Networks,

Vol. 13 (2002) 1053-1063

17. Zhang, Y., Ge, S.S.: A General Recurrent Neural Network Model for Time-Varying

Matrix Inversion. Proceedings of the 42nd IEEE Conference on Decision and Con-

trol (2003) 6169-6174

18. Zhang, Y., Ge, S.S.: Design and Analysis of a General Recurrent Neural Network

Model for Time-Varying Matrix Inversion. IEEE Transactions on Neural Networks,

Vol. 16 (2005) 1477-1490

19. Carneiro, N.C.F., Caloba, L.P.: A New Algorithm for Analog Matrix Inversion.

Proceedings of the 38th Midwest Symposium on Circuits and Systems, Vol. 1 (1995)

401-404

20. Mead, C.: Analog VLSI and Neural Systems. Addison-Wesley, Reading, MA (1989)

21. Zhang, Y., Li, Z., Fan, Z., Wang, G.: Matrix-Inverse Primal Neural Network with

Application to Robotics. Dynamics of Continuous, Discrete and Impulsive Systems,

Series B, Vol. 14 (2007) 400-407

22. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis, Cambridge University Press,

Cambridge (1991)

Design of a nonlinearly activated gradient-based neural network and its application to matrix inversion

Article

Full-text available

Jan 2020

Different from the traditional linearly activated gradient-based neural network model (GNN model), two nonlinear activation functions are presented and investigated to construct two nonlinear gradient-based neural network models (NGNN-1 model and NGNN-2 model) for matrix inversion in this paper. For comparative and illustrative purposes, the traditional GNN model is also used to solve matrix inversion problems under the same circumstance. In addition, the simulation results of the computer finally confirm the validity and superiority of the two nonlinear gradient-based neural network models specially activated by two nonlinear activation functions for matrix inversion, as compared with the traditional GNN model.

Implicit Gradient Neural Networks with a Positive-Definite Mass Matrix for Online Linear Equations Solving

Article

Mar 2017

Ke Chen

Motivated by the advantages achieved by implicit analogue net for solving online linear equations, a novel implicit neural model is designed based on conventional explicit gradient neural networks in this letter by introducing a positive-definite mass matrix. In addition to taking the advantages of the implicit neural dynamics, the proposed implicit gradient neural networks can still achieve globally exponential convergence to the unique theoretical solution of linear equations and also global stability even under no-solution and multi-solution situations. Simulative results verify theoretical convergence analysis on the proposed neural dynamics.

A Review on Neural Dynamics for Robot Autonomy

Article

Aug 2018

Exploiting neural networks to solve control problems of robots is becoming commonly and effectively in academia and engineering. Due to the remarkable features like distributed storage, parallelism, easy implementation by hardware, adaptive self-learning capability, and free of off-line training, the solutions of neural networks break the bottlenecks of serial-processing strategies and methods, and serve as significant alternatives for robotic engineers and researchers. Especially, various types and branches of recurrent neural networks (RNNs) have been sequentially developed since the seminal works by Hopfield and Tank. Successively, many classes and branches of RNNs such as primal-dual neural networks (PDNNs), zeroing neural networks (ZNNs) and gradient neural networks (GNNs) are proposed, investigated, developed and applied to the robot autonomy. The objective of this paper is to present a comprehensive review of the research on neural networks (especially RNNs) for control problems solving of different kinds of robots. Specifically, the state-of-the-art research of RNNs, PDNNs, ZNNs and GNNs in different robot control problems solving are detailedly revisited and reported. The readers can readily find many effective and valuable solutions on the basis of neural networks for the robot autonomy in this paper.

Loose Error Bounds and Exponential Convergence Rates of GMDS-ZNN for Time-Dependent Inverse Computing

Conference Paper

Dec 2022

Computing Time-Varying ML-Weighted Pseudoinverse by the Zhang Neural Networks

Article

Mar 2020

The Zhang neural network (ZNN), a recurrent neural network, proposed in 2001, is particularly effective in solving time-varying problems. It has shown high efficiency and excellent performance in various applications. The weighted pseudoinverse is a useful tool in solving and analyzing the constrained least-squares problems. In this paper, we propose a ZNN model for computing the weighted pseudoinverse of a time-varying matrix. We show that our model converges globally and exponentially to the solution and our system is robust at the presence of small errors. A Matlab Simulink implementation of our model is presented. Our convergence analysis is verified by our experiments on testing matrices. A comparison study shows that our model has superior performance over the conventional gradient-based neural networks.

Matlab Simulink of Varying-Parameter Convergent-Differential Neural-Network for Solving Online Time-Varying Matrix Inverse

Conference Paper

Dec 2016

Link from ZD control to pascal's triangle illustrated via multiple-integrator systems

Conference Paper

Aug 2016

Improved neural dynamics for online Sylvester equations solving

Article

Mar 2016
INFORM PROCESS LETT

Ke Chen

A novel implicit dynamic system together with its electronic implementation is firstly proposed and investigated for online solution of Sylvester equations. In view of the success of recently-proposed Zhang implicit dynamics, our proposed model is also designed in the implicit-dynamical fashion. Compared to the existing neural dynamics, i.e., conventional gradient explicit dynamics and Zhang implicit dynamics, our implicit dynamic system can achieve superior global exponential convergence performance. Computer simulation results demonstrate theoretical analysis of our proposed dynamic model for real-time solution of Sylvester equations.

Zhang neural networks and neural-dynamic method

Article

Jan 2011

The real-time solution to a mathematical problem arises in numerous fields of science, engineering, and business. It is usually an essential part of many solutions, e.g., matrix/vector computation, optimization, control theory, kinematics, signal processing, and pattern recognition. In recent years, due to the in-depth research on neural networks, numerous recurrent neural networks (RNN) based on the gradient-based method have been developed and investigated. Particularly, some simple neural networks were proposed to solve linear programming problems in real time and implemented on analog circuits. In this book, ZNN, ZD or ZND theory formalizes these problems and solutions in the time-varying context and provides compact models that could solve those dynamic problems.

Robustness analysis of a hybrid of recursive neural dynamics for online matrix inversion

Article

Jan 2016

Encouraged by superior convergence performance achieved by a recently-proposed hybrid of recursive neural dynamics for online matrix inversion, we investigate its robustness properties in this paper when there exists large model implementation errors. Theoretical analysis shows that the perturbed dynamic system is still global stable with the tight steady-state bound of solution error estimated. Moreover, this paper analyses global exponential convergence rate and finite convergence time of such a hybrid dynamical model to a relatively loose solution error bound. Computer simulation results substantiate our analysis on the perturbed hybrid neural dynamics for online matrix inversion when having large implementation errors.

Time-series Gaussian Process Regression Based on Toeplitz Computation of O(N2) Operations and O(N)-level Storage

Conference Paper

Full-text available

Jan 2006

Gaussian process (GP) regression is a Bayesian nonparametric model showing good performance in various applications. However, its hyperparameter-estimating procedure may contain numerous matrix manipulations of O(N3) arithmetic operations, in addition to the O(N2)-level storage. Motivated by handling the real-world large dataset of 24000 wind-turbine data, we propose in this paper an efficient and economical Toeplitz-computation scheme for time-series Gaussian process regression. The scheme is of O(N2) operations and O(N)-level memory requirement. Numerical experiments substantiate the effectiveness and possibility of using this Toeplitz computation for very large datasets regression (such as, containing 10000~100000 data points).

Optimization network for matrix inversion

Article

Matrix Analysis

Book

Jan 1990

Linear algebra and matrix theory are fundamental tools in mathematical and physical science, as well as fertile fields for research. This second edition of this acclaimed text presents results of both classic and recent matrix analysis using canonical forms as a unifying theme and demonstrates their importance in a variety of applications. This thoroughly revised and updated second edition is a text for a second course on linear algebra and has more than 1,100 problems and exercises, new sections on the singular value and CS decompositions and the Weyr canonical form, expanded treatments of inverse problems and of block matrices, and much more.

Analog VLSI and Neural Systems

Chapter

Jan 1989

Analog VLSI and neural systems

Book

Jan 1989

Carver Mead

1. A Neural Processor for Maze Solving.- 2 Resistive Fuses: Analog Hardware for Detecting Discontinuities in Early Vision.- 3 CMOS Integration of Herault-Jutten Cells for Separation of Sources.- 4 Circuit Models of Sensory Transduction in the Cochlea.- 5 Issues in Analog VLSI and MOS Techniques for Neural Computing.- 6 Design and Fabrication of VLSI Components for a General Purpose Analog Neural Computer.- 7 A Chip that Focuses an Image on Itself.- 8 A Foveated Retina-Like Sensor Using CCD Technology.- 9 Cooperative Stereo Matching Using Static and Dynamic Image Features.- 10 Adaptive Retina.

Topics in Matrix Analysis

Book

Jan 1994

1. The field of values 2. Stable matrices and inertia 3. Singular value inequalities 4. Matrix equations and Kronecker products 5. Hadamard products 6. Matrices and functions 7. Totally positive matrices.

O(N )Operation Approximation of Covariance Matrix Inverse in Gaussian Process Regression Based on Quasi-Newton BFGS Method

Article

Mar 2007

Gaussian process (GP) is a Bayesian nonparametric regression model, showing good performance in various applications. However, during its model-tuning procedure, the GP implementation suffers from numerous covariance-matrix inversions of expensive O(N) operations, where N is the matrix dimension. In this article, we propose using the quasi-Newton BFGS O(N)-operation formula to approximate/replace recursively the inverse of covariance matrix at every iteration. The implementation accuracy is guaranteed carefully by a matrix-trace criterion and by the restarts technique to generate good initial guesses. A number of numerical tests are then performed based on the sinusoidal regression example and the Wiener–Hammerstein identification example. It is shown that by using the proposed implementation, more than 80% O(N) operations could be eliminated, and a typical speedup of 5–9 could be achieved as compared to the standard maximum-likelihood-estimation (MLE) implementation commonly used in Gaussian process regression.

Towards Piecewise-Linear Primal Neural Networks for Optimization and Redundant Robotics

Conference Paper

Jan 2006

Yunong Zhang

Motivated by handling joint physical limits, environmental obstacles and various performance indices, researchers have developed a general quadratic-programming (QP) formulation for the redundancy resolution of robot manipulators. Such a general QP formulation is subject to equality constraint, inequality constraint and bound constraint, simultaneously. Each of the constraints has interpretably physical meaning and utility. Motivated by the real-time solution to the robotic problems, dynamic system solvers in the form of recurrent neural networks (RNN) have been developed and employed. This is in light of their parallel-computing nature and hardware implementability. In this paper, we have reviewed five RNN models, which include state-of-the-art dual neural networks (DNN) and LVI-based primal-dual neural networks (LVI-PDNN). Based on the review of the design experience, this paper proposes the concept, requirement and possibility of developing a future recurrent neural network model for solving online QP problems in redundant robotics; i.e., a piecewise-linear primal neural network

Revisit the Analog Computer and Gradient-Based Neural System for Matrix Inversion

Conference Paper

Jul 2005

Yunong Zhang

As inspired by revising (Zhang and Ge, 2003), the traditional gradient-based neural system (also termed analog computer (Manherz et al., 1968)) for matrix inversion is re-visited by examining different activation functions and various implementation errors. A general neural system for matrix inversion is thus presented which can be constructed by using monotonically-increasing odd activation functions. For superior convergence and robustness of such a system, the power-sigmoid activation function is preferred to be in use if the hardware permits. In addition to investigating the singular case, this paper also presents an application example on inverse-kinematic control of redundant manipulators via online pseudoinverse solution

A recurrent neural network for real-time matrix inversion

Article

Apr 1993
APPL MATH COMPUT

Jun Wang

A recurrent neural network for computing inverse matrices in real-time is proposed. The proposed recurrent neural network consists of n independent subnetworks where n is the order of the matrix. The proposed recurrent neural network is proven to be asymptotically stable and capable of computing large-scale nonsingular inverse matrices in real-time. An op-amp based analog neural network is discussed. The operating characteristics of the op-amp based analog neural network is also demonstrated via an illustrative example.

MATLAB Simulation of Gradient-Based Neural Network for Online Matrix Inversion

Abstract and Figures

Recommended publications

MATLAB simulation of gradient-based neural network for sylvester equation solving

Global exponential convergence and stability of gradient-based neural network for online matrix inve...

Performance Analysis of Gradient Neural Network Exploited for Online Time-Varying Matrix Inversion

MATLAB Simulation and Comparison of Zhang Neural Network and Gradient Neural Network for Online Solu...

Zhang Neural Network Versus Gradient Neural Network for Online Time-Varying Quadratic Function Minim...