Content uploaded by Uwe Helmke
Author content
All content in this area was uploaded by Uwe Helmke on Jan 05, 2016
Content may be subject to copyright.
SIAM J. MATRIXANAL.APPL. @1994
Society for Industrial and Applied Mathematics
Vol. 15, No. 3, pp. 881-902, July 1994 011
NUMERICAL GRADIENT ALGORITHMS FOR EIGENVALUE AND
SINGULAR VALUE CALCULATIONS*
J.B. MOOREt ,R.E. MAHONYt ,AND U. HELMKE$
Abstract. Recent work has shown that the algebraic question of determining the eigenvalues,
or singular values, of amatrix can be answered by solving certain continuous-time gradient flows on
matrix manifolds, To obtain computational methods based on this theory, it is reasonable to develop
algorithms that iteratively approximate the continuous-time flows. In this paper the authors propose
two algorithms, based on adouble Lie-bracket equation recently studied by Brockett, that appear
to be suitable for implementation in parallel processing environments. The algorithms presented
achieve, respectively, the eigenvalue decomposition of asymmetric matrix and the singular value
decomposition of an arbitrary matrix. The algorithms have the same equilibria as the continuous-
time flows on wh~ch they are based and inherit the exponential convergence of the continuous-time
solutions.
Key words. eigenvalue decomposition, singular value decomposition, numerical gradient algo-
rithm
AMS subject classifications. 15A18, 65F1O
1. Introduction. Atraditional algebraic approach to determining the eigen-
value and eigenvector structure of an arbitrary matrix is the QR-algorithm. In the
early 1980s it was observed that the QR-algorithm is closely related to acontinuous-
time differential equation that has become known through study of the Toda lattice.
Symes [13] and Deift, Nanda, and Tomei [6] showed that for tridiagonal real sym-
metric matrices, the QR-algorithm is adiscrete-time sampling of the solution to a
continuous-time differential equation. This result was generalised to full complex ma-
trices by Chu [3], and Watkins and Elsner [14] provided further insight in the late
1980s.
Brockett [2] studied dynamic matrix flows generated by the double Lie-bracket
equation
if= [H, [H, N]]> H(o)=Ho
for constant symmetric matrices Nand Ho, and where we use the Lie-bracket notation
[X, Y] =XY –YX. We call this differential equation the double-bracket equation,
and we call solutions of this equation double-bracket flows. Similar matrix differential
equations in the area of Physics were known and studied prior to the references given
above. An example, is the Landau–Lifschitz–Gilbert equation of micromagnetics
drn –*(fix R-wlzx(fix77)) [ril’=1,
-Z–”
as a+wand -y/a +k, aconstant. In this equation m, ~ER3 and the cross-
product is equivalent to aLie-bracket operation. The relevance of such equations
●Received by the editors April 13, 1992; accepted for publication (in revised form) September
25, 1992. The authors acknowledge the funding of the activities of the Cooperative Research Centre
for Robust and Adaptive Systems by the Australian Commonwealth Government under the Coop-
erative Research Centres Program. The authors also acknowledge additional support from Boeing
Commercial Aircraft Corporation, Inc.
tDepartment of Systems Engineering, Research School of Physical Sciences and Systems Engi-
neering, Australian National University, A. C.T., 0200, Australia (robert. mahony@smu.edu. au).
1Department of Mathematics, University of Regensburg, 8400 Regensburg, Germany.
881
882 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
to traditional linear algebra problems, however, has only recently been studied and
discretisations of such flows have not been investigated.
The double-bracket equation is not known to be acontinuous-time version of any
previously existing linear algebra algorithm; however, it exhibits exponential conver-
gence to an equilibrium point on the manifold of self-equivalent symmetric matrices
[2], [5], [9]. Brockett [2] was able to show that this flow could be used to diagonalise
real symmetric matrices, and thus, to find their eigenvalues, sort lists, and even to
solve linear programming problems. Part of the flexibility and theoretical appeal of
the double-bracket equation follows from its dependence on the arbitrary matrix pa-
rameter N, which can be varied to cent rol the transient behaviour of the differential
equation.
In independent work by Driessel [7], Chu and Driessel [5], Smith [12] and Helmke
and Ivfoore [8], asimilar gradient flow approach is developed for the task of comput-
ing the singular values of ageneral nonsymmetric, nonsquare matrix. The differential
equation obtained in these approaches is almost identical to the double-bracket equa-
tion. In [8], it is shown that these flows can also be derived as special cases of the
double-bracket equation for anonsymmetric matrix, suitably augmented to be sym-
metric.
With the theoretical aspects of these differential equations becoming known, and
with applications in the area of balanced realizations [10], [11] aiong with the more
traditional matrix eigenvalue problems, there remains the question of efficiently com-
puting their solutions. No explicit solutions to the differential equations have been
obtained and adirect numerical estimate of their integral solutions seems unlikely to
be an efficient computational algorithm. Iterative algorithms that approximate the
cent inuous-t ime flows, however, seem more likely to yield useful numerical methods.
Furthermore, discretisations of such isospectral matrix flows are of general theoretical
interest in the field of numerical linear algebra. For example, the algorithms proposed
in this paper involve adjustable parameters, such as step-size selection schemes and
amatrix parameter N, which are not present in traditional algorithms such as the
QR-algorithm or the Jacobi method.
In this paper, we propose anew algorithm termed the Lie-bracket algorithm, for
computing the eigenvalues of an arbitrary symmetric matrix
For suitably small ak, termed time-steps, the algorithm is an approximation of the
solution to the continuous time double-bracket equation. Thus, the algorithm rep-
resents an approach to developing new recursive algorithms based on approximating
suitable continuous-time flows. We show that for suitable choices of time-steps, the
Lie-bracket algorithm inherits the same equilibria as the double-bracket flow. Further-
more, exponential convergence of the algorithm is shown. This paper presents only
theoretical results on the Lie-bracket algorithm and does not attempt to compare its
performance to that of existing methods for calculating the eigenvalues of amatrix.
Continuous-time gradient flows that compute the singular values of arbitrary
nonsymmet ric mat rices, such as those covered in [5], [8], [9], [12], have asimilar
form to the double-bracket equation on which the Lie-bracket algorithm was based.
We use this similarity to generate anew scheme for computing the singular values
of ageneral matrix termed the singular value algorithm. The natural equivalence
between the Lie-bracket algorithm and the singular value algorithm is demonstrated
and exponential convergence results follow almost directly.
NUMERICAL GRADIENT ALGORITHMS 883
Associated with the main algorithms presented for the computation of the eigen-
values or singular values of matrices are algorithms that compute the full eigenspace
decompositions of given matrices. These algorithms are closely related to the Lie-
bracket algorithm and also display exponential convergence.
The paper is divided into eight sections including the Introduction and an Ap-
pendix. In ~2 of this paper, we consider the Lie-bracket algorithm and prove apropo-
sition that ensures the algorithm converges to afixed point. Section 3deals with
choosing step-size selection schemes and proposes two valid deterministic functions
for defining the time-steps. Considering the particular step-size selection schemes
presented in 53 we return to the question of stability in 54 and show that the Lie-
bracket algorithm has aunique exponentially attractive fixed point, though several of
the technical proofs are deferred to the Appendix. This completes the discussion for
the symmetric case and $5 considers the nonsymmetric case and the singular value
decomposition. Section 6presents associated algorithms that compute the eigenspace
decompositions of given initial conditions. Anumber of computational issues are
briefly mentioned in $7, while 58 provides aconclusion.
2. The Lie-bracket algorithm. In this section, we begin by introducing the
least squares potential that underpins the recent gradient flow results and then we
describe the double Lie-bracket equation first derived by Brockett [2]. The Lie-bracket
recursion is introduced and conditions are given that guarantee convergence of the
algorithm.
Let Nand Hbe real symmetric matrices and consider the potential function
(1) @(H) := Ijll- Nl\2
=lpI112+IIN112-2tr(N~)
where the norm used is the Frobenius norm 11X112:= tr(XTX) = ~ z~j, with Zij the
elements of X. Note that @(H) measures the least squares difference between the
elements of Hand the elements of N. Let kf(Ho) be the set of orthogonally similar
matrices, generated by some symmetric initial condition Ho =H; cI?nxn. Then
(2) M(H~) ={WHJ7 IuEo(n)},
where O(n) denotes the group of all nxnreal orthogonal matrices. It is shown
in [9, p. 48] that Af(Ho) is asmooth compact Riemannian manifold with explicit
forms given for its tangent space and Rlemannian metric. Furthermore, in [1], [5] the
gradient of +(H), with the respect to the normal Riemannian metric on Al(Ho) [9,
p. 50], is shown to be V@(H) =–[H, [H, N]]. Consider the gradient flow given by
the solution of
(3) H=–v’@(H)
=[H, [H, N]], with H(0)= Ho,
which we call the double-bracket jlow [2], [5]. Thus, the double-bracket flow is a
gradient flow that acts to decrease or minimise the least squares potential @on the
manifold M(Ho). Note that from (1), this is equivalent to increasing or maximizing
tr(NH). We refer to the matrix Ho as the initial condition and the matrix Nas the
target matrix.
The Lie-bTacket algorithm proposed in this paper is
(4) Hk+~ =e–Cx~[&~]Hke@k,~]
884 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
for arbitrary symmetric nxnmatrices Ho and Nand some suitably small scalars
a~ termed time-steps. To motivate the Lie-bracket algorithm, consider the curve
lf~+l(t) =e-ti~k’~lk?~et[~ ”~l. Thus, Hk+l(o) =Hk and Hk+l =ffk+l(~k), the
(k+ l)th iteration of (4). Observe that
~(e-’[H’~]~ke~[H~,~])
=[~k,[~k,N]],
t=o
and thus, e–t IHk‘N]Hket ‘Hk‘N] is afirst approximateion of the double-bracket flow at
Hk CA4(HO). It follows that for small ~/c, the solution to (3) evaluated at time t = (%
with H(0) =Hk is approximately Hk+l =Hk+l (~k).
Itis easily seen from above that stationary points of (3) are fixed points of (4).
In general, (4) may have more fixed points than just the stationary points of (3),
however, Proposition 2.1 shows that this is not the case for asuitable choice of time-
step ~k. We use the term equilibrium point to mean afixed point of the algorithm
that is also astationary point of (3).
To implement (4) it is necessary to specify the time-steps ok. We do this by
considering functions ct~ :AI(HO) -R+ and setting ok := crjv(Hk). We refer to the
function ffN as the step-size selection scheme. We require that the step-size selection
scheme satisfies the following condition.
CONDITION 2.1, Let ~N :M(HO) ~R+ be astep-size selection scheme for the
Lie-bracket algorithm on M(HO). Then ~jv is well defined and continuous on all of
M(Ho), except possibly those points HGM(Ho) where HN =NH. Furthermore,
there exist real numbers B, -y >0, such that B>CYN(H) 2~for all HGM(HO)
where ~N is well defined.
Remark 2.1. We find that the variable step-size selection scheme proposed in this
paper, which provides the best simulation results, is discontinuous at all the points
HGM(HO), such that [H, N] =O.
Remark 2.2. Note that the definition of astep-size selection scheme depends
implicitly on the matrix parameter N. Indeed, ~N can be thought of as afunction in
two matrix variables Nand H.
CONDITION 2.2. Let Nbe adiagonal nxnmatrix with distinct diagonal entries
/.l,l>/Q?>. ..> #n.
Remark 2.3. This condition on N, along with Condition 2.1 on the step-size
selection scheme, is chosen to ensure that the Lie-bracket algorithm converges to a
diagonal matrix from which the eigenvalues of Ho can be directly determined.
Let A1>A2 >... >&be the eigenvalues of Ho with associated algebraic
multiplicities nl, . . . . nr satisfying ~~=1 ni =n. Note that as Ho is symmetric, the
eigenvalues of Ho are all real. Thus, the diagonalisation of Ho is
where Ini is the ni xni identity matrix. For generic initial conditions and atarget
matrix Nthat satisfies Condition 2.2, the continuous-time equation (3) converges
exponentially fast to A[2], [9]. Thus, the eigenvalues of Ho are the diagonal entries
of the limiting value of the infinite time solution to (3). The Lie-bracket algorithm
behaves similarly to (3) for small ~k and, given asuitable step-size selection scheme,
should converge to the same equilibrium as the cent inuous-time equation.
NUMERICAL GRADIENT ALGORITHMS 885
PROPOSITION 2.1. Let HO and Nbe nxnreal symmetric matrices where N
satisfies Condition 2.2. Let V(H) be given by (1) and let ON :M(HO) +R+ be astep-
size selection scheme that satisfies Condition 2.1. For H~ GM(Ho), let c% =~N (H,+)
and define
(6) ~?)(Hk,(l~) := @(H~+I) –?/!(Hk),
where Hk+l is given by (4). Suppose
(7) ~?/J(Hk,a~) <0 when [Hk, N] +O.
Then (a) The iterative equation (4) defines an isospectral (eigenvalue preserving) re-
cursion on the manifold M(HO).
(b) The fixed points of (4) are characterised by matrices H6M(HO) satisfying
(8) [H, N]= O.
(c) Every solution Hk, fork =1,2,..., of (4), converges as k+cm, to some
Hm GM(HO) where [Hm, N] =O.
Proof To prove part (a), note that the Lie-bracket [H, N]T =–[H, N] is skew-
symmetric. As the exponential of askew-symmetric matrix is orthogonal, (4) is an
orthogonal conjugation of Hk and hence is isospectral.
For part (b) note that if [Hk, N] =O, then by direct substitution into (4) we see
Hk+~ =Hk and thus, Hk+~ =Hk for 1Z1, and Hk is afixed point of (4). Conversely
if [Hk, N] #O, then from (7), ~@(Hk, ~k) #O, and thus Hk+l #Hk. By inspection,
points satisfying (8) are stationary points of (3), and indeed are known to be the only
stationary points of (3) [9, pg. 50]. Thus, the fixed points of (4) are equilibrium
points in the sense that they are all stationary points of (3). To prove part (c) we
need the following lemma.
LEMMA 2.2. Let Nsatisfy Condition 2.2 and CYNsatisfy Condition 2.1 such
that the Lie-bracket algorithm satisfies (7). The Lie-bracket algorithm (4) has exactly
n!/ ~~=1 (nil) distinct equilibrium points in M(Ho). These equilibrium points are
characterised by the matrices XTAZ7 where ~is an nxnpermutation matrix, a
rearrangement of the rows of the identity matrix, and Ais given by (5).
Proo\ Note that part (b) of Proposition 2.1 characterises equilibrium points of
(4) as H6M(HO) such that [H, N] =O. Evaluating this condition componentwise
for H={hij }gives
hzj(pj –/.42)=o,
and hence by Condition 2.2, hij =Ofor i#j. Using the fact that (4) is isospectral, it
follows that equilibrium points are diagonal matrices that have the same eigenvalues
as Ho. Such matrices are distinct and can be written in the form nTA~ for ~an nxn
permutation matrix. Asimple counting argument yields the number of matrices that
satisfy this condition to be n!/ ~~=1 (ni !). II
Consider for afixed initial condition Ho, the sequence Hk generated by the Lie-
bracket algorithm. Observe that condition (7) implies that @(Hk) is strictly monotonic
decreasing for all kwhere [Hk, N] #=O. Also, since @is acontinuous function on the
compact set M(HO ), then @is bounded from below and +(Hk )will converge to some
nonnegative value @m. As @(Hk) -@m then A@(Hk, ak) +O.
For an arbitrary positive number qdefine the open set De cM(HO), consisting
of all points of M(HO), within an cneighbourhood of some equilibrium point of (4).
886 J. B. MOORE, R. E. MAHONEY,AND U. HELMKE
The set M(Ho) –D, is aclosed, compact subset of M(27.) on which the matrix
function H+[H, N] does not vanish. As aconsequence, the difference function (6)
is cent inuous and strictly negative on M(HO) –DC, and thus can be over bounded
by some strictly negative number &<0. hloreover, as A@(Ifk, ak) ~O, then there
exists aK=K(61) such that for all k> ~ then O2~@(Hk, ~k) >61. This ensures
that Hk 6D, for all k>K. In other words, Hk is converging to some subset of
possible equilibrium points.
Imposing the upper bound Bon the step-size selection scheme a~, Condition
2.2, it follows that CI~(Hk)[H~, N] ~Oas k~co. Thus, e@N(H’JIH”~l -1, the
identity matrix, and hence, e-aNfHkJ~Hk’~lHkemNIHkJIH”Nl-+ Hk as k*co. As a
consequence [1~~+1 –f?,k[I~Ofor k~mand this combined with the distinct nature
of the fixed points, Lemma 2.2, and the partial convergence already shown, completes
the proof. Cl
Remark 2.4. In Condition 2.2 it was required that Nhave distinct diagonal
entries. If this condition is not satisfied, the equilibrium condition [H, N] =Omay
no longer force Hto be diagonal, and thus, though the algorithm will converge, it is
unlikely to converge to adiagonal matrix.
3. Step-size selection. The Lie-bracket algorithm (4) requires asuitable step-
size selection scheme before it can be implemented. To generate such ascheme, we
use the potential (1) as ameasure of the convergence of (4) at each iteration. Thus,
we aim to choose each time-step to maximise the absolute change in potential IA@l
of (6), such that A@ <0. Optimal time-steps can be determined at each step of the
iteration by completing aline search to maximise the absolute change in potential as
the time-step is increased. Such an approach, however, involves high computational
overheads and we aim rather to obtain astep-size selection scheme in the form of a
scalar equation depending on known values.
Using the Taylor expansion, we express ~@(Hk, T) for ageneral time-step r, as a
linear term plus ahigher order error term. By estimating the error term we obtain a
mathematically simple function A@u (Hk, T), which is an upper bound to A@(H~, ~)
for all T. Then, choosing asuitable time-step ~k based on minimising A@u, we
guarantee that the aCtUal change in pOtentia~, ~@(Hk, ok) 5~$u (Hk, ok) <0,
satisfies (7). Due to the simple nature of the function A+u, there is an explicit form
for the time-step ~k depending only on Hk and N. We begin by deriving an expression
for the error term.
LEhlMA 3.1. For the kth step of the recursion (4) the change in potential
~@(Hk, T) of (6), for atime-step 72s
(9) ~$!J(Hk, ‘r) =‘2Tll[Hk, N]]12 -2T2tr(N~~(~))
with
(lo) J
7?,(T) := 1(1 -s)H:+l(sT)ds,
o
where H;+ ~(T) is the second derivative of Hk+ 1(r) with respect to T.
Proof Let Hk+ 1(~) be the (k+ l)th recursive estimate for an arbitrary time-step
T. Thus Hk+l (T) =e-r{~~ ‘~lHker{H~ ‘~1. It is easy to verify that the first and second
time derivatives of Hk+l are exactly
NUMERICAL GRADIENT ALGORITHMS
Applying Taylor’s theorem, then
~k+I(’T) =~k+~(()) +T&k+I(0) +72 [‘(1 -s) H[+1(s7)ds,
/4. \o
887
(11) =Hk +~[Hk, [Hk, N]] +T27?2(T).
Consider the change in the potential +(H) between
b@(Hk, T) =@(Hk+l(T)) – @(Hk)
––2tr(iV(Hk+l(~) –Hk))
—
/*n\
the points H~ and Hk+l (~),
(l,L)
=–2tr(N(7[Hk, [Hk, N]] +T2’??z(T)))
=–2T1/[Hk, N]l12 –2T%r(N7?2(T)). D
Note that for r=O, then ~~(Hk, O) =Oand also that
-&(Hk) ~) =-2\l[H~, N]\]2.
17=0
Thus, for sufficiently small ~the error term ~2tr(N%i12(~))becomes negligible
A@(H~, T) is strictly negative. Let aOPt>0 be the first time for which
$~@(Hk, T) =o,
T=cx.pt
then ~q5(Hk, aopt) <A@(Hk, T) < 0 for all strictly positive T<O. Pt. It is
and
not
possible, howeve~, to estimate czOPtdirectly from (12) due to the transcendental nature
of the error term 722(T). By considering two separate estimates of the error term, we
obtain two step-size selection schemes Ok Sc%pt. The first and constant step-size
selection scheme follows from aloose bound of the error, whereas the second variable
step-size selection scheme follows from amore sophisticated argument and results in
faster convergence of (4).
LEMMA 3.2 (Constant step-size selection scheme).
(13) 1
afi=411Hol\.[lNl[
satisfies Condition 2.1. Furthermore, the Lie-bracket
step-size selection scheme afi, satisfies (7).
The constant time-step
algorithm, equipped with the
(14)
Prooj Recall that for the Frobenius norm ltr(XY) I~11X1I.]IYI1. Then
~@(Hk, T) <–fhll[Hk,~]112 +h21tr(~7?2(T))l
<‘2T/l[Hk,~]112+zT21/~11 “11~2(T)11
<–zTll[Hk,~]112 +2?_211Nll -
J
1
(1 -S)ll[[Hk+l(ST), [H~, Jv]], [Hk,~]]{@
<–2T]l[Hk,N]\l;+4~21/N/l ~llHo[l .ll[H~,N]112
=: ~@~(Hk,T).
Thus A@u (Hk, r) is an upper bound for ~~(Hk, T) and has the property that for
sufficiently small T, it is strictly negative; see Fig. 1. Due to the quadratic form of
888 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
h~a)
I
///,
///,
/,‘ AqfJH,a)>A@,a) ,‘ ,~
,//
,/
FIG.1. The upper bound on A+(Hk, a) viz Adu (Hk, a)
A~u(Hk, T) in r, it is immediately clear that a: =afi(H~) =I/(411HollllNll) of (13)
is the minimum of (14). Cl
Adirect norm bound of the integral error term is not likely to be atight estimate
of the error and the function Llvu is afairly crude bound for A~. The following
more sophisticated estimate results in astep-size selection scheme that causes the
Lie-bracket algorithm to converge an order of magnitude faster.
LEMMA 3.3 (An improved bound for A+(Hk, ~) ). Note the difference junction
A$(Hk, T) can be over bounded by
(15) +llHOll~lt[N, [H~!Nllll (eZ7iI[~~1rVll_1-2711 [Hk,N]ll)
l][Hk,N]ll
Proof Consider the Taylor series expansion of the matrix exponential
~A =I+A+;A2+; A3+ . . . .
It is easily verified that
1A A B]]+ ;[A, [A, [A, B]]]+-
eABe–A=B+[A, B]+Z[ ,[ , .
(16) ‘1
=x~a&AB.
a=o
Here ad~B =adA (a&A-lB), ad~l? =B, where adA :R“x” 4R“x” is the linear
map X*AX –XA. Substituting –~[H~, N] and Hk for Aand 1? in (16) and
comparing with (11), gives
Considering ltr(N’R2 (~))~ and using the readily established identity tr(Nad~AB) =
tr((ad~N)B) gives
l~2tr(N72-~(r))l =
—
—
—
NUMERICAL GRADIENT ALGORITHMS
‘1
X7 (
., tr ad~[~,,~l (N) H,)
j=’ .
&tw1(N)ll ~ IIHoII
~j(QllTIH~,N]ll)’-’ ll~~~IH,,~I(N)ll ~IIHoII
IIHoII ~lldi~.,~l(N)ll mI
2Tll[Hk,N]\l ~z(2Tll[H,,N]11)~
l\Holl~IIIN, [H~>AW (e2W-WVII _~-QTll[Hk,N]II)
211[Hk,N]ll
889
Thus combining this with the first line of (14) gives (15). o
The variable step-size selection scheme is derived from this estimate of the error
term in the same manner the constant step-size selection scheme was derived in Lemma
3.2. LEMMA 3.4 (Variable step-size selection scheme). The step-size selection scheme
afi :M(H)) +R+
(17) IIIH, NI112
ai(H) =211[H:N]II10g
( )
IIHOIII1[N,[H,N]]II‘1 ‘
where all norms are Frobenius norms, satisfies Condition 2.1. Furthermore, the Lie-
bracket algorithm, equipped with the step-size selection scheme crfi, satisjies (7).
Prooj We first show that cYfi satisfies the requirements of Condition 2.1. As the
Frobenius norm is acontinuous function, then afi is well defined and continuous at all
points HGM(HO) such that [H, N] #O. Note that when [H, N] =O, then afi is not
well defined. To show that there exists apositive constant ~, such that afi (H) >-y,
consider the following lower bound,
(18)
1(ll[Hk,N]t/
‘~ ‘= 211[Hk,N](l10g 211[HOIIIINII+ 1
)
1
(ll[~k,N]112
s2/l[H~,N]l/ 10g 211[H011IINII ll[Hk,N]ll +1)
1
(
ll[fL, Nl112
s211[Hk,N]ll 10g )
ll[HOll [l[N,[Hk,N]][l ‘1 ‘
which is just afi. Using L’H6pital’s rule it can be seen that the limit of afi at an
equilibrium point, HGIII(HO) such that [H, N] =O, is 1/(41IHOII.IINII). Including
these points in the definition of afi, gives that ah is acontinuous, strictly positive,
well-defined function for all H@M(HO). Thus, as M(HO) is compact, there exists a
real number ~ > 0such that
ctfi>a;>~>o
890 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
on JW(~o) –{llm I[~m, ~] =O}.
To show that there exists areal number B>0, such that ofi (H) <B, H6
LI(HO), set [H, IV] =X={zij}. For Ngiven by Condition 2.2, then IIIN,xlll =
Xi=j(w-%)2+ where~~z=oas[~>~1isskew-symmetric”observethat
llxll/ll[N,x]ll=Ei=j ‘fj
Zt=j(Pi -Pj)2x?j
<ITlaX(p2 –/Jj)–2 =: ~
i= j
for all choices of X=–XT. It follows that
+1 )
since log(x +1) <zfor z>0.
Finally, for amatrix Hk cM(HO), [Hk, N] #O, the time-step crfi(Hk) =a; >0
minimises (15), and from Lemma 3.3 it follows that O~~@fi(H~, T) ~~@(Hk, ~).
Thus, the Lie-bracket algorithm, equipped with the step-size selection scheme afi,
satisfies (7) and the proof is complete. c1
4. Stability analysis. In this section we study the stability of equilibria of the
Lie-bracket algorithm (4). It is shown that for generic initial conditions and any
step-size selection scheme that satisfies Condition 2.1 and (7), the solution Hk of
the Lie-bracket algorithm converges to the unique equilibrium point Agiven by (5).
Furthermore, we derive local exponential bounds on the rate of convergence. To
improve the readability of the paper the proofs of anumber of the more technical
results have been deferred to an appendix. We begin by showing that Ais the unique
locally asymptotically stable equilibrium point of (4).
LEhiMA 4.1. Let Nsatisfy Condition (2.2) and @N be some selection scheme that
satisfies Condition 2.1 and (7). The Lie-bracket algorithm (4) has aunique locally
asymptotically stable equilib~”um point Agiven by (5). All other equilibrium points of
(4) are unstable.
Proof It is known that Ais the unique local and global minimum of the potential
function @on Ikf(~o) [9]. By assumptions on Nand ~N, ~(~~) is monotonically
decreasing. Thus the domain of attraction of Acontains an open neighbourhood of
A, and hence, Ais alocally asymptotically stable equilibrium point of (4).
All other equilibrium points H~are either saddle points or maxima of @[9].
Thus for any neighbourhood Dof some equilibrium point Hm #A, there exists
some Ho G D such that *(HO) <@(Hm ). It follows that the solution to the Lie-
bracket algorithm, with initial condition Ho, will not converge to Hm and thus Hm is
unstable. n
Lemma 4.1 is sufficient to conclude that for generic initial conditions the Lie-
bracket algorithm will converge to the unique matrix A. It is difficult to characterise
the set of initial conditions for which the algorithm converges to some unstable equi-
librium point Hm #A. For the continuous-time double-bracket flow, however, it is
NUMERICAL GRADIENT ALGORITHMS 891
known that the unstable basins of attraction of such points are of zero measure in
M(HO) [9].
LEMMA 4.2. Let Nsatisfy Condition 2.2. Let dcR+ be aconstant such that
O<d<1/21[Hol 12IINI12 and consider the constant step-size selection scheme, ok :
M(HO) ~R+,
afi(H) =d.
The Lie-bracket algon”thm (4) equipped with the step-size selection scheme Q$ has a
unique locally exponentially asymptotically stable equilibrium point Agiven by (5J.
Proof. Since Q$ is aconstant function, the time-step a: =Q$ (Hk )=dis
constant. Thus, the map
Hk-e -d[H@’]Hked[Hk>N]
is adifferentiable map on all M(HO), and we may consider the linearisation of this map
at the equilibrium point Agiven by (5). The linearisation of this recursion expressed
in terms of ~k &TAM(HI)) (the tangent space of the equilibrium point A) is
(19) =k+l ==k —d[(~kN —NE~)A —A(5kN —Nsk)],
Thus for the elements of ~k, we have
(20) (’&)k+l =[1 -d(~i -~j)(w ‘~j)](<tj)k for i,~ =1,...n
The tangent space T*M(HO) at Aconsists of those matrices ~ = [A, 0] where QE
Skew(n), the class of skew-symmetric matrices [9, p. 53]. Thus, the matrices Eare
parameterised by their components <,j, where i<j, and Ai #Aj. This is alinearly
independent parameterisation of TAM (Ho)and the eigenvalues of the linearisation
(19) can be read directly from (20) as 1 – d(~m(i) –Arfj))(pi –pj), for i<.j and
Ai #Aj. Since &2Aj when i>j, then if d<l/211Ho11211Nllzit follows that
for all i<jwith Ai #Aj. Classical stability theory gives that Ais alocally exponen-
tially asymptotically stable equilibrium point of the recursion (4) with an exponential
rate of convergence of maxi<j,~z=~, {d(& –~j) (ii –Pj )}. n
Remark 4.1. As IIN]IzIIHOI]2 <211NIIIIHOII,the constant step-size selection
scheme cr~ is an example of such aselection scheme where c=1/(41 IHOI\,IlNI l).
Remark 4.2. Let ON :AI(HO) +R+ be astep-size selection scheme that satisfies
Condition 2.1 and (7) and is also continuous on all M(HO). Let Abe the locally
asymptotically stable equilibrium point given by (5). Set am =~N (A) and observe
that the linearisation of the Lie-bracket algorithm will be of the form (19) with d
replaced by am. Recall that the a~, scheme defined in (18) is continuous with limit
cz~(Hm) =1/(41 I.HOII .IINII). Thus, Ais an exponentially asymptotically stable
equilibrium point for the Lie-bracket recursion equipped with the stepsize selection
scheme a~.
To show that the Lie-bracket algorithm is exponentially stable at A for the a;
step-size selection scheme is technically difficult due to the discontinuous nature of Q;
at equilibrium points. The proof of the following proposition is given in the Appendix.
PROPOSITION4.3. Let Nsatisfy assumption (2.2) and afi be the step-size selec-
tion scheme given by Lemma 3.4. The iterative algorithm (4), has aunique exponen-
tially attractive equilibrium point Agiven by (5).
892 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
IL I
o10 m30 40 so co m80
lb-slim
FIG.2.Aplot of the diagonal elements hii of each itemtion Hk of the Lie- bmcket algorithm
run on a7x7initial condition Ho with eigenvalues (1, ....7). The target matrix Nwas chosen to
bediag(l,...,7).
FIG.3. The potential ?+(Hk) =l/Hk –NI [2 for the Lie-bmcket recursion.
To give an indication of the behaviour of the Lie-bracket algorithm, two plots
of asimulation have been included as Figs. 2and 3. The simulation was run on a
random 7x7symmetric initial value matrix with eigenvalues 1, ..., 7. The target
matrix Nis chosen as diag( 1,. ... 7) and as aconsequence the minimum potential is
@~ =0. Fi~re 2is aplot of the diagonal entries of the recursive estimate Hk. The
off-diagonal entries converge to zero as the diagonal entries converge to the eigenvalues
of Hk. Figure 3is aplot of the potential IlHk –N[ [2 verses the iteration k. This
plot clearly shows the monotonic decreasing nature of the potential at each step of
the algorithm.
We summarise the results of ~$2-4 in Theorem 4.4.
THEOREM 4.4. Let Ho =H: be areal symmetric nxnmatrix with eigenvalues
A~>-..>An. Let NGR“xn satisfy Condition 2.2 and let (2N be either the con-
stant step-size selection (13) or the variable step-size selection (17). The Lie-bracket
NUMERICAL GRADIENT ALGORITHMS
recursion
893
Hk+l =e–CYk [tfk, ~] Hke~k[~k,N] ,
a~ =qv(Hk),
with initial condition Ho, has the following properties:
(i) The recursion is isospectral.
(ii) l“ Hk is asolution of the Lie-bracket algorithm, then ~(Hk) =l[Hk -NI[2
is strictly monotonically decreasing for every kGN, where [Hk, N] #O.
(iii) Fixed points of the recursive equation are characterised by matrices HG
M(HO) such that
[H, N]= O.
(iv) Fixed points of the recursion are exactly the stationary points of the double-
bracket equation. These points are termed equilibrium points.
(v) Let Hk, k=l,2,..., be asolution to the Lie-bracket algorithm, then Hk
converges to amatrix H@ GM(HO), [Hm, N] =O, an equilibrium point of the recur-
sion. (vi) All equilibrium points of the Lie-bracket algorithm are strictly unstable except
A=diag(~l,.. ., An), which is locally exponentially asymptotically stable.
5. Singular value computations. In this section we consider discretisations
of continuous-time flows to compute the singular values of an arbitrary matrix.
Asingular value decomposition of amatrix Ho ~Rmx”, m>nis amatrix
decomposition
(21) Ho =VTZU,
where VcO(m), UEO(n) and
(22) z=
‘(m-n))(n
Here al >02>...>0, zOare the distinct singular values of Ho occurring with
multiplicities nl, . . . . n~, such that ~~=1 ni =n. By convention the singular values
of amatrix are chosen to be nonnegative. It should be noted that although such a
decomposition always exists and Xis unique, there is no unique choice of orthogonal
matrices Vand U. The approach we take is to define an algorithm that converges
to Xand thus computes the singular values of Ho without directly generating the
orthogonal decomposition.
Let S(HO) be the set of all orthogonally equivalent matrices to Ho,
(23) S(Ifo) ={VTHOU ERmxn ]VGO(m), UEO(n)}.
It is shown in [9, p. 89] that S(HO) is asmooth compact Riemannian manifold with
explicit forms given for its tangent space and Riemannian metric. Following [4], [5],
[8], [9], and [12] we consider the task of calculating the singular values of amatrix HO
by minimizing the least squares cost function @:S(HO) -R+, @(H) =IIH –NI 12.
It is shown in [8] and [9] that ~achieves aunique local and global minimum at the
894 J. B, MOORE, R. E. MAHONEY, AND U. HELMKE
point XsS(HO). hloreover, in [8], [9], and [12] the explicit form for the gradient V@
is calculated. The gradient flow is
(24) Z=–V*(H)
=Ii{H,N}–{H*,W}H,
with H(0) =Ho the initial condition. Here we have used ageneralisation of the
Lie-bracket {X, Y} := XTY –YTX =–{X, Y}T.
To accomplish the task of computing the singular values of amatrix we require
Nto satisfy the following.
CONDITION 5.1. Let Nbe an mxnmatriz
!1
PI ““” o
N=:’~, “.
o... 1%
O(m–n)xn
where pl>p2 >... >pn >0 are stm”ctlypositive, distinct real numbers.
For generic initial conditions and atarget matrix Nthat satisfies Condition 5.1,
it is known that (24) converges exponentially fast to ZcS(HO)[8], [12]. Arecursive
version of this flow follows from an analogous argument to that used in the derivation
of the Lie-bracket algorithm. For Ho and Nconstant mxnmatrices, the singular
value algorithm proposed is
(25) Hk+l =e–m&{H~,NT}Hke~~{HL,N).
The singular value algorithm and the Lie-bracket algorithm are closely linked as
is shown in the following lemma.
LEMMA 5.1. Let Ho, Nbe mxnmatrices. For any H~i$mxn define amap
H H EsR(m+”jx(m+”), where
(26) (H
)
fi =;mTxm o.
nxn
For any sequence of real numbers ~k, k=1,....m the iterations
(27) Hk+l =e-a~{H~,NT}HkeQk{Hk,N)
with initial condition Ho and
with initial condition fio are equivalent.
Proof Consider the iterative solution to (28) and evaluate the multiplication in
the block form of (26). This gives two equivalent iterative solutions, one the transpose
of the other, both of which are equivalent to the iterative solution to (27). Cl
Remark 5.1. Note that fio and Rare symmetric (m+ n) x(m+ n) matrices and
that, as aresult, the iteration (28) is just the Lie-bracket algorithm.
Remark 5.2. The equivalence given by Lemma 5.1 is complete in every way. In
particular, H@ is an equilibrium point of (27) if and only if Hm is an equilibrium
point of (28). Similarly, Hk *Hm if and only if fih -+ ~w as k~~.
NUMERICAL GRADIENT ALGORITHMS 895
This leads us directly to consider step-size selection schemes for the singular value
algorithm induced by selection schem~ that we have already considered for the Lie-
bracket algorithm. Indeed if afi :itf(HO) ~R+ is astep-size selection scheme for (4)
on &f(Ho), and Hk ES(HO), then we can define atime-step CYkfor the singular value
algorithm by
Thus, if (28) equipped with astep-size selection scheme afi satisfies Condition 2.1
and (7), then from Lemma 5.1, (27) will satisfy similar conditions. For simplicity,
we deal only with the step-size selection schemes induced by the constant step-size
selection (13) and the variable step-size selection (17). Thus we may state the main
convergence theorem for the singular value algorithm.
THEOREM 5.2. Let HOLN be mxnmatrices where m2nand Nsatisfies
Condition 5.1. Let a~ :M(HO) ~R+ be eithe~ the constant step-size selection (13),
or the variable step-size selection (17). The singular value algorithm
with initial condition Ho, has the following properties:
(i) The singular value algorithm is aself-equivalent (singular value preserving)
recursion on the manifold S(HO).
(ii) If Hk is asolution of the singular value aigorithm, then q!J(Hk) =llH~ -
NI 12is strictly monotonically decreasing for evey k~N, where {Hk, N} #Oand
{H:, NT} #O.
(iii) Fixed points of the recursive equation are characterised by matrices HG
S(HO) such that
(30) {Hk,N} = O and {H~, NT}= O.
Fixed points of the recursion are exactly the stationay points of the singular value
gradient flow (24) and are termed equilibrium points.
(iv) Let Hk, k=l,2,..., be asolution to the singular value algorithm, then Hk
converges to amatrix Hm GS(HO), an equilibrium point of the recursion.
(v) All equilibrium points of the Lie-bracket algorithm are strictly unstable except
Zgiven by (22), which is locally exponentially asymptotically stable.
ProoJ To prove part (i), note that the generalised Lie-bracket {X, Y} =–{X, Y}T
is skew-symmetric and thus (25) is an orthogonal conjugation and preserves the singu-
lar values of ~k. Also note that the potential ?$(Hk) = ~@(flk). hforeover, Lemma 5.1
shows that the sequence Hk is asolution to the Lie-bracket algorithm and thus from
Proposition 2.1, ~~(~k) must be monotonically decreasing for all kGNsuch that
[~k, R] #O, which is equivalent to (30). This proves part (ii) and part (iii) follows by
noting that if {H:, NT} =Oand {~k, N} =O, then ~k+l =Hk for 1=1,2, . . . . and
Hk is afixed point of (25). hloreover, since @(Hk) is strictly monotonic decreasing for
all {Hk, N} #Oand {H~, NT} #O, then these points can be the only fixed points.
It is known that these are the only stationary points of (24) [8], [9], [12].
To prove (iv), we need the following characterisation of equilibria of the singular
value algorithm.
896 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
LEMMA 5.3. Let Nsatisfy Condition 5.1 and a~ be either the constant step-size
selection (13) or the variable step-size selection (17). The singular value algorithm
(25) equipped with time-steps crk =Ofi(fik) has exactly 2nn!/ ~~=l(ni!) distinct equi-
librium points in S(HO). Furthermore, these equilibrium points are characterised by
the matrices (~T Onx(m-n)
)x%,
qTn–n)xn qm-n)x(m-n)
‘wheTeris an nxnpermutation matrkc and S=diag(+l, . . . . +1) asign matrix.
Proof Equilibrium points of (25) are characterised by the two conditions (30).
For H=(hi~), {H, N} =Ois equivalent to
~jhji–pihij=O fori=l, . . ..n. j=l,. ... n.
Similarly, the condition {HT, NT} =Ois equivalent to
~jhij–~ihji=o fori=l, . . ..n. j=l,. ... n,
hij~j=O fori=n+l,..., m, j=l, ....n.
By manipulating the relationships, and using the distinct, positive nature of the Pi,
it is easily shown that h~j =Ofor i#j. Using the fact that (25) is self equivalent,
the only possible matrices of this form that have the same singular values as Ho are
characterised as above. Asimple counting argument shows that the number of distinct
equilibrium points is 2nn!/ ~~=1 (n, !). 0
The proof of Theorem 5.2 part (iv) is now directly analogous to the proof of Propo-
sition 2.1 part (c). It remains only to prove Theorem 5.2 part (v), which involves the
stability analysis of the equilibrium points characterised by (30). It is not possible to
directly apply the results obtained in fj4 to the Lie-bracket recursion ~k, since the N
does not satisfy Condition 2.2. However, for the constant step-size selection scheme
induced by (13), and using analogous arguments to those used in Lemmas 4.1 and
4.2, it follows that Zis the unique locally exponentially attractive equilibrium point
of the singular value algorithm. Thus, for the constant step-size selection scheme, ~is
the ~nique exponentially attractive equilibrium point of the Lie-bracket alg~rithm on
M(HO), and now the argument from Proposition 4.3 applies directly and Zis expo-
nentially attractive for the variable step-size selection scheme (17). This completes the
proof. Cl
Remark 5.3. Theorem 5.2 holds true for any time-steps a~ =a~(~k) induced
by astep-size selection scheme, a~, that satisfies Condition 2.1, such that Theorem
4.4 holds.
Remark 5.4. It is possible that for nongenetic initial conditions, the singular value
algorithm may converge to adiagonal matrix with the singular values ordered in a
different manner to Z. However, all simulations run have converged exponentially fast
to the unique matrix Z, and thus it is likely that the attractive basins of the unstable
equilibrium points have zero measure. Note that for the continuous-time flows, it is
known that the attractive basins of the unstable equilibrium points have zero measure
in S(HO) [9].
6. Associated orthogonal algorithms. In the previous sections we have pro-
posed the Lie-bracket and the singular value algorithms that calculate the eigenvalues
and singular values, respectively, of given initial conditions. Associated with these re-
cursions are orthogonal recursions that compute the eigenvectors or singular vectors
NUMERICAL GRADIENT ALGORITHMS 897
of given initial conditions and provide afull spectral decomposition. To simplify the
subsequent analysis we impose agenericity condition on the initial condition Ho.
CONDITION 6.1. 1’ Ho =H$ GR“x’ ts areal symmetric matrix then assume
that Ho has distinct eigenvalues Al >. . . >&. If Ho eJRmx”, where m?n, then
assume that Ho has distinct singular values U1 >. . . >LTn>0.
For asequence of positive real numbers ak for k=1,2,... the associated orthog-
onal Lie-bracket algorithm is
(31) ukfl =Ukeak[u:Houk, N] >Uo e o(n),
where Ho =H; ~Rnxnis symmetric. For an arbitrary initial condition Ho ●ll?m‘n
the associated orthogonal singular value algorithm is
Vk+l =Vkf?
~k{u:@vk,NT}
(32) >VOGO(m)
uk~~ =UkecU{vkTHOuk, N} >.!70c o(n).
Note that in each case the exponents of the exponential terms are skew-symmetric
and thus the recursions will remain orthogonal.
Let Ho =H$ <R“x” and consider the map g:O(n) -M(HO), U*UTHOU,
which is asmooth subjection. If uk is asolution to (31) it follows that
g(U~+l) =e–@&(U&),N]g(Uk)e~~[9(~ ~),iv],
which generates the Lie-bracket algorithm (4). Thus, gmaps the associated orthogo-
nal Lie-bracket algorithm with initial condition U. to the Lie-bracket algorithm with
initial condition U$HOUO on M(U$HOUO) =M(HO).
Remark 6.1. Consider the potential function 4: O(n) ~R+, O(U) =IIUTHOU –
N] 12on the set of orthogonal nxnmatrices. Using the standard induced Rlemannian
metric from Rn xnon O(n), the associated orthogonal gradient flow is [2], [3], [5], [9]
U=–V@(U) =UIUTHOU, N].
THEOREM 6.1. Let HO =H~ be areal symmetric nxnmatrix that satisfies
Condition 6.1. Let NGR“xn satisfy Condition 2.2, and let QN be either the constant
step-size selection (13) or the variable step-size selection (17). The recursion
Uk+l =uk&k[u:HOuk’N],
Uo 6O(n),
a’~ =CrN(Hk)
referred to as the associated orthogonal Lie-bracket algorithm has the following prop-
erties: (i) Asolution Uk, k=1,2,..., to the associated orthogonal Lie-bracket algo-
rithm remains orthogonal.
(ii) Let @(U) =IIUTHOU –NI 12be amap from O(n) to the set of nonnegative
reals R+. Let Uk, k= 1,2, ..., be asolution to the associated orthogonal Lie-bracket
algorithm. Then ~(uk) is strictly monotonically decreasing for evey k 6 Nwhere
[U~HoUk, N] #O.
(iii) Fixed points of the algorithm are characterised by matrices U~O(n) such
that
[UTHOU,N] =O.
There are exactly 2nn! distinct fixed points.
898 J. B. MOORE, R. E. MAHONEY, AND U. HELhlKE
(iv) Let Uk, k=l,2,..., be asolution to the associated orthogonal Lie-bracket
algorithm, then uk converges to an orthogonal matm’x UW, afixed point of the algo-
rithm.
(v) All fized points of the associated orthogonal Lie-bracket algorithm are strictly
unstable except those 2n points UN6O(n) such that
U~HoUw =A,
where A=diag(~l, .., An). Such points U* are locally exponentially asymptotically
stable and Ho =U+AU: is an eigenspace decomposition of Ho.
Proof Part (i) follows directly from the orthogonal nature of ea’ i“~~ou’ ~1.
Note that in part (ii) the definition of @can be expressed in terms of the map g(U) =
UTHOU from O(n) to fif(llo) and the Lie-bracket potential V(H) =1Ill –NI 12of
(l), i.e.,
fj(uk)= @(g(u,)).
Observe that g(UO) =U~Ho Uo and thus g(Uk) is the solution of the Lie-bracket
algorithm with initial condition UOTHOUO.As the step-size selection scheme ON is
either (13) or (17), then g(Uk) satisfies (7). This ensures that part (ii) holds.
If uk is afixed point of the associated orthogonal Lie-bracket algorithm with initial
condition U~HoUo, then g(Uk) is afixed point of the Lie-bracket algorithm. Thus,
from Proposition 2.1, [g(U~), N] =[U$HOUk, N] =O. Nloreover, if [U~HOUk, N] =O
for some given kE~, then by inspection Uk+l =uk for 1=1,2,..., and uk is afixed
point of the associated orthogonal Lie-bracket algorithm. From Lemma 2.2 it follows
that if Uis afixed point of the algorithm then UTHOU =TTAn for some permutation
matrix n. By inspection any orthogonal matrix W=SUmT, where Sis asign matrix
S=diag(+l,. . . . *1), is also afixed point of the recursion, and indeed, any two fixed
points are related in this manner. Asimple counting argument shows that there are
exactly 2nn! distinct matrices of this form.
To prove (iv), note that since g(uk) is asolution to the Lie-bracket algorithm, it
converges to alimit point H@ GM(HO), [Hm, N] =O(Proposition 2.1). Thus uk
must converge to the preimage set of Hw via the map g. Condition 6.1 ensures that
aset generated by the preimage of H~is afinite distinct set, any two elements U~
1_U2 s, s=diag(~l, ..., *1). Convergence
and U~ of which are related by Uw –~
to aparticular element of this preimage follows since ~k [U$HOuk, N] ~Oas in
Proposition 2.1.
To prove part (v), observe that the dimension of O(n) is the same as the dimen-
sion of M(Ho) due to genericity Condition 6.1. Thus gis locally adiffeomorphism
on O(n) that forms an exact equivalence between the Lie-bracket algorithm and the
associated orthogonal Lie-bracket algorithm. Restricting gto alocal region, the sta-
bility structure of equilibria are preserved under the map g-l. Thus, all fixed points
of the associated orthogonal Lie-bracket algorithm are locally unstable except those
that map via gto the unique locally asymptotically stable equilibrium of the Lie-
bracket recursion. Observe that due to the monotonicity of @(Uk) alocally unstable
equilibrium is also globally unstable. II
THEOREM 6.2. Let Ho GRmx” where m~nsatisfies Condition 6.1. Let
NcRrnxn satisfy Condition 5.1. Let the time-step ff/cbe given by
NUMERICAL GRADIENT ALGORITHMS 899
where cY~ is either the constant step-size selection (13) or the variable step-size selec-
tion scheme (17), on M(~o). The recursion
rejerred to as the associated orthogonal singular value algorithm, has the jollowing
properties
(i) Let (Vk, Uk) be asolution to the associated orthogonal S~ngUlarvalue algo-
m“thm,then both vk and uk remain orthogonal.
(ii) Let O(V, U) =I]VTHOU -N\]2 be amap from O(m) xO(n) to the set of
nonnegative reals R+, then ~(vk, Uk) is strictly monotonically decreasing foT every
kGNwhere {VkTHOUk,N} ~Oand {tJ~H$Vk, NT} #O. MoreoveT, fixed points of
the algorithm are characterised by matrix pairs (V, U) eO(m) xo(n) such that
{VTHOU, N} =Oand {UTH~V, NT}= O.
(iii) Let (Vk, uk), k=1,2,..., be asolution to the associated orthogonal singular-
value algorithm, then (V,, Uk) converges to apair of orthogonal matrices (V~, U~ ),
afixed point of the algorithm.
(iv) All fixed points of the associated orthogonal singular value algorithm are
stn”ctly unstable except those points (V*, U*) ~O(m) xO(n) such that
V*THIJJ*= z,
where E =diag(al, . . . . am)eRmxn. Each such point (V*, U*) is locally exponentially
asymptotically stable and Ho =V*TZU. is asingular value decomposition of Ho.
Proof. The proof of this theorem is analogous to the proof of Theorem 6.1. Cl
7. Computat ionzd considerations. There are several issues involved in the
implementation of the Lie-bracket algorithm as anumerical tool that have not been
dealt with in the body of this paper. Design and implementation of efficient code
has not been considered and would depend heavily on the nature of the hardware on
which such arecursion would be run. As each iteration requires the calculation of
atime-step, an exponential and ak+1estimate, it is likely that it would be best
to consider applications in parallel processing environments. Certainly in astandard
computational environment the exponential calculation would limit the possible areas
of useful application of the algorithms proposed.
It is also possible to consider approximations of the Lie-bracket algorithm that
have good computational properties. For example, consider a(1,1) Pad6 approxima-
tion to the matrix exponential
Such an approach has the advantage that, as [Hk, N] is skew-symmetric, the Pade
approximation will be orthogonal and will preserve the isospectral nature of the Lie-
bracket algorithm. Similarly, an (n, n) Pad6 approximation of the exponential for
any nwill also be orthogonal. There are difficulties involved in obtaining direct
step-size selection schemes based on the Pad@ approximate Lie-bracket algorithms.
To guarantee that the potential ~is monotonic decreasing for such schemes, direct
900 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
estimates of time-step must be chosen prohibitively small. Agood heuristic choice
of astep-size selection scheme, however, can be made based on the selection schemes
given in this paper and simulations indicate that such an approach is viable.
Another approach is to take just the linear term from the Taylor expansion of
~k+l(~k),
as an algorithm on ll%nxn. An algorithm such as this is similar in form to approx-
imating the curves generated by the Lie-bracket algorithm by straight lines. The
approximation will not retain the isospectral nature of the Lie-bracket recursion; how-
ever, it is computationally cheap. Furthermore, when the curvature of the manifold
Ill (IYo) is small, then it can be imagined that the linear algorithm would be agood
approximation to the Lie-bracket algorithm.
8. Conclusion. In this paper we have proposed two algorithms which, along
with their associated orthogonal algorithms, calculate respectively, the eigenvalue de-
composition of asymmetric matrix and the singular value decomposition of ageneral
matrix. Moreover, we have presented two suitable step-size selection schemes which
ensure that, for generic initial conditions, the algorithms proposed will converge ex-
ponentially fast to an asymptotically attractive fixed point.
In future work we hope to improve the theoretical understanding of the step-size
selection schemes necessary for the Lie-bracket algorithm as well as to investigate a
number of related applications of the double-bracket flow and its discretisation.
9. Appendix. The following discussion is aproof of Proposition 4.3.
Proof By Lemma 4.1, Ais the unique locally asymptotically stable equilibrium
point and it remains to show that Ais exponentially attractive. Note that direct
linearisation techniques do not apply as the recursion will not necessarily be differ-
entiable at the equilibrium A. To proceed we set c=1/(4/] Ho I].IIN]]), the constant
time-step, and show that the Lie-bracket algorithm converges faster using the variable
step-size selection scheme than it does with the constant time-step c. The proof is
divided into anumber of lemmas.
LEMMA 9.1. Let O</3<min(l, c), where c=l/(411Holl -IINII). Then there
exists areal number &such that for Hk cM(Ho) and II[Hk, N] II <61, then
(33) 0> &b(~&,.@ ~-3@ll[Hk,N]112.
Proo~ Consider the error term ~2tT(N%?2(~)) defined in Lemma 3.1 and recall
the estimation argument for Lemma 3.3. Employing asimilar argument for ~ = ~
gives
Thus, combining this with (9) it follows that
It is well known that
2(ey–1–y)~y2 fory~O+,
NUMERICAL GRADIENT ALGORITHMS 901
where “J’ indicates that two functions are asymptotically equal. This is equivalent to
saying that for any e>0, there exists 6(6) >0, such that for all y, where 6(c) >y>0,
then 1 – e<2(e~ – 1 – y)/y2 <1+c. Thus, choosing c=~, it follows that for
d(~) >y>0then 2(eY – 1 – y) <2y2. Recall that we are restricting B<1,
and thus, there exists some real number 81 >0such that if II[Hk, N] II <61, then
2/311[ilk, N]l] <b(~), and hence 2(e2~lllHk~lll – 1 – 2~)][Hk,N]]l) <4~211[H~,N]]]2.
Substituting this into (34) gives
(35) A+(Hk, p) >–2@ll[Hk, N]l12 –4&llH~11 ~Ijfvll .ll[H~,N]]12.
By additionally requiring that ~ < c=1/(41 Ill. II.IIN]1) the lemma is proved. Cl
LEMMA 9.2. Let C& be the step-size selection scheme given by Lemma 3.4, and
let T6R+, such that Qfi(Hk) > ~ >0 for all [Hk, N] #O. Define ~:= min{-y, c}
and choose ~G1%+such that
0< B<min{l, c,~(~-211H~ll .llNll~2)}
Then the~e exists areal number 62>0 such that for any Hk GM(HO) with] I[Hk, N] II <
62
(36) ‘3~\l[Hk,N]112 >A@;(H~, &).
1%-oo~ Recall that a; was chosen as the first critical point of the function
fl@fi (Hk, 7). Thus Q~fi(Hk, r) is monotonic decreasing on the interval (O,a;).
The lower bound ~ < -y, for crfi, must be less than a;, and thus A@~(H~, ~) >
A?!$ (Hk, CY~).Substituting vinto the definition of @.v~ gives
&/$(Hk,~) =–2~ll[Hk, ~]112
+IIHoII.l[[~,[HbNllll (e2WfdW -1 ‘2~l][&,i@ >
ll[Hk,N]]l
<–2~{l[Hk,N]l\2
+2]lfIoll ~IINII (e2711[Hk’N]lt-1 -2~ll[Hk,N][l) .
As shown in Lemma 9.1, there exists 82>0, such that for any Hk cAl(Ho), where
ll[H~,Nll <62, then g (e2711[Hk>N~11– 1 ‘g?’ll[~k,~]ll) <4-y211[H~,N]112.Using this
with the above inequality gives
Note that since ~
negative. Now as <c, then the right-hand side of the last inequality is strictly
0<~< ;(q-211Holl .llNll~2),
then –3~ll[Hk,N]112 >211[Hk,N]112(211HOII.llNll~2 –~) andthe result follows. Cl
The proof of Proposition 4.3 now follows by choosing
{
1,
(37) ,6= min c,
~(~- 211HOII.llNll=y2) ,
902 J. B. MOORE, R. E. MAHONEY, AND U. HELMKE
where ~ = min(-y, c). Thus, from Lemmas 9.1 and 9.2, choose 61 and 62 such that the
results hold and set 6= ~ min {61,62}. Hence, combining the inequalities (33) and
(36) gives
for all Hk GJ{(HO) with ll[Hk,N]ll<6.
Let D6 be some open set around Asuch that II[Hk, N] II <6. Note that @<c,
and thus from Lemma 4.2 the Lie-bracket algorithm equipped with afi = ~ as a
step-size selection scheme is exponentially stable. Finally, note that within Ds, and
due to (38), ~(Hk+l (a;)) will always decrease faster than @(Hk+l (@ ), regardless of
Hk. Since Ais exponentially attractive for the Lie-bracket algorithm equipped with
the selection scheme afi, it follows that Amust also be exponentially attractive for
the same recursion equipped with the selection scheme afi. II
Acknowledgments. The authors would like to thank Kenneth Driessel and Wei-
Yong Yan for many useful comments. Also the authors thank an anonymous referee
for mentioning the connection of the double-bracket flow to micromagnetics as well
as anumber of useful comments.
REFERENCES
[1] A. M. BLOCH, R. W’. BROCKETT, AND T. RATIU, Anew formulation of the generalised Toda
lattice equations and their fixed point analysis via the momentum map, Bull. Amer. Math.
SOC.,23 (1990), pp. 477-485.
[2] R. W. BROCKETT, Dynamical systems that sort lists, diagonalise matrices and solve linear
programming problems, Linear Algebra Appl., 146 (1991), pp. 79–91; see, also, Proc. IEEE
Conf. Decisions and Control, 1988, pp. 799-803.
[3] M. T. CHU, The genemlized Toda j?ow, the QR-algorithm and the center manifold thwry,
SIAM J. Discrete Math., 5(1984), pp. 187-201.
[4] —, Adiffer-ential equation approach to the singular value decomposition of bidiagonal ma-
trices, Linear Algebra App)., 80 (1986), pp. 71-80.
[5] hf. T. CHuAND K. R. DRIESSEL, The projected gmdient method for least squa~es mattiz ap-
proximations with spectra/ const~aints, SIAM J. Numer. Anal., 27 (1990), pp. 1050-1060.
[6] P. DEIFT, T. NANDA, AND C. TOMEI, Ordinary diflenmtial equations joT the symmetric eigen-
value problem, SIAM J. Numer. Anal., 20 (1983), pp. 1–22.
[7) K. R. DRIESSEL, On isospectral gradient flows-solving mattiz eigenproblems using difierentzal
equations, in Inverse Problems, J. R. Cannon and U. Hornung, eds., Birkhauser-Verlag,
1986, pp. 69-90.
[8] U. HELMKE AND J. B. MOORE, SingulaT value decomposition via gradient flows, Systems Con-
trol Lett., 14 (1990), pp. 369-377.
19] —, Optimization and dynamical systems, in Communications and Control Engineering,
Springer-Verlag, London, 1994.
[10] J. lMAE, J. PERKINS, AND J. MOORE) Towards time varying balanced Tealisation via Riccati
equations, Math. Control Signals Systems, 5(1992), pp. 313–326.
[11] J. E. PERKINS, U. HELMKE, AND J. B. MOORE, Balanced realizations via gradient flow tech-
niques, Systems Control Lett., 14 (1990), pp. 369–380.
[12] S. T. SMITH, Dynamical systems that perjorrn the singular value decomposition, Systems Con-
trol Lett., 16 (1991), pp. 319-327.
[13] W. W. SYMES, The QR algorithm and scattering for the jinite nonperiodic Toda lattice, Phys.
D4, (1982), pp. 275-280.
[14] D. WATKINS AND L. ELSNER, Self-equivalent fZows associated with the singulaT vaiue decompo-
sition, SIAM J. Matrix Anal. Appl., 10 (1989), pp. 244–258.