PreprintPDF Available

Convergence analysis of multi-step one-shot methods for linear inverse problems

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

In this work we are interested in general linear inverse problems where the corresponding forward problem is solved iteratively using fixed point methods. Then one-shot methods, which iterate at the same time on the forward problem solution and on the inverse problem unknown, can be applied. We analyze two variants of the so-called multi-step one-shot methods and establish sufficient conditions on the descent step for their convergence, by studying the eigenvalues of the block matrix of the coupled iterations. Several numerical experiments are provided to illustrate the convergence of these methods in comparison with the classical usual and shifted gradient descent. In particular, we observe that very few inner iterations on the forward problem are enough to guarantee good convergence of the inversion algorithm.
Content may be subject to copyright.
ISSN 0249-6399 ISRN INRIA/RR--9477--FR+ENG
RESEARCH
REPORT
9477
July 2022
Project-Teams IDEFIX
Convergence analysis of
multi-step one-shot
methods for linear
inverse problems
Marcella Bonazzoli, Houssem Haddar, Tuan Anh Vu
RESEARCH CENTRE
SACLAY ÎLE-DE-FRANCE
1 rue Honoré d’Estienne d’Orves
Bâtiment Alan Turing
Campus de l’École Polytechnique
91120 Palaiseau
Convergence analysis of multi-step one-shot
methods for linear inverse problems
Marcella Bonazzoli, Houssem Haddar, Tuan Anh Vu
Project-Teams IDEFIX
Research Report n°9477 July 2022 55 pages
Abstract: In this work we are interested in general linear inverse problems where the corre-
sponding forward problem is solved iteratively using fixed point methods. Then one-shot methods,
which iterate at the same time on the forward problem solution and on the inverse problem un-
known, can be applied. We analyze two variants of the so-called multi-step one-shot methods and
establish sufficient conditions on the descent step for their convergence, by studying the eigen-
values of the block matrix of the coupled iterations. Several numerical experiments are provided
to illustrate the convergence of these methods in comparison with the classical usual and shifted
gradient descent. In particular, we observe that very few inner iterations on the forward problem
are enough to guarantee good convergence of the inversion algorithm.
Key-words: inverse problems, one-shot methods, convergence analysis, parameter identification
Inria, UMA, ENSTA Paris, Institut Polytechnique de Paris
Analyse de convergence pour des méthodes d’inversion
multi-étapes de type one-shot
Résumé : Dans ce travail nous nous intéressons à des problèmes inverses linéaires généraux
le problème direct correrpondant est résolu de façon itérative en utilisant des méthodes de
point fixe. Ainsi, les méthodes de type one-shot, qui itérent en même temps sur la solution du
problème direct et l’inconnue du problème inverse, peuvent être appliquées. Nous considérons
deux variantes des méthodes multi-étapes de type one-shot et nous establissons des conditions
suffisantes et nécessaires sur le pas de descente pour leur convergence, en étudiant les valeurs
propres de la matrice par blocs des iterations couplées. Plusieurs tests numériques sont présentés
pour illustrer la convergence de ces méthodes par rapport aux méthodes de descente de gradient
usuelle et décentrée. En particulier, nous observons que très peu d’iterations internes pour le
problème direct sont suffisantes pour garantir une bonne convergence de l’algorithme d’inversion.
Mots-clés : problèmes inverses, méthodes de type one-shot, analyse de convergence, identifi-
cation de paramètres
Convergence analysis of multi-step one-shot methods for linear inverse problems 3
Contents
1 Introduction 4
2 Multi-step one-shot inversion methods 5
3 Convergence of one-step one-shot methods (k= 1)7
3.1 Block iteration matrices and eigenvalue equations .................. 7
3.2 Real eigenvalues .................................... 10
3.3 Complex eigenvalues .................................. 11
3.4 Final result (k= 1)................................... 15
4 Convergence of multi-step one-shot methods (k2)16
4.1 Block iteration matrices and eigenvalue equations .................. 16
4.2 Real eigenvalues .................................... 18
4.3 Complex eigenvalues .................................. 20
4.4 Final result (k2)................................... 26
5 Inverse problem with complex forward problem and real parameter 26
6 Numerical experiments 28
7 Conclusion 33
A Some useful lemmas 36
B Descent step for usual and shifted gradient descent 42
C Convergence study for the scalar case 44
C.1 Notations and preliminary calculation ........................ 44
C.2 Necessary and sufficient conditions for convergence ................. 45
C.2.1 Descent step for the usual gradient descent ................. 45
C.2.2 Descent step for the shifted gradient descent ................ 45
C.2.3 Descent step for k-step one-shot ....................... 46
C.2.4 Descent step for shifted k-step one-shot ................... 49
C.3 Comparison of the bounds for the descent step ................... 52
D A proof of Lemma C.1 based on Marden’s works 52
RR n°9477
4M. Bonazzoli & H. Haddar & T. A. Vu
1 Introduction
For large-scale inverse problems, which often arise in real life applications, the solution of the
corresponding forward and adjoint problems is generally computed using an iterative solver, such
as (preconditioned) fixed point or Krylov subspace methods. Indeed, the corresponding linear
systems could be too large to be handled with direct solvers (e.g. LU-type solvers), and iterative
solvers are easier to parallelize on many cores. Naturally this leads to the idea of one-step
one-shot methods, which iterate at the same time on the forward problem solution (the state
variable), the adjoint problem solution (the adjoint state) and on the inverse problem unknown
(the parameter or design variable). If two or more inner iterations are performed on the state
and adjoint state before updating the parameter (by starting from the previous iterates as initial
guess for the state and adjoint state), we speak of multi-step one-shot methods. Our goal is to
rigorously analyze the convergence of such inversion methods. In particular, we are interested
in those schemes where the inner iterations on the direct and adjoint problems are incomplete,
i.e. stopped before achieving convergence. Indeed, solving the forward and adjoint problems
exactly by direct solvers or very accurately by iterative solvers could be very time-consuming
with little improvement in the accuracy of the inverse problem solution.
The concept of one-shot methods was first introduced by Ta’asan [22] for optimal control
problems. Based on this idea, a variety of related methods, such as the all-at-once methods, where
the state equation is included in the misfit functional, were developed for aerodynamic shape
optimization, see for instance [23,21,11,19,18] and the literature review in the introduction
of [19]. All-at-once approaches to inverse problems for parameter identification were studied
in, e.g., [8,2,15]. An alternative method, called Wavefield Reconstruction Inversion (WRI),
was introduced for seismic imaging in [25], as an improvement of the classical Full Waveform
Inversion (FWI) [24]. WRI is a penalty method which combines the advantages of the all-at-once
approach with those of the reduced approach (where the state equation represents a constraint
and is enforced at each iteration, as in FWI), and was extended to more general inverse problems
in [26].
Few convergence proofs, especially for the multi-step one-shot methods, are available in litera-
ture. In particular, for non-linear design optimization problems, Griewank [6] proposed a version
of one-step one-shot methods where a Hessian-based preconditioner is used in the design variable
iteration. The author proved conditions to ensure that the real eigenvalues of the Jacobian of the
coupled iterations are smaller than 1, but these are just necessary and not sufficient conditions
to exclude real eigenvalues smaller than 1. In addition, no condition to also bound complex
eigenvalues below 1in modulus was found, and multi-step methods were not investigated. In
[9,10,4] an exact penalty function of doubly augmented Lagrangian type was introduced to coor-
dinate the coupled iterations, and global convergence of the proposed optimization approach was
proved under some assumptions. In [7] this particular one-step one-shot approach was extended
to time-dependent problems.
In this work, we consider two variants of multi-step one-shot methods where the forward
and adjoint problems are solved using fixed point methods and the inverse problem is solved
using gradient descent methods. This is a preparatory work where we focus on (discretized)
linear inverse problems. Note that the present analysis in the linear case implies also local
convergence in the non-linear case. The only basic assumptions we require are the inverse problem
uniqueness and the convergence of the fixed point iteration for the forward problem. To analyze
the convergence of the coupled iterations we study the real and complex eigenvalues of the
block iteration matrices. We prove that if the descent step is small enough then the considered
multi-step one-shot methods converge. Moreover, the upper bounds for the descent step in
these sufficient conditions are explicit in the number of inner iterations and in the norms of
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 5
the operators involved in the problem. In the particular scalar case (Appendix C), we establish
sufficient and also necessary convergence conditions on the descent step.
This paper is structured as follows. In Section 2, we introduce the principle of multi-step
one-shot methods and define two variants of these algorithms. Then, in Section 3, respectively
Section 4, we analyze the convergence of one-step one-shot methods, respectively multi-step
one-shot methods: first, we establish eigenvalue equations for the block matrices of the coupled
iterations, then we derive sufficient convergence conditions on the descent step by studying both
real and complex eigenvalues. In Section 5we show that the previous analysis can be extended
to the case where the state variable is complex. Finally, in Section 6we test numerically the
performance of the different algorithms on a toy 2D Helmholtz inverse problem.
Throughout this work, ,·i indicates the usual Hermitian scalar product in Cn, that is
hx, yi:=y|x, x, y Cn, and k·k the vector/matrix norms induced by ,·i. We denote by
A=A|the adjoint operator of a matrix ACm×n, and likewise by z=zthe conjugate of a
complex number z. The identity matrix is always denoted by I, whose size is understood from
context. Finally, for a matrix TCn×nwith ρ(T)<1, we define
s(T):= sup
zC,|z|≥1
(IT/z)1
which is further studied in Appendix A.
2 Multi-step one-shot inversion methods
We focus on (discretized) linear inverse problems, which correspond to a direct (or forward)
problem of the form: find uu(σ)such that
u=Bu +M σ +F(1)
where uRnu,σRnσ,BRnu×nu,MRnu×nσand FRnu. Here IBis the invertible
matrix of the direct problem, obtained after discretization, with parameter σ. Note that in the
non-linear case Bwould be a function of σ. Equation (1) is also called state equation and uis
called state. Given σ, we can solve for uby a fixed point iteration
u`+1 =Bu`+M σ +F, ` = 0,1,..., (2)
which converges for any initial guess u0if and only if the spectral radius ρ(B)is strictly less than
1(see e.g. [5, Theorem 2.1.1]). Hence we assume ρ(B)<1. Now, we measure f=Hu(σ), where
HRnf×nu, and we are interested in the linear inverse problem of finding σfrom f. In order
to guarantee the uniqueness of the inverse problem, we assume that H(IB)1Mis injective.
In summary, we set
direct problem: u=Bu +M σ +F,
inverse problem: measure f=Hu(σ),find σ(3)
with the assumptions:
ρ(B)<1, H(IB)1Mis injective.(4)
To solve the inverse problem we write its least squares formulation: given σex the exact solution
of the inverse problem and f:=Hu(σex),
σex =argminσRnσJ(σ)where J(σ):=1
2kHu(σ)fk2.
RR n°9477
6M. Bonazzoli & H. Haddar & T. A. Vu
Using the classical Lagrangian technique with real scalar products, we introduce the adjoint state
pp(σ), which is the solution of
p=Bp+H(Hu f)
and allows us to compute the gradient of the cost functional
J(σ) = Mp(σ).
The classical gradient descent algorithm then reads
usual gradient descent:
σn+1 =σnτM pn,
un=Bun+M σn+F,
pn=Bpn+H(Hunf),
(5)
where τ > 0is the descent step size, and the state and adjoint state equations are solved exactly
by a direct solver. Here σn+1 =σnτJ(σn); if instead we update σn+1 =σnτJ(σn1),
we obtain the
shifted gradient descent:
σn+1 =σnτM pn,
un+1 =Bun+1 +M σn+F,
pn+1 =Bpn+1 +H(Hun+1 f).
(6)
Both algorithms converge for sufficiently small τ(see e.g. Appendix B): for any initial guess, (5)
converges if
τ < 2
kH(IB)1Mk2,(7)
and (6) converges if
τ < 1
kH(IB)1Mk2.(8)
Here, we are interested in methods where the direct and adjoint problems are rather solved
iteratively as in (2), and where we iterate at the same time on the forward problem solution and
the inverse problem unknown: such methods are called one-shot methods. More precisely, we are
interested in two variants of multi-step one-shot methods, defined as follows. Let nbe the index
of the (outer) iteration on σ, the solution to the inverse problem. We update σn+1 =σnτ Mpn
as in gradient descent methods, but the state and adjoint state equations are now solved by a
fixed point iteration method, using just kinner iterations, and coupled:
(un+1
`+1 =Bun+1
`+Mσ +F,
pn+1
`+1 =Bpn+1
`+H(Hun+1
`f),`= 0,1, . . . , k, (un+1 =un+1
k,
pn+1 =pn+1
k
where σdepends on the considered variant (σ=σn+1 or, for the shifted methods, σ=σn). As
initial guess we naturally choose un+1
0=unand pn+1
0=pn, the information from the previous
(outer) step. In summary, we have two multi-step one-shot algorithms
k-step one-shot:
σn+1 =σnτM pn,
un+1
0=un, pn+1
0=pn,
un+1
`+1 =Bun+1
`+Mσn+1 +F,
pn+1
`+1 =Bpn+1
`+H(Hun+1
`f),
un+1 =un+1
k, pn+1 =pn+1
k
(9)
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 7
and
shifted k-step one-shot:
σn+1 =σnτM pn,
un+1
0=un, pn+1
0=pn,
un+1
`+1 =Bun+1
`+Mσn+F,
pn+1
`+1 =Bpn+1
`+H(Hun+1
`f),
un+1 =un+1
k, pn+1 =pn+1
k,
(10)
and in particular, when k= 1, we obtain the following two algorithms
one-step one-shot:
σn+1 =σnτM pn,
un+1 =Bun+M σn+1 +F
pn+1 =Bpn+H(Hunf)
(11)
and
shifted one-step one-shot:
σn+1 =σnτM pn,
un+1 =Bun+M σn+F
pn+1 =Bpn+H(Hunf).
(12)
The only difference for the shifted versions lies in the fact that σnis used in (10) and (12), instead
of σn+1 in (9) and (11), so that in (9) and (11) we need to wait for σbefore updating uand
p, while in (10) and (12) we can update σ, u, p at the same time. Also note that when k ,
the k-step one-shot method (9) formally converges to the usual gradient descent (5), while the
shifted k-step one-shot method (10) formally converges to the shifted gradient descent (6).
We first analyze the one-step one-shot methods (k= 1) in Section 3and then the multi-step
one-shot methods (k2) in Section 4.
3 Convergence of one-step one-shot methods (k= 1)
3.1 Block iteration matrices and eigenvalue equations
To analyze the convergence of these methods, first we express (σn+1, un+1, pn+1 )in terms of
(σn, un, pn), by inserting the expression for σn+1 into the iteration for un+1 in (11), so that
system (11) is rewritten as
σn+1 =σnτM pn
un+1 =Bun+M σnτ M M pn+F
pn+1 =Bpn+HHunHf.
(13)
System (12) is already in the form we need. In what follows we first study the shifted 1-step
one-shot method, then the 1-step one-shot method.
Now, we consider the errors (σnσex, unu(σex), pnp(σex )) with respect to the exact
solution at the n-th iteration, and, by abuse of notation, we designate them by (σn, un, pn). We
obtain that the errors satisfy: for the shifted algorithm (12)
σn+1 =σnτM pn
un+1 =Bun+M σn
pn+1 =Bpn+HHun
(14)
RR n°9477
8M. Bonazzoli & H. Haddar & T. A. Vu
and for algorithm (13)
σn+1 =σnτM pn
un+1 =Bun+M σnτ M M pn
pn+1 =Bpn+HHun,
(15)
or equivalently, by putting in evidence the block iteration matrices
pn+1
un+1
σn+1
=
BHH0
0B M
τM 0I
pn
un
σn
(16)
and
pn+1
un+1
σn+1
=
BHH0
τM M B M
τM 0I
pn
un
σn
.(17)
Now recall that a fixed point iteration converges if and only if the spectral radius of its iteration
matrix is strictly less than 1. Therefore in the following propositions we establish eigenvalue
equations for the iteration matrix of the two methods.
Proposition 3.1 (Eigenvalue equation for the shifted 1-step one-shot method).Assume that
λCis an eigenvalue of the iteration matrix in (16).
(i) If λC,λ /Spec(B), then yCnσ, y 6= 0 such that
(λ1) kyk2+τhM(λI B)1HH(λI B)1M y, yi= 0.(18)
(ii) λ= 1 is not an eigenvalue of the iteration matrix.
Remark 3.2.Since ρ(B)is strictly less than 1, so is ρ(B).
Proof. Since λCis an eigenvalue of the iteration matrix in (16), there exists a non-zero vector
(˜p, ˜u, y)Cnu+nu+nσsuch that
λy =yτM ˜p
λ˜u=B˜u+My
λ˜p=B˜p+HH˜u.
(19)
By the second equation in (19)˜u= (λI B)1M y, so together with the third equation
˜p= (λI B)1HH˜u= (λI B)1HH(λI B)1M y,
and by inserting this result into the first equation we obtain
(λ1)y=τM (λI B)1HH(λI B)1M y, (20)
that gives (18) by taking the scalar product with y. We also see that if y= 0 then the above
formulas for ˜u, ˜pimmediately give ˜u= ˜p= 0, that is a contradiction.
(ii) Assume that λ= 1 is an eigenvalue of the iteration matrix, then (20) gives us
M(IB)1HH(IB)1My = 0,
but this cannot happen for y6= 0 due to the injectivity of H(IB)1M.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 9
Proposition 3.3 (Eigenvalue equation for the 1-step one-shot method).Assume that λCis
an eigenvalue of the iteration matrix in (17).
(i) If λC,λ /Spec(B)then yCnσ, y 6= 0 such that:
(λ1) kyk2+τλhM(λI B)1HH(λI B)1M y, yi= 0.(21)
(ii) λ= 1 is not an eigenvalue of the iteration matrix.
Proof. Since λCis an eigenvalue of the iteration matrix in (17), there exists a non-zero vector
(˜p, ˜u, y)Cnu+nu+nσsuch that
λy =yτM ˜p
λ˜u=B˜u+M y τ MM ˜p
λ˜p=B˜p+HH˜u.
(22)
By the third equation in (22)˜p= (λI B)1HH˜u, and inserting this result into the second
equation we obtain
λ˜u=B˜u+M y τ MM (λI B)1HH˜u,
or equivalently,
[I+τM M A](λI B)˜u=M y
where A= (λI B)1HH(λI B)1. Since τ > 0,I+τ M M Ais a positive definite matrix.
Therefore
˜u= (λI B)1[I+τ MM A]1M y
and
˜p= (λI B)1HH˜u=A[I+τ M M A]1My.
By inserting this result into the first equation in (22) we obtain
(λ1)y=τM A[I+τ M MA]1My.
Thanks to the fact that [I+τM M A]1and M MAcommute, we have
(λ1)My =τ M M A[I+τMM A]1M y =τ[I+τ MM A]1M M AM y
then
(λ1)[I+τM M A]M y =τMM AM y,
that leads to
(λ1)My +τ λM M AMy = 0.
Since H(IB)1Mis injective, so is M. Therefore
(λ1)y+τλM AM y = 0,(23)
that gives (21) by taking scalar product with y. We also see that if y= 0 then the above formulas
for ˜u, ˜pimmediately give ˜u= ˜p= 0, that is a contradiction.
(ii) Assume that λ= 1 is an eigenvalue of the iteration matrix, then (23) gives us
M(IB)1HH(IB)1My = 0,
but this cannot happen for y6= 0 due to the injectivity of H(IB)1M.
RR n°9477
10 M. Bonazzoli & H. Haddar & T. A. Vu
In the following sections we will show that, for sufficiently small τ, equations (18) and (21)
admit no solution |λ| 1, thus algorithms (12) and (11) converge. When λ6= 0, it is convenient
to rewrite (18) and (21) respectively as
λ2(λ1) kyk2+τhM(IB)1HH(IB/λ)1M y, yi= 0 (24)
and
λ(λ1) kyk2+τhM(IB)1HH(IB/λ)1M y, yi= 0.(25)
For the analysis we use auxiliary results proved in Appendix A.
First, we study separately the very particular case where B= 0.
Proposition 3.4 (shifted 1-step one-shot method).When B= 0, the eigenvalue equation (24)
admits no solution λC,|λ| 1if τ < 1+5
2kHk2kMk2.
Proof. When B= 0, equation (24) becomes λ2(λ1) kyk2+τkH M yk2= 0 which is equivalent
to λ3λ2+kHM yk2
kyk2τ= 0. Then, the conclusion can be obtained by Lemma C.1.
Proposition 3.5 (1-step one-shot method).When B= 0, the eigenvalue equation (25)admits
no solution λC,|λ| 1if τ < 1
kHk2kMk2.
Proof. When B= 0, equation (25) becomes λ(λ1) kyk2+τkHM yk2= 0 which yields λ3
λ2+kHM yk2
kyk2τλ = 0. Then, the conclusion can be obtained by Lemma C.1.
3.2 Real eigenvalues
We now find conditions on the descent step τsuch that the real eigenvalues stay inside the unit
disk. Recall that we have already proved that λ= 1 is not an eigenvalue for both methods.
Proposition 3.6 (shifted 1-step one-shot method).Equation (24)
(i) admits no solution λR, λ > 1for all τ > 0;
(ii) admits no solution λR, λ 1if we take
τ < 2
kHk2kMk2s(B)2,
where s(B)is defined in Lemma A.2; moreover if 0<kBk<1, we can take
τ < χ0(1,kBk)
kHk2kMk2,where χ0(1, b) = 2(1 b)2(26)
(here in the notation χ0(1, b),1refers to k= 1).
Proof. When λR\{0}equation (24) becomes
λ2(λ1) kyk2+τ
H(IB/λ)1M y
2= 0.
The left-hand side of the above equation is strictly positive for any τ > 0if λ > 1; it is strictly
negative for τsatisfying the inequality in (ii) if λ 1, noting that λ7→ λ2(λ1) is increasing
for λ 1.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 11
Proposition 3.7 (1-step one-shot method).Equation (25)admits no solution λR, λ 6= 1,|λ|
1for all τ > 0.
Proof. When λR\{0}equation (25) becomes
λ(λ1) kyk2+τ
H(IB/λ)1M y
2= 0.
If λR, λ 6= 1,|λ| 1then λ(λ1) >0, thus the left-hand side of the above equation is strictly
positive for any τ > 0.
3.3 Complex eigenvalues
We now look for conditions on the descent step τsuch that also the complex eigenvalues stay
inside the unit disk. We first deal with the shifted 1-step one-shot method.
Proposition 3.8 (shifted 1-step one-shot method).If B6= 0,τ > 0sufficiently small such
that equation (24)admits no solution λC\R,|λ| 1. In particular, if 0<kBk<1, given any
δ0>0and 0< θ0π
6, take
τ < min{χ1(1,kBk), χ2(1,kBk), χ3(1,kBk), χ4(1,kBk)}
kHk2kMk2,
where
χ1(1, b) = (1 b)4
4b2, χ2(1, b) = 2 sin θ0
2(1 b)2
(1 + b)2,
χ3(1, b) = δ0cos25θ0
2
21+2δ0sin 5θ0
2+δ2
0·(1 b)4
b2, χ4(1, b) = hsin π
23θ0+ cos 2θ0i(1 b)2
(here in the notation χi(1, b), i = 1,...,4,1refers to k= 1).
Proof. Step 1. Rewrite equation (24)so that we can study its real and imaginary
parts.
Let λ=R(cos θ+ i sin θ)in polar form where R=|λ| 1and θ(π, π). Write 1 =
r(cos φ+ i sin φ)in polar form where r= 1/|λ|= 1/R 1and φ=θ(π, π). By Lemma
A.3, we have
IB
λ!1
=P(λ)+iQ(λ), IB
λ!1
=P(λ)+ iQ(λ)
where P(λ)and Q(λ)are Cnu×nu-valued functions, and, by omitting the dependence on λ,
kPk p:=
(1 + kBk)s(B)2for general B6= 0,
1
1 kBkwhen kBk<1; (27)
kQk q1:=
kBks(B)2for general B6= 0,
kBk
1 kBkwhen 0<kBk<1; (28)
kQk |sin θ|q2, q2:=
kBks(B)2for general B6= 0,
kBk
(1 kBk)2when 0<kBk<1.(29)
RR n°9477
12 M. Bonazzoli & H. Haddar & T. A. Vu
Now we rewrite (24) as
λ2(λ1) kyk2+τG(P+ iQ, P + iQ) = 0 (30)
where
G(X, Y ) = hMX H H Y M y, yi C, X, Y Cnu×nu.
Gsatisfies the following properties:
X, Y1, Y2Cnu×nu,z1, z2C:G(X, z1Y1+z2Y2) = z1G(X, Y1) + z2G(X, Y2).
X1, X2, Y Cnu×nu,z1, z2C:G(z1X1+z2X2, Y ) = z1G(X1, Y ) + z2G(X2, Y ).
XCnu×nu:0G(X, X) = kHX M yk2(kHkkMkkXk)2kyk2.
X, Y Cnu×nu:G(X, Y ) + G(Y, X)R, indeed
G(X, Y ) = hMXHHY M y, yi=hy , MYHHXMyi
=hMYHHXM y, yi=G(Y, X).
With these properties of G, we expand (30) and take its real and imaginary parts, so we respec-
tively obtain:
<(λ3λ2)kyk2+τ[G(P, P )G(Q, Q)] = 0 (31)
and
=(λ3λ2)kyk2+τ[G(P, Q) + G(Q, P )] = 0 (32)
Step 2. Find a suitable combination of equations (31)and (32), choose τso that we
obtain a new equation with a left-hand side which is strictly positive/negative.
Let γ=γ(λ)R, defined by cases as in Lemma A.4. Multiplying equation (32) with γthen
summing it with equation (31), we obtain:
[<(λ3λ2) + γ=(λ3λ2)] kyk2+τ[G(P, P )G(Q, Q) + γG(P, Q) + γG(Q, P )] = 0,
or equivalently,
[<(λ3λ2) + γ=(λ3λ2)] kyk2+τG(P+γQ, P +γQ)(1 + γ2)τ G(Q, Q)=0.(33)
Now we consider four cases of λas in Lemma A.4:
Case 1. <(λ3λ2)0;
Case 2. <(λ3λ2)<0and θ[θ0, π θ0][π+θ0,θ0]for fixed 0< θ0π
6;
Case 3. <(λ3λ2)<0and θ(θ0, θ0)for fixed 0< θ0π
6;
Case 4. <(λ3λ2)<0and θ(πθ0, π)(π, π+θ0)for fixed 0< θ0π
6.
The four cases will be treated in the following four lemmas (Lemmas 3.93.12), which together
give the statement of this proposition.
Lemma 3.9 (Case 1).Equation (24)admits no solutions λin Case 1 if we take
τ < 1
4kHk2kMk2kBk2s(B)4.
Moreover, if 0<kBk<1, we can take
τ < (1 kBk)4
4kHk2kMk2kBk2.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 13
Proof. Writing (33) for γ=γ1as in Lemma A.4 (i) (in particular γ2
1= 1), we have
[<(λ3λ2) + γ1=(λ3λ2)] kyk2+τG(P+γ1Q, P +γ1Q)2τG(Q, Q)=0.(34)
By the properties of Gwe have
G(P+γ1Q, P +γ1Q)0
and
G(Q, Q)(kHkkMkkQk)2kyk2(kHk kMk |sin θ|q2)2kyk2,
therefore the left-hand side of (34) will be strictly positive if τsatisfies
τ < <(λ3λ2) + γ1=(λ3λ2)
2 (kHkkMk| sin θ|q2)2.
Since <(λ3λ2) + γ1=(λ3λ2)2|sin(θ/2)|by Lemma A.4 (i), it is enough to choose
τ < 1
4sin θ
2cos2θ
2kHk2kMk2q2
2
.
Since sin θ
2cos2θ
21, it is sufficient to choose τ < 1
4kHk2kMk2q2
2
and we use definition (29) of
q2.
Lemma 3.10 (Case 2).Equation (24)admits no solutions λin Case 2 if we take
τ < 2 sin θ0
2
kHk2kMk2(1 + 2 kBk)2s(B)4.
Moreover, if 0<kBk<1, we can take
τ < 2 sin θ0
2(1 kBk)2
kHk2kMk2(1 + kBk)2.
Proof. Writing (33) for γ=γ2as in Lemma A.4 (ii) (in particular γ2
2= 1), we have
[<(λ3λ2) + γ2=(λ3λ2)] kyk2+τG(P+γ2Q, P +γ2Q)2τG(Q, Q)=0.(35)
By the properties of G
G(Q, Q)0, G(P+γ2Q, P +γ2Q)(kHk kMk kP+γ2Qk)2kyk2
and the estimate kP+γ2Qk kPk+|γ2|kQk=kPk+kQk p+q1,the left-hand side of (35)
will be strictly negative if τsatisfies:
τ < −<(λ3λ2)γ2=(λ3λ2)
[kHkkMk(p+q1)]2.
Thanks to Lemma A.4 (ii), it is sufficient to choose
τ < 2 sin θ0
2
kHk2kMk2(p+q1)2
and we use definitions (27) and (28) of pand q1.
RR n°9477
14 M. Bonazzoli & H. Haddar & T. A. Vu
Lemma 3.11 (Case 3).Let δ0>0be fixed. Equation (24)admits no solutions λin Case 3 if
we take
τ < δ0cos25θ0
2
21+2δ0sin 5θ0
2+δ2
0·1
kHk2kMk2kBk2s(B)4.
Moreover, if 0<kBk<1, we can take
τ < δ0cos25θ0
2
21+2δ0sin 5θ0
2+δ2
0·(1 kBk)4
kHk2kMk2kBk2.
Proof. Writing (33) for γ=γ3as in Lemma A.4 (iii), we have
[<(λ3λ2) + γ3=(λ3λ2)] kyk2+τG(P+γ3Q, P +γ3Q)(1 + γ2
3)τG(Q, Q)=0.(36)
By the properties of G
G(P+γ3Q, P +γ3Q)0, G(Q, Q)(kHk kMk kQk)2kyk2
and by the estimate kQk |sin θ|q2, the left-hand side of (36) will be strictly positive if τ
satisfies:
τ < <(λ2λ) + γ3=(λ2λ)
(1 + γ2
3) (kHkkMk| sin θ|q2)2.
Since by Lemma A.4 (iii) <(λ3λ2) + γ3=(λ3λ2)>2δ0sin θ
2, it is sufficient to choose
τ < δ0
2(1 + γ2
3)kHk2kMk2q2
2
=1
2kHk2kMk2q2
2·δ0cos25θ0
2
1+2δ0sin 5θ0
2+δ2
0
,
where we have used the definition of γ3. To conclude we use definition (29) of q2.
Lemma 3.12 (Case 4).Equation (24)admits no solutions λin Case 4 if we take
τ < sin π
23θ0+ cos 2θ0
kHk2kMk2(1 + kBk)2s(B)2.
Moreover, if 0<kBk<1, we can take
τ < hsin π
23θ0+ cos 2θ0i(1 kBk)2
kHk2kMk2.
Proof. Here it is enough to consider (31). By the properties of G
G(Q, Q)0, G(P, P )(kHkkMkp)2kyk2
we see that the left-hand side of (31) will be strictly negative if τsatisfies
τ < −<(λ3λ2)
(kHkkMkp)2.
Thanks to Lemma A.4 (iv), it is sufficient to choose
τ < sin π
23θ0+ cos 2θ0
kHk2kMk2p2,
and definition (27) of pleads to the conclusion.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 15
Similarly, with the help of Lemma A.5, we prove for the 1-step one-shot method the analogue
of Proposition 3.8. In particular, note that here just three cases of λneed to be considered,
because the analogue of the fourth one is excluded by Lemma A.5 (iv).
Proposition 3.13 (1-step one-shot method).If B6= 0,τ > 0sufficiently small such that
equation (25)admits no solution λC\R,|λ| 1. In particular, if 0<kBk<1, given any
δ0>0and 0< θ0π
4, take
τ < min{ψ1(1,kBk), ψ2(1,kBk), ψ3(1,kBk)}
kHk2kMk2,
where
ψ1(1, b) = (1 b)4
4b2, ψ2(1, b) = 2 sin θ0
2(1 b)2
(1 + b)2, ψ3(1, b) = δ0cos23θ0
2(1 b)4
21+2δ0sin 3θ0
2+δ2
0b2
(here in the notation ψi(1, b), i = 1,2,3,1refers to k= 1).
3.4 Final result (k= 1)
Considering Proposition 3.4, and taking the minimum between the bound (26) in Proposition 3.6
for real eigenvalues and the bound in Proposition 3.8 for complex eigenvalues, we obtain a
sufficient condition on the descent step τto ensure convergence of the shifted 1-step one-shot
method.
Theorem 3.14 (Convergence of shifted 1-step one-shot).Under assumption (4), the shifted
1-step one-shot method (12)converges for sufficiently small τ. In particular, for kBk<1, it is
enough to take
τ < χ(1,kBk)
kHk2kMk2,
where χ(1,kBk)is an explicit function of kBk(in this notation 1refers to k= 1).
Remark 3.15.Set b=kBk. For 0<b<1, a practical (but not optimal) bound for τis
τ < 1
kHk2kMk2·min 1
2·(1 b)2
(1 + b)2,1sin 5π
12
4·(1 b)4
b2.
Indeed, using the notation in Proposition 3.6 and 3.8, it is easy to show that χ2(1, b)χ0(1, b)
and χ3(1, b)χ1(1, b). By studying χ3(1, b)and noting that δ2
0+ 1 2δ0, we see that we
should take δ0= 1. Finally, we can take for instance θ0=π
6, then compare χ2(1, b),χ3(1, b)and
χ4(1, b).
Putting together Propositions 3.5,3.7,3.13, we obtain a sufficient condition on the descent
step τto ensure convergence of the 1-step one-shot method.
Theorem 3.16 (Convergence of 1-step one-shot).Under assumption (4), the 1-step one-shot
method (11)converges for sufficiently small τ. In particular, for kBk<1, it is enough to take
τ < ψ(1,kBk)
kHk2kMk2,
where ψ(1,kBk)is an explicit function of kBk(in this notation 1refers to k= 1).
Remark 3.17.Similarly as above, for 0< b < 1, a practical (but not optimal) bound for τis
τ < 1
kHk2kMk2·min 2 sin π
8·(1 b)2
(1 + b)2,1sin 3π
8
4·(1 b)4
b2.
RR n°9477
16 M. Bonazzoli & H. Haddar & T. A. Vu
4 Convergence of multi-step one-shot methods (k2)
We now tackle the multi-step case, that is the k-step one-shot methods with k2.
4.1 Block iteration matrices and eigenvalue equations
Once again, to analyze the convergence of these methods, first we express (σn+1, un+1, pn+1 )in
terms of (σn, un, pn), by rewriting the recursions for uand p: systems (9) and (10) are respectively
rewritten as
σn+1 =σnτM pn
un+1 =Bkun+TkMσnτ TkMMpn+TkF
pn+1 = [(B)kτXkM M ]pn+Ukun+XkM σn+XkFT
kHf
(37)
and
σn+1 =σnτM pn
un+1 =Bkun+TkMσn+TkF
pn+1 = (B)kpn+Ukun+XkMσn+XkFT
kHf
(38)
where
Tk=I+B+... +Bk1= (IB)1(IBk), k 1,(39)
Uk= (B)k1HH+ (B)k2HHB +... +HHBk1, k 1,
Xk=(B)k2HHT1+ (B)k3HHT2+... +HH Tk1if k2,
0if k= 1.(40)
Note that (37) (k-step one-shot) can be obtained from (38) (shifted k-step one-shot) by replacing
σnwith σn+1 =σnτM pnin the equations for uand p, which yields two extra terms in (37). In
what follows we first study the shifted k-step one-shot method then the k-step one-shot method.
The following lemma gathers some useful properties of Tk, Ukand Xk.
Lemma 4.1. (i) The matrices Ukand Xkcan be rewritten as
Uk=X
i+j=k1
(B)iHHBjfor k1,
Xk=
k2
X
l=0 X
i+j=l
(B)iHHBj=
k1
X
l=1
Ulfor k2.
(ii) The matrices Ukand Xkare self-adjoint: U
k=Uk,X
k=Xk.
(iii) We have the relation
UkTkXkBk+Xk=T
kHHTk,k1.(41)
Proof. (i) is easy to check by the definitions. (ii) follows from (i).
(iii) For k= 1, we have U1=HH,T1=Iand X1= 0, hence the identity is verified. For k2,
note that Xk+1 =BXk+HHTk, then by (ii) Xk+1 =X
k+1 =XkB+T
kHH. On the other
hand, from (i) we get that Xk+1 =Xk+Uk. Thus,
Xk+Uk=XkB+T
kHH, or equivalently, Uk=Xk(BI) + T
kHH.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 17
Finally,
UkTk=Xk(BI)Tk+T
kHHTk=Xk(BkI) + T
kHHTk.
Now, we consider the errors (σnσex, unu(σex), pnp(σex )) with respect to the exact
solution at the n-th iteration, and, by abuse of notation, we designate them by (σn, un, pn). We
obtain that the errors satisfy: for the shifted algorithm (38)
σn+1 =σnτM pn
un+1 =Bkun+TkMσn
pn+1 = (B)kpn+Ukun+XkMσn
(42)
and for algorithm (37)
σn+1 =σnτM pn
un+1 =Bkun+TkMσnτ TkMMpn
pn+1 = [(B)kτXkM M ]pn+Ukun+XkM σn,
(43)
or equivalently, by putting in evidence the block iteration matrices
pn+1
un+1
σn+1
=
(B)kUkXkM
0BkTkM
τM 0I
pn
un
σn
(44)
and
pn+1
un+1
σn+1
=
(B)kτXkM M UkXkM
τTkM M BkTkM
τM 0I
pn
un
σn
.(45)
Now recall that a fixed point iteration converges if and only if the spectral radius of its iteration
matrix is strictly less than 1. Therefore in the following propositions we establish eigenvalue
equations for the iteration matrix of the two methods.
Proposition 4.2 (Eigenvalue equation for the shifted k-step one-shot method).Assume that
λCis an eigenvalue of the iteration matrix in (44).
(i) If λC,λ /Spec(Bk), then yCnσ, y 6= 0 such that
(λ1) kyk2+τhM[λI (B)k]1[(λ1)Xk+T
kHHTk](λI Bk)1M y, yi= 0.(46)
(ii) λ= 1 is not an eigenvalue of the iteration matrix.
Proposition 4.3 (Eigenvalue equation for the k-step one-shot method).Assume that λCis
an eigenvalue of the iteration matrix in (45).
(i) If λC,λ /Spec(Bk)then yCnσ, y 6= 0 such that:
(λ1) kyk2+τλhM[λI (B)k]1[(λ1)Xk+T
kHHTk](λI Bk)1M y, yi= 0.(47)
(ii) λ= 1 is not an eigenvalue of the iteration matrix.
Remark 4.4.Since ρ(B)is strictly less than 1, so are ρ(B), ρ(Bk)and ρ((B)k).
RR n°9477
18 M. Bonazzoli & H. Haddar & T. A. Vu
The proofs for Propositions 4.2 and 4.3 are respectively similar to the ones of Propositions 3.1
and 3.3, the slight difference is that in the calculation we use (41) to simplify some terms.
In the following sections we will show that, for sufficiently small τ, equations (46) and (47)
admit no solution |λ| 1, thus algorithms (10) and (9) converge. When λ6= 0, it is convenient
to rewrite (46) and (47) respectively as
λ2(λ1) kyk2+τhMI(B)k1[(λ1)Xk+T
kHHTk]IBk1M y, yi= 0 (48)
and
λ(λ1) kyk2+τhMI(B)k1[(λ1)Xk+T
kHHTk]IBk1M y, yi= 0 (49)
The scalar case where nu, nσ, nf= 1 is analyzed in Appendix C.
Remark 4.5.Note that when B= 0 and k2, the shifted k-step one-shot and k-step one-
shot are respectively equivalent to the shifted and usual gradient descent methods, therefore we
retrieve the same bounds (8)–(7) for the descent step τas for those methods.
For the analysis we use auxiliary results proved in Appendix A, and the following bounds for
s(Bk), Tk, Xk.
Lemma 4.6. If kBk<1,
s(Bk)1
1 kBkk,kTkk 1 kBkk
1 kBk,kXkk kHk2(1 kkBkk1+ (k1) kBkk)
(1 kBk)2.
Proof. The bound for s(Bk)is proved using Lemma A.2 and
Bk
kBkk. Next, from (39) we
have
kTkk 1 + kBk+... +kBkk1=1 kBkk
1 kBk.
From (40), if k2we have
kXkk kHk2kBkk2+kBkk3(1 + kBk) + ... + (1 + kBk+... +kBkk2)
=kHk2(1 + 2 kBk+... + (k1) kBkk2) = kHk2(1 kkBkk1+ (k1) kBkk)
(1 kBk)2.
4.2 Real eigenvalues
We first find conditions on the descent step τsuch that the real eigenvalues stay inside the unit
disk. Recall that we have already proved that λ= 1 is not an eigenvalue for any k.
Proposition 4.7 (shifted k-step one-shot method).When k2,τ > 0sufficiently small such
that equation (48)admits no solution λR, λ 6= 1,|λ| 1. More precisely, take
τ < 2
kMk2(kHk2kTkk2+2kXkk)s(Bk)2if the denominator of the right-hand side is not 0;
any τ > 0otherwise.
Moreover, if kBk<1, we can take
τ < (1 kBk)2
kHk2kMk2·2(1 kBkk)2
(1 kBkk)2+ 2(1 kkBkk1+ (k1) kBkk).
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 19
Proof. When λRequation (48) is rewritten as
λ2(λ1) kyk2+τ
HTkIBk
λ1My
2
+τ(λ1)hMhI(B)k
λi1XkIBk
λ1M y, yi= 0.
We show that if λ > 1(or respectively λ 1) we can choose τso that the left-hand side of the
above equation is strictly positive (or respectively negative). Indeed, if λ > 1, we choose τsuch
that
λ2kyk2τhMI(B)k
λ1
XkIBk
λ1
M y, yi
>0
and this can be done by taking τsuch that
[kXkkkMk2s(Bk)2]τ < 1.
If λ 1, we choose τsuch that
λ2(λ1) kyk2+τ
HTkIBk
λ1My
2
+τ(1 λ)hMhI(B)k
λi1XkIBk
λ1M y, yi<0
and this can be done by taking τsuch that
"kHk2kTkk2kMk2s(Bk)2
2+kXkkkMk2s(Bk)2#τ < 1,
so we obtain the first conclusion. Finally, the second conclusion in the case kBk<1can be
obtained by Lemma 4.6.
Proposition 4.8 (k-step one-shot method).When k2,τ > 0sufficiently small such that
equation (49)admits no solution λR, λ 6= 1,|λ| 1. More precisely, take
τ < 1
kXkkkMk2s(Bk)2if the denominator of the right-hand side is not 0;
any τ > 0otherwise.
Moreover, if kBk<1, we can take
τ < (1 kBk)2
kHk2kMk2·(1 kBkk)2
1kkBkk1+ (k1) kBkk.
Proof. When λRequation (49) is rewritten as
λ(λ1) kyk2+τ
HTkIBk
λ1My
2
+τ(λ1)hMhI(B)k
λi1XkIBk
λ1M y, yi= 0.
RR n°9477
20 M. Bonazzoli & H. Haddar & T. A. Vu
We show that we can choose τso that the left-hand side of the above equation is strictly positive.
Indeed, if λ > 1, we choose τsuch that
λkyk2τhMI(B)k
λ1
XkIBk
λ1
M y, yi
>0
and this can be done by taking τsuch that
kXkkkMk2s(Bk)2τ < 1.
If λ 1, we choose τsuch that
λkyk2+τhMI(B)k
λ1
XkIBk
λ1
M y, yi
<0
and this is also done by taking τsuch that
kXkkkMk2s(Bk)2τ < 1.
so we obtain the first conclusion. Finally, the conclusion in the case kBk<1can be obtained by
Lemma 4.6.
4.3 Complex eigenvalues
We now look for conditions on the descent step τsuch that also the complex eigenvalues stay
inside the unit disk. We first deal with the shifted k-step one-shot method.
Proposition 4.9 (shifted k-step one-shot method).When k2,τ > 0sufficiently small such
that equation (48)admits no solution λC\R,|λ| 1. In particular, if kBk<1, given any
δ0>0and 0< θ0<π
6, take
τ < min{χ1(k, kBk), χ2(k, kBk), χ3(k , kBk), χ4(k, kBk)}
kHk2kMk2
where
χ1(k, b) = (1 b)2(1 bk)2
4b2k+2(1 kbk1+ (k1)bk)(1 + bk)2
χ2(k, b) = (1 b)2(1 bk)2
1
2 sin(θ0/2) (1 bk)2+2(1 kbk1+ (k1)bk)(1 + bk)2
χ3(k, b) = (1 b)2(1 bk)2
2csin(θ0/2)
δ0b2k+ (1 kbk1+ (k1)bk)hc
δ0(1 + b2k) + 2 maxc
δ0,c
cos 3θ0bki
χ4(k, b) = sin π
23θ0+ cos 2θ0(1 b)2(1 bk)2
(1 bk)2+ 2(1 kbk1+ (k1)bk)(1 + bk)2
and c=1+2δ0sin 5θ0
2+δ2
0
cos25θ0
2
.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 21
Proof. Step 1. Rewrite equation (48)so that we can study its real and imaginary
parts.
Let λ=R(cos θ+ i sin θ)in polar form where R=|λ| 1and θ(π, π). Write 1 =
r(cos φ+ i sin φ)in polar form where r= 1/|λ|= 1/R 1and φ=θ(π, π). By Lemma
A.3 applied to T=Bk, we have
IBk
λ!1
=P(λ)+iQ(λ), I(B)k
λ!1
=P(λ)+ iQ(λ)
where P(λ)and Q(λ)are Cnu×nu-valued functions, and, by omitting the dependence on λ,
kPk p:=
(1 +
Bk
)s(Bk)2for general B,
1
1 kBkkwhen kBk<1; (50)
kQk q1:=
Bk
s(Bk)2for general B,
kBkk
1 kBkkwhen kBk<1; (51)
kQk q2|sin θ|, q2:=
Bk
s(Bk)2for general B,
kBkk
(1 kBkk)2when kBk<1.(52)
Now we rewrite (48) as
λ2(λ1) kyk2+τG(P+ iQ, P + iQ) + τ(λ1)L(P+ iQ, P + iQ)=0.(53)
where
G(X, Y ) = hMX T
kHHTkY M y, yi, L(X, Y ) = hMXXkY My , yi
for X, Y Cnu×nu.Gsatisfies the following properties:
X, Y1, Y2Cnu×nu,z1, z2C:G(X, z1Y1+z2Y2) = z1G(X, Y1) + z2G(X, Y2).
X1, X2, Y Cnu×nu,z1, z2C:G(z1X1+z2X2, Y ) = z1G(X1, Y ) + z2G(X2, Y ).
XCnu×nu:G(X, X)R.
X, Y Cnu×nu:G(X, Y ) + G(Y, X)R, indeed
G(X, Y ) = hMX T
kHHTkY M y, yi=hy , MYT
kHHTkXMyi
=hMYT
kHHTkXM y, yi=G(Y, X).
Similarly, Lhas the same properties as G(note that X
k=Xkby Lemma 4.1). With these
properties of Gand L, we expand (53) and take its real and imaginary parts, so we respectively
obtain:
<(λ3λ2)kyk2+τG1+τ[<(λ1)L1 =(λ1)L2] = 0 (54)
and
=(λ3λ2)kyk2+τG2+τ[=(λ1)L1+<(λ1)L2] = 0 (55)
where
G1=G(P, P )G(Q, Q), G2=G(P, Q) + G(Q, P ),
RR n°9477
22 M. Bonazzoli & H. Haddar & T. A. Vu
L1=L(P, P )L(Q, Q), L2=L(P, Q) + L(Q, P ).
Step 2. Find a suitable combination of equations (54)and (55), choose τso that we
obtain a new equation with a left-hand side which is strictly positive/negative.
Let γ=γ(λ)R, defined by cases as in Lemma A.4. Multiplying equation (55) with γthen
summing it with equation (54), we obtain:
[<(λ3λ2) + γ=(λ3λ2)] kyk2+τG(P+γQ, P +γQ)(1 + γ2)τ G(Q, Q)
+τ([<(λ1) + γ=(λ1)]L1+ [γ<(λ1) =(λ1)]L2) = 0.(56)
Now we prepare some useful estimates.
XCnu×nu:0G(X, X) = kHTkX M yk2(kHkkTkkkMkkXk)2kyk2.
Since kQk q1and kQk q2|sin θ|, we have
G(Q, Q)(kHkkTkkkMkq1)2kyk2and G(Q, Q)(kHk kTkk kMkq2sin |θ|)2kyk2.
By Cauchy-Schwarz inequality we have
|<(λ1) + γ=(λ1)| p1 + γ2|λ1|;|γ<(λ1) =(λ1)| p1 + γ2|λ1|.
X, Y Cnu×nu:|L(X, Y )|=|hMXXkY M y, yi| kXkkkMk2kXkkYk kyk2.Hence
|L1|=|L(P, P )L(Q, Q)|≤|L(P, P )|+|L(Q, Q)|
kXkkkMk2(kPk2+kQk2)kyk2 kXkkkMk2(p2+q2
1)kyk2,
|L2|=|L(P, Q) + L(Q, P )|≤|L(P, Q)|+|L(Q, P )|
2kXkkkMk2kPkkQk kyk22kXkk kMk2pq1kyk2,
and then |[<(λ1) + γ=(λ1)]L1+ [γ<(λ1) =(λ1)]L2|
|<(λ1) + γ=(λ1)||L1|+|γ<(λ1) =(λ1)||L2|
p1 + γ2|λ1|kXkkkMk2(p2+q2
1+ 2pq1)kyk2
=p1 + γ2|λ1|kXkkkMk2(p+q1)2kyk2.
Now we consider four cases of λas in Lemma A.4:
Case 1. <(λ3λ2)0;
Case 2. <(λ3λ2)<0and θ[θ0, π θ0][π+θ0,θ0]for fixed 0< θ0<π
6;
Case 3. <(λ3λ2)<0and θ(θ0, θ0)for fixed 0< θ0<π
6;
Case 4. <(λ3λ2)<0and θ(πθ0, π)(π, π+θ0)for fixed 0< θ0<π
6.
The four cases will be treated in the following four lemmas (Lemmas 4.104.13), which together
give the statement of this proposition.
Lemma 4.10 (Case 1).For k2, equation (48)admits no solutions λin Case 1 if we take
τ < s(Bk)4
4kHk2kMk2kTkk2kBkk2+2kMk2kXkk(1 + 2 kBkk)2if the denominator of the
right-hand side is not 0;
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 23
any τ > 0otherwise.
Moreover, if kBk<1, we can take
τ < (1 kBk)2
kHk2kMk2·(1 kBkk)2
4kBk2k+2(1 kkBkk1+ (k1) kBkk)(1 + kBkk)2.
Proof. Writing (56) for γ=γ1as in Lemma A.4 (i) (in particular γ2
1= 1), we have
[<(λ3λ2) + γ1=(λ3λ2)] kyk2+τG(P+γ1Q, P +γ1Q)2τG(Q, Q)
+τ([<(λ1) + γ1=(λ1)]L1+ [γ1<(λ1) =(λ1)]L2) = 0.(57)
Since G(P+γ1Q, P +γ1Q)0, and by estimating
G(Q, Q)(kHkkTkkkMkq2sin |θ|)2kyk2,
[<(λ1) + γ1=(λ1)]L1+ [γ1<(λ1) =(λ1)]L2
2|λ1|kXkkkMk2(p+q1)2kyk2,
by Lemma A.4 (i) the left-hand side of (57) will be strictly positive if τsatisfies:
2 (kHkkTkkkMkq2)2|sin θ|2
|λ1|+2kXkkkMk2(p+q1)2τ < 1.
Since |sin θ|2
|λ1||sin θ|2
2|sin(θ/2)|= 2 sin θ
2cos2θ
22, we have the first part of the conclusion using
definitions (50), (51), (52) of p, q1, q2. Finally, the conclusion in the case kBk<1can be obtained
by Lemma 4.6.
Lemma 4.11 (Case 2).For k2, equation (48)admits no solutions λin Case 2 if we take
τ < s(Bk)4
1
2 sin(θ0/2) kHk2kMk2kTkk2+2kMk2kXkk(1 + 2 kBkk)2if the denominator of
the right-hand side is not 0;
any τotherwise.
Moreover, if kBk<1, we can take
τ < (1 kBk)2
kHk2kMk2·(1
Bk
)2
h1
2 sin(θ0/2) (1 kBkk)2+2(1 kkBkk1+ (k1) kBkk)i(1 + kBkk)2.
Proof. Writing (56) for γ=γ2as in Lemma A.4 (ii) (in particular γ2
2= 1), we have
[<(λ3λ2) + γ2=(λ3λ2)] kyk2+τG(P+γ2Q, P +γ2Q)2τG(Q, Q)
+τ([<(λ1) + γ2=(λ1)]L1+ [γ2<(λ1) =(λ1)]L2) = 0.(58)
Since G(Q, Q)0, and by estimating kP+γ2Qk≤kPk+|γ2|kQk=kPk+kQk p+q1, so
that
G(P+γ2Q, P +γ2Q)[kHk kTkk kMk(p+q1)]2kyk2,
RR n°9477
24 M. Bonazzoli & H. Haddar & T. A. Vu
and
[<(λ1) + γ2=(λ1)]L1+ [γ2<(λ1) =(λ1)]L2
2|λ1|kXkkkMk2(p+q1)2kyk2,
by Lemma A.4 (ii), the left-hand side of (58) will be strictly negative if τsatisfies:
[kHkkTkkkMk(p+q1)]21
|λ1|+2kXkkkMk2(p+q1)2τ < 1.
Since 1
|λ1|1
2 sin(θ0/2) , we have the first part of the conclusion using definitions (50), (51) of
p, q1. Finally, the conclusion in the case kBk<1can be obtained by Lemma 4.6.
Lemma 4.12 (Case 3).Let δ0>0be fixed and c:=1+2δ0sin 5θ0
2+δ2
0
cos25θ0
2
. For k2, equation (48)
admits no solutions λin Case 3 if we take
τ < s(Bk)42csin θ0
2
δ0kHk2kMk2kTkk2
Bk
2+c
δ0kMk2kXkk(1 + 2
Bk
+ 2
Bk
2)
+2 max c
δ0,c
cos 3θ0kMk2kXkk(
Bk
+
Bk
2)iif the denominator of the right-hand
side is not 0;
any τ > 0otherwise.
Moreover, if kBk<1, we can take
τ < (1−kBk)2
kHk2kMk2(1 kBkk)22csin θ0
2
δ0
Bk
2k+c
δ0(1 kkBkk1+ (k1) kBkk)(1 + kBk2k)
+2 max c
δ0,c
cos 3θ0(1 kkBkk1+ (k1) kBkk)kBkki1.
Proof. Writing (56) for γ=γ3as in Lemma A.4 (iii), we have
[<(λ3λ2) + γ3=(λ3λ2)] kyk2+τG(P+γ3Q, P +γ3Q)(1 + γ2
3)τG(Q, Q)
+τ([<(λ1) + γ3=(λ1)]L1+ [γ3<(λ1) =(λ1)]L2) = 0.
(59)
Since G(P+γ3Q, P +γ3Q)0, the left-hand side of (59) will be strictly positive if τsatisfies:
τ < 1
kyk2"(1 + γ2
3)G(Q, Q)
<(λ3λ2) + γ3=(λ3λ2)
+|L1||<(λ1) + γ3=(λ1)|
<(λ3λ2) + γ3=(λ3λ2)+|L2||γ3<(λ1) =(λ1)|
<(λ3λ2) + γ3=(λ3λ2)#1
.
By estimating
G(Q, Q)(kHkkTkkkMkq2|sin θ|)2kyk2
|L1|≤kXkk kMk2(p2+q2
1)kyk2;
|L2| 2kXkkkMk2pq1kyk2
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 25
and using Lemma A.4 (iii), it suffices to choose
(1 + γ2
3) (kHkkTkkkMkq2)22|sin θ
2|cos2θ
2
δ0+kXkkkMk2(p2+q2
1)1+γ2
3
δ0
+2 kXkkkMk2pq1max 1+γ2
3
δ0,1+γ2
3
cos 3θ0τ < 1.
Noting that c= 1 + γ2
3, the final result is obtained by definitions (50), (51), (52) of p, q1, q2.
Finally, the conclusion in the case 0<kBk<1can be obtained by Lemma 4.6.
Lemma 4.13 (Case 4).For k2, equation (48)admits no solutions λin Case 4 if we take
τ < sin π
23θ0+ cos 2θ0s(Bk)4
kHk2kMk2kTkk2(1 + kBkk)2+ 2 kMk2kXkk(1 + 2 kBkk)2if the denominator of the
right-hand side is not 0;
any τ > 0otherwise.
Moreover, if kBk<1, we can take
τ < (1 kBk)2
kHk2kMk2·sin π
23θ0+ cos 2θ0(1 kBkk)2
(1 kBkk)2+ 2(1 kkBkk1+ (k1) kBkk)(1 + kBkk)2.
Proof. Here it is enough to consider (54). By the properties of G
G(Q, Q)0, G(P, P )(kHkkTkkkMkp)2kyk2
and Lemma A.4 (iv), we see that the left-hand side of (54) will be strictly negative if τsatisfies:
(kHkkTkkkMkp)21
sin(π
23θ0)+cos 2θ0
+kXkkkMk2(p+q1)22
sin(π
23θ0)+cos 2θ0τ < 1.
Definitions (50), (51) of p, q1lead to the final result. Finally, the conclusion in the case 0<
kBk<1can be obtained by Lemma 4.6.
Similarly, with the help of Lemma A.5, we prove for the k-step one-shot method the analogue
of Proposition 4.9. In particular, note that here just three cases of λneed to be considered,
because the analogue of the fourth one is excluded by Lemma A.5 (iv).
Proposition 4.14 (k-step one-shot method).τ > 0sufficiently small such that equation (49)
admits no solution λC\R,|λ| 1. In particular, if kBk<1, given any δ0>0and 0< θ0<π
4,
take
τ < min{ψ1(k, b), ψ2(k, b), ψ3(k , b)}
kHk2kMk2
where
ψ1(k, b) = (1 b)2(1 bk)2
4b2k+2(1 kbk1+ (k1)bk)(1 + bk)2
ψ2(k, b) = (1 b)2(1 bk)2
h1
2 sin(θ0/2) (1 bk)2+2(1 kbk1+ (k1)bk)i(1 + bk)2
ψ3(k, b) = (1 b)2(1 bk)2
2csin(θ0/2)
δ0b2k+ (1 kbk1+ (k1)bk)hc
δ0(1 + b2k) + 2 maxc
δ0,c
cos 2θ0bki
and c=1+2δ0sin 3θ0
2+δ2
0
cos23θ0
2
.
RR n°9477
26 M. Bonazzoli & H. Haddar & T. A. Vu
4.4 Final result (k2)
Considering Remark 4.5, and taking the minimum between the bound in Proposition 4.7 for
real eigenvalues and the bound in Proposition 4.9 for complex eigenvalues, we finally obtain a
sufficient condition on the descent step τto ensure convergence of the shifted multi-step one-shot
method.
Theorem 4.15 (Convergence of shifted k-step one-shot, k2).Under assumption (4), the
shifted k-step one-shot method, k2, converges for sufficiently small τ. In particular, for
kBk<1, it is enough to take
τ < χ(k, kBk)
kHk2kMk2,
where χ(k, kBk)is an explicit function of kand kBk.
Similarly, by combining Remark 4.5, Propositions 4.8 and 4.14, we obtain a sufficient condition
on the descent step τto ensure convergence of the multi-step one-shot method.
Theorem 4.16 (Convergence of k-step one-shot, k2).Under assumption (4), the k-step
one-shot method, k2, converges for sufficiently small τ. In particular, for kBk<1, it is
enough to take
τ < ψ(k, kBk)
kHk2kMk2,
where ψ(k, kBk)is an explicit function of kand kBk.
5 Inverse problem with complex forward problem and real
parameter
In this section we show that a linear inverse problem with associated complex forward problem
and real parameter can be transformed into a linear inverse problem which matches with the real
model at the beginning of Section 2, so that the previous theory applies. More precisely, here
we study the state equation
u=Bu +M σ +F
where uCnu,σRnσ,BCnu×nu, M Cnu×nσ. We measure H u(σ) = fwhere HCnf×nu
and we want to recover σfrom f. Using the method of least squares, we consider the cost
functional
J(σ):=1
2kHu(σ)fk2,
then by the Lagrangian technique with
L(u, v, σ) = 1
2kHu fk2+<hBu + +Fu, vi,
we can define the adjoint state p=p(σ)such that
p=Bp+H(Hu(σ)f),
which allows us to compute
J(σ) = <(Mp).
By separating the real and imaginary parts of all vectors and matrices u=u1+ iu2,p=p1+ ip2,
B=B1+ iB2,M=M1+ iM2,F=F1+ iF2,H=H1+ iH2,f=f1+ if2, we can transform
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 27
this inverse problem with complex forward problem into the inverse problem with real forward
problem introduced at the beginning of Section 2. Indeed, note that B=B
1iB
2,M=
M
1iM
2,H=H
1iH
2, so we have
u1+ iu2= (B1+ iB2)(u1+ iu2)+(M1+ iM2)σ+ (F1+ iF2)
p1+ ip2= (B
1iB
2)(p1+ ip2)+(H
1iH
2)[(H1+ iH2)(u1+ iu2)(f1+ if2)]
J(σ) = <[(M
1iM
2)(p1+ ip2)],
which implies
u1=B1u1B2u2+M1σ+F1
u2=B2u1+B1u2+M2σ+F2
p1=B
1p1+B
2p2+ (H
1H1+H
2H2)u1(H
2H1H
1H2)u2(H
1f1+H
2f2)
p2=B
2p1+B
1p2+ (H
2H1H
1H2)u1+ (H
1H1+H
2H2)u2(H
2f1+H
1f2)
J(σ) = M
1p1+M
2p2.
By setting
˜u=u1
u2,˜p=p1
p2,˜
B=B1B2
B2B1,˜
M=M1
M2,˜
F=F1
F2,˜
H=H1H2
H2H1,˜
f=f1
f2
we have
˜u=˜
B˜u+˜
Mσ +˜
F
˜p=˜
B˜p+˜
H(˜
H˜u˜
f)
J(σ) = ˜
M˜p,
that has the same structure as the inverse problem at the beginning of Section 2.
Finally we finish this section by two lemmas that match the assumptions of the inverse
problem with complex state variable with the assumptions of the transformed inverse problem
with real state variable.
Lemma 5.1. Spec( ˜
B) = Spec(B)Spec(B).
Proof. By writing
˜
B=B1B2
B2B1=I I
iIiI
| {z }
C1
B0
0B1
2Ii
2I
1
2Ii
2I
| {z }
C
,(60)
we find that det( ˜
BλI) = det(BλI) det(BλI ). The conclusion is then deduced thanks to
the fact that Spec(B) = Spec(B).
Lemma 5.2. Assume that ρ(B)<1, and H(IB)1Mis injective. Then ρ(˜
B)<1, and
˜
H(˜
I˜
B)1˜
Mis injective where ˜
IR2nu×2nuis the identity matrix.
Proof. The previous lemma says that ρ(˜
B) = ρ(B)<1. Therefore (˜
I˜
B)1is well-defined and
thanks to (60),
(˜
I˜
B)1=I I
iIiI
| {z }
C1
(IB)10
0 (IB)11
2Ii
2I
1
2Ii
2I
| {z }
C
=1
2(IB)1+ (IB)1i(IB)1+ i(IB)1
i(IB)1i(IB)1(IB)1+ (IB)1.
RR n°9477
28 M. Bonazzoli & H. Haddar & T. A. Vu
Now we have
˜
H(˜
I˜
B)1˜
M=1
2H1H2
H2H1(IB)1+ (IB)1i(IB)1+ i(IB)1
i(IB)1i(IB)1(IB)1+ (IB)1M1
M2
=1
2H(IB)1+H(IB)1iH(IB)1+ iH(IB)1
iH(IB)1iH(IB)1H(IB)1+H(IB)1M1
M2
=1
2H(IB)1M+H(IB)1M
iH(IB)1MiH(IB)1M.
Now assume that there exists xCnσsuch that H(˜
I˜
B)1˜
Mx = 0, then
([H(IB)1M+H(IB)1M]x= 0
[iH(IB)1MiH(IB)1M]x= 0
or, equivalently, ([H(IB)1M+H(IB)1M]x= 0
[H(IB)1M+H(IB)1M]x= 0.
By summing up these two equations we deduce that H(IB)1M x = 0, then x= 0 thanks to
the injectivity of H(IB)1M.
6 Numerical experiments
Let us introduce a toy model to illustrate numerically the performance of the different methods.
Given Rnan open bounded Lipschitz domain, we consider the direct problem for the
linearized scattered field uH2(Ω) given by the Helmholtz equation
divσ0u) + ˜
k2u=div(σu0),in ,
u= 0,on ,(61)
where the incident field u0: Rsatisfies
divσ0u0) + ˜
k2u= 0,in ,
u0=f, on (62)
with the datum f:R. Here σ: Rsuch that σ= 0;˜σ0=σ0+δσris a given
function with δ0and random σr. More precisely, given ˜σ0and f, we solve for u0=u0(f)in
(62), then insert u0into (61) to solve for u=u(σ). The variational formulations for uand u0
are respectively
Z
˜σ0u· vZ
˜
k2uv =Z
σu0· v, vH1
0(Ω) and u= 0 on ,(63)
Z
˜σ0u0· vZ
˜
k2uv = 0,vH1
0(Ω) and u0=fon .(64)
We are interested in the inverse problem of finding σfrom the measurement Hu(σ)where Hu :=
˜σ0 u
ν . To solve this inverse problem we use the method of least squares. Denoting by σex the
exact σand g= ˜σ0 u(σex)
ν the corresponding measurement, we consider the cost functional
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 29
J(σ) = 1
2kHu(σ)gk2
L2(Ω) =1
2Rσ0 u(σ)
ν g)2. The Lagrangian technique allows us to
compute the gradient σJ(σ) = −∇u0· p(σ), where the adjoint state p=p(σ)satisfies
Z
˜σ0p· vZ
˜
k2pv = 0,vH1(Ω) and p=˜σ0
u(σ)
ν gon .(65)
By discretizing uby P1finite elements on a mesh Tu
hof , and σby P0finite elements
on a coarser mesh Tσ
hof , the discretization of (63) can be written as the linear system
A1~u =A2~σ, where ~u Rnu,~σ Rnσ. More precisely, A1and A2are respectively issued from
the discretization of R˜σ0u· vR˜
k2uv and Rσu0· v, where the Dirichlet boundary
conditions are imposed by the penalty method. To rewrite the system in the form (1), we
consider the naive splitting A1=A11 +δA12, where A11 and A12 are respectively issued from
the discretization of Rσ0u· vR˜
k2uv and Rσru· v. Then we get
~u =A1
11 (δA12~u +A2~σ)and ~u = 0 on
and
~p =A1
11 (δA12 ~p)and ~p =H~u ~g on
where HRnf×nuis the discretization of the above operator Hby abuse of notation. Choosing
δsuch that δ
A1
11 A12
2<1, we consider (3) with B=δA1
11 A12,M=A1
11 A2,F= 0.
The application of A1
11 , which has the same size as matrix A1, is done by a direct solver; more
practical fixed point iterations will be investigated in the future.
Figure 1: Domain with six source points for the numerical experiments. The unknown σis
supported on the three squares.
We then perform some numerical experiments in FreeFEM [12] with the following setting:
Wavenumber ˜
k= 2π,σ0= 1,δ= 0.01,σris a random real function with range in the
interval [1,2].
Wavelength λ=2π
˜
kσ0= 1, mesh size h=λ
20 = 0.05. The domain is the disk shown in
Figure 1, where the squares are the support of function σ. Here nu= 5853,nσ= 6.
RR n°9477
30 M. Bonazzoli & H. Haddar & T. A. Vu
We test with 6data fgiven by zero-order Bessel function of the second kind centered at the
points shown in Figure 1, and the cost functional is the normalized sum of the contributions
corresponding to different data.
We take σex = 10 in every square and 0otherwise. The initial guess for the inverse problem
is 12 in every square and 0otherwise.
For the first iteration, we perform a line search to adapt the descent step τ, using a direct
solver for the forward and adjoint problems.
The stopping rule for the outer iteration is based on the relative value of the cost functional
and on the relative norm of the gradient with a tolerance of 105.
Recall that kis the number of inner iterations on the direct and adjoint problems. We are
interested in two experiments.
In the first experiment, we study the dependence on the descent step τ. In Figure 2a and
2b we respectively fix k= 1 and k= 2 and compare k-step one-shot methods with the usual
gradient descent method. On the horizontal axis we indicate the (outer) iteration number n
in (5) and (9). We can verify that for sufficiently small τ, both one-shot methods converge.
In particular, for τ= 2, while gradient descent and 2-step one-shot converge, 1-step one-shot
diverges. Oscillations may appear on the convergence curve for certain values of τ, but they
gradually vanish when τgets smaller. For sufficiently small τ, the convergence curves of both
one-shot methods are comparable to the one of gradient descent.
In the second experiment, we study the dependence on the number of inner iterations k, for
fixed τ. First (Figures 2c2d), we investigate for which kthe convergence curve of k-step one-
shot is comparable with the one of usual gradient descent. As in the previous pictures, on the
horizontal axis we indicate the (outer) iteration number nin (5) and (9). For τ= 2 (see Figure
2c), we observe that for k= 3,4the convergence curves of k-step one-shot are close to the one of
usual gradient descent. Note that with 3inner iterations the L2error between unand the exact
solution to the forward problem ranges between 4.3·106and 0.0136 for different nin (9); in
fact this error is rather significant at the beginning then it tends to reduce when we are closer to
convergence for the parameter σ. Therefore incomplete inner iterations on the forward problem
are enough to have good precision on the solution of the inverse problem. In the very particular
case τ= 2.5(see Figure 2d), we observe an interesting phenomenon: when k= 3,5,10, with
k-step one-shot the cost functional decreases even faster than with usual GD. For bigger k, for
example k= 14, the convergence curve of one-shot is close to the one of usual gradient descent
as expected. Next (Figures 2e2f), since the overall cost of the k-step one-shot method increases
with k, we indicate on the horizontal axis the accumulated inner iteration number, which sums
up kfrom an outer iteration to the next. More precisely, because at the first outer iteration
we perform a step search by a direct solver, we set to 1the first accumulated inner iteration
number; for the following outer iterations n2, the accumulated inner iteration number is set
to 1+(n1)k. In Figures 2e2f we replot the results for the converging k-step one-shot methods
of Figures 2c2d with respect to the accumulated inner iteration number. For τ= 2 (see Figure
2e), while k= 2 presents some oscillations, quite interestingly it appears that k= 3 gives a
faster decrease of the cost functional with respect to k= 4, at least after the first iterations. For
τ= 2.5(see Figure 2f) we observe that k= 3 is enough for the decrease of the cost functional,
but with some oscillations, and the considered higher kappears again to give slower decrease.
A similar behavior can be observed for the shifted methods in Figure 3.
Finally we fix two particular values of τand compare all considered methods in Figure 4. We
note that shifted methods present more oscillations with respect to non-shifted ones, especially
for larger τ.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 31
(a) Convergence curves of usual gradient descent
and 1-step one-shot for different descent step τ.
(b) Convergence curves of usual gradient descent
and 2-step one-shot for different descent step τ.
(c) Convergence curves of usual gradient descent
and k-step one-shot for different kwith τ= 2.
(d) Convergence curves of usual gradient descent
and k-step one-shot for different kwith τ= 2.5.
(e) Convergence curves of k-step one-shot for dif-
ferent kwith τ= 2.
(f) Convergence curves of k-step one-shot for dif-
ferent kwith τ= 2.5.
Figure 2: Convergence curves of usual gradient descent and k-step one-shot.
RR n°9477
32 M. Bonazzoli & H. Haddar & T. A. Vu
(a) Convergence curves of shifted gradient descent
and shifted 1-step one-shot for different descent
step τ.
(b) Convergence curves of shifted gradient descent
and shifted 2-step one-shot for different descent
step τ.
(c) Convergence curves of shifted gradient descent
and shifted k-step one-shot for different kwith τ=
0.25.
(d) Convergence curves of shifted gradient descent
and shifted k-step one-shot for different kwith τ=
0.5.
(e) Convergence curves of shifted k-step one-shot
for different kwith τ= 0.25.
(f) Convergence curves of shifted k-step one-shot
for different kwith τ= 0.5.
Figure 3: Convergence curves of shifted gradient descent and shifted k-step one-shot.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 33
(a) Convergence curves with τ= 0.5.(b) Convergence curves with τ= 1.3.
Figure 4: Comparison of usual gradient descent and k-step one-shot with shifted gradient descent
and shifted k-step one-shot.
7 Conclusion
We have proved sufficient conditions on the descent step for the convergence of two variants of
multi-step one-shot methods. Although these bounds on the descent step are not optimal, to our
knowledge no other bounds, explicit in the number of inner iterations, are available in literature
for multi-step one-shot methods. Furthermore, we have shown in the numerical experiments
that very few inner iterations on the forward and adjoint problems are enough to guarantee good
convergence of the inversion algorithm.
These encouraging numerical results are preliminary in the sense that the considered fixed
point iteration is not a practical one, since it involves a direct solve of a problem of the same
size as the original forward problem. We will investigate in the future iterative solvers based
on domain decomposition methods (see e.g. [3]), which are well adapted to large-scale problems.
In addition, fixed point iterations could be replaced by more efficient Krylov subspace methods,
such as conjugate gradient or GMRES.
Another interesting issue is how to adapt the number of inner iterations in the course of
the outer iterations. Moreover, based on this linear inverse problem study, we plan to tackle
non-linear and time-dependent inverse problems.
References
[1] S. Barnett. Polynomials and linear control systems, volume 77 of Pure Appl. Math. Marcel
Dekker, Inc., New York, NY, 1983.
[2] M. Burger and W. Mühlhuber. Iterative regularization of parameter identification problems
by sequential quadratic programming methods. Inverse Problems, 18:943–969, 2002.
[3] V. Dolean, P. Jolivet, and F. Nataf. An Introduction to Domain Decomposition Methods:
Algorithms, Theory, and Parallel Implementation. Society for Industrial and Applied Math-
ematics, Philadelphia, PA, 2015.
[4] N. Gauger, A. Griewank, A. Hamdi, C. Kratzenstein, E. Özkaya, and T. Slawig. Automated
extension of fixed point PDE solvers for optimal design with bounded retardation. In Con-
RR n°9477
34 M. Bonazzoli & H. Haddar & T. A. Vu
strained Optimization and Optimal Control for Partial Differential Equations, International
Series of Numerical Mathematics, pages 99–122. Springer Basel, 2012.
[5] A. Greenbaum. Iterative Methods for Solving Linear Systems. Number 17 in Frontiers in
Applied Mathematics. Soc. for Industrial and Applied Math, Philadelphia, 1997.
[6] A. Griewank. Projected Hessians for Preconditioning in One-Step One-Shot Design Opti-
mization. In Large-Scale Nonlinear Optimization, volume 83, pages 151–171. Springer US,
Boston, MA, 2006. Series Title: Nonconvex Optimization and Its Applications.
[7] S. ¼nther, N. R. Gauger, and Q. Wang. Simultaneous single-step one-shot optimization
with unsteady PDEs. Journal of Computational and Applied Mathematics, 294:12–22, 2016.
[8] E. Haber and U. M. Ascher. Preconditioned all-at-once methods for large, sparse parameter
estimation problems. Inverse Problems, 17(6):1847–1864, 2001.
[9] A. Hamdi and A. Griewank. Reduced quasi-Newton method for simultaneous design and
optimization. Computational Optimization and Applications, 49(3):521–548, 2009.
[10] A. Hamdi and A. Griewank. Properties of an augmented Lagrangian for design optimization.
Optimization Methods and Software, 25(4):645–664, 2010.
[11] S.B. Hazra, V. Schulz, J. Brezillon, and N.R. Gauger. Aerodynamic shape optimization
using simultaneous pseudo-timestepping. Journal of Computational Physics, 204(1):46–64,
2005.
[12] F. Hecht. New development in FreeFem++. J. Numer. Math., 20(3-4):251–265, 2012.
[13] E.I. Jury. On the roots of a real polynomial inside the unit circle and a stability criterion for
linear discrete systems. IFAC Proceedings Volumes, 1(2):142–153, 1963. 2nd International
IFAC Congress on Automatic and Remote Control: Theory, Basle, Switzerland, 1963.
[14] E.I. Jury. Theory and Applications of the Z-Transform Method. New York, 1964.
[15] B. Kaltenbacher, A. Kirchner, and B. Vexler. Goal oriented adaptivity in the IRGNM for
parameter identification in PDEs II: all-at-once formulations. Inverse Problems, 30:045002,
2014.
[16] M. Marden. The geometry of the zeros of a polynomial in a complex variable, volume 3 of
Math. Surv. American Mathematical Society (AMS), Providence, RI, 1949.
[17] M. Marden. Geometry of Polynomials. Number 3 in Mathematical Surveys and Monographs.
American Math. Soc, Providence, RI, 2nd edition, 1966.
[18] E. Özkaya and N. R. Gauger. Single-step One-shot Aerodynamic Shape Optimization. In
Optimal Control of Coupled Systems of Partial Differential Equations, volume 158, pages
191–204. Birkhäuser Basel, Basel, 2009. Series Title: International Series of Numerical
Mathematics.
[19] V. Schulz and I. Gherman. One-Shot Methods for Aerodynamic Shape Optimization. In
MEGADESIGN and MegaOpt - German Initiatives for Aerodynamic Simulation and Opti-
mization in Aircraft Design, volume 107, pages 207–220. Springer Berlin Heidelberg, Berlin,
Heidelberg, 2009. Series Title: Notes on Numerical Fluid Mechanics and Multidisciplinary
Design.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 35
[20] I. Schur. Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind. Journal für
die reine und angewandte Mathematik (Crelles Journal), 1917(147):205–232, 1917.
[21] A. Shenoy, M. Heinkenschloss, and E. M. Cliff. Airfoil design by an all-at-once method.
International Journal of Computational Fluid Dynamics, 11(1-2):3–25, 1998.
[22] S. Ta’asan. "One Shot" Methods for Optimal Control of Distributed Parameter Systems I:
Finite Dimensional Control. Technical Report 91-2, ICASE, Hampton, 1991.
[23] S. Ta’asan, G. Kuruvila, and M. Salas. Aerodynamic design and optimization in one shot. In
30th Aerospace Sciences Meeting and Exhibit, Reno, NV, U.S.A., 1992. American Institute
of Aeronautics and Astronautics.
[24] A. Tarantola and B. Valette. Generalized nonlinear inverse problems solved using the least
squares criterion. Reviews of Geophysics, 20(2):219–232, 1982.
[25] T. van Leeuwen and F. J. Herrmann. Mitigating local minima in full-waveform inversion by
expanding the search space. Geophysical Journal International, 195(1):661–667, 2013.
[26] T. van Leeuwen and F. J. Herrmann. A penalty method for PDE-constrained optimization
in inverse problems. Inverse Problems, 32(1):015007, 2015.
RR n°9477
36 M. Bonazzoli & H. Haddar & T. A. Vu
A Some useful lemmas
We state auxiliary results about matrices like those appearing in the eigenvalue equations (24),
(25), (48), (49).
Lemma A.1. Let (Cn×n,k·k)be a normed space and TCn×n. If ρ(T)<1, then
X
k=0
Tkconverges and
X
k=0
Tk= (IT)1.
Moreover, if kTk<1,
(IT)1
1
1−kTk.
Lemma A.2. Let TCn×nsuch that ρ(T)<1. Set
s(T):= sup
zC,|z|≥1
(IT/z)1
(66)
then 0< s(T)<+. Moreover, if kTk<1,0< s(T)1
1 kTk.
Proof. The functional z7→
(IT/z)1
, with zC,|z| 1, is well-defined and continuous,
and we use Lemma A.1.
The following lemma says that, for TCn×nand λC,|λ| 1, we can decompose
IT
λ!1
=P(λ)+iQ(λ)and IT
λ!1
=P(λ)+ iQ(λ)
and gives bounds for P(λ)and Q(λ).
Lemma A.3. Let TCn×nsuch that ρ(T)<1and λC,|λ| 1. Write 1
λ=r(cos φ+ i sin φ)
in polar form, where 0< r 1and φ[π , π]. Then
IT
λ!1
=P(λ)+iQ(λ)and IT
λ!1
=P(λ)+ iQ(λ)
where
P(λ)=(Ircos φ T )(I2rcos φ T +r2T2)1, Q(λ) = rsin φ T (I2rcos φ T +r2T2)1
are Cn×n-valued functions. We also have the following properties:
(i) kP(λ)k (1 + kTk)s(T)2and kQ(λ)k≤|sin φ|kTks(T)2 kTks(T)2.
(ii) Moreover if kTk<1then
kP(λ)k 1
1 kTkand kQ(λ)k kTk
1 kTk.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 37
Proof. The first part of the lemma is verified by direct computation, using
(IT)1= (IT ) [(IT ) (IT )]1,
(IT)1= [(IT) (IT)]1(IT)
and
(IT) (IT ) = I2rcos φ T +r2T2.
After that, with the help of Lemma A.2, it is not difficult to show the inequalities in (i). To
prove (ii), first observe that the two series
X
k=0
rkcos()Tkand
X
k=1
rksin()Tk
converge. Then, by expanding and simplifying the left-hand sides, we can show that
"
X
k=0
rkcos()Tk#(I2rcos φ T +r2T2) = Ircos φ T
and "
X
k=1
rksin()Tk#(I2rcos φ T +r2T2) = rsin φ T
so P(λ)and Q(λ)can be expressed as the series above, and the inequalities in (ii) follow.
In Sections 3.3 and 4.3 we identify different cases of λCand we need corresponding
estimations, given in the two following lemmas. Lemma A.4 is used for the shifted k-step one-
shot method and Lemma A.5 is used for the k-step one-shot method.
Lemma A.4. For λC\R,|λ| 1we write λ=R(cos θ+ i sin θ)in polar form where R1,
θ(π, π),θ6= 0.
(i) For λsatisfying <(λ3λ2)0, let γ1=γ1(λ) = 1,if =(λ3λ2)0,
1,if =(λ3λ2)<0then
<(λ3λ2) + γ1=(λ3λ2) |λ1| 2|sin(θ/2)|.
(ii) Let 0< θ0π
6. For λsatisfying <(λ3λ2)<0and θ[θ0, π θ0][π+θ0,θ0],
let γ2=1,if =(λ3λ2)0,
1,if =(λ3λ2)<0then
−<(λ3λ2)γ2=(λ3λ2) |λ1| 2 sin(θ0/2).
(iii) Let 0< θ0π
6and δ0>0. For λsatisfying <(λ3λ2)<0and θ(θ0, θ0)\{0}, let
γ3=γ3(sign(θ)) = δ0+ sin 5θ0
2/cos 5θ0
2if θ > 0,
δ0+ sin 5θ0
2/cos 5θ0
2if θ < 0then
<(λ3λ2) + γ3=(λ3λ2)2δ0|sin(θ/2)|.
Moreover, if 0< θ0<π
6, we have
RR n°9477
38 M. Bonazzoli & H. Haddar & T. A. Vu
|<(λ1)+γ3=(λ1)|
<(λ3λ2)+γ3=(λ3λ2)1+γ2
3
δ0and |γ3<(λ1)−=(λ1)|
<(λ3λ2)+γ3=(λ3λ2)max 1+γ2
3
δ0,1+γ2
3
cos 3θ0.
(iv) Let 0< θ0π
6. For λsatisfying <(λ3λ2)<0and θ(πθ0, π)(π, π+θ0), we
have
−<(λ3λ2)sin π
23θ0+ cos 2θ0,
|<(λ1)|
−<(λ3λ2)2
sin π
23θ0+ cos 2θ0
and |=(λ1)|
−<(λ3λ2)2
sin π
23θ0+ cos 2θ0
.
Proof. (i) From the definition of γ1we see that γ2
1= 1,γ1=(λ3λ2)0and
<(λ3λ2) + γ1=(λ3λ2)2=<(λ3λ2)2+=(λ3λ2)2+ 2γ1<(λ3λ2)=(λ3λ2)
<(λ3λ2)2+=(λ3λ2)2=|λ3λ2|2,
which yields <(λ3λ2) + γ1=(λ3λ2)R2|λ1|. Finally,
|λ1|=|Rcos θ1+iRsin θ|=pR2+ 1 2Rcos θ22 cos θ= 2|sin(θ/2)|
since the function R7→ R2+ 1 2Rcos θ, for R1, is increasing.
(ii) In this case we have θ
2θ0
2,π
2θ0
2π
2+θ0
2,θ0
2so sin θ
2sin θ0
2. From the definition
of γ2we see that γ2
2= 1 and γ2=(λ3λ2)0. Similar to (i), we have −<(λ2λ)γ2=(λ2λ)
|λ1| 2|sin(θ/2)|, that implies the conclusion.
(iii) Note that cos 3θ > 0,π
2<3θ < π
2, and sin 3θhas the same sign as θand γ3, so we have
<(λ3λ2) + γ3=(λ3λ2) = R2(Rcos 3θcos 2θ+γ3Rsin 3θγ3sin 2θ)
cos 3θcos 2θ+γ3sin 3θγ3sin 2θ
=2 sin 5θ
2sin θ
2+ 2γ3cos 5θ
2sin θ
2
= 2 sin θ
2γ3cos 5θ
2sin 5θ
2.
Then we consider two cases: if 0< θ < θ0then γ3>0,sin θ
2= sin θ
2>0,0<5θ
2<5θ0
2<π
2and
γ3cos 5θ
2sin 5θ
2> γ3cos 5θ0
2sin 5θ0
2=δ0; if θ0< θ < 0then γ3>0,sin θ
2=sin θ
2>0,
π
2<5θ0
2<5θ
2<0and γ3cos 5θ
2+ sin 5θ
2>γ3cos 5θ0
2sin 5θ0
2=δ0.
Next, if 0< θ0<π
6, we will show that |<(λ1)+γ3=(λ1)|
<(λ3λ2)+γ3=(λ3λ2)and |γ3<(λ1)−=(λ1)|
<(λ3λ2)+γ3=(λ3λ2)are both
bounded. First,
|<(λ1) + γ3=(λ1)|
<(λ3λ2) + γ3=(λ3λ2)=|(cos θ+γ3sin θ)R1|
R2[(cos 3θ+γ3sin 3θ)R(cos 2θ+γ3sin 2θ)]
|(cos θ+γ3sin θ)R1|
(cos 3θ+γ3sin 3θ)R(cos 2θ+γ3sin 2θ).
Since γ3does not depend on R, let us study f1(R) = aR1
bRc2where a= cos θ+γ3sin θ,
b= cos 3θ+γ3sin 3θand c= cos 2θ+γ3sin 2θ. We observe that:
a, b, c > 0. Indeed, cos θ, cos 2θ, cos 3θ > 0, and θand γ3have the same sign.
bR c > 0since <(λ3λ2) + γ3=(λ3λ2)>0, thus R > c
b.
ac > b (equivalently c
b>1
a), since
ac = cos θcos 2θ+γ2
3sin θsin 2θ+γ3sin 3θ > cos θcos 2θsin θsin 2θ+γ3sin 3θ=b.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 39
Now, f0
1(R) = 2 ·aR1
bRc·bac
(bRc)2<0for R > c
b>1
aand we would like to have c
b<1so that
f1(R)f1(1),R1. Indeed c
b<1is equivalent to
cos 2θ+γ3sin 2θ < cos 3θ+γ3sin 3θ |γ3|>sin 5θ
2
cos 5θ
2
,
which is true since
|γ3|=δ0+ sin 5θ0
2
cos 5θ0
2
>sin 5θ
2
cos 5θ
2
+ε0where ε0=δ0
cos 5θ0
2
.
Then we study
f1(1) = cos θ1 + γ3sin θ
cos 3θcos 2θ+γ3(sin 3θsin 2θ)2
= sin θ
2+γ3cos θ
2
γ3sin 5θ
2+γ2
3cos 5θ
2!2
γ2
3.
We have:
(sin θ
2+γ3cos θ
2)21 + γ2
3by Cauchy-Schwarz inequality;
γ2
3=|γ3|2>γ3sin 5θ
2
cos 5θ
2
+ε0|γ3|that leads to γ3sin 5θ
2+γ2
3cos 5θ
2> ε0cos 5θ0
2|γ3|=δ0|γ3|;
hence f1(1) 1+γ2
3
δ2
0
and finally |<(λ1)+γ3=(λ1)|
<(λ3λ2)+γ3=(λ3λ2)1+γ2
3
δ0. Next, we have
|γ3<(λ1) =(λ1)|
<(λ2λ) + γ3=(λ2λ)=|(γ3cos θsin θ)Rγ3|
R2[(cos 3θ+γ3sin 3θ)R(cos 2θ+γ3sin 2θ)]
|(γ3cos θsin θ)Rγ3|
(cos 3θ+γ3sin 3θ)R(cos 2θ+γ3sin 2θ).
Since γ3does not depend on R, let us study f2(R) = dRγ3
bRc2where d=γ3cos θsin θand
b, c as above. We observe that:
γ3bcd and θhave the same sign. Indeed, γ3bcd = (γ2
3+ 1) sin θcos 2θ. Consequently,
we always have (γ3bcd)γ3>0.
We always have γ3
d>1. Indeed, if θ > 0then d > 0since γ3=δ0+sin 5θ0
2
cos 5θ0
2
>sin θ
cos θ,
also γ3
d=γ3
γ3cos θsin θ>1; if θ < 0then d < 0since γ3=δ0+sin 5θ0
2
cos 5θ0
2
>sin θ
cos θ, also
γ3
d=γ3
γ3cos θ+sin θ>1.
Now, f0
2(R)=2·
d
γ3R1
bRc·(γ3bcd)γ3
(bRc)2, so, thanks to the above results, f2(R)decreases for 1R < γ3
d
and increases for R > γ3
d. Moreover, like for f1(1), we can estimate
f2(1) = cos θ
2γ3sin θ
2
γ3sin 5θ
2+γ2
3cos 5θ
2!2
γ2
31 + γ2
3
δ2
0
,
and limR+f2(R) = γ3cos θsin θ
cos 3θ+γ3sin 3θ21+γ2
3
cos23θ0. Therefore
|γ3<(λ1) =(λ1)|
<(λ2λ) + γ3=(λ2λ)max p1 + γ2
3
δ0
,p1 + γ2
3
cos 3θ0!.
(iv) Since θ(πθ0, π)(π, π+θ0), we have
RR n°9477
40 M. Bonazzoli & H. Haddar & T. A. Vu
2θ(2π2θ0,2π)(2π, 2π+ 2θ0)2ππ
3,2π2π, 2π+π
3thus cos 2θ >
cos 2θ0>0;
3θ(3π3θ0,3π)(3π, 3π+ 3θ0)3ππ
2,3π3π, 3π+π
2, thus cos 3θ >
cos(3π3θ0) = sin π
23θ00;
So we have
−<(λ3λ2) = R2(Rcos 3θ+ cos 2θ)>hsin π
23θ0+ cos 2θ0iR2>0.
Finally, |<(λ1)|
−<(λ3λ2)R+1
[sin(π
23θ0)+cos 2θ0]R22
sin(π
23θ0)+cos 2θ0
and similarly for |=(λ1)|
−<(λ3λ2).
Lemma A.5. For λC\R,|λ| 1we write λ=R(cos θ+ i sin θ)in polar form where R1,
θ(π, π),θ6= 0.
(i) For λsatisfying <(λ2λ)0, let γ1=γ1(λ) = 1,if =(λ2λ)0,
1,if =(λ2λ)<0then
<(λ2λ) + γ1=(λ2λ) |λ(λ1)| 2|sin(θ/2)|.
(ii) Let 0< θ0π
4. For λsatisfying <(λ2λ)<0and θ[θ0, π θ0][π+θ0,θ0], let
γ2=γ2(λ) = 1,if =(λ2λ)0,
1,if =(λ2λ)<0then
−<(λ2λ)γ2=(λ2λ) |λ(λ1)| 2 sin(θ0/2).
(iii) Let 0< θ0π
4and δ0>0. For λsatisfying <(λ2λ)<0and θ(θ0, θ0)\{0}, let
γ3=γ3(sign(θ)) = δ0+ sin 3θ0
2/cos 3θ0
2if θ > 0,
δ0+ sin 3θ0
2/cos 3θ0
2if θ < 0then
<(λ2λ) + γ3=(λ2λ)2δ0|sin(θ/2)|.
Moreover, if 0< θ0<π
4then
|<(λ1)+γ3=(λ1)|
<(λ2λ)+γ3=(λ2λ)1+γ2
3
δ0and |γ3<(λ1)−=(λ1)|
<(λ2λ)+γ3=(λ2λ)max 1+γ2
3
δ0,1+γ2
3
cos 2θ0.
(iv) Let 0< θ0π
4. There exists no λsatisfying <(λ2λ)<0and θ(πθ0, π)(π, π+
θ0).
Proof. The proofs for (i) and (ii) are similar to those in Lemma A.4.
(iii) Note that cos 2θ > 0,π
2<2θ < π
2, and sin 2θhas the same sign as θand γ3, so we have
<(λ2λ) + γ3=(λ2λ) = R(Rcos 2θcos θ+γ3Rsin 2θγ3sin θ)
cos 2θcos θ+γ3sin 2θγ3sin θ
=2 sin 3θ
2sin θ
2+ 2γ3cos 3θ
2sin θ
2
= 2 sin θ
2γ3cos 3θ
2sin 3θ
2.
Then we consider two cases: if 0< θ < θ0then γ3>0,sin θ
2= sin θ
2>0,0<3θ
2<3θ0
2<π
2and
γ3cos 3θ
2sin 3θ
2> γ3cos 3θ0
2sin 3θ0
2=δ0; if θ0< θ < 0then γ3>0,sin θ
2=sin θ
2>0,
π
2<3θ0
2<3θ
2<0and γ3cos 3θ
2+ sin 3θ
2>γ3cos 3θ0
2sin 3θ0
2=δ0.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 41
Next, if 0< θ0<π
4, we will show that |<(λ1)+γ3=(λ1)|
<(λ2λ)+γ3=(λ2λ)and |γ3<(λ1)−=(λ1)|
<(λ2λ)+γ3=(λ2λ)are both
bounded. First,
|<(λ1) + γ3=(λ1)|
<(λ2λ) + γ3=(λ2λ)=|(cos θ+γ3sin θ)R1|
R[(cos 2θ+γ3sin 2θ)R(cos θ+γ3sin θ)]
|(cos θ+γ3sin θ)R1|
(cos 2θ+γ3sin 2θ)R(cos θ+γ3sin θ).
Since γ3does not depend on R, let us study f1(R) = aR1
bRa2where a= cos θ+γ3sin θ,
b= cos 2θ+γ3sin 2θ. We observe that:
a > 0and b > 0. Indeed, cos θ > 0,cos 2θ > 0, and θand γ3have the same sign.
bR a > 0since <(λ2λ) + γ3=(λ2λ)>0, thus R > a
b.
a2> b (equivalently a
b>1
a), since a2= cos2θ+γ2
3sin2θ+γ3sin 2θ > cos2θsin2θ+
γ3sin 2θ=b.
Now, f0
1(R) = 2 ·aR1
bRa·ba2
(bRa)2<0for R > a
b>1
aand we would like to have a
b<1so that
f1(R)f1(1),R1. Indeed a
b<1is equivalent to
cos θ+γ3sin θ < cos 2θ+γ3sin 2θ |γ3|>sin 3θ
2
cos 3θ
2
,
which is true since
|γ3|=δ0+ sin 3θ0
2
cos 3θ0
2
>sin 3θ
2
cos 3θ
2
+ε0where ε0=δ0
cos 3θ0
2
.
Then we study
f1(1) = cos θ1 + γ3sin θ
cos 2θcos θ+γ3(sin 2θsin θ)2
= sin θ
2+γ3cos θ
2
γ3sin 3θ
2+γ2
3cos 3θ
2!2
γ2
3.
We have:
(sin θ
2+γ3cos θ
2)21 + γ2
3by Cauchy-Schwarz inequality;
γ2
3=|γ3|2>γ3sin 3θ
2
cos 3θ
2
+ε0|γ3|that leads to γ3sin 3θ
2+γ2
3cos 3θ
2> ε0cos 3θ
2|γ3|=δ0|γ3|;
hence f1(1) 1+γ2
3
δ2
0
and finally |<(λ1)+γ3=(λ1)|
<(λ2λ)+γ3=(λ2λ)1+γ2
3
δ0. Next, we have
|γ3<(λ1) =(λ1)|
<(λ2λ) + γ3=(λ2λ)=|(γ3cos θsin θ)Rγ3|
R[(cos 2θ+γ3sin 2θ)R(cos θ+γ3sin θ)]
|(γ3cos θsin θ)Rγ3|
(cos 2θ+γ3sin 2θ)R(cos θ+γ3sin θ).
Since γ3does not depend on R, let us study f2(R) = cRγ3
bRa2where c=γ3cos θsin θand
a, b as above. We observe that:
RR n°9477
42 M. Bonazzoli & H. Haddar & T. A. Vu
γ3bca and θhave the same sign. Indeed, γ3bca = (γ2
3+ 1) sin θcos θ. Consequently,
we always have (γ3bca)γ3>0.
We always have γ3
c>1. Indeed, if θ > 0then c > 0since γ3=δ0+sin 3θ0
2
cos 3θ0
2
>sin θ
cos θ,
also γ3
c=γ3
γ3cos θsin θ>1; if θ < 0then c < 0since γ3=δ0+sin 3θ0
2
cos 3θ0
2
>sin θ
cos θ, also
γ3
c=γ3
γ3cos θ+sin θ>1.
Now, f0
2(R)=2·
c
γ3R1
bRa·(γ3bca)γ3
(bRa)2, so, thanks to the above results, f2(R)decreases for 1R < γ3
c
and increases for R > γ3
c. Moreover, like for f1(1), we can estimate
f2(1) = cos θ
2γ3sin θ
2
γ3sin 3θ
2+γ2
3cos 3θ
2!2
γ2
31 + γ2
3
δ2
0
and limR+f2(R) = γ3cos θsin θ
cos 2θ+γ3sin 2θ21+γ2
3
cos 2θ0. Therefore
|γ3<(λ1) =(λ1)|
<(λ2λ) + γ3=(λ2λ)max p1 + γ2
3
δ0
,p1 + γ2
3
cos 2θ0!.
(iv) For θ(πθ0, π )(π, π+θ0), we have cos 2θ > 0since 2θ3π
2,2π2π, 3π
2,
while cos θ < 0. Hence <(λ2λ) = R(Rcos 2θcos θ)>0.
B Descent step for usual and shifted gradient descent
Proposition B.1 (Descent step for the usual gradient descent).The usual gradient descent
algorithm (5)converges if
0< τ < 2
kH(IB)1Mk2.
Proof. The error system for (5) can be rewritten as
pn+1
un+1
σn+1
=
τ(IB)1HH(IB)1MM 0 (IB)1HH(IB)1M
τ(IB)1MM 0 (IB)1M
τM 0I
pn
un
σn
(67)
Recall that a fixed point iteration converges if and only if the spectral radius of its iteration
matrix is strictly less than 1. We can show that:
(i) If λC\{0,1}is an eigenvalue of the iteration matrix, then, proceeding as in Proposition
4.3, there exists yCnσ, y 6= 0 such that
λ2(λ1) + τ
H(IB)1My
2
kyk2λ2= 0 (68)
hence λ= 1 τkH(IB)1M yk2
kyk2. If we take τ < 2
kH(IB)1Mk2then equation (68) admits
no solution λwith |λ| 1.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 43
(ii) λ= 1 is not an eigenvalue of the iteration matrix. To show this, we rewrite iteration (67)
as
σn+1
pn+1
un+1
=
IτM 0
(IB)1HH(IB)1Mτ(IB)1HH(IB)1MM 0
(IB)1Mτ(IB)1MM 0
σn
pn
un
.
Proposition B.2 (Convergence of the shifted gradient descent).The shifted gradient descent
algorithm (6)converges if
0< τ < 1
kH(IB)1Mk2.
Proof. The error system for (6) can be rewritten as
pn+1
un+1
σn+1
=
0 0 (IB)1HH(IB)1M
0 0 (IB)1M
τM 0I
pn
un
σn
.(69)
Recall that a fixed point iteration converges if and only if the spectral radius of its iteration
matrix is strictly less than 1. We can show that:
(i) If λC\{0,1}is an eigenvalue of the iteration matrix, then, proceeding as in Proposition
4.2, there exists yCnσ, y 6= 0 such that
λ2(λ1) + τ
H(IB)1My
2
kyk2λ= 0.(70)
By applying Lemma C.1 for
a0= 0, a1=τ
H(IB)1My
2
kyk2, a2=1,
we see that equation (70) admits no solution λwith |λ| 1if we take τ < kyk2
kH(IB)1My k2.
Then it is enough to take τ < 1
kH(IB)1Mk2.
(ii) λ= 1 is not an eigenvalue of the iteration matrix. To show this, we rewrite iteration (69)
as
σn+1
pn+1
un+1
=
IτM 0
(IB)1HH(IB)1M0 0
(IB)1M0 0
σn
pn
un
.
and proceed as in Proposition 4.2.
RR n°9477
44 M. Bonazzoli & H. Haddar & T. A. Vu
C Convergence study for the scalar case
C.1 Notations and preliminary calculation
In the scalar case, that is when nu, nσ, nf= 1, we change the notation from capital to lower case
letters:
BbR, b < 1, M mR, m 6= 0, H hR, h 6= 0,
Tktk= 1 + b+... +bk1=1bk
1b, Ukuk=kh2bk1(71)
Xkxk=0, k = 1,
h2[1 + 2b+ 3b2+... + (k1)bk2], k 2.
The identity 1+2x+ 3x2+... +nxn1=1xn+1
1x0=1(n+1)xn+nxn+1
(1x)2says that
xk=h21kbk1+ (k1)bk
(1 b)2, k 1,(72)
where we set bk1= 1 when k= 1 and b= 0. Now for each of algorithms (5), (6), (10), (9), we
write the iterations for the errors in the scalar case and the corresponding iteration matrix M
such that [pn+1, un+1, σ n+1]|=M[pn, un, σn]|.
Usual gradient descent (usual GD):
σn+1 =σnτmpn
un=bun+n
pn=bpn+h2unM=
h2m2(1 b)2τ0h2m(1 b)2
m2(1 b)1τ0m(1 b)1
0 1
(73)
Shifted gradient descent (shifted GD):
σn+1 =σnτmpn
un+1 =bun+1 +n
pn+1 =bpn+1 +h2un+1 M=
0 0 h2m(1 b)2
0 0 m(1 b)1
0 1
(74)
k-step one-shot:
σn+1 =σnτmpn
pn+1 = (bkτm2xk)pn+ukun+mxkσn
un+1 =bkun+mtkσnτm2tkpnM=
bkm2xkτ ukmxk
m2tkτ bkmtk
0 1
(75)
Shifted k-step one-shot:
σn+1 =σnτmpn
pn+1 =bkpn+ukun+mxkσn
un+1 =bkun+mtkσnM=
bkukmxk
0bkmtk
0 1
.(76)
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 45
C.2 Necessary and sufficient conditions for convergence
In this simpler scalar case, we will be able to prove sufficient and also necessary conditions on
the descent step τfor convergence. Our strategy to study the spectral radius ρ(M)is as follows:
1. Compute det(M λI )to write the eigenvalue equation P(λ) = 0. For the considered
methods, Pturns out to be a polynomial of degree 3,P(λ) = a0+a1λ+a2λ2+λ3, where
a0, a1, a2Rdepend on h, m, b, τ . For the computations, the identity uktkbkxk+xk=
h2t2
k, which is the scalar version of (41), can be helpful.
2. Apply to PLemma C.1, which states a necessary and sufficient condition for a real coeffi-
cient polynomial of degree 3 to have all roots inside the unit circle of the complex plane.
Then deduce conditions on τ.
Lemma C.1. Let a0, a1, a2R, then all roots of P(z) = a0+a1z+a2z2+z3stay (strictly)
inside the unit circle of the complex plane if and only if
(a01)(a0+ 1) <0,(77)
(a2
0a2a0+a11)(a2
0+a2a0a11) >0,(78)
(a0+a2a11)(a0+a2+a1+ 1) <0.(79)
The proof of Lemma C.1 is in Appendix Dand is mainly based on Marden’s works [17].
C.2.1 Descent step for the usual gradient descent
Here, the coefficients of Pare
a0= 0, a1= 0, a2=h2m2(1 b)2τ1.
Conditions (77) and (78) of Lemma C.1 are automatically satisfied. Condition (79) gives
0< τ < 2(1 b)2
h2m2,
that is (7) in the scalar case.
C.2.2 Descent step for the shifted gradient descent
Here, the coefficients of Pare
a0= 0, a1=h2m2(1 b)2τ, a2=1.
Condition (77) of Lemma C.1 is automatically satisfied, condition (79) is automatically satisfied
for τ > 0, and condition (78) gives us
τ < (1 b)2
h2m2,
that is (8) in the scalar case.
RR n°9477
46 M. Bonazzoli & H. Haddar & T. A. Vu
C.2.3 Descent step for k-step one-shot
Here, the coefficients of Pare
a0=s2, a1=m2(h2t2
kxk)τ+ (s2+ 2s), a2=m2xkτ(2s+ 1)
where s=bk. Condition (77) of Lemma C.1 is obviously satisfied since |b|<1. Next we deal
with condition (78). The computation shows that
a2
0a2a0+a11 = m2(h2t2
kxk+xks2)τ+ (s1)3(s+ 1)
| {z }
<0
,(80)
a2
0+a2a0a11 = m2(h2t2
kxk+xks2)τ+ (s1)(s+ 1)3
| {z }
<0
(81)
and
h2t2
kxk+xks2=h2bk1(1 bk)[k(k+ 1)b+kbk(k1)bk+1]
(1 b)2.(82)
Lemma C.2. k(k+ 1)b+kbk(k1)bk+1 >0,∀|b|<1,k1.
Proof. We write k(k+1)b+kbk(k1)bk+1 = (1 b)Awhere A=k+ 11bk
1b+(k1)bk. It
suffices to show A > 0. If k= 1 then A= 1 >0. If either kis even, or k3is odd and 0b < 1,
then (k1)bk0and 1bk
1b=|bk1+bk2+... +b+ 1|≤|bk1|+|bk2|+... +|b|+ 1 < k give
us the conclusion. If k3is odd and 1<b<0then (k1)(1 + bk+ 1) >0and 1bk
1b<1
therefore A= 1 + 11bk
1b+ (k1)(1 + bk)>0.
Then, condition (78) imposes
τ < (1b)2(1+bk)(1bk)2
h2m2bk1[k(k+1)b+kbk(k1)bk+1]if bk1>0;
τ < (1b)2(1+bk)3
h2m2bk1[k+(k+1)bkbk+(k1)bk+1]if bk1<0;
no condition on τif k2and b= 0.
Finally we check condition (79). We have a0+a2+a1+ 1 = h2m2t2
kτ > 0and
a0+a2a11 = h2m2(1 2kbk1+ 2kbkb2k)
(1 b)2τ2(1 + s)2,
therefore, condition (79) gives
τ < 2(1b)2(1+bk)2
h2m2(12kbk1+2kbkb2k)if 12kbk1+ 2kbkb2k>0;
no condition on τif 12kbk1+ 2kbkb2k0.
In the following lemma we study the quantity 12kbk1+ 2kbkb2kthat appears above.
Lemma C.3. Let fk(b)=12kbk1+ 2kbkb2kfor kNand 1b1.
(i) f1(b) = (1 b)2<0, 1< b < 1.
(ii) f2(b) = 1 4b+ 4b2b4has a unique solution b=1 + 2in (1,1); and f2(b)>0if
1< b < 1 + 2,f2(b)<0if 1 + 2< b < 1.
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 47
(iii) If k3is odd then fk(b)has exactly two solutions b1(k)< b2(k)in (1,1); if k2is
even then fk(b)has a unique solution b3(k)in (1,1). Moreover, for every odd k3:
1< b1(k)<0< b2(k)<1;
fk(b)>0b1(k)< b < b2(k);
fk(b)<0 1< b < b1(k)b2(k)<b<1.
and for every even k2:
0< b3(k)<1;
fk(b)>0 1< b < b3(k);
fk(b)<0b3(k)<b<1.
(iv) lim
kodd
k→∞
b1(k) = 1and lim
kodd
k→∞
b2(k) = 1 = lim
keven
k→∞
b3(k)=1.
Proof. (i) and (ii) are easy to verify. (iii) It remains to consider k3. We have
f0
k(b) = bk22k(k1) + 2k2b2kbk+1,1<b<1.
Set
gk(b) = 2k(k1) + 2k2b2kbk+1,1b1, k 3.
Case 1. [k3is odd] By studying the sign of g0
k(b), we find that
gkhas a unique solution v1(k)in (1,1) and 0< v1(k)<k
qk
k+1 <1;
gk(b)>0v1(k)<b<1;
gk(b)<0 1< b < v1(k).
Next, by studying the sign of f0
k(b), we find that
fk(b)has exactly two solutions b1(k)< b2(k)in (1,1) and 1< b1(k)<0< b2(k)<1;
fk(b)>0b1(k)< b < b2(k);
fk(b)<0 1< b < b1(k)b2(k)< b < 1.
Case 2. [k4is even] By studying the sign of g0
k(b), we find that
gkhas a unique solution v2(k)in (1,1) and 0< v2(k)<k
qk
k+1 <1;
gk(b)>0v2(k)<b<1;
gk(b)<00< b < v2(k).
Next, by studying the sign of f0
k(b), we find that
fk(b)has a unique solution b3(k)in (1,1) and 0< b3(k)<1;
fk(b)>0 1< b < b3(k);
fk(b)<0b3(k)< b < 1.
RR n°9477
48 M. Bonazzoli & H. Haddar & T. A. Vu
(iv) We have
fk1
2= 1 k
2k11
2k,k3and fk1
2= 1 3k
2k11
2k,odd k3,
hence for sufficiently large kwe have fk1
2>0and for sufficiently large odd kwe have fk1
2>
0. By the table of signs of fk, we conclude that b1(k)<1
2for large odd k,b2(k)>1
2for large
odd kand b3(k)>1
2for large even k.
Case 1. [k3is odd and sufficiently large] First we work with b1(k). We have
12kb1(k)k1+ 2kb1(k)kb1(k)2k= 0
and b1(k)<1
2so
b1(k)2k+ 2kb1(k)k+ 1 = 2kb1(k)k1= [2kb1(k)k]
| {z }
>0
·1
b1(k)<[2kb1(k)k].2 = 4kb1(k)k,
which leads to
b1(k)2k6kb1(k)k1>0[b1(k)k3k]2>1+9k2.
Since 1< b1(k)<0and kis odd, this tells us that
1< b1(k)<(3k+p1+9k2)1/k =1
(3k+1+9k2)1/k <1
(7k)1/k,
which yields lim
kodd
k→∞
b1(k) = 1. Next, we have
12kb2(k)k1+ 2kb2(k)kb2(k)2k= 0
and b2(k)>1
2so
b2(k)2k+ 2kb2(k)k+ 1 = 2kb2(k)k1= 2kb2(k)k·1
b2(k)<4kb2(k)k,
which leads to
b2(k)2k+ 2kb2(k)k1>0[b2(k)k+k]2>1 + k2.
Since 0< b2(k)<1, this tells us that
1> b2(k)>(k+p1 + k2)1/k =1
(k+1 + k2)1/k >1
(3k)1/k,
which yields lim
keven
k→∞
b2(k) = 1.
Case 2. [k4is even and sufficiently large] We repeat the same arguments as b2(k)for b3(k).
In summary, we have the following proposition.
Proposition C.4 (Convergence of k-step one-shot).Let η1(k, b):= +and
η21(k, b):=(1 b)2(1 + bk)(1 bk)2
bk1[k(k+ 1)b+kbk(k1)bk+1];
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 49
η22(k, b):=(1 b)2(1 + bk)3
bk1[k(k+ 1)b+kbk(k1)bk+1];
η3(k, b):=2(1 b)2(1 + bk)2
12kbk1+ 2kbkb2k
then the necessary and sufficient condition for the convergence of k-step one-shot in the scalar
case is of the form τ < η(k,b)
h2m2where η(k, b)is defined as follows:
(i) η(1, b) = η21(1, b) = (1 b)3(1 + b),1< b < 1;
(ii) for odd k3,
η(k, b) =
η21(k, b),1< b b1(k)b2(k)b < 1,
min {η21(k, b), η3(k, b)}, b1(k)< b < b2(k)b6= 0,
2, b = 0
where 1< b1(k)<0< b2(k)<1are the two solutions of
12kbk1+ 2kbkb2k= 0,1<b<1;
(iii) for even k2,
η(k, b) =
η21(k, b), b3(k)b < 1,
min {η21(k, b), η3(k, b)},0< b < b3(k),
2, b = 0,
min {η22(k, b), η3(k, b)},1< b < 0
where 0< b3(k)<1is the unique solution of
12kbk1+ 2kbkb2k= 0,1<b<1.
Note that lim
kodd
k→∞
b1(k) = 1and lim
kodd
k→∞
b2(k) = 1 = lim
keven
k→∞
b3(k), so the behavior of τwhen k is
consistent with the result τ < 2(1b)2
h2m2,1< b < 1for the usual gradient descent. For illustrations
of the function η(k, b)for different ksee section C.3.
C.2.4 Descent step for shifted k-step one-shot
Here, the coefficients of the polynomial Pof the eigenvalue equation are
a0=h2m2vkτs2, a1=h2m2ykτ+s2+ 2s, a2=2s1,(83)
where s=bk,yk=xk
h2=1kbk1+(k1)bk
(1b)2and vk=t2
kyk=bk1[k(k+1)b+bk+1]
(1b)2. Note that vk
and bk1have the same sign, also vk= 0 if and only if k2and b= 0, since it is easy to show
that k(k+ 1)b+bk+1 >0,∀|b|<1,k1. Then, condition (77) of Lemma C.1 imposes
τ < 1+s2
h2m2vk=(1b)2(1+b2k)
h2m2bk1[k(k+1)b+bk+1]if bk1>0;
τ < 1+s2
h2m2vk=(1b)2(1+b2k)
h2m2bk1[k(k+1)b+bk+1]if bk1<0;
no condition on τif k2and b= 0.
RR n°9477
50 M. Bonazzoli & H. Haddar & T. A. Vu
Next we study condition (78). We have
a2
0a2a0+a11 = v2
k(h2m2τ)2+ [(2s2+ 2s+ 1)vk+yk]h2m2τ+ (s1)3(s+ 1)
| {z }
<0
and
a2
0+a2a0a11 = v2
k(h2m2τ)2[(2s2+ 2s+ 1)vk+yk]h2m2τ+ (s1)(s+ 1)3
| {z }
<0
,
each of which, considered as a second order polynomial of h2m2τif vk6= 0, has exactly two roots
of opposite signs. Therefore if vk6= 0, condition (78) is equivalent to (h2m2τr1)(h2m2τr2)>0
where
r1:=(2s22s1)vkyk+p(4s+ 5)v2
k+y2
k+ 2(2s2+ 2s+ 1)vkyk
2v2
k
>0
and
r2:=(2s2+ 2s+ 1)vk+yk+p(8s2+ 12s+ 5)v2
k+y2
k+ 2(2s2+ 2s+ 1)vkyk
2v2
k
>0.
Lemma C.5. r1and r2cannot be both strictly less than 1+s2
vk.r1and r2cannot be both strictly
less than 1+s2
vk.
Proof. Either r1<1+s2
vkor r1<1+s2
vkimplies (s2+ 4s+ 1)v2
k+ (s2+ 1)vkyk>0. Either
r2<1+s2
vkor r2<1+s2
vkimplies (s2+ 4s+ 1)v2
k+ (s2+ 1)vkyk<0.
Thanks to this lemma we see that condition (78), in combination with condition (77), gives
τ < 1
h2m2min{r1, r2}if bk16= 0;
τ < 1
h2m2if k2and b= 0.
Finally, we have a0+a2+a1+ 1 = h2m2t2
kτ > 0and
a0+a2a11 = h2m2
(1 b)2[1+2kbk12kbk+b2k]τ2(1 bk)2,
thus condition (79) is equivalent to
τ < 2(1b)2(1bk)2
h2m2(1+2kbk12kbk+b2k)if 12kbk1+ 2kbkb2k<0;
no condition on τif 12kbk1+ 2kbkb2k0.
One can look again at Lemma C.3 for the analysis of 12kbk1+ 2kbkb2k. In summary, we
have the following proposition.
Proposition C.6 (Convergence of shifted k-step one-shot).Let
κ11(k, b):=(1 b)2(1 + b2k)
bk1[k(k+ 1)b+bk+1];
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 51
κ12(k, b):=(1 b)2(1 + b2k)
bk1[k(k+ 1)b+bk+1];
tk:=1bk
1b, yk:=1kbk1+ (k1)bk
(1 b)2, s :=bk, vk:=t2
kyk,
κ21(k, b):=(2s22s1)vkyk+p(4s+ 5)v2
k+y2
k+ 2(2s2+ 2s+ 1)vkyk
2v2
k
,
κ22(k, b):=(2s2+ 2s+ 1)vk+yk+p(8s2+ 12s+ 5)v2
k+y2
k+ 2(2s2+ 2s+ 1)vkyk
2v2
k
,
κ2(k, b):= min{κ21(k, b), κ22 (k, b)};
κ3(k, b):=2(1 b)2(1 bk)2
1+2kbk12kbk+b2k
then the necessary and sufficient condition for the convergence of shifted k-step one-shot in the
scalar case is of the form τ < κ(k,b)
h2m2where κ(k, b)is defined as follows:
(i) κ(1, b) = min {κ11(1, b), κ2(1, b), κ3(1, b)}, also note that
κ11(1, b) = 1 + b2, κ21(1, b) = 2b22b1 + 4b+ 5
2,
κ22(1, b) = 2b2+ 2b+1+8b2+ 12b+ 5
2, κ3(1, b) = 2(1 b)2;
(ii) for odd k3,
κ(k, b) =
min {κ11(k, b), κ2(k, b), κ3(k, b)},1< b < b1(k)b2(k)< b < 1,
min {κ11(k, b), κ2(k, b)}, b1(k)bb2(k)b6= 0,
1, b = 0
where 1< b1(k)<0< b2(k)<1are the two solutions of
12kbk1+ 2kbkb2k= 0,1<b<1;
(iii) for even k2,
κ(k, b) =
min {κ11(k, b), κ2(k, b), κ3(k, b)}, b3(k)< b < 1,
min {κ11(k, b), κ2(k, b)},0< b b3(k),
1, b = 0,
min {κ12(k, b), κ2(k, b)},1< b < 0
where 0< b3(k)<1is the unique solution of
12kbk1+ 2kbkb2k= 0,1<b<1.
Remark C.7.In implementation, we rewrite κ21 (k, b)as
b(1 b)2(bk1)
k(k+ 1)b+bk+1 +
2·hbk+1+b(1b)2(1bk)yk
k(k+1)b+bk+1 i
yk+vk+p(4s+ 5)v2
k+y2
k+ 2(2s2+ 2s+ 1)vkyk
to avoid numerical errors. Also in this formula, we see that κ21(k, b)k→∞
(1 b)2(note that
yk=1kbk1+(k1)bk
(1b)2
k→∞
1
(1b)2and vk=t2
kyk
k→∞
0).
For illustrations of the function κ(k , b)for different ksee section C.3.
RR n°9477
52 M. Bonazzoli & H. Haddar & T. A. Vu
C.3 Comparison of the bounds for the descent step
In summary, in the scalar case, the necessary and sufficient convergence conditions on the descent
step τ > 0are:
τ < 2(1 b)2
h2m2, τ < (1 b)2
h2m2, τ < η(k , b)
h2m2, τ < κ(k , b)
h2m2,
respectively for usual GD, shifted GD, k-step one-shot (with η(k, b)given in Proposition C.4),
shifted k-step one-shot (with κ(k, b)given in Proposition C.6). By taking m=h= 1, in Figure 5
we plot for different kthe functions: b7→ 2(1 b)2(usual GD), b7→ (1 b)2(shifted GD),
b7→ η(k, b)(k-step one-shot) and b7→ κ(k, b)(shifted k-step one-shot).
From these plots we can draw two important conclusions. First, when kincreases the visu-
alized curves for k-step one-shot and shifted k-step one-shot tend to the corresponding curves
for usual and shifted gradient descent, as expected. Second, even in this scalar case, it ap-
pears difficult to establish a simplified expression for η(k, b)in Proposition C.4 and κ(k, b)in
Proposition C.6 to find a practical upper bound for the descent step τ.
Remark C.8.For k2, we observe that for some bthe admissible range of τof k-step one-shot
is larger than the one of usual GD, that is not intuitive. This is indeed verified numerically using
FreeFEM: when b= 0.2and τ= 2.08,2-step one-shot converges while the usual GD does not.
D A proof of Lemma C.1 based on Marden’s works
Definition D.1. We say that a complex coefficient polynomial has property Pif all its zeros
lie (strictly) inside the unit circle |z|= 1.
We recall some definitions from Marden’s works [17].
Definition D.2. Let P(z) = a0+a1z+... +anznwhere akR, k = 0, ..., n (we do not require
an6= 0 here). We define
˜
P(z):=an+an1z+... +a0zn
and call it the reverse polynomial of P. One can also see that ˜
P(z) = znP(1/z).
Definition D.3. Let P(z) = a0+a1z+... +anznwhere akR, k = 0, ..., n. We define a
polynomial sequence {Pk}0knwhere
Pk(z) = a(k)
0+a(k)
1z+... +a(k)
nkznk
as follows:
P0=P;
Pk+1 =a(k)
0Pka(k)
nk˜
Pkfor 0kn1.
Then we define
mk(P) = a(1)
0a(2)
0··· a(k)
0,1kn.
The coefficients of these polynomials can be gathered in the following table, that we call
Marden’s table:
Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 53
(a) Shifted 1-step one-shot (b) 1-step one-shot
(c) Shifted k-step one-shot, odd k3(d) k-step one-shot, odd k3
(e) Shifted k-step one-shot, even k2(f) k-step one-shot, even k2
Figure 5: Admissible τin the scalar case as a function of b.
RR n°9477
54 M. Bonazzoli & H. Haddar & T. A. Vu
1x x2... xn1xn
P0a0a1a2... an1an
˜
P0anan1an2... a1a0
P1a(1)
0a(1)
1a(1)
2... a(1)
n1
˜
P1a(1)
n1a(1)
n2a(1)
n3... a(1)
0
.
.
..
.
..
.
..
.
..
.
..
.
..
.
.
Pn1a(n1)
0a(n1)
1
˜
Pn1a(n1)
1a(n1)
0
Pna(n)
0
We have a nice and simple criterion mainly based on the works of Marden [16,17] and Jury
[13,14], known as Jury-Marden Criterion:
Theorem D.4 (Jury-Marden Criterion).The polynomial Phas property Pif and only if
a(1)
0<0; a(k)
0>0,2kn.
This necessary and sufficient condition is mentioned several times in the literature (see e.g. [1,
Theorem 3.10]), but it is not easy to find an explicit proof, so we provide a proof for the reader’s
convenience. Before proving this result, we apply Jury-Marden Criterion to a polynomial of
degree 3and obtain precisely Lemma C.1, that is the following proposition.
Proposition D.5. Let P(z) = a0+a1z+a2z2+z3, z Cwhere a0, a1, a2R. Then Phas
property Pif and only if
(a01)(a0+ 1) <0
(a2
0a2a0+a11)(a2
0+a2a0a11) >0
(a0+a2a11)(a0+a2+a1+ 1) <0.
Proof. By directly applying Jury-Marden Criterion to P, we obtain Marden’s table as follows:
1x x2x3
P0=P a0a1a21
˜
P01a2a1a0
P1a2
01a1a0a2a2a0a1
˜
P1a2a0a1a1a0a2a2
01
P2(a2
01)2(a2a0a1)2(a1a0a2)(a2
0a2a0+a11)
˜
P2(a1a0a2)(a2
0a2a0+a11) (a2
01)2(a2a0a1)2
and
P3(x) = (a2
01)2(a2a0a1)22(a1a0a2)2(a2
0a2a0+a11)2.
Hence
a(1)
0=a2
01=(a01)(a0+ 1),
a(2)
0= (a2
01)2(a2a0a1)2= (a2
0a2a0+a11)(a2
0+a2a0a11),
a(3)
0=(a2
01)2(a2a0a1)22(a1a0a2)2(a2
0a2a0+a11)2
=(a2
0+a2a0a11)2(a1a0a2)2(a2
0a2a0+a11)2
= [a2
0+ (a2a1)a0+a2a11][a2
0+ (a2+a1)a0a2a11]
(a2
01a2a0+a1)2
= (a0+ 1)(a0+a2a11)(a01)(a0+a2+a1+ 1)(a2
01a2a0+a1)2.Inria
Convergence analysis of multi-step one-shot methods for linear inverse problems 55
Then the condition a(1)
0<0, a(2)
0>0, a(3)
0>0, after being simplified, is equivalent to three
inequalities of the statement.
Now, to prove Jury-Marden Criterion, we need the following two results.
Theorem D.6 (Marden, [17], Theorem 42.1).Let Pbe a real coefficient polynomial of n-th
degree. If the sequence
m1(P), m2(P), ..., mn(P)
has exactly pnegative elements and nppositive elements (hence no null elements), then Phas
pcomplex roots (including multiplicities) inside the unit circle |z|= 1, no roots on this circle
and npcomplex roots (including multiplicities) outside this circle.
Lemma D.7 (Schur, [20]).Let P(z) = a0+a1z+... +anznwhere akR,1kn. Assume
that |a0|<|an|. Then deg ˜
P1=n1, and Phas property Pif and only if ˜
P1has property P.
Proof of Jury-Marden Criterion D.4.The sufficient condition for Phaving property Pis a di-
rect consequence of Marden’s Theorem D.6. It remains to prove the necessary one.
For that, we will prove the following statement M(n)by induction: “For every real-coefficient
polynomial Pof n-th degree having property P, the sequence a(1)
0, ..., a(n)
0obtained by Marden’s
algorithm must satisfy
a(1)
0<0, a(k)
0>0,2kn.
To check M(1), let P(z) = a0+a1zwhere a0, a1R, a16= 0. Then P(z) = 0 z=a0/a1
and | a0/a1|<1 |a0|<|a1| a(1)
0=a2
0a2
1<0.
Now supposing that M(n1) is true for some nN, n 2, we show that M(n)is true. Let
P(z) = a0+a1z+... +anznwhere akR, k = 0, ..., n and an6= 0. Assume that Phas property
P. First, a(1)
0=a2
0a2
n<0. Indeed, let z1, z2, ..., znbe the nzeros including multiplicities of
P, then by Viète’s formulas z1z2···zn= (1)n(a0/an). Taking the module of both sides of this
identity and noting that Phas property P, we have |a0/an|<1, thus a(1)
0=a2
0a2
n<0. Next,
by Lemma D.7,˜
P1is of (n1)-th degree and it also has property P. Marden’s table for ˜
P1can
be easily found:
1x x2... xn3xn2xn1
˜
P1a(1)
n1a(1)
n2a(1)
n3... a(1)
2a(1)
1a(1)
0
P1a(1)
0a(1)
1a(1)
2... a(1)
n3a(1)
n2a(1)
n1
P2a(2)
0a(2)
1a(2)
2... a(2)
n3a(2)
n2
˜
P2a(2)
n2a(2)
n3a(1)
n4... a(2)
1a(2)
0
P3a(3)
0a(3)
1a(3)
2... a(3)
n3
˜
P3a(3)
n3a(3)
n4a(3)
n5... a(3)
0
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
Pn1a(n1)
0a(n1)
1
˜
Pn1a(n1)
1a(n1)
0
Pna(n)
0
By M(n1), we must then have a(2)
0<0, a(k)
0>0,3kn.
RR n°9477
RESEARCH CENTRE
SACLAY ÎLE-DE-FRANCE
1 rue Honoré d’Estienne d’Orves
Bâtiment Alan Turing
Campus de l’École Polytechnique
91120 Palaiseau
Publisher
Inria
Domaine de Voluceau - Rocquencourt
BP 105 - 78153 Le Chesnay Cedex
inria.fr
ISSN 0249-6399
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Wave-equation based inversions, such as full-waveform inversion, are challenging because of their computational costs, memory requirements, and reliance on accurate initial models. To confront these issues, we propose a novel formulation of full-waveform inversion based on a penalty method. In this formulation, the objective function consists of a data-misfit term and a penalty term which measures how accurately the wavefields satisfy the wave-equation. Because we carry out the inversion over a larger search space, including both the model and synthetic wavefields, our approach suffers less from local minima. Our main contribution is the development of an efficient optimization scheme that avoids having to store and update the wavefields by explicit elimination. Compared to existing optimization strategies for full-waveform inversion, our method differers in two main aspects; i) The wavefields are solved from an augmented wave-equation, where the solution is forced to solve the wave-equation and fit the observed data, ii) no adjoint wavefields are required to update the model, which leads to significant computational savings. We demonstrate the validity of our approach by carefully selected examples and discuss possible extensions and future research.
Article
-This is a short presentation of the freefem++ software. In Section 1, we recall most of the characteristics of the software, In Section 2, we recall how to to build the weak form of a partial differential equation (PDE) from the strong form. In the 3 last sections, we present different examples and tools to illustrated the power of the software. First we deal with mesh adaptation for problems in two and three dimension, second, we solve numerically a problem with phase change and natural convection, and the finally to show the possibilities for HPC we solve a Laplace equation by a Schwarz domain decomposition problem on parallel computer.
Article
We study PDE-constrained optimization problems where the state equation is solved by a pseudo-time stepping or fixed point iteration. We present a technique that improves primal, dual feasibility and optimality si-multaneously in each iteration step, thus coupling state and adjoint iteration and control/design update. Our goal is to obtain bounded retardation of this coupled iteration compared to the original one for the state, since the latter in many cases has only a Q-factor close to one. For this purpose and based on a doubly augmented Lagrangian, which can be shown to be an exact penalty function, we discuss in detail the choice of an appropriate control or design space preconditioner, discuss implementation issues and present a convergence analysis. We show numerical examples, among them applications from shape design in fluid mechanics and parameter optimization in a climate model. Mathematics Subject Classification (2000). Primary 90C30; Secondary 99Z99.
Article
In this paper we investigate adaptive discretization of the iteratively regularized Gauss–Newton method IRGNM. All-at-once formulations considering the PDE and the measurement equation simultaneously allow to avoid (approximate) solution of a potentially nonlinear PDE in each Newton step as compared to the reduced form Kaltenbacher et al (2014 Inverse Problems 30 045001). We analyze a least squares and a generalized Gauss–Newton formulation and in both cases prove convergence and convergence rates with a posteriori choice of the regularization parameters in each Newton step and of the stopping index under certain accuracy requirements on four quantities of interest. Estimation of the error in these quantities by means of a weighted dual residual method is discussed, which leads to an algorithm for adaptive mesh refinement. Numerical experiments with an implementation of this algorithm show the numerical efficiency of this approach, which especially for strongly nonlinear PDEs outperforms the nonlinear Tikhonov regularization considered in Kaltenbacher et al (2011 Inverse Problems 27 125008).
Article
This paper discusses the efficient numerical treatment of optimal control problems governed by elliptic partial differential equations and systems of elliptic partial differential equations, where the control is finite dimensional. Distributed control as well as boundary control cases are discussed. The main characteristic of the new methods is that they are designed to solve the full optimization problem directly, rather than accelerating a descent method by an efficient multigrid solver for the equations involved. The methods use the adjoint state in order to achieve efficient smoother and a robust coarsening strategy. The main idea is the treatment of the control variables on appropriate scales, i.e., control variables that correspond to smooth functions are solved for on coarse grids depending on the smoothness of these functions. Solution of the control problems is achieved with the cost of solving the constraint equations about two to three times (by a multigrid solver). Numerical examples demonstrate the effectiveness of the method proposed in distributed control case, pointwise control and boundary control problem.
Article
The aim of this paper is to design and to analyse sequential quadratic programming (SQP) methods as iterative regularization methods for ill-posed parameter identification problems. We discuss two variants of the original SQP algorithm, in which an additional stabilizer ensures the strict convexity and well posedness of the quadratic programming problems that have to be solved in each step of the iteration procedure. We show that the SQP problems are equivalent to stable saddle-point problems, which can be analysed by standard methods. In addition, the investigation of these saddle-point problems offers new possibilities for the numerical treatment of the identification problem compared to standard numerical methods for inverse problems. One of the resulting iteration algorithms, called the Levenberg–Marquardt SQP method, is analysed with respect to convergence and regularizing properties under an appropriate choice of the stopping index depending on the noise level. Finally, we show that the conditions needed for convergence are fulfilled for several important types of applications and we test the convergence behaviour in numerical examples.