PreprintPDF Available

Convergence analysis of multi-step one-shot methods for linear inverse problems

July 2022

July 2022

DOI:10.48550/arXiv.2207.10372

Authors:

Marcella Bonazzoli

National Institute for Research in Computer Science and Control

Preprints and early-stage research may not have been peer reviewed yet.

In this work we are interested in general linear inverse problems where the corresponding forward problem is solved iteratively using fixed point methods. Then one-shot methods, which iterate at the same time on the forward problem solution and on the inverse problem unknown, can be applied. We analyze two variants of the so-called multi-step one-shot methods and establish sufficient conditions on the descent step for their convergence, by studying the eigenvalues of the block matrix of the coupled iterations. Several numerical experiments are provided to illustrate the convergence of these methods in comparison with the classical usual and shifted gradient descent. In particular, we observe that very few inner iterations on the forward problem are enough to guarantee good convergence of the inversion algorithm.

Convergence curves of usual gradient descent and k-step one-shot.

…

Convergence curves of shifted gradient descent and shifted k-step one-shot.

…

Comparison of usual gradient descent and k-step one-shot with shifted gradient descent and shifted k-step one-shot.

…

Figures - uploaded by Marcella Bonazzoli

Content may be subject to copyright.

Content uploaded by Marcella Bonazzoli

Content may be subject to copyright.

ISSN 0249-6399 ISRN INRIA/RR--9477--FR+ENG

RESEARCH

REPORT

N° 9477

July 2022

Project-Teams IDEFIX

Convergence analysis of

multi-step one-shot

methods for linear

inverse problems

Marcella Bonazzoli, Houssem Haddar, Tuan Anh Vu

RESEARCH CENTRE

SACLAY – ÎLE-DE-FRANCE

1 rue Honoré d’Estienne d’Orves

Bâtiment Alan Turing

Campus de l’École Polytechnique

91120 Palaiseau

Convergence analysis of multi-step one-shot

methods for linear inverse problems

Marcella Bonazzoli∗, Houssem Haddar∗, Tuan Anh Vu∗

Project-Teams IDEFIX

Research Report n°9477 — July 2022 — 55 pages

Abstract: In this work we are interested in general linear inverse problems where the corre-

sponding forward problem is solved iteratively using ﬁxed point methods. Then one-shot methods,

which iterate at the same time on the forward problem solution and on the inverse problem un-

known, can be applied. We analyze two variants of the so-called multi-step one-shot methods and

establish suﬃcient conditions on the descent step for their convergence, by studying the eigen-

values of the block matrix of the coupled iterations. Several numerical experiments are provided

to illustrate the convergence of these methods in comparison with the classical usual and shifted

gradient descent. In particular, we observe that very few inner iterations on the forward problem

are enough to guarantee good convergence of the inversion algorithm.

Key-words: inverse problems, one-shot methods, convergence analysis, parameter identiﬁcation

∗Inria, UMA, ENSTA Paris, Institut Polytechnique de Paris

Analyse de convergence pour des méthodes d’inversion

multi-étapes de type one-shot

Résumé : Dans ce travail nous nous intéressons à des problèmes inverses linéaires généraux

où le problème direct correrpondant est résolu de façon itérative en utilisant des méthodes de

point ﬁxe. Ainsi, les méthodes de type one-shot, qui itérent en même temps sur la solution du

problème direct et l’inconnue du problème inverse, peuvent être appliquées. Nous considérons

deux variantes des méthodes multi-étapes de type one-shot et nous establissons des conditions

suﬃsantes et nécessaires sur le pas de descente pour leur convergence, en étudiant les valeurs

propres de la matrice par blocs des iterations couplées. Plusieurs tests numériques sont présentés

pour illustrer la convergence de ces méthodes par rapport aux méthodes de descente de gradient

usuelle et décentrée. En particulier, nous observons que très peu d’iterations internes pour le

problème direct sont suﬃsantes pour garantir une bonne convergence de l’algorithme d’inversion.

Mots-clés : problèmes inverses, méthodes de type one-shot, analyse de convergence, identiﬁ-

cation de paramètres

Convergence analysis of multi-step one-shot methods for linear inverse problems 3

Contents

1 Introduction 4

2 Multi-step one-shot inversion methods 5

3 Convergence of one-step one-shot methods (k= 1)7

3.1 Block iteration matrices and eigenvalue equations .................. 7

3.2 Real eigenvalues .................................... 10

3.3 Complex eigenvalues .................................. 11

3.4 Final result (k= 1)................................... 15

4 Convergence of multi-step one-shot methods (k≥2)16

4.1 Block iteration matrices and eigenvalue equations .................. 16

4.2 Real eigenvalues .................................... 18

4.3 Complex eigenvalues .................................. 20

4.4 Final result (k≥2)................................... 26

5 Inverse problem with complex forward problem and real parameter 26

6 Numerical experiments 28

7 Conclusion 33

A Some useful lemmas 36

B Descent step for usual and shifted gradient descent 42

C Convergence study for the scalar case 44

C.1 Notations and preliminary calculation ........................ 44

C.2 Necessary and suﬃcient conditions for convergence ................. 45

C.2.1 Descent step for the usual gradient descent ................. 45

C.2.2 Descent step for the shifted gradient descent ................ 45

C.2.3 Descent step for k-step one-shot ....................... 46

C.2.4 Descent step for shifted k-step one-shot ................... 49

C.3 Comparison of the bounds for the descent step ................... 52

D A proof of Lemma C.1 based on Marden’s works 52

RR n°9477

4M. Bonazzoli & H. Haddar & T. A. Vu

1 Introduction

For large-scale inverse problems, which often arise in real life applications, the solution of the

corresponding forward and adjoint problems is generally computed using an iterative solver, such

as (preconditioned) ﬁxed point or Krylov subspace methods. Indeed, the corresponding linear

systems could be too large to be handled with direct solvers (e.g. LU-type solvers), and iterative

solvers are easier to parallelize on many cores. Naturally this leads to the idea of one-step

one-shot methods, which iterate at the same time on the forward problem solution (the state

variable), the adjoint problem solution (the adjoint state) and on the inverse problem unknown

(the parameter or design variable). If two or more inner iterations are performed on the state

and adjoint state before updating the parameter (by starting from the previous iterates as initial

guess for the state and adjoint state), we speak of multi-step one-shot methods. Our goal is to

rigorously analyze the convergence of such inversion methods. In particular, we are interested

in those schemes where the inner iterations on the direct and adjoint problems are incomplete,

i.e. stopped before achieving convergence. Indeed, solving the forward and adjoint problems

exactly by direct solvers or very accurately by iterative solvers could be very time-consuming

with little improvement in the accuracy of the inverse problem solution.

The concept of one-shot methods was ﬁrst introduced by Ta’asan [22] for optimal control

problems. Based on this idea, a variety of related methods, such as the all-at-once methods, where

the state equation is included in the misﬁt functional, were developed for aerodynamic shape

optimization, see for instance [23,21,11,19,18] and the literature review in the introduction

of [19]. All-at-once approaches to inverse problems for parameter identiﬁcation were studied

in, e.g., [8,2,15]. An alternative method, called Waveﬁeld Reconstruction Inversion (WRI),

was introduced for seismic imaging in [25], as an improvement of the classical Full Waveform

Inversion (FWI) [24]. WRI is a penalty method which combines the advantages of the all-at-once

approach with those of the reduced approach (where the state equation represents a constraint

and is enforced at each iteration, as in FWI), and was extended to more general inverse problems

in [26].

Few convergence proofs, especially for the multi-step one-shot methods, are available in litera-

ture. In particular, for non-linear design optimization problems, Griewank [6] proposed a version

of one-step one-shot methods where a Hessian-based preconditioner is used in the design variable

iteration. The author proved conditions to ensure that the real eigenvalues of the Jacobian of the

coupled iterations are smaller than 1, but these are just necessary and not suﬃcient conditions

to exclude real eigenvalues smaller than −1. In addition, no condition to also bound complex

eigenvalues below 1in modulus was found, and multi-step methods were not investigated. In

[9,10,4] an exact penalty function of doubly augmented Lagrangian type was introduced to coor-

dinate the coupled iterations, and global convergence of the proposed optimization approach was

proved under some assumptions. In [7] this particular one-step one-shot approach was extended

to time-dependent problems.

In this work, we consider two variants of multi-step one-shot methods where the forward

and adjoint problems are solved using ﬁxed point methods and the inverse problem is solved

using gradient descent methods. This is a preparatory work where we focus on (discretized)

linear inverse problems. Note that the present analysis in the linear case implies also local

convergence in the non-linear case. The only basic assumptions we require are the inverse problem

uniqueness and the convergence of the ﬁxed point iteration for the forward problem. To analyze

the convergence of the coupled iterations we study the real and complex eigenvalues of the

block iteration matrices. We prove that if the descent step is small enough then the considered

multi-step one-shot methods converge. Moreover, the upper bounds for the descent step in

these suﬃcient conditions are explicit in the number of inner iterations and in the norms of

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 5

the operators involved in the problem. In the particular scalar case (Appendix C), we establish

suﬃcient and also necessary convergence conditions on the descent step.

This paper is structured as follows. In Section 2, we introduce the principle of multi-step

one-shot methods and deﬁne two variants of these algorithms. Then, in Section 3, respectively

Section 4, we analyze the convergence of one-step one-shot methods, respectively multi-step

one-shot methods: ﬁrst, we establish eigenvalue equations for the block matrices of the coupled

iterations, then we derive suﬃcient convergence conditions on the descent step by studying both

real and complex eigenvalues. In Section 5we show that the previous analysis can be extended

to the case where the state variable is complex. Finally, in Section 6we test numerically the

performance of the diﬀerent algorithms on a toy 2D Helmholtz inverse problem.

Throughout this work, h·,·i indicates the usual Hermitian scalar product in Cn, that is

hx, yi:=y|x, ∀x, y ∈Cn, and k·k the vector/matrix norms induced by h·,·i. We denote by

A∗=A|the adjoint operator of a matrix A∈Cm×n, and likewise by z∗=zthe conjugate of a

complex number z. The identity matrix is always denoted by I, whose size is understood from

context. Finally, for a matrix T∈Cn×nwith ρ(T)<1, we deﬁne

s(T):= sup

z∈C,|z|≥1



(I−T/z)−1



which is further studied in Appendix A.

2 Multi-step one-shot inversion methods

We focus on (discretized) linear inverse problems, which correspond to a direct (or forward)

problem of the form: ﬁnd u≡u(σ)such that

u=Bu +M σ +F(1)

where u∈Rnu,σ∈Rnσ,B∈Rnu×nu,M∈Rnu×nσand F∈Rnu. Here I−Bis the invertible

matrix of the direct problem, obtained after discretization, with parameter σ. Note that in the

non-linear case Bwould be a function of σ. Equation (1) is also called state equation and uis

called state. Given σ, we can solve for uby a ﬁxed point iteration

u`+1 =Bu`+M σ +F, ` = 0,1,..., (2)

which converges for any initial guess u0if and only if the spectral radius ρ(B)is strictly less than

1(see e.g. [5, Theorem 2.1.1]). Hence we assume ρ(B)<1. Now, we measure f=Hu(σ), where

H∈Rnf×nu, and we are interested in the linear inverse problem of ﬁnding σfrom f. In order

to guarantee the uniqueness of the inverse problem, we assume that H(I−B)−1Mis injective.

In summary, we set

direct problem: u=Bu +M σ +F,

inverse problem: measure f=Hu(σ),ﬁnd σ(3)

with the assumptions:

ρ(B)<1, H(I−B)−1Mis injective.(4)

To solve the inverse problem we write its least squares formulation: given σex the exact solution

of the inverse problem and f:=Hu(σex),

σex =argminσ∈RnσJ(σ)where J(σ):=1

2kHu(σ)−fk2.

RR n°9477

6M. Bonazzoli & H. Haddar & T. A. Vu

Using the classical Lagrangian technique with real scalar products, we introduce the adjoint state

p≡p(σ), which is the solution of

p=B∗p+H∗(Hu −f)

and allows us to compute the gradient of the cost functional

∇J(σ) = M∗p(σ).

The classical gradient descent algorithm then reads

usual gradient descent: 









σn+1 =σn−τM ∗pn,

un=Bun+M σn+F,

pn=B∗pn+H∗(Hun−f),

(5)

where τ > 0is the descent step size, and the state and adjoint state equations are solved exactly

by a direct solver. Here σn+1 =σn−τ∇J(σn); if instead we update σn+1 =σn−τ∇J(σn−1),

we obtain the

shifted gradient descent: 









σn+1 =σn−τM ∗pn,

un+1 =Bun+1 +M σn+F,

pn+1 =B∗pn+1 +H∗(Hun+1 −f).

(6)

Both algorithms converge for suﬃciently small τ(see e.g. Appendix B): for any initial guess, (5)

converges if

τ < 2

kH(I−B)−1Mk2,(7)

and (6) converges if

τ < 1

kH(I−B)−1Mk2.(8)

Here, we are interested in methods where the direct and adjoint problems are rather solved

iteratively as in (2), and where we iterate at the same time on the forward problem solution and

the inverse problem unknown: such methods are called one-shot methods. More precisely, we are

interested in two variants of multi-step one-shot methods, deﬁned as follows. Let nbe the index

of the (outer) iteration on σ, the solution to the inverse problem. We update σn+1 =σn−τ M∗pn

as in gradient descent methods, but the state and adjoint state equations are now solved by a

ﬁxed point iteration method, using just kinner iterations, and coupled:

(un+1

`+1 =Bun+1

`+Mσ +F,

pn+1

`+1 =B∗pn+1

`+H∗(Hun+1

`−f),`= 0,1, . . . , k, (un+1 =un+1

pn+1 =pn+1

where σdepends on the considered variant (σ=σn+1 or, for the shifted methods, σ=σn). As

initial guess we naturally choose un+1

0=unand pn+1

0=pn, the information from the previous

(outer) step. In summary, we have two multi-step one-shot algorithms

k-step one-shot:











σn+1 =σn−τM ∗pn,

un+1

0=un, pn+1

0=pn,



un+1

`+1 =Bun+1

`+Mσn+1 +F,

pn+1

`+1 =B∗pn+1

`+H∗(Hun+1

`−f),

un+1 =un+1

k, pn+1 =pn+1

(9)

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 7

and

shifted k-step one-shot:











σn+1 =σn−τM ∗pn,

un+1

0=un, pn+1

0=pn,



un+1

`+1 =Bun+1

`+Mσn+F,

pn+1

`+1 =B∗pn+1

`+H∗(Hun+1

`−f),

un+1 =un+1

k, pn+1 =pn+1

(10)

and in particular, when k= 1, we obtain the following two algorithms

one-step one-shot: 









σn+1 =σn−τM ∗pn,

un+1 =Bun+M σn+1 +F

pn+1 =B∗pn+H∗(Hun−f)

(11)

and

shifted one-step one-shot: 









σn+1 =σn−τM ∗pn,

un+1 =Bun+M σn+F

pn+1 =B∗pn+H∗(Hun−f).

(12)

The only diﬀerence for the shifted versions lies in the fact that σnis used in (10) and (12), instead

of σn+1 in (9) and (11), so that in (9) and (11) we need to wait for σbefore updating uand

p, while in (10) and (12) we can update σ, u, p at the same time. Also note that when k→ ∞,

the k-step one-shot method (9) formally converges to the usual gradient descent (5), while the

shifted k-step one-shot method (10) formally converges to the shifted gradient descent (6).

We ﬁrst analyze the one-step one-shot methods (k= 1) in Section 3and then the multi-step

one-shot methods (k≥2) in Section 4.

3 Convergence of one-step one-shot methods (k= 1)

3.1 Block iteration matrices and eigenvalue equations

To analyze the convergence of these methods, ﬁrst we express (σn+1, un+1, pn+1 )in terms of

(σn, un, pn), by inserting the expression for σn+1 into the iteration for un+1 in (11), so that

system (11) is rewritten as











σn+1 =σn−τM ∗pn

un+1 =Bun+M σn−τ M M ∗pn+F

pn+1 =B∗pn+H∗Hun−H∗f.

(13)

System (12) is already in the form we need. In what follows we ﬁrst study the shifted 1-step

one-shot method, then the 1-step one-shot method.

Now, we consider the errors (σn−σex, un−u(σex), pn−p(σex )) with respect to the exact

solution at the n-th iteration, and, by abuse of notation, we designate them by (σn, un, pn). We

obtain that the errors satisfy: for the shifted algorithm (12)











σn+1 =σn−τM ∗pn

un+1 =Bun+M σn

pn+1 =B∗pn+H∗Hun

(14)

RR n°9477

8M. Bonazzoli & H. Haddar & T. A. Vu

and for algorithm (13)









σn+1 =σn−τM ∗pn

un+1 =Bun+M σn−τ M M ∗pn

pn+1 =B∗pn+H∗Hun,

(15)

or equivalently, by putting in evidence the block iteration matrices





pn+1

un+1

σn+1

=



B∗H∗H0

0B M

−τM ∗0I





σn

(16)

and 



pn+1

un+1

σn+1

=



B∗H∗H0

−τM M ∗B M

−τM ∗0I





σn

.(17)

Now recall that a ﬁxed point iteration converges if and only if the spectral radius of its iteration

matrix is strictly less than 1. Therefore in the following propositions we establish eigenvalue

equations for the iteration matrix of the two methods.

Proposition 3.1 (Eigenvalue equation for the shifted 1-step one-shot method).Assume that

λ∈Cis an eigenvalue of the iteration matrix in (16).

(i) If λ∈C,λ /∈Spec(B), then ∃y∈Cnσ, y 6= 0 such that

(λ−1) kyk2+τhM∗(λI −B∗)−1H∗H(λI −B)−1M y, yi= 0.(18)

(ii) λ= 1 is not an eigenvalue of the iteration matrix.

Remark 3.2.Since ρ(B)is strictly less than 1, so is ρ(B∗).

Proof. Since λ∈Cis an eigenvalue of the iteration matrix in (16), there exists a non-zero vector

(˜p, ˜u, y)∈Cnu+nu+nσsuch that











λy =y−τM ∗˜p

λ˜u=B˜u+My

λ˜p=B∗˜p+H∗H˜u.

(19)

By the second equation in (19)˜u= (λI −B)−1M y, so together with the third equation

˜p= (λI −B∗)−1H∗H˜u= (λI −B∗)−1H∗H(λI −B)−1M y,

and by inserting this result into the ﬁrst equation we obtain

(λ−1)y=−τM ∗(λI −B∗)−1H∗H(λI −B)−1M y, (20)

that gives (18) by taking the scalar product with y. We also see that if y= 0 then the above

formulas for ˜u, ˜pimmediately give ˜u= ˜p= 0, that is a contradiction.

(ii) Assume that λ= 1 is an eigenvalue of the iteration matrix, then (20) gives us

M∗(I−B∗)−1H∗H(I−B)−1My = 0,

but this cannot happen for y6= 0 due to the injectivity of H(I−B)−1M.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 9

Proposition 3.3 (Eigenvalue equation for the 1-step one-shot method).Assume that λ∈Cis

an eigenvalue of the iteration matrix in (17).

(i) If λ∈C,λ /∈Spec(B)then ∃y∈Cnσ, y 6= 0 such that:

(λ−1) kyk2+τλhM∗(λI −B∗)−1H∗H(λI −B)−1M y, yi= 0.(21)

(ii) λ= 1 is not an eigenvalue of the iteration matrix.

Proof. Since λ∈Cis an eigenvalue of the iteration matrix in (17), there exists a non-zero vector

(˜p, ˜u, y)∈Cnu+nu+nσsuch that











λy =y−τM ∗˜p

λ˜u=B˜u+M y −τ MM ∗˜p

λ˜p=B∗˜p+H∗H˜u.

(22)

By the third equation in (22)˜p= (λI −B∗)−1H∗H˜u, and inserting this result into the second

equation we obtain

λ˜u=B˜u+M y −τ MM ∗(λI −B∗)−1H∗H˜u,

or equivalently,

[I+τM M ∗A](λI −B)˜u=M y

where A= (λI −B∗)−1H∗H(λI −B)−1. Since τ > 0,I+τ M M ∗Ais a positive deﬁnite matrix.

Therefore

˜u= (λI −B)−1[I+τ MM ∗A]−1M y

and

˜p= (λI −B∗)−1H∗H˜u=A[I+τ M M ∗A]−1My.

By inserting this result into the ﬁrst equation in (22) we obtain

(λ−1)y=−τM ∗A[I+τ M M∗A]−1My.

Thanks to the fact that [I+τM M ∗A]−1and M M∗Acommute, we have

(λ−1)My =−τ M M ∗A[I+τMM ∗A]−1M y =−τ[I+τ MM ∗A]−1M M ∗AM y

then

(λ−1)[I+τM M ∗A]M y =−τMM ∗AM y,

that leads to

(λ−1)My +τ λM M ∗AMy = 0.

Since H(I−B)−1Mis injective, so is M. Therefore

(λ−1)y+τλM ∗AM y = 0,(23)

that gives (21) by taking scalar product with y. We also see that if y= 0 then the above formulas

for ˜u, ˜pimmediately give ˜u= ˜p= 0, that is a contradiction.

(ii) Assume that λ= 1 is an eigenvalue of the iteration matrix, then (23) gives us

M∗(I−B∗)−1H∗H(I−B)−1My = 0,

but this cannot happen for y6= 0 due to the injectivity of H(I−B)−1M.

RR n°9477

10 M. Bonazzoli & H. Haddar & T. A. Vu

In the following sections we will show that, for suﬃciently small τ, equations (18) and (21)

admit no solution |λ| ≥ 1, thus algorithms (12) and (11) converge. When λ6= 0, it is convenient

to rewrite (18) and (21) respectively as

λ2(λ−1) kyk2+τhM∗(I−B∗/λ)−1H∗H(I−B/λ)−1M y, yi= 0 (24)

and

λ(λ−1) kyk2+τhM∗(I−B∗/λ)−1H∗H(I−B/λ)−1M y, yi= 0.(25)

For the analysis we use auxiliary results proved in Appendix A.

First, we study separately the very particular case where B= 0.

Proposition 3.4 (shifted 1-step one-shot method).When B= 0, the eigenvalue equation (24)

admits no solution λ∈C,|λ| ≥ 1if τ < −1+√5

2kHk2kMk2.

Proof. When B= 0, equation (24) becomes λ2(λ−1) kyk2+τkH M yk2= 0 which is equivalent

to λ3−λ2+kHM yk2

kyk2τ= 0. Then, the conclusion can be obtained by Lemma C.1.

Proposition 3.5 (1-step one-shot method).When B= 0, the eigenvalue equation (25)admits

no solution λ∈C,|λ| ≤ 1if τ < 1

kHk2kMk2.

Proof. When B= 0, equation (25) becomes λ(λ−1) kyk2+τkHM yk2= 0 which yields λ3−

λ2+kHM yk2

kyk2τλ = 0. Then, the conclusion can be obtained by Lemma C.1.

3.2 Real eigenvalues

We now ﬁnd conditions on the descent step τsuch that the real eigenvalues stay inside the unit

disk. Recall that we have already proved that λ= 1 is not an eigenvalue for both methods.

Proposition 3.6 (shifted 1-step one-shot method).Equation (24)

(i) admits no solution λ∈R, λ > 1for all τ > 0;

(ii) admits no solution λ∈R, λ ≤ −1if we take

τ < 2

kHk2kMk2s(B)2,

where s(B)is deﬁned in Lemma A.2; moreover if 0<kBk<1, we can take

τ < χ0(1,kBk)

kHk2kMk2,where χ0(1, b) = 2(1 −b)2(26)

(here in the notation χ0(1, b),1refers to k= 1).

Proof. When λ∈R\{0}equation (24) becomes

λ2(λ−1) kyk2+τ

H(I−B/λ)−1M y

2= 0.

The left-hand side of the above equation is strictly positive for any τ > 0if λ > 1; it is strictly

negative for τsatisfying the inequality in (ii) if λ≤ −1, noting that λ7→ λ2(λ−1) is increasing

for λ≤ −1.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 11

Proposition 3.7 (1-step one-shot method).Equation (25)admits no solution λ∈R, λ 6= 1,|λ| ≥

1for all τ > 0.

Proof. When λ∈R\{0}equation (25) becomes

λ(λ−1) kyk2+τ

H(I−B/λ)−1M y

2= 0.

If λ∈R, λ 6= 1,|λ| ≥ 1then λ(λ−1) >0, thus the left-hand side of the above equation is strictly

positive for any τ > 0.

3.3 Complex eigenvalues

We now look for conditions on the descent step τsuch that also the complex eigenvalues stay

inside the unit disk. We ﬁrst deal with the shifted 1-step one-shot method.

Proposition 3.8 (shifted 1-step one-shot method).If B6= 0,∃τ > 0suﬃciently small such

that equation (24)admits no solution λ∈C\R,|λ| ≥ 1. In particular, if 0<kBk<1, given any

δ0>0and 0< θ0≤π

6, take

τ < min{χ1(1,kBk), χ2(1,kBk), χ3(1,kBk), χ4(1,kBk)}

kHk2kMk2,

where

χ1(1, b) = (1 −b)4

4b2, χ2(1, b) = 2 sin θ0

2(1 −b)2

(1 + b)2,

χ3(1, b) = δ0cos25θ0

21+2δ0sin 5θ0

2+δ2

0·(1 −b)4

b2, χ4(1, b) = hsin π

2−3θ0+ cos 2θ0i(1 −b)2

(here in the notation χi(1, b), i = 1,...,4,1refers to k= 1).

Proof. Step 1. Rewrite equation (24)so that we can study its real and imaginary

parts.

Let λ=R(cos θ+ i sin θ)in polar form where R=|λ| ≥ 1and θ∈(−π, π). Write 1/λ =

r(cos φ+ i sin φ)in polar form where r= 1/|λ|= 1/R ≤1and φ=−θ∈(−π, π). By Lemma

A.3, we have

I−B

λ!−1

=P(λ)+iQ(λ), I−B∗

λ!−1

=P(λ)∗+ iQ(λ)∗

where P(λ)and Q(λ)are Cnu×nu-valued functions, and, by omitting the dependence on λ,

kPk ≤ p:=





(1 + kBk)s(B)2for general B6= 0,

1− kBkwhen kBk<1; (27)

kQk ≤ q1:=





kBks(B)2for general B6= 0,

kBk

1− kBkwhen 0<kBk<1; (28)

kQk ≤ |sin θ|q2, q2:=





kBks(B)2for general B6= 0,

kBk

(1 − kBk)2when 0<kBk<1.(29)

RR n°9477

12 M. Bonazzoli & H. Haddar & T. A. Vu

Now we rewrite (24) as

λ2(λ−1) kyk2+τG(P∗+ iQ∗, P + iQ) = 0 (30)

where

G(X, Y ) = hM∗X H ∗H Y M y, yi ∈ C, X, Y ∈Cnu×nu.

Gsatisﬁes the following properties:

•∀X, Y1, Y2∈Cnu×nu,∀z1, z2∈C:G(X, z1Y1+z2Y2) = z1G(X, Y1) + z2G(X, Y2).

•∀X1, X2, Y ∈Cnu×nu,∀z1, z2∈C:G(z1X1+z2X2, Y ) = z1G(X1, Y ) + z2G(X2, Y ).

•∀X∈Cnu×nu:0≤G(X∗, X) = kHX M yk2≤(kHkkMkkXk)2kyk2.

•∀X, Y ∈Cnu×nu:G(X, Y ) + G(Y∗, X∗)∈R, indeed

G(X, Y ) = hM∗XH∗HY M y, yi=hy , M∗Y∗H∗HX∗Myi

=hM∗Y∗H∗HX∗M y, yi∗=G(Y∗, X∗)∗.

With these properties of G, we expand (30) and take its real and imaginary parts, so we respec-

tively obtain:

<(λ3−λ2)kyk2+τ[G(P∗, P )−G(Q∗, Q)] = 0 (31)

and

=(λ3−λ2)kyk2+τ[G(P∗, Q) + G(Q∗, P )] = 0 (32)

Step 2. Find a suitable combination of equations (31)and (32), choose τso that we

obtain a new equation with a left-hand side which is strictly positive/negative.

Let γ=γ(λ)∈R, deﬁned by cases as in Lemma A.4. Multiplying equation (32) with γthen

summing it with equation (31), we obtain:

[<(λ3−λ2) + γ=(λ3−λ2)] kyk2+τ[G(P∗, P )−G(Q∗, Q) + γG(P∗, Q) + γG(Q∗, P )] = 0,

or equivalently,

[<(λ3−λ2) + γ=(λ3−λ2)] kyk2+τG(P∗+γQ∗, P +γQ)−(1 + γ2)τ G(Q∗, Q)=0.(33)

Now we consider four cases of λas in Lemma A.4:

•Case 1. <(λ3−λ2)≥0;

•Case 2. <(λ3−λ2)<0and θ∈[θ0, π −θ0]∪[−π+θ0,−θ0]for ﬁxed 0< θ0≤π

•Case 3. <(λ3−λ2)<0and θ∈(−θ0, θ0)for ﬁxed 0< θ0≤π

•Case 4. <(λ3−λ2)<0and θ∈(π−θ0, π)∪(−π, −π+θ0)for ﬁxed 0< θ0≤π

The four cases will be treated in the following four lemmas (Lemmas 3.9–3.12), which together

give the statement of this proposition.

Lemma 3.9 (Case 1).Equation (24)admits no solutions λin Case 1 if we take

τ < 1

4kHk2kMk2kBk2s(B)4.

Moreover, if 0<kBk<1, we can take

τ < (1 − kBk)4

4kHk2kMk2kBk2.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 13

Proof. Writing (33) for γ=γ1as in Lemma A.4 (i) (in particular γ2

1= 1), we have

[<(λ3−λ2) + γ1=(λ3−λ2)] kyk2+τG(P∗+γ1Q∗, P +γ1Q)−2τG(Q∗, Q)=0.(34)

By the properties of Gwe have

G(P∗+γ1Q∗, P +γ1Q)≥0

and

G(Q∗, Q)≤(kHkkMkkQk)2kyk2≤(kHk kMk |sin θ|q2)2kyk2,

therefore the left-hand side of (34) will be strictly positive if τsatisﬁes

τ < <(λ3−λ2) + γ1=(λ3−λ2)

2 (kHkkMk| sin θ|q2)2.

Since <(λ3−λ2) + γ1=(λ3−λ2)≥2|sin(θ/2)|by Lemma A.4 (i), it is enough to choose

τ < 1

4sin θ

2cos2θ

2kHk2kMk2q2

Since sin θ

2cos2θ

2≤1, it is suﬃcient to choose τ < 1

4kHk2kMk2q2

and we use deﬁnition (29) of

q2.

Lemma 3.10 (Case 2).Equation (24)admits no solutions λin Case 2 if we take

τ < 2 sin θ0

kHk2kMk2(1 + 2 kBk)2s(B)4.

Moreover, if 0<kBk<1, we can take

τ < 2 sin θ0

2(1 − kBk)2

kHk2kMk2(1 + kBk)2.

Proof. Writing (33) for γ=γ2as in Lemma A.4 (ii) (in particular γ2

2= 1), we have

[<(λ3−λ2) + γ2=(λ3−λ2)] kyk2+τG(P∗+γ2Q∗, P +γ2Q)−2τG(Q∗, Q)=0.(35)

By the properties of G

G(Q∗, Q)≥0, G(P∗+γ2Q∗, P +γ2Q)≤(kHk kMk kP+γ2Qk)2kyk2

and the estimate kP+γ2Qk ≤ kPk+|γ2|kQk=kPk+kQk ≤ p+q1,the left-hand side of (35)

will be strictly negative if τsatisﬁes:

τ < −<(λ3−λ2)−γ2=(λ3−λ2)

[kHkkMk(p+q1)]2.

Thanks to Lemma A.4 (ii), it is suﬃcient to choose

τ < 2 sin θ0

kHk2kMk2(p+q1)2

and we use deﬁnitions (27) and (28) of pand q1.

RR n°9477

14 M. Bonazzoli & H. Haddar & T. A. Vu

Lemma 3.11 (Case 3).Let δ0>0be ﬁxed. Equation (24)admits no solutions λin Case 3 if

we take

τ < δ0cos25θ0

21+2δ0sin 5θ0

2+δ2

0·1

kHk2kMk2kBk2s(B)4.

Moreover, if 0<kBk<1, we can take

τ < δ0cos25θ0

21+2δ0sin 5θ0

2+δ2

0·(1 − kBk)4

kHk2kMk2kBk2.

Proof. Writing (33) for γ=γ3as in Lemma A.4 (iii), we have

[<(λ3−λ2) + γ3=(λ3−λ2)] kyk2+τG(P∗+γ3Q∗, P +γ3Q)−(1 + γ2

3)τG(Q∗, Q)=0.(36)

By the properties of G

G(P∗+γ3Q∗, P +γ3Q)≥0, G(Q∗, Q)≤(kHk kMk kQk)2kyk2

and by the estimate kQk ≤ |sin θ|q2, the left-hand side of (36) will be strictly positive if τ

satisﬁes:

τ < <(λ2−λ) + γ3=(λ2−λ)

(1 + γ2

3) (kHkkMk| sin θ|q2)2.

Since by Lemma A.4 (iii) <(λ3−λ2) + γ3=(λ3−λ2)>2δ0sin θ

2, it is suﬃcient to choose

τ < δ0

2(1 + γ2

3)kHk2kMk2q2

2kHk2kMk2q2

2·δ0cos25θ0

1+2δ0sin 5θ0

2+δ2

where we have used the deﬁnition of γ3. To conclude we use deﬁnition (29) of q2.

Lemma 3.12 (Case 4).Equation (24)admits no solutions λin Case 4 if we take

τ < sin π

2−3θ0+ cos 2θ0

kHk2kMk2(1 + kBk)2s(B)2.

Moreover, if 0<kBk<1, we can take

τ < hsin π

2−3θ0+ cos 2θ0i(1 − kBk)2

kHk2kMk2.

Proof. Here it is enough to consider (31). By the properties of G

G(Q∗, Q)≥0, G(P∗, P )≤(kHkkMkp)2kyk2

we see that the left-hand side of (31) will be strictly negative if τsatisﬁes

τ < −<(λ3−λ2)

(kHkkMkp)2.

Thanks to Lemma A.4 (iv), it is suﬃcient to choose

τ < sin π

2−3θ0+ cos 2θ0

kHk2kMk2p2,

and deﬁnition (27) of pleads to the conclusion.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 15

Similarly, with the help of Lemma A.5, we prove for the 1-step one-shot method the analogue

of Proposition 3.8. In particular, note that here just three cases of λneed to be considered,

because the analogue of the fourth one is excluded by Lemma A.5 (iv).

Proposition 3.13 (1-step one-shot method).If B6= 0,∃τ > 0suﬃciently small such that

equation (25)admits no solution λ∈C\R,|λ| ≥ 1. In particular, if 0<kBk<1, given any

δ0>0and 0< θ0≤π

4, take

τ < min{ψ1(1,kBk), ψ2(1,kBk), ψ3(1,kBk)}

kHk2kMk2,

where

ψ1(1, b) = (1 −b)4

4b2, ψ2(1, b) = 2 sin θ0

2(1 −b)2

(1 + b)2, ψ3(1, b) = δ0cos23θ0

2(1 −b)4

21+2δ0sin 3θ0

2+δ2

0b2

(here in the notation ψi(1, b), i = 1,2,3,1refers to k= 1).

3.4 Final result (k= 1)

Considering Proposition 3.4, and taking the minimum between the bound (26) in Proposition 3.6

for real eigenvalues and the bound in Proposition 3.8 for complex eigenvalues, we obtain a

suﬃcient condition on the descent step τto ensure convergence of the shifted 1-step one-shot

method.

Theorem 3.14 (Convergence of shifted 1-step one-shot).Under assumption (4), the shifted

1-step one-shot method (12)converges for suﬃciently small τ. In particular, for kBk<1, it is

enough to take

τ < χ(1,kBk)

kHk2kMk2,

where χ(1,kBk)is an explicit function of kBk(in this notation 1refers to k= 1).

Remark 3.15.Set b=kBk. For 0<b<1, a practical (but not optimal) bound for τis

τ < 1

kHk2kMk2·min 1

2·(1 −b)2

(1 + b)2,1−sin 5π

4·(1 −b)4

b2.

Indeed, using the notation in Proposition 3.6 and 3.8, it is easy to show that χ2(1, b)≤χ0(1, b)

and χ3(1, b)≤χ1(1, b). By studying χ3(1, b)and noting that δ2

0+ 1 ≥2δ0, we see that we

should take δ0= 1. Finally, we can take for instance θ0=π

6, then compare χ2(1, b),χ3(1, b)and

χ4(1, b).

Putting together Propositions 3.5,3.7,3.13, we obtain a suﬃcient condition on the descent

step τto ensure convergence of the 1-step one-shot method.

Theorem 3.16 (Convergence of 1-step one-shot).Under assumption (4), the 1-step one-shot

method (11)converges for suﬃciently small τ. In particular, for kBk<1, it is enough to take

τ < ψ(1,kBk)

kHk2kMk2,

where ψ(1,kBk)is an explicit function of kBk(in this notation 1refers to k= 1).

Remark 3.17.Similarly as above, for 0< b < 1, a practical (but not optimal) bound for τis

τ < 1

kHk2kMk2·min 2 sin π

8·(1 −b)2

(1 + b)2,1−sin 3π

4·(1 −b)4

b2.

RR n°9477

16 M. Bonazzoli & H. Haddar & T. A. Vu

4 Convergence of multi-step one-shot methods (k≥2)

We now tackle the multi-step case, that is the k-step one-shot methods with k≥2.

4.1 Block iteration matrices and eigenvalue equations

Once again, to analyze the convergence of these methods, ﬁrst we express (σn+1, un+1, pn+1 )in

terms of (σn, un, pn), by rewriting the recursions for uand p: systems (9) and (10) are respectively

rewritten as











σn+1 =σn−τM ∗pn

un+1 =Bkun+TkMσn−τ TkMM∗pn+TkF

pn+1 = [(B∗)k−τXkM M ∗]pn+Ukun+XkM σn+XkF−T∗

kH∗f

(37)

and 









σn+1 =σn−τM ∗pn

un+1 =Bkun+TkMσn+TkF

pn+1 = (B∗)kpn+Ukun+XkMσn+XkF−T∗

kH∗f

(38)

where

Tk=I+B+... +Bk−1= (I−B)−1(I−Bk), k ≥1,(39)

Uk= (B∗)k−1H∗H+ (B∗)k−2H∗HB +... +H∗HBk−1, k ≥1,

Xk=(B∗)k−2H∗HT1+ (B∗)k−3H∗HT2+... +H∗H Tk−1if k≥2,

0if k= 1.(40)

Note that (37) (k-step one-shot) can be obtained from (38) (shifted k-step one-shot) by replacing

σnwith σn+1 =σn−τM ∗pnin the equations for uand p, which yields two extra terms in (37). In

what follows we ﬁrst study the shifted k-step one-shot method then the k-step one-shot method.

The following lemma gathers some useful properties of Tk, Ukand Xk.

Lemma 4.1. (i) The matrices Ukand Xkcan be rewritten as

Uk=X

i+j=k−1

(B∗)iH∗HBjfor k≥1,

Xk=

k−2

l=0 X

i+j=l

(B∗)iH∗HBj=

k−1

l=1

Ulfor k≥2.

(ii) The matrices Ukand Xkare self-adjoint: U∗

k=Uk,X∗

k=Xk.

(iii) We have the relation

UkTk−XkBk+Xk=T∗

kH∗HTk,∀k≥1.(41)

Proof. (i) is easy to check by the deﬁnitions. (ii) follows from (i).

(iii) For k= 1, we have U1=H∗H,T1=Iand X1= 0, hence the identity is veriﬁed. For k≥2,

note that Xk+1 =B∗Xk+H∗HTk, then by (ii) Xk+1 =X∗

k+1 =XkB+T∗

kH∗H. On the other

hand, from (i) we get that Xk+1 =Xk+Uk. Thus,

Xk+Uk=XkB+T∗

kH∗H, or equivalently, Uk=Xk(B−I) + T∗

kH∗H.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 17

Finally,

UkTk=Xk(B−I)Tk+T∗

kH∗HTk=Xk(Bk−I) + T∗

kH∗HTk.

Now, we consider the errors (σn−σex, un−u(σex), pn−p(σex )) with respect to the exact

solution at the n-th iteration, and, by abuse of notation, we designate them by (σn, un, pn). We

obtain that the errors satisfy: for the shifted algorithm (38)











σn+1 =σn−τM ∗pn

un+1 =Bkun+TkMσn

pn+1 = (B∗)kpn+Ukun+XkMσn

(42)

and for algorithm (37)











σn+1 =σn−τM ∗pn

un+1 =Bkun+TkMσn−τ TkMM∗pn

pn+1 = [(B∗)k−τXkM M ∗]pn+Ukun+XkM σn,

(43)

or equivalently, by putting in evidence the block iteration matrices





pn+1

un+1

σn+1

=



(B∗)kUkXkM

0BkTkM

−τM ∗0I





σn

(44)

and 



pn+1

un+1

σn+1

=



(B∗)k−τXkM M ∗UkXkM

−τTkM M ∗BkTkM

−τM ∗0I





σn

.(45)

Now recall that a ﬁxed point iteration converges if and only if the spectral radius of its iteration

matrix is strictly less than 1. Therefore in the following propositions we establish eigenvalue

equations for the iteration matrix of the two methods.

Proposition 4.2 (Eigenvalue equation for the shifted k-step one-shot method).Assume that

λ∈Cis an eigenvalue of the iteration matrix in (44).

(i) If λ∈C,λ /∈Spec(Bk), then ∃y∈Cnσ, y 6= 0 such that

(λ−1) kyk2+τhM∗[λI −(B∗)k]−1[(λ−1)Xk+T∗

kH∗HTk](λI −Bk)−1M y, yi= 0.(46)

(ii) λ= 1 is not an eigenvalue of the iteration matrix.

Proposition 4.3 (Eigenvalue equation for the k-step one-shot method).Assume that λ∈Cis

an eigenvalue of the iteration matrix in (45).

(i) If λ∈C,λ /∈Spec(Bk)then ∃y∈Cnσ, y 6= 0 such that:

(λ−1) kyk2+τλhM∗[λI −(B∗)k]−1[(λ−1)Xk+T∗

kH∗HTk](λI −Bk)−1M y, yi= 0.(47)

(ii) λ= 1 is not an eigenvalue of the iteration matrix.

Remark 4.4.Since ρ(B)is strictly less than 1, so are ρ(B∗), ρ(Bk)and ρ((B∗)k).

RR n°9477

18 M. Bonazzoli & H. Haddar & T. A. Vu

The proofs for Propositions 4.2 and 4.3 are respectively similar to the ones of Propositions 3.1

and 3.3, the slight diﬀerence is that in the calculation we use (41) to simplify some terms.

In the following sections we will show that, for suﬃciently small τ, equations (46) and (47)

admit no solution |λ| ≥ 1, thus algorithms (10) and (9) converge. When λ6= 0, it is convenient

to rewrite (46) and (47) respectively as

λ2(λ−1) kyk2+τhM∗I−(B∗)k/λ−1[(λ−1)Xk+T∗

kH∗HTk]I−Bk/λ−1M y, yi= 0 (48)

and

λ(λ−1) kyk2+τhM∗I−(B∗)k/λ−1[(λ−1)Xk+T∗

kH∗HTk]I−Bk/λ−1M y, yi= 0 (49)

The scalar case where nu, nσ, nf= 1 is analyzed in Appendix C.

Remark 4.5.Note that when B= 0 and k≥2, the shifted k-step one-shot and k-step one-

shot are respectively equivalent to the shifted and usual gradient descent methods, therefore we

retrieve the same bounds (8)–(7) for the descent step τas for those methods.

For the analysis we use auxiliary results proved in Appendix A, and the following bounds for

s(Bk), Tk, Xk.

Lemma 4.6. If kBk<1,

s(Bk)≤1

1− kBkk,kTkk ≤ 1− kBkk

1− kBk,kXkk ≤ kHk2(1 −kkBkk−1+ (k−1) kBkk)

(1 − kBk)2.

Proof. The bound for s(Bk)is proved using Lemma A.2 and 

Bk

≤ kBkk. Next, from (39) we

have

kTkk ≤ 1 + kBk+... +kBkk−1=1− kBkk

1− kBk.

From (40), if k≥2we have

kXkk ≤ kHk2kBkk−2+kBkk−3(1 + kBk) + ... + (1 + kBk+... +kBkk−2)

=kHk2(1 + 2 kBk+... + (k−1) kBkk−2) = kHk2(1 −kkBkk−1+ (k−1) kBkk)

(1 − kBk)2.

4.2 Real eigenvalues

We ﬁrst ﬁnd conditions on the descent step τsuch that the real eigenvalues stay inside the unit

disk. Recall that we have already proved that λ= 1 is not an eigenvalue for any k.

Proposition 4.7 (shifted k-step one-shot method).When k≥2,∃τ > 0suﬃciently small such

that equation (48)admits no solution λ∈R, λ 6= 1,|λ| ≥ 1. More precisely, take

•τ < 2

kMk2(kHk2kTkk2+2kXkk)s(Bk)2if the denominator of the right-hand side is not 0;

•any τ > 0otherwise.

Moreover, if kBk<1, we can take

τ < (1 − kBk)2

kHk2kMk2·2(1 − kBkk)2

(1 − kBkk)2+ 2(1 −kkBkk−1+ (k−1) kBkk).

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 19

Proof. When λ∈Requation (48) is rewritten as

λ2(λ−1) kyk2+τ



HTkI−Bk

λ−1My



+τ(λ−1)hM∗hI−(B∗)k

λi−1XkI−Bk

λ−1M y, yi= 0.

We show that if λ > 1(or respectively λ≤ −1) we can choose τso that the left-hand side of the

above equation is strictly positive (or respectively negative). Indeed, if λ > 1, we choose τsuch

that

λ2kyk2−τhM∗I−(B∗)k

λ−1

XkI−Bk

λ−1

M y, yi

and this can be done by taking τsuch that

[kXkkkMk2s(Bk)2]τ < 1.

If λ≤ −1, we choose τsuch that

λ2(λ−1) kyk2+τ



HTkI−Bk

λ−1My



+τ(1 −λ)hM∗hI−(B∗)k

λi−1XkI−Bk

λ−1M y, yi<0

and this can be done by taking τsuch that

"kHk2kTkk2kMk2s(Bk)2

2+kXkkkMk2s(Bk)2#τ < 1,

so we obtain the ﬁrst conclusion. Finally, the second conclusion in the case kBk<1can be

obtained by Lemma 4.6.

Proposition 4.8 (k-step one-shot method).When k≥2,∃τ > 0suﬃciently small such that

equation (49)admits no solution λ∈R, λ 6= 1,|λ| ≥ 1. More precisely, take

•τ < 1

kXkkkMk2s(Bk)2if the denominator of the right-hand side is not 0;

•any τ > 0otherwise.

Moreover, if kBk<1, we can take

τ < (1 − kBk)2

kHk2kMk2·(1 − kBkk)2

1−kkBkk−1+ (k−1) kBkk.

Proof. When λ∈Requation (49) is rewritten as

λ(λ−1) kyk2+τ



HTkI−Bk

λ−1My



+τ(λ−1)hM∗hI−(B∗)k

λi−1XkI−Bk

λ−1M y, yi= 0.

RR n°9477

20 M. Bonazzoli & H. Haddar & T. A. Vu

We show that we can choose τso that the left-hand side of the above equation is strictly positive.

Indeed, if λ > 1, we choose τsuch that

λkyk2−τhM∗I−(B∗)k

λ−1

XkI−Bk

λ−1

M y, yi

and this can be done by taking τsuch that

kXkkkMk2s(Bk)2τ < 1.

If λ≤ −1, we choose τsuch that

λkyk2+τhM∗I−(B∗)k

λ−1

XkI−Bk

λ−1

M y, yi

and this is also done by taking τsuch that

kXkkkMk2s(Bk)2τ < 1.

so we obtain the ﬁrst conclusion. Finally, the conclusion in the case kBk<1can be obtained by

Lemma 4.6.

4.3 Complex eigenvalues

We now look for conditions on the descent step τsuch that also the complex eigenvalues stay

inside the unit disk. We ﬁrst deal with the shifted k-step one-shot method.

Proposition 4.9 (shifted k-step one-shot method).When k≥2,∃τ > 0suﬃciently small such

that equation (48)admits no solution λ∈C\R,|λ| ≥ 1. In particular, if kBk<1, given any

δ0>0and 0< θ0<π

6, take

τ < min{χ1(k, kBk), χ2(k, kBk), χ3(k , kBk), χ4(k, kBk)}

kHk2kMk2

where

χ1(k, b) = (1 −b)2(1 −bk)2

4b2k+√2(1 −kbk−1+ (k−1)bk)(1 + bk)2

χ2(k, b) = (1 −b)2(1 −bk)2

1

2 sin(θ0/2) (1 −bk)2+√2(1 −kbk−1+ (k−1)bk)(1 + bk)2

χ3(k, b) = (1 −b)2(1 −bk)2

2csin(θ0/2)

δ0b2k+ (1 −kbk−1+ (k−1)bk)h√c

δ0(1 + b2k) + 2 max√c

δ0,√c

cos 3θ0bki

χ4(k, b) = sin π

2−3θ0+ cos 2θ0(1 −b)2(1 −bk)2

(1 −bk)2+ 2(1 −kbk−1+ (k−1)bk)(1 + bk)2

and c=1+2δ0sin 5θ0

2+δ2

cos25θ0

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 21

Proof. Step 1. Rewrite equation (48)so that we can study its real and imaginary

parts.

Let λ=R(cos θ+ i sin θ)in polar form where R=|λ| ≥ 1and θ∈(−π, π). Write 1/λ =

r(cos φ+ i sin φ)in polar form where r= 1/|λ|= 1/R ≤1and φ=−θ∈(−π, π). By Lemma

A.3 applied to T=Bk, we have

I−Bk

λ!−1

=P(λ)+iQ(λ), I−(B∗)k

λ!−1

=P(λ)∗+ iQ(λ)∗

where P(λ)and Q(λ)are Cnu×nu-valued functions, and, by omitting the dependence on λ,

kPk ≤ p:=





(1 + 

Bk

)s(Bk)2for general B,

1− kBkkwhen kBk<1; (50)

kQk ≤ q1:=









Bk

s(Bk)2for general B,

kBkk

1− kBkkwhen kBk<1; (51)

kQk ≤ q2|sin θ|, q2:=









Bk

s(Bk)2for general B,

kBkk

(1 − kBkk)2when kBk<1.(52)

Now we rewrite (48) as

λ2(λ−1) kyk2+τG(P∗+ iQ∗, P + iQ) + τ(λ−1)L(P∗+ iQ∗, P + iQ)=0.(53)

where

G(X, Y ) = hM∗X T ∗

kH∗HTkY M y, yi, L(X, Y ) = hM∗XXkY My , yi

for X, Y ∈Cnu×nu.Gsatisﬁes the following properties:

•∀X, Y1, Y2∈Cnu×nu,∀z1, z2∈C:G(X, z1Y1+z2Y2) = z1G(X, Y1) + z2G(X, Y2).

•∀X1, X2, Y ∈Cnu×nu,∀z1, z2∈C:G(z1X1+z2X2, Y ) = z1G(X1, Y ) + z2G(X2, Y ).

•∀X∈Cnu×nu:G(X∗, X)∈R.

•∀X, Y ∈Cnu×nu:G(X, Y ) + G(Y∗, X∗)∈R, indeed

G(X, Y ) = hM∗X T ∗

kH∗HTkY M y, yi=hy , M∗Y∗T∗

kH∗HTkX∗Myi

=hM∗Y∗T∗

kH∗HTkX∗M y, yi∗=G(Y∗, X∗)∗.

Similarly, Lhas the same properties as G(note that X∗

k=Xkby Lemma 4.1). With these

properties of Gand L, we expand (53) and take its real and imaginary parts, so we respectively

obtain:

<(λ3−λ2)kyk2+τG1+τ[<(λ−1)L1− =(λ−1)L2] = 0 (54)

and

=(λ3−λ2)kyk2+τG2+τ[=(λ−1)L1+<(λ−1)L2] = 0 (55)

where

G1=G(P∗, P )−G(Q∗, Q), G2=G(P∗, Q) + G(Q∗, P ),

RR n°9477

22 M. Bonazzoli & H. Haddar & T. A. Vu

L1=L(P∗, P )−L(Q∗, Q), L2=L(P∗, Q) + L(Q∗, P ).

Step 2. Find a suitable combination of equations (54)and (55), choose τso that we

obtain a new equation with a left-hand side which is strictly positive/negative.

Let γ=γ(λ)∈R, deﬁned by cases as in Lemma A.4. Multiplying equation (55) with γthen

summing it with equation (54), we obtain:

[<(λ3−λ2) + γ=(λ3−λ2)] kyk2+τG(P∗+γQ∗, P +γQ)−(1 + γ2)τ G(Q∗, Q)

+τ([<(λ−1) + γ=(λ−1)]L1+ [γ<(λ−1) − =(λ−1)]L2) = 0.(56)

Now we prepare some useful estimates.

•∀X∈Cnu×nu:0≤G(X∗, X) = kHTkX M yk2≤(kHkkTkkkMkkXk)2kyk2.

Since kQk ≤ q1and kQk ≤ q2|sin θ|, we have

G(Q∗, Q)≤(kHkkTkkkMkq1)2kyk2and G(Q∗, Q)≤(kHk kTkk kMkq2sin |θ|)2kyk2.

•By Cauchy-Schwarz inequality we have

|<(λ−1) + γ=(λ−1)| ≤ p1 + γ2|λ−1|;|γ<(λ−1) − =(λ−1)| ≤ p1 + γ2|λ−1|.

•∀X, Y ∈Cnu×nu:|L(X, Y )|=|hM∗XXkY M y, yi| ≤ kXkkkMk2kXkkYk kyk2.Hence

|L1|=|L(P∗, P )−L(Q∗, Q)|≤|L(P∗, P )|+|L(Q∗, Q)|

≤ kXkkkMk2(kPk2+kQk2)kyk2≤ kXkkkMk2(p2+q2

1)kyk2,

|L2|=|L(P∗, Q) + L(Q∗, P )|≤|L(P∗, Q)|+|L(Q∗, P )|

≤2kXkkkMk2kPkkQk kyk2≤2kXkk kMk2pq1kyk2,

and then |[<(λ−1) + γ=(λ−1)]L1+ [γ<(λ−1) − =(λ−1)]L2|

≤ |<(λ−1) + γ=(λ−1)||L1|+|γ<(λ−1) − =(λ−1)||L2|

≤p1 + γ2|λ−1|kXkkkMk2(p2+q2

1+ 2pq1)kyk2

=p1 + γ2|λ−1|kXkkkMk2(p+q1)2kyk2.

Now we consider four cases of λas in Lemma A.4:

•Case 1. <(λ3−λ2)≥0;

•Case 2. <(λ3−λ2)<0and θ∈[θ0, π −θ0]∪[−π+θ0,−θ0]for ﬁxed 0< θ0<π

•Case 3. <(λ3−λ2)<0and θ∈(−θ0, θ0)for ﬁxed 0< θ0<π

•Case 4. <(λ3−λ2)<0and θ∈(π−θ0, π)∪(−π, −π+θ0)for ﬁxed 0< θ0<π

The four cases will be treated in the following four lemmas (Lemmas 4.10–4.13), which together

give the statement of this proposition.

Lemma 4.10 (Case 1).For k≥2, equation (48)admits no solutions λin Case 1 if we take

•τ < s(Bk)−4

4kHk2kMk2kTkk2kBkk2+√2kMk2kXkk(1 + 2 kBkk)2if the denominator of the

right-hand side is not 0;

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 23

•any τ > 0otherwise.

Moreover, if kBk<1, we can take

τ < (1 − kBk)2

kHk2kMk2·(1 − kBkk)2

4kBk2k+√2(1 −kkBkk−1+ (k−1) kBkk)(1 + kBkk)2.

Proof. Writing (56) for γ=γ1as in Lemma A.4 (i) (in particular γ2

1= 1), we have

[<(λ3−λ2) + γ1=(λ3−λ2)] kyk2+τG(P∗+γ1Q∗, P +γ1Q)−2τG(Q∗, Q)

+τ([<(λ−1) + γ1=(λ−1)]L1+ [γ1<(λ−1) − =(λ−1)]L2) = 0.(57)

Since G(P∗+γ1Q∗, P +γ1Q)≥0, and by estimating

G(Q∗, Q)≤(kHkkTkkkMkq2sin |θ|)2kyk2,

[<(λ−1) + γ1=(λ−1)]L1+ [γ1<(λ−1) − =(λ−1)]L2

≥ −√2|λ−1|kXkkkMk2(p+q1)2kyk2,

by Lemma A.4 (i) the left-hand side of (57) will be strictly positive if τsatisﬁes:

2 (kHkkTkkkMkq2)2|sin θ|2

|λ−1|+√2kXkkkMk2(p+q1)2τ < 1.

Since |sin θ|2

|λ−1|≤|sin θ|2

2|sin(θ/2)|= 2 sin θ

2cos2θ

2≤2, we have the ﬁrst part of the conclusion using

deﬁnitions (50), (51), (52) of p, q1, q2. Finally, the conclusion in the case kBk<1can be obtained

by Lemma 4.6.

Lemma 4.11 (Case 2).For k≥2, equation (48)admits no solutions λin Case 2 if we take

•τ < s(Bk)−4

1

2 sin(θ0/2) kHk2kMk2kTkk2+√2kMk2kXkk(1 + 2 kBkk)2if the denominator of

the right-hand side is not 0;

•any τotherwise.

Moreover, if kBk<1, we can take

τ < (1 − kBk)2

kHk2kMk2·(1 −

Bk

)2

2 sin(θ0/2) (1 − kBkk)2+√2(1 −kkBkk−1+ (k−1) kBkk)i(1 + kBkk)2.

Proof. Writing (56) for γ=γ2as in Lemma A.4 (ii) (in particular γ2

2= 1), we have

[<(λ3−λ2) + γ2=(λ3−λ2)] kyk2+τG(P∗+γ2Q∗, P +γ2Q)−2τG(Q∗, Q)

+τ([<(λ−1) + γ2=(λ−1)]L1+ [γ2<(λ−1) − =(λ−1)]L2) = 0.(58)

Since G(Q∗, Q)≥0, and by estimating kP+γ2Qk≤kPk+|γ2|kQk=kPk+kQk ≤ p+q1, so

that

G(P∗+γ2Q∗, P +γ2Q)≤[kHk kTkk kMk(p+q1)]2kyk2,

RR n°9477

24 M. Bonazzoli & H. Haddar & T. A. Vu

and

[<(λ−1) + γ2=(λ−1)]L1+ [γ2<(λ−1) − =(λ−1)]L2

≤√2|λ−1|kXkkkMk2(p+q1)2kyk2,

by Lemma A.4 (ii), the left-hand side of (58) will be strictly negative if τsatisﬁes:

[kHkkTkkkMk(p+q1)]21

|λ−1|+√2kXkkkMk2(p+q1)2τ < 1.

Since 1

|λ−1|≤1

2 sin(θ0/2) , we have the ﬁrst part of the conclusion using deﬁnitions (50), (51) of

p, q1. Finally, the conclusion in the case kBk<1can be obtained by Lemma 4.6.

Lemma 4.12 (Case 3).Let δ0>0be ﬁxed and c:=1+2δ0sin 5θ0

2+δ2

cos25θ0

. For k≥2, equation (48)

admits no solutions λin Case 3 if we take

•τ < s(Bk)−42csin θ0

δ0kHk2kMk2kTkk2

Bk

2+√c

δ0kMk2kXkk(1 + 2 

Bk

+ 2 

Bk

2)

+2 max √c

δ0,√c

cos 3θ0kMk2kXkk(

Bk

+

Bk

2)iif the denominator of the right-hand

side is not 0;

•any τ > 0otherwise.

Moreover, if kBk<1, we can take

τ < (1−kBk)2

kHk2kMk2(1 − kBkk)22csin θ0

δ0

Bk

2k+√c

δ0(1 −kkBkk−1+ (k−1) kBkk)(1 + kBk2k)

+2 max √c

δ0,√c

cos 3θ0(1 −kkBkk−1+ (k−1) kBkk)kBkki−1.

Proof. Writing (56) for γ=γ3as in Lemma A.4 (iii), we have

[<(λ3−λ2) + γ3=(λ3−λ2)] kyk2+τG(P∗+γ3Q∗, P +γ3Q)−(1 + γ2

3)τG(Q∗, Q)

+τ([<(λ−1) + γ3=(λ−1)]L1+ [γ3<(λ−1) − =(λ−1)]L2) = 0.

(59)

Since G(P∗+γ3Q∗, P +γ3Q)≥0, the left-hand side of (59) will be strictly positive if τsatisﬁes:

τ < 1

kyk2"(1 + γ2

3)G(Q∗, Q)

<(λ3−λ2) + γ3=(λ3−λ2)

+|L1||<(λ−1) + γ3=(λ−1)|

<(λ3−λ2) + γ3=(λ3−λ2)+|L2||γ3<(λ−1) − =(λ−1)|

<(λ3−λ2) + γ3=(λ3−λ2)#−1

By estimating

•G(Q∗, Q)≤(kHkkTkkkMkq2|sin θ|)2kyk2

•|L1|≤kXkk kMk2(p2+q2

1)kyk2;

•|L2| ≤ 2kXkkkMk2pq1kyk2

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 25

and using Lemma A.4 (iii), it suﬃces to choose

(1 + γ2

3) (kHkkTkkkMkq2)22|sin θ

2|cos2θ

δ0+kXkkkMk2(p2+q2

1)√1+γ2

δ0

+2 kXkkkMk2pq1max √1+γ2

δ0,√1+γ2

cos 3θ0τ < 1.

Noting that c= 1 + γ2

3, the ﬁnal result is obtained by deﬁnitions (50), (51), (52) of p, q1, q2.

Finally, the conclusion in the case 0<kBk<1can be obtained by Lemma 4.6.

Lemma 4.13 (Case 4).For k≥2, equation (48)admits no solutions λin Case 4 if we take

•τ < sin π

2−3θ0+ cos 2θ0s(Bk)−4

kHk2kMk2kTkk2(1 + kBkk)2+ 2 kMk2kXkk(1 + 2 kBkk)2if the denominator of the

right-hand side is not 0;

•any τ > 0otherwise.

Moreover, if kBk<1, we can take

τ < (1 − kBk)2

kHk2kMk2·sin π

2−3θ0+ cos 2θ0(1 − kBkk)2

(1 − kBkk)2+ 2(1 −kkBkk−1+ (k−1) kBkk)(1 + kBkk)2.

Proof. Here it is enough to consider (54). By the properties of G

G(Q∗, Q)≥0, G(P∗, P )≤(kHkkTkkkMkp)2kyk2

and Lemma A.4 (iv), we see that the left-hand side of (54) will be strictly negative if τsatisﬁes:

(kHkkTkkkMkp)21

sin(π

2−3θ0)+cos 2θ0

+kXkkkMk2(p+q1)22

sin(π

2−3θ0)+cos 2θ0τ < 1.

Deﬁnitions (50), (51) of p, q1lead to the ﬁnal result. Finally, the conclusion in the case 0<

kBk<1can be obtained by Lemma 4.6.

Similarly, with the help of Lemma A.5, we prove for the k-step one-shot method the analogue

of Proposition 4.9. In particular, note that here just three cases of λneed to be considered,

because the analogue of the fourth one is excluded by Lemma A.5 (iv).

Proposition 4.14 (k-step one-shot method).∃τ > 0suﬃciently small such that equation (49)

admits no solution λ∈C\R,|λ| ≥ 1. In particular, if kBk<1, given any δ0>0and 0< θ0<π

take

τ < min{ψ1(k, b), ψ2(k, b), ψ3(k , b)}

kHk2kMk2

where

ψ1(k, b) = (1 −b)2(1 −bk)2

4b2k+√2(1 −kbk−1+ (k−1)bk)(1 + bk)2

ψ2(k, b) = (1 −b)2(1 −bk)2

2 sin(θ0/2) (1 −bk)2+√2(1 −kbk−1+ (k−1)bk)i(1 + bk)2

ψ3(k, b) = (1 −b)2(1 −bk)2

2csin(θ0/2)

δ0b2k+ (1 −kbk−1+ (k−1)bk)h√c

δ0(1 + b2k) + 2 max√c

δ0,√c

cos 2θ0bki

and c=1+2δ0sin 3θ0

2+δ2

cos23θ0

RR n°9477

26 M. Bonazzoli & H. Haddar & T. A. Vu

4.4 Final result (k≥2)

Considering Remark 4.5, and taking the minimum between the bound in Proposition 4.7 for

real eigenvalues and the bound in Proposition 4.9 for complex eigenvalues, we ﬁnally obtain a

suﬃcient condition on the descent step τto ensure convergence of the shifted multi-step one-shot

method.

Theorem 4.15 (Convergence of shifted k-step one-shot, k≥2).Under assumption (4), the

shifted k-step one-shot method, k≥2, converges for suﬃciently small τ. In particular, for

kBk<1, it is enough to take

τ < χ(k, kBk)

kHk2kMk2,

where χ(k, kBk)is an explicit function of kand kBk.

Similarly, by combining Remark 4.5, Propositions 4.8 and 4.14, we obtain a suﬃcient condition

on the descent step τto ensure convergence of the multi-step one-shot method.

Theorem 4.16 (Convergence of k-step one-shot, k≥2).Under assumption (4), the k-step

one-shot method, k≥2, converges for suﬃciently small τ. In particular, for kBk<1, it is

enough to take

τ < ψ(k, kBk)

kHk2kMk2,

where ψ(k, kBk)is an explicit function of kand kBk.

5 Inverse problem with complex forward problem and real

parameter

In this section we show that a linear inverse problem with associated complex forward problem

and real parameter can be transformed into a linear inverse problem which matches with the real

model at the beginning of Section 2, so that the previous theory applies. More precisely, here

we study the state equation

u=Bu +M σ +F

where u∈Cnu,σ∈Rnσ,B∈Cnu×nu, M ∈Cnu×nσ. We measure H u(σ) = fwhere H∈Cnf×nu

and we want to recover σfrom f. Using the method of least squares, we consider the cost

functional

J(σ):=1

2kHu(σ)−fk2,

then by the Lagrangian technique with

L(u, v, σ) = 1

2kHu −fk2+<hBu +mσ +F−u, vi,

we can deﬁne the adjoint state p=p(σ)such that

p=B∗p+H∗(Hu(σ)−f),

which allows us to compute

∇J(σ) = <(M∗p).

By separating the real and imaginary parts of all vectors and matrices u=u1+ iu2,p=p1+ ip2,

B=B1+ iB2,M=M1+ iM2,F=F1+ iF2,H=H1+ iH2,f=f1+ if2, we can transform

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 27

this inverse problem with complex forward problem into the inverse problem with real forward

problem introduced at the beginning of Section 2. Indeed, note that B∗=B∗

1−iB∗

2,M∗=

M∗

1−iM∗

2,H∗=H∗

1−iH∗

2, so we have











u1+ iu2= (B1+ iB2)(u1+ iu2)+(M1+ iM2)σ+ (F1+ iF2)

p1+ ip2= (B∗

1−iB∗

2)(p1+ ip2)+(H∗

1−iH∗

2)[(H1+ iH2)(u1+ iu2)−(f1+ if2)]

∇J(σ) = <[(M∗

1−iM∗

2)(p1+ ip2)],

which implies











u1=B1u1−B2u2+M1σ+F1

u2=B2u1+B1u2+M2σ+F2

p1=B∗

1p1+B∗

2p2+ (H∗

1H1+H∗

2H2)u1−(H∗

2H1−H∗

1H2)u2−(H∗

1f1+H∗

2f2)

p2=−B∗

2p1+B∗

1p2+ (H∗

2H1−H∗

1H2)u1+ (H∗

1H1+H∗

2H2)u2−(−H∗

2f1+H∗

1f2)

∇J(σ) = M∗

1p1+M∗

2p2.

By setting

˜u=u1

u2,˜p=p1

p2,˜

B=B1−B2

B2B1,˜

M=M1

M2,˜

F=F1

F2,˜

H=H1−H2

H2H1,˜

f=f1

f2

we have 









˜u=˜

B˜u+˜

Mσ +˜

˜p=˜

B∗˜p+˜

H∗(˜

H˜u−˜

∇J(σ) = ˜

M∗˜p,

that has the same structure as the inverse problem at the beginning of Section 2.

Finally we ﬁnish this section by two lemmas that match the assumptions of the inverse

problem with complex state variable with the assumptions of the transformed inverse problem

with real state variable.

Lemma 5.1. Spec( ˜

B) = Spec(B)∪Spec(B).

Proof. By writing

B=B1−B2

B2B1=I I

iI−iI

| {z }

C−1

B0

0B1

2I−i

2Ii

2I

| {z }

,(60)

we ﬁnd that det( ˜

B−λI) = det(B−λI) det(B−λI ). The conclusion is then deduced thanks to

the fact that Spec(B) = Spec(B).

Lemma 5.2. Assume that ρ(B)<1, and H(I−B)−1Mis injective. Then ρ(˜

B)<1, and

H(˜

I−˜

B)−1˜

Mis injective where ˜

I∈R2nu×2nuis the identity matrix.

Proof. The previous lemma says that ρ(˜

B) = ρ(B)<1. Therefore (˜

I−˜

B)−1is well-deﬁned and

thanks to (60),

(˜

I−˜

B)−1=I I

iI−iI

| {z }

C−1

(I−B)−10

0 (I−B)−11

2I−i

2Ii

2I

| {z }

2(I−B)−1+ (I−B)−1−i(I−B)−1+ i(I−B)−1

i(I−B)−1−i(I−B)−1(I−B)−1+ (I−B)−1.

RR n°9477

28 M. Bonazzoli & H. Haddar & T. A. Vu

Now we have

H(˜

I−˜

B)−1˜

M=1

2H1−H2

H2H1(I−B)−1+ (I−B)−1−i(I−B)−1+ i(I−B)−1

i(I−B)−1−i(I−B)−1(I−B)−1+ (I−B)−1M1

M2

2H(I−B)−1+H(I−B)−1−iH(I−B)−1+ iH(I−B)−1

iH(I−B)−1−iH(I−B)−1H(I−B)−1+H(I−B)−1M1

M2

2H(I−B)−1M+H(I−B)−1M

iH(I−B)−1M−iH(I−B)−1M.

Now assume that there exists x∈Cnσsuch that H(˜

I−˜

B)−1˜

Mx = 0, then

([H(I−B)−1M+H(I−B)−1M]x= 0

[iH(I−B)−1M−iH(I−B)−1M]x= 0

or, equivalently, ([H(I−B)−1M+H(I−B)−1M]x= 0

[−H(I−B)−1M+H(I−B)−1M]x= 0.

By summing up these two equations we deduce that H(I−B)−1M x = 0, then x= 0 thanks to

the injectivity of H(I−B)−1M.

6 Numerical experiments

Let us introduce a toy model to illustrate numerically the performance of the diﬀerent methods.

Given Ω⊂Rnan open bounded Lipschitz domain, we consider the direct problem for the

linearized scattered ﬁeld u∈H2(Ω) given by the Helmholtz equation

div(˜σ0∇u) + ˜

k2u=div(σ∇u0),in Ω,

u= 0,on ∂Ω,(61)

where the incident ﬁeld u0: Ω →Rsatisﬁes

div(˜σ0∇u0) + ˜

k2u= 0,in Ω,

u0=f, on ∂Ω(62)

with the datum f:∂Ω→R. Here σ: Ω →Rsuch that σ∂Ω= 0;˜σ0=σ0+δσris a given

function with δ≥0and random σr. More precisely, given ˜σ0and f, we solve for u0=u0(f)in

(62), then insert u0into (61) to solve for u=u(σ). The variational formulations for uand u0

are respectively

ZΩ

˜σ0∇u· ∇v−ZΩ

k2uv =ZΩ

σ∇u0· ∇v, ∀v∈H1

0(Ω) and u= 0 on ∂Ω,(63)

ZΩ

˜σ0∇u0· ∇v−ZΩ

k2uv = 0,∀v∈H1

0(Ω) and u0=fon ∂Ω.(64)

We are interested in the inverse problem of ﬁnding σfrom the measurement Hu(σ)where Hu :=

˜σ0∂ u

∂ ν ∂Ω. To solve this inverse problem we use the method of least squares. Denoting by σex the

exact σand g= ˜σ0∂ u(σex)

∂ ν ∂Ωthe corresponding measurement, we consider the cost functional

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 29

J(σ) = 1

2kHu(σ)−gk2

L2(∂Ω) =1

2R∂Ω(˜σ0∂ u(σ)

∂ ν −g)2. The Lagrangian technique allows us to

compute the gradient ∇σJ(σ) = −∇u0· ∇p(σ), where the adjoint state p=p(σ)satisﬁes

ZΩ

˜σ0∇p· ∇v−ZΩ

k2pv = 0,∀v∈H1(Ω) and p=˜σ0

∂ u(σ)

∂ ν ∂Ω−gon ∂Ω.(65)

By discretizing uby P1ﬁnite elements on a mesh Tu

hof Ω, and σby P0ﬁnite elements

on a coarser mesh Tσ

hof Ω, the discretization of (63) can be written as the linear system

A1~u =A2~σ, where ~u ∈Rnu,~σ ∈Rnσ. More precisely, A1and A2are respectively issued from

the discretization of RΩ˜σ0∇u· ∇v−RΩ˜

k2uv and RΩσ∇u0· ∇v, where the Dirichlet boundary

conditions are imposed by the penalty method. To rewrite the system in the form (1), we

consider the naive splitting A1=A11 +δA12, where A11 and A12 are respectively issued from

the discretization of RΩσ0∇u· ∇v−RΩ˜

k2uv and RΩσr∇u· ∇v. Then we get

~u =A−1

11 (−δA12~u +A2~σ)and ~u = 0 on ∂Ω

and

~p =A−1

11 (−δA12 ~p)and ~p =H~u −~g on ∂Ω

where H∈Rnf×nuis the discretization of the above operator Hby abuse of notation. Choosing

δsuch that δ

A−1

11 A12

2<1, we consider (3) with B=−δA−1

11 A12,M=A−1

11 A2,F= 0.

The application of A−1

11 , which has the same size as matrix A1, is done by a direct solver; more

practical ﬁxed point iterations will be investigated in the future.

Figure 1: Domain with six source points for the numerical experiments. The unknown σis

supported on the three squares.

We then perform some numerical experiments in FreeFEM [12] with the following setting:

•Wavenumber ˜

k= 2π,σ0= 1,δ= 0.01,σris a random real function with range in the

interval [1,2].

•Wavelength λ=2π

k√σ0= 1, mesh size h=λ

20 = 0.05. The domain Ωis the disk shown in

Figure 1, where the squares are the support of function σ. Here nu= 5853,nσ= 6.

RR n°9477

30 M. Bonazzoli & H. Haddar & T. A. Vu

•We test with 6data fgiven by zero-order Bessel function of the second kind centered at the

points shown in Figure 1, and the cost functional is the normalized sum of the contributions

corresponding to diﬀerent data.

•We take σex = 10 in every square and 0otherwise. The initial guess for the inverse problem

is 12 in every square and 0otherwise.

•For the ﬁrst iteration, we perform a line search to adapt the descent step τ, using a direct

solver for the forward and adjoint problems.

•The stopping rule for the outer iteration is based on the relative value of the cost functional

and on the relative norm of the gradient with a tolerance of 10−5.

Recall that kis the number of inner iterations on the direct and adjoint problems. We are

interested in two experiments.

In the ﬁrst experiment, we study the dependence on the descent step τ. In Figure 2a and

2b we respectively ﬁx k= 1 and k= 2 and compare k-step one-shot methods with the usual

gradient descent method. On the horizontal axis we indicate the (outer) iteration number n

in (5) and (9). We can verify that for suﬃciently small τ, both one-shot methods converge.

In particular, for τ= 2, while gradient descent and 2-step one-shot converge, 1-step one-shot

diverges. Oscillations may appear on the convergence curve for certain values of τ, but they

gradually vanish when τgets smaller. For suﬃciently small τ, the convergence curves of both

one-shot methods are comparable to the one of gradient descent.

In the second experiment, we study the dependence on the number of inner iterations k, for

ﬁxed τ. First (Figures 2c–2d), we investigate for which kthe convergence curve of k-step one-

shot is comparable with the one of usual gradient descent. As in the previous pictures, on the

horizontal axis we indicate the (outer) iteration number nin (5) and (9). For τ= 2 (see Figure

2c), we observe that for k= 3,4the convergence curves of k-step one-shot are close to the one of

usual gradient descent. Note that with 3inner iterations the L2error between unand the exact

solution to the forward problem ranges between 4.3·10−6and 0.0136 for diﬀerent nin (9); in

fact this error is rather signiﬁcant at the beginning then it tends to reduce when we are closer to

convergence for the parameter σ. Therefore incomplete inner iterations on the forward problem

are enough to have good precision on the solution of the inverse problem. In the very particular

case τ= 2.5(see Figure 2d), we observe an interesting phenomenon: when k= 3,5,10, with

k-step one-shot the cost functional decreases even faster than with usual GD. For bigger k, for

example k= 14, the convergence curve of one-shot is close to the one of usual gradient descent

as expected. Next (Figures 2e–2f), since the overall cost of the k-step one-shot method increases

with k, we indicate on the horizontal axis the accumulated inner iteration number, which sums

up kfrom an outer iteration to the next. More precisely, because at the ﬁrst outer iteration

we perform a step search by a direct solver, we set to 1the ﬁrst accumulated inner iteration

number; for the following outer iterations n≥2, the accumulated inner iteration number is set

to 1+(n−1)k. In Figures 2e–2f we replot the results for the converging k-step one-shot methods

of Figures 2c–2d with respect to the accumulated inner iteration number. For τ= 2 (see Figure

2e), while k= 2 presents some oscillations, quite interestingly it appears that k= 3 gives a

faster decrease of the cost functional with respect to k= 4, at least after the ﬁrst iterations. For

τ= 2.5(see Figure 2f) we observe that k= 3 is enough for the decrease of the cost functional,

but with some oscillations, and the considered higher kappears again to give slower decrease.

A similar behavior can be observed for the shifted methods in Figure 3.

Finally we ﬁx two particular values of τand compare all considered methods in Figure 4. We

note that shifted methods present more oscillations with respect to non-shifted ones, especially

for larger τ.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 31

(a) Convergence curves of usual gradient descent

and 1-step one-shot for diﬀerent descent step τ.

(b) Convergence curves of usual gradient descent

and 2-step one-shot for diﬀerent descent step τ.

and k-step one-shot for diﬀerent kwith τ= 2.

(d) Convergence curves of usual gradient descent

and k-step one-shot for diﬀerent kwith τ= 2.5.

(e) Convergence curves of k-step one-shot for dif-

ferent kwith τ= 2.

(f) Convergence curves of k-step one-shot for dif-

ferent kwith τ= 2.5.

Figure 2: Convergence curves of usual gradient descent and k-step one-shot.

RR n°9477

32 M. Bonazzoli & H. Haddar & T. A. Vu

(a) Convergence curves of shifted gradient descent

and shifted 1-step one-shot for diﬀerent descent

step τ.

(b) Convergence curves of shifted gradient descent

and shifted 2-step one-shot for diﬀerent descent

step τ.

and shifted k-step one-shot for diﬀerent kwith τ=

0.25.

(d) Convergence curves of shifted gradient descent

and shifted k-step one-shot for diﬀerent kwith τ=

0.5.

(e) Convergence curves of shifted k-step one-shot

for diﬀerent kwith τ= 0.25.

(f) Convergence curves of shifted k-step one-shot

for diﬀerent kwith τ= 0.5.

Figure 3: Convergence curves of shifted gradient descent and shifted k-step one-shot.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 33

(a) Convergence curves with τ= 0.5.(b) Convergence curves with τ= 1.3.

Figure 4: Comparison of usual gradient descent and k-step one-shot with shifted gradient descent

and shifted k-step one-shot.

7 Conclusion

We have proved suﬃcient conditions on the descent step for the convergence of two variants of

multi-step one-shot methods. Although these bounds on the descent step are not optimal, to our

knowledge no other bounds, explicit in the number of inner iterations, are available in literature

for multi-step one-shot methods. Furthermore, we have shown in the numerical experiments

that very few inner iterations on the forward and adjoint problems are enough to guarantee good

convergence of the inversion algorithm.

These encouraging numerical results are preliminary in the sense that the considered ﬁxed

point iteration is not a practical one, since it involves a direct solve of a problem of the same

size as the original forward problem. We will investigate in the future iterative solvers based

on domain decomposition methods (see e.g. [3]), which are well adapted to large-scale problems.

In addition, ﬁxed point iterations could be replaced by more eﬃcient Krylov subspace methods,

such as conjugate gradient or GMRES.

Another interesting issue is how to adapt the number of inner iterations in the course of

the outer iterations. Moreover, based on this linear inverse problem study, we plan to tackle

non-linear and time-dependent inverse problems.

References

[1] S. Barnett. Polynomials and linear control systems, volume 77 of Pure Appl. Math. Marcel

Dekker, Inc., New York, NY, 1983.

[2] M. Burger and W. Mühlhuber. Iterative regularization of parameter identiﬁcation problems

by sequential quadratic programming methods. Inverse Problems, 18:943–969, 2002.

[3] V. Dolean, P. Jolivet, and F. Nataf. An Introduction to Domain Decomposition Methods:

Algorithms, Theory, and Parallel Implementation. Society for Industrial and Applied Math-

ematics, Philadelphia, PA, 2015.

[4] N. Gauger, A. Griewank, A. Hamdi, C. Kratzenstein, E. Özkaya, and T. Slawig. Automated

extension of ﬁxed point PDE solvers for optimal design with bounded retardation. In Con-

RR n°9477

34 M. Bonazzoli & H. Haddar & T. A. Vu

strained Optimization and Optimal Control for Partial Diﬀerential Equations, International

Series of Numerical Mathematics, pages 99–122. Springer Basel, 2012.

[5] A. Greenbaum. Iterative Methods for Solving Linear Systems. Number 17 in Frontiers in

Applied Mathematics. Soc. for Industrial and Applied Math, Philadelphia, 1997.

[6] A. Griewank. Projected Hessians for Preconditioning in One-Step One-Shot Design Opti-

mization. In Large-Scale Nonlinear Optimization, volume 83, pages 151–171. Springer US,

Boston, MA, 2006. Series Title: Nonconvex Optimization and Its Applications.

[7] S. GÃ¼nther, N. R. Gauger, and Q. Wang. Simultaneous single-step one-shot optimization

with unsteady PDEs. Journal of Computational and Applied Mathematics, 294:12–22, 2016.

[8] E. Haber and U. M. Ascher. Preconditioned all-at-once methods for large, sparse parameter

estimation problems. Inverse Problems, 17(6):1847–1864, 2001.

[9] A. Hamdi and A. Griewank. Reduced quasi-Newton method for simultaneous design and

optimization. Computational Optimization and Applications, 49(3):521–548, 2009.

[10] A. Hamdi and A. Griewank. Properties of an augmented Lagrangian for design optimization.

Optimization Methods and Software, 25(4):645–664, 2010.

[11] S.B. Hazra, V. Schulz, J. Brezillon, and N.R. Gauger. Aerodynamic shape optimization

using simultaneous pseudo-timestepping. Journal of Computational Physics, 204(1):46–64,

2005.

[12] F. Hecht. New development in FreeFem++. J. Numer. Math., 20(3-4):251–265, 2012.

[13] E.I. Jury. On the roots of a real polynomial inside the unit circle and a stability criterion for

linear discrete systems. IFAC Proceedings Volumes, 1(2):142–153, 1963. 2nd International

IFAC Congress on Automatic and Remote Control: Theory, Basle, Switzerland, 1963.

[14] E.I. Jury. Theory and Applications of the Z-Transform Method. New York, 1964.

[15] B. Kaltenbacher, A. Kirchner, and B. Vexler. Goal oriented adaptivity in the IRGNM for

parameter identiﬁcation in PDEs II: all-at-once formulations. Inverse Problems, 30:045002,

2014.

[16] M. Marden. The geometry of the zeros of a polynomial in a complex variable, volume 3 of

Math. Surv. American Mathematical Society (AMS), Providence, RI, 1949.

[17] M. Marden. Geometry of Polynomials. Number 3 in Mathematical Surveys and Monographs.

American Math. Soc, Providence, RI, 2nd edition, 1966.

[18] E. Özkaya and N. R. Gauger. Single-step One-shot Aerodynamic Shape Optimization. In

Optimal Control of Coupled Systems of Partial Diﬀerential Equations, volume 158, pages

191–204. Birkhäuser Basel, Basel, 2009. Series Title: International Series of Numerical

Mathematics.

[19] V. Schulz and I. Gherman. One-Shot Methods for Aerodynamic Shape Optimization. In

MEGADESIGN and MegaOpt - German Initiatives for Aerodynamic Simulation and Opti-

mization in Aircraft Design, volume 107, pages 207–220. Springer Berlin Heidelberg, Berlin,

Heidelberg, 2009. Series Title: Notes on Numerical Fluid Mechanics and Multidisciplinary

Design.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 35

[20] I. Schur. Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind. Journal für

die reine und angewandte Mathematik (Crelles Journal), 1917(147):205–232, 1917.

[21] A. Shenoy, M. Heinkenschloss, and E. M. Cliﬀ. Airfoil design by an all-at-once method.

International Journal of Computational Fluid Dynamics, 11(1-2):3–25, 1998.

[22] S. Ta’asan. "One Shot" Methods for Optimal Control of Distributed Parameter Systems I:

Finite Dimensional Control. Technical Report 91-2, ICASE, Hampton, 1991.

[23] S. Ta’asan, G. Kuruvila, and M. Salas. Aerodynamic design and optimization in one shot. In

30th Aerospace Sciences Meeting and Exhibit, Reno, NV, U.S.A., 1992. American Institute

of Aeronautics and Astronautics.

[24] A. Tarantola and B. Valette. Generalized nonlinear inverse problems solved using the least

squares criterion. Reviews of Geophysics, 20(2):219–232, 1982.

[25] T. van Leeuwen and F. J. Herrmann. Mitigating local minima in full-waveform inversion by

expanding the search space. Geophysical Journal International, 195(1):661–667, 2013.

[26] T. van Leeuwen and F. J. Herrmann. A penalty method for PDE-constrained optimization

in inverse problems. Inverse Problems, 32(1):015007, 2015.

RR n°9477

36 M. Bonazzoli & H. Haddar & T. A. Vu

A Some useful lemmas

We state auxiliary results about matrices like those appearing in the eigenvalue equations (24),

(25), (48), (49).

Lemma A.1. Let (Cn×n,k·k)be a normed space and T∈Cn×n. If ρ(T)<1, then

∞

k=0

Tkconverges and ∞

k=0

Tk= (I−T)−1.

Moreover, if kTk<1,

(I−T)−1

≤1

1−kTk.

Lemma A.2. Let T∈Cn×nsuch that ρ(T)<1. Set

s(T):= sup

z∈C,|z|≥1



(I−T/z)−1



(66)

then 0< s(T)<+∞. Moreover, if kTk<1,0< s(T)≤1

1− kTk.

Proof. The functional z7→ 



(I−T/z)−1



, with z∈C,|z| ≥ 1, is well-deﬁned and continuous,

and we use Lemma A.1.

The following lemma says that, for T∈Cn×nand λ∈C,|λ| ≥ 1, we can decompose

I−T

λ!−1

=P(λ)+iQ(λ)and I−T∗

λ!−1

=P(λ)∗+ iQ(λ)∗

and gives bounds for P(λ)and Q(λ).

Lemma A.3. Let T∈Cn×nsuch that ρ(T)<1and λ∈C,|λ| ≥ 1. Write 1

λ=r(cos φ+ i sin φ)

in polar form, where 0< r ≤1and φ∈[−π , π]. Then

I−T

λ!−1

=P(λ)+iQ(λ)and I−T∗

λ!−1

=P(λ)∗+ iQ(λ)∗

where

P(λ)=(I−rcos φ T )(I−2rcos φ T +r2T2)−1, Q(λ) = rsin φ T (I−2rcos φ T +r2T2)−1

are Cn×n-valued functions. We also have the following properties:

(i) kP(λ)k ≤ (1 + kTk)s(T)2and kQ(λ)k≤|sin φ|kTks(T)2≤ kTks(T)2.

(ii) Moreover if kTk<1then

kP(λ)k ≤ 1

1− kTkand kQ(λ)k ≤ kTk

1− kTk.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 37

Proof. The ﬁrst part of the lemma is veriﬁed by direct computation, using

(I−T/λ)−1= (I−T /λ∗) [(I−T /λ) (I−T /λ∗)]−1,

(I−T∗/λ)−1= [(I−T∗/λ∗) (I−T∗/λ)]−1(I−T∗/λ∗)

and

(I−T/λ) (I−T /λ∗) = I−2rcos φ T +r2T2.

After that, with the help of Lemma A.2, it is not diﬃcult to show the inequalities in (i). To

prove (ii), ﬁrst observe that the two series

∞

k=0

rkcos(kφ)Tkand ∞

k=1

rksin(kφ)Tk

converge. Then, by expanding and simplifying the left-hand sides, we can show that

"∞

k=0

rkcos(kφ)Tk#(I−2rcos φ T +r2T2) = I−rcos φ T

and "∞

k=1

rksin(kφ)Tk#(I−2rcos φ T +r2T2) = rsin φ T

so P(λ)and Q(λ)can be expressed as the series above, and the inequalities in (ii) follow.

In Sections 3.3 and 4.3 we identify diﬀerent cases of λ∈Cand we need corresponding

estimations, given in the two following lemmas. Lemma A.4 is used for the shifted k-step one-

shot method and Lemma A.5 is used for the k-step one-shot method.

Lemma A.4. For λ∈C\R,|λ| ≥ 1we write λ=R(cos θ+ i sin θ)in polar form where R≥1,

θ∈(−π, π),θ6= 0.

(i) For λsatisfying <(λ3−λ2)≥0, let γ1=γ1(λ) = 1,if =(λ3−λ2)≥0,

−1,if =(λ3−λ2)<0then

<(λ3−λ2) + γ1=(λ3−λ2)≥ |λ−1| ≥ 2|sin(θ/2)|.

(ii) Let 0< θ0≤π

6. For λsatisfying <(λ3−λ2)<0and θ∈[θ0, π −θ0]∪[−π+θ0,−θ0],

let γ2=−1,if =(λ3−λ2)≥0,

1,if =(λ3−λ2)<0then

−<(λ3−λ2)−γ2=(λ3−λ2)≥ |λ−1| ≥ 2 sin(θ0/2).

(iii) Let 0< θ0≤π

6and δ0>0. For λsatisfying <(λ3−λ2)<0and θ∈(−θ0, θ0)\{0}, let

γ3=γ3(sign(θ)) = δ0+ sin 5θ0

2/cos 5θ0

2if θ > 0,

−δ0+ sin 5θ0

2/cos 5θ0

2if θ < 0then

<(λ3−λ2) + γ3=(λ3−λ2)≥2δ0|sin(θ/2)|.

Moreover, if 0< θ0<π

6, we have

RR n°9477

38 M. Bonazzoli & H. Haddar & T. A. Vu

|<(λ−1)+γ3=(λ−1)|

<(λ3−λ2)+γ3=(λ3−λ2)≤√1+γ2

δ0and |γ3<(λ−1)−=(λ−1)|

<(λ3−λ2)+γ3=(λ3−λ2)≤max √1+γ2

δ0,√1+γ2

cos 3θ0.

(iv) Let 0< θ0≤π

6. For λsatisfying <(λ3−λ2)<0and θ∈(π−θ0, π)∪(−π, −π+θ0), we

have

−<(λ3−λ2)≥sin π

2−3θ0+ cos 2θ0,

|<(λ−1)|

−<(λ3−λ2)≤2

sin π

2−3θ0+ cos 2θ0

and |=(λ−1)|

−<(λ3−λ2)≤2

sin π

2−3θ0+ cos 2θ0

Proof. (i) From the deﬁnition of γ1we see that γ2

1= 1,γ1=(λ3−λ2)≥0and

<(λ3−λ2) + γ1=(λ3−λ2)2=<(λ3−λ2)2+=(λ3−λ2)2+ 2γ1<(λ3−λ2)=(λ3−λ2)

≥<(λ3−λ2)2+=(λ3−λ2)2=|λ3−λ2|2,

which yields <(λ3−λ2) + γ1=(λ3−λ2)≥R2|λ−1|. Finally,

since the function R7→ R2+ 1 −2Rcos θ, for R≥1, is increasing.

(ii) In this case we have θ

2∈θ0

2,π

2−θ0

2∪−π

2+θ0

2,−θ0

2so sin θ

2≥sin θ0

2. From the deﬁnition

of γ2we see that γ2

2= 1 and γ2=(λ3−λ2)≤0. Similar to (i), we have −<(λ2−λ)−γ2=(λ2−λ)≥

|λ−1| ≥ 2|sin(θ/2)|, that implies the conclusion.

(iii) Note that cos 3θ > 0,−π

2<3θ < π

2, and sin 3θhas the same sign as θand γ3, so we have

<(λ3−λ2) + γ3=(λ3−λ2) = R2(Rcos 3θ−cos 2θ+γ3Rsin 3θ−γ3sin 2θ)

≥cos 3θ−cos 2θ+γ3sin 3θ−γ3sin 2θ

=−2 sin 5θ

2sin θ

2+ 2γ3cos 5θ

2sin θ

= 2 sin θ

2γ3cos 5θ

2−sin 5θ

2.

Then we consider two cases: if 0< θ < θ0then γ3>0,sin θ

2= sin θ

2>0,0<5θ

2<5θ0

2<π

2and

γ3cos 5θ

2−sin 5θ

2> γ3cos 5θ0

2−sin 5θ0

2=δ0; if −θ0< θ < 0then −γ3>0,sin θ

2=−sin θ

2>0,

−π

2<−5θ0

2<5θ

2<0and −γ3cos 5θ

2+ sin 5θ

2>−γ3cos 5θ0

2−sin 5θ0

2=δ0.

Next, if 0< θ0<π

6, we will show that |<(λ−1)+γ3=(λ−1)|

<(λ3−λ2)+γ3=(λ3−λ2)and |γ3<(λ−1)−=(λ−1)|

<(λ3−λ2)+γ3=(λ3−λ2)are both

bounded. First,

|<(λ−1) + γ3=(λ−1)|

<(λ3−λ2) + γ3=(λ3−λ2)=|(cos θ+γ3sin θ)R−1|

R2[(cos 3θ+γ3sin 3θ)R−(cos 2θ+γ3sin 2θ)]

≤|(cos θ+γ3sin θ)R−1|

(cos 3θ+γ3sin 3θ)R−(cos 2θ+γ3sin 2θ).

Since γ3does not depend on R, let us study f1(R) = aR−1

bR−c2where a= cos θ+γ3sin θ,

b= cos 3θ+γ3sin 3θand c= cos 2θ+γ3sin 2θ. We observe that:

•a, b, c > 0. Indeed, cos θ, cos 2θ, cos 3θ > 0, and θand γ3have the same sign.

•bR −c > 0since <(λ3−λ2) + γ3=(λ3−λ2)>0, thus R > c

•ac > b (equivalently c

b>1

a), since

ac = cos θcos 2θ+γ2

3sin θsin 2θ+γ3sin 3θ > cos θcos 2θ−sin θsin 2θ+γ3sin 3θ=b.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 39

Now, f0

1(R) = 2 ·aR−1

bR−c·b−ac

(bR−c)2<0for R > c

b>1

aand we would like to have c

b<1so that

f1(R)≤f1(1),∀R≥1. Indeed c

b<1is equivalent to

cos 2θ+γ3sin 2θ < cos 3θ+γ3sin 3θ⇔ |γ3|>sin 5θ

2

cos 5θ

which is true since

|γ3|=δ0+ sin 5θ0

cos 5θ0

>sin 5θ

2

cos 5θ

+ε0where ε0=δ0

cos 5θ0

Then we study

f1(1) = cos θ−1 + γ3sin θ

cos 3θ−cos 2θ+γ3(sin 3θ−sin 2θ)2

= −sin θ

2+γ3cos θ

−γ3sin 5θ

2+γ2

3cos 5θ

2!2

γ2

We have:

•(−sin θ

2+γ3cos θ

2)2≤1 + γ2

3by Cauchy-Schwarz inequality;

•γ2

3=|γ3|2>γ3sin 5θ

cos 5θ

+ε0|γ3|that leads to −γ3sin 5θ

2+γ2

3cos 5θ

2> ε0cos 5θ0

2|γ3|=δ0|γ3|;

hence f1(1) ≤1+γ2

δ2

and ﬁnally |<(λ−1)+γ3=(λ−1)|

<(λ3−λ2)+γ3=(λ3−λ2)≤√1+γ2

δ0. Next, we have

|γ3<(λ−1) − =(λ−1)|

<(λ2−λ) + γ3=(λ2−λ)=|(γ3cos θ−sin θ)R−γ3|

R2[(cos 3θ+γ3sin 3θ)R−(cos 2θ+γ3sin 2θ)]

≤|(γ3cos θ−sin θ)R−γ3|

(cos 3θ+γ3sin 3θ)R−(cos 2θ+γ3sin 2θ).

Since γ3does not depend on R, let us study f2(R) = dR−γ3

bR−c2where d=γ3cos θ−sin θand

b, c as above. We observe that:

•γ3b−cd and θhave the same sign. Indeed, γ3b−cd = (γ2

3+ 1) sin θcos 2θ. Consequently,

we always have (γ3b−cd)γ3>0.

•We always have γ3

d>1. Indeed, if θ > 0then d > 0since γ3=δ0+sin 5θ0

cos 5θ0

>sin θ

cos θ,

also γ3

d=γ3

γ3cos θ−sin θ>1; if θ < 0then d < 0since −γ3=δ0+sin 5θ0

cos 5θ0

>−sin θ

cos θ, also

γ3

d=−γ3

−γ3cos θ+sin θ>1.

Now, f0

2(R)=2·

γ3R−1

bR−c·(γ3b−cd)γ3

(bR−c)2, so, thanks to the above results, f2(R)decreases for 1≤R < γ3

and increases for R > γ3

d. Moreover, like for f1(1), we can estimate

f2(1) = −cos θ

2−γ3sin θ

−γ3sin 5θ

2+γ2

3cos 5θ

2!2

γ2

3≤1 + γ2

δ2

and limR→+∞f2(R) = γ3cos θ−sin θ

cos 3θ+γ3sin 3θ2≤1+γ2

cos23θ0. Therefore

|γ3<(λ−1) − =(λ−1)|

<(λ2−λ) + γ3=(λ2−λ)≤max p1 + γ2

δ0

,p1 + γ2

cos 3θ0!.

(iv) Since θ∈(π−θ0, π)∪(−π, −π+θ0), we have

RR n°9477

40 M. Bonazzoli & H. Haddar & T. A. Vu

•2θ∈(2π−2θ0,2π)∪(−2π, −2π+ 2θ0)⊆2π−π

3,2π∪−2π, −2π+π

3thus cos 2θ >

cos 2θ0>0;

•3θ∈(3π−3θ0,3π)∪(−3π, −3π+ 3θ0)⊆3π−π

2,3π∪−3π, −3π+π

2, thus −cos 3θ >

−cos(3π−3θ0) = sin π

2−3θ0≥0;

So we have

−<(λ3−λ2) = R2(−Rcos 3θ+ cos 2θ)>hsin π

2−3θ0+ cos 2θ0iR2>0.

Finally, |<(λ−1)|

−<(λ3−λ2)≤R+1

[sin(π

2−3θ0)+cos 2θ0]R2≤2

sin(π

2−3θ0)+cos 2θ0

and similarly for |=(λ−1)|

−<(λ3−λ2).

Lemma A.5. For λ∈C\R,|λ| ≥ 1we write λ=R(cos θ+ i sin θ)in polar form where R≥1,

θ∈(−π, π),θ6= 0.

(i) For λsatisfying <(λ2−λ)≥0, let γ1=γ1(λ) = 1,if =(λ2−λ)≥0,

−1,if =(λ2−λ)<0then

<(λ2−λ) + γ1=(λ2−λ)≥ |λ(λ−1)| ≥ 2|sin(θ/2)|.

(ii) Let 0< θ0≤π

4. For λsatisfying <(λ2−λ)<0and θ∈[θ0, π −θ0]∪[−π+θ0,−θ0], let

γ2=γ2(λ) = −1,if =(λ2−λ)≥0,

1,if =(λ2−λ)<0then

−<(λ2−λ)−γ2=(λ2−λ)≥ |λ(λ−1)| ≥ 2 sin(θ0/2).

(iii) Let 0< θ0≤π

4and δ0>0. For λsatisfying <(λ2−λ)<0and θ∈(−θ0, θ0)\{0}, let

γ3=γ3(sign(θ)) = δ0+ sin 3θ0

2/cos 3θ0

2if θ > 0,

−δ0+ sin 3θ0

2/cos 3θ0

2if θ < 0then

<(λ2−λ) + γ3=(λ2−λ)≥2δ0|sin(θ/2)|.

Moreover, if 0< θ0<π

4then

|<(λ−1)+γ3=(λ−1)|

<(λ2−λ)+γ3=(λ2−λ)≤√1+γ2

δ0and |γ3<(λ−1)−=(λ−1)|

<(λ2−λ)+γ3=(λ2−λ)≤max √1+γ2

δ0,√1+γ2

cos 2θ0.

(iv) Let 0< θ0≤π

4. There exists no λsatisfying <(λ2−λ)<0and θ∈(π−θ0, π)∪(−π, −π+

θ0).

Proof. The proofs for (i) and (ii) are similar to those in Lemma A.4.

(iii) Note that cos 2θ > 0,−π

2<2θ < π

2, and sin 2θhas the same sign as θand γ3, so we have

<(λ2−λ) + γ3=(λ2−λ) = R(Rcos 2θ−cos θ+γ3Rsin 2θ−γ3sin θ)

≥cos 2θ−cos θ+γ3sin 2θ−γ3sin θ

=−2 sin 3θ

2sin θ

2+ 2γ3cos 3θ

2sin θ

= 2 sin θ

2γ3cos 3θ

2−sin 3θ

2.

Then we consider two cases: if 0< θ < θ0then γ3>0,sin θ

2= sin θ

2>0,0<3θ

2<3θ0

2<π

2and

γ3cos 3θ

2−sin 3θ

2> γ3cos 3θ0

2−sin 3θ0

2=δ0; if −θ0< θ < 0then −γ3>0,sin θ

2=−sin θ

2>0,

−π

2<−3θ0

2<3θ

2<0and −γ3cos 3θ

2+ sin 3θ

2>−γ3cos 3θ0

2−sin 3θ0

2=δ0.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 41

Next, if 0< θ0<π

4, we will show that |<(λ−1)+γ3=(λ−1)|

<(λ2−λ)+γ3=(λ2−λ)and |γ3<(λ−1)−=(λ−1)|

<(λ2−λ)+γ3=(λ2−λ)are both

bounded. First,

|<(λ−1) + γ3=(λ−1)|

<(λ2−λ) + γ3=(λ2−λ)=|(cos θ+γ3sin θ)R−1|

R[(cos 2θ+γ3sin 2θ)R−(cos θ+γ3sin θ)]

≤|(cos θ+γ3sin θ)R−1|

(cos 2θ+γ3sin 2θ)R−(cos θ+γ3sin θ).

Since γ3does not depend on R, let us study f1(R) = aR−1

bR−a2where a= cos θ+γ3sin θ,

b= cos 2θ+γ3sin 2θ. We observe that:

•a > 0and b > 0. Indeed, cos θ > 0,cos 2θ > 0, and θand γ3have the same sign.

•bR −a > 0since <(λ2−λ) + γ3=(λ2−λ)>0, thus R > a

•a2> b (equivalently a

b>1

a), since a2= cos2θ+γ2

3sin2θ+γ3sin 2θ > cos2θ−sin2θ+

γ3sin 2θ=b.

Now, f0

1(R) = 2 ·aR−1

bR−a·b−a2

(bR−a)2<0for R > a

b>1

aand we would like to have a

b<1so that

f1(R)≤f1(1),∀R≥1. Indeed a

b<1is equivalent to

cos θ+γ3sin θ < cos 2θ+γ3sin 2θ⇔ |γ3|>sin 3θ

2

cos 3θ

which is true since

|γ3|=δ0+ sin 3θ0

cos 3θ0

>sin 3θ

2

cos 3θ

+ε0where ε0=δ0

cos 3θ0

Then we study

f1(1) = cos θ−1 + γ3sin θ

cos 2θ−cos θ+γ3(sin 2θ−sin θ)2

= −sin θ

2+γ3cos θ

−γ3sin 3θ

2+γ2

3cos 3θ

2!2

γ2

We have:

•(−sin θ

2+γ3cos θ

2)2≤1 + γ2

3by Cauchy-Schwarz inequality;

•γ2

3=|γ3|2>γ3sin 3θ

cos 3θ

+ε0|γ3|that leads to −γ3sin 3θ

2+γ2

3cos 3θ

2> ε0cos 3θ

2|γ3|=δ0|γ3|;

hence f1(1) ≤1+γ2

δ2

and ﬁnally |<(λ−1)+γ3=(λ−1)|

<(λ2−λ)+γ3=(λ2−λ)≤√1+γ2

δ0. Next, we have

|γ3<(λ−1) − =(λ−1)|

<(λ2−λ) + γ3=(λ2−λ)=|(γ3cos θ−sin θ)R−γ3|

R[(cos 2θ+γ3sin 2θ)R−(cos θ+γ3sin θ)]

≤|(γ3cos θ−sin θ)R−γ3|

(cos 2θ+γ3sin 2θ)R−(cos θ+γ3sin θ).

Since γ3does not depend on R, let us study f2(R) = cR−γ3

bR−a2where c=γ3cos θ−sin θand

a, b as above. We observe that:

RR n°9477

42 M. Bonazzoli & H. Haddar & T. A. Vu

•γ3b−ca and θhave the same sign. Indeed, γ3b−ca = (γ2

3+ 1) sin θcos θ. Consequently,

we always have (γ3b−ca)γ3>0.

•We always have γ3

c>1. Indeed, if θ > 0then c > 0since γ3=δ0+sin 3θ0

cos 3θ0

>sin θ

cos θ,

also γ3

c=γ3

γ3cos θ−sin θ>1; if θ < 0then c < 0since −γ3=δ0+sin 3θ0

cos 3θ0

>−sin θ

cos θ, also

γ3

c=−γ3

−γ3cos θ+sin θ>1.

Now, f0

2(R)=2·

γ3R−1

bR−a·(γ3b−ca)γ3

(bR−a)2, so, thanks to the above results, f2(R)decreases for 1≤R < γ3

and increases for R > γ3

c. Moreover, like for f1(1), we can estimate

f2(1) = −cos θ

2−γ3sin θ

−γ3sin 3θ

2+γ2

3cos 3θ

2!2

γ2

3≤1 + γ2

δ2

and limR→+∞f2(R) = γ3cos θ−sin θ

cos 2θ+γ3sin 2θ2≤1+γ2

cos 2θ0. Therefore

|γ3<(λ−1) − =(λ−1)|

<(λ2−λ) + γ3=(λ2−λ)≤max p1 + γ2

δ0

,p1 + γ2

cos 2θ0!.

(iv) For θ∈(π−θ0, π )∪(−π, −π+θ0), we have cos 2θ > 0since 2θ∈3π

2,2π∪−2π, −3π

2,

while cos θ < 0. Hence <(λ2−λ) = R(Rcos 2θ−cos θ)>0.

B Descent step for usual and shifted gradient descent

Proposition B.1 (Descent step for the usual gradient descent).The usual gradient descent

algorithm (5)converges if

0< τ < 2

kH(I−B)−1Mk2.

Proof. The error system for (5) can be rewritten as





pn+1

un+1

σn+1

=

−τ(I−B∗)−1H∗H(I−B)−1MM ∗0 (I−B∗)−1H∗H(I−B)−1M

−τ(I−B)−1MM ∗0 (I−B)−1M

−τM ∗0I





σn



(67)

Recall that a ﬁxed point iteration converges if and only if the spectral radius of its iteration

matrix is strictly less than 1. We can show that:

(i) If λ∈C\{0,1}is an eigenvalue of the iteration matrix, then, proceeding as in Proposition

4.3, there exists y∈Cnσ, y 6= 0 such that

λ2(λ−1) + τ

H(I−B)−1My

2

kyk2λ2= 0 (68)

hence λ= 1 −τkH(I−B)−1M yk2

kyk2. If we take τ < 2

kH(I−B)−1Mk2then equation (68) admits

no solution λwith |λ| ≥ 1.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 43

(ii) λ= 1 is not an eigenvalue of the iteration matrix. To show this, we rewrite iteration (67)





σn+1

pn+1

un+1

=



I−τM ∗0

(I−B∗)−1H∗H(I−B)−1M−τ(I−B∗)−1H∗H(I−B)−1MM ∗0

(I−B)−1M−τ(I−B)−1MM ∗0





σn

un

.

Proposition B.2 (Convergence of the shifted gradient descent).The shifted gradient descent

algorithm (6)converges if

0< τ < 1

kH(I−B)−1Mk2.

Proof. The error system for (6) can be rewritten as





pn+1

un+1

σn+1

=



0 0 (I−B∗)−1H∗H(I−B)−1M

0 0 (I−B)−1M

−τM ∗0I





σn

.(69)

Recall that a ﬁxed point iteration converges if and only if the spectral radius of its iteration

matrix is strictly less than 1. We can show that:

(i) If λ∈C\{0,1}is an eigenvalue of the iteration matrix, then, proceeding as in Proposition

4.2, there exists y∈Cnσ, y 6= 0 such that

λ2(λ−1) + τ

H(I−B)−1My

2

kyk2λ= 0.(70)

By applying Lemma C.1 for

a0= 0, a1=τ

H(I−B)−1My

2

kyk2, a2=−1,

we see that equation (70) admits no solution λwith |λ| ≥ 1if we take τ < kyk2

kH(I−B)−1My k2.

Then it is enough to take τ < 1

kH(I−B)−1Mk2.

(ii) λ= 1 is not an eigenvalue of the iteration matrix. To show this, we rewrite iteration (69)

as 



σn+1

pn+1

un+1

=



I−τM ∗0

(I−B∗)−1H∗H(I−B)−1M0 0

(I−B)−1M0 0





σn

un

.

and proceed as in Proposition 4.2.

RR n°9477

44 M. Bonazzoli & H. Haddar & T. A. Vu

C Convergence study for the scalar case

C.1 Notations and preliminary calculation

In the scalar case, that is when nu, nσ, nf= 1, we change the notation from capital to lower case

letters:

B←b∈R, b < 1, M ←m∈R, m 6= 0, H ←h∈R, h 6= 0,

Tk←tk= 1 + b+... +bk−1=1−bk

1−b, Uk←uk=kh2bk−1(71)

Xk←xk=0, k = 1,

h2[1 + 2b+ 3b2+... + (k−1)bk−2], k ≥2.

The identity 1+2x+ 3x2+... +nxn−1=1−xn+1

1−x0=1−(n+1)xn+nxn+1

(1−x)2says that

xk=h21−kbk−1+ (k−1)bk

(1 −b)2, k ≥1,(72)

where we set bk−1= 1 when k= 1 and b= 0. Now for each of algorithms (5), (6), (10), (9), we

write the iterations for the errors in the scalar case and the corresponding iteration matrix M

such that [pn+1, un+1, σ n+1]|=M[pn, un, σn]|.

•Usual gradient descent (usual GD):











σn+1 =σn−τmpn

un=bun+mσn

pn=bpn+h2unM=

−h2m2(1 −b)−2τ0h2m(1 −b)−2

−m2(1 −b)−1τ0m(1 −b)−1

−mτ 0 1 

(73)

•Shifted gradient descent (shifted GD):











σn+1 =σn−τmpn

un+1 =bun+1 +mσn

pn+1 =bpn+1 +h2un+1 M=



0 0 h2m(1 −b)−2

0 0 m(1 −b)−1

−mτ 0 1 

(74)

•k-step one-shot:











σn+1 =σn−τmpn

pn+1 = (bk−τm2xk)pn+ukun+mxkσn

un+1 =bkun+mtkσn−τm2tkpnM=



bk−m2xkτ ukmxk

−m2tkτ bkmtk

−mτ 0 1 

(75)

•Shifted k-step one-shot:











σn+1 =σn−τmpn

pn+1 =bkpn+ukun+mxkσn

un+1 =bkun+mtkσnM=



bkukmxk

0bkmtk

−mτ 0 1 

.(76)

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 45

C.2 Necessary and suﬃcient conditions for convergence

In this simpler scalar case, we will be able to prove suﬃcient and also necessary conditions on

the descent step τfor convergence. Our strategy to study the spectral radius ρ(M)is as follows:

1. Compute det(M − λI )to write the eigenvalue equation P(λ) = 0. For the considered

methods, Pturns out to be a polynomial of degree 3,P(λ) = a0+a1λ+a2λ2+λ3, where

a0, a1, a2∈Rdepend on h, m, b, τ . For the computations, the identity uktk−bkxk+xk=

h2t2

k, which is the scalar version of (41), can be helpful.

2. Apply to PLemma C.1, which states a necessary and suﬃcient condition for a real coeﬃ-

cient polynomial of degree 3 to have all roots inside the unit circle of the complex plane.

Then deduce conditions on τ.

Lemma C.1. Let a0, a1, a2∈R, then all roots of P(z) = a0+a1z+a2z2+z3stay (strictly)

inside the unit circle of the complex plane if and only if

(a0−1)(a0+ 1) <0,(77)

(a2

0−a2a0+a1−1)(a2

0+a2a0−a1−1) >0,(78)

(a0+a2−a1−1)(a0+a2+a1+ 1) <0.(79)

The proof of Lemma C.1 is in Appendix Dand is mainly based on Marden’s works [17].

C.2.1 Descent step for the usual gradient descent

Here, the coeﬃcients of Pare

a0= 0, a1= 0, a2=h2m2(1 −b)−2τ−1.

Conditions (77) and (78) of Lemma C.1 are automatically satisﬁed. Condition (79) gives

0< τ < 2(1 −b)2

h2m2,

that is (7) in the scalar case.

C.2.2 Descent step for the shifted gradient descent

Here, the coeﬃcients of Pare

a0= 0, a1=h2m2(1 −b)−2τ, a2=−1.

Condition (77) of Lemma C.1 is automatically satisﬁed, condition (79) is automatically satisﬁed

for τ > 0, and condition (78) gives us

τ < (1 −b)2

h2m2,

that is (8) in the scalar case.

RR n°9477

46 M. Bonazzoli & H. Haddar & T. A. Vu

C.2.3 Descent step for k-step one-shot

Here, the coeﬃcients of Pare

a0=−s2, a1=m2(h2t2

k−xk)τ+ (s2+ 2s), a2=m2xkτ−(2s+ 1)

where s=bk. Condition (77) of Lemma C.1 is obviously satisﬁed since |b|<1. Next we deal

with condition (78). The computation shows that

0−a2a0+a1−1 = m2(h2t2

k−xk+xks2)τ+ (s−1)3(s+ 1)

| {z }

,(80)

0+a2a0−a1−1 = −m2(h2t2

k−xk+xks2)τ+ (s−1)(s+ 1)3

| {z }

(81)

and

h2t2

k−xk+xks2=h2bk−1(1 −bk)[k−(k+ 1)b+kbk−(k−1)bk+1]

(1 −b)2.(82)

Lemma C.2. k−(k+ 1)b+kbk−(k−1)bk+1 >0,∀|b|<1,∀k≥1.

Proof. We write k−(k+1)b+kbk−(k−1)bk+1 = (1 −b)Awhere A=k+ 1−1−bk

1−b+(k−1)bk. It

suﬃces to show A > 0. If k= 1 then A= 1 >0. If either kis even, or k≥3is odd and 0≤b < 1,

then (k−1)bk≥0and 1−bk

1−b=|bk−1+bk−2+... +b+ 1|≤|bk−1|+|bk−2|+... +|b|+ 1 < k give

us the conclusion. If k≥3is odd and −1<b<0then (k−1)(1 + bk+ 1) >0and 1−bk

1−b<1

therefore A= 1 + 1−1−bk

1−b+ (k−1)(1 + bk)>0.

Then, condition (78) imposes

•τ < (1−b)2(1+bk)(1−bk)2

h2m2bk−1[k−(k+1)b+kbk−(k−1)bk+1]if bk−1>0;

•τ < (1−b)2(1+bk)3

h2m2bk−1[−k+(k+1)b−kbk+(k−1)bk+1]if bk−1<0;

•no condition on τif k≥2and b= 0.

Finally we check condition (79). We have a0+a2+a1+ 1 = h2m2t2

kτ > 0and

a0+a2−a1−1 = h2m2(1 −2kbk−1+ 2kbk−b2k)

(1 −b)2τ−2(1 + s)2,

therefore, condition (79) gives

•τ < 2(1−b)2(1+bk)2

h2m2(1−2kbk−1+2kbk−b2k)if 1−2kbk−1+ 2kbk−b2k>0;

•no condition on τif 1−2kbk−1+ 2kbk−b2k≤0.

In the following lemma we study the quantity 1−2kbk−1+ 2kbk−b2kthat appears above.

Lemma C.3. Let fk(b)=1−2kbk−1+ 2kbk−b2kfor k∈N∗and −1≤b≤1.

(i) f1(b) = −(1 −b)2<0,∀ − 1< b < 1.

(ii) f2(b) = 1 −4b+ 4b2−b4has a unique solution b=−1 + √2in (−1,1); and f2(b)>0if

−1< b < −1 + √2,f2(b)<0if −1 + √2< b < 1.

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 47

(iii) If k≥3is odd then fk(b)has exactly two solutions b1(k)< b2(k)in (−1,1); if k≥2is

even then fk(b)has a unique solution b3(k)in (−1,1). Moreover, for every odd k≥3:

•−1< b1(k)<0< b2(k)<1;

•fk(b)>0⇔b1(k)< b < b2(k);

•fk(b)<0⇔ −1< b < b1(k)∨b2(k)<b<1.

and for every even k≥2:

•0< b3(k)<1;

•fk(b)>0⇔ −1< b < b3(k);

•fk(b)<0⇔b3(k)<b<1.

(iv) lim

kodd

k→∞

b1(k) = −1and lim

kodd

k→∞

b2(k) = 1 = lim

keven

k→∞

b3(k)=1.

Proof. (i) and (ii) are easy to verify. (iii) It remains to consider k≥3. We have

k(b) = bk−2−2k(k−1) + 2k2b−2kbk+1,−1<b<1.

Set

gk(b) = −2k(k−1) + 2k2b−2kbk+1,−1≤b≤1, k ≥3.

Case 1. [k≥3is odd] By studying the sign of g0

k(b), we ﬁnd that

•gkhas a unique solution v1(k)in (−1,1) and 0< v1(k)<k

k+1 <1;

•gk(b)>0⇔v1(k)<b<1;

•gk(b)<0⇔ −1< b < v1(k).

Next, by studying the sign of f0

k(b), we ﬁnd that

•fk(b)has exactly two solutions b1(k)< b2(k)in (−1,1) and −1< b1(k)<0< b2(k)<1;

•fk(b)>0⇔b1(k)< b < b2(k);

•fk(b)<0⇔ −1< b < b1(k)∨b2(k)< b < 1.

Case 2. [k≥4is even] By studying the sign of g0

k(b), we ﬁnd that

•gkhas a unique solution v2(k)in (−1,1) and 0< v2(k)<k

k+1 <1;

•gk(b)>0⇔v2(k)<b<1;

•gk(b)<0⇔0< b < v2(k).

Next, by studying the sign of f0

k(b), we ﬁnd that

•fk(b)has a unique solution b3(k)in (−1,1) and 0< b3(k)<1;

•fk(b)>0⇔ −1< b < b3(k);

•fk(b)<0⇔b3(k)< b < 1.

RR n°9477

48 M. Bonazzoli & H. Haddar & T. A. Vu

(iv) We have

fk1

2= 1 −k

2k−1−1

2k,∀k≥3and fk−1

2= 1 −3k

2k−1−1

2k,∀odd k≥3,

hence for suﬃciently large kwe have fk1

2>0and for suﬃciently large odd kwe have fk−1

2>

0. By the table of signs of fk, we conclude that b1(k)<−1

2for large odd k,b2(k)>1

2for large

odd kand b3(k)>1

2for large even k.

Case 1. [k≥3is odd and suﬃciently large] First we work with b1(k). We have

1−2kb1(k)k−1+ 2kb1(k)k−b1(k)2k= 0

and b1(k)<−1

2so

−b1(k)2k+ 2kb1(k)k+ 1 = 2kb1(k)k−1= [−2kb1(k)k]

| {z }

·1

−b1(k)<[−2kb1(k)k].2 = −4kb1(k)k,

which leads to

b1(k)2k−6kb1(k)k−1>0⇔[b1(k)k−3k]2>1+9k2.

Since −1< b1(k)<0and kis odd, this tells us that

−1< b1(k)<−(−3k+p1+9k2)1/k =−1

(3k+√1+9k2)1/k <−1

(7k)1/k,

which yields lim

kodd

k→∞

b1(k) = −1. Next, we have

1−2kb2(k)k−1+ 2kb2(k)k−b2(k)2k= 0

and b2(k)>1

2so

−b2(k)2k+ 2kb2(k)k+ 1 = 2kb2(k)k−1= 2kb2(k)k·1

b2(k)<4kb2(k)k,

which leads to

b2(k)2k+ 2kb2(k)k−1>0⇔[b2(k)k+k]2>1 + k2.

Since 0< b2(k)<1, this tells us that

1> b2(k)>(−k+p1 + k2)1/k =1

(k+√1 + k2)1/k >1

(3k)1/k,

which yields lim

keven

k→∞

b2(k) = 1.

Case 2. [k≥4is even and suﬃciently large] We repeat the same arguments as b2(k)for b3(k).

In summary, we have the following proposition.

Proposition C.4 (Convergence of k-step one-shot).Let η1(k, b):= +∞and

η21(k, b):=(1 −b)2(1 + bk)(1 −bk)2

bk−1[k−(k+ 1)b+kbk−(k−1)bk+1];

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 49

η22(k, b):=−(1 −b)2(1 + bk)3

bk−1[k−(k+ 1)b+kbk−(k−1)bk+1];

η3(k, b):=2(1 −b)2(1 + bk)2

1−2kbk−1+ 2kbk−b2k

then the necessary and suﬃcient condition for the convergence of k-step one-shot in the scalar

case is of the form τ < η(k,b)

h2m2where η(k, b)is deﬁned as follows:

(i) η(1, b) = η21(1, b) = (1 −b)3(1 + b),−1< b < 1;

(ii) for odd k≥3,

η(k, b) = 





η21(k, b),−1< b ≤b1(k)∨b2(k)≤b < 1,

min {η21(k, b), η3(k, b)}, b1(k)< b < b2(k)∧b6= 0,

2, b = 0

where −1< b1(k)<0< b2(k)<1are the two solutions of

1−2kbk−1+ 2kbk−b2k= 0,−1<b<1;

(iii) for even k≥2,

η(k, b) = 









η21(k, b), b3(k)≤b < 1,

min {η21(k, b), η3(k, b)},0< b < b3(k),

2, b = 0,

min {η22(k, b), η3(k, b)},−1< b < 0

where 0< b3(k)<1is the unique solution of

1−2kbk−1+ 2kbk−b2k= 0,−1<b<1.

Note that lim

kodd

k→∞

b1(k) = −1and lim

kodd

k→∞

b2(k) = 1 = lim

keven

k→∞

b3(k), so the behavior of τwhen k→ ∞ is

consistent with the result τ < 2(1−b)2

h2m2,−1< b < 1for the usual gradient descent. For illustrations

of the function η(k, b)for diﬀerent ksee section C.3.

C.2.4 Descent step for shifted k-step one-shot

Here, the coeﬃcients of the polynomial Pof the eigenvalue equation are

a0=h2m2vkτ−s2, a1=h2m2ykτ+s2+ 2s, a2=−2s−1,(83)

where s=bk,yk=xk

h2=1−kbk−1+(k−1)bk

(1−b)2and vk=t2

k−yk=bk−1[k−(k+1)b+bk+1]

(1−b)2. Note that vk

and bk−1have the same sign, also vk= 0 if and only if k≥2and b= 0, since it is easy to show

that k−(k+ 1)b+bk+1 >0,∀|b|<1,∀k≥1. Then, condition (77) of Lemma C.1 imposes

•τ < 1+s2

h2m2vk=(1−b)2(1+b2k)

h2m2bk−1[k−(k+1)b+bk+1]if bk−1>0;

•τ < −1+s2

h2m2vk=(1−b)2(−1+b2k)

h2m2bk−1[k−(k+1)b+bk+1]if bk−1<0;

•no condition on τif k≥2and b= 0.

RR n°9477

50 M. Bonazzoli & H. Haddar & T. A. Vu

Next we study condition (78). We have

0−a2a0+a1−1 = v2

k(h2m2τ)2+ [(−2s2+ 2s+ 1)vk+yk]h2m2τ+ (s−1)3(s+ 1)

| {z }

and

0+a2a0−a1−1 = v2

k(h2m2τ)2−[(2s2+ 2s+ 1)vk+yk]h2m2τ+ (s−1)(s+ 1)3

| {z }

each of which, considered as a second order polynomial of h2m2τif vk6= 0, has exactly two roots

of opposite signs. Therefore if vk6= 0, condition (78) is equivalent to (h2m2τ−r1)(h2m2τ−r2)>0

where

r1:=(2s2−2s−1)vk−yk+p(−4s+ 5)v2

k+y2

k+ 2(−2s2+ 2s+ 1)vkyk

2v2

and

r2:=(2s2+ 2s+ 1)vk+yk+p(8s2+ 12s+ 5)v2

k+y2

k+ 2(2s2+ 2s+ 1)vkyk

2v2

>0.

Lemma C.5. r1and r2cannot be both strictly less than 1+s2

vk.r1and r2cannot be both strictly

less than −1+s2

vk.

Proof. Either r1<1+s2

vkor r1<−1+s2

vkimplies (s2+ 4s+ 1)v2

k+ (s2+ 1)vkyk>0. Either

r2<1+s2

vkor r2<−1+s2

vkimplies (s2+ 4s+ 1)v2

k+ (s2+ 1)vkyk<0.

Thanks to this lemma we see that condition (78), in combination with condition (77), gives

•τ < 1

h2m2min{r1, r2}if bk−16= 0;

•τ < 1

h2m2if k≥2and b= 0.

Finally, we have a0+a2+a1+ 1 = h2m2t2

kτ > 0and

a0+a2−a1−1 = h2m2

(1 −b)2[−1+2kbk−1−2kbk+b2k]τ−2(1 −bk)2,

thus condition (79) is equivalent to

•τ < 2(1−b)2(1−bk)2

h2m2(−1+2kbk−1−2kbk+b2k)if 1−2kbk−1+ 2kbk−b2k<0;

•no condition on τif 1−2kbk−1+ 2kbk−b2k≥0.

One can look again at Lemma C.3 for the analysis of 1−2kbk−1+ 2kbk−b2k. In summary, we

have the following proposition.

Proposition C.6 (Convergence of shifted k-step one-shot).Let

κ11(k, b):=(1 −b)2(1 + b2k)

bk−1[k−(k+ 1)b+bk+1];

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 51

κ12(k, b):=(1 −b)2(−1 + b2k)

bk−1[k−(k+ 1)b+bk+1];

tk:=1−bk

1−b, yk:=1−kbk−1+ (k−1)bk

(1 −b)2, s :=bk, vk:=t2

k−yk,

κ21(k, b):=(2s2−2s−1)vk−yk+p(−4s+ 5)v2

k+y2

k+ 2(−2s2+ 2s+ 1)vkyk

2v2

κ22(k, b):=(2s2+ 2s+ 1)vk+yk+p(8s2+ 12s+ 5)v2

k+y2

k+ 2(2s2+ 2s+ 1)vkyk

2v2

κ2(k, b):= min{κ21(k, b), κ22 (k, b)};

κ3(k, b):=2(1 −b)2(1 −bk)2

−1+2kbk−1−2kbk+b2k

then the necessary and suﬃcient condition for the convergence of shifted k-step one-shot in the

scalar case is of the form τ < κ(k,b)

h2m2where κ(k, b)is deﬁned as follows:

(i) κ(1, b) = min {κ11(1, b), κ2(1, b), κ3(1, b)}, also note that

κ11(1, b) = 1 + b2, κ21(1, b) = 2b2−2b−1 + √−4b+ 5

κ22(1, b) = 2b2+ 2b+1+√8b2+ 12b+ 5

2, κ3(1, b) = 2(1 −b)2;

(ii) for odd k≥3,

κ(k, b) = 





min {κ11(k, b), κ2(k, b), κ3(k, b)},−1< b < b1(k)∨b2(k)< b < 1,

min {κ11(k, b), κ2(k, b)}, b1(k)≤b≤b2(k)∧b6= 0,

1, b = 0

where −1< b1(k)<0< b2(k)<1are the two solutions of

1−2kbk−1+ 2kbk−b2k= 0,−1<b<1;

(iii) for even k≥2,

κ(k, b) = 









min {κ11(k, b), κ2(k, b), κ3(k, b)}, b3(k)< b < 1,

min {κ11(k, b), κ2(k, b)},0< b ≤b3(k),

1, b = 0,

min {κ12(k, b), κ2(k, b)},−1< b < 0

where 0< b3(k)<1is the unique solution of

1−2kbk−1+ 2kbk−b2k= 0,−1<b<1.

Remark C.7.In implementation, we rewrite κ21 (k, b)as

b(1 −b)2(bk−1)

k−(k+ 1)b+bk+1 +

2·h−bk+1+b(1−b)2(1−bk)yk

k−(k+1)b+bk+1 i

yk+vk+p(−4s+ 5)v2

k+y2

k+ 2(−2s2+ 2s+ 1)vkyk

to avoid numerical errors. Also in this formula, we see that κ21(k, b)k→∞

−→ (1 −b)2(note that

yk=1−kbk−1+(k−1)bk

(1−b)2

k→∞

−→ 1

(1−b)2and vk=t2

k−yk

k→∞

−→ 0).

For illustrations of the function κ(k , b)for diﬀerent ksee section C.3.

RR n°9477

52 M. Bonazzoli & H. Haddar & T. A. Vu

C.3 Comparison of the bounds for the descent step

In summary, in the scalar case, the necessary and suﬃcient convergence conditions on the descent

step τ > 0are:

τ < 2(1 −b)2

h2m2, τ < (1 −b)2

h2m2, τ < η(k , b)

h2m2, τ < κ(k , b)

h2m2,

respectively for usual GD, shifted GD, k-step one-shot (with η(k, b)given in Proposition C.4),

shifted k-step one-shot (with κ(k, b)given in Proposition C.6). By taking m=h= 1, in Figure 5

we plot for diﬀerent kthe functions: b7→ 2(1 −b)2(usual GD), b7→ (1 −b)2(shifted GD),

b7→ η(k, b)(k-step one-shot) and b7→ κ(k, b)(shifted k-step one-shot).

From these plots we can draw two important conclusions. First, when kincreases the visu-

alized curves for k-step one-shot and shifted k-step one-shot tend to the corresponding curves

for usual and shifted gradient descent, as expected. Second, even in this scalar case, it ap-

pears diﬃcult to establish a simpliﬁed expression for η(k, b)in Proposition C.4 and κ(k, b)in

Proposition C.6 to ﬁnd a practical upper bound for the descent step τ.

Remark C.8.For k≥2, we observe that for some bthe admissible range of τof k-step one-shot

is larger than the one of usual GD, that is not intuitive. This is indeed veriﬁed numerically using

FreeFEM: when b= 0.2and τ= 2.08,2-step one-shot converges while the usual GD does not.

D A proof of Lemma C.1 based on Marden’s works

Deﬁnition D.1. We say that a complex coeﬃcient polynomial has property Pif all its zeros

lie (strictly) inside the unit circle |z|= 1.

We recall some deﬁnitions from Marden’s works [17].

Deﬁnition D.2. Let P(z) = a0+a1z+... +anznwhere ak∈R, k = 0, ..., n (we do not require

an6= 0 here). We deﬁne

P(z):=an+an−1z+... +a0zn

and call it the reverse polynomial of P. One can also see that ˜

P(z) = znP(1/z).

Deﬁnition D.3. Let P(z) = a0+a1z+... +anznwhere ak∈R, k = 0, ..., n. We deﬁne a

polynomial sequence {Pk}0≤k≤nwhere

Pk(z) = a(k)

0+a(k)

1z+... +a(k)

n−kzn−k

as follows:

•P0=P;

•Pk+1 =a(k)

0Pk−a(k)

n−k˜

Pkfor 0≤k≤n−1.

Then we deﬁne

mk(P) = a(1)

0a(2)

0··· a(k)

0,1≤k≤n.

The coeﬃcients of these polynomials can be gathered in the following table, that we call

Marden’s table:

Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 53

(a) Shifted 1-step one-shot (b) 1-step one-shot

(e) Shifted k-step one-shot, even k≥2(f) k-step one-shot, even k≥2

Figure 5: Admissible τin the scalar case as a function of b.

RR n°9477

54 M. Bonazzoli & H. Haddar & T. A. Vu

1x x2... xn−1xn

P0a0a1a2... an−1an

P0anan−1an−2... a1a0

P1a(1)

0a(1)

1a(1)

2... a(1)

n−1

P1a(1)

n−1a(1)

n−2a(1)

n−3... a(1)

Pn−1a(n−1)

0a(n−1)

Pn−1a(n−1)

1a(n−1)

Pna(n)

We have a nice and simple criterion mainly based on the works of Marden [16,17] and Jury

[13,14], known as Jury-Marden Criterion:

Theorem D.4 (Jury-Marden Criterion).The polynomial Phas property Pif and only if

a(1)

0<0; a(k)

0>0,∀2≤k≤n.

This necessary and suﬃcient condition is mentioned several times in the literature (see e.g. [1,

Theorem 3.10]), but it is not easy to ﬁnd an explicit proof, so we provide a proof for the reader’s

convenience. Before proving this result, we apply Jury-Marden Criterion to a polynomial of

degree 3and obtain precisely Lemma C.1, that is the following proposition.

Proposition D.5. Let P(z) = a0+a1z+a2z2+z3, z ∈Cwhere a0, a1, a2∈R. Then Phas

property Pif and only if











(a0−1)(a0+ 1) <0

(a2

0−a2a0+a1−1)(a2

0+a2a0−a1−1) >0

(a0+a2−a1−1)(a0+a2+a1+ 1) <0.

Proof. By directly applying Jury-Marden Criterion to P, we obtain Marden’s table as follows:

1x x2x3

P0=P a0a1a21

P01a2a1a0

P1a2

0−1a1a0−a2a2a0−a1

P1a2a0−a1a1a0−a2a2

0−1

P2(a2

0−1)2−(a2a0−a1)2(a1a0−a2)(a2

0−a2a0+a1−1)

P2(a1a0−a2)(a2

0−a2a0+a1−1) (a2

0−1)2−(a2a0−a1)2

and

P3(x) = (a2

0−1)2−(a2a0−a1)22−(a1a0−a2)2(a2

0−a2a0+a1−1)2.

Hence

a(1)

0=a2

0−1=(a0−1)(a0+ 1),

a(2)

0= (a2

0−1)2−(a2a0−a1)2= (a2

0−a2a0+a1−1)(a2

0+a2a0−a1−1),

a(3)

0=(a2

0−1)2−(a2a0−a1)22−(a1a0−a2)2(a2

0−a2a0+a1−1)2

=(a2

0+a2a0−a1−1)2−(a1a0−a2)2(a2

0−a2a0+a1−1)2

= [a2

0+ (a2−a1)a0+a2−a1−1][a2

0+ (a2+a1)a0−a2−a1−1]

(a2

0−1−a2a0+a1)2

= (a0+ 1)(a0+a2−a1−1)(a0−1)(a0+a2+a1+ 1)(a2

0−1−a2a0+a1)2.Inria

Convergence analysis of multi-step one-shot methods for linear inverse problems 55

Then the condition a(1)

0<0, a(2)

0>0, a(3)

0>0, after being simpliﬁed, is equivalent to three

inequalities of the statement.

Now, to prove Jury-Marden Criterion, we need the following two results.

Theorem D.6 (Marden, [17], Theorem 42.1).Let Pbe a real coeﬃcient polynomial of n-th

degree. If the sequence

m1(P), m2(P), ..., mn(P)

has exactly pnegative elements and n−ppositive elements (hence no null elements), then Phas

pcomplex roots (including multiplicities) inside the unit circle |z|= 1, no roots on this circle

and n−pcomplex roots (including multiplicities) outside this circle.

Lemma D.7 (Schur, [20]).Let P(z) = a0+a1z+... +anznwhere ak∈R,∀1≤k≤n. Assume

that |a0|<|an|. Then deg ˜

P1=n−1, and Phas property Pif and only if ˜

P1has property P.

Proof of Jury-Marden Criterion D.4.The suﬃcient condition for Phaving property Pis a di-

rect consequence of Marden’s Theorem D.6. It remains to prove the necessary one.

For that, we will prove the following statement M(n)by induction: “For every real-coeﬃcient

polynomial Pof n-th degree having property P, the sequence a(1)

0, ..., a(n)

0obtained by Marden’s

algorithm must satisfy

a(1)

0<0, a(k)

0>0,∀2≤k≤n.”

To check M(1), let P(z) = a0+a1zwhere a0, a1∈R, a16= 0. Then P(z) = 0 ⇔z=−a0/a1

and | − a0/a1|<1⇔ |a0|<|a1| ⇔ a(1)

0=a2

0−a2

1<0.

Now supposing that M(n−1) is true for some n∈N, n ≥2, we show that M(n)is true. Let

P(z) = a0+a1z+... +anznwhere ak∈R, k = 0, ..., n and an6= 0. Assume that Phas property

P. First, a(1)

0=a2

0−a2

n<0. Indeed, let z1, z2, ..., znbe the nzeros including multiplicities of

P, then by Viète’s formulas z1z2···zn= (−1)n(a0/an). Taking the module of both sides of this

identity and noting that Phas property P, we have |a0/an|<1, thus a(1)

0=a2

0−a2

n<0. Next,

by Lemma D.7,˜

P1is of (n−1)-th degree and it also has property P. Marden’s table for ˜

P1can

be easily found:

1x x2... xn−3xn−2xn−1

P1a(1)

n−1a(1)

n−2a(1)

n−3... a(1)

2a(1)

1a(1)

P1a(1)

0a(1)

1a(1)

2... a(1)

n−3a(1)

n−2a(1)

n−1

−P2−a(2)

0−a(2)

1−a(2)

2... −a(2)

n−3−a(2)

n−2

−˜

P2−a(2)

n−2−a(2)

n−3a(1)

n−4... −a(2)

1−a(2)

P3a(3)

0a(3)

1a(3)

2... a(3)

n−3

P3a(3)

n−3a(3)

n−4a(3)

n−5... a(3)

Pn−1a(n−1)

0a(n−1)

Pn−1a(n−1)

1a(n−1)

Pna(n)

By M(n−1), we must then have −a(2)

0<0, a(k)

0>0,∀3≤k≤n.

RR n°9477

RESEARCH CENTRE

SACLAY – ÎLE-DE-FRANCE

1 rue Honoré d’Estienne d’Orves

Bâtiment Alan Turing

Campus de l’École Polytechnique

91120 Palaiseau

Publisher

Inria

Domaine de Voluceau - Rocquencourt

BP 105 - 78153 Le Chesnay Cedex

inria.fr

ISSN 0249-6399

ResearchGate has not been able to resolve any citations for this publication.

Mitigating local minima in full-waveform inversion by expanding the search space

Article

Full-text available

Jul 2013

Wave-equation based inversions, such as full-waveform inversion, are challenging because of their computational costs, memory requirements, and reliance on accurate initial models. To confront these issues, we propose a novel formulation of full-waveform inversion based on a penalty method. In this formulation, the objective function consists of a data-misfit term and a penalty term which measures how accurately the wavefields satisfy the wave-equation. Because we carry out the inversion over a larger search space, including both the model and synthetic wavefields, our approach suffers less from local minima. Our main contribution is the development of an efficient optimization scheme that avoids having to store and update the wavefields by explicit elimination. Compared to existing optimization strategies for full-waveform inversion, our method differers in two main aspects; i) The wavefields are solved from an augmented wave-equation, where the solution is forced to solve the wave-equation and fit the observed data, ii) no adjoint wavefields are required to update the model, which leads to significant computational savings. We demonstrate the validity of our approach by carefully selected examples and discuss possible extensions and future research.

Theory and Application of the Z-Transform Method

Article

Dec 1965

New development in FreeFem++

Article

Dec 2012
J NUMER MATH

Frédéric Hecht

-This is a short presentation of the freefem++ software. In Section 1, we recall most of the characteristics of the software, In Section 2, we recall how to to build the weak form of a partial differential equation (PDE) from the strong form. In the 3 last sections, we present different examples and tools to illustrated the power of the software. First we deal with mesh adaptation for problems in two and three dimension, second, we solve numerically a problem with phase change and natural convection, and the finally to show the possibilities for HPC we solve a Laplace equation by a Schwarz domain decomposition problem on parallel computer.

Automated Extension of Fixed Point PDE Solvers for Optimal Design with Bounded Retardation

Article

Oct 2012

We study PDE-constrained optimization problems where the state equation is solved by a pseudo-time stepping or fixed point iteration. We present a technique that improves primal, dual feasibility and optimality si-multaneously in each iteration step, thus coupling state and adjoint iteration and control/design update. Our goal is to obtain bounded retardation of this coupled iteration compared to the original one for the state, since the latter in many cases has only a Q-factor close to one. For this purpose and based on a doubly augmented Lagrangian, which can be shown to be an exact penalty function, we discuss in detail the choice of an appropriate control or design space preconditioner, discuss implementation issues and present a convergence analysis. We show numerical examples, among them applications from shape design in fluid mechanics and parameter optimization in a climate model. Mathematics Subject Classification (2000). Primary 90C30; Secondary 99Z99.

Goal oriented adaptivity in the IRGNM for parameter identification in PDEs: II. all-at-once formulations

Article

Feb 2014

In this paper we investigate adaptive discretization of the iteratively regularized Gauss–Newton method IRGNM. All-at-once formulations considering the PDE and the measurement equation simultaneously allow to avoid (approximate) solution of a potentially nonlinear PDE in each Newton step as compared to the reduced form Kaltenbacher et al (2014 Inverse Problems 30 045001). We analyze a least squares and a generalized Gauss–Newton formulation and in both cases prove convergence and convergence rates with a posteriori choice of the regularization parameters in each Newton step and of the stopping index under certain accuracy requirements on four quantities of interest. Estimation of the error in these quantities by means of a weighted dual residual method is discussed, which leads to an algorithm for adaptive mesh refinement. Numerical experiments with an implementation of this algorithm show the numerical efficiency of this approach, which especially for strongly nonlinear PDEs outperforms the nonlinear Tikhonov regularization considered in Kaltenbacher et al (2011 Inverse Problems 27 125008).

Polynomials and Linear Control Systems

Article

Jan 1983

Stephen Barnett

Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind. I, II

Article

Jan 1917
J REINE ANGEW MATH

I. Schur

The Geometry of the Zeros of a Polynomial in a Complex Variable

Article

May 1950

M. Marden

'One Shot' Methods for Optimal Control of Distributed Parameter Systems I: Finite Dimensional Control

Article

Jan 1991

Shlomo Ta'asan

This paper discusses the efficient numerical treatment of optimal control problems governed by elliptic partial differential equations and systems of elliptic partial differential equations, where the control is finite dimensional. Distributed control as well as boundary control cases are discussed. The main characteristic of the new methods is that they are designed to solve the full optimization problem directly, rather than accelerating a descent method by an efficient multigrid solver for the equations involved. The methods use the adjoint state in order to achieve efficient smoother and a robust coarsening strategy. The main idea is the treatment of the control variables on appropriate scales, i.e., control variables that correspond to smooth functions are solved for on coarse grids depending on the smoothness of these functions. Solution of the control problems is achieved with the cost of solving the constraint equations about two to three times (by a multigrid solver). Numerical examples demonstrate the effectiveness of the method proposed in distributed control case, pointwise control and boundary control problem.

Iterative regularization of parameter identification problems by sequential quadratic programming methods

Article

May 2002

The aim of this paper is to design and to analyse sequential quadratic programming (SQP) methods as iterative regularization methods for ill-posed parameter identification problems. We discuss two variants of the original SQP algorithm, in which an additional stabilizer ensures the strict convexity and well posedness of the quadratic programming problems that have to be solved in each step of the iteration procedure. We show that the SQP problems are equivalent to stable saddle-point problems, which can be analysed by standard methods. In addition, the investigation of these saddle-point problems offers new possibilities for the numerical treatment of the identification problem compared to standard numerical methods for inverse problems. One of the resulting iteration algorithms, called the Levenberg–Marquardt SQP method, is analysed with respect to convergence and regularizing properties under an appropriate choice of the stopping index depending on the noise level. Finally, we show that the conditions needed for convergence are fulfilled for several important types of applications and we test the convergence behaviour in numerical examples.

Convergence analysis of multi-step one-shot methods for linear inverse problems

Abstract and Figures

Recommended publications

Inverse Problem in the Coupling Constant

The matrix iterative methods for solving a class of generalized coupled Sylvester-conjugate linear m...

Common solution to a pair of non-linear matrix equations via fixed point results

Numerical Treatment of Elliptic Problems Nonlinearly Coupled Through the Interface http://rdcu.be/mT...