ChapterPDF Available

Sequential Quadratic Programming Methods

Authors:

Abstract

In his 1963 PhD thesis, Wilson proposed the first sequential quadratic programming (SQP) method for the solution of constrained nonlinear optimization problems. In the intervening 48 years, SQP methods have evolved into a powerful and effective class of methods for a wide range of optimization problems. We review some of the most prominent developments in SQP methods since 1963 and discuss the relationship of SQP methods to other popular methods, including augmented Lagrangian methods and interior methods. Given the scope and utility of nonlinear optimization, it is not surprising that SQP methods are still a subject of active research. Recent developments in methods for mixed integer nonlinear programming (MINLP) and the minimization of functions subject to differential equation constraints has led to a heightened interest in methods that may be “warm started” from a good approximate solution. We discuss the role of SQP methods in these contexts Key wordsLarge-scale nonlinear programming-SQP methods-nonconvex programming-quadratic programming-KKT systems
REGULARIZED SEQUENTIAL QUADRATIC
PROGRAMMING METHODS
Philip E. GillDaniel P. Robinson
UCSD Department of Mathematics
Technical Report NA-11-02
October 2011
Abstract
We present the formulation and analysis of a new sequential quadratic pro-
gramming (SQP) method for general nonlinearly constrained optimization. The
method pairs a primal-dual generalized augmented Lagrangian merit function
with a flexible line search to obtain a sequence of improving estimates of the
solution. This function is a primal-dual variant of the augmented Lagrangian
proposed by Hestenes and Powell in the early 1970s. A crucial feature of the
method is that the QP subproblems are convex, but formed from the exact
second derivatives of the original problem. This is in contrast to methods that
use a less accurate quasi-Newton approximation. Additional benefits of this
approach include the following: (i) each QP subproblem is regularized; (ii) the
QP subproblem always has a known feasible point; and (iii) a projected gradient
method may be used to identify the QP active set when far from the solution.
Key words. Nonlinear programming, nonlinear constraints, augmented
Lagrangian, sequential quadratic programming, SQP methods, regularized meth-
ods, primal-dual methods.
AMS subject classifications. 49J20, 49J15, 49M37, 49D37, 65F05, 65K05,
90C30
Department of Mathematics, University of California, San Diego, La Jolla, CA 92093-0112
(pgill@ucsd.edu). Research supported in part by National Science Foundation grants DMS-
0511766 and DMS-0915220, and by Department of Energy grant DE-SC0002349.
Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD
21218-2682 (daniel.p.robinson@jhu.edu).
1
1. Introduction 2
1. Introduction
We present a sequential quadratic programming (SQP) method for optimization
problems involving general linear and nonlinear constraints. The method is de-
scribed in terms of the problem format:
(NP) minimize
xRnf(x) subject to c(x)=0, x 0,
where c:Rn7→ Rmand f:Rn7→ Rare twice-continuously differentiable. This
problem format assumes that all general inequality constraints have been converted
to equalities by the use of slack variables. Methods for solving problem (NP) easily
carry over to the more general setting with lxu. The vector-pair (x, y) is
called a first-order solution to problem (NP) if it satisfies
c(x) = 0 and min x, z= 0,(1.1)
where yare the Lagrange multipliers associated with the constraints c(x) = 0, and
zare the reduced costs at (x, y), i.e. z=g(x)J(x)Ty
Sequential quadratic programming methods and interior methods are two alter-
native approaches to handling the inequality constraints in problem (NP). Sequential
quadratic programming (SQP) methods find an approximate solution of a sequence
of quadratic programming (QP) subproblems in which a quadratic model of the ob-
jective function is minimized subject to the linearized constraints. Interior methods
approximate a continuous path that passes through a solution of (NP). In the sim-
plest case, the path is parameterized by a positive scalar parameter µthat may be
interpreted as a perturbation for the optimality conditions for the problem (NP).
Both interior methods and SQP methods have an inner/outer iteration structure,
with the work for an inner iteration being dominated by the cost of solving a large
sparse system of symmetric indefinite linear equations. In the case of SQP meth-
ods, these equations involve a subset of the variables and constraints; for interior
methods, the equations involve all the constraints and variables.
SQP methods provide a relatively reliable “certificate of infeasibility” and they
have the potential of being able to capitalize on a good initial starting point. Sophis-
ticated matrix factorization updating techniques are used to exploit the fact that
the linear equations change by only a single row and column at each inner iteration.
These updating techniques are often customized for the particular QP method being
used and have the benefit of providing a uniform treatment of ill-conditioning and
singularity.
On the negative side, it is difficult to implement SQP methods so that exact sec-
ond derivatives can be used efficiently and reliably. Some of these difficulties stem
from the theoretical properties of the quadratic programming subproblem, which
can be nonconvex when second derivatives are used. Nonconvex quadratic pro-
gramming is NP-hard—even for the calculation of a local minimizer [11,25]. The
complexity of the QP subproblem has been a major impediment to the formula-
tion of second-derivative SQP methods (although methods based on indefinite QP
have been proposed [19,20]). Over the years, algorithm developers have avoided
1. Introduction 3
this difficulty by eschewing second derivatives and by solving a convex QP subprob-
lem defined with a positive semidefinite quasi-Newton approximate Hessian (see,
e.g., [28]); some authors enhance these basic methods with an additional subspace
phase that incorporates exact second derivatives [33,34,40]. A difficulty with active-
set methods is that they may require a substantial number of QP iterations when the
outer iterates are far from the solution. The use of a QP subproblem is motivated by
the assumption that the QP objective and constraints provide good “models” of the
objective and constraints of problem (NP). This should make it unnecessary (and
inefficient) to solve the QP to high accuracy during the preliminary iterations. Un-
fortunately, the simple expedient of limiting the number of inner iterations may have
a detrimental effect upon reliability. An approximate QP solution may not predict
a sufficient improvement in a merit function. Moreover, some of the QP multipliers
will have the wrong sign if an active-set method is terminated before a solution is
found. This may cause difficulties if the QP multipliers are used to estimate the
multipliers for the nonlinear problem. These issues would largely disappear if a
primal-dual interior method were to be used to solve the QP subproblem. These
methods have the benefit of providing a sequence of feasible (i.e., correctly signed)
dual iterates. Nevertheless, QP solvers based on conventional interior methods have
had limited success within SQP methods because they are difficult to “warm start”
from a near-optimal point (see the discussion below). This makes it difficult to
capitalize on the property that, as the outer iterates converge, the solution of one
QP subproblem is a very good estimate of the solution of the next.
Broadly speaking, the advantages and disadvantages of SQP methods and in-
terior methods complement each other. Interior methods are most efficient when
implemented with exact second derivatives. Moreover, they can converge in few
inner iterations—even for very large problems. The inner iterates are the iterates of
Newton’s method for finding an approximate solution of the perturbed optimality
conditions for a given µ. As the dimension and zero/nonzero structure of the New-
ton equations remains fixed, these Newton equations may be solved efficiently using
either iterative or direct methods available in the form of advanced “off-the-shelf”
linear algebra software. In particular, any new software for multicore and parallel
architectures is immediately applicable. Moreover, the perturbation parameter µ
plays an auxiliary role as an implicit regularization parameter of the linear equa-
tions. This implicit regularization plays a crucial role in the robustness of interior
methods on ill-conditioned and ill-posed problems.
On the negative side, although interior methods are very effective for solving
“one-off” problems, they are difficult to adapt to solving a sequence of related non-
linear problems. This difficulty may be explained in terms of the “path-following”
interpretation of interior methods. In the neighborhood of an optimal solution, a
step along the path x(µ) of perturbed solutions is well-defined, whereas a step onto
the path from a neighboring point will be extremely sensitive to perturbations in
the problem functions (and hence difficult to compute). Another difficulty with con-
ventional interior methods is that a substantial number of iterations may be needed
when the constraints are infeasible.
The idea of replacing a constrained optimization problem by a sequence of un-
1. Introduction 4
constrained problems parameterized by a scalar µhas played a fundamental role
in the formulation of algorithms since the early 1960s (for a seminal reference, see
Fiacco and McCormick [16,17]). One of the best-known methods for solving the
equality-constrained problem (NEP) uses an unconstrained function based on the
quadratic penalty function, which combines fwith a term of order 1that “penal-
izes” the sum of the squares of the constraint violations. Under certain conditions
(see, e.g., [17,26,49,51]), the minimizers of the penalty function define a differen-
tiable trajectory or central path that approaches the solution as µ0. Penalty
methods approximate this path by minimizing the penalty function for a finite se-
quence of decreasing values of µ. In this form, the methods have a two-level structure
of inner and outer iterations: the inner iterations are those of the method used to
minimize the penalty function, and the outer iterations test for convergence and
adjust the value of µ. As µ0, the Newton equations for minimizing the penalty
function are increasingly ill-conditioned, and this ill-conditioning was perceived to
be the reason for the poor numerical performance on some problems. In separate
papers, Hestenes [36] and Powell [42] proposed the augmented Lagrangian function
for (NEP), which is an unconstrained function based on augmenting the Lagrangian
function with a quadratic penalty term that does not require µto go to zero for con-
vergence. The price that must be paid for keeping 1finite is the need to update
estimates of the Lagrange multipliers in each outer iteration.
Since the first appearance of the Hestenes-Powell function, many algorithms have
been proposed based on using the augmented Lagrangian as an objective function for
sequential unconstrained minimization. Augmented Lagrangian functions have also
been proposed that treat the multiplier vector as a continuous function of x; some
of these ensure global convergence and permit local superlinear convergence (see,
e.g., Fletcher [18]; DiPillo and Grippo [13]; Bertsekas [1,2]; Boggs and Tolle [4]).
As methods for treating linear inequality constraints and bounds became more
sophisticated, the emphasis of algorithms shifted from sequential unconstrained min-
imization to sequential linearly constrained minimization. In this context, the aug-
mented Lagrangian has been used successfully within a number of different algo-
rithmic frameworks for problem (NP). The method used in the software package
LANCELOT [9] finds the approximate solution of a sequence of bound constrained
problems with an augmented Lagrangian objective function. Similarly, the software
package MINOS of Murtagh and Saunders [41] employs a variant of Robinson’s lin-
early constrained Lagrangian (LCL) method [44] in which an augmented Lagrangian
is minimized subject to the linearized nonlinear constraints. Friedlander and Saun-
ders [27] define a globally convergent version of the LCL method that can treat
infeasible constraints and infeasible subproblems. Augmented Lagrangian functions
have also been used extensively as a merit function for sequential quadratic pro-
gramming (SQP) methods (see, e.g., [3,5,7,21,28,30,4548]).
The development of path-following interior methods for linear programming in
the mid-1980s stimulated renewed interest in the treatment of constraints by sequen-
tial unconstrained optimization. This new attention not only resulted in a new un-
derstanding of the computational complexity of existing methods but also provided
the impetus for the development of new approaches. A notable development was the
1. Introduction 5
derivation of efficient path-following methods for linear programming based on ap-
plying Newton’s method with respect to both the primal and dual variables. These
new approaches also refocused attention on two computational aspects of penalty-
and barrier-function methods for nonlinear optimization. First, the recognition of
the formal equivalence between some primal-dual methods and conventional penalty
methods indicated that the inherent ill-conditioning of penalty and barrier functions
is not necessarily the reason for poor numerical performance. Second, the crucial
role of penalty and barrier functions in problem regularization was recognized and
better understood.
In this paper we formulate and analyze a new sequential quadratic programming
(SQP) method for nonlinearly constrained optimization. The method pairs a primal-
dual generalized augmented Lagrangian merit function with a flexible line search
to obtain a sequence of improving estimates of the solution. This function is a
primal-dual variant of the augmented Lagrangian proposed by Hestenes and Powell
in the early 1970s. A crucial feature of the method is that the QP subproblems
are convex, but formed from the exact second derivatives of the original problem.
This is in contrast to methods that use a less accurate quasi-Newton approximation.
Additional benefits of this approach include the following: (i) each QP subproblem
is regularized; (ii) the QP subproblem always has a known feasible point; and (iii) a
projected gradient method may be used to identify the QP active set when far from
the solution. Preliminary numerical experiments on a subset of problems from the
CUTEr test collection indicate that the proposed SQP method is significantly more
efficient than our current SQP package SNOPT.
The paper is organized in five sections. Section 1is a review of some of the
basic properties of SQP methods. In Section 2, the steps of the primal-dual SQP
method are defined. Similarities with the conventional Hestenes-Powell augmented
Lagrangian method are also discussed. In Section 3, we consider methods for the
solution of the QP subproblem and show that in the neighborhood of a solution,
the method is equivalent to the stabilized SQP method [15,35,38,50]. A rather
general global convergence result is established in Section 4that does not make any
constraint qualification or non-degeneracy assumption.
Notation and Terminology
Unless explicitly indicated otherwise, k·k denotes the vector two-norm or its induced
matrix norm. The inertia of a real symmetric matrix A, denoted by In(A), is the
integer triple (a+, a, a0) giving the number of positive, negative and zero eigen-
values of A. Given vectors aand bwith the same dimension, the vector with ith
component aibiis denoted by a·b. The vectors eand ejdenote, respectively, the
column vector of ones and the jth column of the identity matrix I. The dimensions
of e,eiand Iare defined by the context. Given vectors xand y, the long vector
consisting of the elements of xaugmented by elements of yis denoted by (x, y). The
ith component of a vector labeled with a subscript will be denoted by [ ·]i, e.g., [ v]i
is the ith component of the vector v. The subvector of components with indices in
the index set Sis denoted by [ ·]S, e.g., [ v]Sis the vector with components [ v]i
1. Introduction 6
for i∈ S. Similarly, if Mis a symmetric matrix, then [ M]Sdenotes the symmetric
matrix with elements mij for i,j∈ S. A local solution of an optimization problem
is denoted by x. The vector g(x) is used to denote f(x), the gradient of f(x),
and H(x) denotes the (symmetric) Hessian matrix 2
f(x). The matrix J(x) de-
notes the m×nconstraint Jacobian, which has ith row ci(x)T, the gradient of
the ith constraint function ci(x). The matrix Hi(x) denotes the Hessian of ci(x).
The Lagrangian function associated with (NP) is L(x, y, z ) = f(x)c(x)TyzTx,
where yand zare m- and n-vectors of dual variables associated with the equality
constraints and bounds, repectively. The Hessian of the Lagrangian with respect to
xis denoted by H(x, y) = H(x)Pm
i=1 yiHi(x).
Background
Some of the most efficient algorithms for nonlinear optimization are sequential
quadratic programming (SQP) methods. Conventional SQP methods find an ap-
proximate solution of a sequence of quadratic programming (QP) subproblems in
which a quadratic model of the objective function is minimized subject to the lin-
earized constraints. Given a current estimate (xk, yk) of a primal-dual solution of
(NP), a line search SQP method computes a search direction pksuch that xk+pk
is the solution (when it exists) of the convex quadratic program
minimize
xgT
k(xxk) + 1
2(xxk)T¯
Hk(xxk)
subject to ck+Jk(xxk) = 0, x 0,(1.2)
where ck,gkand Jkdenote the quantities c(x), g(x) and J(x) evaluated at xk, and
¯
Hkis some positive-definite approximation to H(xk, yk). If the Lagrange multiplier
vector associated with the constraint ck+Jk(xxk) = 0 is written in the form
yk+qk, then a solution (xk+pk, yk+qk) of the QP subproblem (1.2) satisfies
ck+Jkpk= 0 and min xk+pk, gk+¯
HkpkJT
k(yk+qk)= 0,
Given any x0, let A0and F0denote the index sets
A0(x) = {i:xi= 0}and F0(x) = {1,2, . . . , n}/A0(x),(1.3)
If xis feasible for the constraints ck+Jk(xxk) = 0, then A0(x) is the active set
at x. If the set A0associated with a solution of the subproblem (1.2) is known,
then xk+pkmay be found by solving linear equations that represent the optimality
conditions for an equality-constrained QP with the inequalities x0 replaced by
xi= 0 for i∈ A0. In general, the optimal A0is not known in advance, and
active-set methods generate a sequence of estimates (bpj,bqj)(pk, qk) such that
(bpj+1,bqj+1) = (bpj,bqj) + αj(∆pj, ∆qj), with (∆pj, ∆qj) a solution of
¯
HFJT
F
JF0∆pF
∆qj=[gk+¯
HkbpjJT
k(yk+bqj) ]F
ck+Jkbpj,(1.4)
where ¯
HFis the matrix of free rows and columns of ¯
Hk,JFis the matrix of free
columns of Jk, and the step length αis chosen to ensure feasibility of all variables,
not just those in the set A0.
2. A Regularized Primal-Dual Line-Search SQP Algorithm 7
If the equations (1.4) are to be used to define ∆pFand ∆qj, then it is necessary
that JFhas full rank, which is probably the greatest outstanding issue associated
with systems of the form (1.4). Two remedies are available.
Rank-enforcing active-set methods maintain a set of indices Bassociated with a
matrix of columns JBwith rank m, i.e., the rows of JBare linearly independent.
The set Bis the complement in (1,2, . . . , n) of a “working set” of indices that
estimates the set A0at a solution of (1.2). If Nis a subset of A0, then the
system analogous to (1.4) is given by
¯
HBJT
B
JB0∆pB
∆qj=[gk+¯
HkbpjJT
k(yk+bqj) ]B
ck+Jkbpj,(1.5)
which is nonsingular because of the linear independence of the rows of JB.
Regularized active-set methods add a positive-definite regularization term in
the (2,2) block of (1.4). The magnitude of the regularization is generally based
an heuristic arguments, which gives mixed results in practice.
2. A Regularized Primal-Dual Line-Search SQP Algorithm
In this section, we define a regularized SQP line-search method based on the primal-
dual augmented Lagrangian merit function
Mν(x, y ;yE, µ) = f(x)c(x)TyE+1
2µkc(x)k2+ν
2µkc(x) + µ(yyE)k2,(2.1)
where νis a scalar, µis the so-called penalty parameter, yEis an estimate of an
optimal Lagrange multiplier vector y. This function, proposed by Robinson [43],
and Gill and Robinson [31], may be derived by applying the primal-dual penalty
function of Forsgren and Gill [23] to a problem in which the constraints are shifted
by a constant vector (see Powell [42]). With the notation c=c(x), g=g(x), and
J=J(x), the gradient of Mν(x, y ;yE, µ) may be written as
Mν(x, y ;yE, µ) = gJT(1 + ν)(yE1
µc)νy
νc+µ(yyE)!(2.2a)
= gJTπ+ν(πy)
νµ(yπ)!,(2.2b)
where π=π(x;yE, µ) denotes the vector-valued function
π(x;yE, µ) = yE1
µc(x).(2.3)
Similarly, the Hessian of Mν(x, y ;yE, µ) may be written as
2
Mν(x, y ;yE, µ) = Hx, π +ν(πy)+1
µ(1 + ν)JTJ νJ T
νJ νµI .(2.4)
2. A Regularized Primal-Dual Line-Search SQP Algorithm 8
We use Mν(x, y), Mν(x, y), and 2
Mν(x, y), to denote Mν,Mν, and 2
Mν
evaluated with parameters yEand µ. (We note that a trust-region based method
could also be given, but we leave the statement and analysis to a future paper.)
Our approach is motivated by the following theorem, which shows that minimiz-
ers of problem (NP) are also minimizers—under certain assumptions—of the bound
constrained problem
minimize
x,y Mν(x, y ;y, µ) subject to x0,(2.5)
where yis a Lagrange multiplier vector for the equality constraints c(x) = 0.
Theorem 2.1. If (x, y)satisfies the second-order sufficient conditions for a solu-
tion of problem (NP), then there exists a positive ¯µsuch that for all 0<µ< ¯µ, the
point (x, y)is a minimizer of the bound constrained problem (2.5)for all ν > 0.
2.1. Definition of the search direction
To motivate the computation of the step, we consider a quadratic approximation to
Mν. Given (x, y) and fixed ν0, we define
Hν
M(x, y ;µ) = ¯
H(x, y) + 1
µ(1 + ν)J(x)TJ(x)νJ (x)T
νJ (x)νµI ,(2.6)
where ¯
H(x, y) is a symmetric approximation to Hx, π +ν(πy)H(x, y ) such
that ¯
H(x, y) + 1
µJ(x)TJ(x) is positive definite. The approximation π+ν(πy)y
is valid provided πy. The restriction on the inertia of ¯
Himplies that Hν
M(x, y ;µ)
is positive definite for ν > 0 and positive semidefinite for ν= 0 (see Theorem 3.1 of
Section 3.2.3).
Using this definition of Hν
Mat the kth primal-dual iterate vk= (xk, yk), consider
the convex QP subproblem
minimize
∆v=(p,q)Mν(vk)T∆v +1
2∆vTHν
M(vk)∆v subject to xk+p0,(2.7)
where Mν(v) denotes the merit function evaluated at v. For any primal-dual QP
solution ∆vk= (pk, qk), it is shown in Theorem 3.3 of Section 3.2.3 that the first-
order conditions associated with the variables in F0(xk+pk) may be written in
matrix form as:
¯
HFJT
F
JFµI pF
qk!= [gkJT
kyk¯
Hks]F
ck+µ(ykyE)Jks!,(2.8)
where ck,gkand Jkdenote the quantities c(x), g(x) and J(x) evaluated at xk, and
sis a nonnegative vector such that
si=[xk]iif i∈ A0(xk+pk);
0 if i∈ F0(xk+pk).
2. A Regularized Primal-Dual Line-Search SQP Algorithm 9
(The assumption of positive-definiteness of ¯
Hk+1
µJT
kJkimplies that the matrix
associated with the equations (2.8) is nonsingular.) It follows that if A0(xk+pk) =
A0(xk), then (pk, qk) satisfies the perturbed Newton equations
HFJT
F
JFµI pF
qk!= [gkJT
kyk]F
ck+µ(ykyE)!.
A key property is that if µ= 0 and JFhas full rank, then this equation is identical
to the equation for the conventional SQP step given by (1.4). This provides the
motivation to use different penalty parameters for the step computation and the
merit function.
Given an iterate vk= (xk, yk) and Lagrange multiplier estimate yE
k, the primal-
dual search direction ∆vk= (pk, qk) is defined such that vk+∆vk= (xk+pk, yk+qk)
is a solution of the convex QP problem
minimize
v=(x,y)(vvk)TMν(vk;yE
k, µR
k) + 1
2(vvk)THν
M(vk;µR
k)(vvk)
subject to x0,(2.9)
where µR
kis a small parameter, and Hν
M(vk;µR
k) is the matrix (2.6) written in terms
of the composite variables vk= (xk, yk). In this context, µR
kplays the role of a
regularization parameter rather than a penalty parameter, thereby providing an
O(µR
k) estimate of the conventional SQP direction. This approach is nonstandard
because a small “penalty parameter” µR
kis used by design, whereas other augmented
Lagrangian-based methods attempt to keep µas large as possible [8,28].
Finally, we note that if v=vkis a solution of the QP (2.9), then vkis a first-order
solution of
minimize
v=(x,y)Mν(v;yE
k, µR
k) subject to x0.(2.10)
In Section 3it is shown that, under certain conditions, the primal-dual vector
vk+∆vk= (xk+pk, yk+qk) is a solution of problem (2.9) if and only if it solves
minimize
x,y gT
k(xxk) + 1
2(xxk)T¯
H(xk, yk)(xxk) + 1
2µR
kkyk2
subject to ck+Jk(xxk) + µR
k(yyE
k)=0, x 0,(2.11)
which is often referred to as the “stabilized” SQP subproblem because of its calming
effect on multiplier estimates for degenerate problems (see, e.g., [35,50]). Therefore,
the proposed method provides a natural link between the stabilized SQP methods
(which employ a subproblem appropriate for degenerate problems), conventional
SQP methods (which are highly efficient in practice), and augmented Lagrangian
methods (which have desirable global convergence properties).
2.2. Definition of the new iterate
Once the search direction ∆vkhas been determined, a “flexible” backtracking line
search is performed on the primal-dual augmented Lagrangian. A conventional
2. A Regularized Primal-Dual Line-Search SQP Algorithm 10
backtracking line search defines vk+1 =vk+αk∆vk, where αk= 2jand jis the
smallest nonnegative integer such that
Mν(vk+αk∆vk;yE
k, µk)≤ Mν(vk;yE
k, µk) + αkηS∆vT
kMν(vk;yE
k, µk)
for a given scalar ηS(0,1). However, this approach would suffer from the Maratos
effect [39] simply because the penalty parameter µkand the regularization parameter
µR
kgenerally have different values. Thus, we use a “flexible penalty function” based
on the work of Curtis and Nocedal [12] and define αk= 2j, where jis the smallest
nonnegative integer such that
Mν(vk+αk∆vk;yE
k, µF
k)≤ Mν(vk;yE
k, µF
k) + αkηSNk(2.12)
for some value µF
k[µR
k, µk], and where
Nk4
=max ∆vT
kMν(vk;yE
k, µR
k),103k∆vkk20 (2.13)
is a sufficiently negative real number that will allow us to prove global convergence
of our proposed method. Once an appropriate value for αkis found, the new primal-
dual solution estimate is given by
xk+1 =xk+αkpkand yk+1 =yk+αkqk.
We note that the step acceptance is well-defined since the weakened Armijo condi-
tion (2.12) will be satisfied for µF
k=µR
kand all αsufficiently small.
2.3. Updating the multiplier estimate
The preliminary numerical results presented in [31] indicate that the method out-
lined thus far is robust with respect to updating yE
k. In particular, the numerical
results generated in that paper updated yE
kat every iteration. Consequently, we
seek a strategy that allows for frequent updates to yE
k. To this end, we use the
(merit) functions
φS(v) = η(x) + 105ω(v) and φL(v) = 105η(x) + ω(v),(2.14)
where
η(x) = kc(x)kand ω(x, y) =
min x, g(x)J(x)Ty
(2.15)
are feasibility and stationarity measures at the point (x, y), respectively. These
optimality measures are based on the optimality conditions for problem (NP) rather
than for minimizing the merit function Mν. Both measures are bounded below by
zero, and are equal to zero if vis a first-order solution to problem (NP). Such
conditions are appropriate because trial steps are regularized SQP steps that should
converge rapidly to a solution of problem (NP).
The estimate yE
kis updated when any iterate vksatisfies either φS(vk)1
2φmax
S
or φL(vk)1
2φmax
L, where φmax
Sand φmax
Lare bounds that are updated throughout
the solution process. To ensure global convergence, the update to yE
kis accompanied
by a decrease in either φmax
Sor φmax
L.
2. A Regularized Primal-Dual Line-Search SQP Algorithm 11
Finally, yE
kis also updated if an approximate first-order solution of the problem
minimize
x,y Mν(x, y ;yE
k, µR
k) subject to x0 (2.16)
has been found. The test for optimality is
k∇
yMν(vk+1 ;yE
k, µR
k)k ≤ τkand
min xk+1,
xMν(vk+1 ;yE
k, µR
k)
τk(2.17)
for some small tolerance τk>0. This condition is rarely satisfied in practice, but
the test is required for the proof of convergence. Nonetheless, if the condition is
satisfied, yE
kis updated with the safeguarded estimate
yE
k+1 = mid106, yk+1,106.
2.4. Updating the penalty parameters
As we only want to decrease µR
kwhen “close” to optimality (ignoring locally infea-
sible problems), we use the definition
µR
k+1 =(min 1
2µR
k,krkk3/2,if (2.17) is satisfied;
min µR
k,krkk3/2,otherwise, (2.18)
where
rk+1 ropt(vk+1)4
=c(xk+1)
min xk+1, g(xk+1)J(xk+1)Tyk+1 .(2.19)
The update to µkis motivated by a different goal. Namely, we wish to decrease µk
only when the trial step indicates that the merit function with penalty parameter
µkincreases. Thus, we use the definition
µk+1 =(µk,Mν(vk+1 ;yE
k, µk)≤ Mν(vk;yE
k, µk) + min(αmin, αk)ηSNk
max 1
2µk, µR
k+1,otherwise,
(2.20)
for some positive αmin. The use of the scalar αmin increases the likelihood that µk
will not be decreased.
2.5. Formal statement of the algorithm
In this section we formally state the proposed method as Algorithm 2.1 and in-
clude some additional details. During each iteration, the trial step is computed as
described in Section 2.1, the solution estimate is updated as in Section 2.2,yE
kis
updated as in Section 2.3, and the penalty parameters are updated as in Section 2.4.
The value of yE
kis crucial for both global and local convergence. To this end, there
are three possibilities. First, yE
kis set to yk+1 if (xk+1, yk+1 ) is acceptable to either
of the merit functions φSor φLgiven by (2.14). These iterates are labeled as S- and
L-iterates, respectively. It is to be expected that yE
kwill be updated in this way
most of the time. Second, if (xk+1 , yk+1) is not acceptable to either of the merit
3. Solution of the QP Subproblem 12
functions φSor φL, we check whether we have computed an approximate first-order
solution to problem (2.16) by verifying conditions (2.17) for the current value of τk.
If these conditions are satisfied, the iterate is called an M-iterate. In this case, the
regularization parameter µR
kand subproblem tolerance τkare decreased and yE
kis
updated as in (2.3). Finally, an iterate at which neither of the first two cases occur
is called an F-iterate. The multiplier estimate yE
kis not changed in an F-iterate.
Algorithm 2.1. Regularized primal-dual SQP algorithm (pdSQP)
Input (x0, y0);
Set algorithm parameters αmin >0, ηS(0,1), τstop >0, and ν > 0;
Initialize yE
0=y0,τ0>0, µR
0>0, µ0[µR
0,), and k= 0;
Compute f(x0), c(x0), g(x0), J(x0), and H(x0, y0);
for k= 0,1,2, . . . do
Define ¯
HkH(xk, yk) such that ¯
Hk+ (1R
k)JT
kJkis positive definite;
Solve the QP (2.9) for the search direction ∆vk= (pk, qk);
Find an αksatisfying (2.12) and (2.13);
Update the primal-dual estimate xk+1 =xk+αkpk,yk+1 =yk+αkqk;
Compute f(xk+1), c(xk+1), g(xk+1), J(xk+1), and H(xk+1, yk+1);
if φS(xk+1, yk+1)1
2φmax
Sthen [S-iterate]
φmax
S=1
2φmax
S;
yE
k+1 =yk+1;
τk+1 =τk;
else if φL(xk+1 , yk+1)1
2φmax
Lthen [L-iterate]
φmax
L=1
2φmax
L;
yE
k+1 =yk+1;
τk+1 =τk;
else if vk+1 = (xk+1 , yk+1) satisfies (2.17) [M-iterate]
yE
k+1 = mid(106, yk+1,106);
τk+1 =1
2τk;
else [F-iterate]
yE
k+1 =yE
k;
τk+1 =τk;
end if
Update µR
k+1 and µk+1 according to (2.18) and (2.20), respectively;
if krkk ≤ τstop then exit ;
end (for)
3. Solution of the QP Subproblem
In this section we consider various theoretical and computational issues associated
with the QP subproblem (2.9). In particular, it is shown that the search direction
computed using subproblem (2.9) is the unique solution of the “stabilized” SQP
subproblem (2.11), and independent of the value of ν. Moreover, an active-set
3. Solution of the QP Subproblem 13
method applied to problems (2.9) and (2.11) generates identical iterates, provided
a common (feasible) starting point is used.
3.1. Equivalence with Stabilized SQP
In this section it is shown that, under certain conditions, the regularized QP sub-
problem (2.9) is equivalent to the stabilized SQP subproblem (2.11). Equivalent
problems are considered in which the unknowns are written in terms of the steps
(p, q) for given variables (x, y).
Theorem 3.1. Consider the bound constrained QP
minimize
∆v=(p,q)gT
M∆v +1
2∆vTHM∆v subject to x+p0,(3.1)
where xand yare constant,
gM=gJTπ+ν(πy)
νc+µ(yyE),and HM=H+1
µ(1 + ν)JTJ νJ T
νJ νµI ,
with H+1
µJTJpositive definite and ν0. For the same quantities c,g,Jand H,
consider the stabilized QP problem
minimize
p,q gTp+1
2pTHp +1
2µky+qk2
subject to c+J p +µ(y+qyE)=0, x +p0.(3.2)
The following results hold.
(a) The stabilized QP (3.2)has a bounded unique primal-dual solution (p, q).
(b) The unique solution ∆v = (p, q)of the stabilized QP (3.2)is a solution of the
bound constrained QP (3.1)for all ν0. If ν > 0, then the stabilized solution
∆v = (p, q)is the unique solution of (3.1).
Proof. For part (a), let ∆v = (p, q) denote an arbitrary feasible point for the
constraints of the stabilized QP (3.2). Given the particular feasible point ∆v0=
(0, π y), consider an n-vector of variables wdefined by the linear transformation
∆v =∆v0+Mw, where M= µI
J!.
The matrix Mis (n+m)×nwith rank n, and its columns form a basis for the
null-space of the constraint matrix J µI . Using this transformation gives rise to
the equivalent problem
minimize
wRn
µ
2wTH+1
µJTJw+wTgJTπsubject to x+µw 0.
The matrix H+1
µJTJis positive definite by assumption, and it follows that the
stabilized QP (3.2) is equivalent to a convex program with a strictly convex objective.
The existence of a bounded unique solution follows directly.
3. Solution of the QP Subproblem 14
For part (b), we begin by stating the first-order conditions for (p, q) to be a
solution of the stabilized QP (3.2):
c+Jp +µ(y+qyE)=0, µ(y+q) = µw,
g+Hp JTwz= 0, z 0,
z·(x+p)=0, x +p0,
where wand zdenote the dual variables for the equality and inequality constraints
of problem (3.2), respectively. Eliminating wusing the equation w=y+qgives
c+Jp +µ(y+qyE) = 0,(3.3a)
g+Hp JT(y+q)z= 0, z 0,(3.3b)
z·(x+p) = 0, x +p0.(3.3c)
First, we prove part (b) for the case ν > 0. The optimality conditions for (3.1)
are
gM+HM∆v =z
0, z 0,(3.4)
z·(x+p)=0, x +p0.
Pre-multiplying the equality of (3.4) by the nonsingular matrix Tsuch that
T= I1+ν
νµ JT
01
νI!,
and using the definition (2.2a) yields the equivalent conditions
g+Hp JT(y+q)z= 0 and c+Jp +µ(y+qyE)=0,
which are identical to the relevant equalities in (3.3). Thus, the solutions of (3.2)
and (3.1) are identical in the case ν > 0.
It remains to consider the case ν= 0. In this situation, the objective function
of the QP (3.1) includes only the primal variables p, which implies that the problem
may be written as
minimize
p(gJTπ)Tp+1
2pTH+1
µJTJpsubject to x+p0,(3.5)
with qan arbitrary vector. Although there are infinitely many solutions of (3.1),
the vector passociated with a particular solution (p, q) is unique because it is the
solution of problem (3.5) for a positive-definite matrix H+1
µJTJ. The optimality
conditions for (3.5) are
gJTπ+H+1
µJTJp=z, z 0,(3.6)
z·(x+p)=0, x +p0.
3. Solution of the QP Subproblem 15
For the given yand optimal p, define the m-vector qsuch that
q=1
µJp +c+µ(yye)=1
µJp +µ(yπ).(3.7)
Equation (3.7) and the equality of (3.6) may be combined to give the matrix equation
gJTy+ 2JT(yπ)
µ(yπ)!+ H+2
µJTJ JT
J µI ! p
q!= z
0!.
Applying the nonsingular matrix I2
µJT
0Ito both sides of this equation yields
gJTy
c+µ(yye)+HJT
J µI p
q=z
0.
It follows that if ν= 0, then the unique solution of (3.2) is a solution of (3.1), which
is what we wanted to show.
When ν > 0, the uniqueness of the solution ∆v = (p, q ) follows from the obser-
vation that QP (3.1) is then convex with a strictly convex objective.
Theorem 3.1 shows that the direction defined by bound-constrained QP is inde-
pendent of the parameter ν. Moreover, this direction may be defined as the solution
of an equivalent stabilized SQP subproblem (2.11) that does not include νat all.
However, the parameter νdoes appear explicitly in the definition of the merit func-
tion Mν(2.1), and therefore plays an important role in influencing the length of
the step during the flexible line search. The value of νdetermines the proximity
of the primal-dual iterates to the so-called “primal-dual trajectory”, which is the
one-parameter family of points x(µ), y(µ), such that x(µ) is a minimizer of the
conventional augmented Lagrangian for fixed yE. The definition of Mνimplies that
larger values of νtend to force the iterates to be close to the primal-dual trajec-
tory. If ν= 0 then the method reverts to a regularized SQP method based on the
(primal) conventional augmented Lagrangian (for which no emphasis is placed on
staying close to the primal-dual trajectory). The algorithm may be modified to
allow for the choice ν= 0 by always setting yE
k+1 to be π(xk+1); this does emphasize
the primal-dual trajectory, but only after the major iteration has been completed.
The use of the primal-dual augmented Lagrangian function allows the emphasis on
the dual variables during the line search.
3.2. Equivalent iterates of an active-set method
In Section 3.1 it is shown that, if ν > 0 then the unique solutions of subproblems (2.9)
and (2.11) are identical, and if ν= 0 then the solution of (2.9) is no longer unique,
but there is a particular solution that is identical to the unique solution of (2.11). In
this section we continue our study of these subproblems by considering the iterates
that result when solved with an active-set method.
3. Solution of the QP Subproblem 16
3.2.1. An active-set method
For the remainder of this section, the indices associated with the SQP iteration are
omitted and it will be assumed that the constraints of the QP involve the constraints
linearized at the point ¯x. In all cases, the suffix jwill be reserved for the iteration
index of the QP algorithm.
We start by defining a “conventional” active-set method on a generic convex QP
with constraints written in standard form. The problem format is
minimize
xQ(x) = gT(x¯x) + 1
2(x¯x)TH(x¯x)
subject to c+A(x¯x)=0, x 0,(3.8)
where ¯x,c,A,gand Hare constant, with Hpositive-definite. Throughout, we
assume that the constraints are feasible, i.e., there exists at least one nonnegative x
such that c+A(x¯x) = 0.
Given a feasible x0, active-set methods generate a feasible sequence {xj}such
that Q(xj+1)≤ Q(xj) with xj+1 =xj+αjpj. Let the index sets A0and F0be
defined as in (1.3). At the start of the jth QP iteration, given primal-dual iterates
(xj, wj), new estimates (xj+pj, wj+qj) are defined by solving a QP formed by
fixing the variables with indices in A0(xj) and defining pjsuch that xj+pjminimizes
Q(x) with respect to the free variables, subject to the equality constraints. With this
definition, the quantities wj+qjare the Lagrange multipliers at the minimizer xj+pj.
The components of pjwith indices in A0(xj) are zero, and the free components
pF= [ pj]Fare determined from the equations
HFAT
F
AF0pF
qj=[g+H(xj¯x)ATwj]F
c+A(xj¯x),(3.9)
where [ ·]Fdenotes the subvector of components with indices in F0(xj). The choice
of step length αjis based on remaining feasible with respect to the satisfied bounds.
If xj+pjis feasible, i.e., xj+pj0, then αjwill be taken as unity. Otherwise,
αis set to αM, the largest feasible step along pj. Finally, the iteration index jis
incremented by one and the iteration is repeated.
It must be emphasized that this active-set method is not well defined unless the
equations (3.9) have a solution at every (xj, wj).
3.2.2. Solution of the bound-constrained subproblem
In this section we apply the active-set method to a QP of the form
minimize
v=(x,y)gT
M(v¯v) + 1
2(v¯v)THM(v¯v) subject to x0,(3.10)
where ¯v= (¯x, ¯y), and
gM=gJTπ+ν(π¯y)
νc+µ(¯yyE), HM=H+1
µ(1 + ν)JTJ νJ T
νJ νµI ,
3. Solution of the QP Subproblem 17
with H+1
µJTJpositive definite, ν0, and π=yEc/µ (see (2.3)). The matrix
HMis positive semidefinite under the given assumptions. This follows from the
identity
LTHML= H+1
µJTJ0
0νµIm!,where L= In0
1
µJ Im!.
The matrix Lis nonsingular, and Sylvester’s Law of Inertia gives
In(HM) = In(LTHML) = In H+1
µJTJ+ (m, 0,0) = (n+m, 0,0) for ν > 0,
and
In(HM) = In H+1
µJTJ+ (0,0, m) = (n, 0, m) for ν= 0.
It follows that problem (3.10) is a convex QP, and we may apply the active-set
method of Section 3.2.1.
Given the jth QP iterate vj= (xj, yj), the generic active-set method applied to
(3.10) defines the next iterate as vj+1 =vj+αj∆vj, where the free components of
the vector ∆vj= (pj, qj) satisfy the equations
[HM]F∆vF=[gM+HM(vj¯v) ]F,(3.11)
where ∆vF= (pF, qj) and the index set F0(xj) is defined as in (1.3). The equations
(3.11) appear to be ill-conditioned for small µbecause of the O(1) term in the
(1,1) block of the matrix HM. However, this ill-conditioning is superficial. The next
result shows that ∆vFmay be determined by solving an equivalent nonsingular
primal-dual system with conditioning dependent on that of the original problem.
Theorem 3.2. Consider the application of the active-set method to the QP (3.10).
Then, for every ν0, there exists a positive ¯µsuch that, for all 0<µ< ¯µ, the free
components of the QP search direction (pj, qj)satisfy the nonsingular primal-dual
system HFJT
F
JFµI pF
qj=[g+H(xj¯x)JTyj]F
c+µ(yjyE) + J(xj¯x).(3.12)
Proof. First, we consider the definition of the search direction when ν > 0. In this
case it suffices to show that the linear systems (3.11) and (3.12) are equivalent. For
any positive ν, we may define the matrix
T= I1+ν
νµ JT
F
01
νIm!,
where the identity matrix Ihas dimension nF, the column dimension of JF. The
matrix Tis nonsingular with nF+mrows and columns. It follows that the equations
T[HM]F∆vF=T[gM+HM(vj¯v) ]F
3. Solution of the QP Subproblem 18
have the same solution as those of (3.11). The primal-dual equations (3.12) follow
by direct multiplication. The nonsingularity of the equations (3.12) follows from the
nonsingularity of T, and the fact that HM(and all symmetric submatrices formed
from its rows and columns) is nonsingular.
The resulting equations (3.12) are independent of ν, but the simple proof above
is not applicable when ν= 0 because Tis undefined in this case. For ν= 0, the QP
objective includes only the primal variables x, which implies that problem (3.10)
may be written as
minimize
x0gJTπTx¯x+1
2x¯xTH+1
µJTJx¯x,
with yarbitrary. The active-set equations analogous to (3.11) are then
HF+1
µJT
FJFpF=g+H+1
µJTJxj¯xJTπF
.(3.13)
For any choice of yj, define the m-vector qjsuch that
qj=1
µJFpF+µ(yjπ) + J(xj¯x),(3.14)
where π=yEc/µ (see (2.3)). Equations (3.13) and (3.14) may be combined to
give equations K∆vF=r, where ∆vF= (pF, qj),
K= HF+2
µJT
FJFJT
F
JFµI !
and right-hand side
r= [g+H(xj¯x) ]F+2
µJT
FJ(xj¯x)JT
Fyj+ 2JT
F(yjπ)
µ(yjπ) + J(xj¯x)!.
Forming the equations T K ∆vF=T r, where Tis the nonsingular matrix
T= I2
µJT
F
0Im!,
gives the equivalent system
HFJT
F
JFµI pF
qj=[g+H(xj¯x)JTyj]F
c+µ(yjyE) + J(xj¯x),
which is identical to the system (3.12).
3. Solution of the QP Subproblem 19
Theorem 3.3. Let (pk, qk)be the solution of the QP subproblem (2.7). If pFdenotes
the components of pkwith indices in F0(xk+pk), then (pF, qk)satisfies the equations
¯
HFJT
F
JFµI ! pF
qk!= [gkJT
kyk¯
Hks]F
ck+µ(ykyE)Jks!,
where Fis defined in terms of the set F0(xk+pk)and sis a nonnegative vector
such that
si=[xk]iif i∈ A0(xk+pk);
0if i∈ F0(xk+pk).
Proof. The proof is analogous to that of Theorem 3.2.
3.2.3. Solution of the stabilized SQP subproblem
In this section we show that under certain conditions, the conventional active-set
method applied to the stabilized SQP subproblem (3.2) and the bound-constrained
QP (3.1) will generate identical iterates.
Consider the application of the “generic” active-set method of Section 3.2.1 to
the stabilized QP:
minimize
x,y gT(x¯x) + 1
2(x¯x)TH(x¯x) + 1
2µkyk2
subject to c+J(x¯x) + µ(yyE)=0, x 0.(3.15)
In terms of the data “(x, ¯x, H, g, A, c)” for the generic QP (3.8), we have variables
x” = (x, y), with “ ¯x” = (¯x, ¯y),
H” = H0
0µI,g” = g
µ¯y,A” = J µI,and “c” = c+µ( ¯yyE).
(The discussion of the properties of the stabilized QP relative to the generic form
(3.8) is not affected by the nonnegativity constraints being applied to only a subset
of the variables in (3.15).) After some simplification, the equations analogous to
(3.9) may be written as
HF0JT
F
0µI µI
JFµI 0
pF
¯pF
qj
=
[g+H(xj¯x)JTwj]F
µyjµwj
c+µ(yjyE) + J(xj¯x)
,(3.16)
where pFand ¯pFdenote the free components of the search directions for the xand
yvariables respectively. (Observe that the right-hand side of (3.16) is independent
of ¯y.) The second block of equations gives ¯pF=qjyj+wj, which implies that
yj+1 =yj+ ¯pF=yj+qjyj+wj=wj+qj=wj+1,
so that the primal y-variables and dual variables of the stabilized QP are identical.
4. Convergence 20
Similarly, substituting for ¯pFin the third block of equations in (3.17), and using
the primal-dual equivalence wj=yjgives
HFJT
F
JFµI pF
qj=[g+H(xj¯x)JTyj]F
c+µ(yjyE) + J(xj¯x),(3.17)
which are identical to the equations associated with those for the QP subproblem
(3.10).
The preceding discussion constitutes a proof of the following result.
Theorem 3.4. Consider the application of the active-set method to the bound con-
strained QP (3.10)and stabilized QP (3.15)defined with the same quantities c,g,J
and H. Consider any x0and y0such that (x0, y0)is feasible for the stabilized QP
(3.15). Then, for every ν0, there exists a positive ¯µsuch that, for all 0<µ< ¯µ,
the active-set method generates identical primal-dual iterates {(xj, yj)}j0.
4. Convergence
The convergence of Algorithm 2.1 is discussed under the following assumptions.
Assumption 4.1. Each ¯
H(xk, yk)is chosen so that the sequence {¯
H(xk, yk)}k0is
bounded, with {¯
H(xk, yk) + (1R
k)J(xk)TJ(xk)}k0uniformly positive definite.
Assumption 4.2. The functions fand care twice continuously differentiable.
Assumption 4.3. The sequence {xk}k0is contained in a compact set.
In the “worst” case, i.e., when all iterates are eventually M-iterates or F-iterates,
Algorithm 2.1 emulates a primal-dual augmented Lagrangian method [9,10,43].
Consequently, it is possible that yE
kand µR
kwill remain fixed over a sequence of
iterations, although this is rare in practice. Nonetheless, our convergence result
must consider this situation, which we now investigate.
Theorem 4.1. Let Assumptions 4.14.3 hold. If there exists an integer b
ksuch that
µR
kµR>0and kis an F-iterate for all kb
k, then the following hold:
(i) solutions {∆vk}k
b
kto subproblem (2.9)are bounded above;
(ii) solutions {∆vk}k
b
kto subproblem (2.9)are bounded away from zero; and
(iii) there exists a constant  > 0such that
Mν(vk;yE
k, µR
k)T∆vk≤ −for all kb
k.
Proof. The assumptions of this theorem guarantee that
τkτ > 0, µR
k=µR,and yE
k=yEfor all kb
k. (4.1)
4. Convergence 21
We first prove part (i). As in the proof of Theorem 3.1, we know that the solution
to (2.9) satisfies
∆vk=0
πkyk+Mkw,where Mk=µRI
Jk
and wis the unique solution of
minimize
wRn
µR
2wT¯
Hk+1
µRJT
kJkw+wTgkJT
kπksubject to xk+µRw0,
for all kb
k. It follows from Assumption 4.1 that {∆vk}k
b
kis uniformly bounded
provided that the quantities gkJT
kπk,Mk,πk, and ykare all uniformly bounded
for kb
k. The boundedness of gkJT
kπk,πkand Mkfollows from Assumption 4.2,
Assumption 4.3, (4.1), and (2.3). Thus, it remains to prove that {yk}k
b
kis bounded.
To this end, we first note that since µR
k=µRfor all kb
k, the update to µkgiven
by (2.20) implies that µkµµRfor some µand all ksufficiently large. From
this point onwards the primal-dual merit function is monotonically decreasing, i.e.,
Mν(xk+1, yk+1 ;yE, µ)≤ Mν(xk, yk;yE, µ). Thus {yk}k
b
kmust be bounded since
if there existed a subsequence such that kykkconverged to infinity, then along that
same subsequence Mνwould also converge to infinity since both {fkcT
kyE+
1
2µkckk2}k
b
kand {ck}k
b
kare bounded because of Assumptions 4.2 and 4.3. This
completes the proof of part (i).
Part (ii) is established by showing that {∆vk}k
b
kis bounded away from zero. If
this were not the case, there would exist a subsequence S1⊆ {k:kb
k}such that
limk∈S1∆vk= 0. It follows that the solution ∆vkto problem (2.9) satisfies
zk
0=Hν
M(vk;µR)∆vk+Mν(vk;yE, µR) and 0 = min(xk+pk, zk)
for all k∈ S1. We may then conclude from the definition of Hν
M, Assumptions 4.1
4.3, and (4.1) that for k∈ S1sufficiently large, iterate vkwill satisfy condition (2.17),
be an M-iterate, and µR
kwould be decreased. This contradicts the assumption that
µR
kµRfor all kb
k. It follows that {kvkk}k
b
kis bounded away from zero and
part (ii) holds.
The proof of part (iii) is also by contradiction. Assume that there exists a
subsequence S2of {k:kb
k}such that
lim
k∈S2
Mν(vk;yE, µR)T∆vk= 0,(4.2)
where we have used (4.1). Using the matrix
Lk=I0
1
µRJkI,
the fact that ∆v = 0 is feasible for the convex problem (2.9), that ∆vkis the solution
to problem (2.9) for ν > 0 chosen in Algorithm 2.1, (4.1), and Assumption 4.1, it
4. Convergence 22
follows that
−∇Mν(vk;yE, µR)T∆vk1
2∆vT
kHν
M(vk;µR)∆vk
=1
2∆vT
kLT
kLT
kHν
M(vk;µR)LkL1
k∆vk
= pk
qk+1
µRJkpk!T ¯
Hk+1
µRJT
kJk0
0νµR! pk
qk+1
µRJkpk!T
λminkpkk2+νµRkqk+ (1R)Jkpkk2,
for some λmin >0. Combining this with (4.2) we deduce that
lim
k∈S2
pk= lim
k∈S2qk+ (1R)Jkpk= 0,
from which limk∈S2qk= 0 follows from Assumptions 4.2 and 4.3. This contradicts
part (ii), which shows that limk∈S2∆vk= 0. It follows that part (iii) must hold.
We may now state our convergence result for Algorithm 2.1.
Theorem 4.2. Let Assumptions 4.14.3 hold. If vkdenotes the kth iterate gener-
ated by Algorithm 2.1, then either
(i) Algorithm 2.1 terminates with an approximate primal-dual first-order solution
vKsatisfying
kropt(vK)k ≤ τstop ,
where ropt is defined by (2.19); or
(ii) there exists a subsequence Ssuch that limk∈S µR
k= 0,{yE
k}k∈S is bounded,
limk∈S τk= 0, and for each k∈ S the vector vk+1 is an approximate minimizer
of the primal-dual augmented Lagrangian function (2.1)satisfying (2.17).
Proof. There are two cases to consider.
Case 1. A subsequence of {kropt(vk)k}k0converges to zero.
In this case it is clear from the definition of S-iterates, M-iterates, φS, and φL, and
the fact that τstop >0 that part (i) will be satisfied for some Ksufficiently large.
Case 2. The sequence {kropt(vk)k}k0is bounded away from zero.
Using the definition of an S-iterate, M-iterates, and the functions φS, and φLwe may
conclude that the number of S-iterates and L-iterates must be finite. We now claim
that there must be an infinite number of M-iterates. To prove this, we assume to the
contrary that the number of M-iterates is finite, so that all iterates are F-iterates
for ksufficiently large. It follows that the update to µR
kgiven by (2.18) and the
assumption of this case, that µR
kis eventually never decreased any further, and that
5. Conclusions 23
the update to µkgiven by (2.20) implies that µkis also eventually fixed. Gathering
these facts gives the existence of an integer b
ksuch that
µR
kµRµµk, yE
kyE, τkτ > 0,and kis an F-iterate for all kb
k.
It then follows from (2.20) that
Mν(vk+1 ;yE, µ)≤ Mν(vk;yE, µ) + min(αmin, αk)ηSNkfor all kb
k, (4.3)
where Nkis defined by (2.13). Moreover, parts (ii) and (iii) of Theorem 4.1 ensures
that {Nk}k
b
kis a negative sequence bounded away from zero. We also claim that
{αk}k
b
kis bounded away from zero. To see this, we first note that parts (i) and
(iii) of Theorem 4.1 and Assumption 4.2 would ensure that {αk}k
b
kis bounded
away from zero if a standard Armijo line search was used, i.e., if µF
k=µRand
Nk=∆vT
kMν(vk;yE, µR) in (2.12). However, the αkthat we actually compute
can be no smaller since the actual definition of Nkis less restrictive and we use a
flexible line search that makes step acceptance more likely. Combining these facts
with (4.3), we conclude that
Mν(vk+1 ;yE, µ)≤ Mν(vk;yE, µ)κfor all kb
kand some κ > 0,
so that
lim
k→∞ Mν(vk;yE, µ) = −∞.
However, Assumptions 4.2 and 4.3 ensure that this is not possible. A contradiction
has been reached so there exists infinitely many M-iterations, and all iterates are
M-iterates and F-iterates for all ksufficiently large. Part (ii) now follows from (2.18)
and the properties of the updates to τkand yE
kused for M-iterates and F-iterates in
Algorithm 2.1.
The “ideal” scenario is that Algorithm 2.1 generates many S-iterates/L-iterates
that rapidly converge to an approximate solution of NP; this corresponds to part (i)
of Theorem 4.2. Part (ii) of Theorem 4.2, i.e., generating infinitely many M-iterates,
is the fall-back position of Algorithm 2.1. We believe this result is the best that
can be expected since we have not assumed any constraint qualification. In fact,
the assumptions we have made does no preclude the possibility that problem NP is
infeasible. Also, it has recently been proved [14,15,37] that iterates generated from
the stabilized SQP subproblem exhibit superlinear convergence under rather mild
conditions; in particular, strict complementarity is not assumed and no constraint
qualification is required.
5. Conclusions
In this paper we developed and analyzed an SQP method for solving general non-
linear optimization problems. The algorithm is based on the natural pairing of a
generalized primal-dual augmented Lagrangian function with a flexible line search.
... Hence, employing a proper local search method can improve evolution efficiency. The sequential quadratic programming (SQP) method was proposed by Wilson [28] due to its outstanding local search capability, and it is widely used in real-parameter optimization problems [29][30][31][32][33]. The SQP method can be used to seek the local minimum satisfying the constraints. ...
... The SQP method can be used to seek the local minimum satisfying the constraints. Theories related to the SQP method can be found in [28,[34][35][36]. ...
Article
Full-text available
(a) Please download code from:【https://github.com/microhard1999/CODES】 Over the last decade, particle swarm optimization has become increasingly sophisticated because well-balanced exploration and exploitation mechanisms have been proposed. The sequential quadratic programming method, which is widely used for real-parameter optimization problems, demonstrates its outstanding local search capability. In this study, two mechanisms are proposed and integrated into particle swarm optimization for single-objective numerical optimization. A novel ratio adaptation scheme is utilized for calculating the proportion of subpopulations and intermittently invoking the sequential quadratic programming for local search start from the best particle to seek a better solution. The novel particle swarm optimization variant was validated on CEC2013, CEC2014, and CEC2017 benchmark functions. The experimental results demonstrate impressive performance compared with the state-of-the-art particle swarm optimization-based algorithms. Furthermore, the results also illustrate the effectiveness of the two mechanisms when cooperating to achieve significant improvement.
... according to SQP [43], where ( ) is the objective function of (8) and * T is the transpose operation. Then, (7) can be rewritten as ...
Article
Full-text available
Balancing adaptability, reliability, and accuracy in vision technology has always been a major bottleneck limiting its application in appearance assurance for complex objects in high-end equipment production. Data-driven deep learning shows robustness to feature diversity but is limited by interpretability and accuracy. The traditional vision scheme is reliable and can achieve high accuracy, but its adaptability is insufficient. The deeper reason is the lack of appropriate architecture and integration strategies between the learning paradigm and empirical design. To this end, a learnable viewpoint evolution algorithm for high-accuracy pose estimation of complex assembled products under free view is proposed. To alleviate the balance problem of exploration and optimization in estimation, shape-constrained virtual–real matching, evolvable feasible region, and specialized population migration and reproduction strategies are designed. Furthermore, a learnable evolution control mechanism is proposed, which integrates a guided model based on experience and is cyclic-trained with automatically generated effective trajectories to improve the evolution process. Compared to the m of the state-of-the-art data-driven method and the m of the classic strategy combination, the pose estimation error of complex assembled product in this study is m, which proves the effectiveness of the proposed method. Meanwhile, through in-depth exploration, the robustness, parameter sensitivity, and adaptability to the virtual–real appearance variations are sequentially verified.
... The cost function is not convex in nature with respect to c r . So, we adopted a sequential quadratic programming (SQP) algorithm to solve it [35]. After solving (11), we obtain an intermediate optimal value of c r and use it for the next step. ...
Article
Full-text available
This study introduces the novel communication topology, namely RadRCom, integrating radar and relay-assisted communication systems for single antenna configuration as a proof of concept. While simultaneous radar and communication operations within the same spectrum gain momentum, our work advances this concept by incorporating relay assistance, particularly crucial in applications like vehicle-to-anything (V2X) communication. The inclusion of relays significantly enhances communication system performance, addressing challenges such as interference management between radar, relay, and communication nodes. This topology attracts three design challenges such as optimal radar waveform, relay parameters and communication system parameters. However, the key bottlenecks are the interference from radar to the relay and communication receiver and similarly the one from communication transmitter and relay node to the radar. Therefore, the work addresses these challenges simultaneously meeting the quality of service. Our proposed RadRCom system optimizes radar waveform and relay parameters to improve signal-to-interference noise ratio (SINR) at the radar and mean square error (MSE) of data transmission. We introduce two frameworks for parameter design, i.e, radar-centric and relay-centric ones. We take a sub-optimal iterative approach to address the computational complexity. Numerical simulations are performed to evaluate the performances of the proposed RadRCom system with the proposed algorithms.
... This kind of problem formulation can be solved with sequential quadratic programming. This method has been first proposed by Wilson to solve constrained nonlinear optimization problems [40]. Afterwards, this method has gained interest in solving similar problems to overcome the difficulty of complex mathematical formulation [41]. ...
Article
Full-text available
Electricity demand in residential areas is generally met by the local low-voltage grid or, alternatively, the national grid, which produces electricity using thermal power stations based on conventional sources. These generators are holding back the revolution and the transition to a green planet, being unable to cope with climatic constraints. In the residential context, to ensure a smooth transition to an ecological green city, the idea of using alternative sources will offer the solution. These alternatives must be renewable and naturally available on the planet. This requires a generation that is very responsive to the constraints of the 21st century. However, these sources are intermittent and require a hybrid solution known as Hybrid Renewable Energy Systems (HRESs). To this end, we have designed a hybrid system based on PV-, wind-turbine- and grid-supported battery storage and an electric vehicle connected to a residential building. We proposed an energy management system based on nonlinear programming. This optimization was solved using sequential quadrature programming. The data were then processed using a long short-term memory (LSTM) model to predict, with the contribution and cooperation of each source, how to meet the energy needs of each home. The prediction was ensured with an accuracy of around 95%. These prediction results have been injected into K-nearest neighbors (KNN), random forest (RF) and gradient boost (GRU) repressors to predict the storage collaboration rates handled by the local battery and the electric vehicle. Results have shown an R2_score of 0.6953, 0.8381, and 0.739, respectively. This combination permitted an efficient prediction of the potential consumption from the grid with a value of an R²-score of around 0.9834 using LSTM. This methodology is effective in allowing us to know in advance the amount of energy of each source, storage, and excess grid injection and to propose the switching control of the hybrid architecture.
... Subject to: 1, 2, ..., n g ) and σ[Ĝ i (u, μ X , d) ](i = 1, 2, ..., n g ) are analytical functions concerning the design parameter vector [μ X , d], the above expression represents a traditional deterministic optimization problem. Conventional optimization algorithms, such as the genetic algorithm [47] and sequential quadratic programming [48], can be used to solve this problem. In this work, the sequential quadratic programming is employed to perform the design optimization. ...
Article
Robust design optimization (RDO) is a valuable technique in the design of engineering structures as it can provide an optimum design solution that is relatively insensitive to input uncertainties. However, the nested double-loop estimation process required in RDO often results in significant computational costs. To address this issue, we propose an adaptive decoupled RDO method based on the Kriging surrogate model. This method transforms the nested double-loop estimation process into a traditional deterministic optimization procedure, thus reducing computational costs. Furthermore, a novel estimation expression for the performance standard deviation that can simultaneously reflect the uncertainties in both the prediction and the performance mean is established. The closed-form expressions of the performance mean and performance standard deviation under different design parameters are deduced, which are further implemented to the uncertainty propagation during the design optimization. Moreover, an adaptive framework is introduced to improve the computational accuracy of uncertainty propagation as well as optimization procedure to guarantee the estimation accuracy of RDO problems. Several numerical examples along with engineering cases are introduced to illustrate the effectiveness of the established adaptive decoupled adaptive RDO method, and the results demonstrate that the proposed method can effectively optimize the design of structures while reducing computational costs.
Chapter
The global energy demands have reached high peaks and even in 2023, it has been witnessed that nearly 90% of the energy in the world is still based and produced from fossil fuels. Considering all the consequences of using such conventional energy sources like CO2 and other greenhouse gases emissions, and at the same time questioning the sustainability factor of the energy transition, there is a dire need to transition to cleaner energy fuels. To inculcate the conventional energy sources, a lot of renewable energy sources like biogas, biofuels, hydrogen, solar energy etc. have been researched. The primary success towards biogas production is due to the affordability of the available feedstock, the ease and availability of biofuels, low production costs, and the applications of biogases which involves heating, electricity, fuel, refrigeration and power generation. Some of the criticalities and challenges discussed in the study includes a gigantic gap between biotechnology research and development, commercialization and analysis of the future of biogas in the circular economy. Many lignocellulosic sources, such as manure, fruit, and vegetable wastes, can be used to generate biowaste, and anaerobic digestion can be used on a local or large scale. In this study, MATLAB/Simulink environment is used to carry out a multitude of models and simulations that take into account speculative objectives and potential energy futures (calculating the number of functioning plants in 2030, 2050, etc.) which include Input–output models, the Anaerobic Digestion Model 1 (ADM1), and other models. There are a lot of predictions, points of view, and conclusions that are discussed that claim the outcomes of simulations of such models can cause significant changes in the economic systems, as the use of biogas and biofuel will lead to the recovery of a lot of fossil phosphorous, to the tune of 100–150 billion euros, and that also proved the effectiveness and applications of biogas and biofuels. Graphical Abstract Graphical Abstract giving the overview of the various models and simulation models for biogas production from biowaste with the help of MATLAB/Simulink.
Conference Paper
Full-text available
A framework for solving a class of nonlinear programming problems via the filter method is presented. The proposed technique first solve a sequence of quadratic programming subproblems via line search strategy and to induce global convergence, trial points are accepted provided there is a sufficient decrease in the objective function or constraints violation function. In the event when the step size has reached a minimum threshold such that the trial iterate is rejected by the filter, the algorithm temporarily exits to a trust region based algorithm to generate iterates that approach the feasible region and also acceptable to the filter. Computational results on selected large scale CUTE problems on the prototype code filLS are very encouraging and numerical performance with LOQO and SNOPT show that the algorithm is efficient and reliable.
Article
Full-text available
A framework for solving a class of nonlinear programming problems via the filter method is presented. The proposed technique first solve a sequence of quadratic programming subproblems via line search strategy and to induce global convergence, trial points are accepted provided there is a sufficient decrease in the objective function or constraints violation function. In the event when the step size has reached a minimum threshold such that the trial iterate is rejected by the filter, the algorithm temporarily exits to a trust region based algorithm to generate iterates that approach the feasible region and also acceptable to the filter. Computational results on selected large scale CUTE problems on the prototype code filLS are very encouraging and numerical performance with LOQO and SNOPT show that the algorithm is efficient and reliable.
Article
Full-text available
We describe the results of a series of tests for a class of new methods of trust region type for solving the simple bound constrained minimization problem. The results are encouraging and lead us to believe that the methods will prove useful in solving large-scale problems.
Chapter
This volume comprises a set of research papers that together will provide an up-to-date survey of the current state of the art in numerical analysis. The contributions are based on talks given at a conference in honour of Jim Wilkinson, one of the foremost pioneers in numerical analysis. The contributors were all his colleagues and collaborators and are leading figures in their respective fields. The breadth of Jim Wilkinson's research is reflected in the main themes covered: linear algebra, error analysis and computer arithmetic, algorithms, and mathematical software. Particular topics covered include analysis of the Lanczos algorithm, determining the nearest defective matrix to a given one, QR-factorizations, error propagation models, parameter estimation problems, sparse systems, and shape-preserving splines. As a whole the volume reflects the current vitality of numerical analysis and will prove an invaluable reference for all numerical analysts.
Chapter
In this paper, we introduce a simple new set of techniques for deriving symmetric and positive definite secant updates. We use these techniques to present a simple new derivation of the BFGS update using neither matrix inverses nor weighting matrices. A related derivation is shown to generate a large class of symmetric rank-two update formulas, together with the condition for each to preserve positive definiteness. We apply our techniques to generate a new projected BFGS update, and indicate applications to the efficient implementation of secant algorithms via the Cholesky factorization.
Article
A proof, based on the duality theorem of linear programming, is given for a duality theorem for a class of quadratic programs. An illustrative application is made in the theory of elastic structures.
Article
Many algorithms for solution of quadratic programming problems generate a sequence of simpler auxiliary problems whose solutions approximate the solution of a given problem. When these auxiliary problems are solved iteratively, which may be advantageous for large problems, it is necessary to define precision of their solution so that the whole procedure is effective. In this paper, we review our recent results on implementation of algorithms with precision control that exploits the norm of violation of Karush-Tucker conditions.
Article
We present numerically reliable methods for the calculation of a search direction for use in sequential methods for solving nonlinear programming problems. The methods presented are easy to adapt to such problems as locating directions of negative curvature and linear infinite descent. Encouraging numerical results are included.