ArticlePublisher preview available

An active-set algorithmic framework for non-convex optimization problems over the simplex

September 2020
Computational Optimization and Applications 77(2)

September 2020
77(2)

DOI:10.1007/s10589-020-00195-x

Authors:

Marianna De Santis

University of Florence

Stefano Lucidi

Sapienza University of Rome

Francesco Rinaldi

University of Padova

In this paper, we describe a new active-set algorithmic framework for minimizing a non-convex function over the unit simplex. At each iteration, the method makes use of a rule for identifying active variables (i.e., variables that are zero at a stationary point) and specific directions (that we name active-set gradient related directions) satisfying a new “nonorthogonality” type of condition. We prove global convergence to stationary points when using an Armijo line search in the given framework. We further describe three different examples of active-set gradient related directions that guarantee linear convergence rate (under suitable assumptions). Finally, we report numerical experiments showing the effectiveness of the approach.

Objective function error (y axis) versus CPU time in seconds (x axis). Comparison between original and active-set algorithms on instances from the Chebyshev center problem. Both y axis and x axis are in logarithmic scale

…

Objective function error (y axis) versus CPU time in seconds (x axis). Comparison among active-set algorithms and P2GP on instances from the Chebyshev center problem. Both y axis and x axis are in logarithmic scale

…

Objective function error (y axis) versus CPU time in seconds (x axis). Comparison between original and active-set algorithms on instances from the symmetric eigenvalue complementarity problem. Both y axis and x axis are in logarithmic scale

…

Figures - available from: Computational Optimization and Applications

This content is subject to copyright. Terms and conditions apply.

A preview of this full-text is provided by Springer Nature.

Learn more

Content available from Computational Optimization and Applications

This content is subject to copyright. Terms and conditions apply.

Vol.:(0123456789)

Computational Optimization and Applications (2020) 77:57–89

https://doi.org/10.1007/s10589-020-00195-x

1 3

An active‑set algorithmic framework fornon‑convex

optimization problems overthesimplex

AndreaCristofari1 · MariannaDeSantis2· StefanoLucidi2· FrancescoRinaldi1

Received: 16 February 2019 / Published online: 16 May 2020

Abstract

In this paper, we describe a new active-set algorithmic framework for minimizing a

non-convex function over the unit simplex. At each iteration, the method makes use

of a rule for identifying active variables (i.e., variables that are zero at a stationary

point) and speciﬁc directions (that we name active-set gradient related directions)

satisfying a new “nonorthogonality” type of condition. We prove global convergence

to stationary points when using an Armijo line search in the given framework. We

further describe three diﬀerent examples of active-set gradient related directions that

guarantee linear convergence rate (under suitable assumptions). Finally, we report

numerical experiments showing the eﬀectiveness of the approach.

Keywords Active-set methods· Unit simplex· Non-convex optimization· Large-

scale optimization

Mathematics Subject Classication 65K05· 90C06· 90C30

* Andrea Cristofari

andrea.cristofari@unipd.it

Marianna De Santis

mdesantis@diag.uniroma1.it

Stefano Lucidi

lucidi@diag.uniroma1.it

Francesco Rinaldi

rinaldi@math.unipd.it

1 Dipartimento di Matematica “Tullio Levi-Civita”, Università di Padova, Padua, Italy

2 Dipartimento di Ingegneria Informatica, Automatica e Gestionale, Sapienza Università di Roma,

Rome, Italy

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Minimization over the $$\ell _1$$-ball using an active-set non-monotone projected gradient

Article

Full-text available

Aug 2022
COMPUT OPTIM APPL

The $$\ell _1$$ ℓ 1 -ball is a nicely structured feasible set that is widely used in many fields (e.g., machine learning, statistics and signal analysis) to enforce some sparsity in the model solutions. In this paper, we devise an active-set strategy for efficiently dealing with minimization problems over the $$\ell _1$$ ℓ 1 -ball and embed it into a tailored algorithmic scheme that makes use of a non-monotone first-order approach to explore the given subspace at each iteration. We prove global convergence to stationary points. Finally, we report numerical experiments, on two different classes of instances, showing the effectiveness of the algorithm.

Minimization over the l1-ball using an active-set non-monotone projected gradient

Preprint

Jul 2021

The l1-ball is a nicely structured feasible set that is widely used in many fields (e.g., machine learning, statistics and signal analysis) to enforce some sparsity in the model solutions. In this paper, we devise an active-set strategy for efficiently dealing with minimization problems over the l1-ball and embed it into a tailored algorithmic scheme that makes use of a non-monotone first-order approach to explore the given subspace at each iteration. We prove global convergence to stationary points. Finally, we report numerical experiments, on two different classes of instances, showing the effectiveness of the algorithm.

Active-set identification with complexity guarantees of an almost cyclic 2-coordinate descent method with Armijo line search

Preprint

Mar 2021

Andrea Cristofari

In this paper, it is established finite active-set identification of an almost cyclic 2-coordinate descent method for problems with one linear coupling constraint and simple bounds. First, general active-set identification results are stated for non-convex objective functions. Then, under strong convexity, complexity results on the number of iterations required to identify the active set are given. In our analysis, a simple Armijo line search is used to compute the stepsize, thus not requiring exact minimizations or additional information.

Avoiding bad steps in Frank-Wolfe variants

Article

Full-text available

Nov 2022
COMPUT OPTIM APPL

The study of Frank-Wolfe (FW) variants is often complicated by the presence of different kinds of “good” and “bad” steps. In this article, we aim to simplify the convergence analysis of specific variants by getting rid of such a distinction between steps, and to improve existing rates by ensuring a non-trivial bound at each iteration. In order to do this, we define the Short Step Chain (SSC) procedure, which skips gradient computations in consecutive short steps until proper conditions are satisfied. This algorithmic tool allows us to give a unified analysis and converge rates in the general smooth non convex setting, as well as a linear convergence rate under a Kurdyka-Łojasiewicz (KL) property. While the KL setting has been widely studied for proximal gradient type methods, to our knowledge, it has never been analyzed before for the Frank-Wolfe variants considered in the paper. An angle condition, ensuring that the directions selected by the methods have the steepest slope possible up to a constant, is used to carry out our analysis. We prove that such a condition is satisfied, when considering minimization problems over a polytope, by the away step Frank-Wolfe (AFW), the pairwise Frank-Wolfe (PFW), and the Frank-Wolfe method with in face directions (FDFW).

A decomposition method for lasso problems with zero-sum constraint

Preprint

Apr 2022

Andrea Cristofari

In this paper, we consider lasso problems with zero-sum constraint, commonly required for the analysis of compositional data in high-dimensional spaces. A novel algorithm is proposed to solve these problems, combining a tailored active-set technique, to identify the zero variables in the optimal solution, with a 2-coordinate descent scheme. At every iteration, the algorithm chooses between two different strategies: the first one requires to compute the whole gradient of the smooth term of the objective function and is more accurate in the active-set estimate, while the second one only uses partial derivatives and is computationally more efficient. Global convergence to optimal solutions is proved and numerical results are provided on synthetic and real datasets, showing the effectiveness of the proposed method. The software is publicly available.

Avoiding bad steps in Frank Wolfe variants

Preprint

Full-text available

Dec 2020

The analysis of Frank Wolfe (FW) variants is often complicated by the presence of different kinds of "good" and "bad" steps. In this article we aim to simplify the convergence analysis of some of these variants by getting rid of such a distinction between steps, and to improve existing rates by ensuring a sizable decrease of the objective at each iteration. In order to do this, we define the Short Step Chain (SSC) procedure, which skips gradient computations in consecutive short steps until proper stopping conditions are satisfied. This technique allows us to give a unified analysis and converge rates in the general smooth non convex setting, as well as a linear convergence rate under a Kurdyka-Lojasiewicz (KL) property. While this setting has been widely studied for proximal gradient type methods, to our knowledge, it has not been analyzed before for the Frank Wolfe variants under study. An angle condition, ensuring that the directions selected by the methods have the steepest slope possible up to a constant, is used to carry out our analysis. We prove that this condition is satisfied on polytopes by the away step Frank-Wolfe (AFW), the pairwise Frank-Wolfe (PFW), and the Frank-Wolfe method with in face directions (FDFW).

Frank–Wolfe and friends: a journey into projection-free first-order optimization methods

Article

Full-text available

Sep 2021
4OR-Q J OPER RES

Invented some 65 years ago in a seminal paper by Marguerite Straus-Frank and Philip Wolfe, the Frank–Wolfe method recently enjoys a remarkable revival, fuelled by the need of fast and reliable first-order optimization methods in Data Science and other relevant application areas. This review tries to explain the success of this approach by illustrating versatility and applicability in a wide range of contexts, combined with an account on recent progress in variants, improving on both the speed and efficiency of this surprisingly simple principle of first-order optimization.

Frank-Wolfe and friends: a journey into projection-free first-order optimization methods

Preprint

Full-text available

Jun 2021

Invented some 65 years ago in a seminal paper by Marguerite Straus-Frank and Philip Wolfe, the Frank-Wolfe method recently enjoys a remarkable revival, fuelled by the need of fast and reliable first-order optimization methods in Data Science and other relevant application areas. This review tries to explain the success of this approach by illustrating versatility and applicability in a wide range of contexts, combined with an account on recent progress in variants, both improving on the speed and efficiency of this surprisingly simple principle of first-order optimization.

A decomposition method for lasso problems with zero-sum constraint

Article

Sep 2022
EUR J OPER RES

Andrea Cristofari

Active-Set Identification with Complexity Guarantees of an Almost Cyclic 2-Coordinate Descent Method with Armijo Line Search

Article

Jun 2022

Andrea Cristofari

Splitting methods for the Eigenvalue Complementarity Problem

Article

Full-text available

Jun 2018

We study splitting methods for solving the Eigenvalue Complementarity Problem (EiCP). We introduce four variants, which depend on the properties (symmetry, nonsymmetry, positive definite, negative definite, indefinite) of the matrices included in the definition of EiCP. Convergence analyses for each one of these versions of the splitting method are discussed. Special choices for the splitting matrices associated with these versions are recommended and tested on the solution of small and large symmetric and nonsymmetric EiCPs. These experiments show that the four versions of the splitting method work well at least for some choices of the splitting matrices. Furthermore, these versions of the splitting methods seem to be competitive with the most efficient state-of-the-art algorithms for the solution of EiCP.

A Two-Phase Gradient Method for Quadratic Programming Problems with a Single Linear Constraint and Bounds on the Variables

Article

Full-text available

May 2017

We propose a gradient-based method for quadratic programming problems with a single linear constraint and bounds on the variables. Inspired by the GPCG algorithm for bound-constrained convex quadratic programming [J.J. Mor\'e and G. Toraldo, SIAM J. Optim. 1, 1991], our approach alternates between two phases until convergence: an identification phase, which performs gradient projection iterations until either a candidate active set is identified or no reasonable progress is made, and an unconstrained minimization phase, which reduces the objective function in a suitable space defined by the identification phase, by applying either the conjugate gradient method or a recently proposed spectral gradient method. However, the algorithm differs from GPCG not only because it deals with a more general class of problems, but mainly for the way it stops the minimization phase. This is based on a comparison between a measure of optimality in the reduced space and a measure of bindingness of the variables that are on the bounds, defined by extending the concept of proportioning, which was proposed by some authors for box-constrained problems. If the objective function is bounded, the algorithm converges to a stationary point thanks to a suitable application of the gradient projection method in the identification phase. For strictly convex problems, the algorithm converges to the optimal solution in a finite number of steps even in case of degeneracy. Extensive numerical experiments show the effectiveness of the proposed approach.

A Two-Stage Active-Set Algorithm for Bound-Constrained Optimization

Article

Full-text available

Feb 2017
J OPTIMIZ THEORY APP

In this paper, we describe a two-stage method for solving optimization problems with bound constraints. It combines the active-set estimate described in [Facchinei and Lucidi, 1995] with a modification of the nonmonotone line search framework recently proposed in [De Santis et al., 2012]. In the first stage, the algorithm exploits a property of the active-set estimate that ensures a significant reduction of the objective function when setting to the bounds all those variables estimated active. In the second stage, a truncated-Newton strategy is used in the subspace of the variables estimated non-active. In order to properly combine the two phases, a proximity check is included in the scheme. This new tool, together with the other theoretical features of the two stages, enables us to prove global convergence. Furthermore, under additional standard assumptions, we can show that the algorithm converges at a superlinear rate. We report results of a numerical experience on bound-constrained problems from the CUTEst collection, showing the efficiency of the proposed approach.

A Feasible Active Set Method with Reoptimization for Convex Quadratic Mixed-Integer Programming

Article

Full-text available

Dec 2015

We propose a feasible active set method for convex quadratic programming problems with non-negativity constraints. This method is specifically designed to be embedded into a branch-and-bound algorithm for convex quadratic mixed integer programming problems. The branch-and-bound algorithm generalizes the approach for unconstrained convex quadratic integer programming proposed by Buchheim, Caprara and Lodi to the presence of linear constraints. The main feature of the latter approach consists in a sophisticated preprocessing phase, leading to a fast enumeration of the branch-and-bound nodes. Moreover, the feasible active set method takes advantage of this preprocessing phase and is well suited for reoptimization. Experimental results for randomly generated instances show that the new approach significantly outperforms the MIQP solver of CPLEX 12.6 for instances with a small number of constraints.

On the Global Linear Convergence of Frank-Wolfe Optimization Variants

Article

Full-text available

Nov 2015

The Frank-Wolfe (FW) optimization algorithm has lately re-gained popularity thanks in particular to its ability to nicely handle the structured constraints appearing in machine learning applications. However, its convergence rate is known to be slow (sublinear) when the solution lies at the boundary. A simple less-known fix is to add the possibility to take 'away steps' during optimization, an operation that importantly does not require a feasibility oracle. In this paper, we highlight and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and prove for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective. The constant in the convergence rate has an elegant interpretation as the product of the (classical) condition number of the function with a novel geometric quantity that plays the role of a 'condition number' of the constraint set. We provide pointers to where these algorithms have made a difference in practice, in particular with the flow polytope, the marginal polytope and the base polytope for submodular optimization.

An Active Set Newton Algorithm for Large-Scale Nonlinear Programs with Box Constraints

Article

Full-text available

Feb 1998

A new algorithm for large-scale nonlinear programs with box constraints is introduced. The algorithm is based on an efficient identification technique of the active set at the solution and on a nonmonotone stabilization technique. It possesses global and superlinear convergence properties under standard assumptions. A new technique for generating test problems with known characteristics is also introduced. The implementation of the method is described along with computational results for large-scale problems.

Projection onto a Polyhedron that Exploits Sparsity

Article

Jan 2016

An algorithm is developed for projecting a point onto a polyhedron. The algorithm solves a dual version of the projection problem and then uses the relationship between the primal and dual to recover the projection. The techniques in the paper exploit sparsity. Sparse reconstruction by separable approximation (SpaRSA) is used to approximately identify active constraints in the polyhedron, and the dual active set algorithm (DASA) is used to compute a high precision solution. A linear convergence result is established for SpaRSA that does not require the strong concavity of the dual to the projection problem, and an earlier R-linear convergence rate is strengthened to a Q-linear convergence property. An algorithmic framework is developed for combining SpaRSA with an asymptotically preferred algorithm such as DASA. It is shown that only the preferred algorithm is executed asymptotically. Numerical results are given using the polyhedra associated with the Netlib LP test set. A comparison is made to the interior point method contained in the general purpose open source software package IPOPT for nonlinear optimization, and to the commercial package CPLEX, which contains an implementation of the barrier method that is targeted to problems with the structure of the polyhedral projection problem.

A block active set algorithm with spectral choice line search for the symmetric eigenvalue complementarity problem

Article

Feb 2017
APPL MATH COMPUT

In this paper, we address the solution of the symmetric eigenvalue complementarity problem (EiCP) by treating an equivalent reformulation of finding a stationary point of a fractional quadratic program on the unit simplex. The spectral projected-gradient (SPG) method has been recommended to this optimization problem when the dimension of the symmetric EiCP is large and the accuracy of the solution is not a very important issue. We suggest a new algorithm which combines elements from the SPG method and the block active set method, where the latter was originally designed for box constrained quadratic programs. In the new algorithm the projection onto the unit simplex in the SPG method is replaced by the much cheaper projection onto a box. This can be of particular advantage for large and sparse symmetric EiCPs. Global convergence to a solution of the symmetric EiCP is established. Computational experience with medium and large symmetric EiCPs is reported to illustrate the efficacy and efficiency of the new algorithm.

An Active Set Algorithm for Nonlinear Optimization with Polyhedral Constraints

Article

Jun 2016

A polyhedral active set algorithm PASA is developed for solving a nonlinear optimization problem whose feasible set is a polyhedron. Phase one of the algorithm is the gradient projection method, while phase two is any algorithm for solving a linearly constrained optimization problem. Rules are provided for branching between the two phases. Global convergence to a stationary point is established, while asymptotically PASA performs only phase two when either a nondegeneracy assumption holds, or the active constraints are linearly independent and a strong second-order sufficient optimality condition holds.

Convergence Theory in Nonlinear Programming

Article

Pat Wolfe

An active-set algorithmic framework for non-convex optimization problems over the simplex

Abstract and Figures

Recommended publications

Convergence Properties of Iterations Using Sets

Numerical integration error method for zeros of analytic functions

Global konvergente Interpolationsmethoden zur Nullstelleneinschließung

Convergence and Rate of Convergence of a Non-autonomous Gradient System on Hadamard Manifolds