SIAM Journal on Optimization
Journal Prestige (SJR): 2.195 Citation Impact (citeScore): 3 Number of Followers: 12 Hybrid journal (It can contain Open Access articles) ISSN (Print) 10526234  ISSN (Online) 10957189 Published by Society for Industrial and Applied Mathematics [17 journals] 
 Finding Sparse Solutions for Packing and Covering Semidefinite Programs

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Khaled Elbassioni, Kazuhisa Makino, Waleed Najy
Pages: 321  353
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 321353, June 2022.
Packing and covering semidefinite programs (SDPs) appear in natural relaxations of many combinatorial optimization problems as well as a number of other applications. Recently, several techniques were proposed, which utilize the particular structure of this class of problems, to obtain more efficient algorithms than those offered by general SDP solvers. For certain applications, such as those described in this paper, it may be desirable to obtain sparse dual solutions, i.e., those with support size (almost) independent of the number of primal constraints. In this paper, we give an algorithm that finds such solutions, which is an extension of a logarithmicpotential based algorithm of Grigoriadis et al. [SIAM J. Optim. 11 (2001), pp. 10811091] for packing/covering linear programs.
Citation: SIAM Journal on Optimization
PubDate: 20220404T07:00:00Z
DOI: 10.1137/20M137570X
Issue No: Vol. 32, No. 2 (2022)

 Distributed Optimization Based on Gradient Tracking Revisited: Enhancing
Convergence Rate via Surrogation
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Ying Sun, Gesualdo Scutari, Amir Daneshmand
Pages: 354  385
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 354385, June 2022.
We study distributed multiagent optimization over graphs. We consider the minimization of $F+G$ subject to convex constraints, where $F$ is the smooth strongly convex sum of the agent's losses and $G$ is a nonsmooth convex function. We build on the SONATA algorithm: the algorithm employs the use of surrogate objective functions in the agents' subproblems (thus going beyond linearization, such as proximalgradient) coupled with a perturbed consensus mechanism that aims to locally track the gradient of $F$. SONATA achieves precision $\epsilon>0$ on the objective value in $\mathcal{O}(\kappa_g \log(1/\epsilon))$ gradient computations at each node and $\tilde{\mathcal{O}}\big(\kappa_g (1\rho)^{1/2} \log(1/\epsilon)\big)$ communication steps, where $\kappa_g$ is the condition number of $F$ and $\rho$ characterizes the connectivity of the network. This is the first linear rate result for distributed composite optimization; it also improves on existing (nonaccelerated) schemes just minimizing $F$, whose rate depends on much larger quantities than $\kappa_g$. When the loss functions of the agents are similar, due to statistical data similarity or otherwise, SONATA employing highorder surrogates achieves precision $\epsilon>0$ in $\mathcal{O}\big((\beta/\mu) \log(1/\epsilon)\big)$ iterations and $\tilde{\mathcal{O}}\big((\beta/\mu) (1\rho)^{1/2} \log(1/\epsilon)\big)$ communication steps, where $\beta$ measures the degree of similarity of agents' losses and $\mu$ is the strong convexity constant of $F$. Therefore, when $\beta/\mu < \kappa_g$, the use of highorder surrogates yields provably faster rates than those achievable by firstorder models; this is without exchanging any Hessian matrix over the network.
Citation: SIAM Journal on Optimization
PubDate: 20220405T07:00:00Z
DOI: 10.1137/19M1259973
Issue No: Vol. 32, No. 2 (2022)

 ZerothOrder Stochastic Compositional Algorithms for RiskAware Learning

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Dionysios S. Kalogerias, Warren B. Powell
Pages: 386  416
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 386416, June 2022.
We present ${\it FreeMESSAGE}^{,p}$, the first zerothorder algorithm for (weakly) convex meansemideviationbased riskaware learning, which is also the firstever threelevel zerothorder compositional stochastic optimization algorithm. Using a nontrivial extension of Nesterov's classical results on Gaussian smoothing, we develop the ${\it FreeMESSAGE}^{,p}$ algorithm from first principles and show that it essentially solves a smoothed surrogate to the original problem, the former being a uniform approximation of the latter, in a useful, convenient sense. We then present a complete analysis of the ${\it FreeMESSAGE}^{,p}$ algorithm, which establishes convergence in a usertunable neighborhood of the optimal solutions of the original problem for convex costs, as well as explicit convergence rates for convex, weakly convex, and strongly convex costs, in a unified way. Orderwise, and for fixed problem parameters, our results demonstrate no sacrifice in convergence speed as compared to existing firstorder methods, while striking a certain balance among the condition of the problem, its dimensionality, and the accuracy of the obtained results, naturally extending previous results in zerothorder riskneutral learning.
Citation: SIAM Journal on Optimization
PubDate: 20220405T07:00:00Z
DOI: 10.1137/20M1315403
Issue No: Vol. 32, No. 2 (2022)

 Bayesian Optimization with Expensive Integrands

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Saul ToscanoPalmerin, Peter I. Frazier
Pages: 417  444
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 417444, June 2022.
Nonconvex derivativefree timeconsuming objectives are often optimized using “blackbox” optimization. These approaches assume very little about the objective. While broadly applicable, they typically require more evaluations than methods exploiting more problem structure. Often, such timeconsuming objectives are actually the sum or integral of a larger number of functions, each of which consumes significant time when evaluated individually. This arises in designing aircraft, choosing parameters in ridesharing dispatch systems, and tuning hyperparameters in deep neural networks. We develop a novel Bayesian optimization algorithm that leverages this structure to improve performance. Our algorithm is averagecase optimal by construction when a single evaluation of the integrand remains within our evaluation budget. Achieving this onestep optimality requires solving a challenging value of information optimization problem, for which we provide a novel efficient discretizationfree computational method. We also prove consistency for our method in both continuum and discrete finite domains for objective functions that are sums. In numerical experiments comparing against previous stateoftheart methods, including those that also leverage sum or integral structure, our method performs as well or better across a wide range of problems and offers significant improvements when evaluations are noisy or the integrand varies smoothly in the integrated variables.
Citation: SIAM Journal on Optimization
PubDate: 20220418T07:00:00Z
DOI: 10.1137/19M1303125
Issue No: Vol. 32, No. 2 (2022)

 Tracking and Regret Bounds for Online ZerothOrder Euclidean and
Riemannian Optimization
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Alejandro I. Maass, Chris Manzie, Dragan Nešić, Jonathan H. Manton, Iman Shames
Pages: 445  469
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 445469, June 2022.
We study numerical optimization algorithms that use zerothorder information to minimize timevarying geodesically convex cost functions on Riemannian manifolds. In the Euclidean setting, zerothorder algorithms have received a lot of attention in both the timevarying and timeinvariant cases. However, the extension to Riemannian manifolds is much less developed. We focus on Hadamard manifolds, which are a special class of Riemannian manifolds with global nonpositive curvature that offer convenient grounds for the generalization of convexity notions. Specifically, we derive bounds on the expected instantaneous tracking error, and we provide algorithm parameter values that minimize the algorithm's performance. Our results illustrate how the manifold geometry in terms of the sectional curvature affects these bounds. Additionally, we provide dynamic regret bounds for this online optimization setting. To the best of our knowledge, these are the first regret bounds even for the Euclidean version of the problem. Lastly, via numerical simulations, we demonstrate the applicability of our algorithm on an online Karcher mean problem.
Citation: SIAM Journal on Optimization
PubDate: 20220421T07:00:00Z
DOI: 10.1137/21M1405551
Issue No: Vol. 32, No. 2 (2022)

 Hyperbolic Relaxation of $k$Locally Positive Semidefinite Matrices

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Grigoriy Blekherman, Santanu S. Dey, Kevin Shu, Shengding Sun
Pages: 470  490
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 470490, June 2022.
A successful computational approach for solving largescale positive semidefinite (PSD) programs is to enforce PSDness on only a collection of submatrices. For our study, we let $\mathcal{S}^{n,k}$ be the convex cone of $n\times n$ symmetric matrices where all $k\times k$ principal submatrices are PSD. We call a matrix in this $k$locally PSD. In order to compare $\mathcal{S}^{n,k}$ to the cone of PSD matrices, we study eigenvalues of $k$locally PSD matrices. The key insight in this paper is that there is a convex cone $H(e_k^n)$ so that if $X \in \mathcal{S}^{n,k}$, then the vector of eigenvalues of $X$ is contained in $H(e_k^n)$. The cone $H(e_k^n)$ is the hyperbolicity cone of the elementary symmetric polynomial $e_k^n$ (where $e_k^n(x) = \sum_{S \subseteq [n] : S = k} \prod_{i \in S} x_i$) with respect to the all ones vector. Using this insight, we are able to improve previously known upper bounds on the Frobenius distance between matrices in $\mathcal{S}^{n,k}$ and PSD matrices. We also study the quality of the convex relaxation $H(e^n_k)$. We first show that this relaxation is tight for the case of $k = n 1$, that is, for every vector in $H(e^n_{n 1})$ there exists a matrix in $\mathcal{S}^{n, n 1}$ whose eigenvalues are equal to the components of the vector. We then prove a structure theorem on nonsingular matrices in $\mathcal{S}^{n,k}$ all of whose $k\times k$ principal minors are zero, which we believe is of independent interest. This result shows shows that for $1< k < n 1$ “large parts” of the boundary of $H(e_k^n)$ do not intersect with the eigenvalues of matrices in $\mathcal{S}^{n,k}$.
Citation: SIAM Journal on Optimization
PubDate: 20220425T07:00:00Z
DOI: 10.1137/20M1387407
Issue No: Vol. 32, No. 2 (2022)

 Finite Convergence of SumofSquares Hierarchies for the Stability Number
of a Graph
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Monique Laurent, Luis Felipe Vargas
Pages: 491  518
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 491518, June 2022.
We investigate a hierarchy of semidefinite bounds $\vartheta^{(r)}(G)$ for the stability number $\alpha(G)$ of a graph $G$, based on its copositive programming formulation and introduced by de Klerk and Pasechnik [SIAM J. Optim., 12 (2002), pp. 875892], who conjectured convergence to $\alpha(G)$ in $r=\alpha(G)1$ steps. Even the weaker conjecture claiming finite convergence is still open. We establish links between this hierarchy and sumofsquares hierarchies based on the MotzkinStraus formulation of $\alpha(G)$, which we use to show finite convergence when $G$ is acritical, i.e., when $\alpha(G\setminus e)=\alpha(G)$ for all edges $e$ of $G$. This relies, in particular, on understanding the structure of the minimizers of the MotzkinStraus formulation and showing that their number is finite precisely when $G$ is acritical. Moreover we show that these results hold in the general setting of the weighted stable set problem for graphs equipped with positive node weights. In addition, as a byproduct we show that deciding whether a standard quadratic program has finitely many minimizers does not admit a polynomialtime algorithm unless P=NP.
Citation: SIAM Journal on Optimization
PubDate: 20220425T07:00:00Z
DOI: 10.1137/21M140345X
Issue No: Vol. 32, No. 2 (2022)

 Stochastic Multilevel Composition Optimization Algorithms with
LevelIndependent Convergence Rates
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Krishnakumar Balasubramanian, Saeed Ghadimi, Anthony Nguyen
Pages: 519  544
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 519544, June 2022.
In this paper, we study smooth stochastic multilevel composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic firstorder oracle. For solving this class of problems, we propose two algorithms using movingaverage stochastic estimates, and analyze their convergence to an $\epsilon$stationary point of the problem. We show that the first algorithm, which is a generalization of [S. Ghadimi, A. Ruszczynski, and M. Wang, SIAM J. Optim., 30 (2020), pp. 960979] to the $T$ level case, can achieve a sample complexity of $\mathcal{O}_T(1/\epsilon^6)$ by using minibatches of samples in each iteration, where $\mathcal{O}_T$ hides constants that depend on $T$. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}_T(1/\epsilon^4)$. This modification not only removes the requirement of having a minibatch of samples in each iteration, but also makes the algorithm parameterfree and easy to implement. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multilevel setting obtains the same sample complexity of the smooth singlelevel setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic firstorder oracle.
Citation: SIAM Journal on Optimization
PubDate: 20220425T07:00:00Z
DOI: 10.1137/21M1406222
Issue No: Vol. 32, No. 2 (2022)

 A Subspace Acceleration Method for Minimization Involving a Group
SparsityInducing Regularizer
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Frank E. Curtis, Yutong Dai, Daniel P. Robinson
Pages: 545  572
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 545572, June 2022.
We consider the problem of minimizing an objective function that is the sum of a convex function and a group sparsityinducing regularizer. Problems that integrate such regularizers arise in modern machine learning applications, often for the purpose of obtaining models that are easier to interpret and that have higher predictive accuracy. We present a new method for solving such problems that utilizes subspace acceleration, domain decomposition, and support identification. Our analysis provides the global iteration complexity of obtaining an $\epsilon$accurate solution and shows that, under common assumptions, the iterates locally converge superlinearly. Numerical results on regularized logistic and linear regression problems show that our approach is efficient and reliable and outperforms stateoftheart methods on interesting classes of problems, especially when the number of data points is larger than the number of features. For solving problems when the number of data points is smaller than the number of features, algorithms that focus on solving a dual problem may be more efficient than our approach, which solves the primal problem.
Citation: SIAM Journal on Optimization
PubDate: 20220427T07:00:00Z
DOI: 10.1137/21M1411111
Issue No: Vol. 32, No. 2 (2022)

 Distributed Variable SampleSize GradientResponse and BestResponse
Schemes for Stochastic Nash Equilibrium Problems
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Jinlong Lei, Uday V. Shanbhag
Pages: 573  603
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 573603, June 2022.
This paper considers an $n$player stochastic Nash equilibrium problem (NEP) in which the $i$th player minimizes a composite objective $f_i( \bullet ,x_{i}) + r_i( \bullet )$, where $f_i$ is an expectationvalued smooth function, $x_{i}$ is a tuple of rival decisions, and $r_i$ is a nonsmooth convex function with an efficient proxevaluation. In this context, we make the following contributions. (I) Under suitable monotonicity assumptions on the \redpseudogradient map, we derive optimal rate statements and oracle complexity bounds for the proposed variable samplesize proximal stochastic gradientresponse (VSPGR) scheme when the samplesize increases at a geometric rate. If the samplesize increases at a polynomial rate with degree $v> 0$, the meansquared error of the iterates decays at a corresponding polynomial rate; in particular, we prove that the iteration and oracle complexities to obtain an $\epsilon$Nash equilibrium ($\epsilon$NE) are $\mathcal{O}(1/\epsilon^{1/v})$ and $\mathcal{O}(1/\epsilon^{1+1/v})$, respectively. \redWhen the samplesize is held constant, the iterates converge geometrically to a neighborhood of the Nash equilibrium in an expectedvalue sense. (II) We then overlay \bf VSPGR with a consensus phase with a view towards developing distributed protocols for aggregative stochastic NEPs. In the resulting \bf dVSPGR scheme, when the samplesize at each iteration grows at a geometric rate while the communication rounds per iteration grow at the rate of $ k+1 $, computing an $\epsilon$NE requires similar iteration and oracle complexities to \bf VSPGR with a communication complexity of $\mathcal{O}(\ln^2(1/\epsilon))$. Notably, (I) and (II) rely on weaker oracle assumptions in that the conditionally unbiasedness assumption is relaxed while the bound on the conditional second moment may be statedependent. (III) Under a suitable contractive property associated with the proximal bestresponse (BR) map, we design a variable samplesize proximal BR (VSPBR) scheme, where each player solves a sampleaverage BR problem. When the samplesize increases at a suitable geometric rate, the resulting iterates converge at a geometric rate while the iteration and oracle complexity are, respectively, $\mathcal{O}(\ln(1/\epsilon))$ and $\mathcal{O}(1/\epsilon)$. If the samplesize increases at a polynomial rate with degree $v$, the meansquared error decays at a corresponding polynomial rate while the iteration and oracle complexities are $\mathcal{O}(1/\epsilon^{1/v})$ and $\mathcal{O}(1/\epsilon^{1+1/v})$, respectively. (IV) Akin to (II), the distributed variant \bf dVSPBR achieves similar iteration and oracle complexities to the centralized VSPBR with a communication complexity of $\mathcal{O}(\ln^2(1/\epsilon))$ when the communication rounds per iteration increase at the rate of $ k+1 $. Finally, we present preliminary numerics to provide empirical support for the rate and complexity statements.
Citation: SIAM Journal on Optimization
PubDate: 20220427T07:00:00Z
DOI: 10.1137/20M1340071
Issue No: Vol. 32, No. 2 (2022)

 Generic Property of the Partial Calmness Condition for Bilevel Programming
Problems
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Rongzhu Ke, Wei Yao, Jane J. Ye, Jin Zhang
Pages: 604  634
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 604634, June 2022.
The partial calmness for the bilevel programming problem (BLPP) is an important condition which ensures that a local optimal solution of BLPP is a local optimal solution of a partially penalized problem where the lowerlevel optimality constraint is moved to the objective function and hence a weaker constraint qualification can be applied. In this paper, we propose a sufficient condition in the form of a partial error bound condition which guarantees the partial calmness condition. We analyze the partial calmness for the combined program based on the Bouligand (B) and the Fritz John (FJ) stationary conditions from a generic point of view. Our main result states that the partial error bound condition for the combined programs based on B and FJ conditions is generic for an important setting with applications in economics, and hence the partial calmness for the combined program is not a particularly stringent assumption. Moreover, we derive optimality conditions for the combined program for the generic case without any extra constraint qualifications and show the exact equivalence between our optimality condition and the one by Jongen and Shikhman [Math. Program., 136 (2012), pp. 6589] given in implicit form. Our arguments are based on Jongen, Jonker, and Twilt's [Math. Program., 34 (1986), pp. 333353] generic (five type) classification of the socalled generalized critical points for onedimensional parametric optimization problems and Jongen and Shikhman's generic local reductions of BLPPs.
Citation: SIAM Journal on Optimization
PubDate: 20220428T07:00:00Z
DOI: 10.1137/20M1371403
Issue No: Vol. 32, No. 2 (2022)

 Condition Number Minimization in Euclidean Jordan Algebras

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Alberto Seeger
Pages: 635  658
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 635658, June 2022.
Let $b$ be a nonnegative vector in a Euclidean Jordan algebra $\mathbb{E}$. Its condition number $\kappa(b)$ is assumed to be high, possibly infinite. The condition number of a nonnegative vector is defined as the ratio between the largest and the smallest eigenvalue. We wish to perturb $b$ in such a way as to diminish its condition number as much as possible. The problem at hand is that of minimizing $\kappa(b+x)$ with respect to a perturbation vector $x$ taken from a given subset of $\mathbb{E}$. In spite of being nonsmooth and nonconvex, such an optimization problem can be handled with the machinery of Euclidean Jordan algebras and, in particular, with the theory of spectral functions.
Citation: SIAM Journal on Optimization
PubDate: 20220428T07:00:00Z
DOI: 10.1137/21M1400705
Issue No: Vol. 32, No. 2 (2022)

 On Obtaining the Convex Hull of Quadratic Inequalities via Aggregations

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Santanu S. Dey, Gonzalo Mun͂oz, Felipe Serrano
Pages: 659  686
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 659686, June 2022.
A classical approach for obtaining valid inequalities for a set involves the analysis of relaxations constructed using aggregations of the inequalities that describe such a set. When the set is described by linear inequalities, thanks to the Farkas lemma, we know that every valid inequality can be obtained using aggregations. When the inequalities describing the set are two quadratics, Yildiran [IMA J. Math. Control Inform., 26 (2009), pp. 417450] showed that the convex hull of the set is given by at most two aggregated inequalities. In this work, we study the case of a set described by three or more quadratic inequalities. We show that, under technical assumptions, the convex hull of a set described by three quadratic inequalities can be obtained via (potentially infinitely many) aggregated inequalities. We also show, through counterexamples, that such as a result does not hold either if the technical conditions are relaxed or if we consider four or more inequalities.
Citation: SIAM Journal on Optimization
PubDate: 20220428T07:00:00Z
DOI: 10.1137/21M1428583
Issue No: Vol. 32, No. 2 (2022)

 ZerothOrder Regularized Optimization (ZORO): Approximately Sparse
Gradients and Adaptive Sampling
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: HanQin Cai, Daniel McKenzie, Wotao Yin, Zhenliang Zhang
Pages: 687  714
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 687714, June 2022.
We consider the problem of minimizing a highdimensional objective function, which may include a regularization term, using only (possibly noisy) evaluations of the function. Such optimization is also called derivativefree, zerothorder, or blackbox optimization. We propose a new zerothorder regularized optimization method, dubbed ZORO. When the underlying gradient is approximately sparse at an iterate, ZORO needs very few objective function evaluations to obtain a new iterate that decreases the objective function. We achieve this with an adaptive, randomized gradient estimator, followed by an inexact proximalgradient scheme. Under a novel approximately sparse gradient assumption and various different convex settings, we show that the (theoretical and empirical) convergence rate of ZORO is only logarithmically dependent on the problem dimension. Numerical experiments show that ZORO outperforms existing methods with similar assumptions, on both synthetic and real datasets.
Citation: SIAM Journal on Optimization
PubDate: 20220428T07:00:00Z
DOI: 10.1137/21M1392966
Issue No: Vol. 32, No. 2 (2022)

 Distributionally Robust SecondOrder Stochastic Dominance Constrained
Optimization with Wasserstein Ball
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Yu Mei, Jia Liu, Zhiping Chen
Pages: 715  738
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 715738, June 2022.
We consider a distributionally robust secondorder stochastic dominance constrained optimization problem. We require the dominance constraints to hold with respect to all probability distributions in a Wasserstein ball centered at the empirical distribution. We adopt the sample approximation approach to develop a linear programming formulation that provides a lower bound. We propose a novel splitanddual decomposition framework which provides an upper bound. We establish quantitative convergence for both lower and upper approximations given some constraint qualification conditions. To efficiently solve the nonconvex upper bound problem, we use a sequential convex approximation algorithm. Numerical evidence on a portfolio selection problem validates the convergence and effectiveness of the proposed two approximation methods.
Citation: SIAM Journal on Optimization
PubDate: 20220502T07:00:00Z
DOI: 10.1137/21M1394412
Issue No: Vol. 32, No. 2 (2022)

 ActiveSet Identification with Complexity Guarantees of an Almost Cyclic
2Coordinate Descent Method with Armijo Line Search
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Andrea Cristofari
Pages: 739  764
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 739764, June 2022.
This paper establishes finite activeset identification of an almost cyclic 2coordinate descent method for problems with one linear coupling constraint and simple bounds. First, general activeset identification results are stated for nonconvex objective functions. Then, under convexity and a quadratic growth condition (satisfied by any strongly convex function), complexity results on the number of iterations required to identify the active set are given. In our analysis, a simple Armijo line search is used to compute the stepsize, thus not requiring exact minimizations or additional information.
Citation: SIAM Journal on Optimization
PubDate: 20220502T07:00:00Z
DOI: 10.1137/20M1328014
Issue No: Vol. 32, No. 2 (2022)

 Symmetry Reduction in AM/GMBased Optimization

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Philippe Moustrou, Helen Naumann, Cordian Riener, Thorsten Theobald, Hugues Verdure
Pages: 765  785
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 765785, June 2022.
The arithmetic mean/geometric mean inequality (AM/GM inequality) facilitates classes of nonnegativity certificates and of relaxation techniques for polynomials and, more generally, for exponential sums. Here, we present a first systematic study of the AM/GMbased techniques in the presence of symmetries under the linear action of a finite group. We prove a symmetryadapted representation theorem and develop techniques to reduce the size of the resulting relative entropy programs. We study in more detail the complexity gain in the case of the symmetric group. In this setup, we can show in particular certain stabilization results. We exhibit several sequences of examples in growing dimensions where the size of the reduced problem stabilizes. Finally, we provide some numerical results, emphasizing the computational speedup.
Citation: SIAM Journal on Optimization
PubDate: 20220502T07:00:00Z
DOI: 10.1137/21M1405691
Issue No: Vol. 32, No. 2 (2022)

 On the Linear Convergence of the Multimarginal Sinkhorn Algorithm

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Guillaume Carlier
Pages: 786  794
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 786794, June 2022.
The aim of this note is to give an elementary proof of linear convergence of the Sinkhorn algorithm for the entropic regularization of multimarginal optimal transport in the setting of general probability spaces. The proof simply relies on (i) the fact that Sinkhorn iterates are bounded, (ii) the strong convexity of the exponential on bounded intervals, and (iii) the convergence analysis of the coordinate descent (GaussSeidel) method of Beck and Tetruashvili [SIAM J. Optim, 23 (2013), pp. 20372060].
Citation: SIAM Journal on Optimization
PubDate: 20220509T07:00:00Z
DOI: 10.1137/21M1410634
Issue No: Vol. 32, No. 2 (2022)

 Differentially Private Accelerated Optimization Algorithms

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Nurdan Kuru, Ş. İlker Birbil, Mert Gürbüzbalaban, Sinan Yildirim
Pages: 795  821
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 795821, June 2022.
We present two classes of differentially private optimization algorithms derived from the wellknown accelerated firstorder methods. The first algorithm is inspired by Polyak's heavy ball method and employs a smoothing approach to decrease the accumulated noise on the gradient steps required for differential privacy. The second class of algorithms are based on Nesterov's accelerated gradient method and its recent multistage variant. We propose a noise dividing mechanism for the iterations of Nesterov's method in order to improve the error behavior of the algorithm. The convergence rate analyses are provided for both the heavy ball and the Nesterov's accelerated gradient method with the help of the dynamical system analysis techniques. Finally, we conclude with our numerical experiments showing that the presented algorithms have advantages over the wellknown differentially private algorithms.
Citation: SIAM Journal on Optimization
PubDate: 20220509T07:00:00Z
DOI: 10.1137/20M1355847
Issue No: Vol. 32, No. 2 (2022)

 Sequential Quadratic Optimization for Nonlinear Optimization Problems on
Riemannian Manifolds
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Mitsuaki Obara, Takayuki Okuno, Akiko Takeda
Pages: 822  853
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 822853, June 2022.
We consider optimization problems on Riemannian manifolds with equality and inequality constraints, which we call Riemannian nonlinear optimization (RNLO) problems. Although they have numerous applications, the existing studies on them are limited especially in terms of algorithms. In this paper, we propose Riemannian sequential quadratic optimization (RSQO) that uses a linesearch technique with an $\ell_{1}$ penalty function as an extension of the standard SQO algorithm for constrained nonlinear optimization problems in Euclidean spaces to Riemannian manifolds. We prove its global convergence to a KarushKuhnTucker point of the RNLO problem by means of parallel transport and the exponential mapping. Furthermore, we establish its local quadratic convergence by analyzing the relationship between sequences generated by RSQO and the Riemannian Newton method. Ours is the first algorithm that has both global and local convergence properties for constrained nonlinear optimization on Riemannian manifolds. Empirical results show that RSQO finds solutions more stably and with higher accuracy compared with the existing Riemannian penalty and augmented Lagrangian methods.
Citation: SIAM Journal on Optimization
PubDate: 20220510T07:00:00Z
DOI: 10.1137/20M1370173
Issue No: Vol. 32, No. 2 (2022)

 Fenchel Duality and a Separation Theorem on Hadamard Manifolds

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Maurício Silva Louzeiro, Ronny Bergmann, Roland Herzog
Pages: 854  873
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 854873, June 2022.
In this paper, we introduce a definition of Fenchel conjugate and Fenchel biconjugate on Hadamard manifolds based on the tangent bundle. Our definition overcomes the inconvenience that the conjugate depends on the choice of a certain point on the manifold, as previous definitions required. On the other hand, this new definition still possesses properties known to hold in the Euclidean case. It even yields a broader interpretation of the Fenchel conjugate in the Euclidean case itself. Most prominently, our definition of the Fenchel conjugate provides a FenchelMoreau theorem for geodesically convex, proper, lower semicontinuous functions. In addition, this framework allows us to develop a theory of separation of convex sets on Hadamard manifolds, and a strict separation theorem is obtained.
Citation: SIAM Journal on Optimization
PubDate: 20220510T07:00:00Z
DOI: 10.1137/21M1400699
Issue No: Vol. 32, No. 2 (2022)

 Convexification with Bounded Gap for Randomly Projected Quadratic
Optimization
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Terunari Fuji, PierreLouis Poirion, Akiko Takeda
Pages: 874  899
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 874899, June 2022.
Random projection techniques based on the JohnsonLindenstrauss lemma are used for randomly aggregating the constraints or variables of optimization problems while approximately preserving their optimal values, which leads to smallerscale optimization problems. D'Ambrosio et al. [Math. Program., 183 (2020), pp. 619647] have applied random projection to a quadratic optimization problem so as to decrease the number of decision variables. Although the problem size becomes smaller, the projected problem will also almost surely be nonconvex if the original problem is nonconvex and hence will be hard to solve. In this paper, by focusing on the fact that the level of the nonconvexity of a nonconvex quadratic optimization problem can be alleviated by random projection, we find an approximate global optimal value of the problem by attributing it to a convex problem with smaller size. To the best of our knowledge, our paper is the first to use random projection for convexification of nonconvex optimization problems. We evaluate the approximation error between optimum values of a nonconvex optimization problem and its convexified randomly projected problem.
Citation: SIAM Journal on Optimization
PubDate: 20220512T07:00:00Z
DOI: 10.1137/21M1433678
Issue No: Vol. 32, No. 2 (2022)

 Fast Decentralized Nonconvex FiniteSum Optimization with Recursive
Variance Reduction
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Ran Xin, Usman A. Khan, Soummya Kar
Pages: 1  28
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 128, January 2022.
This paper considers decentralized minimization of $N:=nm$ smooth nonconvex cost functions equally divided over a directed network of $n$ nodes. Specifically, we describe a stochastic firstorder gradient method, called GTSARAH, that employs a SARAHtype variance reduction technique and gradient tracking (GT) to address the stochastic and decentralized nature of the problem. We show that GTSARAH, with appropriate algorithmic parameters, finds an $\epsilon$accurate firstorder stationary point with $\mathcal{O}\big(\max\big\{N^{{1}/{2}},n(1\lambda)^{2},n^{{2}/{3}}m^{{1}/{3}}(1\lambda)^{1}\big\}L\epsilon^{2}\big)$ gradient complexity, where ${(1\lambda)\in(0,1]}$ is the spectral gap of the network weight matrix and $L$ is the smoothness parameter of the cost functions. This gradient complexity outperforms that of the existing decentralized stochastic gradient methods. In particular, in a bigdata regime such that ${n = \mathcal{O}(N^{{1}/{2}}(1\lambda)^{3})}$, this gradient complexity furthers reduces to ${\mathcal{O}(N^{{1}/{2}}L\epsilon^{2})}$, independent of the network topology, and matches that of the centralized nearoptimal variancereduced methods. Moreover, in this regime GTSARAH achieves a nonasymptotic linear speedup in that the total number of gradient computations at each node is reduced by a factor of $1/n$ compared to the centralized nearoptimal algorithms that perform all gradient computations at a single node. To the best of our knowledge, GTSARAH is the first algorithm that achieves this property. In addition, we show that appropriate choices of local minibatch size balance the tradeoffs between the gradient and communication complexity of GTSARAH. Over infinite time horizon, we establish that all nodes in GTSARAH asymptotically achieve consensus and converge to a firstorder stationary point in the almost sure and meansquared sense.
Citation: SIAM Journal on Optimization
PubDate: 20220105T08:00:00Z
DOI: 10.1137/20M1361158
Issue No: Vol. 32, No. 1 (2022)

 A NoiseTolerant QuasiNewton Algorithm for Unconstrained Optimization

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: HaoJun M. Shi, Yuchen Xie, Richard Byrd, Jorge Nocedal
Pages: 29  55
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 2955, January 2022.
This paper describes an extension of the BFGS and LBFGS methods for the minimization of a nonlinear function subject to errors. This work is motivated by applications that contain computational noise, employ lowprecision arithmetic, or are subject to statistical noise. The classical BFGS and LBFGS methods can fail in such circumstances because the updating procedure can be corrupted and the line search can behave erratically. The proposed method addresses these difficulties and ensures that the BFGS update is stable by employing a lengthening procedure that spaces out the points at which gradient differences are collected. A new line search, designed to tolerate errors, guarantees that the ArmijoWolfe conditions are satisfied under most reasonable conditions, and works in conjunction with the lengthening procedure. The proposed methods are shown to enjoy convergence guarantees for strongly convex functions. Detailed implementations of the methods are presented, together with encouraging numerical results.
Citation: SIAM Journal on Optimization
PubDate: 20220105T08:00:00Z
DOI: 10.1137/20M1373190
Issue No: Vol. 32, No. 1 (2022)

 Openness, Hölder Metric Regularity, and Hölder Continuity Properties of
Semialgebraic SetValued Maps
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Jae Hyoung Lee, TienSon Pham
Pages: 56  74
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 5674, January 2022.
Given a semialgebraic setvalued map $F \colon \mathbb{R}^n \rightrightarrows \mathbb{R}^m$ with closed graph, we show that the map $F$ is Hölder metrically subregular and that the following conditions are equivalent: (i) $F$ is an open map from its domain into its range, and the range of $F$ is locally closed; (ii) the map $F$ is Hölder metrically regular; (iii) the inverse map $F^{1}$ is pseudoHölder continuous; (iv) the inverse map $F^{1}$ is lower pseudoHölder continuous. An application, via Robinson's normal map formulation, leads to the following result in the context of semialgebraic variational inequalities: if the solution map (as a map of the parameter vector) is lower semicontinuous, then the solution map is finite and pseudoHölder continuous. In particular, we obtain a negative answer to a question mentioned in the paper of Dontchev and Rockafellar SIAM J. Optim., 4 (1996), pp. 10871105. As a byproduct, we show that for a (not necessarily semialgebraic) continuous singlevalued map from $\mathbb{R}^n$ to $\mathbb{R},$ the openness and the nonextremality are equivalent. This fact improves the main result of Pühl J. Math. Anal. Appl., 227 (1998), pp. 382395, which requires the convexity of the map in question.
Citation: SIAM Journal on Optimization
PubDate: 20220105T08:00:00Z
DOI: 10.1137/20M1331901
Issue No: Vol. 32, No. 1 (2022)

 Sequential Linearization Method for BoundConstrained Mathematical
Programs with Complementarity Constraints
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Christian Kirches, Jeffrey Larson, Sven Leyffer, Paul Manns
Pages: 75  99
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 7599, January 2022.
We propose an algorithm for solving boundconstrained mathematical programs with complementarity constraints on the variables. Each iteration of the algorithm involves solving a linear program with complementarity constraints in order to obtain an estimate of the active set. The algorithm enforces descent on the objective function to promote global convergence to Bstationary points. We provide a convergence analysis and preliminary numerical results on a range of test problems. We also study the effect of fixing the active constraints in a boundconstrained quadratic program that can be solved on each iteration in order to obtain fast convergence.
Citation: SIAM Journal on Optimization
PubDate: 20220105T08:00:00Z
DOI: 10.1137/20M1370501
Issue No: Vol. 32, No. 1 (2022)

 FirstOrder Algorithms for a Class of Fractional Optimization Problems

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Na Zhang, Qia Li
Pages: 100  129
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 100129, January 2022.
In this paper, we consider a class of singleratio fractional minimization problems, in which the numerator of the objective is the sum of a nonsmooth nonconvex function and a smooth nonconvex function while the denominator is a nonsmooth convex function. In this work, we first derive its firstorder necessary optimality condition, by using the firstorder operators of the three functions involved. Then we develop firstorder algorithms, namely, the proximitygradientsubgradient algorithm (PGSA), PGSA with monotone line search (PGSA_ML), and PGSA with nonmonotone line search (PGSA_NL). It is shown that any accumulation point of the sequence generated by them is a critical point of the problem under mild assumptions. Moreover, we establish global convergence of the sequence generated by PGSA or PGSA_ML and analyze its convergence rate, by further assuming the local Lipschitz continuity of the nonsmooth function in the numerator, the smoothness of the denominator, and the KurdykaŁojasiewicz (KL) property of the objective. The proposed algorithms are applied to the sparse generalized eigenvalue problem associated with a pair of symmetric positive semidefinite matrices, and the corresponding convergence results are obtained according to their general convergence theorems. We perform some preliminary numerical experiments to demonstrate the efficiency of the proposed algorithms.
Citation: SIAM Journal on Optimization
PubDate: 20220110T08:00:00Z
DOI: 10.1137/20M1325381
Issue No: Vol. 32, No. 1 (2022)

 Bias Reduction in SampleBased Optimization

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Darinka Dentcheva, Yang Lin
Pages: 130  151
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 130151, January 2022.
We consider the stochastic optimization problems which use observed data to estimate essential characteristics of the random quantities involved. Sample average approximation (SAA) or empirical (plugin) estimation are very popular ways to use data in optimization. It is well known that SAA suffers from downward bias. Our proposal is to use smooth estimators rather than empirical ones in the optimization problems. We establish consistency results for the optimal value and the set of optimal solutions of the new problem formulation. The performance of the proposed approach is compared to SAA theoretically and numerically. We analyze the bias of the new problems and identify sufficient conditions for ensuring less biased estimation of the optimal value of the true problem. At the same time, the error of the new estimator remains controlled. Those conditions are satisfied for many popular statistical problems such as regression models, classification problems, and optimization problems with average (conditional) value at risk. We have proved that smoothing the leastsquares objective in a regression problem by a normal kernel leads to a ridge regression. Our numerical experience shows that the new estimators also frequently exhibit smaller variance and smaller meansquare error than those of SAA.
Citation: SIAM Journal on Optimization
PubDate: 20220124T08:00:00Z
DOI: 10.1137/20M1326428
Issue No: Vol. 32, No. 1 (2022)

 Affinely Adjustable Robust Linear Complementarity Problems

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Christian Biefel, Frauke Liers, Jan Rolfes, Martin Schmidt
Pages: 152  172
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 152172, March 2022.
Linear complementarity problems (LCPs) are a powerful tool for modeling many practically relevant situations such as market equilibria. They also connect many subareas of mathematics like game theory, optimization, and matrix theory. Despite their close relation to optimization, the protection of LCPs against uncertaintiesespecially in the sense of robust optimizationis still in its infancy. During the last years, robust LCPs have only been studied using the notions of strict and $\Gamma$robustness. Unfortunately, both concepts lead to the problem that the existence of robust solutions cannot be guaranteed. In this paper, we consider affinely adjustable robust LCPs. In the latter, a part of the LCP solution is allowed to adjust via a function that is affine in the uncertainty. We show that this notion of robustness allows us to establish strong characterizations of solutions for the cases of uncertain matrix and vector, separately, from which existence results can be derived. Our main results are valid for the case of an uncertain LCP vector. Here, we additionally provide sufficient conditions on the LCP matrix for the uniqueness of a solution. Moreover, based on characterizations of the affinely adjustable robust solutions, we derive a mixedinteger programming formulation that allows us to solve the corresponding robust counterpart. If, in addition, the certain LCP matrix is positive semidefinite, we prove polynomialtime solvability and uniqueness of robust solutions. If the LCP matrix is uncertain, characterizations of solutions are developed for every nominal matrix; i.e., these characterizations are, in particular, independent of the definiteness of the nominal matrix. Robust solutions are also shown to be unique for a positive definite LCP matrix, but both uniqueness and mixedinteger programming formulations still remain open problems if the nominal LCP matrix is not positive definite.
Citation: SIAM Journal on Optimization
PubDate: 20220210T08:00:00Z
DOI: 10.1137/20M1359778
Issue No: Vol. 32, No. 1 (2022)

 The RatioCut Polytope and KMeans Clustering

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Antonio De Rosa, Aida Khajavirad
Pages: 173  203
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 173203, March 2022.
We introduce the ratiocut polytope defined as the convex hull of ratiocut vectors corresponding to all partitions of $n$ points in $\mathbb{R}^m$ into at most $K$ clusters. This polytope is closely related to the convex hull of the feasible region of a number of clustering problems such as Kmeans clustering and spectral clustering. We study the facial structure of the ratiocut polytope and derive several types of facetdefining inequalities. We then consider the problem of Kmeans clustering and introduce a novel linear programming (LP) relaxation for it. Subsequently, we focus on the case of two clusters and derive sufficient condition under which the proposed LP relaxation recovers the underlying clusters exactly. Namely, we consider the stochastic ball model, a popular generative model for Kmeans clustering, and we show that if the separation distance between cluster centers satisfies $\Delta> 1+\sqrt 3$, then the LP relaxation recovers the planted clusters with high probability. This is a major improvement over the only existing recovery guarantee for an LP relaxation of Kmeans clustering stating that recovery is possible with high probability if and only if $\Delta> 4$. Our numerical experiments indicate that the proposed LP relaxation significantly outperforms a popular semidefinite programming relaxation in recovering the planted clusters.
Citation: SIAM Journal on Optimization
PubDate: 20220224T08:00:00Z
DOI: 10.1137/20M1348601
Issue No: Vol. 32, No. 1 (2022)

 Faster LagrangianBased Methods in Convex Optimization

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Shoham Sabach, Marc Teboulle
Pages: 204  227
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 204227, March 2022.
In this paper, we aim at unifying, simplifying, and improving the convergence rate analysis of Lagrangianbased methods for convex optimization problems. We first introduce the notion of nice primal algorithmic map, which plays a central role in the unification and in the simplification of the analysis of most Lagrangianbased methods. Equipped with a nice primal algorithmic map, we then introduce a versatile generic scheme, which allows for the design and analysis of faster Lagrangian (FLAG) methods with new provably sublinear rate of convergence expressed in terms of function values and feasibility violation of the original (nonergodic) generated sequence. To demonstrate the power and versatility of our approach and results, we show that most wellknown iconic Lagrangianbased schemes admit a nice primal algorithmic map and hence share the new faster rate of convergence results within their corresponding FLAG.
Citation: SIAM Journal on Optimization
PubDate: 20220228T08:00:00Z
DOI: 10.1137/20M1375358
Issue No: Vol. 32, No. 1 (2022)

 QNG: A QuasiNatural Gradient Method for LargeScale Statistical Learning

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Xiaoyu He, Zibin Zheng, Yuren Zhou, Chuan Chen
Pages: 228  255
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 228255, March 2022.
Natural gradient method provides a powerful paradigm for training statistical models and offers several appealing theoretic benefits. It constructs the Fisher information matrix to correct the ordinary gradients, and thus, the cost may become prohibitively expensive in the largescale setting. This paper proposes a quasinatural gradient method to address this computational issue. It employs a rankone model to capture the secondorder information from the underlying statistical manifold and to iteratively update the Fisher approximations. A threeloop procedure is designed to implement the updating formulas. This procedure resembles the classical twoloop procedure in the limitedmemory BFGS method but saves half of the memory usage while it can be made faster. The resulting method retains the convergence rate advantages of existing stochastic optimization methods and has inherent ability in handling nonconvexity. Numerical studies conducted on several machine learning tasks demonstrate the reduction in convergence time and the robustness in tackling nonconvexity relative to stochastic gradient descent and the online limitedmemory BFGS method.
Citation: SIAM Journal on Optimization
PubDate: 20220314T07:00:00Z
DOI: 10.1137/20M1376753
Issue No: Vol. 32, No. 1 (2022)

 Interior Point Methods Can Exploit Structure of Convex Piecewise Linear
Functions with Application in Radiation Therapy
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Bram L. Gorissen
Pages: 256  275
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 256275, March 2022.
Auxiliary variables are often used to model a convex piecewise linear function in the framework of linear optimization. This work shows that such variables yield a block diagonal plus low rank structure in the reduced KKT system of the dual problem. We show how the structure can be detected efficiently and derive the linear algebra formulas for an interior point method which exploits such a structure. The structure is detected in 36% of the cases in Netlib. Numerical results on the inverse planning problem in radiation therapy show an order of magnitude speedup compared to the stateoftheart interior point solver CPLEX and considerable improvements in dose distribution compared to current algorithms.
Citation: SIAM Journal on Optimization
PubDate: 20220315T07:00:00Z
DOI: 10.1137/21M1402364
Issue No: Vol. 32, No. 1 (2022)

 Partial Smoothness and Constant Rank

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: A. S. Lewis, Jingwei Liang, Tonghua Tian
Pages: 276  291
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 276291, March 2022.
In optimization, the notion of a partly smooth objective function is powerful for applications in algorithmic convergence and postoptimality analysis, and yet is complex to define. A shift in focus to the firstorder optimality conditions reduces the concept to a simple constantrank condition. In this view, partial smoothness extends to more general variational systems, encompassing in particular the saddlepoint operators underlying popular primaldual splitting algorithms. For a broad class of semialgebraic generalized equations, partial smoothness holds generically.
Citation: SIAM Journal on Optimization
PubDate: 20220317T07:00:00Z
DOI: 10.1137/19M1237909
Issue No: Vol. 32, No. 1 (2022)

 Uncertainty Preferences in Robust MixedInteger Linear Optimization with
Endogenous Uncertainty
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Immanuel Bomze, Markus Gabl
Pages: 292  318
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 292318, March 2022.
In robust optimization one seeks to make a decision under uncertainty, where the goal is to find the solution with the best worstcase performance. The set of possible realizations of the uncertain data is described by a socalled uncertainty set. In many scenarios, a decision maker may influence the uncertainty regime they are facing, for example, by investing in market research or in machines which work with higher precision. Recently, this situation was addressed in the literature by introducing decision dependent uncertainty sets (endogenous uncertainty), i.e., uncertainty sets whose structure depends on (typically discrete) decision variables. In this way, one can model the tradeoff between reducing the cost of robustness versus the cost of the investment necessary for influencing the uncertainty. However, there is another tradeoff to be made here. With different uncertainty regimes, not only do the worstcase optimal solutions vary but also other aspects of those solutions such as maxregret, bestcase performance, or predictability of the performance. A decision maker may still be interested in having a performance guarantee but at the same time be willing to forgo superior worstcase performance if those other aspects can be enhanced by switching to a suitable uncertainty regime. We introduce the notion of uncertainty preference in order to capture such stances. We introduce a multiobjective optimization based and a bilevel optimization based model that integrate these preferences in a meaningful way. Further, we present three ways to formalize uncertainty preferences and study the resulting mathematical models. The goal is to have reformulations/approximations of these models which can be solved with standard methods. The workhorse is mixedinteger linear and conic optimization. We apply our framework to the uncertain shortest path problem and conduct numerical experiments for the resulting models. We can demonstrate that our models can be handled very well by standard mixedinteger linear solvers.
Citation: SIAM Journal on Optimization
PubDate: 20220324T07:00:00Z
DOI: 10.1137/20M1355422
Issue No: Vol. 32, No. 1 (2022)

 Corrigendum: Critical Cones for Sufficient Second Order Conditions in PDE
Constrained Optimization
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Authors: Eduardo Casas, Mariano Mateos
Pages: 319  320
Abstract: SIAM Journal on Optimization, Volume 32, Issue 1, Page 319320, March 2022.
We correct an error in the proof of Theorem 3.1 in [E. Casas and M. Mateos, Critical cones for sufficient second order conditions in PDE constrained optimization, SIAM J. Optim., 30 (2020), pp. 585603]. With this correction, all results in that paper remain true.
Citation: SIAM Journal on Optimization
PubDate: 20220331T07:00:00Z
DOI: 10.1137/21M1466839
Issue No: Vol. 32, No. 1 (2022)
