Subjects -> MATHEMATICS (Total: 1013 journals)     - APPLIED MATHEMATICS (92 journals)    - GEOMETRY AND TOPOLOGY (23 journals)    - MATHEMATICS (714 journals)    - MATHEMATICS (GENERAL) (45 journals)    - NUMERICAL ANALYSIS (26 journals)    - PROBABILITIES AND MATH STATISTICS (113 journals) MATHEMATICS (714 journals)            First | 1 2 3 4
 Showing 601 - 538 of 538 Journals sorted alphabetically Results in Control and Optimization Results in Mathematics Results in Nonlinear Analysis Review of Symbolic Logic       (Followers: 2) Reviews in Mathematical Physics       (Followers: 1) Revista Baiana de Educação Matemática Revista Bases de la Ciencia Revista BoEM - Boletim online de Educação Matemática Revista Colombiana de Matemáticas       (Followers: 1) Revista de Ciencias Revista de Educación Matemática Revista de la Escuela de Perfeccionamiento en Investigación Operativa Revista de la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas Revista de Matemática : Teoría y Aplicaciones       (Followers: 1) Revista Digital: Matemática, Educación e Internet Revista Electrónica de Conocimientos, Saberes y Prácticas Revista Integración : Temas de Matemáticas Revista Internacional de Sistemas Revista Latinoamericana de Etnomatemática Revista Latinoamericana de Investigación en Matemática Educativa Revista Matemática Complutense Revista REAMEC : Rede Amazônica de Educação em Ciências e Matemática Revista SIGMA Ricerche di Matematica RMS : Research in Mathematics & Statistics Royal Society Open Science       (Followers: 7) Russian Journal of Mathematical Physics Russian Mathematics Sahand Communications in Mathematical Analysis Sampling Theory, Signal Processing, and Data Analysis São Paulo Journal of Mathematical Sciences Science China Mathematics       (Followers: 1) Science Progress       (Followers: 1) Sciences & Technologie A : sciences exactes Selecta Mathematica       (Followers: 1) SeMA Journal Semigroup Forum       (Followers: 1) Set-Valued and Variational Analysis SIAM Journal on Applied Mathematics       (Followers: 11) SIAM Journal on Computing       (Followers: 11) SIAM Journal on Control and Optimization       (Followers: 18) SIAM Journal on Discrete Mathematics       (Followers: 8) SIAM Journal on Financial Mathematics       (Followers: 3) SIAM Journal on Mathematics of Data Science       (Followers: 1) SIAM Journal on Matrix Analysis and Applications       (Followers: 3) SIAM Journal on Optimization       (Followers: 12) Siberian Advances in Mathematics Siberian Mathematical Journal Sigmae SILICON SN Partial Differential Equations and Applications Soft Computing       (Followers: 7) Statistics and Computing       (Followers: 14) Stochastic Analysis and Applications       (Followers: 3) Stochastic Partial Differential Equations : Analysis and Computations       (Followers: 2) Stochastic Processes and their Applications       (Followers: 6) Stochastics and Dynamics       (Followers: 2) Studia Scientiarum Mathematicarum Hungarica       (Followers: 1) Studia Universitatis Babeș-Bolyai Informatica Studies In Applied Mathematics       (Followers: 1) Studies in Mathematical Sciences       (Followers: 1) Superficies y vacio Suska Journal of Mathematics Education       (Followers: 1) Swiss Journal of Geosciences       (Followers: 1) Synthesis Lectures on Algorithms and Software in Engineering       (Followers: 2) Synthesis Lectures on Mathematics and Statistics       (Followers: 1) Tamkang Journal of Mathematics Tatra Mountains Mathematical Publications Teaching Mathematics       (Followers: 10) Teaching Mathematics and its Applications: An International Journal of the IMA       (Followers: 4) Teaching Statistics       (Followers: 8) Technometrics       (Followers: 8) The Journal of Supercomputing       (Followers: 1) The Mathematica journal The Mathematical Gazette       (Followers: 1) The Mathematical Intelligencer The Ramanujan Journal The VLDB Journal       (Followers: 2) Theoretical and Mathematical Physics       (Followers: 7) Theory and Applications of Graphs Topological Methods in Nonlinear Analysis Transactions of the London Mathematical Society       (Followers: 1) Transformation Groups Turkish Journal of Mathematics Ukrainian Mathematical Journal Uniciencia Uniform Distribution Theory Unisda Journal of Mathematics and Computer Science Unnes Journal of Mathematics       (Followers: 1) Unnes Journal of Mathematics Education       (Followers: 2) Unnes Journal of Mathematics Education Research       (Followers: 1) Ural Mathematical Journal Vestnik Samarskogo Gosudarstvennogo Tekhnicheskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki Vestnik St. Petersburg University: Mathematics VFAST Transactions on Mathematics       (Followers: 1) Vietnam Journal of Mathematics Vinculum Visnyk of V. N. Karazin Kharkiv National University. Ser. Mathematics, Applied Mathematics and Mechanics       (Followers: 2) Water SA       (Followers: 1) Water Waves Zamm-Zeitschrift Fuer Angewandte Mathematik Und Mechanik       (Followers: 1) ZDM       (Followers: 2) Zeitschrift für angewandte Mathematik und Physik       (Followers: 2) Zeitschrift fur Energiewirtschaft Zetetike

First | 1 2 3 4

Similar Journals
 SIAM Journal on OptimizationJournal Prestige (SJR): 2.195 Citation Impact (citeScore): 3Number of Followers: 12      Hybrid journal (It can contain Open Access articles) ISSN (Print) 1052-6234 - ISSN (Online) 1095-7189 Published by Society for Industrial and Applied Mathematics  [17 journals]
• Distributionally Robust Two-Stage Stochastic Programming

• Free pre-print version: Loading...

Authors: Daniel Duque, Sanjay Mehrotra, David P. Morton
Pages: 1499 - 1522
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1499-1522, September 2022.
Distributionally robust optimization is a popular modeling paradigm in which the underlying distribution of the random parameters in a stochastic optimization model is unknown. Therefore, hedging against a range of distributions, properly characterized in an ambiguity set, is of interest. We study two-stage stochastic programs with linear recourse in the context of distributional ambiguity, and formulate several distributionally robust models that vary in how the ambiguity set is built. We focus on the Wasserstein distance under a $p$-norm, and an extension, an optimal quadratic transport distance, as mechanisms to construct the set of probability distributions, allowing the support of the random variables to be a continuous space. We study both unbounded and bounded support sets, and provide guidance regarding which models are meaningful in the sense of yielding robust first-stage decisions. We develop cutting-plane algorithms to solve two classes of problems, and test them on a supply-allocation problem. Our numerical experiments provide further evidence as to what type of problems benefit the most from a distributionally robust solution.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-13T07:00:00Z
DOI: 10.1137/20M1370227
Issue No: Vol. 32, No. 3 (2022)

• Bregman Proximal Point Algorithm Revisited: A New Inexact Version and Its
Inertial Variant

• Free pre-print version: Loading...

Authors: Lei Yang, Kim-Chuan Toh
Pages: 1523 - 1554
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1523-1554, September 2022.
We study a general convex optimization problem, which covers various classic problems in different areas and particularly includes many optimal transport related problems arising in recent years. To solve this problem, we revisit the classic Bregman proximal point algorithm (BPPA) and introduce a new inexact stopping condition for solving the subproblems, which can circumvent the underlying feasibility difficulty often appearing in existing inexact conditions when the problem has a complex feasible set. Our inexact condition also covers several existing inexact conditions as special cases and hence makes our inexact BPPA (iBPPA) more flexible to fit different scenarios in practice. As an application to the standard optimal transport (OT) problem, our iBPPA with the entropic proximal term can bypass some numerical instability issues that usually plague the popular Sinkhorn's algorithm in the OT community, since our iBPPA does not require the proximal parameter to be very small for obtaining an accurate approximate solution. The iteration complexity of $O(1/k)$ and the convergence of the sequence are also established for our iBPPA under some mild conditions. Moreover, inspired by Nesterov's acceleration technique, we develop an inertial variant of our iBPPA, denoted by V-iBPPA, and establish the iteration complexity of $O(1/k^{\lambda})$, where $\lambda\geq1$ is a quadrangle scaling exponent of the kernel function. In particular, when the proximal parameter is a constant and the kernel function is strongly convex with Lipschitz continuous gradient (hence $\lambda=2$), our V-iBPPA achieves a faster rate of $O(1/k^2)$ just as existing accelerated inexact proximal point algorithms. Some preliminary numerical experiments for solving the standard OT problem are conducted to show the convergence behaviors of our iBPPA and V-iBPPA under different inexactness settings. The experiments also empirically verify the potential of our V-iBPPA for improving the convergence speed.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-13T07:00:00Z
DOI: 10.1137/20M1360748
Issue No: Vol. 32, No. 3 (2022)

• Sketched Newton--Raphson

• Free pre-print version: Loading...

Authors: Rui Yuan, Alessandro Lazaric, Robert M. Gower
Pages: 1555 - 1583
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1555-1583, September 2022.
We propose a new globally convergent stochastic second-order method. Our starting point is the development of a new sketched Newton--Raphson (SNR) method for solving large scale nonlinear equations of the form $F(x)=0$ with $F:\mathbb{R}^p \rightarrow \mathbb{R}^m$. We then show how to design several stochastic second-order optimization methods by rewriting the optimization problem of interest as a system of nonlinear equations and applying SNR. For instance, by applying SNR to find a stationary point of a generalized linear model, we derive completely new and scalable stochastic second-order methods. We show that the resulting method is very competitive as compared to state-of-the-art variance reduced methods. Furthermore, using a variable splitting trick, we also show that the stochastic Newton method (SNM) is a special case of SNR and use this connection to establish the first global convergence theory of SNM. We establish the global convergence of SNR by showing that it is a variant of the online stochastic gradient descent (SGD) method, and then leveraging proof techniques of \textttSGD. As a special case, our theory also provides a new global convergence theory for the original Newton--Raphson method under strictly weaker assumptions as compared to the classic monotone convergence theory.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-13T07:00:00Z
DOI: 10.1137/21M139788X
Issue No: Vol. 32, No. 3 (2022)

• Golden Ratio Primal-Dual Algorithm with Linesearch

• Free pre-print version: Loading...

Authors: Xiao-Kai Chang, Junfeng Yang, Hongchao Zhang
Pages: 1584 - 1613
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1584-1613, September 2022.
The golden ratio primal-dual algorithm (GRPDA) is a new variant of the classical Arrow--Hurwicz method for solving structured convex optimization problems, in which the objective function consists of the sum of two closed proper convex functions, one of which involves a composition with a linear transform. The same as the Arrow--Hurwicz method and the popular primal-dual algorithm (PDA) of Chambolle and Pock, GRPDA is full-splitting in the sense that it does not rely on solving any subproblems or linear system of equations iteratively. Compared with PDA, an important feature of GRPDA is that it permits larger primal and dual stepsizes. However, the stepsize condition of GRPDA requires that the spectral norm of the linear transform is known, which can be difficult to obtain in some applications. Furthermore, constant stepsizes are usually overconservative in practice. In this paper, we propose a linesearch strategy for GRPDA, which not only does not require the spectral norm of the linear transform but also allows adaptive and potentially much larger stepsizes. Within each linesearch step, only the dual variable needs to be updated, and it is thus quite cheap and does not require any extra matrix-vector multiplications for many special yet important applications such as a regularized least-squares problem. Global iterate convergence and ${\cal O}(1/N)$ ergodic convergence rate results, measured by the function value gap and constraint violations of an equivalent optimization problem, are established, where $N$ denotes the iteration counter. When one of the component functions is strongly convex, faster ${\cal O}(1/N^2)$ ergodic convergence rate results, quantified by the same measures, are established by adaptively choosing some algorithmic parameters. Moreover, when the subdifferential operators of the component functions are strongly metric subregular, a condition that is much weaker than strong convexity, we show that the iterates converge R-linearly to the unique solution. Numerical experiments on matrix game and LASSO problems illustrate the effectiveness of the proposed linesearch strategy.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-18T07:00:00Z
DOI: 10.1137/21M1420319
Issue No: Vol. 32, No. 3 (2022)

• Difference-of-Convex Algorithms for a Class of Sparse Group $\ell_0$
Regularized Optimization Problems

• Free pre-print version: Loading...

Authors: Wenjing Li, Wei Bian, Kim-Chuan Toh
Pages: 1614 - 1641
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1614-1641, September 2022.
In this paper, we consider a class of sparse group $\ell_0$ regularized optimization problems. First, we give a continuous relaxation model of the considered problem and define a class of stationary points of the relaxation problem. Then, we establish the equivalence of these two problems in the sense of global minimizers, and prove that the defined stationary point is equivalent to the local minimizer of the considered sparse group $\ell_0$ regularized problem with a desirable bound from its global minimizers. Further, based on the difference-of-convex (DC) structure of the relaxation problem, we design two DC algorithms to solve the relaxation problem. We prove that any accumulation point of the iterates generated by them is a local minimizer with a desirable bound for the considered sparse group $\ell_0$ problem. In particular, all accumulation points have a common support set and their zero entries can be attained within finite iterations. Moreover, we give the global convergence analysis of the proposed algorithms. Finally, we perform some numerical experiments to show the efficiency of the proposed algorithms.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-18T07:00:00Z
DOI: 10.1137/21M1443455
Issue No: Vol. 32, No. 3 (2022)

• Sublinear Convergence of a Tamed Stochastic Gradient Descent Method in
Hilbert Space

• Free pre-print version: Loading...

Authors: Monika Eisenmann, Tony Stillfjord
Pages: 1642 - 1667
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1642-1667, September 2022.
In this paper, we introduce the tamed stochastic gradient descent method (TSGD) for optimization problems. Inspired by the tamed Euler scheme, which is a commonly used method within the context of stochastic differential equations, TSGD is an explicit scheme that exhibits stability properties similar to those of implicit schemes. As its computational cost is essentially equivalent to that of the well-known stochastic gradient descent method (SGD), it constitutes a very competitive alternative to such methods. We rigorously prove (optimal) sublinear convergence of the scheme for strongly convex objective functions on an abstract Hilbert space. The analysis only requires very mild step size restrictions, which illustrates the good stability properties. The analysis is based on a priori estimates more frequently encountered in a time integration context than in optimization, and this alternative approach provides a different perspective also on the convergence of SGD. Finally, we demonstrate the usability of the scheme on a problem arising in a context of supervised learning.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-21T07:00:00Z
DOI: 10.1137/21M1427450
Issue No: Vol. 32, No. 3 (2022)

• Potential Function-Based Framework for Minimizing Gradients in Convex and
Min-Max Optimization

• Free pre-print version: Loading...

Authors: Jelena Diakonikolas, Puqian Wang
Pages: 1668 - 1697
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1668-1697, September 2022.
Making the gradients small is a fundamental optimization problem that has eluded unifying and simple convergence arguments in first-order optimization, so far primarily reserved for other convergence criteria, such as reducing the optimality gap. In particular, while many different potential function-based frameworks covering broad classes of algorithms exist for optimality gap-based convergence guarantees, we are not aware of such general frameworks addressing the gradient norm guarantees. To fill this gap, we introduce a novel potential function-based framework to study the convergence of standard methods for making the gradients small in smooth convex optimization and convex-concave min-max optimization. Our framework is intuitive and provides a lens for viewing algorithms that makes the gradients small as being driven by a trade-off between reducing either the gradient norm or a certain notion of an optimality gap. On the lower bounds side, we discuss tightness of the obtained convergence results for the convex setup and provide a new lower bound for minimizing norm of cocoercive operators that allows us to argue about optimality of methods in the min-max setup.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-21T07:00:00Z
DOI: 10.1137/21M1395302
Issue No: Vol. 32, No. 3 (2022)

• Effective Scenarios in Multistage Distributionally Robust Optimization
with a Focus on Total Variation Distance

• Free pre-print version: Loading...

Authors: Hamed Rahimian, Güzin Bayraksan, Tito Homem De-Mello
Pages: 1698 - 1727
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1698-1727, September 2022.
We study multistage distributionally robust optimization (DRO) to hedge against ambiguity in quantifying the underlying uncertainty of a problem. Recognizing that not all the realizations and scenario paths might have an “effect” on the optimal value, we investigate the question of how to define and identify critical scenarios for nested multistage DRO problems. Our analysis extends the work of Rahimian, Bayraksan, and Homem-de-Mello [Math. Program., 173 (2019), pp. 393--430], which was in the context of a static/two-stage setting, to the multistage setting. To this end, we define the notions of effectiveness of scenario paths and the conditional effectiveness of realizations along a scenario path for a general class of multistage DRO problems. We then propose easy-to-check conditions to identify the effectiveness of scenario paths in the multistage setting when the distributional ambiguity is modeled via the total variation distance. Numerical results show that these notions provide useful insight on the underlying uncertainty of the problem.
Citation: SIAM Journal on Optimization
PubDate: 2022-07-21T07:00:00Z
DOI: 10.1137/21M1446484
Issue No: Vol. 32, No. 3 (2022)

• On the Benefit of Width for Neural Networks: Disappearance of Basins

• Free pre-print version: Loading...

Authors: Dawei Li, Tian Ding, Ruoyu Sun
Pages: 1728 - 1758
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1728-1758, September 2022.
Wide networks are often believed to have a nice optimization landscape, but what rigorous results can we prove' To understand the benefit of width, it is important to identify the difference between wide and narrow networks. In this work, we prove that from narrow to wide networks, there is a phase transition from having suboptimal basins to no suboptimal basins. Specifically, we prove two results: on the positive side, for any continuous activation functions, the loss surface of a class of wide networks has no suboptimal basin, where “basin” is defined as the setwise strict local minimum; on the negative side, for a large class of networks with width below a threshold, we construct strict local minima that are not global. These two results together show the phase transition from narrow to wide networks.
Citation: SIAM Journal on Optimization
PubDate: 2022-08-01T07:00:00Z
DOI: 10.1137/21M1394205
Issue No: Vol. 32, No. 3 (2022)

• First-Order Methods for Problems with $O$(1) Functional Constraints Can
Have Almost the Same Convergence Rate as for Unconstrained Problems

• Free pre-print version: Loading...

Authors: Yangyang Xu
Pages: 1759 - 1790
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1759-1790, September 2022.
First-order methods (FOMs) have recently been applied and analyzed for solving problems with complicated functional constraints. Existing works show that FOMs for functional constrained problems have lower-order convergence rates than those for unconstrained problems. In particular, an FOM for a smooth strongly convex problem can have linear convergence, while it can only converge sublinearly for a constrained problem if the projection onto the constraint set is prohibited. In this paper, we point out that the slower convergence is caused by the large number of functional constraints but not the constraints themselves. When there are only $m=O(1)$ functional constraints, we show that an FOM can have almost the same convergence rate as that for solving an unconstrained problem, even without the projection onto the feasible set. In addition, given an $\varepsilon>0$, we show that a complexity result that is better than a lower bound can be obtained if there are only $m=o(\varepsilon^{-\frac{1}{2}})$ functional constraints. Our result is surprising but does not contradict the existing lower complexity bound because we focus on a specific subclass of problems. Experimental results on quadratically constrained quadratic programs demonstrate our theory.
Citation: SIAM Journal on Optimization
PubDate: 2022-08-01T07:00:00Z
DOI: 10.1137/20M1371579
Issue No: Vol. 32, No. 3 (2022)

• Approximating Min-Mean-Cycle for Low-Diameter Graphs in Near-Optimal Time
and Memory

• Free pre-print version: Loading...

Authors: Jason M. Altschuler, Pablo A. Parrilo
Pages: 1791 - 1816
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1791-1816, September 2022.
We revisit Min-Mean-Cycle, the classical problem of finding a cycle in a weighted directed graph with minimum mean weight. Despite an extensive algorithmic literature, previous work failed to achieve a near-linear runtime in the number of edges $m$. We propose an algorithm with near-linear runtime $\tilde{O}(m (W_{\max}/\epsilon)^2)$ for computing an $\epsilon$ additive approximation on graphs with polylogarithmic diameter and weights of magnitude at most $W_{\max}$. In particular, this is the first algorithm whose runtime scales in the number of vertices $n$ as $\tilde{O}(n^2)$ for the complete graph. Moreover---unconditionally on the diameter---the algorithm uses only $O(n)$ memory beyond reading the input, making it “memory-optimal". Our approach is based on solving a linear programming (LP) relaxation using entropic regularization, which reduces the LP to a Matrix Balancing problem---à la the popular reduction of Optimal Transport to Matrix Scaling. We then round the fractional LP solution using a variant of the classical Cycle-Canceling algorithm that is sped up to near-linear runtime at the expense of being approximate, and implemented in a memory-optimal manner. The algorithm is simple to implement and is competitive with the state-of-the-art methods in practice.
Citation: SIAM Journal on Optimization
PubDate: 2022-08-01T07:00:00Z
DOI: 10.1137/21M1439390
Issue No: Vol. 32, No. 3 (2022)

• Convergence Rates of the Heavy Ball Method for Quasi-strongly Convex
Optimization

• Free pre-print version: Loading...

Authors: J.-F. Aujol, Ch. Dossal, A. Rondepierre
Pages: 1817 - 1842
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1817-1842, September 2022.
In this paper, we study the behavior of solutions of the ODE associated to the heavy ball method. Since the pioneering work of B. T. Polyak in 1964, it has been well known that such a scheme is very efficient for $C^2$ strongly convex functions with Lipschitz gradient. But much less is known when the $C^2$ assumption is dropped. Depending on the geometry of the function to minimize, we obtain optimal convergence rates for the class of convex functions with some additional regularity such as quasi-strong convexity or strong convexity. We perform this analysis in continuous time for the ODE, and then we transpose these results for discrete optimization schemes. In particular, we propose a variant of the heavy ball algorithm which has the best state of the art convergence rate for first-order methods to minimize strongly composite nonsmooth convex functions.
Citation: SIAM Journal on Optimization
PubDate: 2022-08-01T07:00:00Z
DOI: 10.1137/21M1403990
Issue No: Vol. 32, No. 3 (2022)

• Escaping Unknown Discontinuous Regions in Blackbox Optimization

• Free pre-print version: Loading...

Authors: Charles Audet, Alain Batailly, Solène Kojtych
Pages: 1843 - 1870
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1843-1870, September 2022.
The design of key nonlinear systems often requires the use of expensive blackbox simulations presenting inherent discontinuities whose positions in the variable space cannot be analytically predicted. Without further precautions, the solution of related optimization problems leads to design configurations which may be close to discontinuities of the blackbox output functions. These discontinuities may betray unsafe regions of the design space, such as nonlinear resonance regions. To account for possible changes of operating conditions, an acceptable solution must be away from unsafe regions of the space of variables. The objective of this work is to solve a constrained blackbox optimization problem with the additional constraint that the solution should be outside unknown zones of discontinuities or strong variations of the objective function or the constraints. The proposed approach is an extension of the mesh adaptive direct search (\sf Mads) algorithm and aims at building a series of inner approximations of these zones. The algorithm, called \sf DiscoMADS, relies on two main mechanisms: revealing discontinuities and progressively escaping the surrounding zones. A convergence analysis supports the algorithm and preserves the optimality conditions of \sf Mads. Numerical tests are conducted on analytical problems and on three engineering problems illustrating the following possible applications of the algorithm: the design of a simplified truss, the synthesis of a chemical component, and the design of a turbomachine blade. The \sf DiscoMADS algorithm successfully solves these problems by providing a feasible solution away from discontinuous regions.
Citation: SIAM Journal on Optimization
PubDate: 2022-08-04T07:00:00Z
DOI: 10.1137/21M1420915
Issue No: Vol. 32, No. 3 (2022)

• The Stochastic Auxiliary Problem Principle in Banach Spaces: Measurability
and Convergence

• Free pre-print version: Loading...

Authors: Thomas Bittar, Pierre Carpentier, Jean-Philippe Chancelier, Jéro͂me Lonchampt
Pages: 1871 - 1900
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1871-1900, September 2022.
The stochastic auxiliary problem principle (APP) algorithm is a general stochastic approximation (SA) scheme that turns the resolution of an original convex optimization problem into the iterative resolution of a sequence of auxiliary problems. This framework has been introduced to design decomposition-coordination schemes but also encompasses many well-known SA algorithms such as stochastic gradient descent or stochastic mirror descent. We study the stochastic APP in the case where the iterates lie in a Banach space and we consider an additive error on the computation of the subgradient of the objective. In order to derive convergence results or efficiency estimates for an SA scheme, the iterates must be random variables. This is why we prove the measurability of the iterates of the stochastic APP algorithm. Then, we extend convergence results from the Hilbert space case to the reflexive separable Banach space case. Finally, we derive efficiency estimates for the function values taken at the averaged sequence of iterates or at the last iterate, the latter being obtained by adapting the concept of modified Fejér monotonicity to our framework.
Citation: SIAM Journal on Optimization
PubDate: 2022-08-04T07:00:00Z
DOI: 10.1137/21M1402467
Issue No: Vol. 32, No. 3 (2022)

• Stochastic Decomposition Method for Two-Stage Distributionally Robust
Linear Optimization

• Free pre-print version: Loading...

Authors: Harsha Gangammanavar, Manish Bansal
Pages: 1901 - 1930
Abstract: SIAM Journal on Optimization, Volume 32, Issue 3, Page 1901-1930, September 2022.
In this paper, we present a sequential sampling-based algorithm for the two-stage distributionally robust linear programming (2-DRLP) models. The 2-DRLP models are defined over a general class of ambiguity sets with discrete or continuous probability distributions. The algorithm is a distributionally robust version of the well-known stochastic decomposition algorithm of Higle and Sen [Math. Oper. Res., 16 (1991), pp. 650--669] for a two-stage stochastic linear program. We refer to the algorithm as the distributionally robust stochastic decomposition (DRSD) method. The key features of the algorithm include (1) it works with data-driven approximations of ambiguity sets that are constructed using samples of increasing size and (2) efficient construction of approximations of the worst-case expectation function that solves only two second-stage subproblems in every iteration. We identify conditions under which the ambiguity set approximations converge to the true ambiguity sets and show that the DRSD method asymptotically identifies an optimal solution, with probability one. We also computationally evaluate the performance of the DRSD method for solving distributionally robust versions of instances considered in stochastic programming literature. The numerical results corroborate the analytical behavior of the DRSD method and illustrate the computational advantage over an external sampling-based decomposition approach (distributionally robust L-shaped method).
Citation: SIAM Journal on Optimization
PubDate: 2022-08-08T07:00:00Z
DOI: 10.1137/20M1378600
Issue No: Vol. 32, No. 3 (2022)

• Finding Sparse Solutions for Packing and Covering Semidefinite Programs

• Free pre-print version: Loading...

Authors: Khaled Elbassioni, Kazuhisa Makino, Waleed Najy
Pages: 321 - 353
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 321-353, June 2022.
Packing and covering semidefinite programs (SDPs) appear in natural relaxations of many combinatorial optimization problems as well as a number of other applications. Recently, several techniques were proposed, which utilize the particular structure of this class of problems, to obtain more efficient algorithms than those offered by general SDP solvers. For certain applications, such as those described in this paper, it may be desirable to obtain sparse dual solutions, i.e., those with support size (almost) independent of the number of primal constraints. In this paper, we give an algorithm that finds such solutions, which is an extension of a logarithmic-potential based algorithm of Grigoriadis et al. [SIAM J. Optim. 11 (2001), pp. 1081--1091] for packing/covering linear programs.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-04T07:00:00Z
DOI: 10.1137/20M137570X
Issue No: Vol. 32, No. 2 (2022)

• Distributed Optimization Based on Gradient Tracking Revisited: Enhancing
Convergence Rate via Surrogation

• Free pre-print version: Loading...

Authors: Ying Sun, Gesualdo Scutari, Amir Daneshmand
Pages: 354 - 385
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 354-385, June 2022.
We study distributed multiagent optimization over graphs. We consider the minimization of $F+G$ subject to convex constraints, where $F$ is the smooth strongly convex sum of the agent's losses and $G$ is a nonsmooth convex function. We build on the SONATA algorithm: the algorithm employs the use of surrogate objective functions in the agents' subproblems (thus going beyond linearization, such as proximal-gradient) coupled with a perturbed consensus mechanism that aims to locally track the gradient of $F$. SONATA achieves precision $\epsilon>0$ on the objective value in $\mathcal{O}(\kappa_g \log(1/\epsilon))$ gradient computations at each node and $\tilde{\mathcal{O}}\big(\kappa_g (1-\rho)^{-1/2} \log(1/\epsilon)\big)$ communication steps, where $\kappa_g$ is the condition number of $F$ and $\rho$ characterizes the connectivity of the network. This is the first linear rate result for distributed composite optimization; it also improves on existing (nonaccelerated) schemes just minimizing $F$, whose rate depends on much larger quantities than $\kappa_g$. When the loss functions of the agents are similar, due to statistical data similarity or otherwise, SONATA employing high-order surrogates achieves precision $\epsilon>0$ in $\mathcal{O}\big((\beta/\mu) \log(1/\epsilon)\big)$ iterations and $\tilde{\mathcal{O}}\big((\beta/\mu) (1-\rho)^{-1/2} \log(1/\epsilon)\big)$ communication steps, where $\beta$ measures the degree of similarity of agents' losses and $\mu$ is the strong convexity constant of $F$. Therefore, when $\beta/\mu < \kappa_g$, the use of high-order surrogates yields provably faster rates than those achievable by first-order models; this is without exchanging any Hessian matrix over the network.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-05T07:00:00Z
DOI: 10.1137/19M1259973
Issue No: Vol. 32, No. 2 (2022)

• Zeroth-Order Stochastic Compositional Algorithms for Risk-Aware Learning

• Free pre-print version: Loading...

Authors: Dionysios S. Kalogerias, Warren B. Powell
Pages: 386 - 416
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 386-416, June 2022.
We present ${\it Free-MESSAGE}^{,p}$, the first zeroth-order algorithm for (weakly) convex mean-semideviation-based risk-aware learning, which is also the first-ever three-level zeroth-order compositional stochastic optimization algorithm. Using a nontrivial extension of Nesterov's classical results on Gaussian smoothing, we develop the ${\it Free-MESSAGE}^{,p}$ algorithm from first principles and show that it essentially solves a smoothed surrogate to the original problem, the former being a uniform approximation of the latter, in a useful, convenient sense. We then present a complete analysis of the ${\it Free-MESSAGE}^{,p}$ algorithm, which establishes convergence in a user-tunable neighborhood of the optimal solutions of the original problem for convex costs, as well as explicit convergence rates for convex, weakly convex, and strongly convex costs, in a unified way. Orderwise, and for fixed problem parameters, our results demonstrate no sacrifice in convergence speed as compared to existing first-order methods, while striking a certain balance among the condition of the problem, its dimensionality, and the accuracy of the obtained results, naturally extending previous results in zeroth-order risk-neutral learning.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-05T07:00:00Z
DOI: 10.1137/20M1315403
Issue No: Vol. 32, No. 2 (2022)

• Bayesian Optimization with Expensive Integrands

• Free pre-print version: Loading...

Authors: Saul Toscano-Palmerin, Peter I. Frazier
Pages: 417 - 444
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 417-444, June 2022.
Nonconvex derivative-free time-consuming objectives are often optimized using “black-box” optimization. These approaches assume very little about the objective. While broadly applicable, they typically require more evaluations than methods exploiting more problem structure. Often, such time-consuming objectives are actually the sum or integral of a larger number of functions, each of which consumes significant time when evaluated individually. This arises in designing aircraft, choosing parameters in ride-sharing dispatch systems, and tuning hyperparameters in deep neural networks. We develop a novel Bayesian optimization algorithm that leverages this structure to improve performance. Our algorithm is average-case optimal by construction when a single evaluation of the integrand remains within our evaluation budget. Achieving this one-step optimality requires solving a challenging value of information optimization problem, for which we provide a novel efficient discretization-free computational method. We also prove consistency for our method in both continuum and discrete finite domains for objective functions that are sums. In numerical experiments comparing against previous state-of-the-art methods, including those that also leverage sum or integral structure, our method performs as well or better across a wide range of problems and offers significant improvements when evaluations are noisy or the integrand varies smoothly in the integrated variables.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-18T07:00:00Z
DOI: 10.1137/19M1303125
Issue No: Vol. 32, No. 2 (2022)

• Tracking and Regret Bounds for Online Zeroth-Order Euclidean and
Riemannian Optimization

• Free pre-print version: Loading...

Authors: Alejandro I. Maass, Chris Manzie, Dragan Nešić, Jonathan H. Manton, Iman Shames
Pages: 445 - 469
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 445-469, June 2022.
We study numerical optimization algorithms that use zeroth-order information to minimize time-varying geodesically convex cost functions on Riemannian manifolds. In the Euclidean setting, zeroth-order algorithms have received a lot of attention in both the time-varying and time-invariant cases. However, the extension to Riemannian manifolds is much less developed. We focus on Hadamard manifolds, which are a special class of Riemannian manifolds with global nonpositive curvature that offer convenient grounds for the generalization of convexity notions. Specifically, we derive bounds on the expected instantaneous tracking error, and we provide algorithm parameter values that minimize the algorithm's performance. Our results illustrate how the manifold geometry in terms of the sectional curvature affects these bounds. Additionally, we provide dynamic regret bounds for this online optimization setting. To the best of our knowledge, these are the first regret bounds even for the Euclidean version of the problem. Lastly, via numerical simulations, we demonstrate the applicability of our algorithm on an online Karcher mean problem.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-21T07:00:00Z
DOI: 10.1137/21M1405551
Issue No: Vol. 32, No. 2 (2022)

• Hyperbolic Relaxation of $k$-Locally Positive Semidefinite Matrices

• Free pre-print version: Loading...

Authors: Grigoriy Blekherman, Santanu S. Dey, Kevin Shu, Shengding Sun
Pages: 470 - 490
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 470-490, June 2022.
A successful computational approach for solving large-scale positive semidefinite (PSD) programs is to enforce PSD-ness on only a collection of submatrices. For our study, we let $\mathcal{S}^{n,k}$ be the convex cone of $n\times n$ symmetric matrices where all $k\times k$ principal submatrices are PSD. We call a matrix in this $k$-locally PSD. In order to compare $\mathcal{S}^{n,k}$ to the cone of PSD matrices, we study eigenvalues of $k$-locally PSD matrices. The key insight in this paper is that there is a convex cone $H(e_k^n)$ so that if $X \in \mathcal{S}^{n,k}$, then the vector of eigenvalues of $X$ is contained in $H(e_k^n)$. The cone $H(e_k^n)$ is the hyperbolicity cone of the elementary symmetric polynomial $e_k^n$ (where $e_k^n(x) = \sum_{S \subseteq [n] : S = k} \prod_{i \in S} x_i$) with respect to the all ones vector. Using this insight, we are able to improve previously known upper bounds on the Frobenius distance between matrices in $\mathcal{S}^{n,k}$ and PSD matrices. We also study the quality of the convex relaxation $H(e^n_k)$. We first show that this relaxation is tight for the case of $k = n -1$, that is, for every vector in $H(e^n_{n -1})$ there exists a matrix in $\mathcal{S}^{n, n -1}$ whose eigenvalues are equal to the components of the vector. We then prove a structure theorem on nonsingular matrices in $\mathcal{S}^{n,k}$ all of whose $k\times k$ principal minors are zero, which we believe is of independent interest. This result shows shows that for $1< k < n -1$ “large parts” of the boundary of $H(e_k^n)$ do not intersect with the eigenvalues of matrices in $\mathcal{S}^{n,k}$.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-25T07:00:00Z
DOI: 10.1137/20M1387407
Issue No: Vol. 32, No. 2 (2022)

• Finite Convergence of Sum-of-Squares Hierarchies for the Stability Number
of a Graph

• Free pre-print version: Loading...

Authors: Monique Laurent, Luis Felipe Vargas
Pages: 491 - 518
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 491-518, June 2022.
We investigate a hierarchy of semidefinite bounds $\vartheta^{(r)}(G)$ for the stability number $\alpha(G)$ of a graph $G$, based on its copositive programming formulation and introduced by de Klerk and Pasechnik [SIAM J. Optim., 12 (2002), pp. 875--892], who conjectured convergence to $\alpha(G)$ in $r=\alpha(G)-1$ steps. Even the weaker conjecture claiming finite convergence is still open. We establish links between this hierarchy and sum-of-squares hierarchies based on the Motzkin--Straus formulation of $\alpha(G)$, which we use to show finite convergence when $G$ is acritical, i.e., when $\alpha(G\setminus e)=\alpha(G)$ for all edges $e$ of $G$. This relies, in particular, on understanding the structure of the minimizers of the Motzkin--Straus formulation and showing that their number is finite precisely when $G$ is acritical. Moreover we show that these results hold in the general setting of the weighted stable set problem for graphs equipped with positive node weights. In addition, as a byproduct we show that deciding whether a standard quadratic program has finitely many minimizers does not admit a polynomial-time algorithm unless P=NP.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-25T07:00:00Z
DOI: 10.1137/21M140345X
Issue No: Vol. 32, No. 2 (2022)

• Stochastic Multilevel Composition Optimization Algorithms with
Level-Independent Convergence Rates

• Free pre-print version: Loading...

Authors: Krishnakumar Balasubramanian, Saeed Ghadimi, Anthony Nguyen
Pages: 519 - 544
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 519-544, June 2022.
In this paper, we study smooth stochastic multilevel composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic first-order oracle. For solving this class of problems, we propose two algorithms using moving-average stochastic estimates, and analyze their convergence to an $\epsilon$-stationary point of the problem. We show that the first algorithm, which is a generalization of [S. Ghadimi, A. Ruszczynski, and M. Wang, SIAM J. Optim., 30 (2020), pp. 960--979] to the $T$ level case, can achieve a sample complexity of $\mathcal{O}_T(1/\epsilon^6)$ by using minibatches of samples in each iteration, where $\mathcal{O}_T$ hides constants that depend on $T$. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}_T(1/\epsilon^4)$. This modification not only removes the requirement of having a minibatch of samples in each iteration, but also makes the algorithm parameter-free and easy to implement. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multilevel setting obtains the same sample complexity of the smooth single-level setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic first-order oracle.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-25T07:00:00Z
DOI: 10.1137/21M1406222
Issue No: Vol. 32, No. 2 (2022)

• A Subspace Acceleration Method for Minimization Involving a Group
Sparsity-Inducing Regularizer

• Free pre-print version: Loading...

Authors: Frank E. Curtis, Yutong Dai, Daniel P. Robinson
Pages: 545 - 572
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 545-572, June 2022.
We consider the problem of minimizing an objective function that is the sum of a convex function and a group sparsity-inducing regularizer. Problems that integrate such regularizers arise in modern machine learning applications, often for the purpose of obtaining models that are easier to interpret and that have higher predictive accuracy. We present a new method for solving such problems that utilizes subspace acceleration, domain decomposition, and support identification. Our analysis provides the global iteration complexity of obtaining an $\epsilon$-accurate solution and shows that, under common assumptions, the iterates locally converge superlinearly. Numerical results on regularized logistic and linear regression problems show that our approach is efficient and reliable and outperforms state-of-the-art methods on interesting classes of problems, especially when the number of data points is larger than the number of features. For solving problems when the number of data points is smaller than the number of features, algorithms that focus on solving a dual problem may be more efficient than our approach, which solves the primal problem.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-27T07:00:00Z
DOI: 10.1137/21M1411111
Issue No: Vol. 32, No. 2 (2022)

• Distributed Variable Sample-Size Gradient-Response and Best-Response
Schemes for Stochastic Nash Equilibrium Problems

• Free pre-print version: Loading...

Authors: Jinlong Lei, Uday V. Shanbhag
Pages: 573 - 603
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 573-603, June 2022.
This paper considers an $n$-player stochastic Nash equilibrium problem (NEP) in which the $i$th player minimizes a composite objective $f_i( \bullet ,x_{-i}) + r_i( \bullet )$, where $f_i$ is an expectation-valued smooth function, $x_{-i}$ is a tuple of rival decisions, and $r_i$ is a nonsmooth convex function with an efficient prox-evaluation. In this context, we make the following contributions. (I) Under suitable monotonicity assumptions on the \redpseudogradient map, we derive optimal rate statements and oracle complexity bounds for the proposed variable sample-size proximal stochastic gradient-response (VS-PGR) scheme when the sample-size increases at a geometric rate. If the sample-size increases at a polynomial rate with degree $v> 0$, the mean-squared error of the iterates decays at a corresponding polynomial rate; in particular, we prove that the iteration and oracle complexities to obtain an $\epsilon$-Nash equilibrium ($\epsilon$-NE) are $\mathcal{O}(1/\epsilon^{1/v})$ and $\mathcal{O}(1/\epsilon^{1+1/v})$, respectively. \redWhen the sample-size is held constant, the iterates converge geometrically to a neighborhood of the Nash equilibrium in an expected-value sense. (II) We then overlay \bf VS-PGR with a consensus phase with a view towards developing distributed protocols for aggregative stochastic NEPs. In the resulting \bf d-VS-PGR scheme, when the sample-size at each iteration grows at a geometric rate while the communication rounds per iteration grow at the rate of $k+1$, computing an $\epsilon$-NE requires similar iteration and oracle complexities to \bf VS-PGR with a communication complexity of $\mathcal{O}(\ln^2(1/\epsilon))$. Notably, (I) and (II) rely on weaker oracle assumptions in that the conditionally unbiasedness assumption is relaxed while the bound on the conditional second moment may be state-dependent. (III) Under a suitable contractive property associated with the proximal best-response (BR) map, we design a variable sample-size proximal BR (VS-PBR) scheme, where each player solves a sample-average BR problem. When the sample-size increases at a suitable geometric rate, the resulting iterates converge at a geometric rate while the iteration and oracle complexity are, respectively, $\mathcal{O}(\ln(1/\epsilon))$ and $\mathcal{O}(1/\epsilon)$. If the sample-size increases at a polynomial rate with degree $v$, the mean-squared error decays at a corresponding polynomial rate while the iteration and oracle complexities are $\mathcal{O}(1/\epsilon^{1/v})$ and $\mathcal{O}(1/\epsilon^{1+1/v})$, respectively. (IV) Akin to (II), the distributed variant \bf d-VS-PBR achieves similar iteration and oracle complexities to the centralized VS-PBR with a communication complexity of $\mathcal{O}(\ln^2(1/\epsilon))$ when the communication rounds per iteration increase at the rate of $k+1$. Finally, we present preliminary numerics to provide empirical support for the rate and complexity statements.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-27T07:00:00Z
DOI: 10.1137/20M1340071
Issue No: Vol. 32, No. 2 (2022)

• Generic Property of the Partial Calmness Condition for Bilevel Programming
Problems

• Free pre-print version: Loading...

Authors: Rongzhu Ke, Wei Yao, Jane J. Ye, Jin Zhang
Pages: 604 - 634
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 604-634, June 2022.
The partial calmness for the bilevel programming problem (BLPP) is an important condition which ensures that a local optimal solution of BLPP is a local optimal solution of a partially penalized problem where the lower-level optimality constraint is moved to the objective function and hence a weaker constraint qualification can be applied. In this paper, we propose a sufficient condition in the form of a partial error bound condition which guarantees the partial calmness condition. We analyze the partial calmness for the combined program based on the Bouligand (B) and the Fritz John (FJ) stationary conditions from a generic point of view. Our main result states that the partial error bound condition for the combined programs based on B and FJ conditions is generic for an important setting with applications in economics, and hence the partial calmness for the combined program is not a particularly stringent assumption. Moreover, we derive optimality conditions for the combined program for the generic case without any extra constraint qualifications and show the exact equivalence between our optimality condition and the one by Jongen and Shikhman [Math. Program., 136 (2012), pp. 65--89] given in implicit form. Our arguments are based on Jongen, Jonker, and Twilt's [Math. Program., 34 (1986), pp. 333--353] generic (five type) classification of the so-called generalized critical points for one-dimensional parametric optimization problems and Jongen and Shikhman's generic local reductions of BLPPs.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-28T07:00:00Z
DOI: 10.1137/20M1371403
Issue No: Vol. 32, No. 2 (2022)

• Condition Number Minimization in Euclidean Jordan Algebras

• Free pre-print version: Loading...

Authors: Alberto Seeger
Pages: 635 - 658
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 635-658, June 2022.
Let $b$ be a nonnegative vector in a Euclidean Jordan algebra $\mathbb{E}$. Its condition number $\kappa(b)$ is assumed to be high, possibly infinite. The condition number of a nonnegative vector is defined as the ratio between the largest and the smallest eigenvalue. We wish to perturb $b$ in such a way as to diminish its condition number as much as possible. The problem at hand is that of minimizing $\kappa(b+x)$ with respect to a perturbation vector $x$ taken from a given subset of $\mathbb{E}$. In spite of being nonsmooth and nonconvex, such an optimization problem can be handled with the machinery of Euclidean Jordan algebras and, in particular, with the theory of spectral functions.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-28T07:00:00Z
DOI: 10.1137/21M1400705
Issue No: Vol. 32, No. 2 (2022)

• On Obtaining the Convex Hull of Quadratic Inequalities via Aggregations

• Free pre-print version: Loading...

Authors: Santanu S. Dey, Gonzalo Mun͂oz, Felipe Serrano
Pages: 659 - 686
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 659-686, June 2022.
A classical approach for obtaining valid inequalities for a set involves the analysis of relaxations constructed using aggregations of the inequalities that describe such a set. When the set is described by linear inequalities, thanks to the Farkas lemma, we know that every valid inequality can be obtained using aggregations. When the inequalities describing the set are two quadratics, Yildiran [IMA J. Math. Control Inform., 26 (2009), pp. 417--450] showed that the convex hull of the set is given by at most two aggregated inequalities. In this work, we study the case of a set described by three or more quadratic inequalities. We show that, under technical assumptions, the convex hull of a set described by three quadratic inequalities can be obtained via (potentially infinitely many) aggregated inequalities. We also show, through counterexamples, that such as a result does not hold either if the technical conditions are relaxed or if we consider four or more inequalities.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-28T07:00:00Z
DOI: 10.1137/21M1428583
Issue No: Vol. 32, No. 2 (2022)

• Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse
Gradients and Adaptive Sampling

• Free pre-print version: Loading...

Authors: HanQin Cai, Daniel McKenzie, Wotao Yin, Zhenliang Zhang
Pages: 687 - 714
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 687-714, June 2022.
We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using only (possibly noisy) evaluations of the function. Such optimization is also called derivative-free, zeroth-order, or black-box optimization. We propose a new zeroth-order regularized optimization method, dubbed ZORO. When the underlying gradient is approximately sparse at an iterate, ZORO needs very few objective function evaluations to obtain a new iterate that decreases the objective function. We achieve this with an adaptive, randomized gradient estimator, followed by an inexact proximal-gradient scheme. Under a novel approximately sparse gradient assumption and various different convex settings, we show that the (theoretical and empirical) convergence rate of ZORO is only logarithmically dependent on the problem dimension. Numerical experiments show that ZORO outperforms existing methods with similar assumptions, on both synthetic and real datasets.
Citation: SIAM Journal on Optimization
PubDate: 2022-04-28T07:00:00Z
DOI: 10.1137/21M1392966
Issue No: Vol. 32, No. 2 (2022)

• Distributionally Robust Second-Order Stochastic Dominance Constrained
Optimization with Wasserstein Ball

• Free pre-print version: Loading...

Authors: Yu Mei, Jia Liu, Zhiping Chen
Pages: 715 - 738
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 715-738, June 2022.
We consider a distributionally robust second-order stochastic dominance constrained optimization problem. We require the dominance constraints to hold with respect to all probability distributions in a Wasserstein ball centered at the empirical distribution. We adopt the sample approximation approach to develop a linear programming formulation that provides a lower bound. We propose a novel split-and-dual decomposition framework which provides an upper bound. We establish quantitative convergence for both lower and upper approximations given some constraint qualification conditions. To efficiently solve the nonconvex upper bound problem, we use a sequential convex approximation algorithm. Numerical evidence on a portfolio selection problem validates the convergence and effectiveness of the proposed two approximation methods.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-02T07:00:00Z
DOI: 10.1137/21M1394412
Issue No: Vol. 32, No. 2 (2022)

• Active-Set Identification with Complexity Guarantees of an Almost Cyclic
2-Coordinate Descent Method with Armijo Line Search

• Free pre-print version: Loading...

Authors: Andrea Cristofari
Pages: 739 - 764
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 739-764, June 2022.
This paper establishes finite active-set identification of an almost cyclic 2-coordinate descent method for problems with one linear coupling constraint and simple bounds. First, general active-set identification results are stated for nonconvex objective functions. Then, under convexity and a quadratic growth condition (satisfied by any strongly convex function), complexity results on the number of iterations required to identify the active set are given. In our analysis, a simple Armijo line search is used to compute the stepsize, thus not requiring exact minimizations or additional information.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-02T07:00:00Z
DOI: 10.1137/20M1328014
Issue No: Vol. 32, No. 2 (2022)

• Symmetry Reduction in AM/GM-Based Optimization

• Free pre-print version: Loading...

Authors: Philippe Moustrou, Helen Naumann, Cordian Riener, Thorsten Theobald, Hugues Verdure
Pages: 765 - 785
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 765-785, June 2022.
The arithmetic mean/geometric mean inequality (AM/GM inequality) facilitates classes of nonnegativity certificates and of relaxation techniques for polynomials and, more generally, for exponential sums. Here, we present a first systematic study of the AM/GM-based techniques in the presence of symmetries under the linear action of a finite group. We prove a symmetry-adapted representation theorem and develop techniques to reduce the size of the resulting relative entropy programs. We study in more detail the complexity gain in the case of the symmetric group. In this setup, we can show in particular certain stabilization results. We exhibit several sequences of examples in growing dimensions where the size of the reduced problem stabilizes. Finally, we provide some numerical results, emphasizing the computational speedup.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-02T07:00:00Z
DOI: 10.1137/21M1405691
Issue No: Vol. 32, No. 2 (2022)

• On the Linear Convergence of the Multimarginal Sinkhorn Algorithm

• Free pre-print version: Loading...

Authors: Guillaume Carlier
Pages: 786 - 794
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 786-794, June 2022.
The aim of this note is to give an elementary proof of linear convergence of the Sinkhorn algorithm for the entropic regularization of multimarginal optimal transport in the setting of general probability spaces. The proof simply relies on (i) the fact that Sinkhorn iterates are bounded, (ii) the strong convexity of the exponential on bounded intervals, and (iii) the convergence analysis of the coordinate descent (Gauss--Seidel) method of Beck and Tetruashvili [SIAM J. Optim, 23 (2013), pp. 2037--2060].
Citation: SIAM Journal on Optimization
PubDate: 2022-05-09T07:00:00Z
DOI: 10.1137/21M1410634
Issue No: Vol. 32, No. 2 (2022)

• Differentially Private Accelerated Optimization Algorithms

• Free pre-print version: Loading...

Authors: Nurdan Kuru, Ş. İlker Birbil, Mert Gürbüzbalaban, Sinan Yildirim
Pages: 795 - 821
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 795-821, June 2022.
We present two classes of differentially private optimization algorithms derived from the well-known accelerated first-order methods. The first algorithm is inspired by Polyak's heavy ball method and employs a smoothing approach to decrease the accumulated noise on the gradient steps required for differential privacy. The second class of algorithms are based on Nesterov's accelerated gradient method and its recent multistage variant. We propose a noise dividing mechanism for the iterations of Nesterov's method in order to improve the error behavior of the algorithm. The convergence rate analyses are provided for both the heavy ball and the Nesterov's accelerated gradient method with the help of the dynamical system analysis techniques. Finally, we conclude with our numerical experiments showing that the presented algorithms have advantages over the well-known differentially private algorithms.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-09T07:00:00Z
DOI: 10.1137/20M1355847
Issue No: Vol. 32, No. 2 (2022)

• Sequential Quadratic Optimization for Nonlinear Optimization Problems on
Riemannian Manifolds

• Free pre-print version: Loading...

Authors: Mitsuaki Obara, Takayuki Okuno, Akiko Takeda
Pages: 822 - 853
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 822-853, June 2022.
We consider optimization problems on Riemannian manifolds with equality and inequality constraints, which we call Riemannian nonlinear optimization (RNLO) problems. Although they have numerous applications, the existing studies on them are limited especially in terms of algorithms. In this paper, we propose Riemannian sequential quadratic optimization (RSQO) that uses a line-search technique with an $\ell_{1}$ penalty function as an extension of the standard SQO algorithm for constrained nonlinear optimization problems in Euclidean spaces to Riemannian manifolds. We prove its global convergence to a Karush--Kuhn--Tucker point of the RNLO problem by means of parallel transport and the exponential mapping. Furthermore, we establish its local quadratic convergence by analyzing the relationship between sequences generated by RSQO and the Riemannian Newton method. Ours is the first algorithm that has both global and local convergence properties for constrained nonlinear optimization on Riemannian manifolds. Empirical results show that RSQO finds solutions more stably and with higher accuracy compared with the existing Riemannian penalty and augmented Lagrangian methods.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-10T07:00:00Z
DOI: 10.1137/20M1370173
Issue No: Vol. 32, No. 2 (2022)

• Fenchel Duality and a Separation Theorem on Hadamard Manifolds

• Free pre-print version: Loading...

Authors: Maurício Silva Louzeiro, Ronny Bergmann, Roland Herzog
Pages: 854 - 873
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 854-873, June 2022.
In this paper, we introduce a definition of Fenchel conjugate and Fenchel biconjugate on Hadamard manifolds based on the tangent bundle. Our definition overcomes the inconvenience that the conjugate depends on the choice of a certain point on the manifold, as previous definitions required. On the other hand, this new definition still possesses properties known to hold in the Euclidean case. It even yields a broader interpretation of the Fenchel conjugate in the Euclidean case itself. Most prominently, our definition of the Fenchel conjugate provides a Fenchel--Moreau theorem for geodesically convex, proper, lower semicontinuous functions. In addition, this framework allows us to develop a theory of separation of convex sets on Hadamard manifolds, and a strict separation theorem is obtained.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-10T07:00:00Z
DOI: 10.1137/21M1400699
Issue No: Vol. 32, No. 2 (2022)

• Convexification with Bounded Gap for Randomly Projected Quadratic
Optimization

• Free pre-print version: Loading...

Authors: Terunari Fuji, Pierre-Louis Poirion, Akiko Takeda
Pages: 874 - 899
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 874-899, June 2022.
Random projection techniques based on the Johnson--Lindenstrauss lemma are used for randomly aggregating the constraints or variables of optimization problems while approximately preserving their optimal values, which leads to smaller-scale optimization problems. D'Ambrosio et al. [Math. Program., 183 (2020), pp. 619--647] have applied random projection to a quadratic optimization problem so as to decrease the number of decision variables. Although the problem size becomes smaller, the projected problem will also almost surely be nonconvex if the original problem is nonconvex and hence will be hard to solve. In this paper, by focusing on the fact that the level of the nonconvexity of a nonconvex quadratic optimization problem can be alleviated by random projection, we find an approximate global optimal value of the problem by attributing it to a convex problem with smaller size. To the best of our knowledge, our paper is the first to use random projection for convexification of nonconvex optimization problems. We evaluate the approximation error between optimum values of a nonconvex optimization problem and its convexified randomly projected problem.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-12T07:00:00Z
DOI: 10.1137/21M1433678
Issue No: Vol. 32, No. 2 (2022)

• A Proximal Quasi-Newton Trust-Region Method for Nonsmooth Regularized
Optimization

• Free pre-print version: Loading...

Authors: Aleksandr Y. Aravkin, Robert Baraldi, Dominique Orban
Pages: 900 - 929
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 900-929, June 2022.
We develop a trust-region method for minimizing the sum of a smooth term (f) and a nonsmooth term (h), both of which can be nonconvex. Each iteration of our method minimizes a possibly nonconvex model of (f + h) in a trust region. The model coincides with (f + h) in value and subdifferential at the center. We establish global convergence to a first-order stationary point when (f) satisfies a smoothness condition that holds, in particular, when it has a Lipschitz-continuous gradient, and (h) is proper and lower semicontinuous. The model of (h) is required to be proper, lower semi-continuous and prox-bounded. Under these weak assumptions, we establish a worst-case (O(1/\epsilon^2)) iteration complexity bound that matches the best known complexity bound of standard trust-region methods for smooth optimization. We detail a special instance, named TR-PG, in which we use a limited-memory quasi-Newton model of (f) and compute a step with the proximal gradient method, resulting in a practical proximal quasi-Newton method. We establish similar convergence properties and complexity bound for a quadratic regularization variant, named R2, and provide an interpretation as a proximal gradient method with adaptive step size for nonconvex problems. R2 may also be used to compute steps inside the trust-region method, resulting in an implementation named TR-R2. We describe our Julia implementations and report numerical results on inverse problems from sparse optimization and signal processing. Both TR-PG and TR-R2 exhibit promising performance and compare favorably with two linesearch proximal quasi-Newton methods based on convex models.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-19T07:00:00Z
DOI: 10.1137/21M1409536
Issue No: Vol. 32, No. 2 (2022)

• Optimization under Rare Chance Constraints

• Free pre-print version: Loading...

Authors: Shanyin Tong, Anirudh Subramanyam, Vishwas Rao
Pages: 930 - 958
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 930-958, June 2022.
Chance constraints provide a principled framework to mitigate the risk of high-impact extreme events by modifying the controllable properties of a system. The low probability and rare occurrence of such events, however, impose severe sampling and computational requirements on classical solution methods that render them impractical. This work proposes a novel sampling-free method for solving rare chance constrained optimization problems affected by uncertainties that follow general Gaussian mixture distributions. By integrating modern developments in large deviation theory with tools from convex analysis and bilevel optimization, we propose tractable formulations that can be solved by off-the-shelf solvers. Our formulations enjoy several advantages compared to classical methods: their size and complexity is independent of event rarity, they do not require linearity or convexity assumptions on system constraints, and under easily verifiable conditions, serve as safe conservative approximations or asymptotically exact reformulations of the true problem. Computational experiments on linear, nonlinear, and PDE-constrained problems from applications in portfolio management, structural engineering, and fluid dynamics illustrate the broad applicability of our method and its advantages over classical sampling-based approaches in terms of both accuracy and efficiency.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-19T07:00:00Z
DOI: 10.1137/20M1382490
Issue No: Vol. 32, No. 2 (2022)

• Column $\ell_{2,0}$-Norm Regularized Factorization Model of Low-Rank
Matrix Recovery and Its Computation

• Free pre-print version: Loading...

Authors: Ting Tao, Yitian Qian, Shaohua Pan
Pages: 959 - 988
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 959-988, June 2022.
This paper is concerned with the column $\ell_{2,0}$-regularized factorization model of low-rank matrix recovery problems and its computation. The column $\ell_{2,0}$-norm of factor matrices is introduced to promote column sparsity of factors and low-rank solutions. For this nonconvex discontinuous optimization problem, we develop an alternating majorization-minimization (AMM) method with extrapolation and a hybrid AMM in which a majorized alternating proximal method is proposed to seek an initial factor pair with less nonzero columns and the AMM with extrapolation is then employed to minimize a smooth nonconvex loss. We provide the global convergence analysis for the proposed AMM methods and apply them to the matrix completion problem with nonuniform sampling schemes. Numerical experiments are conducted with synthetic and real data examples, and comparison results with the nuclear-norm regularized factorization model and the max-norm regularized convex model show that the column $\ell_{2,0}$-regularized factorization model has an advantage in offering solutions of lower error and rank within less time.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-19T07:00:00Z
DOI: 10.1137/20M136205X
Issue No: Vol. 32, No. 2 (2022)

• Robust Markov Decision Processes with Data-Driven, Distance-Based
Ambiguity Sets

• Free pre-print version: Loading...

Authors: Sivaramakrishnan Ramani, Archis Ghate
Pages: 989 - 1017
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 989-1017, June 2022.
We consider finite- and infinite-horizon Markov decision processes (MDPs) with unknown state-transition probabilities. They are assumed to belong to certain ambiguity sets, and the goal is to maximize the worst-case expected total discounted reward over all probabilities from these sets. Specifically, the ambiguity set for any state-action-stage triplet is a ball---it includes all probability mass functions (pmfs) within a certain distance from the empirical pmf constructed using historical, independent observations of state transitions. We prove that optimal values in the resulting robust MDPs (RMDPs) converge to the optimal value of the true MDP if the radii of the ambiguity balls approach zero as the sample-size diverges to infinity. In addition, robust optimal policies for sufficiently large sample-sizes are optimal to the true MDP. These results rely on a sufficient condition that links convergence of pmfs with respect to the distance function with their componentwise convergence in an appropriate space. Further, for finite sample-sizes, the optimal value of the RMDP provides a lower bound on the value of the robust optimal policy in the true MDP, with a high probability. A certain concentration inequality is sufficient for this out-of-sample performance guarantee. Several well-known distances satisfy these conditions. Numerical experiments suggest that one can choose from several distance functions to build computationally tractable RMDPs that exhibit good out-of-sample performance, and balance conservativeness with probabilistic guarantees.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-19T07:00:00Z
DOI: 10.1137/21M1423841
Issue No: Vol. 32, No. 2 (2022)

• Improved Performance Guarantees for Orthogonal Group Synchronization via
Generalized Power Method

• Free pre-print version: Loading...

Authors: Shuyang Ling
Pages: 1018 - 1048
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1018-1048, June 2022.
Given the noisy pairwise measurements among a set of unknown group elements, how does one recover them efficiently and robustly' This problem, known as group synchronization, has drawn tremendous attention in the scientific community. In this work, we focus on orthogonal group synchronization that has found many applications, including computer vision, robotics, and cryo-electron microscopy. One commonly used approach is the least squares estimation that requires solving a highly nonconvex optimization program. The past few years have witnessed considerable advances in tackling this challenging problem by convex relaxation and efficient first order methods. However, one fundamental theoretical question remains to be answered: how does the recovery performance depend on the noise strength' To answer this question, we study a benchmark model: recovering orthogonal group elements from their pairwise measurements corrupted by additive Gaussian noise. We investigate the performance of convex relaxation and the generalized power method (GPM). By applying the novel leave-one-out technique, we prove that the GPM with spectral initialization enjoys linear convergence to the global optima to the convex relaxation that also matches the maximum likelihood estimator. Our result achieves a near-optimal performance bound on the convergence of the GPM and improves the state-of-the-art theoretical guarantees on the tightness of convex relaxation by a large margin.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-23T07:00:00Z
DOI: 10.1137/20M1389571
Issue No: Vol. 32, No. 2 (2022)

• Minimal Pairs of Convex Sets Which Share a Recession Cone

• Free pre-print version: Loading...

Authors: Jerzy Grzybowski, Ryszard Urbański
Pages: 1049 - 1068
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1049-1068, June 2022.
Robinson introduced a quotient space of pairs of unbounded convex sets which share their recession cone. In this paper minimal pairs of unbounded convex sets, i.e., minimal representations of elements of Robinson's quotient spaces, are investigated. The fact that a minimal pair having a property of translation is reduced is proved. In the case of pairs of two-dimensional sets a formula for finding an equivalent minimal pair is given, a criterion of minimality of a pair of sets is presented, reducibility of all minimal pairs is proved, and nontrivial examples are shown. Shephard--Weil--Schneider's criterion for polytopal summand of a compact convex set is generalized to unbounded convex sets. Finally, minimal pairs of unbounded convex sets are applied to finding Hartman's minimal representation of differences of convex functions.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-23T07:00:00Z
DOI: 10.1137/21M1410695
Issue No: Vol. 32, No. 2 (2022)

• Continuation Methods for Riemannian Optimization

• Free pre-print version: Loading...

Authors: Axel Séguin, Daniel Kressner
Pages: 1069 - 1093
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1069-1093, June 2022.
Numerical continuation in the context of optimization can be used to mitigate convergence issues due to a poor initial guess. In this work, we extend this idea to Riemannian optimization problems, that is, the minimization of a target function on a Riemannian manifold. For this purpose, a suitable homotopy is constructed between the original problem and a problem that admits an easy solution. We develop and analyze a path-following numerical continuation algorithm on manifolds for solving the resulting parameter-dependent equation. To illustrate our developments, we consider two typical classical applications of Riemannian optimization: the computation of the Karcher mean and low-rank matrix completion. We demonstrate that numerical continuation can yield improvements for challenging instances of both problems.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-23T07:00:00Z
DOI: 10.1137/21M1428650
Issue No: Vol. 32, No. 2 (2022)

• Local Linear Convergence of Alternating Projections in Metric Spaces with
Bounded Curvature

• Free pre-print version: Loading...

Authors: Adrian S. Lewis, Genaro Lopez-Acedo, Adriana Nicolae
Pages: 1094 - 1119
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1094-1119, June 2022.
We consider the popular and classical method of alternating projections for finding a point in the intersection of two closed sets. By situating the algorithm in a metric space, equipped only with well-behaved geodesics and angles (in the sense of Alexandrov), we are able to highlight the two key geometric ingredients in a standard intuitive analysis of local linear convergence. The first is a transversality-like condition on the intersection; the second is a convexity-like condition on one set: “uniform approximation by geodesics."
Citation: SIAM Journal on Optimization
PubDate: 2022-05-26T07:00:00Z
DOI: 10.1137/21M1431576
Issue No: Vol. 32, No. 2 (2022)

• Simple and Optimal Methods for Stochastic Variational Inequalities, II:
Markovian Noise and Policy Evaluation in Reinforcement Learning

• Free pre-print version: Loading...

Authors: Georgios Kotsalis, Guanghui Lan, Tianjiao Li
Pages: 1120 - 1155
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1120-1155, June 2022.
The focus of this paper is on stochastic variational inequalities (VI) under Markovian noise. A prominent application of our algorithmic developments is the stochastic policy evaluation problem in reinforcement learning. Prior investigations in the literature focused on temporal difference (TD) learning by employing nonsmooth finite time analysis motivated by stochastic subgradient descent leading to certain limitations. These limitations encompass the requirement of analyzing a modified TD algorithm that involves projection to an a priori defined Euclidean ball, achieving a nonoptimal convergence rate and no clear way of deriving the beneficial effects of parallel implementation. Our approach remedies these shortcomings in the broader context of stochastic VIs and in particular when it comes to stochastic policy evaluation. We developed a variety of simple TD learning type algorithms motivated by its original version that maintain its simplicity, while offering distinct advantages from a nonasymptotic analysis point of view. We first provide an improved analysis of the standard TD algorithm that can benefit from parallel implementation. Then we present versions of a conditional TD algorithm (CTD), that involves periodic updates of the stochastic iterates, which reduce the bias and therefore exhibit improved iteration complexity. This brings us to the fast TD (FTD) algorithm which combines elements of CTD and the stochastic operator extrapolation method of the companion paper. For a novel index resetting step size policy FTD exhibits the best known convergence rate. We also devised a robust version of the algorithm that is particularly suitable for discounting factors close to 1.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-26T07:00:00Z
DOI: 10.1137/20M1381691
Issue No: Vol. 32, No. 2 (2022)

• Exponential Decay of Sensitivity in Graph-Structured Nonlinear Programs

• Free pre-print version: Loading...

Authors: Sungho Shin, Mihai Anitescu, Victor M. Zavala
Pages: 1156 - 1183
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1156-1183, June 2022.
We study solution sensitivity for nonlinear programs (NLPs) whose structures are induced by graphs. These NLPs arise in many applications such as dynamic optimization, stochastic optimization, optimization with partial differential equations, and network optimization. We show that for a given pair of nodes, the sensitivity of the primal-dual solution at one node against a data perturbation at the other node decays exponentially with respect to the distance between these two nodes on the graph. In other words, the solution sensitivity decays as one moves away from the perturbation point. This result, which we call exponential decay of sensitivity, holds under the strong second-order sufficiency condition and the linear independence constraint qualification. We also present conditions under which the decay rate remains uniformly bounded; this allows us to characterize the sensitivity behavior of NLPs defined over subgraphs of infinite graphs. The theoretical developments are illustrated with numerical examples.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-31T07:00:00Z
DOI: 10.1137/21M1391079
Issue No: Vol. 32, No. 2 (2022)

• A Vectorization Scheme for Nonconvex Set Optimization Problems

• Free pre-print version: Loading...

Authors: Gabriele Eichfelder, Ernest Quintana, Stefan Rocktäschel
Pages: 1184 - 1209
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1184-1209, June 2022.
In this paper, we study a solution approach for set optimization problems with respect to the lower set less relation. This approach can serve as a base for numerically solving set optimization problems by using established solvers from multiobjective optimization. Our strategy consists of deriving a parametric family of multiobjective optimization problems whose optimal solution sets approximate, in a specific sense, that of the set-valued problem with arbitrary accuracy. We also examine particular classes of set-valued mappings for which the corresponding set optimization problem is equivalent to a multiobjective optimization problem in the generated family. Surprisingly, this includes set-valued mappings with a convex graph.
Citation: SIAM Journal on Optimization
PubDate: 2022-05-31T07:00:00Z
DOI: 10.1137/21M143683X
Issue No: Vol. 32, No. 2 (2022)

• An Accelerated Method for Derivative-Free Smooth Stochastic Convex
Optimization

• Free pre-print version: Loading...

Authors: Eduard Gorbunov, Pavel Dvurechensky, Alexander Gasnikov
Pages: 1210 - 1238
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1210-1238, June 2022.
We consider an unconstrained problem of minimizing a smooth convex function which is only available through noisy observations of its values, the noise consisting of two parts. Similar to stochastic optimization problems, the first part is of stochastic nature. The second part is additive noise of unknown nature but bounded in absolute value. In the two-point feedback setting, i.e., when pairs of function values are available, we propose an accelerated derivative-free algorithm together with its complexity analysis. The complexity bound of our derivative-free algorithm is only by a factor of $\sqrt{n}$ larger than the bound for accelerated gradient-based algorithms, where $n$ is the dimension of the decision variable. We also propose a nonaccelerated derivative-free algorithm with a complexity bound similar to the stochastic gradient--based algorithm; that is, our bound does not have any dimension-dependent factor except logarithmic. Notably, if the difference between the starting point and the solution is a sparse vector, for both our algorithms, we obtain a better complexity bound if the algorithm uses an 1-norm proximal setup rather than the Euclidean proximal setup, which is a standard choice for unconstrained problems.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-01T07:00:00Z
DOI: 10.1137/19M1259225
Issue No: Vol. 32, No. 2 (2022)

• Constrained, Global Optimization of Unknown Functions with Lipschitz
Continuous Gradients

• Free pre-print version: Loading...

Authors: Abraham P. Vinod, Arie Israel, Ufuk Topcu
Pages: 1239 - 1264
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1239-1264, June 2022.
We present two first-order, sequential optimization algorithms to solve constrained optimization problems. We consider a black-box setting with a priori unknown, nonconvex objective and constraint functions that have Lipschitz continuous gradients. The proposed algorithms balance the exploration of the a priori unknown feasible space with the pursuit of global optimality within a prespecified finite number of first-order oracle calls. The first algorithm accommodates an infeasible start and provides either a near-optimal global solution or establishes infeasibility. However, the algorithm may produce infeasible iterates during the search. For a strongly convex constraint function and a feasible initial solution guess, the second algorithm returns a near-optimal global solution without any constraint violation. At each iteration, the algorithms identify the next query point by solving a nonconvex, quadratically constrained, quadratic program, characterized by the Lipschitz continuous gradient property of the functions and the oracle responses. In contrast to existing methods, the algorithms compute global suboptimality bounds at every iteration. They can also satisfy user-specified tolerances in the computed solution with near-optimal complexity in oracle calls for a large class of optimization problems. We implement the proposed algorithms using GUROBI, a commercial off-the-shelf solver.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-02T07:00:00Z
DOI: 10.1137/20M1380879
Issue No: Vol. 32, No. 2 (2022)

• Newton Differentiability of Convex Functions in Normed Spaces and of a
Class of Operators

• Free pre-print version: Loading...

Authors: Martin Brokate, Michael Ulbrich
Pages: 1265 - 1287
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1265-1287, June 2022.
Newton differentiability is an important concept for analyzing generalized Newton methods for nonsmooth equations. In this work, for a convex function defined on an infinite-dimensional space, we discuss the relation between Newton and Bouligand differentiability and upper semicontinuity of its subdifferential. We also construct a Newton derivative of an operator of the form $(Fx)(p) = f(x,p)$ for general nonlinear operators $f$ that possess a Newton derivative with respect to $x$ and also for the case where $f$ is convex in $x$.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-02T07:00:00Z
DOI: 10.1137/21M1449531
Issue No: Vol. 32, No. 2 (2022)

• On the Convergence of Stochastic Primal-Dual Hybrid Gradient

• Free pre-print version: Loading...

Authors: Ahmet Alacaoglu, Olivier Fercoq, Volkan Cevher
Pages: 1288 - 1318
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1288-1318, June 2022.
In this paper, we analyze the recently proposed stochastic primal-dual hybrid gradient (SPDHG) algorithm and provide new theoretical results. In particular, we prove almost sure convergence of the iterates to a solution with convexity and linear convergence with further structure, using standard step sizes independent of strong convexity or other regularity constants. In the general convex case, we also prove the $\mathcal{O}(1/k)$ convergence rate for the ergodic sequence, on expected primal-dual gap function. Our assumption for linear convergence is metric subregularity, which is satisfied for strongly convex-strongly concave problems in addition to many nonsmooth and/or nonstrongly convex problems, such as linear programs, Lasso, and support vector machines. We also provide numerical evidence showing that SPDHG with standard step sizes shows a competitive practical performance against its specialized strongly convex variant SPDHG-$\mu$ and other state-of-the-art algorithms, including variance reduction methods.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-06T07:00:00Z
DOI: 10.1137/19M1296252
Issue No: Vol. 32, No. 2 (2022)

• Cutting Plane Generation through Sparse Principal Component Analysis

• Free pre-print version: Loading...

Authors: Santanu S. Dey, Aleksandr Kazachkov, Andrea Lodi, Gonzalo Munoz
Pages: 1319 - 1343
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1319-1343, June 2022.
Quadratically constrained quadratic programs (QCQPs) are optimization models whose remarkable expressiveness have made them a cornerstone of methodological research for nonconvex optimization problems. However, modern methods to solve a general QCQP fail to scale, encountering computational challenges even with just a few hundred variables. Specifically, a semidefinite programming (SDP) relaxation is typically employed, which provides strong dual bounds for QCQPs but relies on memory-intensive algorithms. An appealing alternative is to replace the SDP with an easier-to-solve linear programming relaxation while still achieving strong bounds. In this work, we make advances toward achieving this goal by developing a computationally efficient linear cutting plane algorithm that emulates the SDP-based approximations of nonconvex QCQPs. The cutting planes are required to be sparse, in order to ensure a numerically attractive approximation, and efficiently computable. We present a novel connection between such sparse cut generation and the sparse principal component analysis problem in statistics, which allows us to achieve these two goals. We show extensive computational results advocating for the use of our approach.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-13T07:00:00Z
DOI: 10.1137/21M1399956
Issue No: Vol. 32, No. 2 (2022)

• On the Generalized $\vartheta$-Number and Related Problems for Highly
Symmetric Graphs

• Free pre-print version: Loading...

Authors: Lennart Sinjorgo, Renata Sotirov
Pages: 1344 - 1378
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1344-1378, June 2022.
This paper is an in-depth analysis of the generalized $\vartheta$-number of a graph. The generalized $\vartheta$-number, $\vartheta_k(G)$, serves as a bound for both the $k$-multichromatic number of a graph and the maximum $k$-colorable subgraph problem. We present various properties of $\vartheta_k(G)$, such as that the sequence $(\vartheta_k(G))_k$ is increasing and bounded from above by the order of the graph $G$. We study $\vartheta_k(G)$ when $G$ is the strong, disjunction, or Cartesian product of two graphs. We provide closed form expressions for the generalized $\vartheta$-number on several classes of graphs including the Kneser graphs, cycle graphs, strongly regular graphs, and orthogonality graphs. Our paper provides bounds on the product and sum of the $k$-multichromatic number of a graph and its complement graph, as well as lower bounds for the $k$-multichromatic number on several graph classes including the Hamming and Johnson graphs.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-14T07:00:00Z
DOI: 10.1137/21M1414620
Issue No: Vol. 32, No. 2 (2022)

• The Bregman Proximal Average

• Free pre-print version: Loading...

Authors: Xianfu Wang, Heinz H. Bauschke
Pages: 1379 - 1401
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1379-1401, June 2022.
We provide a proximal average with respect to a 1-coercive Legendre function. In the sense of Bregman distance, the Bregman envelope of the proximal average is a convex combination of Bregman envelopes of individual functions. The Bregman proximal mapping of the average is a convex combination of convexified proximal mappings of individual functions. Techniques from variational analysis provide the keys for the Bregman proximal average.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-14T07:00:00Z
DOI: 10.1137/21M1442474
Issue No: Vol. 32, No. 2 (2022)

• Global Linear Convergence of Evolution Strategies on More than Smooth
Strongly Convex Functions

• Free pre-print version: Loading...

Authors: Youhei Akimoto, Anne Auger, Tobias Glasmachers, Daiki Morinaga
Pages: 1402 - 1429
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1402-1429, June 2022.
Evolution strategies (ESs) are zeroth-order stochastic black-box optimization heuristics invariant to monotonic transformations of the objective function. They evolve a multivariate normal distribution, from which candidate solutions are generated. Among different variants, CMA-ES is nowadays recognized as one of the state-of-the-art zeroth-order optimizers for difficult problems. Despite ample empirical evidence that ESs with a step-size control mechanism converge linearly, theoretical guarantees of linear convergence of ESs have been established only on limited classes of functions. In particular, theoretical results on convex functions are missing, where zeroth-order and also first-order optimization methods are often analyzed. In this paper, we establish almost sure linear convergence and a bound on the expected hitting time of an ES family, namely, the $(1+1)_\kappa$-ES, which includes the (1+1)-ES with (generalized) one-fifth success rule and an abstract covariance matrix adaptation with bounded condition number, on a broad class of functions. The analysis holds for monotonic transformations of positively homogeneous functions and of quadratically bounded functions, the latter of which particularly includes monotonic transformation of strongly convex functions with Lipschitz continuous gradient. As far as the authors know, this is the first work that proves linear convergence of ES on such a broad class of functions.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-14T07:00:00Z
DOI: 10.1137/20M1373815
Issue No: Vol. 32, No. 2 (2022)

• Optimization on the Euclidean Unit Sphere

• Free pre-print version: Loading...

Authors: Jean B. Lasserre
Pages: 1430 - 1445
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1430-1445, June 2022.
We consider the problem of minimizing a continuously differentiable function $f$ of $m$ linear forms in $n$ variables on the Euclidean unit sphere. We show that this problem is equivalent to minimizing the same function of related $m$ linear forms (but now in $m$ variables) on the Euclidean unit ball. When the linear forms are known, this results in a drastic reduction in problem size whenever $m\ll n$ and allows us to solve potentially large scale nonconvex such problems. We also provide a test to detect when a polynomial is a polynomial in a fixed number of forms. Finally, we identify two classes of functions with no spurious local minima on the sphere: (i) quasi-convex polynomials of odd degree and (ii) nonnegative and homogeneous functions.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-21T07:00:00Z
DOI: 10.1137/21M1433150
Issue No: Vol. 32, No. 2 (2022)

• Preference Robust Optimization for Choice Functions on the Space of CDFs

• Free pre-print version: Loading...

Authors: William B. Haskell, Huifu Xu, Wenjie Huang
Pages: 1446 - 1470
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1446-1470, June 2022.
In this paper, we consider decision-making problems where the decision maker's (DM's) utility/risk preferences are ambiguous but can be described by a general class of choice functions defined over the space of cumulative distribution functions (CDFs) of random prospects. These choice functions are assumed to satisfy two basic properties: (i) monotonicity w.r.t. the order on CDFs and (ii) quasiconcavity. We propose a maximin preference robust optimization (PRO) model where the optimal decision is based on the robust choice function from a set of choice functions elicited from available information on the DM's preferences. The current univariate utility PRO models are fundamentally based on Von Neumann--Morgenstein's (VNM's) expected utility theory. Our new robust choice function model effectively generalizes them to one which captures common features of VNM's theory and Yaari's dual theory of choice. To evaluate our robust choice functions, we characterize the quasiconcave envelope of $L-$Lipschitz functions of a set of points. Subsequently, we propose two numerical methods for the DM's PRO problem: a projected level function method and a level search method. We apply our PRO model and numerical methods to a portfolio optimization problem and report test results.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-23T07:00:00Z
DOI: 10.1137/20M1316524
Issue No: Vol. 32, No. 2 (2022)

• General Feasibility Bounds for Sample Average Approximation via
Vapnik--Chervonenkis Dimension

• Free pre-print version: Loading...

Authors: Henry Lam, Fengpei Li
Pages: 1471 - 1497
Abstract: SIAM Journal on Optimization, Volume 32, Issue 2, Page 1471-1497, June 2022.
We investigate the feasibility of sample average approximation (SAA) for general stochastic optimization problems, including two-stage stochastic programs without relatively complete recourse. We utilize results from the Vapnik--Chervonenkis (VC) dimension and probably approximately correct learning to provide a general framework to construct feasibility bounds for SAA solutions under minimal structural or distributional assumption. We show that, as long as the hypothesis class formed by the feasible region has a finite VC dimension, the infeasibility of SAA solutions decreases exponentially with computable rates and explicitly identifiable accompanying constants. We demonstrate how our bounds apply more generally and competitively compared to existing results.
Citation: SIAM Journal on Optimization
PubDate: 2022-06-27T07:00:00Z
DOI: 10.1137/21M140211X
Issue No: Vol. 32, No. 2 (2022)

JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762

Your IP address: 35.172.223.251

Home (Search)
API
About JournalTOCs
News (blog, publications)

JournalTOCs © 2009-