Abstract: Quasi-convex optimization acts a pivotal part in many fields including economics and finance; the subgradient method is an effective iterative algorithm for solving large-scale quasi-convex optimization problems. In this paper, we investigate the quantitative convergence theory, including the iteration complexity and convergence rates, of various subgradient methods for solving quasi-convex optimization problems in a unified framework. In particular, we consider a sequence satisfying a general (inexact) basic inequality, and investigate the global convergence theorem and the iteration complexity when using the constant, diminishing or dynamic stepsize rules. More importantly, we establish the linear (or sublinear) convergence rates of the sequence under an additional assumption of weak sharp minima of Hölderian order and upper bounded noise. These convergence theorems are applied to establish the iteration complexity and convergence rates of several subgradient methods, including the standard/inexact/conditional subgradient methods, for solving quasi-convex optimization problems under the assumptions of the Hölder condition and/or the weak sharp minima of Hölderian order. PubDate: 2020-09-01

Abstract: The alternating direction method of multipliers (ADMM) is widely used to solve separable convex programming problems. At each iteration, the classical ADMM solves the subproblems exactly. For many problems arising from practical applications, it is usually impossible or too expensive to obtain the exact solution of a subproblem. To overcome this, a special proximal term is added to ease the solvability of a subproblem. In the literature, the proximal term can be relaxed to be indefinite while still with a convergence guarantee; this relaxation permits the adoption of larger step sizes to solve the subproblem, which particularly accelerates its performance. A large value of the relaxation factor introduced in the dual step of ADMM also plays a vital role in accelerating its performance. However, it is still not clear whether these two acceleration strategies can be used simultaneously with no restriction on the penalty parameter. In this paper, we answer this question affirmatively and conduct a rigorous convergence analysis for indefinite proximal ADMM with a relaxation factor (IP-ADMM \(_{r}\) ), reveal the relationships between the parameter in the indefinite proximal term and the relaxation factor to ensure its global convergence, and establish the worst-case convergence rate in the ergodic sense. Finally, some numerical results on basis pursuit and total variation-based denoising with box constraint problems are presented to verify the efficiency of IP-ADMM \(_{r}\) . PubDate: 2020-09-01

Abstract: In this paper we consider the classical unconstrained nonlinear multiobjective optimization problem. For such a problem, it is particularly interesting to compute as many points as possible in an effort to approximate the so-called Pareto front. Consequently, to solve the problem we define an “a posteriori” algorithm whose generic iterate is represented by a set of points rather than by a single one. The proposed algorithm takes advantage of a linesearch with extrapolation along steepest descent directions with respect to (possibly not all of) the objective functions. The sequence of sets of points produced by the algorithm defines a set of “linked” sequences of points. We show that each linked sequence admits at least one limit point (not necessarily distinct from those obtained by other sequences) and that every limit point is Pareto-stationary. We also report numerical results on a collection of multiobjective problems that show efficiency of the proposed approach over more classical ones. PubDate: 2020-09-01

Abstract: In this paper, we propose an adaptation of the classical augmented Lagrangian method for dealing with multi-objective optimization problems. Specifically, after a brief review of the literature, we give a suitable definition of Augmented Lagrangian for equality and inequality constrained multi-objective problems. We exploit this object in a general computational scheme that is proved to converge, under mild assumptions, to weak Pareto points of such problems. We then provide a modified version of the algorithm which is more suited for practical implementations, proving again convergence properties under reasonable hypotheses. Finally, computational experiments show that the proposed methods not only do work in practice, but are also competitive with respect to state-of-the-art methods. PubDate: 2020-09-01

Abstract: In this paper, we describe a new active-set algorithmic framework for minimizing a non-convex function over the unit simplex. At each iteration, the method makes use of a rule for identifying active variables (i.e., variables that are zero at a stationary point) and specific directions (that we name active-set gradient related directions) satisfying a new “nonorthogonality” type of condition. We prove global convergence to stationary points when using an Armijo line search in the given framework. We further describe three different examples of active-set gradient related directions that guarantee linear convergence rate (under suitable assumptions). Finally, we report numerical experiments showing the effectiveness of the approach. PubDate: 2020-09-01

Abstract: We consider a misspecified optimization problem, requiring the minimization of a function \(f(\cdot;\theta ^*)\) over a closed and convex set X where \(\theta ^*\) is an unknown vector of parameters that may be learnt by a parallel learning process. Here, we develop coupled schemes that generate iterates \((x_k,\theta _k)\) as \(k \rightarrow \infty\) , then \(x_k \rightarrow x^*\) , a minimizer of \(f(\cdot;\theta ^*)\) over X and \(\theta _k \rightarrow \theta ^*\) . In the first part of the paper, we consider the solution of problems where f is either smooth or nonsmooth. In smooth strongly convex regimes, we demonstrate that such schemes still display a linear rate of convergence, albeit with larger constants. When strong convexity assumptions are weakened, it can be shown that the convergence in function value sees a modification in the canonical convergence rate of \({{{\mathcal {O}}}}(1/K)\) by an additive factor proportional to \(\Vert \theta _0-\theta ^*\Vert\) where \(\Vert \theta _0-\theta ^*\Vert\) represents the initial misspecification in \(\theta ^*\) . In addition, when the learning problem is assumed to be merely convex but admits a suitable weak-sharpness property, then the convergence rate deteriorates to \({\mathcal {O}}(1/\sqrt{K})\) . In both convex and strongly convex regimes, diminishing steplength schemes are also provided and are less reliant on the knowledge of problem parameters. Finally, we present an averaging-based subgradient scheme that displays a rate of \({\mathcal {O}}(1/\sqrt{K})+ \mathcal{O}(\ \theta_0-\theta^*\ (1/K))\) , implying no effect on the canonical rate of \({{{\mathcal {O}}}}(1/\sqrt{K})\) . In the second part of the paper, we consider the solution of misspecified monotone variational inequality problems, motivated by the need to contend with more general equilibrium problems as well as the possibility of misspecification in the constraints. In this context, we first present a constant steplength misspecified extragradient scheme and prove its asymptotic convergence. This scheme is reliant on problem parameters (such as Lipschitz constants) and leads to a misspecified variant of iterative Tikhonov regularization, an avenue that does not necessitate the knowledge of such constants. PubDate: 2020-09-01

Abstract: This paper presents an approximation scheme for optimal control problems using finite-dimensional linear programs and interval analysis. This is done in two parts. Following Vinter approach (SIAM J Control Optim 31(2):518–538, 1993) and using occupation measures, the optimal control problem is written into a linear programming problem of infinite-dimension (weak formulation). Thanks to Interval arithmetic, we provide a relaxation of this infinite-dimensional linear programming problem by a finite dimensional linear programming problem. A proof that the optimal value of the finite dimensional linear programming problem is a lower bound to the optimal value of the control problem is given. Moreover, according to the fineness of the discretization and the size of the chosen test function family, obtained optimal values of each finite dimensional linear programming problem form a sequence of lower bounds which converges to the optimal value of the initial optimal control problem. Examples will illustrate the principle of the methodology. PubDate: 2020-09-01

Abstract: In this work, we introduce self-adaptive methods for solving variational inequalities with Lipschitz continuous and quasimonotone mapping(or Lipschitz continuous mapping without monotonicity) in real Hilbert space. Under suitable assumptions, the convergence of algorithms are established without the knowledge of the Lipschitz constant of the mapping. The results obtained in this paper extend some recent results in the literature. Some preliminary numerical experiments and comparisons are reported. PubDate: 2020-08-07

Abstract: In this paper, we study the interplay between acceleration and structure identification for the proximal gradient algorithm. While acceleration is generally beneficial in terms of functional decrease, we report and analyze several cases where its interplay with identification has negative effects on the algorithm behavior (iterates oscillation, loss of structure, etc.). Then, we present a generic method that tames acceleration when structure identification may be at stake; it benefits from a convergence rate that matches the one of the accelerated proximal gradient under some qualifying condition. We show empirically that the proposed method is much more stable in terms of subspace identification compared to the accelerated proximal gradient method while keeping a similar functional decrease. PubDate: 2020-08-06

Abstract: Algorithmic differentiation (AD) is a widely-used approach to compute derivatives of numerical models. Many numerical models include an iterative process to solve non-linear systems of equations. To improve efficiency and numerical stability, AD is typically not applied to the linear solvers. Instead, the differentiated linear solver call is replaced with hand-produced derivative code that exploits the linearity of the original call. In practice, the iterative linear solvers are often stopped prematurely to recompute the linearisation of the non-linear outer loop. We show that in the reverse-mode of AD, the derivatives obtained with partial convergence become inconsistent with the original and the tangent-linear models, resulting in inaccurate adjoints. We present a correction term that restores consistency between adjoint and tangent-linear gradients if linear systems are only partially converged. We prove the consistency of this correction term and show in numerical experiments that the accuracy of adjoint gradients of an incompressible flow solver applied to an industrial test case is restored when the correction term is used. PubDate: 2020-08-03

Abstract: Global optimization problems whose objective function is expensive to evaluate can be solved effectively by recursively fitting a surrogate function to function samples and minimizing an acquisition function to generate new samples. The acquisition step trades off between seeking for a new optimization vector where the surrogate is minimum (exploitation of the surrogate) and looking for regions of the feasible space that have not yet been visited and that may potentially contain better values of the objective function (exploration of the feasible space). This paper proposes a new global optimization algorithm that uses inverse distance weighting (IDW) and radial basis functions (RBF) to construct the acquisition function. Rather arbitrary constraints that are simple to evaluate can be easily taken into account. Compared to Bayesian optimization, the proposed algorithm, that we call GLIS (GLobal minimum using Inverse distance weighting and Surrogate radial basis functions), is competitive and computationally lighter, as we show in a set of benchmark global optimization and hyperparameter tuning problems. MATLAB and Python implementations of GLIS are available at http://cse.lab.imtlucca.it/~bemporad/glis. PubDate: 2020-07-27

Abstract: This paper is devoted to a numerical method for the approximation of a class of free boundary problems of Bernoulli’s type, reformulated as optimal shape design problems with appropriate shape functionals. We show the existence of the shape derivative of the cost functional on a class of admissible domains and compute its shape derivative by using the formula proposed in Boulkhemair (SIAM J Control Optim 55(1):156–171, 2017) and Boulkhemair and Chakib (J Convex Anal 21(1):67–87, 2014), that is, by means of support functions. On the numerical level, this allows us to avoid the tedious computations of the method based on vector fields. A gradient method combined with a boundary element method is performed for the approximation of this problem, in order to overcome the re-meshing task required by the finite element method. Finally, we present some numerical results and simulations concerning practical applications, showing the effectiveness of the proposed approach. PubDate: 2020-07-21

Abstract: In this paper, we combine the positive aspects of the gradient sampling (GS) and bundle methods, as the most efficient methods in nonsmooth optimization, to develop a robust method for solving unconstrained nonsmooth convex optimization problems. The main aim of the proposed method is to take advantage of both GS and bundle methods, meanwhile avoiding their drawbacks. At each iteration of this method, to find an efficient descent direction, the GS technique is utilized for constructing a local polyhedral model for the objective function. If necessary, via an iterative improvement process, this initial polyhedral model is improved by some techniques inspired by the bundle and GS methods. The convergence of the method is studied, which reveals that the global convergence property of our method is independent of the number of gradient evaluations required to establish and improve the initial polyhedral models. Thus, the presented method needs much fewer gradient evaluations in comparison to the original GS method. Furthermore, by means of numerical simulations, we show that the presented method provides promising results in comparison with GS methods, especially for large scale problems. Moreover, in contrast with some bundle methods, our method is not very sensitive to the accuracy of supplied gradients. PubDate: 2020-07-20

Abstract: In this work we introduce and study novel Quasi Newton minimization methods based on a Hessian approximation Broyden Class-type updating scheme, where a suitable matrix \(\tilde{B}_k\) is updated instead of the current Hessian approximation \(B_k\) . We identify conditions which imply the convergence of the algorithm and, if exact line search is chosen, its quadratic termination. By a remarkable connection between the projection operation and Krylov spaces, such conditions can be ensured using low complexity matrices \(\tilde{B}_k\) obtained projecting \(B_k\) onto algebras of matrices diagonalized by products of two or three Householder matrices adaptively chosen step by step. Experimental tests show that the introduction of the adaptive criterion, which theoretically guarantees the convergence, considerably improves the robustness of the minimization schemes when compared with a non-adaptive choice; moreover, they show that the proposed methods could be particularly suitable to solve large scale problems where L-BFGS is not able to deliver satisfactory performance. PubDate: 2020-07-14

Abstract: The distance between sets is a long-standing computational geometry problem. In robotics, the distance between convex sets with Minkowski sum structure plays a fundamental role in collision detection. However, it is typically nontrivial to be computed, even if the projection onto each component set admits explicit formula. In this paper, we explore the problem of calculating the distance between convex sets arising from robotics. Upon the recent progress in convex optimization community, the proposed model can be efficiently solved by the recent hot-investigated first-order methods, e.g., alternating direction method of multipliers or primal-dual hybrid gradient method. Preliminary numerical results demonstrate that those first-order methods are fairly efficient in solving distance problems in robotics. PubDate: 2020-07-13

Abstract: We consider nested variational inequalities consisting in a (upper-level) variational inequality whose feasible set is given by the solution set of another (lower-level) variational inequality. Purely hierarchical convex bilevel optimization problems and certain multi-follower games are particular instances of nested variational inequalities. We present an explicit and ready-to-implement Tikhonov-type solution method for such problems. We give conditions that guarantee the convergence of the proposed method. Moreover, inspired by recent works in the literature, we provide a convergence rate analysis. In particular, for the simple bilevel instance, we are able to obtain enhanced convergence results. PubDate: 2020-07-06

Abstract: We develop a unified level-bundle method, called accelerated constrained level-bundle (ACLB) algorithm, for solving constrained convex optimization problems. where the objective and constraint functions can be nonsmooth, weakly smooth, and/or smooth. ACLB employs Nesterov’s accelerated gradient technique, and hence retains the iteration complexity as that of existing bundle-type methods if the objective or one of the constraint functions is nonsmooth. More importantly, ACLB can significantly reduce iteration complexity when the objective and all constraints are (weakly) smooth. In addition, if the objective contains a nonsmooth component which can be written as a specific form of maximum, we show that the iteration complexity of this component can be much lower than that for general nonsmooth objective function. Numerical results demonstrate the effectiveness of the proposed algorithm. PubDate: 2020-07-06

Abstract: Compressed Sensing using \(\ell _1\) regularization is among the most powerful and popular sparsification technique in many applications, but why has it not been used to obtain sparse deep learning model such as convolutional neural network (CNN)' This paper is aimed to provide an answer to this question and to show how to make it work. Following Xiao (J Mach Learn Res 11(Oct):2543–2596, 2010), We first demonstrate that the commonly used stochastic gradient decent and variants training algorithm is not an appropriate match with \(\ell _1\) regularization and then replace it with a different training algorithm based on a regularized dual averaging (RDA) method. The RDA method of Xiao (J Mach Learn Res 11(Oct):2543–2596, 2010) was originally designed specifically for convex problem, but with new theoretical insight and algorithmic modifications (using proper initialization and adaptivity), we have made it an effective match with \(\ell _1\) regularization to achieve a state-of-the-art sparsity for the highly non-convex CNN compared to other weight pruning methods without compromising accuracy (achieving 95% sparsity for ResNet-18 on CIFAR-10, for example). PubDate: 2020-07-04

Abstract: The problem of recovering acoustic sources, more specifically monopoles, from point-wise measurements of the corresponding acoustic pressure at a limited number of frequencies is addressed. To this purpose, a family of sparse optimization problems in measure space in combination with the Helmholtz equation on a bounded domain is considered. A weighted norm with unbounded weight near the observation points is incorporated into the formulation. Optimality conditions and conditions for recovery in the small noise case are discussed, which motivates concrete choices of the weight. The numerical realization is based on an accelerated conditional gradient method in measure space and a finite element discretization. PubDate: 2020-06-29

Abstract: The exterior Bernoulli problem is rephrased into a shape optimization problem using a new type of objective function called the Dirichlet-data-gap cost function which measures the \(L^2\)-distance between the Dirichlet data of two state functions. The first-order shape derivative of the cost function is explicitly determined via the chain rule approach. Using the same technique, the second-order shape derivative of the cost function at the solution of the free boundary problem is also computed. The gradient and Hessian informations are then used to formulate an efficient second-order gradient-based descent algorithm to numerically solve the minimization problem. The feasibility of the proposed method is illustrated through various numerical examples. PubDate: 2020-06-25