Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper, we study the constrained group sparse regularization optimization problem, where the loss function is convex but nonsmooth, and the penalty term is the group sparsity which is then proposed to be relaxed by the group Capped- \(\ell _1\) for the convenience of computation. Firstly, we introduce three kinds of stationary points for the continuous relaxation problem and describe the relationship of the three kinds of stationary points. We give the optimality conditions for the group Capped- \(\ell _1\) problem and the original group sparse regularization problem, and investigate the link between the relaxation problem and the original problem in terms of global and local optimal solutions. The results reveal the equivalence of the original problem and the relaxation problem in some sense. Secondly, we propose a group smoothing proximal gradient (GSPG) algorithm for the constrained group sparse optimization, and prove that the proposed GSPG algorithm globally converges to a lifted stationary point of the relaxation problem. Finally, we present some numerical results on recovery of the simulated group sparse signals and the real group sparse images to illustrate the efficiency of the continuous relaxation model and the proposed algorithm. PubDate: 2022-12-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper, we propose a new sequential quadratic semidefinite programming (SQSDP) method for solving degenerate nonlinear semidefinite programs (NSDPs), in which we produce iteration points by solving a sequence of stabilized quadratic semidefinite programming (QSDP) subproblems, which we derive from the minimax problem associated with the NSDP. Unlike the existing SQSDP methods, the proposed one allows us to solve those QSDP subproblems inexactly, and each QSDP is feasible. One more remarkable point of the proposed method is that constraint qualifications or boundedness of Lagrange multiplier sequences are not required in the global convergence analysis. Specifically, without assuming such conditions, we prove the global convergence to a point satisfying any of the following: the stationary conditions for the feasibility problem, the approximate-Karush–Kuhn–Tucker (AKKT) conditions, and the trace-AKKT conditions. Finally, we conduct some numerical experiments to examine the efficiency of the proposed method. PubDate: 2022-12-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Copositive optimization is a special case of convex conic programming, and it consists of optimizing a linear function over the cone of all completely positive matrices under linear constraints. Copositive optimization provides powerful relaxations of NP-hard quadratic problems or combinatorial problems, but there are still many open problems regarding copositive or completely positive matrices. In this paper, we focus on one such problem; finding a completely positive (CP) factorization for a given completely positive matrix. We treat it as a nonsmooth Riemannian optimization problem, i.e., a minimization problem of a nonsmooth function over a Riemannian manifold. To solve this problem, we present a general smoothing framework for solving nonsmooth Riemannian optimization problems and show convergence to a stationary point of the original problem. An advantage is that we can implement it quickly with minimal effort by directly using the existing standard smooth Riemannian solvers, such as Manopt. Numerical experiments show the efficiency of our method especially for large-scale CP factorizations. PubDate: 2022-12-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract For many problems, some of which are reviewed in the paper, popular algorithms like Douglas–Rachford (DR), ADMM, and FISTA produce approximating sequences that show signs of spiraling toward the solution. We present a meta-algorithm that exploits such dynamics to potentially enhance performance. The strategy of this meta-algorithm is to iteratively build and minimize surrogates for the Lyapunov function that captures those dynamics. As a first motivating application, we show that for prototypical feasibility problems the circumcentered-reflection method, subgradient projections, and Newton–Raphson are all describable as gradient-based methods for minimizing Lyapunov functions constructed for DR operators, with the former returning the minimizers of spherical surrogates for the Lyapunov function. As a second motivating application, we introduce a new method that shares these properties but with the added advantages that it: (1) does not rely on subproblems (e.g. reflections) and so may be applied for any operator whose iterates have the spiraling property; (2) provably has the aforementioned Lyapunov properties with few structural assumptions and so is generically suitable for primal/dual implementation; and (3) maps spaces of reduced dimension into themselves whenever the original operator does. This makes possible the first primal/dual implementation of a method that seeks the center of spiraling iterates. We describe this method, and provide a computed example (basis pursuit). PubDate: 2022-12-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Difference of Convex (DC) optimization problems have objective functions that are differences between two convex functions. Representative ways of solving these problems are the proximal DC algorithms, which require that the convex part of the objective function have L-smoothness. In this article, we propose the Bregman Proximal DC Algorithm (BPDCA) for solving large-scale DC optimization problems that do not possess L-smoothness. Instead, it requires that the convex part of the objective function has the L-smooth adaptable property that is exploited in Bregman proximal gradient algorithms. In addition, we propose an accelerated version, the Bregman Proximal DC Algorithm with extrapolation (BPDCAe), with a new restart scheme. We show the global convergence of the iterates generated by BPDCA(e) to a limiting critical point under the assumption of the Kurdyka-Łojasiewicz property or subanalyticity of the objective function and other weaker conditions than those of the existing methods. We applied our algorithms to phase retrieval, which can be described both as a nonconvex optimization problem and as a DC optimization problem. Numerical experiments showed that BPDCAe outperformed existing Bregman proximal-type algorithms because the DC formulation allows for larger admissible step sizes. PubDate: 2022-12-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The iteratively reweighted \(\ell _1\) algorithm is a widely used method for solving various regularization problems, which generally minimize a differentiable loss function combined with a convex/nonconvex regularizer to induce sparsity in the solution. However, the convergence and the complexity of iteratively reweighted \(\ell _1\) algorithms is generally difficult to analyze, especially for non-Lipschitz differentiable regularizers such as \(\ell _p\) norm regularization with \(0<p<1\) . In this paper, we propose, analyze and test a reweighted \(\ell _1\) algorithm combined with the extrapolation technique under the assumption of Kurdyka-Łojasiewicz (KL) property on the proximal function of the perturbed objective. Our method does not require the Lipschitz differentiability on the regularizers nor the smoothing parameters in the weights bounded away from 0. We show the proposed algorithm converges uniquely to a stationary point of the regularization problem and has local linear convergence for KL exponent at most 1/2 and local sublinear convergence for KL exponent greater than 1/2. We also provide results on calculating the KL exponents and discuss the cases when the KL exponent is at most 1/2. Numerical experiments show the efficiency of our proposed method. PubDate: 2022-12-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: This paper aims to develop distributed algorithms for nonconvex optimization problems with complicated constraints associated with a network. The network can be a physical one, such as an electric power network, where the constraints are nonlinear power flow equations, or an abstract one that represents constraint couplings between decision variables of different agents. Despite the recent development of distributed algorithms for nonconvex programs, highly complicated constraints still pose a significant challenge in theory and practice. We first identify some difficulties with the existing algorithms based on the alternating direction method of multipliers (ADMM) for dealing with such problems. We then propose a reformulation that enables us to design a two-level algorithm, which embeds a specially structured three-block ADMM at the inner level in an augmented Lagrangian method framework. Furthermore, we prove the global and local convergence as well as iteration complexity of this new scheme for general nonconvex constrained programs, and show that our analysis can be extended to handle more complicated multi-block inner-level problems. Finally, we demonstrate with computation that the new scheme provides convergent and parallelizable algorithms for various nonconvex applications, and is able to complement the performance of the state-of-the-art distributed algorithms in practice by achieving either faster convergence in optimality gap or in feasibility or both. PubDate: 2022-11-23

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We consider a distributed non-convex optimization problem of minimizing the sum of all local cost functions over a network of agents. This problem often appears in large-scale distributed machine learning, known as non-convex empirical risk minimization. In this paper, we propose two accelerated algorithms, named DSGT-HB and DSGT-NAG, which combine the distributed stochastic gradient tracking (DSGT) method with momentum accelerated techniques. Under appropriate assumptions, we prove that both algorithms sublinearly converge to a neighborhood of a first-order stationary point of the distributed non-convex optimization. Moreover, we derive the conditions under which DSGT-HB and DSGT-NAG achieve a network-independent linear speedup. Numerical experiments for a distributed non-convex logistic regression problem on real data sets and a deep neural network on the MNIST database show the superiorities of DSGT-HB and DSGT-NAG compared with DSGT. PubDate: 2022-11-23

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper we present general-purpose preconditioners for regularized augmented systems, and their corresponding normal equations, arising from optimization problems. We discuss positive definite preconditioners, suitable for CG and MINRES. We consider “sparsifications" which avoid situations in which eigenvalues of the preconditioned matrix may become complex. Special attention is given to systems arising from the application of regularized interior point methods to linear or nonlinear convex programming problems. PubDate: 2022-11-14

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We consider nonlinear optimization problems that involve surrogate models represented by neural networks. We demonstrate first how to directly embed neural network evaluation into optimization models, highlight a difficulty with this approach that can prevent convergence, and then characterize stationarity of such models. We then present two alternative formulations of these problems in the specific case of feedforward neural networks with ReLU activation: as a mixed-integer optimization problem and as a mathematical program with complementarity constraints. For the latter formulation we prove that stationarity at a point for this problem corresponds to stationarity of the embedded formulation. Each of these formulations may be solved with state-of-the-art optimization methods, and we show how to obtain good initial feasible solutions for these methods. We compare our formulations on three practical applications arising in the design and control of combustion engines, in the generation of adversarial attacks on classifier networks, and in the determination of optimal flows in an oil well network. PubDate: 2022-11-14

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In the paper, a variant of the semismooth \(^{*}\) Newton method is developed for the numerical solution of generalized equations, in which the multi-valued part is a so-called SCD (subspace containing derivative) mapping. Under a rather mild regularity requirement, the method exhibits (locally) superlinear convergence behavior. From the main conceptual algorithm, two implementable variants are derived whose efficiency is tested via a generalized equation modeling a discretized static contact problem with Coulomb friction. PubDate: 2022-11-07

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract This paper considers a rank-constrained optimization problem where the objective function is continuously differentiable on a closed convex set. After replacing the rank constraint by an equality of the truncated difference of L1 and L2 norm, and adding the equality constraint into the objective to get a penalty problem, we prove that the penalty problem is exact in the sense that the set of its global (local) optimal solutions coincides with that of the original problem when the penalty parameter is over a certain threshold. This establishes the theoretical guarantee for the truncated difference of L1 and L2 norm regularization optimization including the work of Ma et al. (SIAM J Imaging Sci 10(3):1346–1380, 2017). Besides, for the penalty problem, we propose an extrapolation proximal difference of convex algorithm (epDCA) and prove the sequence generated by epDCA converges to a stationary point of the penalty problem. Further, an adaptive penalty method based on epDCA is constructed for the original rank-constrained problem. The efficiency of the algorithms is verified via numerical experiments for the nearest low-rank correlation matrix problem and the matrix completion problem. PubDate: 2022-11-05

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The fast iterative shrinkage/thresholding algorithm (FISTA) is one of the most popular first-order iterations for minimizing the sum of two convex functions. FISTA is known to improve the complexity of the classical proximal gradient method (PGM) from \(O(k^{-1})\) to the optimal complexity \(O(k^{-2})\) in terms of the sequence of the functional values. When the evaluation of the proximal operator is hard, inexact versions of FISTA might be used to solve the problem. In this paper, we proposed an inexact version of FISTA by solving the proximal subproblem inexactly using a relative error criterion instead of exogenous and diminishing error rules. The introduced relative error rule in the FISTA iteration is related to the progress of the algorithm at each step and does not increase the computational burden per iteration. Moreover, the proposed algorithm recovers the same optimal convergence rate as FISTA. Some numerical experiments are also reported to illustrate the numerical behavior of the relative inexact method when compared with FISTA under an absolute error criterion. PubDate: 2022-11-04

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We propose a stochastic first-order trust-region method with inexact function and gradient evaluations for solving finite-sum minimization problems. Using a suitable reformulation of the given problem, our method combines the inexact restoration approach for constrained optimization with the trust-region procedure and random models. Differently from other recent stochastic trust-region schemes, our proposed algorithm improves feasibility and optimality in a modular way. We provide the expected number of iterations for reaching a near-stationary point by imposing some probability accuracy requirements on random functions and gradients which are, in general, less stringent than the corresponding ones in literature. We validate the proposed algorithm on some nonconvex optimization problems arising in binary classification and regression, showing that it performs well in terms of cost and accuracy, and allows to reduce the burdensome tuning of the hyper-parameters involved. PubDate: 2022-11-03

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract FISTA is a popular convex optimisation algorithm which is known to converge at an optimal rate whenever a minimiser is contained in a suitable Hilbert space. We propose a modified algorithm where each iteration is performed in a subset which is allowed to change at every iteration. Sufficient conditions are provided for guaranteed convergence, although at a reduced rate depending on the conditioning of the specific problem. These conditions have a natural interpretation when a minimiser exists in an underlying Banach space. Typical examples are L1-penalised reconstructions where we provide detailed theoretical and numerical analysis. PubDate: 2022-11-02

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract This work deals with the efficient numerical characterization of Pareto stationary fronts for multiobjective optimal control problems with a moderate number of cost functionals and a mildly nonsmooth, elliptic, semilinear PDE-constraint. When “ample” controls are considered, strong stationarity conditions that can be used to numerically characterize the Pareto stationary fronts are known for our problem. We show that for finite dimensional controls, a sufficient adjoint-based stationarity system remains obtainable. It turns out that these stationarity conditions remain useful when numerically characterizing the fronts, because they correspond to strong stationarity systems for problems obtained by application of weighted-sum and reference point techniques to the multiobjective problem. We compare the performance of both scalarization techniques using quantifiable measures for the approximation quality. The subproblems of either method are solved with a line-search globalized pseudo-semismooth Newton method that appears to remove the degenerate behavior of the local version of the method employed previously. We apply a matrix-free, iterative approach to deal with the memory and complexity requirements when solving the subproblems of the reference point method and compare several preconditioning approaches. PubDate: 2022-08-28 DOI: 10.1007/s10589-022-00390-y

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The \(\ell _1\) -ball is a nicely structured feasible set that is widely used in many fields (e.g., machine learning, statistics and signal analysis) to enforce some sparsity in the model solutions. In this paper, we devise an active-set strategy for efficiently dealing with minimization problems over the \(\ell _1\) -ball and embed it into a tailored algorithmic scheme that makes use of a non-monotone first-order approach to explore the given subspace at each iteration. We prove global convergence to stationary points. Finally, we report numerical experiments, on two different classes of instances, showing the effectiveness of the algorithm. PubDate: 2022-08-27 DOI: 10.1007/s10589-022-00407-6