Mathematical Programming
Journal Prestige (SJR): 2.49 Citation Impact (citeScore): 3 Number of Followers: 14 Hybrid journal (It can contain Open Access articles) ISSN (Print) 14364646  ISSN (Online) 00255610 Published by SpringerVerlag [2467 journals] 
 Accelerating inexact successive quadratic approximation for regularized
optimization through manifold identification
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract For regularized optimization that minimizes the sum of a smooth term and a regularizer that promotes structured solutions, inexact proximalNewtontype methods, or successive quadratic approximation (SQA) methods, are widely used for their superlinear convergence in terms of iterations. However, unlike the counter parts in smooth optimization, they suffer from lengthy running time in solving regularized subproblems because even approximate solutions cannot be computed easily, so their empirical time cost is not as impressive. In this work, we first show that for partly smooth regularizers, although general inexact solutions cannot identify the active manifold that makes the objective function smooth, approximate solutions generated by commonlyused subproblem solvers will identify this manifold, even with arbitrarily low solution precision. We then utilize this property to propose an improved SQA method, ISQA \(^{+}\) , that switches to efficient smooth optimization methods after this manifold is identified. We show that for a wide class of degenerate solutions, ISQA \(^{+}\) possesses superlinear convergence not only in iterations, but also in running time because the cost per iteration is bounded. In particular, our superlinear convergence result holds on problems satisfying a sharpness condition that is more general than that in existing literature. We also prove iterate convergence under a sharpness condition for inexact SQA, which is novel for this family of methods that could easily violate the classical relativeerror condition frequently used in proving convergence under similar conditions. Experiments on realworld problems support that ISQA \(^{+}\) improves running time over some modern solvers for regularized optimization.
PubDate: 20230112

 Strong valid inequalities for a class of concave submodular minimization
problems under cardinality constraints
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We study the polyhedral convex hull structure of a mixedinteger set which arises in a class of cardinalityconstrained concave submodular minimization problems. This class of problems has an objective function in the form of \(f(a^\top x)\) , where f is a univariate concave function, a is a nonnegative vector, and x is a binary vector of appropriate dimension. Such minimization problems frequently appear in applications that involve riskaversion or economies of scale. We propose three classes of strong valid linear inequalities for this convex hull and specify their facet conditions when a has two distinct values. We show how to use these inequalities to obtain valid inequalities for general a that contains multiple values. We further provide a complete linear convex hull description for this mixedinteger set when a contains two distinct values and the cardinality constraint upper bound is two. Our computational experiments on the meanrisk optimization problem demonstrate the effectiveness of the proposed inequalities in a branchandcut framework.
PubDate: 20230109

 A unified singleloop alternating gradient projection algorithm for
nonconvex–concave and convex–nonconcave minimax problems
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Much recent research effort has been directed to the development of efficient algorithms for solving minimax problems with theoretical convergence guarantees due to the relevance of these problems to a few emergent applications. In this paper, we propose a unified singleloop alternating gradient projection (AGP) algorithm for solving smooth nonconvex(strongly) concave and (strongly) convex–nonconcave minimax problems. AGP employs simple gradient projection steps for updating the primal and dual variables alternatively at each iteration. We show that it can find an \(\varepsilon \) stationary point of the objective function in \({\mathcal {O}}\left( \varepsilon ^{2} \right) \) (resp. \({\mathcal {O}}\left( \varepsilon ^{4} \right) \) ) iterations under nonconvexstrongly concave (resp. nonconvex–concave) setting. Moreover, its gradient complexity to obtain an \(\varepsilon \) stationary point of the objective function is bounded by \({\mathcal {O}}\left( \varepsilon ^{2} \right) \) (resp., \({\mathcal {O}}\left( \varepsilon ^{4} \right) \) ) under the strongly convex–nonconcave (resp., convex–nonconcave) setting. To the best of our knowledge, this is the first time that a simple and unified singleloop algorithm is developed for solving both nonconvex(strongly) concave and (strongly) convex–nonconcave minimax problems. Moreover, the complexity results for solving the latter (strongly) convex–nonconcave minimax problems have never been obtained before in the literature. Numerical results show the efficiency of the proposed AGP algorithm. Furthermore, we extend the AGP algorithm by presenting a block alternating proximal gradient (BAPG) algorithm for solving more general multiblock nonsmooth nonconvex(strongly) concave and (strongly) convex–nonconcave minimax problems. We can similarly establish the gradient complexity of the proposed algorithm under these four different settings.
PubDate: 20230102

 Sumofsquares chordal decomposition of polynomial matrix inequalities

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We prove decomposition theorems for sparse positive (semi)definite polynomial matrices that can be viewed as sparsityexploiting versions of the Hilbert–Artin, Reznick, Putinar, and Putinar–Vasilescu Positivstellensätze. First, we establish that a polynomial matrix P(x) with chordal sparsity is positive semidefinite for all \(x\in \mathbb {R}^n\) if and only if there exists a sumofsquares (SOS) polynomial \(\sigma (x)\) such that \(\sigma P\) is a sum of sparse SOS matrices. Second, we show that setting \(\sigma (x)=(x_1^2 + \cdots + x_n^2)^\nu \) for some integer \(\nu \) suffices if P is homogeneous and positive definite globally. Third, we prove that if P is positive definite on a compact semialgebraic set \(\mathcal {K}=\{x:g_1(x)\ge 0,\ldots ,g_m(x)\ge 0\}\) satisfying the Archimedean condition, then \(P(x) = S_0(x) + g_1(x)S_1(x) + \cdots + g_m(x)S_m(x)\) for matrices \(S_i(x)\) that are sums of sparse SOS matrices. Finally, if \(\mathcal {K}\) is not compact or does not satisfy the Archimedean condition, we obtain a similar decomposition for \((x_1^2 + \cdots + x_n^2)^\nu P(x)\) with some integer \(\nu \ge 0\) when P and \(g_1,\ldots ,g_m\) are homogeneous of even degree. Using these results, we find sparse SOS representation theorems for polynomials that are quadratic and correlatively sparse in a subset of variables, and we construct new convergent hierarchies of sparsityexploiting SOS reformulations for convex optimization problems with large and sparse polynomial matrix inequalities. Numerical examples demonstrate that these hierarchies can have a significantly lower computational complexity than traditional ones.
PubDate: 20230101

 Characterizing the universal rigidity of generic tensegrities

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract A tensegrity is a structure made from cables, struts, and stiff bars. A ddimensional tensegrity is universally rigid if it is rigid in any dimension \(d'\) with \(d'\ge d\) . The celebrated super stability condition due to Connelly gives a sufficient condition for a tensegrity to be universally rigid. Gortler and Thurston showed that super stability characterizes universal rigidity when the point configuration is generic and every member is a stiff bar. We extend this result in two directions. We first show that a generic universally rigid tensegrity is super stable. We then extend it to tensegrities with point group symmetry, and show that this characterization still holds as long as a tensegrity is generic modulo symmetry. Our strategy is based on the blockdiagonalization technique for symmetric semidefinite programming problems, and our proof relies on the theory of real irreducible representations of finite groups.
PubDate: 20230101

 Revisiting the approximate Carathéodory problem via the FrankWolfe
algorithm
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract The approximate Carathéodory theorem states that given a compact convex set \({\mathcal {C}}\subset {\mathbb {R}}^n\) and \(p\in [2,+\infty [\) , each point \(x^*\in {\mathcal {C}}\) can be approximated to \(\epsilon \) accuracy in the \(\ell _p\) norm as the convex combination of \({\mathcal {O}}(pD_p^2/\epsilon ^2)\) vertices of \({\mathcal {C}}\) , where \(D_p\) is the diameter of \({\mathcal {C}}\) in the \(\ell _p\) norm. A solution satisfying these properties can be built using probabilistic arguments or by applying mirror descent to the dual problem. We revisit the approximate Carathéodory problem by solving the primal problem via the FrankWolfe algorithm, providing a simplified analysis and leading to an efficient practical method. Furthermore, improved cardinality bounds are derived naturally using existing convergence rates of the FrankWolfe algorithm in different scenarios, when \(x^*\) is in the interior of \({\mathcal {C}}\) , when \(x^*\) is the convex combination of a subset of vertices with small diameter, or when \({\mathcal {C}}\) is uniformly convex. We also propose cardinality bounds when \(p\in [1,2[\cup \{+\infty \}\) via a nonsmooth variant of the algorithm. Lastly, we address the problem of finding sparse approximate projections onto \({\mathcal {C}}\) in the \(\ell _p\) norm, \(p\in [1,+\infty ]\) .
PubDate: 20230101

 Stochastic firstorder methods for convex and nonconvex functional
constrained optimization
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Functional constrained optimization is becoming more and more important in machine learning and operations research. Such problems have potential applications in riskaverse machine learning, semisupervised learning and robust optimization among others. In this paper, we first present a novel Constraint Extrapolation (ConEx) method for solving convex functional constrained problems, which utilizes linear approximations of the constraint functions to define the extrapolation (or acceleration) step. We show that this method is a unified algorithm that achieves the bestknown rate of convergence for solving different functional constrained convex composite problems, including convex or strongly convex, and smooth or nonsmooth problems with stochastic objective and/or stochastic constraints. Many of these rates of convergence were in fact obtained for the first time in the literature. In addition, ConEx is a singleloop algorithm that does not involve any penalty subproblems. Contrary to existing primaldual methods, it does not require the projection of Lagrangian multipliers into a (possibly unknown) bounded set. Second, for nonconvex functional constrained problems, we introduce a new proximal point method which transforms the initial nonconvex problem into a sequence of convex problems by adding quadratic terms to both the objective and constraints. Under certain MFCQtype assumption, we establish the convergence and rate of convergence of this method to KKT points when the convex subproblems are solved exactly or inexactly. For largescale and stochastic problems, we present a more practical proximal point method in which the approximate solutions of the subproblems are computed by the aforementioned ConEx method. Under a strong feasibility assumption, we establish the total iteration complexity of ConEx required by this inexact proximal point method for a variety of problem settings, including nonconvex smooth or nonsmooth problems with stochastic objective and/or stochastic constraints. To the best of our knowledge, most of these convergence and complexity results of the proximal point method for nonconvex problems also seem to be new in the literature.
PubDate: 20230101

 A reformulationlinearization technique for optimization over simplices

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We study nonconvex optimization problems over simplices. We show that for a large class of objective functions, the convex approximation obtained from the ReformulationLinearization Technique (RLT) admits optimal solutions that exhibit a sparsity pattern. This characteristic of the optimal solutions allows us to conclude that (i) a linear matrix inequality constraint, which is often added to tighten the relaxation, is vacuously satisfied and can thus be omitted, and (ii) the number of decision variables in the RLT relaxation can be reduced from \({\mathcal {O}} (n^2)\) to \({\mathcal {O}} (n)\) . Taken together, both observations allow us to reduce computation times by up to several orders of magnitude. Our results can be specialized to indefinite quadratic optimization problems over simplices and extended to nonconvex optimization problems over the Cartesian product of two simplices as well as specific classes of polyhedral and nonconvex feasible regions. Our numerical experiments illustrate the promising performance of the proposed framework.
PubDate: 20230101

 Binary optimal control by trustregion steepest descent

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We present a trustregion steepest descent method for dynamic optimal control problems with binaryvalued integrable control functions. Our method interprets the control function as an indicator function of a measurable set and makes setvalued adjustments derived from the sublevel sets of a topological gradient function. By combining this type of update with a trustregion framework, we are able to show by theoretical argument that our method achieves asymptotic stationarity despite possible discretization errors and truncation errors during step determination. To demonstrate the practical applicability of our method, we solve two optimal control problems constrained by ordinary and partial differential equations, respectively, and one topological optimization problem.
PubDate: 20230101

 Twohalfspace closure

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We define a new cutting plane closure for pure integer programs called the twohalfspace closure. It is a natural generalization of the wellknown ChvátalGomory closure. We prove that the twohalfspace closure is polyhedral. We also study the corresponding twohalfspace rank of any valid inequality and show that it is at most the split rank of the inequality. Moreover, while the split rank can be strictly larger than the twohalfspace rank, the split rank is at most twice the twohalfspace rank. A key step of our analysis shows that the split closure of a rational polyhedron can be obtained by considering the split closures of all kdimensional (rational) projections of the polyhedron, for any fixed \(k \ge 2\) . This result may be of independent interest.
PubDate: 20230101

 Boxtotal dual integrality and edgeconnectivity

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Given a graph \(G=(V,E)\) and an integer \(k\ge 1\) , the graph \(H=(V,F)\) , where F is a family of elements (with repetitions allowed) of E, is a kedgeconnected spanning subgraph of G if H cannot be disconnected by deleting any \(k1\) elements of F. The convex hull of incidence vectors of the kedgeconnected subgraphs of a graph G forms the kedgeconnected subgraph polyhedron of G. We prove that this polyhedron is boxtotally dual integral if and only if G is series–parallel. In this case, we also provide an integer boxtotally dual integral system describing this polyhedron.
PubDate: 20230101

 Inexact accelerated highorder proximalpoint methods

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract In this paper, we present a new framework of bilevel unconstrained minimization for development of accelerated methods in Convex Programming. These methods use approximations of the highorder proximal points, which are solutions of some auxiliary parametric optimization problems. For computing these points, we can use different methods, and, in particular, the lowerorder schemes. This opens a possibility for the latter methods to overpass traditional limits of the Complexity Theory. As an example, we obtain a new secondorder method with the convergence rate \(O\left( k^{4}\right) \) , where k is the iteration counter. This rate is better than the maximal possible rate of convergence for this type of methods, as applied to functions with Lipschitz continuous Hessian. We also present new methods with the exact auxiliary search procedure, which have the rate of convergence \(O\left( k^{(3p+1)/ 2}\right) \) , where \(p \ge 1\) is the order of the proximal operator. The auxiliary problem at each iteration of these schemes is convex.
PubDate: 20230101

 Variance reduction for rootfinding problems

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Minimizing finite sums of smooth and strongly convex functions is an important task in machine learning. Recent work has developed stochastic gradient methods that optimize these sums with less computation than methods that do not exploit the finite sum structure. This speedup results from using efficiently constructed stochastic gradient estimators, which have variance that diminishes as the algorithm progresses. In this work, we ask whether the benefits of variance reduction extend to fixed point and rootfinding problems involving sums of nonlinear operators. Our main result shows that variance reduction offers a similar speedup when applied to a broad class of rootfinding problems. We illustrate the result on three tasks involving sums of n nonlinear operators: averaged fixed point, monotone inclusions, and nonsmooth common minimizer problems. In certain “poorly conditioned regimes,” the proposed method offers an nfold speedup over standard methods.
PubDate: 20230101

 Multicriteria cuts and sizeconstrained kcuts in hypergraphs

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We address counting and optimization variants of multicriteria global mincut and sizeconstrained minkcut in hypergraphs. For an rrank nvertex hypergraph endowed with t hyperedgecost functions, we show that the number of multiobjective mincuts is \(O(r2^{tr}n^{3t1})\) . In particular, this shows that the number of parametric mincuts in constant rank hypergraphs for a constant number of criteria is strongly polynomial, thus resolving an open question by Aissi et al. (Math Program 154(1–2):3–28, 2015). In addition, we give randomized algorithms to enumerate all multiobjective mincuts and all paretooptimal cuts in strongly polynomialtime. We also address nodebudgeted multiobjective mincuts: For an nvertex hypergraph endowed with t vertexweight functions, we show that the number of nodebudgeted multiobjective mincuts is \(O(r2^{r}n^{t+2})\) , where r is the rank of the hypergraph, and the number of nodebudgeted bmultiobjective mincuts for a fixed budgetvector \(b\in {\mathbb {R}}^t_{\ge 0}\) is \(O(n^2)\) . We show that minkcut in hypergraphs subject to constant lower bounds on part sizes is solvable in polynomialtime for constant k, thus resolving an open problem posed by Guinez and Queyranne (Unpublished manuscript. . See also , 2012). Our technique also shows that the number of optimal solutions is polynomial. All of our results build on the random contraction approach of Karger (Proceedings of the 4th annual ACMSIAM symposium on discrete algorithms, SODA, pp 21–30, 1993). Our techniques illustrate the versatility of the random contraction approach to address counting and algorithmic problems concerning multiobjective mincuts and sizeconstrained kcuts in hypergraphs.
PubDate: 20230101

 On the robustness of potentialbased flow networks

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Potentialbased flows provide a simple yet realistic mathematical model of transport in many realworld infrastructure networks such as, e.g., gas or water networks, where the flow along each edge depends on the difference of the potentials at its end nodes. We call a network topology robust if the maximal node potential needed to satisfy a set of demands never increases when demands are decreased. This notion of robustness is motivated by infrastructure networks where users first make reservations for certain demands that may be larger than the actual flows sent later on. In these networks, node potentials correspond to physical quantities such as pressures or hydraulic heads and must be guaranteed to lie within a fixed range, even if the actual amounts are smaller than the previously reserved demands. Our main results are a precise characterization of robust network topologies for the case of pointtopoint demands via forbidden nodelabeled graph minors, as well as an efficient algorithm for testing robustness.
PubDate: 20230101

 Strengthened SDP relaxation for an extended trust region subproblem with
an application to optimal power flow
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract We study an extended trust region subproblem minimizing a nonconvex function over the hollow ball \(r \le \Vert x\Vert \le R\) intersected with a fulldimensional second order cone (SOC) constraint of the form \(\Vert x  c\Vert \le b^T x  a\) . In particular, we present a class of valid cuts that improve existing semidefinite programming (SDP) relaxations and are separable in polynomial time. We connect our cuts to the literature on the optimal power flow (OPF) problem by demonstrating that previously derived cuts capturing a convex hull important for OPF are actually just special cases of our cuts. In addition, we apply our methodology to derive a new class of closedform, locally valid, SOC cuts for nonconvex quadratic programs over the mixed polyhedralconic set \(\{x \ge 0 : \Vert x \Vert \le 1 \}\) . Finally, we show computationally on randomly generated instances that our cuts are effective in further closing the gap of the strongest SDP relaxations in the literature, especially in low dimensions.
PubDate: 20230101

 Apportionment with parity constraints

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract In the classic apportionment problem, the goal is to decide how many seats of a parliament should be allocated to each party as a result of an election. The divisor methods solve this problem by defining a notion of proportionality guided by some rounding rule. Motivated by recent challenges in the context of electoral apportionment, we consider the question of how to allocate the seats of a parliament under parity constraints between candidate types (e.g., an equal number of men and women elected) while at the same time satisfying party proportionality. We study two different approaches to solve this question. We first provide a theoretical analysis of a recently devised mechanism based on a greedy approach. We then propose and analyze a mechanism that follows the idea of biproportionality introduced by Balinski and Demange. In contrast with the classic biproportional method by Balinski and Demange, this mechanism is ruled by two levels of proportionality: Proportionality is satisfied at the level of parties by means of a divisor method, and then biproportionality is used to decide the number of candidates allocated to each type and party. A typical benchmark used in the context of twodimensional apportionment is the fair share (a.k.a matrix scaling), which corresponds to an ideal fractional biproportional solution. We provide lower bounds on the distance between these two types of solutions, and we explore their consequences in the context of twodimensional apportionment.
PubDate: 20221228

 Consistent approximations in composite optimization

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Approximations of optimization problems arise in computational procedures and sensitivity analysis. The resulting effect on solutions can be significant, with even small approximations of components of a problem translating into large errors in the solutions. We specify conditions under which approximations are well behaved in the sense of minimizers, stationary points, and levelsets and this leads to a framework of consistent approximations. The framework is developed for a broad class of composite problems, which are neither convex nor smooth. We demonstrate the framework using examples from stochastic optimization, neuralnetwork based machine learning, distributionally robust optimization, penalty and augmented Lagrangian methods, interiorpoint methods, homotopy methods, smoothing methods, extended nonlinear programming, differenceofconvex programming, and multiobjective optimization. An enhanced proximal method illustrates the algorithmic possibilities. A quantitative analysis supplements the development by furnishing rates of convergence.
PubDate: 20221228

 Resolvent splitting for sums of monotone operators with minimal lifting

Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract In this work, we study fixed point algorithms for finding a zero in the sum of \(n\ge 2\) maximally monotone operators by using their resolvents. More precisely, we consider the class of such algorithms where each resolvent is evaluated only once per iteration. For any algorithm from this class, we show that the underlying fixed point operator is necessarily defined on a dfold Cartesian product space with \(d\ge n1\) . Further, we show that this bound is unimprovable by providing a family of examples for which \(d=n1\) is attained. This family includes the Douglas–Rachford algorithm as the special case when \(n=2\) . Applications of the new family of algorithms in distributed decentralised optimisation and multiblock extensions of the alternation direction method of multipliers (ADMM) are discussed.
PubDate: 20221227

 Methodology and firstorder algorithms for solving nonsmooth and
nonstrongly convex bilevel optimization problems
Free preprint version: Loading...Rate this result: What is this?Please help us test our new preprint finding feature by giving the preprint link a rating.
A 5 star rating indicates the linked preprint has the exact same content as the published article.
Abstract: Abstract Simple bilevel problems are optimization problems in which we want to find an optimal solution to an inner problem that minimizes an outer objective function. Such problems appear in many machine learning and signal processing applications as a way to eliminate undesirable solutions. In our work, we suggest a new approach that is designed for bilevel problems with simple outer functions, such as the \(l_1\) norm, which are not required to be either smooth or strongly convex. In our new ITerative Approximation and Levelset EXpansion (ITALEX) approach, we alternate between expanding the levelset of the outer function and approximately optimizing the inner problem over this levelset. We show that optimizing the inner function through firstorder methods such as proximal gradient and generalized conditional gradient results in a feasibility convergence rate of O(1/k), which up to now was a rate only achieved by bilevel algorithms for smooth and strongly convex outer functions. Moreover, we prove an \(O(1/\sqrt{k})\) rate of convergence for the outer function, contrary to existing methods, which only provide asymptotic guarantees. We demonstrate this performance through numerical experiments.
PubDate: 20221227
