Subjects -> MATHEMATICS (Total: 1013 journals)
    - APPLIED MATHEMATICS (92 journals)
    - GEOMETRY AND TOPOLOGY (23 journals)
    - MATHEMATICS (714 journals)
    - MATHEMATICS (GENERAL) (45 journals)
    - NUMERICAL ANALYSIS (26 journals)
    - PROBABILITIES AND MATH STATISTICS (113 journals)

MATHEMATICS (714 journals)            First | 1 2 3 4     

Showing 601 - 538 of 538 Journals sorted alphabetically
Results in Control and Optimization     Open Access  
Results in Mathematics     Hybrid Journal  
Results in Nonlinear Analysis     Open Access  
Review of Symbolic Logic     Full-text available via subscription   (Followers: 2)
Reviews in Mathematical Physics     Hybrid Journal   (Followers: 1)
Revista Baiana de Educação Matemática     Open Access  
Revista Bases de la Ciencia     Open Access  
Revista BoEM - Boletim online de Educação Matemática     Open Access  
Revista Colombiana de Matemáticas     Open Access   (Followers: 1)
Revista de Ciencias     Open Access  
Revista de Educación Matemática     Open Access  
Revista de la Escuela de Perfeccionamiento en Investigación Operativa     Open Access  
Revista de la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas     Partially Free  
Revista de Matemática : Teoría y Aplicaciones     Open Access   (Followers: 1)
Revista Digital: Matemática, Educación e Internet     Open Access  
Revista Electrónica de Conocimientos, Saberes y Prácticas     Open Access  
Revista Integración : Temas de Matemáticas     Open Access  
Revista Internacional de Sistemas     Open Access  
Revista Latinoamericana de Etnomatemática     Open Access  
Revista Latinoamericana de Investigación en Matemática Educativa     Open Access  
Revista Matemática Complutense     Hybrid Journal  
Revista REAMEC : Rede Amazônica de Educação em Ciências e Matemática     Open Access  
Revista SIGMA     Open Access  
Ricerche di Matematica     Hybrid Journal  
RMS : Research in Mathematics & Statistics     Open Access  
Royal Society Open Science     Open Access   (Followers: 7)
Russian Journal of Mathematical Physics     Full-text available via subscription  
Russian Mathematics     Hybrid Journal  
Sahand Communications in Mathematical Analysis     Open Access  
Sampling Theory, Signal Processing, and Data Analysis     Hybrid Journal  
São Paulo Journal of Mathematical Sciences     Hybrid Journal  
Science China Mathematics     Hybrid Journal   (Followers: 1)
Science Progress     Full-text available via subscription   (Followers: 1)
Sciences & Technologie A : sciences exactes     Open Access  
Selecta Mathematica     Hybrid Journal   (Followers: 1)
SeMA Journal     Hybrid Journal  
Semigroup Forum     Hybrid Journal   (Followers: 1)
Set-Valued and Variational Analysis     Hybrid Journal  
SIAM Journal on Applied Mathematics     Hybrid Journal   (Followers: 11)
SIAM Journal on Computing     Hybrid Journal   (Followers: 11)
SIAM Journal on Control and Optimization     Hybrid Journal   (Followers: 18)
SIAM Journal on Discrete Mathematics     Hybrid Journal   (Followers: 8)
SIAM Journal on Financial Mathematics     Hybrid Journal   (Followers: 3)
SIAM Journal on Mathematics of Data Science     Hybrid Journal   (Followers: 1)
SIAM Journal on Matrix Analysis and Applications     Hybrid Journal   (Followers: 3)
SIAM Journal on Optimization     Hybrid Journal   (Followers: 12)
Siberian Advances in Mathematics     Hybrid Journal  
Siberian Mathematical Journal     Hybrid Journal  
Sigmae     Open Access  
SILICON     Hybrid Journal  
SN Partial Differential Equations and Applications     Hybrid Journal  
Soft Computing     Hybrid Journal   (Followers: 7)
Statistics and Computing     Hybrid Journal   (Followers: 14)
Stochastic Analysis and Applications     Hybrid Journal   (Followers: 3)
Stochastic Partial Differential Equations : Analysis and Computations     Hybrid Journal   (Followers: 2)
Stochastic Processes and their Applications     Hybrid Journal   (Followers: 6)
Stochastics and Dynamics     Hybrid Journal   (Followers: 2)
Studia Scientiarum Mathematicarum Hungarica     Full-text available via subscription   (Followers: 1)
Studia Universitatis Babeș-Bolyai Informatica     Open Access  
Studies In Applied Mathematics     Hybrid Journal   (Followers: 1)
Studies in Mathematical Sciences     Open Access   (Followers: 1)
Superficies y vacio     Open Access  
Suska Journal of Mathematics Education     Open Access   (Followers: 1)
Swiss Journal of Geosciences     Hybrid Journal   (Followers: 1)
Synthesis Lectures on Algorithms and Software in Engineering     Full-text available via subscription   (Followers: 2)
Synthesis Lectures on Mathematics and Statistics     Full-text available via subscription   (Followers: 1)
Tamkang Journal of Mathematics     Open Access  
Tatra Mountains Mathematical Publications     Open Access  
Teaching Mathematics     Full-text available via subscription   (Followers: 10)
Teaching Mathematics and its Applications: An International Journal of the IMA     Hybrid Journal   (Followers: 4)
Teaching Statistics     Hybrid Journal   (Followers: 8)
Technometrics     Full-text available via subscription   (Followers: 8)
The Journal of Supercomputing     Hybrid Journal   (Followers: 1)
The Mathematica journal     Open Access  
The Mathematical Gazette     Full-text available via subscription   (Followers: 1)
The Mathematical Intelligencer     Hybrid Journal  
The Ramanujan Journal     Hybrid Journal  
The VLDB Journal     Hybrid Journal   (Followers: 2)
Theoretical and Mathematical Physics     Hybrid Journal   (Followers: 7)
Theory and Applications of Graphs     Open Access  
Topological Methods in Nonlinear Analysis     Full-text available via subscription  
Transactions of the London Mathematical Society     Open Access   (Followers: 1)
Transformation Groups     Hybrid Journal  
Turkish Journal of Mathematics     Open Access  
Ukrainian Mathematical Journal     Hybrid Journal  
Uniciencia     Open Access  
Uniform Distribution Theory     Open Access  
Unisda Journal of Mathematics and Computer Science     Open Access  
Unnes Journal of Mathematics     Open Access   (Followers: 1)
Unnes Journal of Mathematics Education     Open Access   (Followers: 2)
Unnes Journal of Mathematics Education Research     Open Access   (Followers: 1)
Ural Mathematical Journal     Open Access  
Vestnik Samarskogo Gosudarstvennogo Tekhnicheskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki     Open Access  
Vestnik St. Petersburg University: Mathematics     Hybrid Journal  
VFAST Transactions on Mathematics     Open Access   (Followers: 1)
Vietnam Journal of Mathematics     Hybrid Journal  
Vinculum     Full-text available via subscription  
Visnyk of V. N. Karazin Kharkiv National University. Ser. Mathematics, Applied Mathematics and Mechanics     Open Access   (Followers: 2)
Water SA     Open Access   (Followers: 1)
Water Waves     Hybrid Journal  
Zamm-Zeitschrift Fuer Angewandte Mathematik Und Mechanik     Hybrid Journal   (Followers: 1)
ZDM     Hybrid Journal   (Followers: 2)
Zeitschrift für angewandte Mathematik und Physik     Hybrid Journal   (Followers: 2)
Zeitschrift fur Energiewirtschaft     Hybrid Journal  
Zetetike     Open Access  

  First | 1 2 3 4     

Similar Journals
Journal Cover
SIAM Journal on Mathematics of Data Science
Number of Followers: 1  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Online) 2577-0187
Published by Society for Industrial and Applied Mathematics Homepage  [17 journals]
  • $k$-Variance: A Clustered Notion of Variance

    • Free pre-print version: Loading...

      Authors: Justin Solomon, Kristjan Greenewald, Haikady Nagaraja
      Pages: 957 - 978
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 3, Page 957-978, September 2022.
      We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings. $k$-variance measures the expected cost of matching two sets of $k$ samples from a distribution to each other, capturing local rather than global information about a measure as $k$ increases; it is easily approximated stochastically using sampling and linear programming. In addition to defining $k$-variance and proving its basic properties, we provide in-depth analysis of this quantity in several key cases, including one-dimensional measures, clustered measures, and measures concentrated on low-dimensional subsets of ${\mathbb R}^n$. We conclude with experiments and open problems motivated by this new way to summarize distributional shape.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-07-07T07:00:00Z
      DOI: 10.1137/20M1385895
      Issue No: Vol. 4, No. 3 (2022)
       
  • Convergence of a Constrained Vector Extrapolation Scheme

    • Free pre-print version: Loading...

      Authors: Mathieu Barré, Adrien Taylor, Alexandre d'Aspremont
      Pages: 979 - 1002
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 3, Page 979-1002, September 2022.
      Among extrapolation methods, Anderson acceleration (AA) is a popular technique for speeding up convergence of iterative processes toward their limit points. AA proceeds by extrapolating a better approximation of the limit using a weighted combination of previous iterates. Whereas AA was originally developed in the context of nonlinear integral equations, or to accelerate the convergence of iterative methods for solving linear systems, it is also used to extrapolate the solution of nonlinear systems. Simple additional stabilization strategies can be used in this context to control conditioning issues. In this work, we study a constrained vector extrapolation scheme based on an offline version of AA with fixed window size, for solving nonlinear systems arising in optimization problems, where the stabilization strategy consists in bounding the magnitude of the extrapolation weights. We provide explicit convergence bounds for this method and, as a by-product, upper bounds on a constrained version of the Chebyshev problem on polynomials.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-07-11T07:00:00Z
      DOI: 10.1137/21M1428030
      Issue No: Vol. 4, No. 3 (2022)
       
  • Convergence of a Piggyback-Style Method for the Differentiation of
           Solutions of Standard Saddle-Point Problems

    • Free pre-print version: Loading...

      Authors: Lea Bogensperger, Antonin Chambolle, Thomas Pock
      Pages: 1003 - 1030
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 3, Page 1003-1030, September 2022.
      We analyze a “piggyback''-style method for computing the derivative of a loss which depends on the solution of a convex-concave saddle-point problem, with respect to the bilinear term. We attempt to derive guarantees for the algorithm under minimal regularity assumptions on the functions. Our final convergence results include possibly nonsmooth objectives. We illustrate the versatility of the proposed piggyback algorithm by learning optimized shearlet transforms, which are a class of popular sparsifying transforms in the field of imaging.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-07-14T07:00:00Z
      DOI: 10.1137/21M1455887
      Issue No: Vol. 4, No. 3 (2022)
       
  • DESTRESS: Computation-Optimal and Communication-Efficient Decentralized
           Nonconvex Finite-Sum Optimization

    • Free pre-print version: Loading...

      Authors: Boyue Li, Zhize Li, Yuejie Chi
      Pages: 1031 - 1051
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 3, Page 1031-1051, September 2022.
      Emerging applications in multiagent environments such as internet-of-things, networked sensing, autonomous systems, and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including stochastic recursive gradient updates with minibatches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for periteration communication, together with careful choices of hyperparameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-08-04T07:00:00Z
      DOI: 10.1137/21M1450677
      Issue No: Vol. 4, No. 3 (2022)
       
  • A Nonlinear Matrix Decomposition for Mining the Zeros of Sparse Data

    • Free pre-print version: Loading...

      Authors: Lawrence K. Saul
      Pages: 431 - 463
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 431-463, June 2022.
      We describe a simple iterative solution to a widely recurring problem in multivariate data analysis: given a sparse nonnegative matrix ${\mathbf{X}}$, how to estimate a low-rank matrix ${{\Theta}}$ such that ${{X}} \approx f({{\Theta}})$, where $f$ is an elementwise nonlinearity' We develop a latent variable model for this problem and consider those sparsifying nonlinearities, popular in neural networks, that map all negative values to zero. The model seeks to explain the variability of sparse high-dimensional data in terms of a smaller number of degrees of freedom. We show that exact inference in this model is tractable and derive an expectation-maximization (EM) algorithm to estimate the low-rank matrix ${{\Theta}}$. Notably, we do not parameterize ${{\Theta}}$ as a product of smaller matrices to be alternately optimized; instead, we estimate ${{\Theta}}$ directly via the singular value decomposition of matrices that are repeatedly inferred (at each iteration of the EM algorithm) from the model's posterior distribution. We use the model to analyze large sparse matrices that arise from data sets of binary, grayscale, and color images. In all of these cases, we find that the model discovers much lower-rank decompositions than purely linear approaches.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-04-07T07:00:00Z
      DOI: 10.1137/21M1405769
      Issue No: Vol. 4, No. 2 (2022)
       
  • What Kinds of Functions Do Deep Neural Networks Learn' Insights from
           Variational Spline Theory

    • Free pre-print version: Loading...

      Authors: Rahul Parhi, Robert D. Nowak
      Pages: 464 - 489
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 464-489, June 2022.
      We develop a variational framework to understand the properties of functions learned by fitting deep neural networks with rectified linear unit (ReLU) activations to data. We propose a new function space, which is related to classical bounded variation-type spaces, that captures the compositional structure associated with deep neural networks. We derive a representer theorem showing that deep ReLU networks are solutions to regularized data-fitting problems over functions from this space. The function space consists of compositions of functions from the Banach space of second-order bounded variation in the Radon domain. This Banach space has a sparsity-promoting norm, giving insight into the role of sparsity in deep neural networks. The neural network solutions have skip connections and rank-bounded weight matrices, providing new theoretical support for these common architectural choices. The variational problem we study can be recast as a finite-dimensional neural network training problem with regularization schemes related to the notions of weight decay and path-norm regularization. Finally, our analysis builds on techniques from variational spline theory, providing new connections between deep neural networks and splines.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-04-13T07:00:00Z
      DOI: 10.1137/21M1418642
      Issue No: Vol. 4, No. 2 (2022)
       
  • Statistical Methods for Minimax Estimation in Linear Models with Unknown
           Design Over Finite Alphabets

    • Free pre-print version: Loading...

      Authors: Merle Behr, Axel Munk
      Pages: 490 - 513
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 490-513, June 2022.
      We provide a minimax optimal estimation procedure for ${F}$ and ${\Omega}$ in matrix valued linear models $Y = {{F}} {{\Omega}} + Z$, where the parameter matrix ${\Omega}$ and the design matrix ${F}$ are unknown but the latter takes values in a known finite set. The proposed finite alphabet linear model is justified in a variety of applications, ranging from signal processing to cancer genetics. We show that this allows one to separate ${F}$ and ${\Omega}$ uniquely under weak identifiability conditions, a task which is not doable, in general. To this end we quantify in the noiseless case, that is, $Z = 0$, the perturbation range of $Y$ in order to obtain stable recovery of ${F}$ and ${\Omega}$. Based on this, we derive an iterative Lloyd's type estimation procedure that attains minimax estimation rates for ${\Omega}$ and ${F}$ for Gaussian error matrix $Z$. In contrast to the least squares solution the estimation procedure can be computed efficiently and scales linearly with the total number of observations. We confirm our theoretical results in a simulation study and illustrate it with a genetic sequencing data example.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-04-25T07:00:00Z
      DOI: 10.1137/21M1398860
      Issue No: Vol. 4, No. 2 (2022)
       
  • Approximation Bounds for Sparse Programs

    • Free pre-print version: Loading...

      Authors: Armin Askari, Alexandre d'Aspremont, Laurent El Ghaoui
      Pages: 514 - 530
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 514-530, June 2022.
      We show that sparsity-constrained optimization problems over low-dimensional spaces tend to have a small duality gap. We use the Shapley--Folkman theorem to derive both data-driven bounds on the duality gap and an efficient primalization procedure to recover feasible points satisfying these bounds. These error bounds are proportional to the rate of growth of the objective with the target cardinality, which means in particular that the relaxation is nearly tight as soon as the target cardinality is large enough so that only uninformative features are added.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-04-27T07:00:00Z
      DOI: 10.1137/21M1398677
      Issue No: Vol. 4, No. 2 (2022)
       
  • Stochastic Geometry to Generalize the Mondrian Process

    • Free pre-print version: Loading...

      Authors: Eliza O'Reilly, Ngoc Mai Tran
      Pages: 531 - 552
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 531-552, June 2022.
      The stable under iteration (STIT) tessellation process is a stochastic process that produces a recursive partition of space with cut directions drawn independently from a distribution over the sphere. The case of random axis-aligned cuts is known as the Mondrian process. Random forests and Laplace kernel approximations built from the Mondrian process have led to efficient online learning methods and Bayesian optimization. In this work, we utilize tools from stochastic geometry to resolve some fundamental questions concerning STIT processes in machine learning. First, we show that STIT processes can be efficiently simulated by lifting to a higher-dimensional axis-aligned Mondrian process. Second, we characterize all possible kernels that STIT processes and their mixtures can approximate. We also give a uniform convergence rate for the approximation error of the STIT kernels to the targeted kernels, completely generalizing the work of Balog et al. [The Mondrian kernel, 2016] from the Mondrian case. Third, we obtain consistency results for STIT forests in density estimation and regression. Finally, we give a precise formula for the density estimator arising from a STIT forest. This allows for precise comparisons between the STIT forest, the STIT kernel, and the targeted kernel in density estimation. Our paper calls for further developments at the novel intersection of stochastic geometry and machine learning.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-05-02T07:00:00Z
      DOI: 10.1137/20M1354490
      Issue No: Vol. 4, No. 2 (2022)
       
  • Quantitative Approximation Results for Complex-Valued Neural Networks

    • Free pre-print version: Loading...

      Authors: Andrei Caragea, Dae Gwan Lee, Johannes Maly, Götz Pfander, Felix Voigtlaender
      Pages: 553 - 580
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 553-580, June 2022.
      Until recently, applications of neural networks in machine learning have almost exclusively relied on real-valued networks. It was recently observed, however, that complex-valued neural networks (CVNNs) exhibit superior performance in applications in which the input is naturally complex-valued, such as MRI fingerprinting. While the mathematical theory of real-valued networks has, by now, reached some level of maturity, this is far from true for complex-valued networks. In this paper, we analyze the expressivity of complex-valued networks by providing explicit quantitative error bounds for approximating $C^n$ functions on compact subsets of $\mathbb{C}^d$ by CVNNs that employ the modReLU activation function, given by $\sigma(z) = {ReLU}( z - 1), {sgn} (z)$, which is one of the most popular complex activation functions used in practice. We show that the derived approximation rates are optimal (up to log factors) in the class of modReLU networks with weights of moderate growth.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-05-02T07:00:00Z
      DOI: 10.1137/21M1429540
      Issue No: Vol. 4, No. 2 (2022)
       
  • Wasserstein-Based Projections with Applications to Inverse Problems

    • Free pre-print version: Loading...

      Authors: Howard Heaton, Samy Wu Fung, Alex Tong Lin, Stanley Osher, Wotao Yin
      Pages: 581 - 603
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 581-603, June 2022.
      Inverse problems consist of recovering a signal from a collection of noisy measurements. These are typically cast as optimization problems, with classic approaches using a data fidelity term and an analytic regularizer that stabilizes recovery. Recent plug-and-play (PnP) works propose replacing the operator for analytic regularization in optimization methods by a data-driven denoiser. These schemes obtain state-of-the-art results, but at the cost of limited theoretical guarantees. To bridge this gap, we present a new algorithm that takes samples from the manifold of true data as input and outputs an approximation of the projection operator onto this manifold. Under standard assumptions, we prove this algorithm generates a learned operator, called Wasserstein-based projection (WP), that approximates the true projection with high probability. Thus, WPs can be inserted into optimization methods in the same manner as PnP, but now with theoretical guarantees. Provided numerical examples show WPs obtain state-of-the-art results for unsupervised PnP signal recovery. All codes for this work can be found at https://github.com/swufung/WassersteinBasedProjections.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-05-05T07:00:00Z
      DOI: 10.1137/20M1376790
      Issue No: Vol. 4, No. 2 (2022)
       
  • Tukey Depths and Hamilton--Jacobi Differential Equations

    • Free pre-print version: Loading...

      Authors: Martin Molina-Fructuoso, Ryan Murray
      Pages: 604 - 633
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 604-633, June 2022.
      Widespread application of modern machine learning has increased the need for robust statistical algorithms. This work studies one such fundamental statistical concept known as the Tukey depth. We study the problem in the continuum (population) limit. In particular, we formally derive the associated necessary conditions, which take the form of a first-order partial differential equation which is necessarily satisfied at points where the Tukey depth is smooth. We discuss the interpretation of this formal necessary condition in terms of the viscosity solution of a Hamilton--Jacobi equation, but with a nonclassical Hamiltonian with discontinuous dependence on the gradient at zero. We prove that this equation possesses a unique viscosity solution and that this solution always bounds the Tukey depth from below. In certain cases we prove that the Tukey depth is equal to the viscosity solution, and we give some illustrations of standard numerical methods from the optimal control community which deal directly with the partial differential equation. We conclude by outlining several promising research directions both in terms of new numerical algorithms and theoretical challenges.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-05-10T07:00:00Z
      DOI: 10.1137/21M1411998
      Issue No: Vol. 4, No. 2 (2022)
       
  • Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization

    • Free pre-print version: Loading...

      Authors: Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan
      Pages: 634 - 648
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 634-648, June 2022.
      Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the “geometrization” technique introduced by [L. Lei and M. I. Jordan, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017, pp. 148--156] and the SARAH algorithm of [L. M. Nguyen, J. Liu, K. Scheinberg, and M. Takáč, Proceedings of the 34th International Conference on Machine Learning, 2017, pp. 2613--2621], we propose the geometrized SARAH algorithm for nonconvex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak--Łojasiewicz (PL) constant, if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-05-12T07:00:00Z
      DOI: 10.1137/21M1394308
      Issue No: Vol. 4, No. 2 (2022)
       
  • A Variational Formulation of Accelerated Optimization on Riemannian
           Manifolds

    • Free pre-print version: Loading...

      Authors: Valentin Duruisseaux, Melvin Leok
      Pages: 649 - 674
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 649-674, June 2022.
      It was shown recently by [W. Su, S. Boyd, and E. Candes, J. Mach. Learn. Res., 17 (2016), pp. 1--43] that Nesterov's accelerated gradient method for minimizing a smooth convex function $f$ can be thought of as the time discretization of a second-order ODE and that $f(x(t))$ converges to its optimal value at a rate of $\mathcal{O}(1/t^2)$ along any trajectory $x(t)$ of this ODE. A variational formulation was introduced in [A. Wibisono, A. Wilson, and M. Jordan, Proc Natl. Acad. Sci. USA, 113 (2016), pp. E7351--E7358] which allowed for accelerated convergence at a rate of $\mathcal{O}(1/t^p)$, for arbitrary $p>0$, in normed vector spaces. This framework was exploited in [V. Duruisseaux, J. Schmitt, and M. Leok, SIAM J. Sci. Comput., 43 (2021), pp. A2949--A2980] using time-adaptive geometric integrators to design efficient explicit algorithms for symplectic accelerated optimization. In [F. Alimisis, A. Orvieto, G. Bécigneul, and A. Lucchi, Proceedings of the 23rd International AISTATS Conference, 2020, pp. 1297--1307], a second-order ODE was proposed as the continuous-time limit of a Riemannian accelerated algorithm, and it was shown that the objective function $f(x(t))$ converges to its optimal value at a rate of $\mathcal{O}(1/t^2)$ along solutions of this ODE, thereby generalizing the earlier Euclidean result to the Riemannian manifold setting. In this paper, we show that on Riemannian manifolds, the convergence rate of $f(x(t))$ to its optimal value can also be accelerated to an arbitrary convergence rate $\mathcal{O}(1/t^p)$, by considering a family of time-dependent Bregman Lagrangian and Hamiltonian systems on Riemannian manifolds. This generalizes the results of Wibisono, Wilson, and Jordan to Riemannian manifolds and also provides a variational framework for accelerated optimization on Riemannian manifolds. In particular, we will establish results for objective functions on Riemannian manifolds that are geodesically convex, weakly quasi-convex, and strongly convex. An approach based on the time-invariance property of the family of Bregman Lagrangians and Hamiltonians was used to construct very efficient optimization algorithms by Duruisseaux, Schmitt, and Leok, and we establish a similar time-invariance property in the Riemannian setting. This lays the foundation for constructing similarly efficient optimization algorithms on Riemannian manifolds, once the Riemannian analogues of time-adaptive Hamiltonian variational integrators have been developed. The experience with the numerical discretization of variational accelerated optimization flows on vector spaces suggests that the combination of time-adaptivity and symplecticity is important for the efficient, robust, and stable discretization of these variational flows describing accelerated optimization. One expects that a geometric numerical integrator that is time-adaptive, symplectic, and Riemannian manifold preserving will yield a class of similarly promising optimization algorithms on manifolds.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-02T07:00:00Z
      DOI: 10.1137/21M1395648
      Issue No: Vol. 4, No. 2 (2022)
       
  • Speedy Categorical Distributional Reinforcement Learning and Complexity
           Analysis

    • Free pre-print version: Loading...

      Authors: Markus Böck, Clemens Heitzinger
      Pages: 675 - 693
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 675-693, June 2022.
      In distributional reinforcement learning, the entire distribution of the return instead of just the expected return is modeled. The approach with categorical distributions as the approximation method is well-known in Q-learning, and convergence results have been established in the tabular case. In this work, speedy Q-learning is extended to categorical distributions, a finite-time analysis is performed, and probably approximately correct bounds in terms of the Cramér distance are established. It is shown that also in the distributional case the new update rule yields faster policy evaluation in comparison to the standard Q-learning one and that the sample complexity is essentially the same as the one of the value-based algorithmic counterpart. Without the need for more state-action-reward samples, one gains significantly more information about the return with categorical distributions. Even though the results do not easily extend to the case of policy control, a slight modification to the update rule yields promising numerical results.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-06T07:00:00Z
      DOI: 10.1137/20M1364436
      Issue No: Vol. 4, No. 2 (2022)
       
  • Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for
           Neural Networks

    • Free pre-print version: Loading...

      Authors: Yuqing Li, Tao Luo, Chao Ma
      Pages: 694 - 720
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 694-720, June 2022.
      In an attempt to better understand structural benefits and generalization power of deep neural networks, we first present a novel graph theoretical formulation of neural network models, including fully connected, residual network (ResNet) and densely connected networks (DenseNet). Second, we extend the error analysis of the population risk for a two-layer network [W. E., C. Ma, and L. Wu, Commun. Math. Sci., 17 (2019), pp. 1407--1425] and ResNet [W. E., C. Ma, and Q. Wang, Commun. Math. Sci., 18 (2020), pp. 1755--1774] to DenseNet, and show further that for neural networks satisfying certain mild conditions, similar estimates can be obtained. These estimates are a priori in nature since they depend solely on the information prior to the training process, in particular, the bounds for the estimation errors do not suffer from the curse of dimensionality.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-13T07:00:00Z
      DOI: 10.1137/21M140955X
      Issue No: Vol. 4, No. 2 (2022)
       
  • Intrinsic Dimension Adaptive Partitioning for Kernel Methods

    • Free pre-print version: Loading...

      Authors: Thomas Hamm, Ingo Steinwart
      Pages: 721 - 749
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 721-749, June 2022.
      We prove minimax optimal learning rates for kernel ridge regression, respectively, support vector machines, based on a data dependent partition of the input space, where the dependence of the dimension of the input space is replaced by the fractal dimension of the support of the data generating distribution. We further show that these optimal rates can be achieved by a training validation procedure without any prior knowledge on this intrinsic dimension of the data. Finally, we conduct extensive experiments which demonstrate that our considered learning methods are actually able to generalize from a dataset that is nontrivially embedded in a much higher dimensional space just as well as from the original dataset.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-23T07:00:00Z
      DOI: 10.1137/21M1435690
      Issue No: Vol. 4, No. 2 (2022)
       
  • Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method

    • Free pre-print version: Loading...

      Authors: Axel Böhm, Michael Sedlmayer, Ernö Robert Csetnek, Radu Ioan Boţ
      Pages: 750 - 771
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 750-771, June 2022.
      Motivated by the training of generative adversarial networks (GANs), we study methods for solving minimax problems with additional nonsmooth regularizers. We do so by employing monotone operator theory, in particular the forward-backward-forward method, which avoids the known issue of limit cycling by correcting each update by a second gradient evaluation and does so requiring fewer projection steps compared to the extragradient method in the presence of constraints. Furthermore, we propose a seemingly new scheme which recycles old gradients to mitigate the additional computational cost. In doing so we rediscover a known method, related to optimistic gradient descent ascent. For both schemes we prove novel convergence rates for convex-concave minimax problems via a unifying approach. The derived error bounds are in terms of the gap function for the ergodic iterates. For the deterministic and the stochastic problem we show a convergence rate of $\mathcal{O}({1}/{k})$ and $\mathcal{O}({1}/{\sqrt{k}})$, respectively. We complement our theoretical results with empirical improvements in the training of Wasserstein GANs on the CIFAR10 dataset.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-23T07:00:00Z
      DOI: 10.1137/21M1420939
      Issue No: Vol. 4, No. 2 (2022)
       
  • Sequential Construction and Dimension Reduction of Gaussian Processes
           Under Inequality Constraints

    • Free pre-print version: Loading...

      Authors: François Bachoc, Andrés F. López-Lopera, Olivier Roustant
      Pages: 772 - 800
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 772-800, June 2022.
      Accounting for inequality constraints, such as boundedness, monotonicity, or convexity, is challenging when modeling costly-to-evaluate black box functions. In this regard, finite-dimensional Gaussian process (GP) regression models bring a valuable solution, as they guarantee that the inequality constraints are satisfied everywhere. Nevertheless, these models are currently restricted to small dimensional situations (up to dimension 5). Addressing this issue, we introduce the MaxMod algorithm that sequentially inserts one-dimensional knots or adds active variables, thereby performing at the same time dimension reduction and efficient knot allocation. We prove the convergence of this algorithm. In intermediary steps of the proof, we propose the notion of multiaffine extension and study its properties. We also prove the convergence of finite-dimensional GPs, when the knots are not dense in the input space, extending the recent literature. With simulated and real data, we demonstrate that the MaxMod algorithm remains efficient in higher dimension (at least in dimension 20), and needs fewer knots than other constrained GP models from the state of the art, to reach a given approximation error.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-23T07:00:00Z
      DOI: 10.1137/21M1407513
      Issue No: Vol. 4, No. 2 (2022)
       
  • Autodifferentiable Ensemble Kalman Filters

    • Free pre-print version: Loading...

      Authors: Yuming Chen, Daniel Sanz-Alonso, Rebecca Willett
      Pages: 801 - 833
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 801-833, June 2022.
      Data assimilation is concerned with sequentially estimating a temporally evolving state. This task, which arises in a wide range of scientific and engineering applications, is particularly challenging when the state is high-dimensional and the state-space dynamics are unknown. This paper introduces a machine learning framework for learning dynamical systems in data assimilation. Our auto-differentiable ensemble Kalman filters (AD-EnKFs) blend ensemble Kalman filters for state recovery with machine learning tools for learning the dynamics. In doing so, AD-EnKFs leverage the ability of ensemble Kalman filters to scale to high-dimensional states and the power of automatic differentiation to train high-dimensional surrogate models for the dynamics. Numerical results using the Lorenz-96 model show that AD-EnKFs outperform existing methods that use expectation-maximization or particle filters to merge data assimilation and machine learning. In addition, AD-EnKFs are easy to implement and require minimal tuning.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-23T07:00:00Z
      DOI: 10.1137/21M1434477
      Issue No: Vol. 4, No. 2 (2022)
       
  • Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement
           Learning

    • Free pre-print version: Loading...

      Authors: Tong Zhang
      Pages: 834 - 857
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 834-857, June 2022.
      Thompson sampling has been widely used for contextual bandit problems due to the flexibility of its modeling power. However, a general theory for this class of methods in the frequentist setting is still lacking. In this paper, we present a theoretical analysis of Thompson sampling, with a focus on frequentist regret bounds. In this setting, we show that the standard Thompson sampling is not aggressive enough in exploring new actions, leading to suboptimality in some pessimistic situations. A simple modification called Feel-Good Thompson sampling, which favors high reward models more aggressively than the standard Thompson sampling, is proposed to remedy this problem. We show that the theoretical framework can be used to derive Bayesian regret bounds for standard Thompson sampling and frequentist regret bounds for Feel-Good Thompson sampling. It is shown that in both cases, we can reduce the bandit regret problem to online least squares regression estimation. For the frequentist analysis, the online least squares regression bound can be directly obtained using online aggregation techniques which have been well studied. The resulting bandit regret bound matches the minimax lower bound in the finite action case. Moreover, the analysis can be generalized to handle a class of linearly embeddable contextual bandit problems (which generalizes the popular linear contextual bandit model). The obtained result again matches the minimax lower bound. Finally we illustrate that the analysis can be extended to handle some MDP problems.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-23T07:00:00Z
      DOI: 10.1137/21M140924X
      Issue No: Vol. 4, No. 2 (2022)
       
  • Persistent Laplacians: Properties, Algorithms and Implications

    • Free pre-print version: Loading...

      Authors: Facundo Mémoli, Zhengchao Wan, Yusu Wang
      Pages: 858 - 884
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 858-884, June 2022.
      We present a thorough study of the theoretical properties and devise efficient algorithms for the persistent Laplacian, an extension of the standard combinatorial Laplacian to the setting of pairs (or, in more generality, sequences) of simplicial complexes $K \hookrightarrow L$, which was recently introduced by Wang, Nguyen, and Wei. In particular, in analogy with the nonpersistent case, we first prove that the nullity of the $q$th persistent Laplacian $\Delta_q^{K,L}$ equals the $q$th persistent Betti number of the inclusion $(K \hookrightarrow L)$. We then present an initial algorithm for finding a matrix representation of $\Delta_q^{K,L}$, which itself helps interpret the persistent Laplacian. We exhibit a novel relationship between the persistent Laplacian and the notion of Schur complement of a matrix which has several important implications. In the graph case, it both uncovers a link with the notion of effective resistance and leads to a persistent version of the Cheeger inequality. This relationship also yields an additional, very simple algorithm for finding (a matrix representation of) the $q$th persistent Laplacian which in turn leads to a novel and fundamentally different algorithm for computing the $q$th persistent Betti number for a pair $K\hookrightarrow L$ which can be significantly more efficient than standard algorithms. Finally, we study persistent Laplacians for simplicial filtrations and establish novel functoriality properties and stability results for their eigenvalues. Our work brings methods from spectral graph theory, circuit theory, and persistent homology together with a topological view of the combinatorial Laplacian on simplicial complexes.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-24T07:00:00Z
      DOI: 10.1137/21M1435471
      Issue No: Vol. 4, No. 2 (2022)
       
  • Overparameterization and Generalization Error: Weighted Trigonometric
           Interpolation

    • Free pre-print version: Loading...

      Authors: Yuege Xie, Hung-Hsu Chou, Holger Rauhut, Rachel Ward
      Pages: 885 - 908
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 885-908, June 2022.
      Motivated by surprisingly good generalization properties of learned deep neural networks in overparameterized scenarios and by the related double descent phenomenon, this paper analyzes the relation between smoothness and low generalization error in an overparameterized linear learning problem. We study a random Fourier series model, where the task is to estimate the unknown Fourier coefficients from equidistant samples. We derive exact expressions for the generalization error of both plain and weighted least squares estimators. We show precisely how a bias toward smooth interpolants, in the form of weighted trigonometric interpolation, can lead to smaller generalization error in the overparameterized regime compared to the underparameterized regime. This provides insight into the power of overparameterization, which is common in modern machine learning.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-27T07:00:00Z
      DOI: 10.1137/21M1390955
      Issue No: Vol. 4, No. 2 (2022)
       
  • GNMR: A Provable One-Line Algorithm for Low Rank Matrix Recovery

    • Free pre-print version: Loading...

      Authors: Pini Zilber, Boaz Nadler
      Pages: 909 - 934
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 909-934, June 2022.
      Low rank matrix recovery problems appear in a broad range of applications. In this work we present GNMR---an extremely simple iterative algorithm for low rank matrix recovery, based on a Gauss--Newton linearization. On the theoretical front, we derive recovery guarantees for GNMR in both matrix sensing and matrix completion settings. Some of these results improve upon the best currently known for other methods. A key property of GNMR is that it implicitly keeps the factor matrices approximately balanced throughout its iterations. On the empirical front, we show that for matrix completion with uniform sampling, GNMR performs better than several popular methods, especially when given very few observations close to the information limit.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-28T07:00:00Z
      DOI: 10.1137/21M1433812
      Issue No: Vol. 4, No. 2 (2022)
       
  • Benefit of Interpolation in Nearest Neighbor Algorithms

    • Free pre-print version: Loading...

      Authors: Yue Xing, Qifan Song, Guang Cheng
      Pages: 935 - 956
      Abstract: SIAM Journal on Mathematics of Data Science, Volume 4, Issue 2, Page 935-956, June 2022.
      In some studies (e.g., [C. Zhang et al. in Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, 2017]) of deep learning, it is observed that overparametrized deep neural networks achieve a small testing error even when the training error is almost zero. Despite numerous works toward understanding this so-called double-descent phenomenon (e.g., [M. Belkin et al., Proc. Natl. Acad. Sci. USA, 116 (2019), pp. 15849--15854; M. Belkin, D. Hsu, and J. Xu, SIAM J. Math. Data Sci., 2 (2020), pp. 1167--1180]), in this paper, we turn to another way to enforce zero training error (without overparametrization) through a data interpolation mechanism. Specifically, we consider a class of interpolated weighting schemes in the nearest neighbors (NN) algorithms. By carefully characterizing the multiplicative constant in the statistical risk, we reveal a U-shaped performance curve for the level of data interpolation in both classification and regression setups. This sharpens the existing result [M. Belkin, A. Rakhlin, and A. B. Tsybakov, in Proceedings of Machine Learning Research 89, PMLR, 2019, pp. 1611--1619] that zero training error does not necessarily jeopardize predictive performances and claims a counterintuitive result that a mild degree of data interpolation actually strictly improves the prediction performance and statistical stability over those of the (uninterpolated) $k$-NN algorithm. In the end, the universality of our results, such as change of distance measure and corrupted testing data, will also be discussed.
      Citation: SIAM Journal on Mathematics of Data Science
      PubDate: 2022-06-30T07:00:00Z
      DOI: 10.1137/21M1437457
      Issue No: Vol. 4, No. 2 (2022)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 35.172.223.251
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-