Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Semidefinite programming is an important tool to tackle several problems in data science and signal processing, including clustering and community detection. However, semidefinite programs are often slow in practice, so speed up techniques such as sketching are often considered. In the context of community detection in the stochastic block model, Mixon and Xie (IEEE Trans Inform Theory 67(10): 6832–6840, 2021) have recently proposed a sketching framework in which a semidefinite program is solved only on a subsampled subgraph of the network, giving rise to significant computational savings. In this short paper, we provide a positive answer to a conjecture of Mixon and Xie about the statistical limits of this technique for the stochastic block model with two balanced communities. PubDate: 2022-05-06

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The notion of generalization in classical Statistical Learning is often attached to the postulate that data points are independent and identically distributed (IID) random variables. While relevant in many applications, this postulate may not hold in general, encouraging the development of learning frameworks that are robust to non-IID data. In this work, we consider the regression problem from an Optimal Recovery perspective. Relying on a model assumption comparable to choosing a hypothesis class, a learner aims at minimizing the worst-case error, without recourse to any probabilistic assumption on the data. We first develop a semidefinite program for calculating the worst-case error of any recovery map in finite-dimensional Hilbert spaces. Then, for any Hilbert space, we show that Optimal Recovery provides a formula which is user-friendly from an algorithmic point-of-view, as long as the hypothesis class is linear. Interestingly, this formula coincides with kernel ridgeless regression in some cases, proving that minimizing the average error and worst-case error can yield the same solution. We provide numerical experiments in support of our theoretical findings. PubDate: 2022-04-26

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Inertial algorithms for minimizing nonsmooth and nonconvex functions as the inertial proximal alternating linearized minimization algorithm (iPALM) have demonstrated their superiority with respect to computation time over their non inertial variants. In many problems in imaging and machine learning, the objective functions have a special form involving huge data which encourage the application of stochastic algorithms. While algorithms based on stochastic gradient descent are still used in the majority of applications, recently also stochastic algorithms for minimizing nonsmooth and nonconvex functions were proposed. In this paper, we derive an inertial variant of a stochastic PALM algorithm with variance-reduced gradient estimator, called iSPALM, and prove linear convergence of the algorithm under certain assumptions. Our inertial approach can be seen as generalization of momentum methods widely used to speed up and stabilize optimization algorithms, in particular in machine learning, to nonsmooth problems. Numerical experiments for learning the weights of a so-called proximal neural network and the parameters of Student-t mixture models show that our new algorithm outperforms both stochastic PALM and its deterministic counterparts. PubDate: 2022-04-22

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Zuowei Shen, Haizhao Yang, and Shijun Zhang recently proved a sharp estimate in terms of a modulus of continuity for the error of approximating continuous functions with ReLU-activated deep neural networks in their paper “Optimal approximation rate of ReLU networks in terms of width and depth”, Journal de Mathématiques Pures et Appliqués (2021). The sharpness was established based on a Vapnik-Chervonenkis dimension estimate by showing that for each choice of certain fixed width and depth parameters, no general smaller error bound applies. This leads to counterexamples that rule out smaller error bounds. In principle, the obtained counterexamples can be different for different parameters. However, for a given convergence rate (e.g., determined by a Lipschitz class), the counterexamples can be condensed to a single one using a quantitative variant of the uniform boundedness principle. Such theorems were developed at the institute of Paul Butzer at RWTH Aachen. When network width or depth tends to infinity in certain cases, the condensed counterexamples show that the approximation order is not in the little-o class of the modulus of continuity used in the error estimate, i.e., the convergence cannot be faster than stated. PubDate: 2022-03-23

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Polynomial graph filters and their inverses play important roles in graph signal processing. In this paper, we introduce the concept of multiple commutative graph shifts and polynomial graph filters, which could play similar roles in graph signal processing as the one-order delay and finite impulse response filters in classical multi-dimensional signal processing. We implement the filtering procedure associated with a polynomial graph filter of multiple shifts at the vertex level in a distributed network on which each vertex is equipped with a data processing subsystem for limited computation power and data storage, and a communication subsystem for direct data exchange only to its adjacent vertices. In this paper, we also consider the implementation of inverse filtering procedure associated with a polynomial graph filter of multiple shifts, and we propose two iterative approximation algorithms applicable in a distributed network and in a central facility. We also demonstrate the effectiveness of the proposed algorithms to implement the inverse filtering procedure on denoising time-varying graph signals and a dataset of US hourly temperature at 218 locations. PubDate: 2022-01-29 DOI: 10.1007/s43670-021-00019-x

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract This paper considers fast and provably accurate algorithms for approximating smooth functions on the d-dimensional torus, \(f: \mathbb {T}^d \rightarrow \mathbb {C}\) , that are sparse (or compressible) in the multidimensional Fourier basis. In particular, suppose that the Fourier series coefficients of f, \(\{c_\mathbf{k} (f) \}_{\mathbf{k} \in \mathbb {Z}^d}\) , are concentrated in a given arbitrary finite set \(\mathscr {I} \subset \mathbb {Z}^d\) so that $$\begin{aligned} \min _{\Omega \subset \mathscr {I} ~s.t.~ \left \Omega \right =s }\left\ f - \sum _{\mathbf{k} \in \Omega } c_\mathbf{k} (f) \, \mathbb {e}^{ -2 \pi \mathbb {i} {{\mathbf {k}}}\cdot \circ } \right\ _2 < \epsilon \Vert f \Vert _2 \end{aligned}$$ holds for \(s \ll \left \mathscr {I} \right \) and \(\epsilon \in (0,1)\) small. In such cases we aim to both identify a near-minimizing subset \(\Omega \subset \mathscr {I}\) and accurately approximate its associated Fourier coefficients \(\{ c_\mathbf{k} (f) \}_{\mathbf{k} \in \Omega }\) as rapidly as possible. In this paper we present both deterministic and explicit as well as randomized algorithms for solving this problem using \(\mathscr {O}(s^2 d \log ^c ( \mathscr {I} ))\) -time/memory and \(\mathscr {O}(s d \log ^c ( \mathscr {I} ))\) -time/memory, respectively. Most crucially, all of the methods proposed herein achieve these runtimes while simultaneously satisfying theoretical best s-term approximation guarantees which guarantee their numerical accuracy and robustness to noise for general functions. These results are achieved by modifying several different one-dimensional Sparse Fourier Transform (SFT) methods to subsample a function along a reconstructing rank-1 lattice for the given frequency set \(\mathscr {I} \subset \mathbb {Z}^d\) in order to rapidly identify a near-minimizing subset \(\Omega \subset \mathscr {I}\) as above without having to use anything about the lattice beyond its generating vector. This requires the development of new fast and low-memory frequency identification techniques capable of rapidly recovering vector-valued frequencies in \(\mathbb {Z}^d\) as opposed to recovering simple integer frequencies as required in the univariate setting. Two different multivariate frequency identification strategies are proposed, analyzed, and shown to lead to their own best s-term approximation methods herein, each with different accuracy versus computational speed and memory tradeoffs. PubDate: 2021-12-13 DOI: 10.1007/s43670-021-00018-y

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper, we study the error behavior of the nonequispaced fast Fourier transform (NFFT). This approximate algorithm is mainly based on the convenient choice of a compactly supported window function. So far, various window functions have been used and new window functions have recently been proposed. We present novel error estimates for NFFT with compactly supported, continuous window functions and derive rules for convenient choice from the parameters involved in NFFT. The error constant of a window function depends mainly on the oversampling factor and the truncation parameter. PubDate: 2021-11-03 DOI: 10.1007/s43670-021-00017-z

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this work we consider the problem of identification and reconstruction of doubly-dispersive channel operators which are given by finite linear combinations of time-frequency shifts. Such operators arise as time-varying linear systems for example in radar and wireless communications. In particular, for information transmission in highly non-stationary environments the channel needs to be estimated quickly with identification signals of short duration and for vehicular application simultaneous high-resolution radar is desired as well. We consider the time-continuous setting and prove an exact resampling reformulation of the involved channel operator when applied to a trigonometric polynomial as identifier in terms of sparse linear combinations of real-valued atoms. Motivated by recent works of Heckel et al. we present an exact approach for off-the-grid super-resolution which allows to perform the identification with realizable signals having compact support. Then we show how an alternating descent conditional gradient algorithm can be adapted to solve the reformulated problem. Numerical examples demonstrate the performance of this algorithm, in particular in comparison with a simple adaptive grid refinement strategy and an orthogonal matching pursuit algorithm. PubDate: 2021-10-13 DOI: 10.1007/s43670-021-00016-0

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results. PubDate: 2021-10-11 DOI: 10.1007/s43670-021-00015-1

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Discrete tori are \({\mathbb {Z}}_m^N\) thought of as vertices of graphs \({\mathcal {C}}_m^N\) whose adjacencies encode the Cartesian product structure. Space-limiting refers to truncation to a symmetric path neighborhood of the zero element and spectrum-limiting in this case refers to corresponding truncation in the isomorphic Fourier domain. Composition spatio-spectral limiting (SSL) operators are analogues of classical time and band limiting operators. Certain adjacency-invariant spaces of vectors defined on \({\mathbb {Z}}_m^N\) are shown to have bases consisting of Fourier transforms of eigenvectors of SSL operators. We show that when \(m=3\) or \(m=4\) , all eigenvectors of SSL arise in this way. We study the structure of corresponding invariant spaces when \(m\ge 5\) and give an example to indicate that the relationship between eigenvectors of SSL and the corresponding adjacency-invariant spaces should extend to \(m\ge 5\) . PubDate: 2021-08-03 DOI: 10.1007/s43670-021-00014-2

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper we study \(L_2\) -norm sampling discretization and sampling recovery of complex-valued functions in RKHS on \(D \subset \mathbb {R}^d\) based on random function samples. We only assume the finite trace of the kernel (Hilbert–Schmidt embedding into \(L_2\) ) and provide several concrete estimates with precise constants for the corresponding worst-case errors. In general, our analysis does not need any additional assumptions and also includes the case of non-Mercer kernels and also non-separable RKHS. The fail probability is controlled and decays polynomially in n, the number of samples. Under the mild additional assumption of separability we observe improved rates of convergence related to the decay of the singular values. Our main tool is a spectral norm concentration inequality for infinite complex random matrices with independent rows complementing earlier results by Rudelson, Mendelson, Pajor, Oliveira and Rauhut. PubDate: 2021-07-20 DOI: 10.1007/s43670-021-00013-3

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Memoryless scalar quantization (MSQ) is a common technique to quantize generalized linear samples of signals. The non-linear nature of quantization makes the analysis of the corresponding approximation error challenging, often resulting in the use of a simplifying assumption, called the “white noise hypothesis” (WNH) that is useful, yet also known to be not rigorous and, at least in certain cases, not valid. We obtain rigorous reconstruction error estimates without relying on the WNH in the setting where generalized samples of a fixed deterministic signal are obtained using (the analysis matrix of) a random frame with independent isotropic sub-Gaussian rows; quantized using MSQ; and reconstructed linearly. We establish non-asymptotic error bounds that explain the observed error decay rate as the number of measurements grows, which in the special case of Gaussian random frames show that the error approaches a (small) non-zero constant lower bound. We also extend our methodology to dithered and noisy settings as well as the compressed sensing setting where we obtain rigorous error bounds that agree with empirical observations, again, without resorting to the WNH. PubDate: 2021-06-15 DOI: 10.1007/s43670-021-00012-4

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Many practical sampling patterns for function approximation on the rotation group utilizes regular samples on the parameter axes. In this paper, we analyze the mutual coherence for sensing matrices that correspond to a class of regular patterns to angular momentum analysis in quantum mechanics and provide simple lower bounds for it. The products of Wigner d-functions, which appear in coherence analysis, arise in angular momentum analysis in quantum mechanics. We first represent the product as a linear combination of a single Wigner d-function and angular momentum coefficients, otherwise known as the Wigner 3j symbols. Using combinatorial identities, we show that under certain conditions on the bandwidth and number of samples, the inner product of the columns of the sensing matrix at zero orders, which is equal to the inner product of two Legendre polynomials, dominates the mutual coherence term and fixes a lower bound for it. In other words, for a class of regular sampling patterns, we provide a lower bound for the inner product of the columns of the sensing matrix that can be analytically computed. We verify numerically our theoretical results and show that the lower bound for the mutual coherence is larger than Welch bound. Besides, we provide algorithms that can achieve the lower bound for spherical harmonics. PubDate: 2021-06-08 DOI: 10.1007/s43670-021-00006-2

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The concept of translation of an operator allows to consider the analogue of shift-invariant subspaces in the class of Hilbert–Schmidt operators. Thus, we extend the concept of average sampling to this new setting, and we obtain the corresponding sampling formulas. The key point here is the use of the Weyl transform, a unitary mapping between the space of square integrable functions in the phase space \(\mathbb {R}^d\times \widehat{\mathbb {R}}^d\) and the Hilbert space of Hilbert–Schmidt operators on \(L^2(\mathbb {R}^d)\) , which permits to take advantage of some well established sampling results. PubDate: 2021-05-27 DOI: 10.1007/s43670-021-00011-5

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We study the concentration problem on compact two-point homogeneous spaces for finite expansions of eigenfunctions of the Laplace–Beltrami operator using large sieve methods. We derive upper bounds for concentration in terms of the maximum Nyquist density. Our proof uses estimates of the spherical harmonics basis coefficients of certain zonal filters and an ordering result for Jacobi polynomials for arguments close to one. PubDate: 2021-05-25 DOI: 10.1007/s43670-021-00008-0

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: A correction to this paper has been published: https://doi.org/10.1007/s43670-021-00010-6 PubDate: 2021-05-21 DOI: 10.1007/s43670-021-00010-6

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper, we derive a new reconstruction method for real non-harmonic Fourier sums, i.e., real signals which can be represented as sparse exponential sums of the form \(f(t) = \sum _{j=1}^{K} \gamma _{j} \, \cos (2\pi a_{j} t + b_{j})\) , where the frequency parameters \(a_{j} \in {\mathbb {R}}\) (or \(a_{j} \in {\mathrm i} {\mathbb {R}}\) ) are pairwise different. Our method is based on the recently proposed numerically stable iterative rational approximation algorithm in Nakatsukasa et al. (SIAM J Sci Comput 40(3):A1494–A1522, 2018). For signal reconstruction we use a set of classical Fourier coefficients of f with regard to a fixed interval (0, P) with \(P>0\) . Even though all terms of f may be non-P-periodic, our reconstruction method requires at most \(2K+2\) Fourier coefficients \(c_{n}(f)\) to recover all parameters of f. We show that in the case of exact data, the proposed iterative algorithm terminates after at most \(K+1\) steps. The algorithm can also detect the number K of terms of f, if K is a priori unknown and \(L \ge 2K+2\) Fourier coefficients are available. Therefore our method provides a new alternative to the known numerical approaches for the recovery of exponential sums that are based on Prony’s method. PubDate: 2021-05-12 DOI: 10.1007/s43670-021-00007-1

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract A relatively new set of transport-based transforms (CDT, R-CDT, LOT) have shown their strength and great potential in various image and data processing tasks such as parametric signal estimation, classification, cancer detection among many others. It is hence worthwhile to elucidate some of the mathematical properties that explain the successes of these transforms when they are used as tools in data analysis, signal processing or data classification. In particular, we give conditions under which classes of signals that are created by algebraic generative models are transformed into convex sets by the transport transforms. Such convexification of the classes simplify the classification and other data analysis and processing problems when viewed in the transform domain. More specifically, we study the extent and limitation of the convexification ability of these transforms under an algebraic generative modeling framework. We hope that this paper will serve as an introduction to these transforms and will encourage mathematicians and other researchers to further explore the theoretical underpinnings and algorithmic tools that will help understand the successes of these transforms and lay the groundwork for further successful applications. PubDate: 2021-05-11 DOI: 10.1007/s43670-021-00009-z

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this paper, we prove the existence of a particular diagonalization for normal bounded operators defined on subspaces of \(L^2({\mathfrak {S}})\) where \({\mathfrak {S}}\) is a second countable LCA group. The subspaces where the operators act are invariant under the action of a group \(\Gamma \) which is a semi-direct product of a uniform lattice of \({\mathfrak {S}}\) with a discrete group of automorphisms. This class includes the crystal groups which are important in applications as models for images. The operators are assumed to be \(\Gamma \) preserving. i.e. they commute with the action of \(\Gamma \) . In particular we obtain a spectral decomposition for these operators. This generalizes recent results on shift-preserving operators acting on lattice invariant subspaces where \({\mathfrak {S}}\) is the Euclidean space. PubDate: 2021-04-27 DOI: 10.1007/s43670-021-00005-3

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Locally compact abelian (LCA) groups form a natural setting for frame theory in signal processing: while classical frame theory was developed in the setting of \( L^2({\mathbb {R}}),\) concrete applications have to be implemented on \({\mathbb {R}}^d,\) two scenarios that are covered under the setup of LCA groups. Due to rapid progress over the past few years, it is known within the mathematical community that large parts of the standard frame theory for \( L^2({\mathbb {R}})\) can be carried over to the setting of LCA groups. We show that this also holds on the level of explicit constructions, which is a crucial issue for applications within signal processing; in fact we provide explicit constructions of frames and pairs of dual frames on LCA groups. Special attention will be given to the so-called elementary LCA groups, i.e., groups that are tensor products of \({\mathbb {R}},\) \({\mathbb {Z}},\) the torus group \({{\mathbb {T}}},\) and the finite group \({\mathbb {Z}}_N.\) The frames will have the form of a Fourier-like system, i.e., the Fourier transform of a generalized shift-invariant system. The constructions are based on a partition of unity; in particular we provide a general and flexible approach for constructing a partition of unity that makes it easy to control key features of the underlying functions, e.g., support size and continuity. The results apply to arbitrary LCA groups with a supply of “sufficiently fine” lattices, a condition that is satisfied for all elementary LCA groups. Concrete constructions of pairs of dual frames for several elementary LCA groups are provided as illustration. In the case of a system of functions generated by a single lattice, such as a Gabor system, the general results become particularly transparent. PubDate: 2021-03-29 DOI: 10.1007/s43670-021-00004-4