Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Nonparametric mixture models based on the Pitman–Yor process represent a flexible tool for density estimation and clustering. Natural generalization of the popular class of Dirichlet process mixture models, they allow for more robust inference on the number of components characterizing the distribution of the data. We propose a new sampling strategy for such models, named importance conditional sampling (ICS), which combines appealing properties of existing methods, including easy interpretability and a within-iteration parallelizable structure. An extensive simulation study highlights the efficiency of the proposed method which, unlike other conditional samplers, shows stable performances for different specifications of the parameters characterizing the Pitman–Yor process. We further show that the ICS approach can be naturally extended to other classes of computationally demanding models, such as nonparametric mixture models for partially exchangeable data. PubDate: 2022-05-17

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Prediction models often fail if train and test data do not stem from the same distribution. Out-of-distribution (OOD) generalization to unseen, perturbed test data is a desirable but difficult-to-achieve property for prediction models and in general requires strong assumptions on the data generating process (DGP). In a causally inspired perspective on OOD generalization, the test data arise from a specific class of interventions on exogenous random variables of the DGP, called anchors. Anchor regression models, introduced by Rothenhäusler et al. (J R Stat Soc Ser B 83(2):215–246, 2021. https://doi.org/10.1111/rssb.12398), protect against distributional shifts in the test data by employing causal regularization. However, so far anchor regression has only been used with a squared-error loss which is inapplicable to common responses such as censored continuous or ordinal data. Here, we propose a distributional version of anchor regression which generalizes the method to potentially censored responses with at least an ordered sample space. To this end, we combine a flexible class of parametric transformation models for distributional regression with an appropriate causal regularizer under a more general notion of residuals. In an exemplary application and several simulation scenarios we demonstrate the extent to which OOD generalization is possible. PubDate: 2022-05-13

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract In this article we consider the application of multilevel Monte Carlo, for the estimation of normalizing constants. In particular we will make use of the filtering algorithm, the ensemble Kalman–Bucy filter (EnKBF), which is an N-particle representation of the Kalman–Bucy filter (KBF). The EnKBF is of interest as it coincides with the optimal filter in the continuous-linear setting, i.e. the KBF. This motivates our particular setup in the linear setting. The resulting methodology we will use is the multilevel ensemble Kalman–Bucy filter (MLEnKBF). We provide an analysis based on deriving \({\mathbb {L}}_q\) -bounds for the normalizing constants using both the single-level, and the multilevel algorithms, which is largely based on previous work deriving the MLEnKBF Chada et al. (2022). Our results will be highlighted through numerical results, where we firstly demonstrate the error-to-cost rates of the MLEnKBFs comparing it to the EnKBF on a linear Gaussian model. Our analysis will be specific to one variant of the MLEnKBF, whereas the numerics will be tested on different variants. We also exploit this methodology for parameter estimation, where we test this on the models arising in atmospheric sciences, such as the stochastic Lorenz 63 and 96 model. PubDate: 2022-05-04

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Biclustering is a machine learning problem that deals with simultaneously clustering of rows and columns of a data matrix. Complex structures of the data matrix such as overlapping biclusters have challenged existing methods. In this paper, we first provide a unified formulation of biclustering that uses structured regularized matrix decomposition, which synthesizes various existing methods, and then develop a new biclustering method called BCEL based on this formulation. The biclustering problem is formulated as a penalized least-squares problem that approximates the data matrix \(\mathbf {X}\) by a multiplicative matrix decomposition \(\mathbf {U}\mathbf {V}^T\) with sparse columns in both \(\mathbf {U}\) and \(\mathbf {V}\) . The squared \(\ell _{1,2}\) -norm penalty, also called the exclusive Lasso penalty, is applied to both \(\mathbf {U}\) and \(\mathbf {V}\) to assist identification of rows and columns included in the biclusters. The penalized least-squares problem is solved by a novel computational algorithm that combines alternating minimization and the proximal gradient method. A subsampling based procedure called stability selection is developed to select the tuning parameters and determine the bicluster membership. BCEL is shown to be competitive to existing methods in simulation studies and an application to a real-world single-cell RNA sequencing dataset. PubDate: 2022-04-29

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Markov chain Monte Carlo (MCMC) is a powerful methodology for the approximation of posterior distributions. However, the iterative nature of MCMC does not naturally facilitate its use with modern highly parallel computation on HPC and cloud environments. Another concern is the identification of the bias and Monte Carlo error of produced averages. The above have prompted the recent development of fully (‘embarrassingly’) parallel unbiased Monte Carlo methodology based on coupling of MCMC algorithms. A caveat is that formulation of effective coupling is typically not trivial and requires model-specific technical effort. We propose coupling of MCMC chains deriving from sequential Monte Carlo (SMC) by considering adaptive SMC methods in combination with recent advances in unbiased estimation for state-space models. Coupling is then achieved at the SMC level and is, in principle, not problem-specific. The resulting methodology enjoys desirable theoretical properties. A central motivation is to extend unbiased MCMC to more challenging targets compared to the ones typically considered in the relevant literature. We illustrate the effectiveness of the algorithm via application to two complex statistical models: (i) horseshoe regression; (ii) Gaussian graphical models. PubDate: 2022-04-23

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract This paper presents tests to formally choose between regression models using different derivatives of a functional covariate in scalar-on-function regression. We demonstrate that for linear regression, models using different derivatives can be nested within a model that includes point-impact effects at the end-points of the observed functions. Contrasts can then be employed to test the specification of different derivatives. When nonlinear regression models are employed, we apply a C test to determine the statistical significance of the nonlinear structure between a functional covariate and a scalar response. The finite-sample performance of these methods is verified in simulation, and their practical application is demonstrated using both chemometric and environmental data sets. PubDate: 2022-04-23

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We propose a novel method for drift estimation of multiscale diffusion processes when a sequence of discrete observations is given. For the Langevin dynamics in a two-scale potential, our approach relies on the eigenvalues and the eigenfunctions of the homogenized dynamics. Our first estimator is derived from a martingale estimating function of the generator of the homogenized diffusion process. However, the unbiasedness of the estimator depends on the rate with which the observations are sampled. We therefore introduce a second estimator which relies also on filtering the data, and we prove that it is asymptotically unbiased independently of the sampling rate. A series of numerical experiments illustrate the reliability and efficiency of our different estimators. PubDate: 2022-04-11

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The use of Cauchy Markov random field priors in statistical inverse problems can potentially lead to posterior distributions which are non-Gaussian, high-dimensional, multimodal and heavy-tailed. In order to use such priors successfully, sophisticated optimization and Markov chain Monte Carlo methods are usually required. In this paper, our focus is largely on reviewing recently developed Cauchy difference priors, while introducing interesting new variants, whilst providing a comparison. We firstly propose a one-dimensional second-order Cauchy difference prior, and construct new first- and second-order two-dimensional isotropic Cauchy difference priors. Another new Cauchy prior is based on the stochastic partial differential equation approach, derived from Matérn type Gaussian presentation. The comparison also includes Cauchy sheets. Our numerical computations are based on both maximum a posteriori and conditional mean estimation. We exploit state-of-the-art MCMC methodologies such as Metropolis-within-Gibbs, Repelling-Attracting Metropolis, and No-U-Turn sampler variant of Hamiltonian Monte Carlo. We demonstrate the models and methods constructed for one-dimensional and two-dimensional deconvolution problems. Thorough MCMC statistics are provided for all test cases, including potential scale reduction factors. PubDate: 2022-03-25

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Assessing goodness of fit to a given distribution plays an important role in computational statistics. The probability integral transformation (PIT) can be used to convert the question of whether a given sample originates from a reference distribution into a problem of testing for uniformity. We present new simulation- and optimization-based methods to obtain simultaneous confidence bands for the whole empirical cumulative distribution function (ECDF) of the PIT values under the assumption of uniformity. Simultaneous confidence bands correspond to such confidence intervals at each point that jointly satisfy a desired coverage. These methods can also be applied in cases where the reference distribution is represented only by a finite sample, which is useful, for example, for simulation-based calibration. The confidence bands provide an intuitive ECDF-based graphical test for uniformity, which also provides useful information on the quality of the discrepancy. We further extend the simulation and optimization methods to determine simultaneous confidence bands for testing whether multiple samples come from the same underlying distribution. This multiple sample comparison test is useful, for example, as a complementary diagnostic in multi-chain Markov chain Monte Carlo (MCMC) convergence diagnostics, where most currently used convergence diagnostics provide a single diagnostic value, but do not usually offer insight into the nature of the deviation. We provide numerical experiments to assess the properties of the tests using both simulated and real-world data and give recommendations on their practical application in computational statistics workflows. PubDate: 2022-03-24

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We investigate the use of learning approaches to handle Bayesian inverse problems in a computationally efficient way when the signals to be inverted present a moderately high number of dimensions and are in large number. We propose a tractable inverse regression approach which has the advantage to produce full probability distributions as approximations of the target posterior distributions. In addition to provide confidence indices on the predictions, these distributions allow a better exploration of inverse problems when multiple equivalent solutions exist. We then show how these distributions can be used for further refined predictions using importance sampling, while also providing a way to carry out uncertainty level estimation if necessary. The relevance of the proposed approach is illustrated both on simulated and real data in the context of a physical model inversion in planetary remote sensing. The approach shows interesting capabilities both in terms of computational efficiency and multimodal inference. PubDate: 2022-03-22

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract We introduce a framework for online changepoint detection and simultaneous model learning which is applicable to highly parametrized models, such as deep neural networks. It is based on detecting changepoints across time by sequentially performing generalized likelihood ratio tests that require only evaluations of simple prediction score functions. This procedure makes use of checkpoints, consisting of early versions of the actual model parameters, that allow to detect distributional changes by performing predictions on future data. We define an algorithm that bounds the Type I error in the sequential testing procedure. We demonstrate the efficiency of our method in challenging continual learning applications with unknown task changepoints, and show improved performance compared to online Bayesian changepoint detection. PubDate: 2022-03-13

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract This paper presents a novel method for the accurate functional approximation of possibly highly concentrated probability densities. It is based on the combination of several modern techniques such as transport maps and low-rank approximations via a nonintrusive tensor train reconstruction. The central idea is to carry out computations for statistical quantities of interest such as moments based on a convenient representation of a reference density for which accurate numerical methods can be employed. Since the transport from target to reference can usually not be determined exactly, one has to cope with a perturbed reference density due to a numerically approximated transport map. By the introduction of a layered approximation and appropriate coordinate transformations, the problem is split into a set of independent approximations in seperately chosen orthonormal basis functions, combining the notions h- and p-refinement (i.e. “mesh size” and polynomial degree). An efficient low-rank representation of the perturbed reference density is achieved via the Variational Monte Carlo method. This nonintrusive regression technique reconstructs the map in the tensor train format. An a priori convergence analysis with respect to the error terms introduced by the different (deterministic and statistical) approximations in the Hellinger distance and the Kullback–Leibler divergence is derived. Important applications are presented and in particular the context of Bayesian inverse problems is illuminated which is a main motivation for the developed approach. Several numerical examples illustrate the efficacy with densities of different complexity and degrees of perturbation of the transport to the reference density. The (superior) convergence is demonstrated in comparison to Monte Carlo and Markov Chain Monte Carlo methods. PubDate: 2022-03-12

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The partial least squares approach has been particularly successful in spectrometric prediction in chemometrics. By treating the spectral data as realizations of a stochastic process, the functional partial least squares can be applied. Motivated by the spectral data collected from oriented strand board furnish, we propose a sparse version of the functional partial least squares regression. The proposed method aims at achieving locally sparse (i.e., zero on certain sub-regions) estimates for the functional partial least squares bases, and more importantly, the locally sparse estimate for the slope function. The new approach applies a functional regularization technique to each iteration step of the functional partial least squares and implements a computational method that identifies nonzero sub-regions on which the slope function is estimated. We illustrate the proposed method with simulation studies and two applications on the oriented strand board furnish data and the particulate matter emissions data. PubDate: 2022-03-09

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract The spatio-temporal epidemic type aftershock sequence (ETAS) model is widely used to describe the self-exciting nature of earthquake occurrences. While traditional inference methods provide only point estimates of the model parameters, we aim at a fully Bayesian treatment of model inference, allowing naturally to incorporate prior knowledge and uncertainty quantification of the resulting estimates. Therefore, we introduce a highly flexible, non-parametric representation for the spatially varying ETAS background intensity through a Gaussian process (GP) prior. Combined with classical triggering functions this results in a new model formulation, namely the GP-ETAS model. We enable tractable and efficient Gibbs sampling by deriving an augmented form of the GP-ETAS inference problem. This novel sampling approach allows us to assess the posterior model variables conditioned on observed earthquake catalogues, i.e., the spatial background intensity and the parameters of the triggering function. Empirical results on two synthetic data sets indicate that GP-ETAS outperforms standard models and thus demonstrate the predictive power for observed earthquake catalogues including uncertainty quantification for the estimated parameters. Finally, a case study for the l’Aquila region, Italy, with the devastating event on 6 April 2009, is presented. PubDate: 2022-03-08

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract A well known identifiability issue in factor analytic models is the invariance with respect to orthogonal transformations. This problem burdens the inference under a Bayesian setup, where Markov chain Monte Carlo (MCMC) methods are used to generate samples from the posterior distribution. We introduce a post-processing scheme in order to deal with rotation, sign and permutation invariance of the MCMC sample. The exact version of the contributed algorithm requires to solve \(2^q\) assignment problems per (retained) MCMC iteration, where q denotes the number of factors of the fitted model. For large numbers of factors two approximate schemes based on simulated annealing are also discussed. We demonstrate that the proposed method leads to interpretable posterior distributions using synthetic and publicly available data from typical factor analytic models as well as mixtures of factor analyzers. An R package is available online at CRAN web-page. PubDate: 2022-02-27

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Performing optimal Bayesian design for discriminating between competing models is computationally intensive as it involves estimating posterior model probabilities for thousands of simulated data sets. This issue is compounded further when the likelihood functions for the rival models are computationally expensive. A new approach using supervised classification methods is developed to perform Bayesian optimal model discrimination design. This approach requires considerably fewer simulations from the candidate models than previous approaches using approximate Bayesian computation. Further, it is easy to assess the performance of the optimal design through the misclassification error rate. The approach is particularly useful in the presence of models with intractable likelihoods but can also provide computational advantages when the likelihoods are manageable. PubDate: 2022-02-22

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract High-dimensional limit theorems have been shown useful to derive tuning rules for finding the optimal scaling in random walk Metropolis algorithms. The assumptions under which weak convergence results are proved are, however, restrictive: the target density is typically assumed to be of a product form. Users may thus doubt the validity of such tuning rules in practical applications. In this paper, we shed some light on optimal scaling problems from a different perspective, namely a large-sample one. This allows to prove weak convergence results under realistic assumptions and to propose novel parameter-dimension-dependent tuning guidelines. The proposed guidelines are consistent with the previous ones when the target density is close to having a product form, and the results highlight that the correlation structure has to be accounted for to avoid performance deterioration if that is not the case, while justifying the use of a natural (asymptotically exact) approximation to the correlation matrix that can be employed for the very first algorithm run. PubDate: 2022-02-18

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract When statistical analyses consider multiple data sources, Markov melding provides a method for combining the source-specific Bayesian models. Markov melding joins together submodels that have a common quantity. One challenge is that the prior for this quantity can be implicit, and its prior density must be estimated. We show that error in this density estimate makes the two-stage Markov chain Monte Carlo sampler employed by Markov melding unstable and unreliable. We propose a robust two-stage algorithm that estimates the required prior marginal self-density ratios using weighted samples, dramatically improving accuracy in the tails of the distribution. The stabilised version of the algorithm is pragmatic and provides reliable inference. We demonstrate our approach using an evidence synthesis for inferring HIV prevalence, and an evidence synthesis of A/H1N1 influenza. PubDate: 2022-02-18

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Spectral embedding of network adjacency matrices often produces node representations living approximately around low-dimensional submanifold structures. In particular, hidden substructure is expected to arise when the graph is generated from a latent position model. Furthermore, the presence of communities within the network might generate community-specific submanifold structures in the embedding, but this is not explicitly accounted for in most statistical models for networks. In this article, a class of models called latent structure block models (LSBM) is proposed to address such scenarios, allowing for graph clustering when community-specific one-dimensional manifold structure is present. LSBMs focus on a specific class of latent space model, the random dot product graph (RDPG), and assign a latent submanifold to the latent positions of each community. A Bayesian model for the embeddings arising from LSBMs is discussed, and shown to have a good performance on simulated and real-world network data. The model is able to correctly recover the underlying communities living in a one-dimensional manifold, even when the parametric form of the underlying curves is unknown, achieving remarkable results on a variety of real data. PubDate: 2022-02-16

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Abstract Many real-world problems require one to estimate parameters of interest, in a Bayesian framework, from data that are collected sequentially in time. Conventional methods for sampling from posterior distributions, such as Markov chain Monte Carlo cannot efficiently address such problems as they do not take advantage of the data’s sequential structure. To this end, sequential methods which seek to update the posterior distribution whenever a new collection of data become available are often used to solve these types of problems. Two popular choices of sequential method are the ensemble Kalman filter (EnKF) and the sequential Monte Carlo sampler (SMCS). While EnKF only computes a Gaussian approximation of the posterior distribution, SMCS can draw samples directly from the posterior. Its performance, however, depends critically upon the kernels that are used. In this work, we present a method that constructs the kernels of SMCS using an EnKF formulation, and we demonstrate the performance of the method with numerical examples. PubDate: 2022-02-15 DOI: 10.1007/s11222-021-10075-x