Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We develop a flexible parametric framework for the estimation of quantile functions. This involves the specification of an analytical quantile-distribution function. It is shown to adapt well to a wide range of distributions under reasonable assumptions. We derive a least-square type estimator, leading to computationally efficient inference. By-products include a test for comparing two distributions, a variable selection method, and an innovative naïve Bayes classifier. Properties of the estimator, of the asymptotic test and of the classifier are investigated through theoretical results and simulation studies, and illustrated through a real data example. PubDate: 2023-03-24
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper presents the development of a spatial block-Nearest Neighbor Gaussian process (blockNNGP) for location-referenced large spatial data. The key idea behind this approach is to divide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial dependence, while each block captures the small-scale spatial dependence. The resulting blockNNGP enjoys Markov properties reflected on its sparse precision matrix. It is embedded as a prior within the class of latent Gaussian models, thus fast Bayesian inference is obtained using the integrated nested Laplace approximation. The performance of the blockNNGP is illustrated on simulated examples, a comparison of our approach with other methods for analyzing large spatial data and applications with Gaussian and non-Gaussian real data. PubDate: 2023-03-23
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Maximum likelihood estimation in logistic regression with mixed effects is known to often result in estimates on the boundary of the parameter space. Such estimates, which include infinite values for fixed effects and singular or infinite variance components, can cause havoc to numerical estimation procedures and inference. We introduce an appropriately scaled additive penalty to the log-likelihood function, or an approximation thereof, which penalizes the fixed effects by the Jeffreys’ invariant prior for the model with no random effects and the variance components by a composition of negative Huber loss functions. The resulting maximum penalized likelihood estimates are shown to lie in the interior of the parameter space. Appropriate scaling of the penalty guarantees that the penalization is soft enough to preserve the optimal asymptotic properties expected by the maximum likelihood estimator, namely consistency, asymptotic normality, and Cramér-Rao efficiency. Our choice of penalties and scaling factor preserves equivariance of the fixed effects estimates under linear transformation of the model parameters, such as contrasts. Maximum softly-penalized likelihood is compared to competing approaches on two real-data examples, and through comprehensive simulation studies that illustrate its superior finite sample performance. PubDate: 2023-03-16
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper introduces the subgraph nomination inference task, in which example subgraphs of interest are used to query a network for similarly interesting subgraphs. This type of problem appears time and again in real world problems connected to, for example, user recommendation systems and structural retrieval tasks in social and biological/connectomic networks. We formally define the subgraph nomination framework with an emphasis on the notion of a user-in-the-loop in the subgraph nomination pipeline. In this setting, a user can provide additional post-nomination light supervision that can be incorporated into the retrieval task. After introducing and formalizing the retrieval task, we examine the nuanced effect that user-supervision can have on performance, both analytically and across real and simulated data examples. PubDate: 2023-03-06
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In many situations we are interested in modeling data where there is no a clear relationship between the response and the covariates. In the literature there are few related proposals based on the additive partially linear models and normal distribution. It is also common to find situations where the response distribution, even conditionally on the covariates, presents asymmetry and/or heavy tails. In these situations it is more suitable to consider models based on the general class of scale mixture of skew-normal distributions, mainly under the respective centered reparameterization, due to the some inferential issues. In this paper, we developed a class of additive partially linear models based on scale mixture of skew-normal under the centered parameterization. We explore a hierarchical representation and set up an algorithm for maximum likelihood estimation based on the stochastic-approximation-expectation-maximization and expectation-conditional-maximization-either algorithms. A Monte Carlo experiment is conducted to evaluate the performance of these estimators in small and moderate samples. Furthermore, we developed residuals and influence diagnostic tools. The methodology is illustrated with the analysis of a real data set. PubDate: 2023-03-06
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We propose a Bayesian hypothesis testing procedure for comparing the multivariate distributions of several treatment groups against a control group. This test is derived from a flexible model for the group distributions based on a random binary vector such that, if its jth element equals one, then the jth treatment group is merged with the control group. The group distributions’ flexibility comes from a dependent Dirichlet process, while the latent vector prior distribution ensures a multiplicity correction to the testing procedure. We explore the posterior consistency of the Bayes factor and provide a Monte Carlo simulation study comparing the performance of our procedure with state-of-the-art alternatives. Our results show that the presented method performs better than competing approaches. Finally, we apply our proposal to two classical experiments. The first one studies the effects of tuberculosis vaccines on multiple health outcomes for rabbits, and the second one analyzes the effects of two drugs on weight gain for rats. In both applications, we find relevant differences between the control group and at least one treatment group. PubDate: 2023-03-06
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Stochastic Gradient Algorithms (SGAs) are ubiquitous in computational statistics, machine learning and optimisation. Recent years have brought an influx of interest in SGAs, and the non-asymptotic analysis of their bias is by now well-developed. However, relatively little is known about the optimal choice of the random approximation (e.g mini-batching) of the gradient in SGAs as this relies on the analysis of the variance and is problem specific. While there have been numerous attempts to reduce the variance of SGAs, these typically exploit a particular structure of the sampled distribution by requiring a priori knowledge of its density’s mode. In this paper, we construct a Multi-index Antithetic Stochastic Gradient Algorithm (MASGA) whose implementation is independent of the structure of the target measure. Our rigorous theoretical analysis demonstrates that for log-concave targets, MASGA achieves performance on par with Monte Carlo estimators that have access to unbiased samples from the distribution of interest. In other words, MASGA is an optimal estimator from the mean square error-computational cost perspective within the class of Monte Carlo estimators. To illustrate the robustness of our approach, we implement MASGA also in some simple non-log-concave numerical examples, however, without providing theoretical guarantees on our algorithm’s performance in such settings. PubDate: 2023-03-03
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The Matérn covariance function is ubiquitous in the application of Gaussian processes to spatial statistics and beyond. Perhaps the most important reason for this is that the smoothness parameter \(\nu \) gives complete control over the mean-square differentiability of the process, which has significant implications for the behavior of estimated quantities such as interpolants and forecasts. Unfortunately, derivatives of the Matérn covariance function with respect to \(\nu \) require derivatives of the modified second-kind Bessel function \({\mathcal {K}}_{\nu }\) with respect to \(\nu \) . While closed form expressions of these derivatives do exist, they are prohibitively difficult and expensive to compute. For this reason, many software packages require fixing \(\nu \) as opposed to estimating it, and all existing software packages that attempt to offer the functionality of estimating \(\nu \) use finite difference estimates for \(\partial _\nu {\mathcal {K}}_{\nu }\) . In this work, we introduce a new implementation of \({\mathcal {K}}_{\nu }\) that has been designed to provide derivatives via automatic differentiation (AD), and whose resulting derivatives are significantly faster and more accurate than those computed using finite differences. We provide comprehensive testing for both speed and accuracy and show that our AD solution can be used to build accurate Hessian matrices for second-order maximum likelihood estimation in settings where Hessians built with finite difference approximations completely fail. PubDate: 2023-02-28
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper studies the estimation of Gaussian graphical models in the unbalanced distributed framework. It provides an effective approach when the available machines are of different powers or when the existing dataset comes from different sources with different sizes and cannot be aggregated in one single machine. In this paper, we propose a new aggregated estimator of the precision matrix and justify such an approach by both theoretical and practical arguments. The limit distribution and convergence rate for this estimator are provided under sparsity conditions on the true precision matrix and controlling for the number of machines. Furthermore, a procedure for performing statistical inference is proposed. On the practical side, using a simulation study and a real data example, we show that the performance of the distributed estimator is similar to that of the non-distributed estimator that uses the full data. PubDate: 2023-02-25
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The concept of depth has proved very important for multivariate and functional data analysis, as it essentially acts as a surrogate for the notion of ranking of observations which is absent in more than one dimension. Motivated by the rapid development of technology, in particular the advent of ‘Big Data’, we extend here that concept to general metric spaces, propose a natural depth measure and explore its properties as a statistical depth function. Working in a general metric space allows the depth to be tailored to the data at hand and to the ultimate goal of the analysis, a very desirable property given the polymorphic nature of modern data sets. This flexibility is thoroughly illustrated by several real data analyses. PubDate: 2023-02-23
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The importance of the step size in stochastic optimization has been confirmed both theoretically and empirically during the past few decades and reconsidered in recent years, especially for large-scale learning. Different rules of selecting the step size have been discussed since the arising of stochastic approximation methods. The first part of this work reviews the studies on several representative techniques of setting the step size, covering heuristic rules, meta-learning procedure, adaptive step size technique and line search technique. The second component of this work proposes a novel class of accelerating stochastic optimization methods by resorting to the Barzilai–Borwein (BB) technique with a diagonal selection rule for the metric, particularly, termed as DBB. We first explore the theoretical and empirical properties of variance reduced stochastic optimization algorithms with DBB. Especially, we study the theoretical and numerical properties of the resulting method under strongly convex and non-convex cases respectively. To great show the efficacy of the step size schedule of DBB, we extend it into more general stochastic optimization methods. The theoretical and empirical properties of such the case also developed under different cases. Extensive numerical results in machine learning are offered, suggesting that the proposed algorithms show much promise. PubDate: 2023-02-20
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract A natural extension to standard Gaussian process (GP) regression is the use of non-stationary Gaussian processes, an approach where the parameters of the covariance kernel are allowed to vary in time or space. The non-stationary GP is a flexible model that relaxes the strong prior assumption of standard GP regression, that the covariance properties of the inferred functions are constant across the input space. Non-stationary GPs typically model varying covariance kernel parameters as further lower-level GPs, thereby enabling sampling-based inference. However, due to the high computational costs and inherently sequential nature of MCMC sampling, these methods do not scale to large datasets. Here we develop a variational inference approach to fitting non-stationary GPs that combines sparse GP regression methods with a trajectory segmentation technique. Our method is scalable to large datasets containing potentially millions of data points. We demonstrate the effectiveness of our approach on both synthetic and real world datasets. PubDate: 2023-02-17
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Models of social interaction have been mostly focusing on the dyad, the smallest possible social structure, as a unit of network analysis. In the context of friendship networks, we argue that the triad could also be seen as a building block to ensure cohesion and stability of larger group structures. By explicitly modeling the mechanism behind network formation, individual attributes (such as gender and ethnicity) are often dissociated from purely structural network effects (such as popularity) acknowledging the presence of more complex configurations. Allowing structural configurations to emerge when nodes share similar attribute values, real-world networks are more adequately described. We present a comprehensive set of network statistics that allow for continuous attributes to be accounted for. We also draw on the important literature on endogenous social effects to further explore the role of network structures on individual outcomes. A series of Monte Carlo experiments and an empirical example analyzing students’ friendship networks illustrate the importance of properly modeling attribute-based structural effects. In addition, we model unobserved nodal heterogeneity in the network formation process to control for possible friendship selection bias on educational outcomes. A critical issue discussed is whether friendships are related to homogeneity across several attributes or by a balance between homophily on some, such as gender and race, but heterophily on others, such as socio-economic factors. PubDate: 2023-02-16
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Approximate Bayesian computation (ABC) is commonly used for parameter estimation and model comparison for intractable simulator-based statistical models whose likelihood function cannot be evaluated. In this paper we instead investigate the feasibility of ABC as a generic approximate method for predictive inference, in particular, for computing the posterior predictive distribution of future observations or missing data of interest. We consider three complementary ABC approaches for this goal, each based on different assumptions regarding which predictive density of the intractable model can be sampled from. The case where only simulation from the joint density of the observed and future data given the model parameters can be used for inference is given particular attention and it is shown that the ideal summary statistic in this setting is minimal predictive sufficient instead of merely minimal sufficient (in the ordinary sense). An ABC prediction approach that takes advantage of a certain latent variable representation is also investigated. We additionally show how common ABC sampling algorithms can be used in the predictive settings considered. Our main results are first illustrated by using simple time-series models that facilitate analytical treatment, and later by using two common intractable dynamic models. PubDate: 2023-02-09
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Clustering techniques for multivariate data are useful tools in Statistics that have been fully studied in the literature. However, there is limited literature on clustering methodologies for functional data. Our proposal consists of a clustering procedure for functional data using techniques for clustering multivariate data. The idea is to reduce a functional data problem into a multivariate one by applying the epigraph and hypograph indexes to the original curves and to their first and/or second derivatives. All the information given by the functional data is therefore transformed to the multivariate context, being informative enough for the usual multivariate clustering techniques to be efficient. The performance of this new methodology is evaluated through a simulation study and is also illustrated through real data sets. The results are compared to some other clustering procedures for functional data. PubDate: 2023-02-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Boosting is one of the most powerful statistical learning methods that combines multiple weak learners into a strong learner. The main idea of boosting is to sequentially apply the algorithm to enhance its performance. Recently, boosting methods have been implemented to handle variable selection. However, little work has been available to deal with complex data such as measurement error in covariates. In this paper, we adopt the boosting method to do variable selection, especially in the presence of measurement error. We develop two different approximated correction approaches to deal with different types of responses, and meanwhile, eliminate measurement error effects. In addition, the proposed algorithms are easy to implement and are able to derive precise estimators. Throughout numerical studies under various settings, the proposed method outperforms other competitive approaches. PubDate: 2023-02-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Estimating the expectations of functionals applied to sums of random variables (RVs) is a well-known problem encountered in many challenging applications. Generally, closed-form expressions of these quantities are out of reach. A naive Monte Carlo simulation is an alternative approach. However, this method requires numerous samples for rare event problems. Therefore, it is paramount to use variance reduction techniques to develop fast and efficient estimation methods. In this work, we use importance sampling (IS), known for its efficiency in requiring fewer computations to achieve the same accuracy requirements. We propose a state-dependent IS scheme based on a stochastic optimal control formulation, where the control is dependent on state and time. We aim to calculate rare event quantities that could be written as an expectation of a functional of the sums of independent RVs. The proposed algorithm is generic and can be applied without restrictions on the univariate distributions of RVs or the functional applied to the sum. We apply this approach to the log-normal distribution to compute the left tail and cumulative distribution of the ratio of independent RVs. For each case, we numerically demonstrate that the proposed state-dependent IS algorithm compares favorably to most well-known estimators dealing with similar problems. PubDate: 2023-02-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We introduce a novel methodology for robust Bayesian estimation with robust divergence (e.g., density power divergence or \(\gamma \) -divergence), indexed by tuning parameters. It is well known that the posterior density induced by robust divergence gives highly robust estimators against outliers if the tuning parameter is appropriately and carefully chosen. In a Bayesian framework, one way to find the optimal tuning parameter would be using evidence (marginal likelihood). However, we theoretically and numerically illustrate that evidence induced by the density power divergence does not work to select the optimal tuning parameter since robust divergence is not regarded as a statistical model. To overcome the problems, we treat the exponential of robust divergence as an unnormalisable statistical model, and we estimate the tuning parameter by minimising the Hyvarinen score. We also provide adaptive computational methods based on sequential Monte Carlo samplers, enabling us to obtain the optimal tuning parameter and samples from posterior distributions simultaneously. The empirical performance of the proposed method through simulations and an application to real data are also provided. PubDate: 2023-02-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Probabilistic forecasting of time series is an important matter in many applications and research fields. In order to draw conclusions from a probabilistic forecast, we must ensure that the model class used to approximate the true forecasting distribution is expressive enough. Yet, characteristics of the model itself, such as its uncertainty or its feature-outcome relationship are not of lesser importance. This paper proposes Autoregressive Transformation Models (ATMs), a model class inspired by various research directions to unite expressive distributional forecasts using a semi-parametric distribution assumption with an interpretable model specification. We demonstrate the properties of ATMs both theoretically and through empirical evaluation on several simulated and real-world forecasting datasets. PubDate: 2023-02-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Many model inversion problems occur in industry. These problems consist in finding the set of parameter values such that a certain quantity of interest respects a constraint, for example remains below a threshold. In general, the quantity of interest is the output of a simulator, costly in computation time. An effective way to solve this problem is to replace the simulator by a Gaussian process regression, with an experimental design enriched sequentially by a well chosen acquisition criterion. Different inversion-adapted criteria exist such as the Bichon criterion (also known as expected feasibility function) and deviation number . There also exist a class of enrichment strategies (stepwise uncertainty reduction—SUR) which select the next point by measuring the expected uncertainty reduction induced by its selection. In this paper we propose a SUR version of the Bichon criterion. An explicit formulation of the criterion is given and test comparisons show good performances on classical test functions. PubDate: 2023-02-04