A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort alphabetically]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted by number of followers
Review of Economics and Statistics     Hybrid Journal   (Followers: 190)
Statistics in Medicine     Hybrid Journal   (Followers: 140)
Journal of Econometrics     Hybrid Journal   (Followers: 83)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 76, SJR: 3.746, CiteScore: 2)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 52)
Biometrics     Hybrid Journal   (Followers: 48)
Sociological Methods & Research     Hybrid Journal   (Followers: 47)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 42)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 41, SJR: 3.664, CiteScore: 2)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 37)
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 36)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 35)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 34)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 29)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 28)
The American Statistician     Full-text available via subscription   (Followers: 25)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 23)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 21)
Journal of Forecasting     Hybrid Journal   (Followers: 21)
Journal of Applied Statistics     Hybrid Journal   (Followers: 20)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 19)
Statistical Modelling     Hybrid Journal   (Followers: 18)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 18)
Journal of Statistical Software     Open Access   (Followers: 18, SJR: 13.802, CiteScore: 16)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 17)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 17)
Computational Statistics     Hybrid Journal   (Followers: 16)
Risk Management     Hybrid Journal   (Followers: 16)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 15)
Statistics and Computing     Hybrid Journal   (Followers: 14)
Demographic Research     Open Access   (Followers: 14)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Journal of Statistical Physics     Hybrid Journal   (Followers: 12)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 11)
International Statistical Review     Hybrid Journal   (Followers: 10)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 10)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 10)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Advances in Complex Systems     Hybrid Journal   (Followers: 10)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 9)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Stata Journal     Full-text available via subscription   (Followers: 9)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 8)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 8)
Teaching Statistics     Hybrid Journal   (Followers: 8)
Law, Probability and Risk     Hybrid Journal   (Followers: 8)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 8)
Current Research in Biostatistics     Open Access   (Followers: 8)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 7)
Queueing Systems     Hybrid Journal   (Followers: 7)
Argumentation et analyse du discours     Open Access   (Followers: 7)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 7)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 7)
Biometrical Journal     Hybrid Journal   (Followers: 6)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Lifetime Data Analysis     Hybrid Journal   (Followers: 6)
Significance     Hybrid Journal   (Followers: 6)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
Journal of Mathematics and Statistics     Open Access   (Followers: 6)
Applied Categorical Structures     Hybrid Journal   (Followers: 5)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Optimization Methods and Software     Hybrid Journal   (Followers: 5)
Statistical Methods and Applications     Hybrid Journal   (Followers: 5)
CHANCE     Hybrid Journal   (Followers: 5)
ESAIM: Probability and Statistics     Open Access   (Followers: 4)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Metrika     Hybrid Journal   (Followers: 4)
Statistical Papers     Hybrid Journal   (Followers: 4)
TEST     Hybrid Journal   (Followers: 3)
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 3)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 3)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 3)
Sankhya A     Hybrid Journal   (Followers: 3)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 3)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 2)
Extremes     Hybrid Journal   (Followers: 2)
Optimization Letters     Hybrid Journal   (Followers: 2)
Stochastic Models     Hybrid Journal   (Followers: 2)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
Building Simulation     Hybrid Journal   (Followers: 2)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
International Journal of Stochastic Analysis     Open Access   (Followers: 2)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Sequential Analysis: Design Methods and Applications     Hybrid Journal   (Followers: 1)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)
Statistics and Economics     Open Access  
Review of Socionetwork Strategies     Hybrid Journal  
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
Journal of the Korean Statistical Society     Hybrid Journal  

              [Sort alphabetically]   [Restore default list]

Similar Journals
Journal Cover
Statistics and Computing
Journal Prestige (SJR): 2.545
Citation Impact (citeScore): 2
Number of Followers: 14  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1573-1375 - ISSN (Online) 0960-3174
Published by Springer-Verlag Homepage  [2468 journals]
  • A generalized expectation model selection algorithm for latent variable
           selection in multidimensional item response theory models

    • Free pre-print version: Loading...

      Abstract: Abstract In this paper, we propose a generalized expectation model selection (GEMS) algorithm for latent variable selection in multidimensional item response theory models which are commonly used for identifying the relationships between the latent traits and test items. Under some mild assumptions, we prove the numerical convergence of GEMS for model selection by minimizing the generalized information criteria of observed data in the presence of missing data. For latent variable selection in the multidimensional two-parameter logistic (M2PL) models, we present an efficient implementation of GEMS to minimize the Bayesian information criterion. To ensure parameter identifiability, the variances of all latent traits are assumed to be unity and each latent trait is required to have an item exclusively associated with it. The convergence of GEMS for the M2PL models is verified. Simulation studies show that GEMS is computationally more efficient than the expectation model selection (EMS) algorithm and the expectation maximization based \(L_{1}\) -penalized method (EML1), and it yields better correct rate of latent variable selection and mean squared error of parameter estimates than the EMS and EML1. The GEMS algorithm is illustrated by analyzing a real dataset related to the Eysenck Personality Questionnaire.
      PubDate: 2023-11-25
       
  • Randomized time Riemannian Manifold Hamiltonian Monte Carlo

    • Free pre-print version: Loading...

      Abstract: Abstract Hamiltonian Monte Carlo (HMC) algorithms, which combine numerical approximation of Hamiltonian dynamics on finite intervals with stochastic refreshment and Metropolis correction, are popular sampling schemes, but it is known that they may suffer from slow convergence in the continuous time limit. A recent paper of Bou-Rabee and Sanz-Serna (Ann Appl Prob, 27:2159-2194, 2017) demonstrated that this issue can be addressed by simply randomizing the duration parameter of the Hamiltonian paths. In this article, we use the same idea to enhance the sampling efficiency of a constrained version of HMC, with potential benefits in a variety of application settings. We demonstrate both the conservation of the stationary distribution and the ergodicity of the method. We also compare the performance of various schemes in numerical studies of model problems, including an application to high-dimensional covariance estimation.
      PubDate: 2023-11-24
       
  • Randomized self-updating process for clustering large-scale data

    • Free pre-print version: Loading...

      Abstract: Abstract This paper introduces the randomized self-updating process (rSUP) algorithm for clustering large-scale data. rSUP is an extension of the self-updating process (SUP) algorithm, which has shown effectiveness in clustering data with characteristics such as noise, varying cluster shapes and sizes, and numerous clusters. However, SUP’s reliance on pairwise dissimilarities between data points makes it computationally inefficient for large-scale data. To address this challenge, rSUP performs location updates within randomly generated data subsets at each iteration. The Law of Large Numbers guarantees that the clustering results of rSUP converge to those of the original SUP as the partition size grows. This paper demonstrates the effectiveness and computational efficiency of rSUP in large-scale data clustering through simulations and real datasets.
      PubDate: 2023-11-24
       
  • Clusterwise multivariate regression of mixed-type panel data

    • Free pre-print version: Loading...

      Abstract: Abstract Multivariate panel data of mixed type are routinely collected in many different areas of application, often jointly with additional covariates which complicate the statistical analysis. Moreover, it is often of interest to identify unknown groups of subjects in a study population using such data structure, i.e., to perform clustering. In the Bayesian framework, we propose a finite mixture of multivariate generalised linear mixed effects regression models to cluster numeric, binary, ordinal and categorical panel outcomes jointly. The specification of suitable priors on the model parameters allows for convenient posterior inference based on Markov chain Monte Carlo (MCMC) sampling with data augmentation. This approach allows to classify subjects in the data and new subjects as well as to characterise the cluster-specific models. Model estimation and selection of the number of data clusters are simultaneously performed when approximating the posterior for a single model using MCMC sampling without resorting to multiple model estimations. The performance of the proposed methodology is evaluated in a simulation study. Its application is illustrated on two data sets, one from a longitudinal patient study to infer prognosis groups, and a second one from the Czech part of the EU-SILC survey where households are annually interviewed to obtain insights into changes in their financial capability.
      PubDate: 2023-11-22
       
  • Bootstrapping multiple systems estimates to account for model selection

    • Free pre-print version: Loading...

      Abstract: Abstract Multiple systems estimation using a Poisson loglinear model is a standard approach to quantifying hidden populations where data sources are based on lists of known cases. Information criteria are often used for selecting between the large number of possible models. Confidence intervals are often reported conditional on the model selected, providing an over-optimistic impression of estimation accuracy. A bootstrap approach is a natural way to account for the model selection. However, because the model selection step has to be carried out for every bootstrap replication, there may be a high or even prohibitive computational burden. We explore the merit of modifying the model selection procedure in the bootstrap to look only among a subset of models, chosen on the basis of their information criterion score on the original data. This provides large computational gains with little apparent effect on inference. We also incorporate rigorous and economical ways of approaching issues of the existence of estimators when applying the method to sparse data tables.
      PubDate: 2023-11-21
       
  • Stochastic variational inference for GARCH models

    • Free pre-print version: Loading...

      Abstract: Abstract Stochastic variational inference algorithms are derived for fitting various heteroskedastic time series models. We examine Gaussian, t, and skewed t response GARCH models and fit these using Gaussian variational approximating densities. We implement efficient stochastic gradient ascent procedures based on the use of control variates or the reparameterization trick and demonstrate that the proposed implementations provide a fast and accurate alternative to Markov chain Monte Carlo sampling. Additionally, we present sequential updating versions of our variational algorithms, which are suitable for efficient portfolio construction and dynamic asset allocation.
      PubDate: 2023-11-21
       
  • Renewable composite quantile method and algorithm for nonparametric models
           with streaming data

    • Free pre-print version: Loading...

      Abstract: Abstract We are interested in renewable estimations and algorithms for nonparametric models with streaming data. In our method, the nonparametric function of interest is expressed through a functional depending on a weight function and a conditional distribution function (CDF). The CDF is estimated by renewable kernel estimations together with function interpolations, based on which we propose the method of renewable weighted composite quantile regression (WCQR). Then, by fully utilizing the model structure, we obtain new selectors for the weight function, such that the WCQR can achieve asymptotic unbiasness when estimating specific functions in the model. We also propose practical bandwidth selectors for streaming data and find the optimal weight function by minimizing the asymptotic variance. The asymptotical results show that our estimator is almost equivalent to the oracle estimator obtained from the entire data together. Besides, our method also enjoys adaptiveness to error distributions, robustness to outliers, and efficiency in both estimation and computation. Simulation studies and real data analyses further confirm our theoretical findings.
      PubDate: 2023-11-17
       
  • Biclustering multivariate discrete longitudinal data

    • Free pre-print version: Loading...

      Abstract: Abstract A model-based biclustering method for multivariate discrete longitudinal data is proposed. We consider a finite mixture of generalized linear models to cluster units and, within each mixture component, we adopt a flexible and parsimonious parameterization of the component-specific canonical parameter to define subsets of variables (segments) sharing common dynamics over time. We develop an Expectation-Maximization-type algorithm for maximum likelihood estimation of model parameters. The performance of the proposed model is evaluated on a large scale simulation study, where we consider different choices for the sample the size, the number of measurement occasions, the number of components and segments. The proposal is applied to Italian crime data (font ISTAT) with the aim to detect areas sharing common longitudinal trajectories for specific subsets of crime types. The identification of such biclusters may potentially be helpful for policymakers to make decisions on safety.
      PubDate: 2023-11-17
       
  • Off-policy evaluation for tabular reinforcement learning with synthetic
           trajectories

    • Free pre-print version: Loading...

      Abstract: Abstract This paper addresses the problem of offline evaluation in tabular reinforcement learning (RL). We propose a novel method that leverages synthetic trajectories constructed from the available data using a “sampling with replacement” basis, combining the advantages of model-based and Monte Carlo policy evaluation. The method is accompanied by theoretically derived finite sample upper error bounds, offering performance guarantees and allowing for a trade-off between statistical efficiency and computational cost. The results from computational experiments demonstrate that our method consistently achieves lower upper error bounds and relative mean square errors compared to Importance Sampling, Doubly Robust methods, and other existing approaches. Furthermore, this method achieves these superior results in significantly shorter running times compared to traditional model-based approaches. These findings highlight the effectiveness and efficiency of this synthetic trajectory method for accurate offline policy evaluation in RL.
      PubDate: 2023-11-17
       
  • Heterogeneous analysis for clustered data using grouped finite mixture
           models

    • Free pre-print version: Loading...

      Abstract: Abstract It is common to observe significant heterogeneity in clustered data across scientific fields. Cluster-wise conditional distributions are widely used to explore variations and relationships within and among clusters. This paper aims to capture such heterogeneity by employing cluster-wise finite mixture models. To address the heterogeneity among clusters, we introduce latent group structure and incorporate heterogeneous mixing proportions across different groups, accommodating the diverse characteristics observed in the data. The specific number of groups and their membership are unknown. To identify the latent group structure, we employ concave penalty functions to the pairwise differences of the preliminary consistent estimators for the mixing proportions. This approach enables the automatic division of clusters into finite subgroups. Theoretical results demonstrate that as the number of clusters and cluster sizes tend to infinity, the true latent group structure can be recovered with probability close to one, and the post-classification estimators exhibit oracle efficiency. We support our proposed approach’s performance and applicability through extensive simulations and analysis of basic consumption expenditure among urban households in China.
      PubDate: 2023-11-15
       
  • Doubly robust estimation and robust empirical likelihood in generalized
           linear models with missing responses

    • Free pre-print version: Loading...

      Abstract: Abstract In this paper, we study doubly robust estimation and robust empirical likelihood of regression parameter for generalized linear models with missing responses. A doubly robust estimating equation is proposed to estimate the regression parameter, and the resulting estimator has consistency and asymptotic normality, regardless of whether the assumed model contains the true model. A robust empirical log-likelihood ratio statistic for the regression parameter is constructed, showing that the statistic weakly converges to the standard \(\chi ^2\) distribution. The result can be directly used to construct the confidence region of the regression parameter. A method for selecting the tuning parameters of \(\psi \) -function is also given. Simulation studies show the robustness of the estimator of the regression parameter and evaluate the performance of the robust empirical likelihood method. A real data example shows that the proposed method is feasible.
      PubDate: 2023-11-14
       
  • Pareto-efficient designs for multi- and mixed-level supersaturated designs

    • Free pre-print version: Loading...

      Abstract: Abstract Supersaturated designs are used in science and engineering to efficiently explore a large number of factors with a limited number of runs. It is not uncommon in engineering to consider a few, if not all, factors at more than two levels. Multi- and mixed-level supersaturated designs may, therefore, be handy. While the two-level supersaturated designs are widely studied, the literature on multi- and mixed-level designs is still scarce. A recent paper establishes that the group LASSO should be preferred as an analysis method because it can retain the natural group structure of multi- and mixed-level designs. A few optimality criteria for such designs also exist in the literature. These criteria typically aim to find designs that maximize average pairwise orthogonality. However, the literature lacks guidance on the better or ‘right’ optimality criteria from a screening perspective. In addition, the existing optimal designs are often balanced and are rarely available. We propose two new optimality criteria based on the large-sample properties of group LASSO. Our criteria fill the gap in the literature by providing design selection criteria that are directly related to the preferred analysis method. We then construct Pareto-efficient designs on the two new criteria and demonstrate that (a) our optimality criteria can be used to order existing optimal designs on their screening performance, (b) the Pareto-efficient designs are often better than or as good as the existing optimal designs, and (c) the Pareto-efficient designs can be constructed using a coordinate exchange algorithm and are, therefore, available for any choice of the number of runs, factors, and levels. A repository of three- and four-level designs with the number of runs between 8 and 16 is also provided.
      PubDate: 2023-11-11
       
  • Lévy Langevin Monte Carlo

    • Free pre-print version: Loading...

      Abstract: Abstract Analogously to the well-known Langevin Monte Carlo method, in this article we provide a method to sample from a target distribution \(\varvec{\pi }\) by simulating a solution of a stochastic differential equation. Hereby, the stochastic differential equation is driven by a general Lévy process which—unlike the case of Langevin Monte Carlo—allows for non-smooth targets. Our method will be fully explored in the particular setting of target distributions supported on the half-line \((0,\infty )\) and a compound Poisson driving noise. Several illustrative examples conclude the article.
      PubDate: 2023-11-10
       
  • A data-adaptive dimension reduction for functional data via penalized
           low-rank approximation

    • Free pre-print version: Loading...

      Abstract: Abstract We introduce a data-adaptive nonparametric dimension reduction tool to obtain a low-dimensional approximation of functional data contaminated by erratic measurement errors following symmetric or asymmetric distributions. We propose to apply robust submatrix completion techniques to matrices consisting of coefficients of basis functions calculated by projecting the observed trajectories onto a given orthogonal basis set. In this process, we use a composite asymmetric Huber loss function to accommodate domain-specific erratic behaviors in a data-adaptive manner. We further incorporate the \(L_1\) penalty to regularize the smoothness of latent factor curves. The proposed method can also be applied to partially observed functional data, where each trajectory contains individual-specific missing segments. Moreover, since our method does not require estimating the covariance operator, the extension to any dimensional functional data observed over a continuum is straightforward. We demonstrate the empirical performance in estimating lower-dimensional space and reconstruction of trajectories of the proposed method through simulation studies. We then apply the proposed method to two real datasets, one-dimensional Advanced Metering Infrastructure (AMI) data in South Korea and two-dimensional max precipitation spatial data collected in North America and South America.
      PubDate: 2023-11-08
       
  • Online estimation and community detection of network point processes for
           event streams

    • Free pre-print version: Loading...

      Abstract: Abstract A common goal in network modeling is to uncover the latent community structure present among nodes. For many real-world networks, the true connections consist of events arriving as streams, which are then aggregated to form edges, ignoring the dynamic temporal component. A natural way to take account of these temporal dynamics of interactions is to use point processes as the foundation of network models for community detection. Computational complexity hampers the scalability of such approaches to large sparse networks. To circumvent this challenge, we propose a fast online variational inference algorithm for estimating the latent structure underlying dynamic event arrivals on a network, using continuous-time point process latent network models. We describe this procedure for network models capturing community structure. This structure can be learned as new events are observed on the network, updating the inferred community assignments. We investigate the theoretical properties of such an inference scheme, and provide regret bounds on the loss function of this procedure. The proposed inference procedure is then thoroughly compared, using both simulation studies and real data, to non-online variants. We demonstrate that online inference can obtain comparable performance, in terms of community recovery, to non-online variants, while realising computational gains. Our proposed inference framework can also be readily modified to incorporate other popular network structures.
      PubDate: 2023-11-08
       
  • Topology-driven goodness-of-fit tests in arbitrary dimensions

    • Free pre-print version: Loading...

      Abstract: Abstract This paper adopts a tool from computational topology, the Euler characteristic curve (ECC) of a sample, to perform one- and two-sample goodness of fit tests. We call our procedure TopoTests. The presented tests work for samples of arbitrary dimension, having comparable power to the state-of-the-art tests in the one-dimensional case. It is demonstrated that the type I error of TopoTests can be controlled and their type II error vanishes exponentially with increasing sample size. Extensive numerical simulations of TopoTests are conducted to demonstrate their power for samples of various sizes.
      PubDate: 2023-11-08
       
  • Model-based clustering of multiple networks with a hierarchical algorithm

    • Free pre-print version: Loading...

      Abstract: Abstract The paper tackles the problem of clustering multiple networks, directed or not, that do not share the same set of vertices, into groups of networks with similar topology. A statistical model-based approach based on a finite mixture of stochastic block models is proposed. A clustering is obtained by maximizing the integrated classification likelihood criterion. This is done by a hierarchical agglomerative algorithm, that starts from singleton clusters and successively merges clusters of networks. As such, a sequence of nested clusterings is computed that can be represented by a dendrogram providing valuable insights on the collection of networks. Using a Bayesian framework, model selection is performed in an automated way since the algorithm stops when the best number of clusters is attained. The algorithm is computationally efficient, when carefully implemented. The aggregation of clusters requires a means to overcome the label-switching problem of the stochastic block model and to match the block labels of the networks. To address this problem, a new tool is proposed based on a comparison of the graphons of the associated stochastic block models. The clustering approach is assessed on synthetic data. An application to a set of ecological networks illustrates the interpretability of the obtained results.
      PubDate: 2023-11-07
       
  • SURE-tuned bridge regression

    • Free pre-print version: Loading...

      Abstract: Abstract Consider the \(\ell _{\alpha }\) regularized linear regression, also termed Bridge regression. For \(\alpha \in (0,1)\) , Bridge regression enjoys several statistical properties of interest such as sparsity and near-unbiasedness of the estimates (Fan and Li in J Am Stat Assoc 96(456): 1348–1360, 2001). However, the main difficulty lies in the non-convex nature of the penalty for these values of \(\alpha \) , which makes an optimization procedure challenging and usually it is only possible to find a local optimum. To address this issue, Polson et al. (J R Stat Soc B 76(4):713–733, 2013) took a sampling based fully Bayesian approach to this problem, using the correspondence between the Bridge penalty and a power exponential prior on the regression coefficients. However, their sampling procedure relies on Markov chain Monte Carlo (MCMC) techniques, which are inherently sequential and not scalable to large problem dimensions. Cross validation approaches are similarly computation-intensive. To this end, our contribution is a novel non-iterative method to fit a Bridge regression model. The main contribution lies in an explicit formula for Stein’s unbiased risk estimate for the out of sample prediction risk of Bridge regression, which can then be optimized to select the desired tuning parameters, allowing us to completely bypass MCMC as well as computation-intensive cross validation approaches. Our procedure yields results in a fraction of computational times compared to iterative schemes, without any appreciable loss in statistical performance. An R implementation is publicly available online at: https://github.com/loriaJ/Sure-tuned_BridgeRegression.
      PubDate: 2023-11-07
       
  • Point process simulation of generalised hyperbolic Lévy processes

    • Free pre-print version: Loading...

      Abstract: Abstract Generalised hyperbolic (GH) processes are a class of stochastic processes that are used to model the dynamics of a wide range of complex systems that exhibit heavy-tailed behavior, including systems in finance, economics, biology, and physics. In this paper, we present novel simulation methods based on subordination with a generalised inverse Gaussian (GIG) process and using a generalised shot-noise representation that involves random thinning of infinite series of decreasing jump sizes. Compared with our previous work on GIG processes, we provide tighter bounds for the construction of rejection sampling ratios, leading to improved acceptance probabilities in simulation. Furthermore, we derive methods for the adaptive determination of the number of points required in the associated random series using concentration inequalities. Residual small jumps are then approximated using an appropriately scaled Brownian motion term with drift. Finally the rejection sampling steps are made significantly more computationally efficient through the use of squeezing functions based on lower and upper bounds on the Lévy density. Experimental results are presented illustrating the strong performance under various parameter settings and comparing the marginal distribution of the GH paths with exact simulations of GH random variates. The new simulation methodology is made available to researchers through the publication of a Python code repository.
      PubDate: 2023-11-07
       
  • Privacy-preserving and lossless distributed estimation of high-dimensional
           generalized additive mixed models

    • Free pre-print version: Loading...

      Abstract: Abstract Various privacy-preserving frameworks that respect the individual’s privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy-preserving, and lossless estimation of generalized additive mixed models (GAMM) using component-wise gradient boosting (CWB). Making use of CWB allows us to reframe the GAMM estimation as a distributed fitting of base learners using the \(L_2\) -loss. In order to account for the heterogeneity of different data location sites, we propose a distributed version of a row-wise tensor product that allows the computation of site-specific (smooth) effects. Our adaption of CWB preserves all the important properties of the original algorithm, such as an unbiased feature selection and the feasibility to fit models in high-dimensional feature spaces, and yields equivalent model estimates as CWB on pooled data. Next to a derivation of the equivalence of both algorithms, we also showcase the efficacy of our algorithm on a distributed heart disease data set and compare it with state-of-the-art methods.
      PubDate: 2023-11-07
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 44.197.101.251
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-