Subjects -> STATISTICS (Total: 130 journals)
 Showing 1 - 151 of 151 Journals sorted by number of followers Review of Economics and Statistics       (Followers: 160) Statistics in Medicine       (Followers: 152) Journal of Econometrics       (Followers: 84) Journal of the American Statistical Association       (Followers: 73, SJR: 3.746, CiteScore: 2) Advances in Data Analysis and Classification       (Followers: 52) Biometrics       (Followers: 52) Sociological Methods & Research       (Followers: 45) Journal of Business & Economic Statistics       (Followers: 40, SJR: 3.664, CiteScore: 2) Journal of the Royal Statistical Society, Series B (Statistical Methodology)       (Followers: 40) Journal of the Royal Statistical Society Series C (Applied Statistics)       (Followers: 37) Computational Statistics & Data Analysis       (Followers: 36) Oxford Bulletin of Economics and Statistics       (Followers: 34) Journal of Risk and Uncertainty       (Followers: 33) Statistical Methods in Medical Research       (Followers: 30) Journal of the Royal Statistical Society, Series A (Statistics in Society)       (Followers: 28) The American Statistician       (Followers: 26) Journal of Urbanism: International Research on Placemaking and Urban Sustainability       (Followers: 26) Journal of Biopharmaceutical Statistics       (Followers: 24) Journal of Computational & Graphical Statistics       (Followers: 21) Journal of Applied Statistics       (Followers: 20) Journal of Forecasting       (Followers: 20) British Journal of Mathematical and Statistical Psychology       (Followers: 18) Statistical Modelling       (Followers: 18) International Journal of Quality, Statistics, and Reliability       (Followers: 17) Journal of Statistical Software       (Followers: 16, SJR: 13.802, CiteScore: 16) Journal of Time Series Analysis       (Followers: 16) Risk Management       (Followers: 16) Decisions in Economics and Finance       (Followers: 15) Pharmaceutical Statistics       (Followers: 15) Computational Statistics       (Followers: 15) Statistics and Computing       (Followers: 14) Demographic Research       (Followers: 14) Statistics & Probability Letters       (Followers: 13) Australian & New Zealand Journal of Statistics       (Followers: 13) Geneva Papers on Risk and Insurance - Issues and Practice       (Followers: 13) Structural and Multidisciplinary Optimization       (Followers: 12) International Statistical Review       (Followers: 12) Statistics: A Journal of Theoretical and Applied Statistics       (Followers: 12) Journal of Statistical Physics       (Followers: 12) Communications in Statistics - Theory and Methods       (Followers: 11) Advances in Complex Systems       (Followers: 10) The Canadian Journal of Statistics / La Revue Canadienne de Statistique       (Followers: 10) Journal of Probability and Statistics       (Followers: 10) Communications in Statistics - Simulation and Computation       (Followers: 9) Biometrical Journal       (Followers: 9) Scandinavian Journal of Statistics       (Followers: 9) Asian Journal of Mathematics & Statistics       (Followers: 8) Argumentation et analyse du discours       (Followers: 8) Fuzzy Optimization and Decision Making       (Followers: 8) Current Research in Biostatistics       (Followers: 8) Teaching Statistics       (Followers: 8) Stata Journal       (Followers: 8) Multivariate Behavioral Research       (Followers: 8) Journal of Educational and Behavioral Statistics       (Followers: 7) Environmental and Ecological Statistics       (Followers: 7) Journal of Combinatorial Optimization       (Followers: 7) Handbook of Statistics       (Followers: 7) Lifetime Data Analysis       (Followers: 7) Significance       (Followers: 7) Journal of Statistical Planning and Inference       (Followers: 7) Research Synthesis Methods       (Followers: 7) Queueing Systems       (Followers: 7) Journal of Mathematics and Statistics       (Followers: 6) Statistical Methods and Applications       (Followers: 6) Law, Probability and Risk       (Followers: 6) International Journal of Computational Economics and Econometrics       (Followers: 6) Journal of Global Optimization       (Followers: 6) Journal of Nonparametric Statistics       (Followers: 6) Optimization Methods and Software       (Followers: 5) Engineering With Computers       (Followers: 5) CHANCE       (Followers: 5) Applied Categorical Structures       (Followers: 5) Handbook of Numerical Analysis       (Followers: 4) Metrika       (Followers: 4) ESAIM: Probability and Statistics       (Followers: 4) Mathematical Methods of Statistics       (Followers: 4) Statistical Papers       (Followers: 4) Sankhya A       (Followers: 3) Journal of Algebraic Combinatorics       (Followers: 3) Journal of Theoretical Probability       (Followers: 3) Journal of Statistical and Econometric Methods       (Followers: 3) Monthly Statistics of International Trade - Statistiques mensuelles du commerce international       (Followers: 3) Statistical Inference for Stochastic Processes       (Followers: 3) Technology Innovations in Statistics Education (TISE)       (Followers: 2) AStA Advances in Statistical Analysis       (Followers: 2) IEA World Energy Statistics and Balances -       (Followers: 2) Building Simulation       (Followers: 2) Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports       (Followers: 2) Stochastic Models       (Followers: 2) Optimization Letters       (Followers: 2) TEST       (Followers: 2) Extremes       (Followers: 2) International Journal of Stochastic Analysis       (Followers: 2) Statistica Neerlandica       (Followers: 1) Wiley Interdisciplinary Reviews - Computational Statistics       (Followers: 1) Measurement Interdisciplinary Research and Perspectives       (Followers: 1) Statistics and Economics Review of Socionetwork Strategies SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques Journal of the Korean Statistical Society Sequential Analysis: Design Methods and Applications
Similar Journals
 Statistics and ComputingJournal Prestige (SJR): 2.545 Citation Impact (citeScore): 2Number of Followers: 14      Hybrid journal (It can contain Open Access articles) ISSN (Print) 1573-1375 - ISSN (Online) 0960-3174 Published by Springer-Verlag  [2467 journals]
• Prediction scoring of data-driven discoveries for reproducible research

Abstract: Abstract Predictive modeling uncovers knowledge and insights regarding a hypothesized data generating mechanism (DGM). Results from different studies on a complex DGM, derived from different data sets, and using complicated models and algorithms, are hard to quantitatively compare due to random noise and statistical uncertainty in model results. This has been one of the main contributors to the replication crisis in the behavioral sciences. The contribution of this paper is to apply prediction scoring to the problem of comparing two studies, such as can arise when evaluating replications or competing evidence. We examine the role of predictive models in quantitatively assessing agreement between two datasets that are assumed to come from two distinct DGMs. We formalize a distance between the DGMs that is estimated using cross validation. We argue that the resulting prediction scores depend on the predictive models created by cross validation. In this sense, the prediction scores measure the distance between DGMs, along the dimension of the particular predictive model. Using human behavior data from experimental economics, we demonstrate that prediction scores can be used to evaluate preregistered hypotheses and provide insights comparing data from different populations and settings. We examine the asymptotic behavior of the prediction scores using simulated experimental data and demonstrate that leveraging competing predictive models can reveal important differences between underlying DGMs. Our proposed cross-validated prediction scores are capable of quantifying differences between unobserved data generating mechanisms and allow for the validation and assessment of results from complex models.
PubDate: 2022-12-02

• Model-free global likelihood subsampling for massive data

Abstract: Abstract Most existing studies for subsampling heavily depend on a specified model. If the assumed model is not correct, the performance of the subsample may be poor. This paper focuses on a model-free subsampling method, called global likelihood subsampling, such that the subsample is robust to different model choices. It leverages the idea of the global likelihood sampler, which is an effective and robust sampling method from a given continuous distribution. Furthermore, we accelerate the algorithm for large-scale datasets and extend it to deal with high-dimensional data with relatively low computational complexity. Simulations and real data studies are conducted to apply the proposed method to regression and classification problems. It illustrates that this method is robust against different modeling methods and has promising performance compared with some existing model-free subsampling methods for data compression.
PubDate: 2022-12-01

• Bayesian A-optimal two-phase designs with a single blocking factor in each
phase

Abstract: Abstract Two-phase experiments are widely used in many areas of science (e.g., agriculture, industrial engineering, food processing, etc.). For example, consider a two-phase experiment in plant breeding. Often, the first phase of this experiment is run in a field involving several blocks. The samples obtained from the first phase are then analyzed in several machines (or days, etc.) in a laboratory in the second phase. There might be field-block-to-field-block and machine-to-machine (or day-to-day, etc.) variation. Thus, it is practical to consider these sources of variation as blocking factors. Clearly, there are two possible strategies to analyze this kind of two-phase experiment, i.e., blocks are treated as fixed or random. While there are a few studies regarding fixed block effects, there are still a limited number of studies with random block effects and when information of block effects is uncertain. Hence, it is beneficial to consider a Bayesian approach to design for such an experiment, which is the main goal of this work. In this paper, we construct a design for a two-phase experiment that has a single treatment factor, a single blocking factor in each phase, and a response that can only be observed in the second phase.
PubDate: 2022-12-01

• Sticky PDMP samplers for sparse and local inference problems

Abstract: Abstract We construct a new class of efficient Monte Carlo methods based on continuous-time piecewise deterministic Markov processes (PDMPs) suitable for inference in high dimensional sparse models, i.e. models for which there is prior knowledge that many coordinates are likely to be exactly 0. This is achieved with the fairly simple idea of endowing existing PDMP samplers with “sticky” coordinate axes, coordinate planes etc. Upon hitting those subspaces, an event is triggered during which the process sticks to the subspace, this way spending some time in a sub-model. This results in non-reversible jumps between different (sub-)models. While we show that PDMP samplers in general can be made sticky, we mainly focus on the Zig-Zag sampler. Compared to the Gibbs sampler for variable selection, we heuristically derive favourable dependence of the Sticky Zig-Zag sampler on dimension and data size. The computational efficiency of the Sticky Zig-Zag sampler is further established through numerical experiments where both the sample size and the dimension of the parameter space are large.
PubDate: 2022-11-28

• Constructing two-level $$Q_B$$ -optimal screening designs using
mixed-integer programming and heuristic algorithms

Abstract: Abstract Two-level screening designs are widely applied in manufacturing industry to identify influential factors of a system. These designs have each factor at two levels and are traditionally constructed using standard algorithms, which rely on a pre-specified linear model. Since the assumed model may depart from the truth, two-level $$Q_B$$ -optimal designs have been developed to provide efficient parameter estimates for several potential models. These designs also have an overarching goal that models that are more likely to be the best for explaining the data are estimated more efficiently than the rest. However, there is no effective algorithm for constructing them. This article proposes two methods: a mixed-integer programming algorithm that guarantees convergence to the two-level $$Q_B$$ -optimal designs; and, a heuristic algorithm that employs a novel formula to find good designs in short computing times. Using numerical experiments, we show that our mixed-integer programming algorithm is attractive to find small optimal designs, and our heuristic algorithm is the most computationally-effective approach to construct both small and large designs, when compared to benchmark heuristic algorithms.
PubDate: 2022-11-25

• Variance reduction for Metropolis–Hastings samplers

Abstract: Abstract We introduce a general framework that constructs estimators with reduced variance for random walk Metropolis and Metropolis-adjusted Langevin algorithms. The resulting estimators require negligible computational cost and are derived in a post-process manner utilising all proposal values of the Metropolis algorithms. Variance reduction is achieved by producing control variates through the approximate solution of the Poisson equation associated with the target density of the Markov chain. The proposed method is based on approximating the target density with a Gaussian and then utilising accurate solutions of the Poisson equation for the Gaussian case. This leads to an estimator that uses two key elements: (1) a control variate from the Poisson equation that contains an intractable expectation under the proposal distribution, (2) a second control variate to reduce the variance of a Monte Carlo estimate of this latter intractable expectation. Simulated data examples are used to illustrate the impressive variance reduction achieved in the Gaussian target case and the corresponding effect when target Gaussianity assumption is violated. Real data examples on Bayesian logistic regression and stochastic volatility models verify that considerable variance reduction is achieved with negligible extra computational cost.
PubDate: 2022-11-25

• Efficient simulation of p-tempered $$\alpha$$ -stable OU processes

Abstract: Abstract We develop efficient methods for simulating processes of Ornstein–Uhlenbeck type related to the class of p-tempered $$\alpha$$ -stable ( $$\textrm{TS}^p_\alpha$$ ) distributions. Our results hold for both the univariate and multivariate cases and we consider both the case where the $$\textrm{TS}^p_\alpha$$ distribution is the stationary law and where it is the distribution of the background driving Lévy process. In the latter case, we also derive an explicit representation for the transition law as this was previous known only in certain special cases and only for $$p=1$$ and $$\alpha \in [0,1)$$ . Simulation results suggest that our methods work well in practice.
PubDate: 2022-11-24

• LASSO for streaming data with adaptative filtering

Abstract: Abstract Streaming data is ubiquitous in modern machine learning, and so the development of scalable algorithms to analyze this sort of information is a topic of current interest. On the other hand, the problem of $$l_1$$ -penalized least-square regression, commonly referred to as LASSO, is a quite popular data mining technique, which is commonly used for feature selection. In this work, we develop a homotopy-based solver for LASSO, on a streaming data context, that massively speeds up its convergence by extracting the most information out of the solution prior receiving the latest batch of data. Since these batches may show a non-stationary behavior, our solver also includes an adaptive filter that improves the predictability of our method in this scenario. Besides different theoretical properties, we additionally compare empirically our solver to the state-of-the-art: LARS, coordinate descent and Garrigues and Ghaoui’s data streaming homotopy. The obtained results show our approach to massively reduce the computational time require to convergence for the previous approaches, reducing up to 3, 4 and 5 orders of magnitude of running time with respect to LARS, coordinate descent and Garrigues and Ghaoui’s homotopy, respectively.
PubDate: 2022-11-24

• Bayesian learning via neural Schrödinger–Föllmer flows

Abstract: Abstract In this work we explore a new framework for approximate Bayesian inference in large datasets based on stochastic control. We advocate stochastic control as a finite time and low variance alternative to popular steady-state methods such as stochastic gradient Langevin dynamics. Furthermore, we discuss and adapt the existing theoretical guarantees of this framework and establish connections to already existing VI routines in SDE-based models.
PubDate: 2022-11-23

• Robust discrete choice models with t-distributed kernel errors

Abstract: Abstract Outliers in discrete choice response data may result from misclassification and misreporting of the response variable and from choice behaviour that is inconsistent with modelling assumptions (e.g. random utility maximisation). In the presence of outliers, standard discrete choice models produce biased estimates and suffer from compromised predictive accuracy. Robust statistical models are less sensitive to outliers than standard non-robust models. This paper analyses two robust alternatives to the multinomial probit (MNP) model. The two models are robit models whose kernel error distributions are heavy-tailed t-distributions to moderate the influence of outliers. The first model is the multinomial robit (MNR) model, in which a generic degrees of freedom parameter controls the heavy-tailedness of the kernel error distribution. The second model, the generalised multinomial robit (Gen-MNR) model, is more flexible than MNR, as it allows for distinct heavy-tailedness in each dimension of the kernel error distribution. For both models, we derive Gibbs samplers for posterior inference. In a simulation study, we illustrate the finite sample properties of the proposed Bayes estimators and show that MNR and Gen-MNR produce more accurate estimates if the choice data contain outliers through the lens of the non-robust MNP model. In a case study on transport mode choice behaviour, MNR and Gen-MNR outperform MNP by substantial margins in terms of in-sample fit and out-of-sample predictive accuracy. The case study also highlights differences in elasticity estimates across models.
PubDate: 2022-11-18

• Automatic search intervals for the smoothing parameter in penalized
splines

Abstract: Abstract The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization. Unfortunately, the grid search method requires a pre-specified search interval that contains the unknown global optimum, yet no guideline is available for providing this interval. As a result, practitioners have to find it by trial and error. To overcome such difficulty, we develop novel algorithms to automatically find this interval. Our automatic search interval has four advantages. (i) It specifies a smoothing parameter range where the associated penalized least squares problem is numerically solvable. (ii) It is criterion-independent so that different criteria, such as GCV and REML, can be explored on the same parameter range. (iii) It is sufficiently wide to contain the global optimum of any criterion, so that for example, the global minimum of GCV and the global maximum of REML can both be identified. (iv) It is computationally cheap compared with the grid search itself, carrying no extra computational burden in practice. Our method is ready to use through our recently developed R package gps ( $$\ge$$  version 1.1). It may be embedded in more advanced statistical modeling methods that rely on penalized splines.
PubDate: 2022-11-18

• Changepoint detection in non-exchangeable data

Abstract: Abstract Changepoint models typically assume the data within each segment are independent and identically distributed conditional on some parameters that change across segments. This construction may be inadequate when data are subject to local correlation patterns, often resulting in many more changepoints fitted than preferable. This article proposes a Bayesian changepoint model that relaxes the assumption of exchangeability within segments. The proposed model supposes data within a segment are m-dependent for some unknown $$m \geqslant 0$$ that may vary between segments, resulting in a model suitable for detecting clear discontinuities in data that are subject to different local temporal correlations. The approach is suited to both continuous and discrete data. A novel reversible jump Markov chain Monte Carlo algorithm is proposed to sample from the model; in particular, a detailed analysis of the parameter space is exploited to build proposals for the orders of dependence. Two applications demonstrate the benefits of the proposed model: computer network monitoring via change detection in count data, and segmentation of financial time series.
PubDate: 2022-11-16

• Interpolating log-determinant and trace of the powers of matrix
$$\textbf{A} + t\textbf{B}$$

Abstract: Abstract We develop heuristic interpolation methods for the functions $$t\mapsto \log \det \left( \textbf{A} + t\textbf{B} \right)$$ and $$t\mapsto {{\,\textrm{trace}\,}}\left( (\textbf{A} + t\textbf{B})^{p} \right)$$ where the matrices $$\textbf{A}$$ and $$\textbf{B}$$ are Hermitian and positive (semi) definite and $$p$$ and $$t$$ are real variables. These functions are featured in many applications in statistics, machine learning, and computational physics. The presented interpolation functions are based on the modification of sharp bounds for these functions. We demonstrate the accuracy and performance of the proposed method with numerical examples, namely, the marginal maximum likelihood estimation for Gaussian process regression and the estimation of the regularization parameter of ridge regression with the generalized cross-validation method.
PubDate: 2022-11-10

• Systematic enumeration of definitive screening designs

Abstract: Abstract Conference designs are $$n \times k$$ matrices, $$k \le n$$ , with orthogonal columns, one zero in each column, at most one zero in each row, and $$-1$$ and $$+1$$ entries elsewhere. Conference designs with $$k=n$$ are called conference matrices. Definitive screening designs (DSDs) are constructed by folding over a conference design and adding a row vector of zeros. We propose methodology for the systematic enumeration of conference designs with a specified number of rows and columns, and thereby for the systematic enumeration of the corresponding DSDs. We demonstrate its potential by enumerating all conference designs with up to 24 rows and columns, and thus all DSDs with up to 49 runs. A large fraction of these DSDs cannot be obtained from conference matrices and is therefore new to the literature. We identify DSDs that minimize the correlation among contrast vectors of second-order effects and provide them in supplementary files.
PubDate: 2022-11-10

• Spline estimation of functional principal components via manifold

Abstract: Abstract Functional principal component analysis has become the most important dimension reduction technique in functional data analysis. Based on B-spline approximation, functional principal components (FPCs) can be efficiently estimated by the expectation-maximization (EM) and the geometric restricted maximum likelihood (REML) algorithms under the strong assumption of Gaussianity on the principal component scores and observational errors. When computing the solution, the EM algorithm does not exploit the underlying geometric manifold structure, while the performance of REML is known to be unstable. In this article, we propose a conjugate gradient algorithm over the product manifold to estimate FPCs. This algorithm exploits the manifold geometry structure of the overall parameter space, thus improving its search efficiency and estimation accuracy. In addition, a distribution-free interpretation of the loss function is provided from the viewpoint of matrix Bregman divergence, which explains why the proposed method works well under general distribution settings. We also show that a roughness penalization can be easily incorporated into our algorithm with a potentially better fit. The appealing numerical performance of the proposed method is demonstrated by simulation studies and the analysis of a Type Ia supernova light curve dataset.
PubDate: 2022-11-09

• Automatic Zig-Zag sampling in practice

Abstract: Abstract Novel Monte Carlo methods to generate samples from a target distribution, such as a posterior from a Bayesian analysis, have rapidly expanded in the past decade. Algorithms based on Piecewise Deterministic Markov Processes (PDMPs), non-reversible continuous-time processes, are developing into their own research branch, thanks their important properties (e.g., super-efficiency). Nevertheless, practice has not caught up with the theory in this field, and the use of PDMPs to solve applied problems is not widespread. This might be due, firstly, to several implementational challenges that PDMP-based samplers present with and, secondly, to the lack of papers that showcase the methods and implementations in applied settings. Here, we address both these issues using one of the most promising PDMPs, the Zig-Zag sampler, as an archetypal example. After an explanation of the key elements of the Zig-Zag sampler, its implementation challenges are exposed and addressed. Specifically, the formulation of an algorithm that draws samples from a target distribution of interest is provided. Notably, the only requirement of the algorithm is a closed-form differentiable function to evaluate the log-target density of interest, and, unlike previous implementations, no further information on the target is needed. The performance of the algorithm is evaluated against canonical Hamiltonian Monte Carlo, and it is proven to be competitive, in simulation and real-data settings. Lastly, we demonstrate that the super-efficiency property, i.e. the ability to draw one independent sample at a lesser cost than evaluating the likelihood of all the data, can be obtained in practice.
PubDate: 2022-11-09

• Dynamic and robust Bayesian graphical models

Abstract: Abstract Gaussian graphical models are widely popular for studying the conditional dependence among random variables. By encoding conditional dependence as an undirected graph, Gaussian graphical models provide interpretable representations and insightful visualizations of the relationships among variables. However, time series data present additional challenges: the graphs can evolve over time—with changes occurring at unknown time points—and the data often exhibit heavy-tailed characteristics. To address these challenges, we propose dynamic and robust Bayesian graphical models that employ state-of-the-art hidden Markov models (HMMs) to introduce dynamics in the graph and heavy-tailed multivariate t-distributions for model robustness. The HMM latent states are linked both temporally and hierarchically for greater information sharing across time and between states. The proposed methods are computationally efficient and demonstrate excellent graph estimation on simulated data with substantial improvements over non-robust graphical models. We demonstrate our proposed approach on human hand gesture tracking data, and discover edges and dynamics with well explained practical meanings.
PubDate: 2022-11-09

• Limit theory and robust evaluation methods for the extremal properties of
GARCH(p, q) processes

Abstract: Abstract Generalized autoregressive conditionally heteroskedastic (GARCH) processes are widely used for modelling financial returns, with their extremal properties being of interest for market risk management. For GARCH( $$p,q$$ ) processes with $$\max (p,q) = 1$$ all extremal features have been fully characterised, but when $$\max (p,q)\ge 2$$ much remains to be found. Previous research has identified that both marginal and dependence extremal features of strictly stationary GARCH( $$p,q$$ ) processes are determined by a multivariate regular variation property and tail processes. Currently there are no reliable methods for evaluating these characterisations, or even assessing the stationarity, for the classes of GARCH( $$p,q$$ ) processes that are used in practice, i.e., with unbounded and asymmetric innovations. By developing a mixture of new limit theory and particle filtering algorithms for fixed point distributions we produce novel and robust evaluation methods for all extremal features for all GARCH( $$p,q$$ ) processes, including ARCH and IGARCH processes. We investigate our methods’ performance when evaluating the marginal tail index, the extremogram and the extremal index, the latter two being measures of temporal dependence.
PubDate: 2022-11-01

• Graph-based algorithms for phase-type distributions

Abstract: Abstract Phase-type distributions model the time until absorption in continuous or discrete-time Markov chains on a finite state space. The multivariate phase-type distributions have diverse and important applications by modeling rewards accumulated at visited states. However, even moderately sized state spaces make the traditional matrix-based equations computationally infeasible. State spaces of phase-type distributions are often large but sparse, with only a few transitions from a state. This sparseness makes a graph-based representation of the phase-type distribution more natural and efficient than the traditional matrix-based representation. In this paper, we develop graph-based algorithms for analyzing phase-type distributions. In addition to algorithms for state space construction, reward transformation, and moments calculation, we give algorithms for the marginal distribution functions of multivariate phase-type distributions and for the state probability vector of the underlying Markov chains of both time-homogeneous and time-inhomogeneous phase-type distributions. The algorithms are available as a numerically stable and memory-efficient open source software package written in C named ptdalgorithms. This library exposes all methods in the programming languages C and R. We compare the running time of ptdalgorithms to the fastest tools using a traditional matrix-based formulation. This comparison includes the computation of the probability distribution, which is usually computed by exponentiation of the sub-intensity or sub-transition matrix. We also compare time spent calculating the moments of (multivariate) phase-type distributions usually defined by inversion of the same matrices. The numerical results of our graph-based and traditional matrix-based methods are identical, and our graph-based algorithms are often orders of magnitudes faster. Finally, we demonstrate with a classic problem from population genetics how ptdalgorithms serves as a much faster, simpler, and completely general modeling alternative.
PubDate: 2022-11-01

• Uniform calibration tests for forecasting systems with small lead time

Abstract: Abstract A long noted difficulty when assessing calibration (or reliability) of forecasting systems is that calibration, in general, is a hypothesis not about a finite dimensional parameter but about an entire functional relationship. A calibrated probability forecast for binary events for instance should equal the conditional probability of the event given the forecast, whatever the value of the forecast. A new class of tests is presented that are based on estimating the cumulative deviations from calibration. The supremum of those deviations is taken as a test statistic, and the asymptotic distribution of the test statistic is established rigorously. It turns out to be universal, provided the forecasts “look one step ahead” only, or in other words, verify at the next time step in the future. The new tests apply to various different forecasting problems and are compared with established approaches which work in a regression based framework. In comparison to those approaches, the new tests develop power against a wider class of alternatives. Numerical experiments for both artificial data as well as operational weather forecasting systems are presented, and possible extensions to longer lead times are discussed.
PubDate: 2022-10-29

JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762