A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort by number of followers]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted alphabetically
Advances in Complex Systems     Hybrid Journal   (Followers: 10)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 61)
Annals of Applied Statistics     Full-text available via subscription   (Followers: 39)
Applied Categorical Structures     Hybrid Journal   (Followers: 4)
Argumentation et analyse du discours     Open Access   (Followers: 10)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 8)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 4)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Bernoulli     Full-text available via subscription   (Followers: 9)
Biometrical Journal     Hybrid Journal   (Followers: 10)
Biometrics     Hybrid Journal   (Followers: 51)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 18)
Building Simulation     Hybrid Journal   (Followers: 1)
Bulletin of Statistics     Full-text available via subscription   (Followers: 4)
CHANCE     Hybrid Journal   (Followers: 5)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 11)
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 37)
Current Research in Biostatistics     Open Access   (Followers: 8)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 11)
Demographic Research     Open Access   (Followers: 16)
Electronic Journal of Statistics     Open Access   (Followers: 8)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
ESAIM: Probability and Statistics     Full-text available via subscription   (Followers: 5)
Extremes     Hybrid Journal   (Followers: 2)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 8)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 5)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 17)
International Journal of Stochastic Analysis     Open Access   (Followers: 3)
International Statistical Review     Hybrid Journal   (Followers: 12)
International Trade by Commodity Statistics - Statistiques du commerce international par produit     Full-text available via subscription  
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 4)
Journal of Applied Statistics     Hybrid Journal   (Followers: 20)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 20)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 39, SJR: 3.664, CiteScore: 2)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 20)
Journal of Econometrics     Hybrid Journal   (Followers: 82)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 6)
Journal of Forecasting     Hybrid Journal   (Followers: 17)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Interactive Marketing     Hybrid Journal   (Followers: 10)
Journal of Mathematics and Statistics     Open Access   (Followers: 8)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 32)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 5)
Journal of Statistical Physics     Hybrid Journal   (Followers: 13)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 8)
Journal of Statistical Software     Open Access   (Followers: 20, SJR: 13.802, CiteScore: 16)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 72, SJR: 3.746, CiteScore: 2)
Journal of the Korean Statistical Society     Hybrid Journal   (Followers: 1)
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 31)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 26)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 43)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 16)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 30)
Law, Probability and Risk     Hybrid Journal   (Followers: 8)
Lifetime Data Analysis     Hybrid Journal   (Followers: 7)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Metrika     Hybrid Journal   (Followers: 4)
Modelling of Mechanical Systems     Full-text available via subscription   (Followers: 1)
Monte Carlo Methods and Applications     Hybrid Journal   (Followers: 6)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 2)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 5)
Optimization Letters     Hybrid Journal   (Followers: 2)
Optimization Methods and Software     Hybrid Journal   (Followers: 8)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 34)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 17)
Probability Surveys     Open Access   (Followers: 4)
Queueing Systems     Hybrid Journal   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 7)
Review of Economics and Statistics     Hybrid Journal   (Followers: 124)
Review of Socionetwork Strategies     Hybrid Journal  
Risk Management     Hybrid Journal   (Followers: 15)
Sankhya A     Hybrid Journal   (Followers: 2)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Sequential Analysis: Design Methods and Applications     Hybrid Journal  
Significance     Hybrid Journal   (Followers: 7)
Sociological Methods & Research     Hybrid Journal   (Followers: 37)
SourceOCDE Comptes nationaux et Statistiques retrospectives     Full-text available via subscription  
SourceOCDE Statistiques : Sources et methodes     Full-text available via subscription  
SourceOECD Bank Profitability Statistics - SourceOCDE Rentabilite des banques     Full-text available via subscription   (Followers: 1)
SourceOECD Insurance Statistics - SourceOCDE Statistiques d'assurance     Full-text available via subscription   (Followers: 2)
SourceOECD Main Economic Indicators - SourceOCDE Principaux indicateurs economiques     Full-text available via subscription   (Followers: 1)
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
SourceOECD Monthly Statistics of International Trade     Full-text available via subscription   (Followers: 1)
SourceOECD National Accounts & Historical Statistics     Full-text available via subscription  
SourceOECD OECD Economic Outlook Database - SourceOCDE Statistiques des Perspectives economiques de l'OCDE     Full-text available via subscription   (Followers: 2)
SourceOECD Science and Technology Statistics - SourceOCDE Base de donnees des sciences et de la technologie     Full-text available via subscription  
SourceOECD Statistics Sources & Methods     Full-text available via subscription   (Followers: 1)
SourceOECD Taxing Wages Statistics - SourceOCDE Statistiques des impots sur les salaires     Full-text available via subscription  
Stata Journal     Full-text available via subscription   (Followers: 9)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Statistical Applications in Genetics and Molecular Biology     Hybrid Journal   (Followers: 5)
Statistical Communications in Infectious Diseases     Hybrid Journal  
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Statistical Methodology     Hybrid Journal   (Followers: 7)
Statistical Methods and Applications     Hybrid Journal   (Followers: 6)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 27)
Statistical Modelling     Hybrid Journal   (Followers: 19)
Statistical Papers     Hybrid Journal   (Followers: 4)
Statistical Science     Full-text available via subscription   (Followers: 13)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Statistics & Risk Modeling     Hybrid Journal   (Followers: 2)
Statistics and Computing     Hybrid Journal   (Followers: 13)
Statistics and Economics     Open Access   (Followers: 1)
Statistics in Medicine     Hybrid Journal   (Followers: 191)
Statistics, Politics and Policy     Hybrid Journal   (Followers: 6)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 14)
Stochastic Models     Hybrid Journal   (Followers: 3)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Teaching Statistics     Hybrid Journal   (Followers: 7)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
TEST     Hybrid Journal   (Followers: 3)
The American Statistician     Full-text available via subscription   (Followers: 24)
The Annals of Applied Probability     Full-text available via subscription   (Followers: 8)
The Annals of Probability     Full-text available via subscription   (Followers: 10)
The Annals of Statistics     Full-text available via subscription   (Followers: 34)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 11)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)

              [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Statistics and Computing
Journal Prestige (SJR): 2.545
Citation Impact (citeScore): 2
Number of Followers: 13  
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1573-1375 - ISSN (Online) 0960-3174
Published by Springer-Verlag Homepage  [2626 journals]
  • A parallel evolutionary multiple-try metropolis Markov chain Monte Carlo
           algorithm for sampling spatial partitions
    • Abstract: Abstract We develop an Evolutionary Markov Chain Monte Carlo (EMCMC) algorithm for sampling spatial partitions that lie within a large, complex, and constrained spatial state space. Our algorithm combines the advantages of evolutionary algorithms (EAs) as optimization heuristics for state space traversal and the theoretical convergence properties of Markov Chain Monte Carlo algorithms for sampling from unknown distributions. Local optimality information that is identified via a directed search by our optimization heuristic is used to adaptively update a Markov chain in a promising direction within the framework of a Multiple-Try Metropolis Markov Chain model that incorporates a generalized Metropolis-Hastings ratio. We further expand the reach of our EMCMC algorithm by harnessing the computational power afforded by massively parallel computing architecture through the integration of a parallel EA framework that guides Markov chains running in parallel.
      PubDate: 2021-01-12
  • Exploiting low-rank covariance structures for computing high-dimensional
           normal and Student- t probabilities
    • Abstract: Abstract We present a preconditioned Monte Carlo method for computing high-dimensional multivariate normal and Student-t probabilities arising in spatial statistics. The approach combines a tile-low-rank representation of covariance matrices with a block-reordering scheme for efficient quasi-Monte Carlo simulation. The tile-low-rank representation decomposes the high-dimensional problem into many diagonal-block-size problems and low-rank connections. The block-reordering scheme reorders between and within the diagonal blocks to reduce the impact of integration variables from right to left, thus improving the Monte Carlo convergence rate. Simulations up to dimension 65,536 suggest that the new method can improve the run time by an order of magnitude compared with the hierarchical quasi-Monte Carlo method and two orders of magnitude compared with the dense quasi-Monte Carlo method. Our method also forms a strong substitute for the approximate conditioning methods as a more robust estimation with error guarantees. An application study to wind stochastic generators is provided to illustrate that the new computational method makes the maximum likelihood estimation feasible for high-dimensional skew-normal random fields.
      PubDate: 2021-01-12
  • Scalable Bayesian inference for self-excitatory stochastic processes
           applied to big American gunfire data
    • Abstract: Abstract The Hawkes process and its extensions effectively model self-excitatory phenomena including earthquakes, viral pandemics, financial transactions, neural spike trains and the spread of memes through social networks. The usefulness of these stochastic process models within a host of economic sectors and scientific disciplines is undercut by the processes’ computational burden: complexity of likelihood evaluations grows quadratically in the number of observations for both the temporal and spatiotemporal Hawkes processes. We show that, with care, one may parallelize these calculations using both central and graphics processing unit implementations to achieve over 100-fold speedups over single-core processing. Using a simple adaptive Metropolis–Hastings scheme, we apply our high-performance computing framework to a Bayesian analysis of big gunshot data generated in Washington D.C. between the years of 2006 and 2019, thereby extending a past analysis of the same data from under 10,000 to over 85,000 observations. To encourage widespread use, we provide hpHawkes, an open-source R package, and discuss high-level implementation and program design for leveraging aspects of computational hardware that become necessary in a big data setting.
      PubDate: 2021-01-12
  • Modal Clustering Using Semiparametric Mixtures and Mode Flattening
    • Abstract: Abstract Modal clustering has a clear population goal, where density estimation plays a critical role. In this paper, we study how to provide better density estimation so as to serve the objective of modal clustering. In particular, we use semiparametric mixtures for density estimation, aided with a novel mode-flattening technique. The use of semiparametric mixtures helps to produce better density estimates, especially in the multivariate situation, and the mode-flattening technique is intended to identify and smooth out spurious and minor modes. With mode flattening, the number of clusters can be sequentially reduced until there is only one mode left. In addition, we adopt the likelihood function in a coherent manner to measure the relative importance of a mode and let the current least important mode disappear in each step. For both simulated and real-world data sets, the proposed method performs very well, as compared with some well-known clustering methods in the literature, and can successfully solve some fairly difficult clustering problems.
      PubDate: 2021-01-12
  • Ensemble Kalman inversion: mean-field limit and convergence analysis
    • Abstract: Abstract Ensemble Kalman inversion (EKI) has been a very popular algorithm used in Bayesian inverse problems (Iglesias et al. in Inverse Probl 29: 045001, 2013). It samples particles from a prior distribution and introduces a motion to move the particles around in pseudo-time. As the pseudo-time goes to infinity, the method finds the minimizer of the objective function, and when the pseudo-time stops at 1, the ensemble distribution of the particles resembles, in some sense, the posterior distribution in the linear setting. The ideas trace back further to ensemble Kalman filter and the associated analysis  (Evensen in J Geophys Res: Oceans 99: 10143–10162, 1994; Reich in BIT Numer Math 51: 235–249, 2011), but to today, when viewed as a sampling method, why EKI works, and in what sense with what rate the method converges is still largely unknown. In this paper, we analyze the continuous version of EKI, a coupled SDE system, and prove the mean-field limit of this SDE system. In particular, we will show that 1. as the number of particles goes to infinity, the empirical measure of particles following SDE converges to the solution to a Fokker–Planck equation in Wasserstein 2-distance with an optimal rate, for both linear and weakly nonlinear case; 2. the solution to the Fokker–Planck equation reconstructs the target distribution in finite time in the linear case, as suggested in Iglesias et al. (Inverse Probl 29: 045001, 2013).
      PubDate: 2021-01-12
  • Non-local spatially varying finite mixture models for image segmentation
    • Abstract: Abstract In this work, we propose a new Bayesian model for unsupervised image segmentation based on a combination of the spatially varying finite mixture models (SVFMMs) and the non-local means (NLM) framework. The probabilistic NLM weighting function is successfully integrated into a varying Gauss–Markov random field, yielding a prior density that adaptively imposes a local regularization to simultaneously preserve edges and enforce smooth constraints in homogeneous regions of the image. Two versions of our model are proposed: a pixel-based model and a patch-based model, depending on the design of the probabilistic NLM weighting function. Contrary to previous methods proposed in the literature, our approximation does not introduce new parameters to be estimated into the model, because the NLM weighting function is completely known once the neighborhood of a pixel is fixed. The proposed model can be estimated in closed-form solution via a maximum a posteriori (MAP) estimation in an expectation–maximization scheme. We have compared our model with previously proposed SVFMMs using two public datasets: the Berkeley Segmentation dataset and the BRATS 2013 dataset. The proposed model performs favorably to previous approaches in the literature, achieving better results in terms of Rand Index and Dice metrics in our experiments.
      PubDate: 2021-01-12
  • Rank-one multi-reference factor analysis
    • Abstract: Abstract In recent years, there is a growing need for processing methods aimed at extracting useful information from large datasets. In many cases, the challenge is to discover a low-dimensional structure in the data, often concealed by the existence of nuisance parameters and noise. Motivated by such challenges, we consider the problem of estimating a signal from its scaled, cyclically shifted and noisy observations. We focus on the particularly challenging regime of low signal-to-noise ratio (SNR), where different observations cannot be shift-aligned. We show that an accurate estimation of the signal from its noisy observations is possible, and derive a procedure which is proved to consistently estimate the signal. The asymptotic sample complexity (the number of observations required to recover the signal) of the procedure is \(1{/}{\text {SNR}}^4\) . Additionally, we propose a procedure which is experimentally shown to improve the sample complexity by a factor equal to the signal’s length. Finally, we present numerical experiments which demonstrate the performance of our algorithms and corroborate our theoretical findings.
      PubDate: 2021-01-12
  • An adaptive MCMC method for Bayesian variable selection in logistic and
           accelerated failure time regression models
    • Abstract: Abstract Bayesian variable selection is an important method for discovering variables which are most useful for explaining the variation in a response. The widespread use of this method has been restricted by the challenging computational problem of sampling from the corresponding posterior distribution. Recently, the use of adaptive Monte Carlo methods has been shown to lead to performance improvement over traditionally used algorithms in linear regression models. This paper looks at applying one of these algorithms (the adaptively scaled independence sampler) to logistic regression and accelerated failure time models. We investigate the use of this algorithm with data augmentation, Laplace approximation and the correlated pseudo-marginal method. The performance of the algorithms is compared on several genomic data sets.
      PubDate: 2021-01-12
  • Robust fitting for generalized additive models for location, scale and
    • Abstract: Abstract The validity of estimation and smoothing parameter selection for the wide class of generalized additive models for location, scale and shape (GAMLSS) relies on the correct specification of a likelihood function. Deviations from such assumption are known to mislead any likelihood-based inference and can hinder penalization schemes meant to ensure some degree of smoothness for nonlinear effects. We propose a general approach to achieve robustness in fitting GAMLSSs by limiting the contribution of observations with low log-likelihood values. Robust selection of the smoothing parameters can be carried out either by minimizing information criteria that naturally arise from the robustified likelihood or via an extended Fellner–Schall method. The latter allows for automatic smoothing parameter selection and is particularly advantageous in applications with multiple smoothing parameters. We also address the challenge of tuning robust estimators for models with nonlinear effects by proposing a novel median downweighting proportion criterion. This enables a fair comparison with existing robust estimators for the special case of generalized additive models, where our estimator competes favorably. The overall good performance of our proposal is illustrated by further simulations in the GAMLSS setting and by an application to functional magnetic resonance brain imaging using bivariate smoothing splines.
      PubDate: 2021-01-12
  • Fast sampling from $$\beta $$ β -ensembles
    • Abstract: Abstract We investigate sampling \(\beta \) -ensembles with time complexity less than cubic in the cardinality of the ensemble. Following Dumitriu and Edelman (J Math Phys 43(11):5830–5847, 2002), we see the ensemble as the eigenvalues of a random tridiagonal matrix, namely a random Jacobi matrix. First, we provide a unifying and elementary treatment of the tridiagonal models associated with the three classical Hermite, Laguerre, and Jacobi ensembles. For this purpose, we use simple changes of variables between successive reparametrizations of the coefficients defining the tridiagonal matrix. Second, we derive an approximate sampler for the simulation of more general \(\beta \) -ensembles and illustrate how fast it can be for polynomial potentials. This method combines a Gibbs sampler on Jacobi matrices and the diagonalization of these matrices. In practice, even for large ensembles, only a few Gibbs passes suffice for the marginal distribution of the eigenvalues to fit the expected theoretical distribution. When the conditionals in the Gibbs sampler can be simulated exactly, the same fast empirical convergence is observed for the fluctuations of the largest eigenvalue. Our experimental results support a conjecture by Krishnapur et al. (Commun Pure Appl Math 69(1): 145–199, 2016), that the Gibbs chain on Jacobi matrices of size N mixes in \(\mathcal {O}(\log N)\) .
      PubDate: 2021-01-12
  • Toeplitz Monte Carlo
    • Abstract: Abstract Motivated mainly by applications to partial differential equations with random coefficients, we introduce a new class of Monte Carlo estimators, called Toeplitz Monte Carlo (TMC) estimator, for approximating the integral of a multivariate function with respect to the direct product of an identical univariate probability measure. The TMC estimator generates a sequence \(x_1,x_2,\ldots \) of i.i.d. samples for one random variable and then uses \((x_{n+s-1},x_{n+s-2}\ldots ,x_n)\) with \(n=1,2,\ldots \) as quadrature points, where s denotes the dimension. Although consecutive points have some dependency, the concatenation of all quadrature nodes is represented by a Toeplitz matrix, which allows for a fast matrix–vector multiplication. In this paper, we study the variance of the TMC estimator and its dependence on the dimension s. Numerical experiments confirm the considerable efficiency improvement over the standard Monte Carlo estimator for applications to partial differential equations with random coefficients, particularly when the dimension s is large.
      PubDate: 2021-01-06
  • Inference for cluster point processes with over- or under-dispersed
           cluster sizes
    • Abstract: Abstract Cluster point processes comprise a class of models that have been used for a wide range of applications. While several models have been studied for the probability density function of the offspring displacements and the parent point process, there are few examples of non-Poisson distributed cluster sizes. In this paper, we introduce a generalization of the Thomas process, which allows for the cluster sizes to have a variance that is greater or less than the expected value. We refer to this as the cluster sizes being over- and under-dispersed, respectively. To fit the model, we introduce minimum contrast methods and a Bayesian MCMC algorithm. These are evaluated in a simulation study. It is found that using the Bayesian MCMC method, we are in most cases able to detect over- and under-dispersion in the cluster sizes. We use the MCMC method to fit the model to nerve fiber data, and contrast the results to those of a fitted Thomas process.
      PubDate: 2020-11-01
  • Imputation and low-rank estimation with Missing Not At Random data
    • Abstract: Abstract Missing values challenge data analysis because many supervised and unsupervised learning methods cannot be applied directly to incomplete data. Matrix completion based on low-rank assumptions are very powerful solution for dealing with missing values. However, existing methods do not consider the case of informative missing values which are widely encountered in practice. This paper proposes matrix completion methods to recover Missing Not At Random (MNAR) data. Our first contribution is to suggest a model-based estimation strategy by modelling the missing mechanism distribution. An EM algorithm is then implemented, involving a Fast Iterative Soft-Thresholding Algorithm (FISTA). Our second contribution is to suggest a computationally efficient surrogate estimation by implicitly taking into account the joint distribution of the data and the missing mechanism: the data matrix is concatenated with the mask coding for the missing values; a low-rank structure for exponential family is assumed on this new matrix, in order to encode links between variables and missing mechanisms. The methodology that has the great advantage of handling different missing value mechanisms is robust to model specification errors. The performances of our methods are assessed on the real data collected from a trauma registry (TraumaBase \(^{\textregistered }\) ) containing clinical information about over twenty thousand severely traumatized patients in France. The aim is then to predict if the doctors should administrate tranexomic acid to patients with traumatic brain injury, that would limit excessive bleeding.
      PubDate: 2020-11-01
  • BayesProject: Fast computation of a projection direction for multivariate
           changepoint detection
    • Abstract: Abstract This article focuses on the challenging problem of efficiently detecting changes in mean within multivariate data sequences. Multivariate changepoints can be detected by projecting a multivariate series to a univariate one using a suitable projection direction that preserves a maximal proportion of signal information. However, for some existing approaches the computation of such a projection direction can scale unfavourably with the number of series and might rely on additional assumptions on the data sequences, thus limiting their generality. We introduce BayesProject, a computationally inexpensive Bayesian approach to compute a projection direction in such a setting. The proposed approach allows the incorporation of prior knowledge of the changepoint scenario, when such information is available, which can help to increase the accuracy of the method. A simulation study shows that BayesProject is robust, yields projections close to the oracle projection direction and, moreover, that its accuracy in detecting changepoints is comparable to, or better than, existing algorithms while scaling linearly with the number of series.
      PubDate: 2020-11-01
  • Sampling hierarchies of discrete random structures
    • Abstract: Abstract Hierarchical normalized discrete random measures identify a general class of priors that is suited to flexibly learn how the distribution of a response variable changes across groups of observations. A special case widely used in practice is the hierarchical Dirichlet process. Although current theory on hierarchies of nonparametric priors yields all relevant tools for drawing posterior inference, their implementation comes at a high computational cost. We fill this gap by proposing an approximation for a general class of hierarchical processes, which leads to an efficient conditional Gibbs sampling algorithm. The key idea consists of a deterministic truncation of the underlying random probability measures leading to a finite dimensional approximation of the original prior law. We provide both empirical and theoretical support for such a procedure.
      PubDate: 2020-11-01
  • Parallel sequential Monte Carlo for stochastic gradient-free nonconvex
    • Abstract: Abstract We introduce and analyze a parallel sequential Monte Carlo methodology for the numerical solution of optimization problems that involve the minimization of a cost function that consists of the sum of many individual components. The proposed scheme is a stochastic zeroth-order optimization algorithm which demands only the capability to evaluate small subsets of components of the cost function. It can be depicted as a bank of samplers that generate particle approximations of several sequences of probability measures. These measures are constructed in such a way that they have associated probability density functions whose global maxima coincide with the global minima of the original cost function. The algorithm selects the best performing sampler and uses it to approximate a global minimum of the cost function. We prove analytically that the resulting estimator converges to a global minimum of the cost function almost surely and provide explicit convergence rates in terms of the number of generated Monte Carlo samples and the dimension of the search space. We show, by way of numerical examples, that the algorithm can tackle cost functions with multiple minima or with broad “flat” regions which are hard to minimize using gradient-based techniques.
      PubDate: 2020-11-01
  • Subsampling sequential Monte Carlo for static Bayesian models
    • Abstract: Abstract We show how to speed up sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel; this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate both the usefulness and limitations of the methodology for estimating four generalized linear models and a generalized additive model with large datasets.
      PubDate: 2020-09-09
  • Properties of the stochastic approximation EM algorithm with mini-batch
    • Abstract: Abstract To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation–Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classical conditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models. In particular, we highlight that mini-batch sampling results in an important speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented. Finally, we illustrate how to use mini-batch sampling in practice to improve results when a constraint on the computing time is given.
      PubDate: 2020-09-05
  • An apparent paradox: a classifier based on a partially classified sample
           may have smaller expected error rate than that if the sample were
           completely classified
    • Abstract: Abstract There has been increasing interest in using semi-supervised learning to form a classifier. As is well known, the (Fisher) information in an unclassified feature with unknown class label is less (considerably less for weakly separated classes) than that of a classified feature which has known class label. Hence in the case where the absence of class labels does not depend on the data, the expected error rate of a classifier formed from the classified and unclassified features in a partially classified sample is greater than that if the sample were completely classified. We propose to treat the labels of the unclassified features as missing data and to introduce a framework for their missingness as in the pioneering work of Rubin (Biometrika 63:581–592, 1976) for missingness in incomplete data analysis. An examination of several partially classified data sets in the literature suggests that the unclassified features are not occurring at random in the feature space, but rather tend to be concentrated in regions of relatively high entropy. It suggests that the missingness of the labels of the features can be modelled by representing the conditional probability of a missing label for a feature via the logistic model with covariate depending on the entropy of the feature or an appropriate proxy for it. We consider here the case of two normal classes with a common covariance matrix where for computational convenience the square of the discriminant function is used as the covariate in the logistic model in place of the negative log entropy. Rather paradoxically, we show that the classifier so formed from the partially classified sample may have smaller expected error rate than that if the sample were completely classified.
      PubDate: 2020-09-05
  • Semi-automated simultaneous predictor selection for regression-SARIMA
    • Abstract: Abstract Deciding which predictors to use plays an integral role in deriving statistical models in a wide range of applications. Motivated by the challenges of predicting events across a telecommunications network, we propose a semi-automated, joint model-fitting and predictor selection procedure for linear regression models. Our approach can model and account for serial correlation in the regression residuals, produces sparse and interpretable models and can be used to jointly select models for a group of related responses. This is achieved through fitting linear models under constraints on the number of nonzero coefficients using a generalisation of a recently developed mixed integer quadratic optimisation approach. The resultant models from our approach achieve better predictive performance on the motivating telecommunications data than methods currently used by industry.
      PubDate: 2020-09-04
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-