A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort by number of followers]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted alphabetically
Advances in Complex Systems     Hybrid Journal   (Followers: 11)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 62)
Annals of Applied Statistics     Full-text available via subscription   (Followers: 39)
Applied Categorical Structures     Hybrid Journal   (Followers: 4)
Argumentation et analyse du discours     Open Access   (Followers: 11)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 8)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 4)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Bernoulli     Full-text available via subscription   (Followers: 9)
Biometrical Journal     Hybrid Journal   (Followers: 11)
Biometrics     Hybrid Journal   (Followers: 52)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 18)
Building Simulation     Hybrid Journal   (Followers: 2)
Bulletin of Statistics     Full-text available via subscription   (Followers: 4)
CHANCE     Hybrid Journal   (Followers: 5)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 11)
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 37)
Current Research in Biostatistics     Open Access   (Followers: 8)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 11)
Demographic Research     Open Access   (Followers: 15)
Electronic Journal of Statistics     Open Access   (Followers: 8)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
ESAIM: Probability and Statistics     Full-text available via subscription   (Followers: 5)
Extremes     Hybrid Journal   (Followers: 2)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 9)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 5)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 17)
International Journal of Stochastic Analysis     Open Access   (Followers: 3)
International Statistical Review     Hybrid Journal   (Followers: 13)
International Trade by Commodity Statistics - Statistiques du commerce international par produit     Full-text available via subscription  
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 4)
Journal of Applied Statistics     Hybrid Journal   (Followers: 21)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 21)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 39, SJR: 3.664, CiteScore: 2)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 20)
Journal of Econometrics     Hybrid Journal   (Followers: 84)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 6)
Journal of Forecasting     Hybrid Journal   (Followers: 17)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Interactive Marketing     Hybrid Journal   (Followers: 10)
Journal of Mathematics and Statistics     Open Access   (Followers: 8)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 33)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 5)
Journal of Statistical Physics     Hybrid Journal   (Followers: 13)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 8)
Journal of Statistical Software     Open Access   (Followers: 21, SJR: 13.802, CiteScore: 16)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 72, SJR: 3.746, CiteScore: 2)
Journal of the Korean Statistical Society     Hybrid Journal   (Followers: 1)
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 33)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 27)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 43)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 16)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 30)
Law, Probability and Risk     Hybrid Journal   (Followers: 8)
Lifetime Data Analysis     Hybrid Journal   (Followers: 7)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Metrika     Hybrid Journal   (Followers: 4)
Modelling of Mechanical Systems     Full-text available via subscription   (Followers: 1)
Monte Carlo Methods and Applications     Hybrid Journal   (Followers: 6)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 2)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 5)
Optimization Letters     Hybrid Journal   (Followers: 2)
Optimization Methods and Software     Hybrid Journal   (Followers: 8)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 34)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 17)
Probability Surveys     Open Access   (Followers: 4)
Queueing Systems     Hybrid Journal   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 8)
Review of Economics and Statistics     Hybrid Journal   (Followers: 128)
Review of Socionetwork Strategies     Hybrid Journal  
Risk Management     Hybrid Journal   (Followers: 15)
Sankhya A     Hybrid Journal   (Followers: 2)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Sequential Analysis: Design Methods and Applications     Hybrid Journal  
Significance     Hybrid Journal   (Followers: 7)
Sociological Methods & Research     Hybrid Journal   (Followers: 38)
SourceOCDE Comptes nationaux et Statistiques retrospectives     Full-text available via subscription  
SourceOCDE Statistiques : Sources et methodes     Full-text available via subscription  
SourceOECD Bank Profitability Statistics - SourceOCDE Rentabilite des banques     Full-text available via subscription   (Followers: 1)
SourceOECD Insurance Statistics - SourceOCDE Statistiques d'assurance     Full-text available via subscription   (Followers: 2)
SourceOECD Main Economic Indicators - SourceOCDE Principaux indicateurs economiques     Full-text available via subscription   (Followers: 1)
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
SourceOECD Monthly Statistics of International Trade     Full-text available via subscription   (Followers: 1)
SourceOECD National Accounts & Historical Statistics     Full-text available via subscription  
SourceOECD OECD Economic Outlook Database - SourceOCDE Statistiques des Perspectives economiques de l'OCDE     Full-text available via subscription   (Followers: 2)
SourceOECD Science and Technology Statistics - SourceOCDE Base de donnees des sciences et de la technologie     Full-text available via subscription  
SourceOECD Statistics Sources & Methods     Full-text available via subscription   (Followers: 1)
SourceOECD Taxing Wages Statistics - SourceOCDE Statistiques des impots sur les salaires     Full-text available via subscription  
Stata Journal     Full-text available via subscription   (Followers: 9)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Statistical Applications in Genetics and Molecular Biology     Hybrid Journal   (Followers: 5)
Statistical Communications in Infectious Diseases     Hybrid Journal  
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Statistical Methodology     Hybrid Journal   (Followers: 7)
Statistical Methods and Applications     Hybrid Journal   (Followers: 6)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 27)
Statistical Modelling     Hybrid Journal   (Followers: 19)
Statistical Papers     Hybrid Journal   (Followers: 4)
Statistical Science     Full-text available via subscription   (Followers: 13)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Statistics & Risk Modeling     Hybrid Journal   (Followers: 3)
Statistics and Computing     Hybrid Journal   (Followers: 13)
Statistics and Economics     Open Access   (Followers: 1)
Statistics in Medicine     Hybrid Journal   (Followers: 196)
Statistics, Politics and Policy     Hybrid Journal   (Followers: 6)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 15)
Stochastic Models     Hybrid Journal   (Followers: 3)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Teaching Statistics     Hybrid Journal   (Followers: 7)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
TEST     Hybrid Journal   (Followers: 3)
The American Statistician     Full-text available via subscription   (Followers: 23)
The Annals of Applied Probability     Full-text available via subscription   (Followers: 8)
The Annals of Probability     Full-text available via subscription   (Followers: 10)
The Annals of Statistics     Full-text available via subscription   (Followers: 34)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 11)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)

              [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Statistics and Computing
Journal Prestige (SJR): 2.545
Citation Impact (citeScore): 2
Number of Followers: 13  
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1573-1375 - ISSN (Online) 0960-3174
Published by Springer-Verlag Homepage  [2656 journals]
  • Estimation of time-varying autoregressive stochastic volatility models
           with stable innovations
    • Abstract: A new time-varying autoregressive stochastic volatility model with \(\alpha \) -stable innovations (TVAR \(\alpha \) SV) is proposed. This new model for time series data combines a time-varying autoregressive component and a stochastic scaling as known from stochastic volatility models with \(\alpha \) -stable distributed noise. Hence, the model can cover extreme events better than classical stochastic volatility models. Furthermore, we develop a Gibbs sampling procedure for the estimation of the model parameters. The procedure is based on the estimation strategy by Kim et al. (Rev Econ Stud 65(3): 361–393, 1998) for classical stochastic volatility models, however, the estimation procedure requires a deliberate approximation of \(\alpha \) -stable distributions by finite mixtures of normal distributions and the application of a simulation smoother for linear Gaussian state space models. A simulation study for the new estimation procedure illustrates the appealing accuracy. Finally, we apply the model to electricity spot price data.
      PubDate: 2021-04-20
  • A piecewise deterministic Monte Carlo method for diffusion bridges
    • Abstract: We introduce the use of the Zig-Zag sampler to the problem of sampling conditional diffusion processes (diffusion bridges). The Zig-Zag sampler is a rejection-free sampling scheme based on a non-reversible continuous piecewise deterministic Markov process. Similar to the Lévy–Ciesielski construction of a Brownian motion, we expand the diffusion path in a truncated Faber–Schauder basis. The coefficients within the basis are sampled using a Zig-Zag sampler. A key innovation is the use of the fully local algorithm for the Zig-Zag sampler that allows to exploit the sparsity structure implied by the dependency graph of the coefficients and by the subsampling technique to reduce the complexity of the algorithm. We illustrate the performance of the proposed methods in a number of examples.
      PubDate: 2021-04-20
  • Particle-based energetic variational inference
    • Abstract: We introduce a new variational inference (VI) framework, called energetic variational inference (EVI). It minimizes the VI objective function based on a prescribed energy-dissipation law. Using the EVI framework, we can derive many existing particle-based variational inference (ParVI) methods, including the popular Stein variational gradient descent (SVGD). More importantly, many new ParVI schemes can be created under this framework. For illustration, we propose a new particle-based EVI scheme, which performs the particle-based approximation of the density first and then uses the approximated density in the variational procedure, or “Approximation-then-Variation” for short. Thanks to this order of approximation and variation, the new scheme can maintain the variational structure at the particle level, and can significantly decrease the KL-divergence in each iteration. Numerical experiments show the proposed method outperforms some existing ParVI methods in terms of fidelity to the target distribution.
      PubDate: 2021-04-17
  • Locally induced Gaussian processes for large-scale simulation experiments
    • Abstract: Gaussian processes (GPs) serve as flexible surrogates for complex surfaces, but buckle under the cubic cost of matrix decompositions with big training data sizes. Geospatial and machine learning communities suggest pseudo-inputs, or inducing points, as one strategy to obtain an approximation easing that computational burden. However, we show how placement of inducing points and their multitude can be thwarted by pathologies, especially in large-scale dynamic response surface modeling tasks. As remedy, we suggest porting the inducing point idea, which is usually applied globally, over to a more local context where selection is both easier and faster. In this way, our proposed methodology hybridizes global inducing point and data subset-based local GP approximation. A cascade of strategies for planning the selection of local inducing points is provided, and comparisons are drawn to related methodology with emphasis on computer surrogate modeling applications. We show that local inducing points extend their global and data subset component parts on the accuracy–computational efficiency frontier. Illustrative examples are provided on benchmark data and a large-scale real-simulation satellite drag interpolation problem.
      PubDate: 2021-04-17
  • A fast and efficient smoothing approach to Lasso regression and an
           application in statistical genetics: polygenic risk scores for chronic
           obstructive pulmonary disease (COPD)
    • Abstract: High dimensional linear regression problems are often fitted using Lasso approaches. Although the Lasso objective function is convex, it is not differentiable everywhere, making the use of gradient descent methods for minimization not straightforward. To avoid this technical issue, we apply Nesterov smoothing to the original (unsmoothed) Lasso objective function. We introduce a closed-form smoothed Lasso which preserves the convexity of the Lasso function, is uniformly close to the unsmoothed Lasso, and allows us to obtain closed-form derivatives everywhere for efficient and fast minimization via gradient descent. Our simulation studies are focused on polygenic risk scores using genetic data from a genome-wide association study (GWAS) for chronic obstructive pulmonary disease (COPD). We compare accuracy and runtime of our approach to the current gold standard in the literature, the FISTA algorithm. Our results suggest that the proposed methodology provides estimates with equal or higher accuracy than the FISTA algorithm while having the same asymptotic runtime scaling. The proposed methodology is implemented in the R-package smoothedLasso, available on the Comprehensive R Archive Network (CRAN).
      PubDate: 2021-04-17
  • Regularized bi-directional co-clustering
    • Abstract: The simultaneous clustering of documents and words, known as co-clustering, has proved to be more effective than one-sided clustering in dealing with sparse high-dimensional datasets. By their nature, text data are also generally unbalanced and directional. Recently, the von Mises–Fisher (vMF) mixture model was proposed to handle unbalanced data while harnessing the directional nature of text. In this paper, we propose a general co-clustering framework based on a matrix formulation of vMF model-based co-clustering. This formulation leads to a flexible framework for text co-clustering that can easily incorporate both word–word semantic relationships and document–document similarities. By contrast with existing methods, which generally use an additive incorporation of similarities, we propose a bi-directional multiplicative regularization that better encapsulates the underlying text data structure. Extensive evaluations on various real-world text datasets demonstrate the superior performance of our proposed approach over baseline and competitive methods, both in terms of clustering results and co-cluster topic coherence.
      PubDate: 2021-04-10
  • A mixed model approach to measurement error in semiparametric regression
    • Abstract: An essential assumption in traditional regression techniques is that predictors are measured without errors. Failing to take into account measurement error in predictors may result in severely biased inferences. Correcting measurement-error bias is an extremely difficult problem when estimating a regression function nonparametrically. We propose an approach to deal with measurement errors in predictors when modelling flexible regression functions. This approach depends on directly modelling the mean and the variance of the response variable after integrating out the true unobserved predictors in a penalized splines model. We demonstrate through simulation studies that our approach provides satisfactory prediction accuracy largely outperforming previously suggested local polynomial estimators even when the model is incorrectly specified and is competitive with the Bayesian estimator.
      PubDate: 2021-03-30
  • A simple method for rejection sampling efficiency improvement on SIMT
    • Abstract: We derive a probability distribution for the possible number of iterations required for a SIMT (single instruction multiple thread) program using rejection sampling to finish creating a sample across all threads. This distribution is found to match a recently proposed distribution from Chakraborty and Gupta (in: Communications in statistics: theory and methods, 2015) that was shown as a good approximation of certain datasets. This work demonstrates an exact application of this distribution. The distribution can be used to evaluate the relative merit of some sampling methods on the GPU without resort to numerical tests. The distribution reduces to the expected geometric distribution in the single thread per warp limit. A simplified formula to approximate the expected number of iterations required to obtain rejection iteration samples is provided. With this new result, a simple, efficient layout for assigning sampling tasks to threads on a GPU is found as a function of the rejection probability without recourse to more complicated rejection sampling methods.
      PubDate: 2021-03-30
  • Efficient stochastic optimisation by unadjusted Langevin Monte Carlo
    • Abstract: Stochastic approximation methods play a central role in maximum likelihood estimation problems involving intractable likelihood functions, such as marginal likelihoods arising in problems with missing or incomplete data, and in parametric empirical Bayesian estimation. Combined with Markov chain Monte Carlo algorithms, these stochastic optimisation methods have been successfully applied to a wide range of problems in science and industry. However, this strategy scales poorly to large problems because of methodological and theoretical difficulties related to using high-dimensional Markov chain Monte Carlo algorithms within a stochastic approximation scheme. This paper proposes to address these difficulties by using unadjusted Langevin algorithms to construct the stochastic approximation. This leads to a highly efficient stochastic optimisation methodology with favourable convergence properties that can be quantified explicitly and easily checked. The proposed methodology is demonstrated with three experiments, including a challenging application to statistical audio analysis and a sparse Bayesian logistic regression with random effects problem.
      PubDate: 2021-03-19
  • Ensemble sampler for infinite-dimensional inverse problems
    • Abstract: We introduce a new Markov chain Monte Carlo (MCMC) sampler for infinite-dimensional inverse problems. Our new sampler is based on the affine invariant ensemble sampler, which uses interacting walkers to adapt to the covariance structure of the target distribution. We extend this ensemble sampler for the first time to infinite-dimensional function spaces, yielding a highly efficient gradient-free MCMC algorithm. Because our new ensemble sampler does not require gradients or posterior covariance estimates, it is simple to implement and broadly applicable.
      PubDate: 2021-03-15
  • Sequential Bayesian optimal experimental design for structural reliability
    • Abstract: Structural reliability analysis is concerned with estimation of the probability of a critical event taking place, described by \(P(g(\mathbf{X} ) \le 0)\) for some n-dimensional random variable \(\mathbf{X} \) and some real-valued function g. In many applications the function g is practically unknown, as function evaluation involves time consuming numerical simulation or some other form of experiment that is expensive to perform. The problem we address in this paper is how to optimally design experiments, in a Bayesian decision theoretic fashion, when the goal is to estimate the probability \(P(g(\mathbf{X} ) \le 0)\) using a minimal amount of resources. As opposed to existing methods that have been proposed for this purpose, we consider a general structural reliability model given in hierarchical form. We therefore introduce a general formulation of the experimental design problem, where we distinguish between the uncertainty related to the random variable \(\mathbf{X} \) and any additional epistemic uncertainty that we want to reduce through experimentation. The effectiveness of a design strategy is evaluated through a measure of residual uncertainty, and efficient approximation of this quantity is crucial if we want to apply algorithms that search for an optimal strategy. The method we propose is based on importance sampling combined with the unscented transform for epistemic uncertainty propagation. We implement this for the myopic (one-step look ahead) alternative, and demonstrate the effectiveness through a series of numerical experiments.
      PubDate: 2021-03-09
  • Accelerating the estimation of renewal Hawkes self-exciting point
    • Abstract: The renewal Hawkes process is a nascent point process model that generalizes the Hawkes process. Although it has shown strong application potential, fitting the renewal Hawkes process to data remains a challenging task, especially on larger datasets. This article tackles this challenge by providing two approaches that significantly reduce the time required to fit renewal Hawkes processes. Since derivative-based methods for optimization, in general, converge faster than derivative-free methods, our first approach is to derive algorithms for evaluating the gradient and Hessian of the log-likelihood function and then use a derivative-based method, such as the Newton–Raphson method, in maximizing the likelihood, instead of the derivative-free method currently being used. Our second approach is to seek linear time algorithms that produce accurate approximations to the likelihood function, and then directly optimize the approximation to the log-likelihood function. Our simulation experiments show that the Newton–Raphson method reduces the computational time by about 30%. Furthermore, the approximate likelihood methods produce equally accurate estimates compared to the methods based on the exact likelihood and are about 20–40 times faster on datasets with about 10,000 events. We conclude with an analysis of price changes of several currencies relative to the US Dollar.
      PubDate: 2021-03-04
  • Unbiased estimation of the gradient of the log-likelihood in inverse
    • Abstract: We consider the problem of estimating a parameter \(\theta \in \Theta \subseteq {\mathbb {R}}^{d_{\theta }}\) associated with a Bayesian inverse problem. Typically one must resort to a numerical approximation of gradient of the log-likelihood and also adopt a discretization of the problem in space and/or time. We develop a new methodology to unbiasedly estimate the gradient of the log-likelihood with respect to the unknown parameter, i.e. the expectation of the estimate has no discretization bias. Such a property is not only useful for estimation in terms of the original stochastic model of interest, but can be used in stochastic gradient algorithms which benefit from unbiased estimates. Under appropriate assumptions, we prove that our estimator is not only unbiased but of finite variance. In addition, when implemented on a single processor, we show that the cost to achieve a given level of error is comparable to multilevel Monte Carlo methods, both practically and theoretically. However, the new algorithm is highly amenable to parallel computation.
      PubDate: 2021-03-03
  • Conditional particle filters with diffuse initial distributions
    • Abstract: Conditional particle filters (CPFs) are powerful smoothing algorithms for general nonlinear/non-Gaussian hidden Markov models. However, CPFs can be inefficient or difficult to apply with diffuse initial distributions, which are common in statistical applications. We propose a simple but generally applicable auxiliary variable method, which can be used together with the CPF in order to perform efficient inference with diffuse initial distributions. The method only requires simulatable Markov transitions that are reversible with respect to the initial distribution, which can be improper. We focus in particular on random walk type transitions which are reversible with respect to a uniform initial distribution (on some domain), and autoregressive kernels for Gaussian initial distributions. We propose to use online adaptations within the methods. In the case of random walk transition, our adaptations use the estimated covariance and acceptance rate adaptation, and we detail their theoretical validity. We tested our methods with a linear Gaussian random walk model, a stochastic volatility model, and a stochastic epidemic compartment model with time-varying transmission rate. The experimental findings demonstrate that our method works reliably with little user specification and can be substantially better mixing than a direct particle Gibbs algorithm that treats initial states as parameters.
      PubDate: 2021-03-03
  • Deep mixtures of unigrams for uncovering topics in textual data
    • Abstract: Mixtures of unigrams are one of the simplest and most efficient tools for clustering textual data, as they assume that documents related to the same topic have similar distributions of terms, naturally described by multinomials. When the classification task is particularly challenging, such as when the document-term matrix is high-dimensional and extremely sparse, a more composite representation can provide better insight into the grouping structure. In this work, we developed a deep version of mixtures of unigrams for the unsupervised classification of very short documents with a large number of terms, by allowing for models with further deeper latent layers; the proposal is derived in a Bayesian framework. The behavior of the deep mixtures of unigrams is empirically compared with that of other traditional and state-of-the-art methods, namely k-means with cosine distance, k-means with Euclidean distance on data transformed according to semantic analysis, partition around medoids, mixture of Gaussians on semantic-based transformed data, hierarchical clustering according to Ward’s method with cosine dissimilarity, latent Dirichlet allocation, mixtures of unigrams estimated via the EM algorithm, spectral clustering and affinity propagation clustering. The performance is evaluated in terms of both correct classification rate and Adjusted Rand Index. Simulation studies and real data analysis prove that going deep in clustering such data highly improves the classification accuracy.
      PubDate: 2021-03-03
  • Bayesian additive regression trees with model trees
    • Abstract: Bayesian additive regression trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners and is very flexible for predicting in the presence of nonlinearity and high-order interactions. In this paper, we introduce an extension of BART, called model trees BART (MOTR-BART), that considers piecewise linear functions at node levels instead of piecewise constants. In MOTR-BART, rather than having a unique value at node level for the prediction, a linear predictor is estimated considering the covariates that have been used as the split variables in the corresponding tree. In our approach, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART. Via simulation studies and real data applications, we compare MOTR-BART to its main competitors. R code for MOTR-BART implementation is available at https://github.com/ebprado/MOTR-BART.
      PubDate: 2021-03-03
  • Bayesian ODE solvers: the maximum a posteriori estimate
    • Abstract: There is a growing interest in probabilistic numerical solutions to ordinary differential equations. In this paper, the maximum a posteriori estimate is studied under the class of \(\nu \) times differentiable linear time-invariant Gauss–Markov priors, which can be computed with an iterated extended Kalman smoother. The maximum a posteriori estimate corresponds to an optimal interpolant in the reproducing kernel Hilbert space associated with the prior, which in the present case is equivalent to a Sobolev space of smoothness \(\nu +1\) . Subject to mild conditions on the vector field, convergence rates of the maximum a posteriori estimate are then obtained via methods from nonlinear analysis and scattered data approximation. These results closely resemble classical convergence results in the sense that a \(\nu \) times differentiable prior process obtains a global order of \(\nu \) , which is demonstrated in numerical examples.
      PubDate: 2021-03-03
  • Gaussian process learning via Fisher scoring of Vecchia’s
    • Abstract: We derive a single-pass algorithm for computing the gradient and Fisher information of Vecchia’s Gaussian process loglikelihood approximation, which provides a computationally efficient means for applying the Fisher scoring algorithm for maximizing the loglikelihood. The advantages of the optimization techniques are demonstrated in numerical examples and in an application to Argo ocean temperature data. The new methods find the maximum likelihood estimates much faster and more reliably than an optimization method that uses only function evaluations, especially when the covariance function has many parameters. This allows practitioners to fit nonstationary models to large spatial and spatial–temporal datasets.
      PubDate: 2021-03-03
  • Optimal representative sample weighting
    • Abstract: We consider the problem of assigning weights to a set of samples or data records, with the goal of achieving a representative weighting, which happens when certain sample averages of the data are close to prescribed values. We frame the problem of finding representative sample weights as an optimization problem, which in many cases is convex and can be efficiently solved. Our formulation includes as a special case the selection of a fixed number of the samples, with equal weights, i.e., the problem of selecting a smaller representative subset of the samples. While this problem is combinatorial and not convex, heuristic methods based on convex optimization seem to perform very well. We describe our open-source implementation rsw and apply it to a skewed sample of the CDC BRFSS dataset.
      PubDate: 2021-02-28
  • A wavelet-based approach for imputation in nonstationary multivariate time
    • Abstract: Many multivariate time series observed in practice are second order nonstationary, i.e. their covariance properties vary over time. In addition, missing observations in such data are encountered in many applications of interest, due to recording failures or sensor dropout, hindering successful analysis. This article introduces a novel method for data imputation in multivariate nonstationary time series, based on the so-called locally stationary wavelet modelling paradigm. Our methodology is shown to perform well across a range of simulation scenarios, with a variety of missingness structures, as well as being competitive in the stationary time series setting. We also demonstrate our technique on data arising in a health monitoring application.
      PubDate: 2021-02-17
      DOI: 10.1007/s11222-021-09998-2
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-