for Journals by Title or ISSN
for Articles by Keywords
Followed Journals
Journal you Follow: 0
Sign Up to follow journals, search in your chosen journals and, optionally, receive Email Alerts when new issues of your Followed Jurnals are published.
Already have an account? Sign In to see the journals you follow.
Journal Cover Stat  
   [3 followers]  Follow    
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Online) 2049-1573
   Published by John Wiley and Sons Homepage  [1605 journals]
  • Issue Information
    • Pages: i - iii
      Abstract: No abstract is available for this article.
      PubDate: 2014-12-22T21:54:24.933098-05:
      DOI: 10.1002/sta4.72
  • Empirical Bayes estimation for the conditional extreme value model
    • Authors: Linyin Cheng; Eric Gilleland, Matthew J. Heaton, Amir AghaKouchak
      Pages: n/a - n/a
      Abstract: A new estimation strategy for estimating the parameters of the Heffernan and Tawn conditional extreme value model is proposed. The technique makes use of empirical Bayes estimation for the conditional likelihood that otherwise does not have a simple closed‐form expression. The approach is tested on simulations from different types of extreme dependence (and independence) structures, as well as for two real data cases consisting of precipitation analysis conditional on extreme temperature in Boulder, Colorado, and Los Angeles, California, USA. The strategy generally has good coverage when informative priors are used for one of the parameters, except for the independence case where the coverage is low until the sample size reaches about 50. Results for the precipitation and temperature data are found to be consistent with the semi‐non‐parametric strategy. The presented model can be potentially applied in a wide variety of science fields, especially in earth, staironment and climate sciences. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-12-11T05:56:22.732917-05:
      DOI: 10.1002/sta4.71
  • A model‐free machine learning method for risk classification and
           survival probability prediction
    • Authors: Yuan Geng; Wenbin Lu, Hao Helen Zhang
      Pages: n/a - n/a
      Abstract: Risk classification and survival probability prediction are two major goals in survival data analysis because they play an important role in patients' risk stratification, long‐term diagnosis, and treatment selection. In this article, we propose a new model‐free machine learning framework for risk classification and survival probability prediction based on weighted support vector machines. The new procedure does not require any specific parametric or semiparametric model assumption on data and is therefore capable of capturing non‐linear covariate effects. We use numerous simulation examples to demonstrate finite sample performance of the proposed method under various settings. Applications to a glioma tumour data and a breast cancer gene‐expression survival data are shown to illustrate the new methodology in real data analysis. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-11-25T01:53:42.864484-05:
      DOI: 10.1002/sta4.67
  • Estimating the proportions in a mixed sample using transcriptomics
    • Authors: Bertrand Clarke; Jennifer Clarke
      Pages: n/a - n/a
      Abstract: Often, biomedical researchers take a tissue sample from a tumour that unintentionally contains tumour cells and stromal cells so that gene expression from the sample represents two distinct kinds of cells. This is tolerated because techniques for separating tumour and stromal cells are expensive and time‐consuming and the act of separation can alter gene expression. So, it is desirable to have a technique for estimating the proportion of tumour cells in a mixed sample to improve detection of differential expression in cancer cells. © 2014 The
      Authors . Stat Published by John Wiley & Sons Ltd.
      PubDate: 2014-10-24T01:35:50.324081-05:
      DOI: 10.1002/sta4.65
  • Additive models for conditional copulas
    • Authors: Avideh Sabeti; Mian Wei, Radu V. Craiu
      Pages: n/a - n/a
      Abstract: Conditional copulas are flexible statistical tools that couple joint conditional and marginal conditional distributions. In a linear regression setting with more than one covariate and two dependent outcomes, we consider additive models for studying the dependence between covariates and the copula parameter. We examine the computation and model selection tools needed for Bayesian inference. The method is illustrated using simulations and a real example. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-10-06T23:38:11.384833-05:
      DOI: 10.1002/sta4.64
  • A note on approximating ABC-MCMC using flexible classifiers
    • Authors: Kim Cuc Pham; David J. Nott, Sanjay Chaudhuri
      Pages: n/a - n/a
      Abstract: A method for approximating Markov chain Monte Carlo algorithms is considered in the setting where the likelihood is intractable. The approach is based on interpreting the likelihood ratio in the Metropolis–Hastings acceptance probability as the odds in the Bayes classification rule for distinguishing whether the observed data were generated using the proposal parameter value or the current one. Approximating the Bayes rule using simulated data from the model and modern flexible classifiers capable of dealing with high-dimensional feature vectors results in new approximate Bayesian computation procedures that are able to perform well with high-dimensional summary statistics. In problems of small to moderate size, it may even be possible to dispense with summary statistics altogether. The synthetic likelihood of Wood corresponds to classification by quadratic discriminant analysis in this framework. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-06-25T00:39:02.926769-05:
      DOI: 10.1002/sta4.56
  • Surface boxplots
    • Authors: Marc G. Genton; Christopher Johnson, Kristin Potter, Georgiy Stenchikov, Ying Sun
      Pages: n/a - n/a
      Abstract: In this paper, we introduce a surface boxplot as a tool for visualization and exploratory analysis of samples of images. First, we use the notion of volume depth to order the images viewed as surfaces. In particular, we define the median image. We use an exact and fast algorithm for the ranking of the images. This allows us to detect potential outlying images that often contain interesting features not present in most of the images. Second, we build a graphical tool to visualize the surface boxplot and its various characteristics. A graph and histogram of the volume depth values allow us to identify images of interest. The code is available in the supporting information of this paper. We apply our surface boxplot to a sample of brain images and to a sample of climate model outputs. Copyright © 2014 John Wiley & Sons Ltd.
      PubDate: 2014-01-22T21:06:22.569842-05:
      DOI: 10.1002/sta4.39
  • A Gaussian compound decision bakeoff
    • Authors: Roger Koenker
      First page: 12
      Abstract: A non-parametric mixture model approach to empirical Bayes compound decisions for the Gaussian location model is compared with a parametric empirical Bayes approach recently suggested by Martin and Walker and several recent more formal Bayes procedures.Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-01-26T22:05:27.774259-05:
      DOI: 10.1002/sta4.38
  • When are first‒order asymptotics adequate' A diagnostic
    • Authors: Karim Anaya‒Izquierdo; Frank Critchley, Paul Marriott
      First page: 17
      Abstract: This paper looks at boundary effects on inference in an important class of models including, notably, logistic regression. Asymptotic results are not uniform across such models. Accordingly, whatever their order, methods asymptotic in sample size will ultimately “break down” as the boundary is approached, in the sense that effects such as infinite skewness, discreteness and collinearity will dominate. In this paper, a highly interpretable diagnostic tool is proposed, allowing the analyst to check if the boundary is going to have an appreciable effect on standard inferential techniques. Copyright © 2014 John Wiley & Sons Ltd.
      PubDate: 2014-02-21T05:03:14.760846-05:
      DOI: 10.1002/sta4.40
  • A note on the tails of the GO-GARCH process
    • Authors: Nelson Muriel
      First page: 23
      Abstract: The possibility of modeling heavy tails using generalized autoregressive conditional heteroskedasticity (GARCH) models has been rigorously established in the univariate case, and the consequences that this heavy tailedness has on the distributional limits of the sample autocovariance function are well known. In the multivariate case, however, the results have not as yet been provided, and the asymptotic properties of the autocovariance function are not fully understood. In this note, we focus on the generalized orthogonal GARCH (GO-GARCH) model, a multivariate specification that has recently received some attention in the literature. We first show that marginal heavy tailedness is a simple consequence of the definition of the process and argue that all tail indexes should be equal. Next, we show that the finite-dimensional distributions of the GO-GARCH model possess some of the properties known to hold for univariate GARCH and comment on some implications of this fact, which are of practical import. Specifically, we show that the sample autocovariance function may either have a random limit or converge rather slowly to its population counterpart depending on how heavy the tails of the process are. Copyright © 2014 John Wiley & Sons Ltd.
      PubDate: 2014-02-24T11:52:48.803031-05:
      DOI: 10.1002/sta4.41
  • Quantile regression analysis of length-biased survival data
    • Authors: Huixia Judy Wang; Lan Wang
      First page: 31
      Abstract: Length-biased time-to-event data commonly arise in epidemiological cohort studies and cross-sectional surveys. Ignoring length-biased sampling often leads to severe bias in estimating the survival time in the general population. We propose a flexible quantile regression framework for analysing the covariate effects on the population survival time under both length-biased sampling and random censoring. This framework allows for easy interpretation of the statistical model. Furthermore, it allows the covariates to have different impacts at different tails of the survival distribution and thus is able to capture important population heterogeneity. Using an unbiased estimating equation approach, we develop a new estimator that allows the censoring variable to depend on covariates in a non-parametric way. We establish the consistency and asymptotic normality for the proposed estimator. A lack-of-fit test is proposed for diagnosing the adequacy of the population quantile regression model. The finite sample performance of the proposed methods is assessed through a simulation study. We demonstrate that the proposed method is effective in discovering interesting covariate effects by analysing the Canadian Study of Health and Aging dementia data. Copyright © 2014 John Wiley & Sons Ltd
      PubDate: 2014-03-05T02:06:48.16971-05:0
      DOI: 10.1002/sta4.42
  • Beyond axial symmetry: An improved class of models for global data
    • Authors: Stefano Castruccio; Marc G. Genton
      First page: 48
      Abstract: An important class of models for data on a spherical domain, called axially symmetric, assumes stationarity across longitudes but not across latitudes. The main aim of this work is to introduce a new and more flexible class of models by relaxing the assumption of longitudinal stationarity in the context of regularly gridded climate model output. In this investigation, two other related topics are discussed: the lack of fit of an axially symmetric parametric model compared with a non-parametric model and to longitudinally reversible processes, an important subclass of axially symmetric models. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-03-13T00:35:32.513891-05:
      DOI: 10.1002/sta4.44
  • Modulation of symmetry for discrete variables and some extensions
    • Authors: Adelchi Azzalini; Giuliana Regoli
      First page: 56
      Abstract: Substantial work has been dedicated in recent years to the construction of families of continuous distributions obtained by applying a modulation factor to a base symmetric density so as to obtain non-symmetric variant forms, often denoted skew-symmetric distributions. All this development has dealt with the case of continuous variables, while here we extend the formulation to the discrete case; moreover, some of the statements are of general validity. The results are illustrated with an application to the distribution of the score difference in sport matches. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-03-20T05:08:58.68592-05:0
      DOI: 10.1002/sta4.45
  • A mixture of common skew-t factor analysers
    • Authors: Paula M. Murray; Paul D. McNicholas, Ryan P. Browne
      First page: 68
      Abstract: A mixture of common skew-t factor analysers model is introduced for model-based clustering of high-dimensional data. By assuming common factors, this model allows clustering to be performed in the presence of a large number of mixture components or when the number of dimensions is too large to be well modelled by the mixture of factor analysers model or a variant thereof. Furthermore, assuming that the component densities follow a skew-t distribution allows robust clustering of data with asymmetric clusters. This paper is the first time that skewed common factors have been used, and it marks an important step in robust clustering and classification of high-dimensional data. The alternating expectation–conditional maximization algorithm is employed for parameter estimation. We demonstrate excellent clustering performance when our mixture of common skew-t factor analysers model is applied to real and simulated data. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-03-27T01:48:18.491063-05:
      DOI: 10.1002/sta4.43
  • Propensity score estimation in the presence of length-biased sampling: a
           non-parametric adjustment approach
    • Authors: Ashkan Ertefaie; Masoud Asgharian, David Stephens
      First page: 83
      Abstract: The pervasive use of prevalent cohort studies on disease duration increasingly calls for an appropriate methodology to account for the biases that invariably accompany samples formed by such data. It is well known, for example, that subjects with shorter lifetime are less likely to be present in such studies. Moreover, certain covariate values could be preferentially selected into the sample, being linked to the long-term survivors. The existing methodology for estimating the propensity score using data collected on prevalent cases requires the correct conditional survival/hazard function given the treatment and covariates. This requirement can be alleviated if the disease under study has stationary incidence, the so-called stationarity assumption. We propose a non-parametric adjustment technique based on a weighted estimating equation for estimating the propensity score, which does not require modeling the conditional survival/hazard function when the stationarity assumption holds. The estimator's large-sample properties are established, and its small-sample behavior is studied via simulation. The estimated propensity score is utilized to estimate the survival curves. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-03-27T01:40:37.893836-05:
      DOI: 10.1002/sta4.46
  • Right and left kurtosis measures: large sample estimation and an
           application to financial returns
    • Authors: Anna Maria Fiori; Davide Beltrami
      First page: 95
      Abstract: Although the standard fourth moment coefficient is routinely computed as “the kurtosis” of a distribution, the measure is not easily interpreted and has been a subject of considerable debate in statistical literature. The financial community has recently joined in the debate, calling for more robust estimators of kurtosis in distributions of stock market returns. For these reasons, we here consider alternative measures of right and left kurtosis, which arise from a recent characterization of kurtosis as inequality at either side of the median. Based on Gini's coefficient of concentration, the new measures apply to both symmetric and asymmetric distributions, their interpretation is clear and they are consistent with common risk perceptions of investors and risk managers. In this contribution, we show that the theory of L-statistics provides a natural framework for the construction of empirical estimators of the proposed measures and the derivation of their asymptotic properties under mild moment requirements. A real data example illustrates the potential of these estimators in financial contexts, in which the existence of higher moments is still an open question. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-04-02T01:10:08.144316-05:
      DOI: 10.1002/sta4.48
  • Bayesian sparse graphical models and their mixtures
    • Authors: Rajesh Talluri; Veerabhadran Baladandayuthapani, Bani K. Mallick
      First page: 109
      Abstract: We propose Bayesian methods for Gaussian graphical models that lead to sparse and adaptively shrunk estimators of the precision (inverse covariance) matrix. Our methods are based on lasso-type regularization priors leading to parsimonious parameterization of the precision matrix, which is essential in several applications involving learning relationships among the variables. In this context, we introduce a novel type of selection prior that develops a sparse structure on the precision matrix by making most of the elements exactly zero, in addition to ensuring positive definiteness—thus conducting model selection and estimation simultaneously. More importantly, we extend these methods to analyse clustered data using finite mixtures of Gaussian graphical model and infinite mixtures of Gaussian graphical models. We discuss appropriate posterior simulation schemes to implement posterior inference in the proposed models, including the evaluation of normalizing constants that are functions of parameters of interest, which result from the restriction of positive definiteness on the correlation matrix. We evaluate the operating characteristics of our method via several simulations and demonstrate the application to real-data examples in genomics. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-04-24T00:28:40.253451-05:
      DOI: 10.1002/sta4.49
  • Multilevel sparse functional principal component analysis
    • Authors: Chongzhi Di; Ciprian M. Crainiceanu, Wolfgang S. Jank
      First page: 126
      Abstract: We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis was proposed recently for such data when functions are densely recorded. Here, we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between-subject and within-subject levels. We address inherent methodological differences in the sparse sampling context to: (i) estimate the covariance operators; (ii) estimate the functional principal component scores; and (iii) predict the underlying curves. Through simulations, the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions. Copyright © 2014 John Wiley & Sons, Ltd
      PubDate: 2014-04-24T00:27:51.75543-05:0
      DOI: 10.1002/sta4.50
  • Classification of non-stationary time series
    • Authors: Karolina Krzemieniewska; Idris A. Eckley, Paul Fearnhead
      Pages: 144 - 157
      Abstract: In this paper we consider the problem of classifying non-stationary time series. The method that we introduce is based on the locally stationary wavelet paradigm and seeks to take account of the fact that there may be within-class variation in the signals being analysed. Specifically, we seek to identify the most stable spectral coefficients within each training group and use these to classify a new, previously unseen, time series. In both simulated examples and an aerosol spray example provided by an industrial collaborator, our approach is found to yield superior classification performance when compared against the current state of the art. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-05-18T21:47:12.854853-05:
      DOI: 10.1002/sta4.51
  • Probability of success: estimation framework, properties and applications
    • Authors: Peter Hu
      Pages: 158 - 171
      Abstract: Probability of success (PoS), defined as Bayesian expected power, has drawn more and more attention as an alternative metric complementary to the conventional conditional power for study planning. This paper describes an estimation framework for PoS on the basis of a proposed joint posterior distribution for the location and scale parameters of an effect of interest, followed by illustrations of how to estimate this quantity efficiently. Some features of this PoS framework are disclosed via the affiliated settings with non-informative prior. The upper limit of PoS, obtained when sample size approaches infinity, is derived in closed form. Three applications of this framework are given to demonstrate the benefits of using the concept of PoS in strategic planning of a confirmatory study and in interim monitoring of drug effectiveness and, as lessons learnt, how a non-inferiority study could be powered appropriately and how change of trend to achieving non-inferiority could be tracked. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-06-02T21:10:25.02844-05:0
      DOI: 10.1002/sta4.52
  • Voxelwise single-subject analysis of imaging metabolic response to therapy
           in neuro-oncology
    • Authors: Mengye Guo; Jeffrey T. Yap, Annick D. Van den Abbeele, Nancy U. Lin, Armin Schwartzman
      Pages: 172 - 186
      Abstract: F-18-Fluorodeoxyglucose positron emission tomography (FDG-PET) has been used to evaluate the metabolic response of metastatic brain tumours to treatment by comparing their tumour glucose metabolism before and after treatment. The standard analysis based on regions-of-interest has the advantage of simplicity. However, it is by definition restricted to those regions and is subject to observer variability. In addition, the observed changes in tumour metabolism are often confounded by normal changes in the tissue background, which can be heterogenous. We propose an analysis pipeline for automatically detecting the change at each voxel in the entire brain of a single subject, while adjusting for changes in the background. The complete analysis includes image registration, segmentation, a hierarchical model for background adjustment and voxelwise statistical comparisons. We demonstrate the method's ability to identify areas of tumour response and/or progression in two subjects enrolled in a clinical trial using FDG-PET to evaluate lapatinib for the treatment of brain metastases in breast cancer patients. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-06-10T20:49:22.686867-05:
      DOI: 10.1002/sta4.53
  • Adaptation in some linear inverse problems
    • Authors: Iain M. Johnstone; Debashis Paul
      Pages: 187 - 199
      Abstract: We consider the linear inverse problem of estimating an unknown signal f from noisy measurements on Kf where the linear operator K admits a wavelet–vaguelette decomposition. We formulate the problem in the Gaussian sequence model and propose estimation based on complexity penalized regression on a level-by-level basis. We adopt squared error loss and show that the estimator achieves exact rate-adaptive optimality as f varies over a wide range of the Besov function classes. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-06-17T02:14:04.67599-05:0
      DOI: 10.1002/sta4.54
  • Spline approximations to conditional Archimedean copula
    • Authors: Philippe Lambert
      Pages: 200 - 217
      Abstract: We propose a flexible copula model to describe changes with a covariate in the dependence structure of (conditionally exchangeable) random variables. The starting point is a spline approximation to the generator of an Archimedean copula. Changes in the dependence structure with a covariate x are modelled by flexible regression of the spline coefficients on x. The performances and properties of the spline estimate of the reference generator and the abilities of these conditional models to approximate conditional copulas are studied through extensive simulations. Inference is made using Bayesian arguments with posterior distributions explored using importance sampling or adaptive Markov chain Monte Carlo algorithms. The modelling strategy is illustrated with the analysis of bivariate growth curve data. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-06-18T21:55:20.870959-05:
      DOI: 10.1002/sta4.55
  • A comparison of a traditional geostatistical regression approach and a
           general Gaussian process approach for spatial prediction
    • Authors: Stanley Leung; Daniel Cooley
      Pages: 228 - 239
      Abstract: A typical goal of a geostatistical analysis is to perform spatial prediction. Geostatistical models traditionally take the form of regression models with spatially correlated errors, employing covariate information in the mean function. The random process in such models is indexed in terms of the locations of the study region. The machine learning community employs Gaussian processes for many tasks, and these processes may be indexed on information that statisticians would term as covariates. We compare the predictive ability of a traditional geostatistical model with that of a non‐traditional Gaussian process model, which has a constant mean and whose random process is indexed on orographic covariates. The models we compare are both simply parametrized, fit with straightforward inference methods, and use classical kriging predictors. The non‐traditional model achieves non‐stationarity in the geographic space. We apply the two models to soil moisture data sets, which have extensive spatial sampling, and we apply the models to six separate days of data. In terms of quantitative measures, we find that the models' predictive abilities are comparable, with the non‐traditional outperforming the traditional model perhaps slightly. Qualitatively, the non‐traditional model's predictions may be more closely linked to the orographic covariates. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-07-07T21:50:30.828242-05:
      DOI: 10.1002/sta4.57
  • Joint density of eigenvalues in spiked multivariate models
    • Authors: Prathapasinghe Dharmawansa; Iain M. Johnstone
      Pages: 240 - 249
      Abstract: The classical methods of multivariate analysis are based on the eigenvalues of one or two sample covariance matrices. In many applications of these methods, for example, to high‐dimensional data, it is natural to consider alternative hypotheses that are a low‐rank departure from the null hypothesis. For rank 1 alternatives, this note provides a representation for the joint eigenvalue density in terms of a single contour integral. This will be of use for deriving approximate distributions for likelihood ratios and “linear” statistics used in testing. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-07-24T01:50:44.689391-05:
      DOI: 10.1002/sta4.58
  • Data smoothing and end correction using entropic kernel compression
    • Authors: Thomas Bläsche; Roger J. Bowden, Peter N. Posch
      Pages: 250 - 257
      Abstract: Kernel smoothing and filtering techniques are undemanding in their data generation assumptions but have limitations where special interest attaches to more recent observations. A methodology is developed that addresses contingencies such as end correction and the kernel term structure within the same technology, namely scale invariant kernel compression. The framework is built around an entropic transformation of the standard uniform moving average, augmented with kernel compressions utilizing entropic weight redistribution. The techniques are illustrated with data drawn from climate change. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-08-06T01:59:44.209303-05:
      DOI: 10.1002/sta4.59
  • Bayesian feature selection in high‐dimensional regression in
           presence of correlated noise
    • Authors: Guy Feldman; Anindya Bhadra, Sergey Kirshner
      Pages: 258 - 272
      Abstract: We consider the problem of feature selection in a high‐dimensional multiple predictors, multiple responses regression setting. Assuming that regression errors are i.i.d. when they are in fact dependent leads to inconsistent and inefficient feature estimates. We relax the i.i.d. assumption by allowing the errors to exhibit a tree‐structured dependence. This allows a Bayesian problem formulation with the error dependence structure treated as an auxiliary variable that can be integrated out analytically with the help of the matrix‐tree theorem. Mixing over trees results in a flexible technique for modelling the graphical structure for the regression errors. Furthermore, the analytic integration results in a collapsed Gibbs sampler for feature selection that is computationally efficient. Our approach offers significant performance gains over the competing methods in simulations, especially when the features themselves are correlated. In addition to comprehensive simulation studies, we apply our method to a high‐dimensional breast cancer data set to identify markers significantly associated with the disease. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-08-15T00:47:03.560521-05:
      DOI: 10.1002/sta4.60
  • Retrospective sampling in MCMC with an application to COM‐Poisson
    • Authors: Charalampos Chanialidis; Ludger Evers, Tereza Neocleous, Agostino Nobile
      Pages: 273 - 290
      Abstract: The normalization constant in the distribution of a discrete random variable may not be available in closed form; in such cases, the calculation of the likelihood can be computationally expensive. Approximations of the likelihood or approximate Bayesian computation methods can be used; but the resulting Markov chain Monte Carlo (MCMC) algorithm may not sample from the target of interest. In certain situations, one can efficiently compute lower and upper bounds on the likelihood. As a result, the target density and the acceptance probability of the Metropolis–Hastings algorithm can be bounded. We propose an efficient and exact MCMC algorithm based on the idea of retrospective sampling. This procedure can be applied to a number of discrete distributions, one of which is the Conway–Maxwell–Poisson distribution. In practice, the bounds on the acceptance probability do not need to be particularly tight in order to accept or reject a move. We demonstrate this method using data on the emergency hospital admissions in Scotland in 2010, where the main interest lies in the estimation of the variability of admissions, as it is considered as a proxy for health inequalities. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-08-18T22:01:36.249069-05:
      DOI: 10.1002/sta4.61
  • Weighted Gibbs sampling for mixture modelling of massive datasets via
    • Authors: Clare Anne McGrory; Daniel C. Ahfock, Joshua A. Horsley, Clair L. Alston
      Pages: 291 - 299
      Abstract: Massive datasets are increasingly encountered in modern research applications, and this presents tremendous new challenges for statisticians. In settings where the aim is to classify or cluster data via finite mixture modelling, such as in satellite image analysis, the large number data points to be analysed can make fitting such models either infeasible, or simply too time‐consuming to be of practical use. It has been shown that using a representative weighted subsample of the complete dataset to estimate mixture model parameters can lead to much more time‐efficient and yet still reasonable inference. These representative subsamples are called coresets. Naturally, these coresets have to be constructed carefully as the naive approach of performing simple uniform sampling from the dataset could lead to smaller clusters of points within the dataset being severely undersampled, and this would in turn result in very unreliable inference. It has previously been shown that an adaptive sampling approach can be used to obtain a representative coreset of data points together with a corresponding set of coreset weights. In this article, we explore how this idea can be incorporated into a Gibbs sampling algorithm for mixture modelling of image data via coresets within a Bayesian framework. We call the resulting algorithm a Weighted Gibbs Sampler. We will illustrate this proposed approach through an application to remote sensing of land use from satellite imagery. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-08-22T04:07:41.375515-05:
      DOI: 10.1002/sta4.62
  • Efficient sampling of Gaussian graphical models using conditional Bayes
    • Authors: Max Hinne; Alex Lenkoski, Tom Heskes, Marcel Gerven
      Pages: 326 - 336
      Abstract: Bayesian estimation of Gaussian graphical models has proven to be challenging because the conjugate prior distribution on the Gaussian precision matrix, the G‐Wishart distribution, has a doubly intractable partition function. Recent developments provide a direct way to sample from the G‐Wishart distribution, which allows for more efficient algorithms for model selection than previously possible. Still, estimating Gaussian graphical models with more than a handful of variables remains a nearly infeasible task. Here, we propose two novel algorithms that use the direct sampler to more efficiently approximate the posterior distribution of the Gaussian graphical model. The first algorithm uses conditional Bayes factors to compare models in a Metropolis–Hastings framework. The second algorithm is based on a continuous time Markov process. We show that both algorithms are substantially faster than state‐of‐the‐art alternatives. Finally, we show how the algorithms may be used to simultaneously estimate both structural and functional connectivity between subcortical brain regions using resting‐state functional magnetic resonance imaging. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-11-17T01:08:47.629168-05:
      DOI: 10.1002/sta4.66
  • White noise testing using wavelets
    • Authors: Guy P. Nason; Delyan Savchev
      Pages: 351 - 362
      Abstract: Testing whether a time series is consistent with white noise is an important task within time series analysis and for model fitting and criticism via residual diagnostics. We introduce three fast and efficient white noise tests that assess spectral constancy via the wavelet coefficients of a periodogram. The Haar wavelet white noise test derives the exact distribution of the Haar wavelet coefficients of the asymptotic periodogram under mild conditions. The single‐coefficient white noise test uses a single Haar wavelet coefficient obtaining a test statistic as a linear combination of odd‐indexed autocorrelations. The general wavelet white noise test uses compactly supported Daubechies wavelets, shows that its coefficients are asymptotically normal and derives its theoretical power for an arbitrary spectrum. All our tests are available in the freely available hwwntest package for the R system. We present a comprehensive simulation study that shows the good performance of our new tests against alternatives commonly found in available software and show an example applied to a wind power time series. © 2014 The
      Authors . Stat published by John Wiley & Sons Ltd.
      PubDate: 2014-12-04T01:20:51.202094-05:
      DOI: 10.1002/sta4.69
  • Bayesian variable selection in generalized additive partial linear models
    • Authors: Sayantan Banerjee; Subhashis Ghosal
      Pages: 363 - 378
      Abstract: Variable selection in regression models has been well studied in the literature, with many non‐Bayesian and Bayesian methods available in this regard. An important class of regression models is generalized linear models, which involve situations where the response variable is discrete. To add more flexibility, generalized additive partial linear models can be considered, where some predictors can have a non‐linear effect while some predictors have a strictly linear effect. We consider Bayesian variable selection in these models. The functions in the non‐parametric additive part of the model are expanded in a B‐spline basis and multivariate Laplace prior put on the coefficients with point mass at zero. The coefficients corresponding to the strictly linear components are assigned a univariate Laplace prior with point mass at zero. The prior times the likelihood is mathematically intractable, but we find an approximation by expansion around the posterior mode, which is the group lasso solution in generalized linear model setting for the choice of prior. We thus completely avoid Markov chain Monte Carlo methods, which are extremely slow and unreliable in high‐dimensional models. We evaluate the performance of the Bayesian method by conducting simulation studies and real data analyses. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-12-07T23:53:32.346085-05:
      DOI: 10.1002/sta4.70
  • Correlation bias correction in two‐way fixed‐effects linear
    • Authors: Simen Gaure
      Pages: 379 - 390
      Abstract: When doing two‐way fixed‐effects ordinary least squares estimations, both the variances and covariance of the fixed effects are biased. A formula for a bias correction is known, but in large datasets, it involves inverses of impractically large matrices. We detail how to compute the bias correction in this case. Copyright © 2014 John Wiley & Sons, Ltd.
      PubDate: 2014-12-09T00:10:35.190254-05:
      DOI: 10.1002/sta4.68
  • Wiley‐Blackwell Announces Launch of Stat – The ISI's Journal
           for the Rapid Dissemination of Statistics Research
    • Pages: n/a - n/a
      PubDate: 2012-04-17T04:34:14.600281-05:
      DOI: 10.1002/sta4.1
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-2014