Subjects -> MATHEMATICS (Total: 1013 journals)
    - APPLIED MATHEMATICS (92 journals)
    - GEOMETRY AND TOPOLOGY (23 journals)
    - MATHEMATICS (714 journals)
    - MATHEMATICS (GENERAL) (45 journals)
    - NUMERICAL ANALYSIS (26 journals)

PROBABILITIES AND MATH STATISTICS (113 journals)                     

Showing 1 - 98 of 98 Journals sorted alphabetically
Advances in Statistics     Open Access   (Followers: 10)
Afrika Statistika     Open Access   (Followers: 1)
American Journal of Applied Mathematics and Statistics     Open Access   (Followers: 11)
American Journal of Mathematics and Statistics     Open Access   (Followers: 9)
Annals of Data Science     Hybrid Journal   (Followers: 14)
Annual Review of Statistics and Its Application     Full-text available via subscription   (Followers: 7)
Applied Medical Informatics     Open Access   (Followers: 11)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 8)
Asian Journal of Probability and Statistics     Open Access  
Austrian Journal of Statistics     Open Access   (Followers: 4)
Biostatistics & Epidemiology     Hybrid Journal   (Followers: 4)
Cadernos do IME : Série Estatística     Open Access  
Calcutta Statistical Association Bulletin     Hybrid Journal  
Communications in Mathematics and Statistics     Hybrid Journal   (Followers: 4)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics: Case Studies, Data Analysis and Applications     Hybrid Journal  
Comunicaciones en Estadística     Open Access  
Econometrics and Statistics     Hybrid Journal   (Followers: 1)
Forecasting     Open Access   (Followers: 1)
Foundations and Trends® in Optimization     Full-text available via subscription   (Followers: 3)
Frontiers in Applied Mathematics and Statistics     Open Access   (Followers: 1)
Game Theory     Open Access   (Followers: 2)
Geoinformatics & Geostatistics     Hybrid Journal   (Followers: 14)
Geomatics, Natural Hazards and Risk     Open Access   (Followers: 13)
Indonesian Journal of Applied Statistics     Open Access  
International Game Theory Review     Hybrid Journal   (Followers: 1)
International Journal of Advanced Statistics and IT&C for Economics and Life Sciences     Open Access  
International Journal of Advanced Statistics and Probability     Open Access   (Followers: 7)
International Journal of Algebra and Statistics     Open Access   (Followers: 3)
International Journal of Applied Mathematics and Statistics     Full-text available via subscription   (Followers: 3)
International Journal of Ecological Economics and Statistics     Full-text available via subscription   (Followers: 5)
International Journal of Energy and Statistics     Hybrid Journal   (Followers: 3)
International Journal of Game Theory     Hybrid Journal   (Followers: 3)
International Journal of Mathematics and Statistics     Full-text available via subscription   (Followers: 2)
International Journal of Multivariate Data Analysis     Hybrid Journal  
International Journal of Probability and Statistics     Open Access   (Followers: 4)
International Journal of Statistics & Economics     Full-text available via subscription   (Followers: 6)
International Journal of Statistics and Applications     Open Access   (Followers: 2)
International Journal of Statistics and Probability     Open Access   (Followers: 3)
International Journal of Statistics in Medical Research     Hybrid Journal   (Followers: 5)
International Journal of Testing     Hybrid Journal   (Followers: 1)
Iraqi Journal of Statistical Sciences     Open Access  
Japanese Journal of Statistics and Data Science     Hybrid Journal  
Journal of Biometrics & Biostatistics     Open Access   (Followers: 4)
Journal of Cost Analysis and Parametrics     Hybrid Journal   (Followers: 5)
Journal of Environmental Statistics     Open Access   (Followers: 4)
Journal of Game Theory     Open Access   (Followers: 1)
Journal of Mathematical Economics and Finance     Full-text available via subscription  
Journal of Mathematics and Statistics Studies     Open Access  
Journal of Modern Applied Statistical Methods     Open Access   (Followers: 1)
Journal of Official Statistics     Open Access   (Followers: 2)
Journal of Quantitative Economics     Hybrid Journal  
Journal of Social and Economic Statistics     Open Access  
Journal of Statistical Theory and Practice     Hybrid Journal   (Followers: 2)
Journal of Statistics and Data Science Education     Open Access   (Followers: 2)
Journal of Survey Statistics and Methodology     Hybrid Journal   (Followers: 4)
Journal of the Indian Society for Probability and Statistics     Full-text available via subscription  
Jurnal Biometrika dan Kependudukan     Open Access  
Jurnal Ekonomi Kuantitatif Terapan     Open Access  
Jurnal Sains Matematika dan Statistika     Open Access  
Lietuvos Statistikos Darbai     Open Access  
Mathematics and Statistics     Open Access   (Followers: 2)
Methods, Data, Analyses     Open Access   (Followers: 1)
METRON     Hybrid Journal   (Followers: 1)
Nepalese Journal of Statistics     Open Access  
North American Actuarial Journal     Hybrid Journal   (Followers: 1)
Open Journal of Statistics     Open Access   (Followers: 3)
Open Mathematics, Statistics and Probability Journal     Open Access  
Pakistan Journal of Statistics and Operation Research     Open Access   (Followers: 1)
Physica A: Statistical Mechanics and its Applications     Hybrid Journal   (Followers: 6)
Probability, Uncertainty and Quantitative Risk     Open Access   (Followers: 2)
Ratio Mathematica     Open Access  
Research & Reviews : Journal of Statistics     Open Access   (Followers: 3)
Revista Brasileira de Biometria     Open Access  
Revista Colombiana de Estadística     Open Access  
RMS : Research in Mathematics & Statistics     Open Access  
Romanian Statistical Review     Open Access  
Sankhya B - Applied and Interdisciplinary Statistics     Hybrid Journal  
SIAM Journal on Mathematics of Data Science     Hybrid Journal   (Followers: 1)
SIAM/ASA Journal on Uncertainty Quantification     Hybrid Journal   (Followers: 2)
Spatial Statistics     Hybrid Journal   (Followers: 2)
Sri Lankan Journal of Applied Statistics     Open Access  
Stat     Hybrid Journal   (Followers: 1)
Stata Journal     Full-text available via subscription   (Followers: 8)
Statistica     Open Access   (Followers: 6)
Statistical Analysis and Data Mining     Hybrid Journal   (Followers: 23)
Statistical Theory and Related Fields     Hybrid Journal  
Statistics and Public Policy     Open Access   (Followers: 4)
Statistics in Transition New Series : An International Journal of the Polish Statistical Association     Open Access  
Statistics Research Letters     Open Access   (Followers: 1)
Statistics, Optimization & Information Computing     Open Access   (Followers: 3)
Stats     Open Access  
Synthesis Lectures on Mathematics and Statistics     Full-text available via subscription   (Followers: 1)
Theory of Probability and its Applications     Hybrid Journal   (Followers: 2)
Theory of Probability and Mathematical Statistics     Full-text available via subscription   (Followers: 2)
Turkish Journal of Forecasting     Open Access   (Followers: 1)
VARIANSI : Journal of Statistics and Its application on Teaching and Research     Open Access  
Zeitschrift für die gesamte Versicherungswissenschaft     Hybrid Journal  


Similar Journals
Journal Cover
Number of Followers: 0  

  This is an Open Access Journal Open Access journal
ISSN (Online) 2571-905X
Published by MDPI Homepage  [84 journals]
  • Stats, Vol. 5, Pages 583-605: Quantile Regression Approach for Analyzing
           Similarity of Gene Expressions under Multiple Biological Conditions

    • Authors: Dianliang Deng, Mashfiqul Huq Chowdhury
      First page: 583
      Abstract: Temporal gene expression data contain ample information to characterize gene function and are now widely used in bio-medical research. A dense temporal gene expression usually shows various patterns in expression levels under different biological conditions. The existing literature investigates the gene trajectory using the mean function. However, temporal gene expression curves usually show a strong degree of heterogeneity under multiple conditions. As a result, rates of change for gene expressions may be different in non-central locations and a mean function model may not capture the non-central location of the gene expression distribution. Further, the mean regression model depends on the normality assumptions of the error terms of the model, which may be impractical when analyzing gene expression data. In this research, a linear quantile mixed model is used to find the trajectory of gene expression data. This method enables the changes in gene expression over time to be studied by estimating a family of quantile functions. A statistical test is proposed to test the similarity between two different gene expressions based on estimated parameters using a quantile model. Then, the performance of the proposed test statistic is examined using extensive simulation studies. Simulation studies demonstrate the good statistical performance of this proposed test statistic and show that this method is robust against normal error assumptions. As an illustration, the proposed method is applied to analyze a dataset of 18 genes in P. aeruginosa, expressed in 24 biological conditions. Furthermore, a minimum Mahalanobis distance is used to find the clustering tree for gene expressions.
      Citation: Stats
      PubDate: 2022-07-02
      DOI: 10.3390/stats5030036
      Issue No: Vol. 5, No. 3 (2022)
  • Stats, Vol. 5, Pages 339-357: A Bootstrap Variance Estimation Method for
           Multistage Sampling and Two-Phase Sampling When Poisson Sampling Is Used
           at the Second Phase

    • Authors: Jean-François Beaumont, Nelson Émond
      First page: 339
      Abstract: The bootstrap method is often used for variance estimation in sample surveys with a stratified multistage sampling design. It is typically implemented by producing a set of bootstrap weights that is made available to users and that accounts for the complexity of the sampling design. The Rao–Wu–Yue method is often used to produce the required bootstrap weights. It is valid under stratified with-replacement sampling at the first stage or fixed-size without-replacement sampling provided the first-stage sampling fractions are negligible. Some surveys use designs that do not satisfy these conditions. We propose a simple and unified bootstrap method that addresses this limitation of the Rao–Wu–Yue bootstrap weights. This method is applicable to any multistage sampling design as long as valid bootstrap weights can be produced for each distinct stage of sampling. Our method is also applicable to two-phase sampling designs provided that Poisson sampling is used at the second phase. We use this design to model survey nonresponse and derive bootstrap weights that account for nonresponse weighting. The properties of our bootstrap method are evaluated in three limited simulation studies.
      Citation: Stats
      PubDate: 2022-03-22
      DOI: 10.3390/stats5020019
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 358-370: Multiple Imputation of Composite Covariates
           in Survival Studies

    • Authors: Lily Clements, Alan C. Kimber, Stefanie Biedermann
      First page: 358
      Abstract: Missing covariate values are a common problem in survival studies, and the method of choice when handling such incomplete data is often multiple imputation. However, it is not obvious how this can be used most effectively when an incomplete covariate is a function of other covariates. For example, body mass index (BMI) is the ratio of weight and height-squared. In this situation, the following question arises: Should a composite covariate such as BMI be imputed directly, or is it advantageous to impute its constituents, weight and height, first and to construct BMI afterwards' We address this question through a carefully designed simulation study that compares various approaches to multiple imputation of composite covariates in a survival context. We discuss advantages and limitations of these approaches for various types of missingness and imputation models. Our results are a first step towards providing much needed guidance to practitioners for analysing their incomplete survival data effectively.
      Citation: Stats
      PubDate: 2022-03-29
      DOI: 10.3390/stats5020020
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 371-384: ordinalbayes: Fitting Ordinal Bayesian
           Regression Models to High-Dimensional Data Using R

    • Authors: Kellie J. Archer, Anna Eames Seffernick, Shuai Sun, Yiran Zhang
      First page: 371
      Abstract: The stage of cancer is a discrete ordinal response that indicates the aggressiveness of disease and is often used by physicians to determine the type and intensity of treatment to be administered. For example, the FIGO stage in cervical cancer is based on the size and depth of the tumor as well as the level of spread. It may be of clinical relevance to identify molecular features from high-throughput genomic assays that are associated with the stage of cervical cancer to elucidate pathways related to tumor aggressiveness, identify improved molecular features that may be useful for staging, and identify therapeutic targets. High-throughput RNA-Seq data and corresponding clinical data (including stage) for cervical cancer patients have been made available through The Cancer Genome Atlas Project (TCGA). We recently described penalized Bayesian ordinal response models that can be used for variable selection for over-parameterized datasets, such as the TCGA-CESC dataset. Herein, we describe our ordinalbayes R package, available from the Comprehensive R Archive Network (CRAN), which enhances the runjags R package by enabling users to easily fit cumulative logit models when the outcome is ordinal and the number of predictors exceeds the sample size, P>N, such as for TCGA and other high-throughput genomic data. We demonstrate the use of this package by applying it to the TCGA cervical cancer dataset. Our ordinalbayes package can be used to fit models to high-dimensional datasets, and it effectively performs variable selection.
      Citation: Stats
      PubDate: 2022-04-15
      DOI: 10.3390/stats5020021
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 385-400: Some Empirical Results on Nearest-Neighbour
           Pseudo-populations for Resampling from Spatial Populations

    • Authors: Sara Franceschi, Rosa Maria Di Biase, Agnese Marcelli, Lorenzo Fattorini
      First page: 385
      Abstract: In finite populations, pseudo-population bootstrap is the sole method preserving the spirit of the original bootstrap performed from iid observations. In spatial sampling, theoretical results about the convergence of bootstrap distributions to the actual distributions of estimators are lacking, owing to the failure of spatially balanced sampling designs to converge to the maximum entropy design. In addition, the issue of creating pseudo.populations able to mimic the characteristics of real populations is challenging in spatial frameworks where spatial trends, relationships, and similarities among neighbouring locations are invariably present. In this paper, we propose the use of the nearest-neighbour interpolation of spatial populations for constructing pseudo-populations that converge to real populations under mild conditions. The effectiveness of these proposals with respect to traditional pseudo-populations is empirically checked by a simulation study.
      Citation: Stats
      PubDate: 2022-04-15
      DOI: 10.3390/stats5020022
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 401-407: Has the Market Started to Collapse or Will
           It Resist'

    • Authors: Yao Kuang, Raphael Douady
      First page: 401
      Abstract: Many people are concerned about the stock market in 2022 as it faces several threats, from rising inflation rates to geopolitical events. The S&P 500 Index has already dropped about 10% from the peak in early January 2022 until the end of February 2022. This paper aims at updating the crisis indicator to predict when the market may experience a significant drawdown, which we developed in Crisis Risk Prediction with Concavity from Polymodel (2022). This indicator uses regime switching and Polymodel theory to calculate the market concavity. We found that concavity had not increased in the past 6 months. We conclude that at present, the market does not bear inherent dynamic instability. This does not exclude a possible collapse which would be due to external events unrelated to financial markets.
      Citation: Stats
      PubDate: 2022-04-23
      DOI: 10.3390/stats5020023
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 408-421: Omnibus Tests for Multiple Binomial
           Proportions via Doubly Sampled Framework with Under-Reported Data

    • Authors: Dewi Rahardja
      First page: 408
      Abstract: Previously, Rahardja (2020) paper (in the first reference list) developed a (pairwise) multiple comparison procedure (MCP) to determine which (proportions) pairs of Multiple Binomial Proportions (with under-reported data), the significant differences came from. Generally, such an MCP test (developed by Rahardja, 2020) is the second part of a two-stage sequential test. In this paper, we derived two omnibus tests (i.e., the overall equality of multiple proportions test) as the first part of the above two-stage sequential test (with under-reported data), in general. Using two likelihood-based approaches, we acquire two Wald-type (Omnibus) tests to compare Multiple Binomial Proportions (in the presence of under-reported data). Our closed-form algorithm is easy to implement and not computationally burdensome. We applied our algorithm to a vehicle-accident data example.
      Citation: Stats
      PubDate: 2022-04-23
      DOI: 10.3390/stats5020024
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 422-439: Bootstrap Assessment of Crop Area Estimates
           Using Satellite Pixels Counting

    • Authors: Cristiano Ferraz, Jacques Delincé, André Leite, Raydonal Ospina
      First page: 422
      Abstract: Crop area estimates based on counting pixels over classified satellite images are a promising application of remote sensing to agriculture. However, such area estimates are biased, and their variance is a function of the error rates of the classification rule. To redress the bias, estimators (direct and inverse) relying on the so-called confusion matrix have been proposed, but analytic estimators for variances can be tricky to derive. This article proposes a bootstrap method for assessing statistical properties of such estimators based on information from a sample confusion matrix. The proposed method can be applied to any other type of estimator that is built upon confusion matrix information. The resampling procedure is illustrated in a small study to assess the biases and variances of estimates using purely pixel counting and estimates provided by both direct and inverse estimators. The method has the advantage of being simple to implement even when the sample confusion matrix is generated under unequal probability sample design. The results show the limitations of estimates based solely on pixel counting as well as respective advantages and drawbacks of the direct and inverse estimators with respect to their feasibility, unbiasedness, and variance.
      Citation: Stats
      PubDate: 2022-04-25
      DOI: 10.3390/stats5020025
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 440-457: Opening the Black Box: Bootstrapping
           Sensitivity Measures in Neural Networks for Interpretable Machine Learning

    • Authors: Michele La Rocca, Cira Perna
      First page: 440
      Abstract: Artificial neural networks are powerful tools for data analysis, particularly in the context of highly nonlinear regression models. However, their utility is critically limited due to the lack of interpretation of the model given its black-box nature. To partially address the problem, the paper focuses on the important problem of feature selection. It proposes and discusses a statistical test procedure for selecting a set of input variables that are relevant to the model while taking into account the multiple testing nature of the problem. The approach is within the general framework of sensitivity analysis and uses the conditional expectation of functions of the partial derivatives of the output with respect to the inputs as a sensitivity measure. The proposed procedure extensively uses the bootstrap to approximate the test statistic distribution under the null while controlling the familywise error rate to correct for data snooping arising from multiple testing. In particular, a pair bootstrap scheme was implemented in order to obtain consistent results when using misspecified statistical models, a typical characteristic of neural networks. Numerical examples and a Monte Carlo simulation were carried out to verify the ability of the proposed test procedure to correctly identify the set of relevant features.
      Citation: Stats
      PubDate: 2022-04-25
      DOI: 10.3390/stats5020026
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 458-476: Repeated-Measures Analysis in the Context of
           Heteroscedastic Error Terms with Factors Having Both Fixed and Random

    • Authors: Lyson Chaka, Peter Njuho
      First page: 458
      Abstract: The design and analysis of experiments which involve factors each consisting of both fixed and random levels fit into linear mixed models. The assumed linear mixed-model design matrix takes either a full-rank or less-than-full-rank form. The complexity of the data structures of such experiments falls in the model-selection and parameter-estimation process. The fundamental consideration in the estimation process of linear models is the special case in which elements of the error vector are assumed equal and uncorrelated. However, different assumptions on the structure of the variance–covariance matrix of error vector in the estimation of parameters of a linear mixed model may be considered. We conceptualise a repeated-measures design with multiple between-subjects factors, in which each of these factors has both fixed and random levels. We focus on the construction of linear mixed-effects models, the estimation of variance components, and hypothesis testing in which the default covariance structure of homoscedastic error terms is not appropriate. We illustrate the proposed approach using longitudinal data fitted to a three-factor linear mixed-effects model. The novelty of this approach lies in the exploration of the fixed and random levels of the same factor and in the subsequent interaction effects of the fixed levels. In addition, we assess the differences between levels of the same factor and determine the proportion of the total variation accounted for by the random levels of the same factor.
      Citation: Stats
      PubDate: 2022-05-06
      DOI: 10.3390/stats5020027
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 477-493: Bayesian Semiparametric Regression Analysis
           of Multivariate Panel Count Data

    • Authors: Chunling Wang, Xiaoyan Lin
      First page: 477
      Abstract: Panel count data often occur in a long-term recurrent event study, where the exact occurrence time of the recurrent events is unknown, but only the occurrence count between any two adjacent observation time points is recorded. Most traditional methods only handle panel count data for a single type of event. In this paper, we propose a Bayesian semiparameteric approach to analyze panel count data for multiple types of events. For each type of recurrent event, the proportional mean model is adopted to model the mean count of the event, where its baseline mean function is approximated by monotone I-splines. The correlation between multiple types of events is modeled by common frailty terms and scale parameters. Unlike many frequentist estimating equation methods, our approach is based on the observed likelihood and makes no assumption on the relationship between the recurrent process and the observation process. Under the Poisson counting process assumption, we develop an efficient Gibbs sampler based on novel data augmentation for the Markov chain Monte Carlo sampling. Simulation studies show good estimation performance of the baseline mean functions and the regression coefficients; meanwhile, the importance of including the scale parameter to flexibly accommodate the correlation between events is also demonstrated. Finally, a skin cancer data example is fully analyzed to illustrate the proposed methods.
      Citation: Stats
      PubDate: 2022-05-10
      DOI: 10.3390/stats5020028
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 494-506: The Missing Indicator Approach for
           Accelerated Failure Time Model with Covariates Subject to Limits of

    • Authors: Norah Alyabs, Sy Han Chiou
      First page: 494
      Abstract: The limit of detection (LOD) is commonly encountered in observational studies when one or more covariate values fall outside the measuring ranges. Although the complete-case (CC) approach is widely employed in the presence of missing values, it could result in biased estimations or even become inapplicable in small sample studies. On the other hand, approaches such as the missing indicator (MDI) approach are attractive alternatives as they preserve sample sizes. This paper compares the effectiveness of different alternatives to the CC approach under different LOD settings with a survival outcome. These alternatives include substitution methods, multiple imputation (MI) methods, MDI approaches, and MDI-embedded MI approaches. We found that the MDI approach outperformed its competitors regarding bias and mean squared error in small sample sizes through extensive simulation.
      Citation: Stats
      PubDate: 2022-05-10
      DOI: 10.3390/stats5020029
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 507-520: Goodness-of-Fit and Generalized Estimating
           Equation Methods for Ordinal Responses Based on the Stereotype Model

    • Authors: Daniel Fernández, Louise McMillan, Richard Arnold, Martin Spiess, Ivy Liu
      First page: 507
      Abstract: Background: Data with ordinal categories occur in many diverse areas, but methodologies for modeling ordinal data lag severely behind equivalent methodologies for continuous data. There are advantages to using a model specifically developed for ordinal data, such as making fewer assumptions and having greater power for inference. Methods: The ordered stereotype model (OSM) is an ordinal regression model that is more flexible than the popular proportional odds ordinal model. The primary benefit of the OSM is that it uses numeric encoding of the ordinal response categories without assuming the categories are equally-spaced. Results: This article summarizes two recent advances in the OSM: (1) three novel tests to assess goodness-of-fit; (2) a new Generalized Estimating Equations approach to estimate the model for longitudinal studies. These methods use the new spacing of the ordinal categories indicated by the estimated score parameters of the OSM. Conclusions: The recent advances presented can be applied to several fields. We illustrate their use with the well-known arthritis clinical trial dataset. These advances fill a gap in methodologies available for ordinal responses and may be useful for practitioners in many applied fields.
      Citation: Stats
      PubDate: 2022-06-01
      DOI: 10.3390/stats5020030
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 521-537: A Comparison of Existing Bootstrap
           Algorithms for Multi-Stage Sampling Designs

    • Authors: Sixia Chen, David Haziza, Zeinab Mashreghi
      First page: 521
      Abstract: Multi-stage sampling designs are often used in household surveys because a sampling frame of elements may not be available or for cost considerations when data collection involves face-to-face interviews. In this context, variance estimation is a complex task as it relies on the availability of second-order inclusion probabilities at each stage. To cope with this issue, several bootstrap algorithms have been proposed in the literature in the context of a two-stage sampling design. In this paper, we describe some of these algorithms and compare them empirically in terms of bias, stability, and coverage probability.
      Citation: Stats
      PubDate: 2022-06-06
      DOI: 10.3390/stats5020031
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 538-545: Evaluation of the Gauss Integral

    • Authors: Dmitri Martila, Stefan Groote
      First page: 538
      Abstract: The normal or Gaussian distribution plays a prominent role in almost all fields of science. However, it is well known that the Gauss (or Euler–Poisson) integral over a finite boundary, as is necessary, for instance, for the error function or the cumulative distribution of the normal distribution, cannot be expressed by analytic functions. This is proven by the Risch algorithm. Regardless, there are proposals for approximate solutions. In this paper, we give a new solution in terms of normal distributions by applying a geometric procedure iteratively to the problem.
      Citation: Stats
      PubDate: 2022-06-10
      DOI: 10.3390/stats5020032
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 546-560: Quantitative Trading through Random
           Perturbation Q-Network with Nonlinear Transaction Costs

    • Authors: Tian Zhu, Wei Zhu
      First page: 546
      Abstract: In recent years, reinforcement learning (RL) has seen increasing applications in the financial industry, especially in quantitative trading and portfolio optimization when the focus is on the long-term reward rather than short-term profit. Sequential decision making and Markov decision processes are rather suited for this type of application. Through trial and error based on historical data, an agent can learn the characteristics of the market and evolve an algorithm to maximize the cumulative returns. In this work, we propose a novel RL trading algorithm utilizing random perturbation of the Q-network and account for the more realistic nonlinear transaction costs. In summary, we first design a new near-quadratic transaction cost function considering the slippage. Next, we develop a convolutional deep Q-learning network (CDQN) with multiple price input based on this cost functions. We further propose a random perturbation (rp) method to modify the learning network to solve the instability issue intrinsic to the deep Q-learning network. Finally, we use this newly developed CDQN-rp algorithm to make trading decisions based on the daily stock prices of Apple (AAPL), Meta (FB), and Bitcoin (BTC) and demonstrate its strengths over other quantitative trading methods.
      Citation: Stats
      PubDate: 2022-06-10
      DOI: 10.3390/stats5020033
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 561-571: Bayesian Bootstrap in Multiple Frames

    • Authors: Daniela Cocchi, Lorenzo Marchi, Riccardo Ievoli
      First page: 561
      Abstract: Multiple frames are becoming increasingly relevant due to the spread of surveys conducted via registers. In this regard, estimators of population quantities have been proposed, including the multiplicity estimator. In all cases, variance estimation still remains a matter of debate. This paper explores the potential of Bayesian bootstrap techniques for computing such estimators. The suitability of the method, which is compared to the existing frequentist bootstrap, is shown by conducting a small-scale simulation study and a case study.
      Citation: Stats
      PubDate: 2022-06-15
      DOI: 10.3390/stats5020034
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 572-582: A Multi-Aspect Permutation Test for
           Goodness-of-Fit Problems

    • Authors: Rosa Arboretti, Elena Barzizza, Nicolò Biasetton, Riccardo Ceccato, Livio Corain, Luigi Salmaso
      First page: 572
      Abstract: Parametric techniques commonly rely on specific distributional assumptions. It is therefore fundamental to preliminarily identify the eventual violations of such assumptions. Therefore, appropriate testing procedures are required for this purpose to deal with a the goodness-of-fit (GoF) problem. This task can be quite challenging, especially with small sample sizes and multivariate data. Previous studiesshowed how a GoF problem can be easily represented through a traditional two-sample system of hypotheses. Following this idea, in this paper, we propose a multi-aspect permutation-based test to deal with the multivariate goodness-of-fit, taking advantage of the nonparametric combination (NPC) methodology. A simulation study is then conducted to evaluate the performance of our proposal and to identify the eventual critical scenarios. Finally, a real data application is considered.
      Citation: Stats
      PubDate: 2022-06-17
      DOI: 10.3390/stats5020035
      Issue No: Vol. 5, No. 2 (2022)
  • Stats, Vol. 5, Pages 52-69: A Flexible Mixed Model for Clustered Count

    • Authors: Darcy Steeg Morris, Kimberly F. Sellers
      First page: 52
      Abstract: Clustered count data are commonly modeled using Poisson regression with random effects to account for the correlation induced by clustering. The Poisson mixed model allows for overdispersion via the nature of the within-cluster correlation, however, departures from equi-dispersion may also exist due to the underlying count process mechanism. We study the cross-sectional COM-Poisson regression model—a generalized regression model for count data in light of data dispersion—together with random effects for analysis of clustered count data. We demonstrate model flexibility of the COM-Poisson random intercept model, including choice of the random effect distribution, via simulated and real data examples. We find that COM-Poisson mixed models provide comparable model fit to well-known mixed models for associated special cases of clustered discrete data, and result in improved model fit for data with intermediate levels of over- or underdispersion in the count mechanism. Accordingly, the proposed models are useful for capturing dispersion not consistent with commonly used statistical models, and also serve as a practical diagnostic tool.
      Citation: Stats
      PubDate: 2022-01-07
      DOI: 10.3390/stats5010004
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 70-88: A Noncentral Lindley Construction Illustrated
           in an INAR(1) Environment

    • Authors: Johannes Ferreira, Ané van der Merwe
      First page: 70
      Abstract: This paper proposes a previously unconsidered generalization of the Lindley distribution by allowing for a measure of noncentrality. Essential structural characteristics are investigated and derived in explicit and tractable forms, and the estimability of the model is illustrated via the fit of this developed model to real data. Subsequently, this model is used as a candidate for the parameter of a Poisson model, which allows for departure from the usual equidispersion restriction that the Poisson offers when modelling count data. This Poisson-noncentral Lindley is also systematically investigated and characteristics are derived. The value of this count model is illustrated and implemented as the count error distribution in an integer autoregressive environment, and juxtaposed against other popular models. The effect of the systematically-induced noncentrality parameter is illustrated and paves the way for future flexible modelling not only as a standalone contender in continuous Lindley-type scenarios but also in discrete and discrete time series scenarios when the often-encountered equidispersed assumption is not adhered to in practical data environments.
      Citation: Stats
      PubDate: 2022-01-10
      DOI: 10.3390/stats5010005
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 89-107: A Bayesian Approach for Imputation of
           Censored Survival Data

    • Authors: Shirin Moghaddam, John Newell, John Hinde
      First page: 89
      Abstract: A common feature of much survival data is censoring due to incompletely observed lifetimes. Survival analysis methods and models have been designed to take account of this and provide appropriate relevant summaries, such as the Kaplan–Meier plot and the commonly quoted median survival time of the group under consideration. However, a single summary is not really a relevant quantity for communication to an individual patient, as it conveys no notion of variability and uncertainty, and the Kaplan–Meier plot can be difficult for the patient to understand and also is often mis-interpreted, even by some physicians. This paper considers an alternative approach of treating the censored data as a form of missing, incomplete data and proposes an imputation scheme to construct a completed dataset. This allows the use of standard descriptive statistics and graphical displays to convey both typical outcomes and the associated variability. We propose a Bayesian approach to impute any censored observations, making use of other information in the dataset, and provide a completed dataset. This can then be used for standard displays, summaries, and even, in theory, analysis and model fitting. We particularly focus on the data visualisation advantages of the completed data, allowing displays such as density plots, boxplots, etc, to complement the usual Kaplan–Meier display of the original dataset. We study the performance of this approach through a simulation study and consider its application to two clinical examples.
      Citation: Stats
      PubDate: 2022-01-26
      DOI: 10.3390/stats5010006
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 108-110: Acknowledgment to Reviewers of Stats in 2021

    • Authors: Stats Editorial Office Stats Editorial Office
      First page: 108
      Abstract: Rigorous peer-reviews are the basis of high-quality academic publishing [...]
      Citation: Stats
      PubDate: 2022-01-28
      DOI: 10.3390/stats5010007
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 111-127: A General Description of Growth Trends

    • Authors: Moshe Elitzur
      First page: 111
      Abstract: Time series that display periodicity can be described with a Fourier expansion. In a similar vein, a recently developed formalism enables the description of growth patterns with the optimal number of parameters. The method has been applied to the growth of national GDP, population and the COVID-19 pandemic; in all cases, the deviations of long-term growth patterns from purely exponential required no more than two additional parameters, mostly only one. Here, I utilize the new framework to develop a unified formulation for all functions that describe growth deceleration, wherein the growth rate decreases with time. The result offers the prospects for a new general tool for trend removal in time-series analysis.
      Citation: Stats
      PubDate: 2022-02-03
      DOI: 10.3390/stats5010008
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 128-138: Selection of Auxiliary Variables for
           Three-Fold Linking Models in Small Area Estimation: A Simple and Effective

    • Authors: Song Cai, J.N.K. Rao
      First page: 128
      Abstract: Model-based estimation of small area means can lead to reliable estimates when the area sample sizes are small. This is accomplished by borrowing strength across related areas using models linking area means to related covariates and random area effects. The effective selection of variables to be included in the linking model is important in small area estimation. The main purpose of this paper is to extend the earlier work on variable selection for area level and two-fold subarea level models to three-fold sub-subarea models linking sub-subarea means to related covariates and random effects at the area, sub-area, and sub-subarea levels. The proposed variable selection method transforms the sub-subarea means to reduce the linking model to a standard regression model and applies commonly used criteria for variable selection, such as AIC and BIC, to the reduced model. The resulting criteria depend on the unknown sub-subarea means, which are then estimated using the sample sub-subarea means. Then, the estimated selection criteria are used for variable selection. Simulation results on the performance of the proposed variable selection method relative to methods based on area level and two-fold subarea level models are also presented.
      Citation: Stats
      PubDate: 2022-02-05
      DOI: 10.3390/stats5010009
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 139-153: Analysis of Household Pulse Survey
           Public-Use Microdata via Unit-Level Models for Informative Sampling

    • Authors: Alexander Sun, Paul A. Parker, Scott H. Holan
      First page: 139
      Abstract: The Household Pulse Survey, recently released by the U.S. Census Bureau, gathers information about the respondents’ experiences regarding employment status, food security, housing, physical and mental health, access to health care, and education disruption. Design-based estimates are produced for all 50 states and the District of Columbia (DC), as well as 15 Metropolitan Statistical Areas (MSAs). Using public-use microdata, this paper explores the effectiveness of using unit-level model-based estimators that incorporate spatial dependence for the Household Pulse Survey. In particular, we consider Bayesian hierarchical model-based spatial estimates for both a binomial and a multinomial response under informative sampling. Importantly, we demonstrate that these models can be easily estimated using Hamiltonian Monte Carlo through the Stan software package. In doing so, these models can readily be implemented in a production environment. For both the binomial and multinomial responses, an empirical simulation study is conducted, which compares spatial and non-spatial models. Finally, using public-use Household Pulse Survey micro-data, we provide an analysis that compares both design-based and model-based estimators and demonstrates a reduction in standard errors for the model-based approaches.
      Citation: Stats
      PubDate: 2022-02-07
      DOI: 10.3390/stats5010010
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 154-171: All-NBA Teams’ Selection Based on
           Unsupervised Learning

    • Authors: João Vítor Rocha da Silva, Paulo Canas Rodrigues
      First page: 154
      Abstract: All-NBA Teams’ selections have great implications for the players’ and teams’ futures. Since contract extensions are highly related to awards, which can be seen as indexes that measure a players’ production in a year, team selection is of mutual interest for athletes and franchises. In this paper, we are interested in studying the current selection format. In particular, this study aims to: (i) identify the factors that are taken into consideration by voters when choosing the three All-NBA Teams; and (ii) suggest a new selection format to evaluate players’ performances. Average game-related statistics of all active NBA players in regular seasons from 2013-14 to 2018-19, were analyzed using LASSO (Logistic) Regression and Principal Component Analysis (PCA). It was possible: (i) to determine an All-NBA player profile; (ii) to determine that this profile can cause a misrepresentation of players’ modern and versatile gameplay styles; and (iii) to suggest a new way to evaluate and select players, through PCA. As the results of this paper a model is presented that may help not only the NBA to better evaluate players, but any basketball league; it also may be a source to researchers that aim to investigate player performance, development, and their impact over many seasons.
      Citation: Stats
      PubDate: 2022-02-09
      DOI: 10.3390/stats5010011
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 172-189: Multivariate Threshold Regression Models
           with Cure Rates: Identification and Estimation in the Presence of the
           Esscher Property

    • Authors: Mei-Ling Ting Lee, George A. Whitmore
      First page: 172
      Abstract: The first hitting time of a boundary or threshold by the sample path of a stochastic process is the central concept of threshold regression models for survival data analysis. Regression functions for the process and threshold parameters in these models are multivariate combinations of explanatory variates. The stochastic process under investigation may be a univariate stochastic process or a multivariate stochastic process. The stochastic processes of interest to us in this report are those that possess stationary independent increments (i.e., Lévy processes) as well as the Esscher property. The Esscher transform is a transformation of probability density functions that has applications in actuarial science, financial engineering, and other fields. Lévy processes with this property are often encountered in practical applications. Frequently, these applications also involve a ‘cure rate’ fraction because some individuals are susceptible to failure and others not. Cure rates may arise endogenously from the model alone or exogenously from mixing of distinct statistical populations in the data set. We show, using both theoretical analysis and case demonstrations, that model estimates derived from typical survival data may not be able to distinguish between individuals in the cure rate fraction who are not susceptible to failure and those who may be susceptible to failure but escape the fate by chance. The ambiguity is aggravated by right censoring of survival times and by minor misspecifications of the model. Slightly incorrect specifications for regression functions or for the stochastic process can lead to problems with model identification and estimation. In this situation, additional guidance for estimating the fraction of non-susceptibles must come from subject matter expertise or from data types other than survival times, censored or otherwise. The identifiability issue is confronted directly in threshold regression but is also present when applying other kinds of models commonly used for survival data analysis. Other methods, however, usually do not provide a framework for recognizing or dealing with the issue and so the issue is often unintentionally ignored. The theoretical foundations of this work are set out, which presents new and somewhat surprising results for the first hitting time distributions of Lévy processes that have the Esscher property.
      Citation: Stats
      PubDate: 2022-02-11
      DOI: 10.3390/stats5010012
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 190-202: Bootstrap Prediction Intervals of Temporal

    • Authors: Bu Hyoung Lee
      First page: 190
      Abstract: In this article, we propose an interval estimation method to trace an unknown disaggregate series within certain bandwidths. First, we consider two model-based disaggregation methods called the GLS disaggregation and the ARIMA disaggregation. Then, we develop iterative steps to construct AR-sieve bootstrap prediction intervals for model-based temporal disaggregation. As an illustration, we analyze the quarterly total balances of U.S. international trade in goods and services between the first quarter of 1992 and the fourth quarter of 2020.
      Citation: Stats
      PubDate: 2022-02-18
      DOI: 10.3390/stats5010013
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 203-214: Modeling Secondary Phenotypes Conditional on
           Genotypes in Case–Control Studies

    • Authors: Naomi C. Brownstein, Jianwen Cai, Shad Smith, Luda Diatchenko, Gary D. Slade, Eric Bair
      First page: 203
      Abstract: Traditional case–control genetic association studies examine relationships between case–control status and one or more covariates. It is becoming increasingly common to study secondary phenotypes and their association with the original covariates. The Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) project, a study of temporomandibular disorders (TMD), motivates this work. Numerous measures of interest are collected at enrollment, such as the number of comorbid pain conditions from which a participant suffers. Examining the potential genetic basis of these measures is of secondary interest. Assessing these associations is statistically challenging, as participants do not form a random sample from the population of interest. Standard methods may be biased and lack coverage and power. We propose a general method for the analysis of arbitrary phenotypes utilizing inverse probability weighting and bootstrapping for standard error estimation. The method may be applied to the complicated association tests used in next-generation sequencing studies, such as analyses of haplotypes with ambiguous phase. Simulation studies show that our method performs as well as competing methods when they are applicable and yield promising results for outcome types, such as time-to-event, to which other methods may not apply. The method is applied to the OPPERA baseline case–control genetic study.
      Citation: Stats
      PubDate: 2022-02-22
      DOI: 10.3390/stats5010014
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 215-257: The Stacy-G Class: A New Family of
           Distributions with Regression Modeling and Applications to Survival Real

    • Authors: Lucas D. Ribeiro Reis, Gauss M. Cordeiro, Maria do Carmo S. Lima
      First page: 215
      Abstract: We study the Stacy-G family, which extends the gamma-G class and provides four of the most well-known forms of the hazard rate function: increasing, decreasing, bathtub, and inverted bathtub. We provide some of its structural properties. We estimate the parameters by maximum likelihood, and perform a simulation study to verify the asymptotic properties of the estimators for the Burr-XII baseline. We construct the log-Stacy-Burr XII regression for censored data. The usefulness of the new models is shown through applications to uncensored and censored real data.
      Citation: Stats
      PubDate: 2022-03-04
      DOI: 10.3390/stats5010015
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 258-269: Resampling under Complex Sampling Designs:
           Roots, Development and the Way Forward

    • Authors: Pier Luigi Conti, Fulvia Mecatti
      First page: 258
      Abstract: In the present paper, resampling for finite populations under an iid sampling design is reviewed. Our attention is mainly focused on pseudo-population-based resampling due to its properties. A principled appraisal of the main theoretical foundations and results is given and discussed, together with important computational aspects. Finally, a discussion on open problems and research perspectives is provided.
      Citation: Stats
      PubDate: 2022-03-08
      DOI: 10.3390/stats5010016
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 270-311: Properties and Limiting Forms of the
           Multivariate Extended Skew-Normal and Skew-Student Distributions

    • Authors: Christopher J. Adcock
      First page: 270
      Abstract: This paper is concerned with the multivariate extended skew-normal [MESN] and multivariate extended skew-Student [MEST] distributions, that is, distributions in which the location parameters of the underlying truncated distributions are not zero. The extra parameter leads to greater variability in the moments and critical values, thus providing greater flexibility for empirical work. It is reported in this paper that various theoretical properties of the extended distributions, notably the limiting forms as the magnitude of the extension parameter, denoted τ in this paper, increases without limit. In particular, it is shown that as τ→−∞, the limiting forms of the MESN and MEST distributions are different. The effect of the difference is exemplified by a study of stockmarket crashes. A second example is a short study of the extent to which the extended skew-normal distribution can be approximated by the skew-Student.
      Citation: Stats
      PubDate: 2022-03-09
      DOI: 10.3390/stats5010017
      Issue No: Vol. 5, No. 1 (2022)
  • Stats, Vol. 5, Pages 312-338: Importance of Weather Conditions in a Flight

    • Authors: Gong Chen, Hartmut Fricke, Ostap Okhrin, Judith Rosenow
      First page: 312
      Abstract: Current research initiatives, such as the Single European Sky Air Traffic Management Research Program, call for an air traffic system with improved safety and efficiency records and environmental compatibility. The resulting multi-criteria system optimization and individual flight trajectories require, in particular, reliable three-dimensional meteorological information. The Global (Weather) Forecast System only provides data at a resolution of around 100 km. We postulate a reliable interpolation at high resolution to compute these trajectories accurately and in due time to comply with operational requirements. We investigate different interpolation methods for aerodynamic crucial weather variables such as temperature, wind speed, and wind direction. These methods, including Ordinary Kriging, the radial basis function method, neural networks, and decision trees, are compared concerning cross-validation interpolation errors. We show that using the interpolated data in a flight performance model emphasizes the effect of weather data accuracy on trajectory optimization. Considering a trajectory from Prague to Tunis, a Monte Carlo simulation is applied to examine the effect of errors on input (GFS data) and output (i.e., Ordinary Kriging) on the optimized trajectory.
      Citation: Stats
      PubDate: 2022-03-09
      DOI: 10.3390/stats5010018
      Issue No: Vol. 5, No. 1 (2022)
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-