A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort by number of followers]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted alphabetically
Advances in Complex Systems     Hybrid Journal   (Followers: 10)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 53)
Applied Categorical Structures     Hybrid Journal   (Followers: 5)
Argumentation et analyse du discours     Open Access   (Followers: 7)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 7)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 2)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Biometrical Journal     Hybrid Journal   (Followers: 9)
Biometrics     Hybrid Journal   (Followers: 54)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 18)
Building Simulation     Hybrid Journal   (Followers: 2)
CHANCE     Hybrid Journal   (Followers: 5)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 11)
Computational Statistics     Hybrid Journal   (Followers: 15)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 36)
Current Research in Biostatistics     Open Access   (Followers: 8)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 15)
Demographic Research     Open Access   (Followers: 14)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
ESAIM: Probability and Statistics     Open Access   (Followers: 4)
Extremes     Hybrid Journal   (Followers: 2)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 8)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 4)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 18)
International Journal of Stochastic Analysis     Open Access   (Followers: 2)
International Statistical Review     Hybrid Journal   (Followers: 12)
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 3)
Journal of Applied Statistics     Hybrid Journal   (Followers: 20)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 24)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 40, SJR: 3.664, CiteScore: 2)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 20)
Journal of Econometrics     Hybrid Journal   (Followers: 83)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 7)
Journal of Forecasting     Hybrid Journal   (Followers: 20)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Mathematics and Statistics     Open Access   (Followers: 6)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 33)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 3)
Journal of Statistical Physics     Hybrid Journal   (Followers: 12)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 7)
Journal of Statistical Software     Open Access   (Followers: 17, SJR: 13.802, CiteScore: 16)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 74, SJR: 3.746, CiteScore: 2)
Journal of the Korean Statistical Society     Hybrid Journal  
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 38)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 28)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 40)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 16)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 27)
Law, Probability and Risk     Hybrid Journal   (Followers: 6)
Lifetime Data Analysis     Hybrid Journal   (Followers: 7)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Metrika     Hybrid Journal   (Followers: 4)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 3)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 8)
Optimization Letters     Hybrid Journal   (Followers: 2)
Optimization Methods and Software     Hybrid Journal   (Followers: 5)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 34)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 15)
Queueing Systems     Hybrid Journal   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 7)
Review of Economics and Statistics     Hybrid Journal   (Followers: 174)
Review of Socionetwork Strategies     Hybrid Journal  
Risk Management     Hybrid Journal   (Followers: 16)
Sankhya A     Hybrid Journal   (Followers: 3)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Sequential Analysis: Design Methods and Applications     Hybrid Journal  
Significance     Hybrid Journal   (Followers: 7)
Sociological Methods & Research     Hybrid Journal   (Followers: 45)
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
Stata Journal     Full-text available via subscription   (Followers: 9)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Statistical Methods and Applications     Hybrid Journal   (Followers: 6)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 30)
Statistical Modelling     Hybrid Journal   (Followers: 18)
Statistical Papers     Hybrid Journal   (Followers: 4)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Statistics and Computing     Hybrid Journal   (Followers: 14)
Statistics and Economics     Open Access  
Statistics in Medicine     Hybrid Journal   (Followers: 152)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 12)
Stochastic Models     Hybrid Journal   (Followers: 2)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Teaching Statistics     Hybrid Journal   (Followers: 8)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
TEST     Hybrid Journal   (Followers: 2)
The American Statistician     Full-text available via subscription   (Followers: 26)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 10)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)

              [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Computational Statistics
Journal Prestige (SJR): 0.803
Citation Impact (citeScore): 1
Number of Followers: 15  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1613-9658 - ISSN (Online) 0943-4062
Published by Springer-Verlag Homepage  [2467 journals]
  • Controlling the false discovery rate by a Latent Gaussian Copula Knockoff
           procedure

    • Free pre-print version: Loading...

      Abstract: Abstract The penalized Lasso Cox proportional hazards model has been widely used to identify prognosis biomarkers in high-dimension settings. However, this method tends to select many false positives, affecting its interpretability. In order to improve the reproducibility, we develop a knockoff procedure that consists on wrapping the Lasso Cox model with the model-X knockoff, resulting in a powerful tool for variable selection that allows for the control of the false discovery rate in the presence of finite sample guarantees. In this paper, we propose a novel approach to sample valid knockoffs for ordinal and continuous variables whose distributions can be skewed or heavy-tailed, which employs a Latent Mixed Gaussian Copula model to account for the dependence structure between the variables, leading to what we call the Latent Gaussian Copula Knockoff (LGCK) procedure. We then combine the LGCK method with the Lasso coefficient difference (LCD) statistic as the importance metric. To our knowledge, our proposal is the first knockoff framework for jointly considering ordinal and continuous data in a non-Gaussian setting and a survival context. We illustrate the proposed methodology’s effectiveness by applying it to a real lung cancer gene expression dataset.
      PubDate: 2023-03-25
       
  • Unsupervised learning on U.S. weather forecast performance

    • Free pre-print version: Loading...

      Abstract: Abstract Nowadays, climate events and weather predictions have a huge impact on human activities. To understand the accuracy of weather prediction, we applied the functional principal component analysis (FPCA) method to investigate the main pattern of variance within the U.S. weather prediction error over a period of 3 years. We further grouped the states in the U.S. based on their similarity in weather forecast performance using two types of functional clustering approaches: the filtering method and the model-based method. The strengths and weaknesses of each clustering method were detected through the simulation studies. Then, the clustering approaches were applied to U.S. weather data from 2014 to 2017. Through clustering, cluster-specific patterns were visually detected, and the cluster-to-cluster differences were quantified in order to identify the most and least predictable U.S. states.
      PubDate: 2023-03-20
       
  • Predictive stability criteria for penalty selection in linear models

    • Free pre-print version: Loading...

      Abstract: Abstract Choosing a shrinkage method can be done by selecting a penalty from a list of pre-specified penalties or by constructing a penalty based on the data. If a list of penalties for a class of linear models is given, we introduce a predictive stability criterion based on data perturbation to select a shrinkage method from the list. Simulation studies show that our predictive method identifies shrinkage methods that usually agree with existing literature and help explain heuristically when a given shrinkage method can be expected to perform well. If the preference is to construct a penalty customized for a given problem, then we propose a technique based on genetic algorithms, again using a predictive criterion. We find that, in general, a custom penalty never performs worse than any commonly used penalties and there are cases the custom penalty reduces to a recognizable penalty. Since penalty selection is mathematically equivalent to prior selection, our method also constructs priors. Our methodology allows us to observe that the oracle property typically holds for penalties that satisfy basic regularity conditions and therefore is not restrictive enough to play a direct role in penalty selection. In addition, our methodology, can be immediately applied to real data problems, and permits us to take model mis-specification into account.
      PubDate: 2023-03-16
       
  • Generative models and Bayesian inversion using Laplace approximation

    • Free pre-print version: Loading...

      Abstract: Abstract The Bayesian approach to solving inverse problems relies on the choice of a prior. This critical ingredient allows expert knowledge or physical constraints to be formulated in a probabilistic fashion and plays an important role for the success of the inference. Recently, Bayesian inverse problems were solved using generative models as highly informative priors. Generative models are a popular tool in machine learning to generate data whose properties closely resemble those of a given database. Typically, the generated distribution of data is embedded in a low-dimensional manifold. For the inverse problem, a generative model is trained on a database that reflects the properties of the sought solution, such as typical structures of the tissue in the human brain in magnetic resonance imaging. The inference is carried out in the low-dimensional manifold determined by the generative model that strongly reduces the dimensionality of the inverse problem. However, this procedure produces a posterior that does not admit a Lebesgue density in the actual variables and the accuracy attained can strongly depend on the quality of the generative model. For linear Gaussian models, we explore an alternative Bayesian inference based on probabilistic generative models; this inference is carried out in the original high-dimensional space. A Laplace approximation is employed to analytically derive the prior probability density function required, which is induced by the generative model. Properties of the resulting inference are investigated. Specifically, we show that derived Bayes estimates are consistent, in contrast to the approach in which the low-dimensional manifold of the generative model is employed. The MNIST data set is used to design numerical experiments that confirm our theoretical findings. It is shown that the approach proposed can be advantageous when the information contained in the data is high and a simple heuristic is considered for the detection of this case. Finally, the pros and cons of both approaches are discussed.
      PubDate: 2023-03-16
       
  • A model-based ultrametric composite indicator for studying waste
           management in Italian municipalities

    • Free pre-print version: Loading...

      Abstract: Abstract A Composite Indicator (CI) is a useful tool to synthesize information on a multidimensional phenomenon and make policy decisions. Multidimensional phenomena are often modeled by hierarchical latent structures that reconstruct relationships between variables. In this paper, we propose an exploratory, simultaneous model for building a hierarchical CI system to synthesize a multidimensional phenomenon and analyze its several facets. The proposal, called the Ultrametric Composite Indicator (UCI) model, reconstructs the hierarchical relationships among manifest variables detected by the correlation matrix via an extended ultrametric correlation matrix. The latter has the feature of being one-to-one associated with a hierarchy of latent concepts. Furthermore, the proposal introduces a test to unravel relevant dimensions in the hierarchy and retain statistically significant higher-level CIs. A simulation study is illustrated to compare the proposal with other existing methodologies. Finally, the UCI model is applied to study Italian municipalities’ behavior toward waste management and to provide a tool to guide their councils in policy decisions.
      PubDate: 2023-03-16
       
  • Spatial correlation in weather forecast accuracy: a functional time series
           approach

    • Free pre-print version: Loading...

      Abstract: Abstract A functional time series approach is proposed for investigating spatial correlation in daily maximum temperature forecast errors for 111 cities spread across the U.S. The modelling of spatial correlation is most fruitful for longer forecast horizons, and becomes less relevant as the forecast horizon shrinks towards zero. For 6-day-ahead forecasts, the functional approach uncovers interpretable regional spatial effects, and captures the higher variance observed in inland cities versus coastal cities, as well as the higher variance observed in mountain and midwest states. The functional approach also naturally handles missing data through modelling a continuum, and can be implemented efficiently by exploiting the sparsity induced by a B-spline basis. The temporal dependence in the data is modeled through temporal dependence in functional basis coefficients. Independent first order autoregressions with generalized autoregressive conditional heteroskedasticity [AR(1)+GARCH(1,1)] and Student-t innovations work well to capture the persistence of basis coefficients over time and the seasonal heteroskedasticity reflecting higher variance in winter. Through exploiting autocorrelation in the basis coefficients, the functional time series approach also yields a method for improving weather forecasts and uncertainty quantification. The resulting method corrects for bias in the weather forecasts, while reducing the error variance.
      PubDate: 2023-03-14
       
  • Threshold effect in varying coefficient models with unknown
           heteroskedasticity

    • Free pre-print version: Loading...

      Abstract: Abstract This paper extends the threshold regression to threshold effect in varying coefficient model. We allow for either cross-section or time series observations. Estimation of the regression parameters is considered. An asymptotic distribution theory for the regression estimates (the threshold and the regression slopes) is developed. The distribution of threshold estimates is found to be non-standard. Under some sufficient conditions, we show that the proposed estimator for regression slopes is root-n consistent and asymptotically normally distributed, and that the proposed estimator for the varying coefficient is consistent and also asymptotically normal distributed but at a rate slower than root-n. Consistent estimators for the asymptotic variances of the proposed estimators are provided. Monte Carlo simulations are presented to assess the performance of the asymptotic approximations. The empirical relevance of the theory is illustrated through an application to the relationship between environmental regulation and regional technological innovation study.
      PubDate: 2023-03-13
       
  • Inference and expected total test time for step-stress life test in the
           presence of complementary risks and incomplete data

    • Free pre-print version: Loading...

      Abstract: Abstract The complementary risk is common and important in the engineering field. However, there is not much research about it because of its complex derivation compared with the competing risk model. In this paper, we concentrate on inference of step-stress partially accelerated life test in the presence of complementary risks under progressive type-II censoring scheme. The Weibull distribution is chosen as the baseline lifetime of the model. The tampered random variable model is adopted as the statistical acceleration model in the accelerated test. We apply both the classical and Bayesian methods to obtain the estimation of lifetime parameters and acceleration factors. The reliability and reversed hazard rate are estimated based on the parametric estimates. The computational formulae of expected total test time are creatively derived under the step-stress and censored setting. The theoretical calculations are compared with simulated values to verify the derivation. Also, numerical studies including the simulation study and real-data analysis in engineering background are conducted to compare and illustrate the performance of the approaches proposed in the paper. Some conclusions and suggestions for actual production are given at the end of the paper.
      PubDate: 2023-03-12
       
  • Reducing the overfitting in the gROC curve estimation

    • Free pre-print version: Loading...

      Abstract: Abstract The generalized receiver-operating characteristic, gROC, curve considers the classification ability of diagnostic tests when both larger and lower values of the marker are associated with higher probabilities of being positive. Its empirical estimation implies to select the best classification subsets among those satisfying particular condition. Both strong and weak consistency have already been proved. However, using the same data for both to select the classification subsets and to calculate its gROC curve leads to an over-optimistic estimate of the real performance of the diagnostic criteria on future samples. In this work, the bias of the empirical gROC curve estimator is explored through Monte Carlo simulations. Besides, two cross-validation based algorithms are proposed for reducing the overfitting. The practical application of the proposed algorithms is illustrated through the analysis of a real-world dataset. Simulation results suggest that the empirical gROC curve estimator returns optimistic approximations, especially, in situations in which the diagnostic capacity of the marker is poor and the sample size is small. The new proposed algorithms improve the estimation of the actual diagnostic test accuracy, and get almost unbiased gAUCs in most of the considered scenarios. However, the cross-validation based algorithms reported larger \(L_1\) -errors than the standard empirical estimators, and increment the computational cost of the procedures. As online supplementary material, this manuscript includes an R function which wraps up the implemented routines.
      PubDate: 2023-03-10
       
  • Correction: Sparse reduced-rank regression for simultaneous rank and
           variable selection via manifold optimization

    • Free pre-print version: Loading...

      PubDate: 2023-03-01
       
  • Predictors with measurement error in mixtures of polynomial regressions

    • Free pre-print version: Loading...

      Abstract: Abstract There has been a substantial body of research on mixtures-of-regressions models that has developed over the past 20 years. While much of the recent literature has focused on flexible mixtures-of-regressions models, there is still considerable utility for imposing structure on the mixture components through fully parametric models. One feature of the data that is scantly addressed in mixtures of regressions is the presence of measurement error in the predictors. The limited existing research on this topic concerns the case where classical measurement error is added to the classic mixtures-of-linear-regressions model. In this paper, we consider the setting of mixtures of polynomial regressions where the predictors are subject to classical measurement error. Moreover, each component is allowed to have a different degree for the polynomial structure. We utilize a generalized expectation-maximization algorithm for performing maximum likelihood estimation. For estimating standard errors, we extend a semiparametric bootstrap routine that has been employed for mixtures of linear regressions without measurement error in the predictors. Numeric work, for practical reasons identified, is limited to estimating two-component models. We consider a likelihood ratio test for determining if there is a higher-degree polynomial term in one of the components. Model selection criteria are also highlighted as a way for determining an appropriate model. A simulation study and an application to the classic nitric oxide emissions data are provided.
      PubDate: 2023-03-01
       
  • Semi-supervised adapted HMMs for P2P credit scoring systems with reject
           inference

    • Free pre-print version: Loading...

      Abstract: Abstract The majority of current credit-scoring models, used for loan approval processing, are generally built on the basis of the information from the accepted credit applicants whose ability to repay the loan is known. This situation generates what is called the selection bias, presented by a sample that is not representative of the population of applicants, since rejected applications are excluded. Thus, the impact on the eligibility of those models from a statistical and economic point of view. Especially for the models used in the peer-to-peer lending platforms, since their rejection rate is extremely high. The method of inferring rejected applicants information in the process of construction of the credit scoring models is known as reject inference. This study proposes a semi-supervised learning framework based on hidden Markov models (SSHMM), as a novel method of reject inference. Real data from the Lending Club platform, the most used online lending marketplace in the United States as well as the rest of the world, is used to experiment the effectiveness of our method over existing approaches. The results of this study clearly illustrate the proposed method’s superiority, stability, and adaptability.
      PubDate: 2023-03-01
       
  • Model selection using PRESS statistic

    • Free pre-print version: Loading...

      Abstract: Abstract The most popularly used statistic \(R^2\) has a fundamental weakness in model building: it favors adding more predictors to the model because \(R^2\) can only increase. In effect, additional predictors start fitting noise to the data. Other measures used in selecting a regression model such as \(R^2_{adj}\) , AIC, SBC, and Mallow’s \(C_p\) does not guarantee that the model selected will also make better prediction of future values. To avoid this, data scientists withhold a percentage of the data for validation purposes. The PRESS statistic does something similar by withholding each observation in calculating its own predicted value. In this paper, we investigated the behavior of \(R^2_{PRESS}\) , and how it performs compared to other criterion in model selection in the presence of unnecessary predictors. Using simulated data, we found \(R^2_{PRESS}\) has generally performed best in selecting the true model as the best model for prediction among the model selection measures considered.
      PubDate: 2023-03-01
       
  • Sparse reduced-rank regression for simultaneous rank and variable
           selection via manifold optimization

    • Free pre-print version: Loading...

      Abstract: Abstract We consider the problem of constructing a reduced-rank regression model whose coefficient parameter is represented as a singular value decomposition with sparse singular vectors. The traditional estimation procedure for the coefficient parameter often fails when the true rank of the parameter is high. To overcome this issue, we develop an estimation algorithm with rank and variable selection via sparse regularization and manifold optimization, which enables us to obtain an accurate estimation of the coefficient parameter even if the true rank of the coefficient parameter is high. Using sparse regularization, we can also select an optimal value of the rank. We conduct Monte Carlo experiments and a real data analysis to illustrate the effectiveness of our proposed method.
      PubDate: 2023-03-01
       
  • Adaptive smoothing spline estimator for the function-on-function linear
           regression model

    • Free pre-print version: Loading...

      Abstract: Abstract In this paper, we propose an adaptive smoothing spline (AdaSS) estimator for the function-on-function linear regression model where each value of the response, at any domain point, depends on the full trajectory of the predictor. The AdaSS estimator is obtained by the optimization of an objective function with two spatially adaptive penalties, based on initial estimates of the partial derivatives of the regression coefficient function. This allows the proposed estimator to adapt more easily to the true coefficient function over regions of large curvature and not to be undersmoothed over the remaining part of the domain. A novel evolutionary algorithm is developed ad hoc to obtain the optimization tuning parameters. Extensive Monte Carlo simulations have been carried out to compare the AdaSS estimator with competitors that have already appeared in the literature before. The results show that our proposal mostly outperforms the competitor in terms of estimation and prediction accuracy. Lastly, those advantages are illustrated also in two real-data benchmark examples. The AdaSS estimator is implemented in the R package adass, openly available online on CRAN.
      PubDate: 2023-03-01
       
  • Model aggregation for doubly divided data with large size and large
           dimension

    • Free pre-print version: Loading...

      Abstract: Abstract Massive data are often featured with high dimensionality as well as large sample size, which typically cannot be stored in a single machine and thus make both analysis and prediction challenging. We propose a distributed gridding model aggregation (DGMA) approach to predicting the conditional mean of a response variable, which overcomes the storage limitation of a single machine and the curse of high dimensionality. Specifically, on each local machine that stores partial data of relatively moderate sample size, we develop the model aggregation approach by splitting predictors wherein a greedy algorithm is developed. To obtain the optimal weights across all local machines, we further design a distributed and communication-efficient algorithm. Our procedure effectively distributes the workload and dramatically reduces the communication cost. Extensive numerical experiments are carried out on both simulated and real datasets to demonstrate the feasibility of the DGMA method.
      PubDate: 2023-03-01
       
  • Efficient computation of tight approximations to Chernoff bounds

    • Free pre-print version: Loading...

      Abstract: Abstract Chernoff bounds are a powerful application of the Markov inequality to produce strong bounds on the tails of probability distributions. They are often used to bound the tail probabilities of sums of Poisson trials, or in regression to produce conservative confidence intervals for the parameters of such trials. The bounds provide expressions for the tail probabilities that can be inverted for a given probability/confidence to provide tail intervals. The inversions involve the solution of transcendental equations and it is often convenient to substitute approximations that can be exactly solved e.g. by the quadratic equation. In this paper we introduce approximations for the Chernoff bounds whose inversion can be exactly solved with a quadratic equation, but which are closer approximations than those adopted previously.
      PubDate: 2023-03-01
       
  • Adjusting the adjusted Rand Index

    • Free pre-print version: Loading...

      Abstract: Abstract The Adjusted Rand Index (ARI) is arguably one of the most popular measures for cluster comparison. The adjustment of the ARI is based on a hypergeometric distribution assumption which is not satisfactory from a modeling point of view because (i) it is not appropriate when the two clusterings are dependent, (ii) it forces the size of the clusters, and (iii) it ignores the randomness of the sampling. In this work, we present a new "modified" version of the Rand Index. First, as in Russell et al. (J Malar Inst India 3(1), 1940 ), we consider only the pairs consistent by similarity and ignore the pairs consistent by difference to define the MRI. Second, we base the adjusted version, called MARI, on a multinomial distribution instead of a hypergeometric distribution. The multinomial model is advantageous because it does not force the size of the clusters, correctly models randomness and is easily extended to the dependent case. We show that ARI is biased under the multinomial model and that the difference between ARI and MARI can be significant for small n but essentially vanishes for large n, where n is the number of individuals. Finally, we provide an efficient algorithm to compute all these quantities ((A)RI and M(A)RI) based on a sparse representation of the contingency table in our aricode package. The space and time complexity is linear with respect to the number of samples and, more importantly, does not depend on the number of clusters as we do not explicitly compute the contingency table.
      PubDate: 2023-03-01
       
  • Controlling the familywise error rate when performing multiple comparisons
           in a linear latent variable model

    • Free pre-print version: Loading...

      Abstract: Abstract In latent variable models (LVMs) it is possible to analyze multiple outcomes and to relate them to several explanatory variables. In this context many parameters are estimated and it is common to perform multiple tests, e.g. to investigate outcome-specific effects using Wald tests or to check the correct specification of the modeled mean and variance using a forward stepwise selection (FSS) procedure based on Score tests. Controlling the family-wise error rate (FWER) at its nominal level involves adjustment of the p-values for multiple testing. Because of the correlation between test statistics, the Bonferroni procedure is often too conservative. In this article, we extend the max-test procedure to the LVM framework for Wald and Score tests. Depending on the correlation between the test statistics, the max-test procedure is equivalent or more powerful than the Bonferroni procedure while also providing, asymptotically, a strong control of the FWER for non-iterative procedures. Using simulation studies, we assess the finite sample behavior of the max-test procedure for Wald and Score tests in LVMs. We apply our procedure to quantify the neuroinflammatory response to mild traumatic brain injury in nine brain regions.
      PubDate: 2023-03-01
       
  • A survival regression with cure fraction applied to cervical cancer

    • Free pre-print version: Loading...

      Abstract: Abstract A new survival model is proposed in the presence of surviving fractions and unobserved dispersion. It is obtained by considering several latent factors (or risks) that generated the observed lifetime which follows a generalized Poisson distribution, and it includes as a special case, the promotion time cure model. We explore maximum likelihood tools for inference issues by aid of the expectation maximization algorithm for estimating the parameters while model discrimination problem is treated by the aid of the likelihood ratio test. The new regression is applied to cervical cancer data to evaluate covariates effects in the cured fraction and non-cured group.
      PubDate: 2023-03-01
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 3.236.209.138
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-