Subjects -> STATISTICS (Total: 130 journals)
 The end of the list has been reached or no journals were found for your choice.
Similar Journals
 Statistical PapersJournal Prestige (SJR): 1.004 Citation Impact (citeScore): 1Number of Followers: 4      Hybrid journal (It can contain Open Access articles) ISSN (Print) 1613-9798 - ISSN (Online) 0932-5026 Published by Springer-Verlag  [2467 journals]
• Variable selection in Propensity Score Adjustment to mitigate selection
bias in online surveys

Abstract: Abstract The development of new survey data collection methods such as online surveys has been particularly advantageous for social studies in terms of reduced costs, immediacy and enhanced questionnaire possibilities. However, many such methods are strongly affected by selection bias, leading to unreliable estimates. Calibration and Propensity Score Adjustment (PSA) have been proposed as methods to remove selection bias in online nonprobability surveys. Calibration requires population totals to be known for the auxiliary variables used in the procedure, while PSA estimates the volunteering propensity of an individual using predictive modelling. The variables included in these models must be carefully selected in order to maximise the accuracy of the final estimates. This study presents an application, using synthetic and real data, of variable selection techniques developed for knowledge discovery in data to choose the best subset of variables for propensity estimation. We also compare the performance of PSA using different classification algorithms, after which calibration is applied. We also present an application of this methodology in a real-world situation, using it to obtain estimates of population parameters. The results obtained show that variable selection using appropriate methods can provide less biased and more efficient estimates than using all available covariates.
PubDate: 2022-12-01

• Optimality of circular equineighbored block designs under correlated
observations

Abstract: Abstract In some experiments each observation is correlated to the observations in its neighborhoods. The circulant correlation is a structure with this situation for circular block designs. The main aim of this paper is to study optimal properties of some circular block designs under the model with circulant correlation. Also, we introduce circular equineighbored designs (CEDs) and show that, under circulant correlation, some CEDs are universally optimal over the class of generalized binary block designs. Some methods of construction these optimal designs with various number of treatments and block sizes are presented.
PubDate: 2022-12-01

• Sequential support points

Abstract: Abstract By minimizing the energy distance, the support points (SP) method can efficiently compact big training sample into a representative point set with small size. However, when the training sample is deficient, the quality of SP will be greatly reduced. In this paper, a sequential version of SP, called sequential support point (SSP), is proposed. The new method has two appealing features. First, the construction algorithm of SSP can adaptively update the proposal density in importance sampling process based on the existing information. Second, a hyperparameter is introduced to balance the representativeness of sequentially added points with the representativeness of overall points, so that some special purpose experimental designs, such as augmented design and sliced designs, can be efficiently constructed by setting the hyperparameter.
PubDate: 2022-12-01

• A measure of evidence based on the likelihood-ratio statistics

Abstract: Abstract In this paper, we show that the likelihood-ratio measure (a) is invariant with respect to dominating sigma-finite measures, (b) satisfies logical consequences which are not satisfied by standard p values, (c) respects frequentist properties, i.e., the type I error can be properly controlled, and, under mild regularity conditions, (d) can be used as an upper bound for posterior probabilities. We also discuss a generic application to test whether the genotype frequencies of a given population are under the Hardy–Weinberg equilibrium, under inbreeding restrictions or under outbreeding restrictions.
PubDate: 2022-12-01

• New insights on goodness-of-fit tests for ranked set samples

Abstract: Abstract Ranked set sampling (RSS) utilizes auxiliary information on the variable of interest so as to assist the experimenter in acquiring an informative sample from the population. The resulting sample has a stratified structure, and often improves statistical inference with respect to the simple random sample of comparable size. In RSS literature, there are some goodness-of-fit tests based on the empirical estimators of the in-stratum cumulative distribution functions (CDFs). Motivated by the fact that the in-stratum CDFs in RSS can be expressed as functions of the population CDF, some new tests are developed and their asymptotic properties are explored. An extensive simulation study is performed to evaluate properties of different testing procedures when the parent distribution is normal. It turns out that the proposed tests can be considerably more powerful than their contenders in many situations. An application in the context of fishery is also provided.
PubDate: 2022-12-01

• Subdata selection algorithm for linear model discrimination

Abstract: Abstract A statistical method is likely to be sub-optimal if the assumed model does not reflect the structure of the data at hand. For this reason, it is important to perform model selection before statistical analysis. However, selecting an appropriate model from a large candidate pool is usually computationally infeasible when faced with a massive data set, and little work has been done to study data selection for model selection. In this work, we propose a subdata selection method based on leverage scores which enables us to conduct the selection task on a small subdata set. Compared with existing subsampling methods, our method not only improves the probability of selecting the best model but also enhances the estimation efficiency. We justify this both theoretically and numerically. Several examples are presented to illustrate the proposed method.
PubDate: 2022-12-01

• Copula-based measures of asymmetry between the lower and upper tail
probabilities

Abstract: Abstract We propose a copula-based measure of asymmetry between the lower and upper tail probabilities of bivariate distributions. The proposed measure has a simple form and possesses some desirable properties as a measure of asymmetry. The limit of the proposed measure as the index goes to the boundary of its domain can be expressed in a simple form under certain conditions on copulas. A sample analogue of the proposed measure for a sample from a copula is presented and its weak convergence to a Gaussian process is shown. Another sample analogue of the presented measure, which is based on a sample from a distribution on $$\mathbb {R}^2$$ , is given. Simple methods for interval and region estimation are presented. A simulation study is carried out to investigate the performance of the proposed sample analogues and methods for interval estimation. As an example, the presented measure is applied to daily returns of S&P500 and Nikkei225. A trivariate extension of the proposed measure and its sample analogue are briefly discussed.
PubDate: 2022-12-01

• Iterative restricted OK estimator in generalized linear models and the
selection of tuning parameters via MSE and genetic algorithm

Abstract: Abstract This article introduces an iterative restricted OK estimator in generalized linear models to address the dilemma of multicollinearity by imposing exact linear restrictions on the parameters. It is a versatile estimator, which contains maximum likelihood (ML), restricted ML, Liu, restricted Liu, ridge and restricted ridge estimators in generalized linear models. To figure out the performance of restricted OK estimator over its counterparts, various comparisons are given where the performance evaluation criterion is the scalar mean square error (SMSE). Thus, illustrations and simulation studies for Gamma and Poisson responses are conducted apart from theoretical comparisons to see the performance of the estimators in terms of estimated and predicted MSE. Besides, the optimization techniques are applied to find the values of tuning parameters by minimizing SMSE and by using genetic algorithm.
PubDate: 2022-12-01

• Estimation methods for stationary Gegenbauer processes

Abstract: Abstract This paper reviews alternative methods for estimation for stationary Gegenbauer processes specifically, as distinct from the more general long memory models. A short set of Monte Carlo simulations is used to compare the accuracy of these methods. The conclusion found is that a Bayesian technique results in the highest accuracy. The paper is completed with an examination of the SILSO Sunspot Number series as collated by the Royal Observatory of Belgium.
PubDate: 2022-12-01

• Exact prediction intervals for future exponential and Pareto lifetimes
based on ordered ranked set sampling of non-random and random size

Abstract: Abstract In the present paper, two pivotal statistics are suggested to construct prediction intervals of future observations from the exponential and Pareto distributions in the context of ordered ranked set sample. Our study encompasses two cases. The first case, when the sample size is assumed to be fixed and the second case when the sample size is assumed to be a positive integer-valued random variable. In addition to deriving explicit forms for the distribution functions of the two pivotal statistics, we consider some special cases for the random size of the sample. Moreover, a simulation study is carried out to assess the efficiency of the suggested methods. Finally, an example representing lifetime data is analyzed.
PubDate: 2022-12-01

• Estimation for partially linear additive regression with spatial data

Abstract: Abstract This paper studies a partially linear additive regression with spatial data. A new estimation procedure is developed for estimating the unknown parameters and additive components in regression. The proposed method is suitable for high dimensional data, there is no need to solve the restricted minimization problem and no iterative algorithms are needed. Under mild regularity assumptions, the asymptotic distribution of the estimator of the unknown parameter vector is established, the asymptotic distributions of the estimators of the unknown functions are also derived. Finite sample properties of our procedures are studied through Monte Carlo simulations. A real data example about spatial soil data is used to illustrate our proposed methodology.
PubDate: 2022-12-01

• Generalized log-gamma additive partial linear models with P-spline
smoothing

Abstract: Abstract In this paper additive partial linear models with generalized log-gamma errors and P-spline smoothing are proposed for uncensored data. This class derived from the generalized gamma distribution contains various continuous asymmetric distributions to the right and to the left with domain on the real line and has the normal distribution as a particular case. The location parameter is modeled in a semiparametric way so that one has a generalized gamma accelerated failure time additive partial linear model. A joint iterative process is derived, that combines the penalized Fisher scoring algorithm for estimating the parametric and nonparametric regression coefficients and a quasi-Newton procedure for obtaining the scale and shape estimates. Discussions on the inferential aspects of the former estimators as well as on the derivation of the effective degrees of freedom are given. Diagnostic procedures are also proposed, such as residual analysis and sensitivity studies based on the local influence approach. Simulation studies are performed to assess the empirical distributions of the parametric and nonparametric estimators and a real data set on personal injury insurance claims made in Australia from January 1998 to June 1999 is analyzed by the methodology developed through the paper. Technical results, tables, graphs, R codes and the data set used in the application are presented as Supplementary Materials.
PubDate: 2022-12-01

• The shortest confidence interval for Poisson mean

Abstract: Abstract The existence of the shortest confidence interval for Poisson mean is shown. The method of obtaining such an interval is presented as well.
PubDate: 2022-12-01

• A class of estimators based on overlapping sample spacings

Abstract: Abstract In parametric models, minimising different estimators of the Kullback–Leibler divergence between the empirical distribution function and the true distribution function yield the maximum likelihood estimator (MLE) and the maximum spacings product estimator. This approach has been extended in the literature to minimise some estimators of Csisźar divergence between the empirical distribution function and the true distribution function. Such estimators based on disjoint spacings have recently been studied in the literature. This paper considers analogues of these estimators based on overlapping sample spacings. The estimators have been found to be consistent and asymptotically normally distributed under a broad set of regularity conditions. Asymptotically and for any fixed order of spacings, such estimators are at least as good as the corresponding estimators based on non-overlapping spacings. Simulation studies show that some of these estimators perform better than the MLE for contaminated models. An application to real data reveals that the considered estimators can perform better than the MLE for parsimonious models.
PubDate: 2022-11-27

• New closed-form efficient estimators for the negative binomial
distribution

Abstract: Abstract The negative binomial (NB) distribution is of interest in various application studies. New closed-form efficient estimators are proposed for the two NB parameters, based on closed-form $$\sqrt{n}$$ -consistent estimators. The asymptotic efficiency and normality of the new closed-form efficient estimators are guaranteed by the theorem applied to derive the new estimators. Since the new closed-form efficient estimators have the same asymptotic distribution as the maximum likelihood estimators (MLEs), these are denoted as MLE-CEs. Simulation studies suggest that the MLE-CE of dispersion parameter r performs better than its MLE and the method of moments estimator (MME) for some parameter ranges. The MLE-CE of the probability parameter p exhibits the best performance for relatively large p values, where the positive-definite expected Fisher information matrix exists. MLE performs better than MME in this parameter space. The MLE-CE is over 200 times faster than the MLE, especially for large sample sizes, which is good for the big data era. Considering the estimated accuracy and computing time, MLE-CE is recommended for small r values and large p values, whereas MME is recommended for other conditions.
PubDate: 2022-11-19

• Pretest and shrinkage estimation of the regression parameter vector of the
marginal model with multinomial responses

Abstract: Abstract Generalized Estimating Equations (GEE) approach has become a popular method that is applied for correlated categorical multinomial responses data in clinical trials and other biomedical experiments. GEEs estimates of the marginal regression parameter vector are consistent. In this article, we propose the pretest, shrinkage, and positive shrinkage estimators for the regression vector of the marginal model with multinomial responses. The array of estimators are compared analytically via their asymptotic quadratic risks, and numerically via their simulated relative efficiencies. We apply the proposed estimation technique to two real data examples and employed a bootstrapping approach to computing the bootstrapping mean squared error of the estimators.
PubDate: 2022-11-17

• Bounds for Gini’s mean difference based on first four moments, with
some applications

Abstract: Abstract In this paper, we obtain lower and upper bounds for the Gini mean difference for the case of independent and identically distributed random variables based on the information about mean, variance, skewness, and kurtosis of the distribution. We also obtain some relationships between the three dispersion measures in the general case. The established results improve some well-known bounds and inequalities. These results are then used to sharpen some inequalities concerning Gini’s index, order statistics and premium principles. Examples demonstrate that the proposed bounds perform much better than the existing ones.
PubDate: 2022-11-16

• Sequential design of multi-fidelity computer experiments with effect
sparsity

Abstract: Abstract A growing area of focus is using multi-fidelity(MF) simulations to predict the behavior of complex physical systems. In order to adequately utilize the popular sequential designs to improve the effectiveness of the MF method, two challenges involving good projection properties in the presence of effect sparsity and the sample allocation between the high-fidelity(HF) and low-fidelity (LF) codes remain to be addressed. Unfortunately, no systematic study has hitherto been done to deal with these two key issues simultaneously. This article develops a sequential nested design for MF experiments that pays attention to both the space-filling properties in all subsets of factors and the best combination between the two levels of accuracy. For the first issue, we propose a weighted maximum projection criterion combining the uniformity metrics of the HF and LF experiments to select HF points, where the weights are totally data-driven. Note that the obtained HF data is also executed in LF codes to form a nested structure. On the other hand, those samples that only appear in the LF simulation are obtained by the original maximum projection design. The second issue is directly connected with deciding which code to run in the next iteration. We use the entropy theory to score the execution of fidelity for each version, such that the one who has a greater potential to improve the model accuracy will be selected. The performance of the proposed approach is illustrated through several numerical examples. The results demonstrate that the proposed approach outperforms the other three methods in terms of both the prediction accuracy of the final surrogate model and the uniformity in all subspaces of the two codes.
PubDate: 2022-11-14

• Robustness of a truncated estimator for the smaller of two ordered means

Abstract: Abstract In this note, we consider the problem of estimating the smaller of two ordered means. Such problems frequently arise in applications where, for example, aggregated data are observed. In order to combine information from direct and indirect observations, we use the Stein-type truncated estimator. We show that it dominates the direct estimator for distributions with log-concave or log-convex densities.
PubDate: 2022-11-10

• Multivariate copulas with given values at two arbitrary points

Abstract: Abstract Copulas are functions that link an n-dimensional distribution function with its one-dimensional margins. In this contribution we show how n-variate copulas with given values at two arbitrary points can be constructed. Thereby, we also answer a so far open question whether lower and upper bounds for n-variate copulas with given value at a single arbitrary point are achieved. We also introduce and discuss the concept of an $$\mathbf{F}$$ -copula which is needed for proving our results.
PubDate: 2022-10-30

JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762