Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We study approximate K-optimal designs for various regression models by minimizing the condition number of the information matrix. This minimizes the error sensitivity in the computation of the least squares estimator of regression parameters and also avoids the multicollinearity in regression. Using matrix and optimization theory, we derive several theoretical results of K-optimal designs, including convexity of K-optimality criterion, lower bounds of the condition number, and symmetry properties of K-optimal designs. A general numerical method is developed to find K-optimal designs for any regression model on a discrete design space. In addition, specific results are obtained for polynomial, trigonometric and second-order response models. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In the paper we consider a new approach to regularize the maximum likelihood estimator of a discrete probability distribution and its application in variable selection. The method relies on choosing a parameter of its convex combination with a low-dimensional target distribution by minimising the squared error (SE) instead of the mean SE (MSE). The choice of an optimal parameter for every sample results in not larger MSE than MSE for James–Stein shrinkage estimator of discrete probability distribution. The introduced parameter is estimated by cross-validation and is shown to perform promisingly for synthetic dependence models. The method is applied to introduce regularized versions of information based variable selection criteria which are investigated in numerical experiments and turn out to work better than commonly used plug-in estimators under several scenarios. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The fundamental problem in the orthogonal design theory is the design isomorphism, which involves two classes of methods in the statistical literature. One is to identify the isomorphic designs by costly computation, another is only to detect the non-isomorphic designs as a feasible alternative. In this paper we explore the design structure to propose the degree of isomorphism, as a novel criterion showing the similarity between orthogonal designs. A column-wise framework is proposed to accommodate different issues of the design isomorphism, including the detection of non-isomorphism, identification of isomorphism and determination of subclasses for symmetric orthogonal designs. Our framework shows surprisingly high efficiency, where the average time of identifying the isomorphism between two designs in selected classes is all down to about one second. By applying the hierarchical clustering on the average linkage, a novel classification is also presented for non-isomorphic orthogonal designs in a combinatorial view. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract A common practice in statistics is to take the log transformation of highly skewed data and construct confidence intervals for the population average on the basis of transformed data. However, when computed based on log-transformed data, the confidence interval is for the geometric instead of the arithmetic average and neglecting this can lead to misleading conclusions. In this paper, we consider an approach based on a regression of the two sample averages to convert the confidence interval for the geometric average in a confidence interval for the arithmetic average of the original untransformed data. The proposed approach is substantially simpler to implement when compared to the existing methods and the extensive Monte Carlo and bootstrapping simulation study suggests outperforming in terms of coverage probabilities even at very small sample sizes. Some real data examples have been analyzed, which support the simulation findings of the paper. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Researchers across many fields routinely analyze trial data using Null Hypothesis Significance Tests with zero null and p < 0.05. To promote thoughtful statistical testing, we propose a visualization tool that highlights practically meaningful effects when calculating sample sizes. The tool re-purposes and adapts funnel plots, originally developed for meta-analyses, after generalizing them to cater for meaningful effects. As with traditional sample size calculators, researchers must nominate anticipated effect sizes and variability alongside the desired power. The advantage of our tool is that it simultaneously presents sample sizes needed to adequately power tests for equivalence, for non-inferiority and for superiority, each considered at up to three alpha levels and in positive and negative directions. The tool thus encourages researchers at the design stage to think about the type and level of test in terms of their research goals, costs of errors, meaningful effect sizes and feasible sample sizes. An R-implementation of the tool is available on-line. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Beta regression models are widely used for modeling continuous data limited to the unit interval, such as proportions, fractions, and rates. The inference for the parameters of beta regression models is commonly based on maximum likelihood estimation. However, it is known to be sensitive to discrepant observations. In some cases, one atypical data point can lead to severe bias and erroneous conclusions about the features of interest. In this work, we develop a robust estimation procedure for beta regression models based on the maximization of a reparameterized L \(_q\) -likelihood. The new estimator offers a trade-off between robustness and efficiency through a tuning constant. To select the optimal value of the tuning constant, we propose a data-driven method which ensures full efficiency in the absence of outliers. We also improve on an alternative robust estimator by applying our data-driven method to select its optimum tuning constant. Monte Carlo simulations suggest marked robustness of the two robust estimators with little loss of efficiency when the proposed selection scheme for the tuning constant is employed. Applications to three datasets are presented and discussed. As a by-product of the proposed methodology, residual diagnostic plots based on robust fits highlight outliers that would be masked under maximum likelihood estimation. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we study a replacement model under which replacements take place not at fixed time but at random time T or upon failure. We study the properties of mean time to failure of the proposed model. Life times of two replacement models with different random replacement times have been compared using few stochastic orderings. A non-parametric test based on U-Statistics has been proposed for testing constancy of mean time to failure against NBUE alternative. The finite sample performance of the proposed test is evaluated through Monte Carlo simulation study. Finally, the proposed test procedure is illustrated using lifetime data of air conditioning equipment. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper aims to propose a new class of permutation-invariant tests for diagonal symmetry around a known point based on the center-outward depth ranking. The asymptotic behavior of the proposed tests under the null distribution is derived. The performance of the proposed tests is assessed through a Monte Carlo study. The results show that the tests perform well comparing other procedures in terms of empirical sizes and empirical powers. We demonstrated that the proposed class includes the celebrated Wilcoxon signed-rank test as a special case in the univariate setting. Finally, we apply the tests to a well-known data set to illustrate the method developed in this paper. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract For a parametric model of distributions, the closest distribution in the model to the true distribution located outside the model is considered. Measuring the closeness between two distributions with the Kullback–Leibler divergence, the closest distribution is called the “information projection.” The estimation risk of the maximum likelihood estimator is defined as the expectation of Kullback–Leibler divergence between the information projection and the maximum likelihood estimative density (the predictive distribution with the plugged-in maximum likelihood estimator). Here, the asymptotic expansion of the risk is derived up to the second order in the sample size, and the sufficient condition on the risk for the Bayes error rate between the predictive distribution and the information projection to be lower than a specified value is investigated. Combining these results, the “p/n criterion” is proposed, which determines whether the estimative density is sufficiently close to the information projection for the given model and sample. This criterion can constitute a solution to the sample size or model selection problem. The use of the p/n criteria is demonstrated for two practical datasets. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Many time series problems feature epidemic changes—segments where a parameter deviates from a background baseline. Detection of such changepoints can be improved by accounting for the epidemic structure, but this is currently difficult if the background level is unknown. Furthermore, in practical data the background often undergoes nuisance changes, which interfere with standard estimation techniques and appear as false alarms. To solve these issues, we develop a new, efficient approach to simultaneously detect epidemic changes and estimate unknown, but fixed, background level, based on a penalised cost. Using it, we build a two-level detector that models and separates nuisance and signal changes. The analytic and computational properties of the proposed methods are established, including consistency and convergence. We demonstrate via simulations that our two-level detector provides accurate estimation of changepoints under a nuisance process, while other state-of-the-art detectors fail. In real-world genomic and demographic datasets, the proposed method identified and localised target events while separating out seasonal variations and experimental artefacts. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The mean past lifetime (MPL) is an important tool in reliability and survival analysis for measuring the average time elapsed since the occurrence of an event, under the condition that the event has occurred before a specific time \(t>0\) . This article develops a nonparametric estimator for MPL based on observations collected according to ranked set sampling (RSS) design. It is shown that the proposed estimator is a strongly uniform consistent estimator of MPL. It is also proved that the introduced estimator tends to a Gaussian process under some mild conditions. A Monte Carlo simulation study is employed to evaluate the performance of the proposed estimator with its competitor in simple random sampling (SRS). Our findings show the introduced estimator is more efficient than its counterpart estimator in SRS as long as the quality of ranking is better than random. Finally, an illustrative example is provided to describe the potential application of the developed estimator in assessing the average time between the infection and diagnosis in HIV patients. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Long memory models can be generalised by the Fractional equal-root Autoregressive Moving Average (FerARMA) process, which displays short memory for a suitable parameter’s set. Consequently, the spectrum is bounded, ensuring stationarity also for values of the memory parameter d larger than 0.5. The FerARMA generalization is proposed here to forecast highly persistent time series, as climate records of tree rings and paleo-temperature reconstructions. The main advantage of a bounded spectrum allows for more accurate predictions with respect to standard long memory models, especially if a long prediction horizon is considered. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, some predictive results of dual generalized order statistics (DGOSs) from the inverse Weibull distribution are obtained. For this goal, different predictive and reconstructive pivotal quantities are proposed. Moreover, several predictive and reconstructive intervals concerning DGOSs based on the inverse Weibull distribution are constructed. Furthermore, the maximum likelihood predictor as well as the predictive maximum likelihood estimates based on DGOSs are studied. Finally, simulation studies are carried out to assess the efficiency of the obtained results. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper proposes a new generalized extended balanced loss function (GEBLF). Admissibility of linear estimators is characterized in the General Gauss–Markov model with respect to GEBLF. The sufficient and necessary conditions for linear estimators to be admissible with a dispersion matrix possibly singular among the set of linear estimators are obtained. It is stated that the results obtained under special conditions lead to the results known in the literature. PubDate: 2023-02-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The general problem of constructing regions that have a guaranteed coverage probability for an arbitrary parameter of interest \(\psi \in \Psi \) is considered. The regions developed are Bayesian in nature and the coverage probabilities can be considered as Bayesian confidences with respect to the model obtained by integrating out the nuisance parameters using the conditional prior given \(\psi .\) Both the prior coverage probability and the prior probability of covering a false value (the accuracy) can be controlled by setting the sample size. These coverage probabilities are considered as a priori figures of merit concerning the reliability of a study while the inferences quoted are Bayesian. Several problems are considered where obtaining confidence regions with desirable properties have proven difficult to obtain. For example, it is shown that the approach discussed never leads to improper regions which has proven to be an issue for some confidence regions. PubDate: 2023-01-24
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we propose a family of correlation structures for crossover designs with repeated measures for both, Gaussian and non-Gaussian responses using generalized estimating equations (GEE). The structure considers two matrices: one that models between-period correlation and another one that models within-period correlation. The overall correlation matrix, which is used to build the GEE, corresponds to the Kronecker between these matrices. A procedure to estimate the parameters of the correlation matrix is proposed, its statistical properties are studied and a comparison with standard models using a single correlation matrix is carried out. A simulation study showed a superior performance of the proposed structure in terms of the quasi-likelihood criterion, efficiency, and the capacity to explain complex correlation phenomena patterns in longitudinal data from crossover designs. PubDate: 2023-01-10
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Using different extropies of k record values various characterizations are provided for continuous symmetric distributions. The results are in addition to the results of Ahmadi (Stat Pap 62:2603–2626, 2021). These include cumulative residual (past) extropy, generalised cumulative residual (past) extropy, also some common Kerridge inaccuracy measures. Using inaccuracy extropy measures, it is demonstrated that continuous symmetric distributions are characterised by an equality of information in upper and lower k-records. The applicability of the suggested test is then demonstrated using three real data sets by observing the p-values of our test. PubDate: 2023-01-10
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We study full Bayesian procedures for high-dimensional linear regression. We adopt data-dependent empirical priors introduced in Martin et al. (Bernoulli 23(3):1822–1847, 2017). In their paper, these priors have nice posterior contraction properties and are easy to compute. Our paper extend their theoretical results to the case of unknown error variance . Under proper sparsity assumption, we achieve model selection consistency, posterior contraction rates as well as Bernstein von-Mises theorem by analyzing multivariate t-distribution. PubDate: 2023-01-10