Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper aims to propose a new class of permutation-invariant tests for diagonal symmetry around a known point based on the center-outward depth ranking. The asymptotic behavior of the proposed tests under the null distribution is derived. The performance of the proposed tests is assessed through a Monte Carlo study. The results show that the tests perform well comparing other procedures in terms of empirical sizes and empirical powers. We demonstrated that the proposed class includes the celebrated Wilcoxon signed-rank test as a special case in the univariate setting. Finally, we apply the tests to a well-known data set to illustrate the method developed in this paper. PubDate: 2022-05-12
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed. PubDate: 2022-05-06
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We study approximate K-optimal designs for various regression models by minimizing the condition number of the information matrix. This minimizes the error sensitivity in the computation of the least squares estimator of regression parameters and also avoids the multicollinearity in regression. Using matrix and optimization theory, we derive several theoretical results of K-optimal designs, including convexity of K-optimality criterion, lower bounds of the condition number, and symmetry properties of K-optimal designs. A general numerical method is developed to find K-optimal designs for any regression model on a discrete design space. In addition, specific results are obtained for polynomial, trigonometric and second-order response models. PubDate: 2022-05-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract A common practice in statistics is to take the log transformation of highly skewed data and construct confidence intervals for the population average on the basis of transformed data. However, when computed based on log-transformed data, the confidence interval is for the geometric instead of the arithmetic average and neglecting this can lead to misleading conclusions. In this paper, we consider an approach based on a regression of the two sample averages to convert the confidence interval for the geometric average in a confidence interval for the arithmetic average of the original untransformed data. The proposed approach is substantially simpler to implement when compared to the existing methods and the extensive Monte Carlo and bootstrapping simulation study suggests outperforming in terms of coverage probabilities even at very small sample sizes. Some real data examples have been analyzed, which support the simulation findings of the paper. PubDate: 2022-04-30
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The mean past lifetime (MPL) is an important tool in reliability and survival analysis for measuring the average time elapsed since the occurrence of an event, under the condition that the event has occurred before a specific time \(t>0\) . This article develops a nonparametric estimator for MPL based on observations collected according to ranked set sampling (RSS) design. It is shown that the proposed estimator is a strongly uniform consistent estimator of MPL. It is also proved that the introduced estimator tends to a Gaussian process under some mild conditions. A Monte Carlo simulation study is employed to evaluate the performance of the proposed estimator with its competitor in simple random sampling (SRS). Our findings show the introduced estimator is more efficient than its counterpart estimator in SRS as long as the quality of ranking is better than random. Finally, an illustrative example is provided to describe the potential application of the developed estimator in assessing the average time between the infection and diagnosis in HIV patients. PubDate: 2022-04-30
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, some predictive results of dual generalized order statistics (DGOSs) from the inverse Weibull distribution are obtained. For this goal, different predictive and reconstructive pivotal quantities are proposed. Moreover, several predictive and reconstructive intervals concerning DGOSs based on the inverse Weibull distribution are constructed. Furthermore, the maximum likelihood predictor as well as the predictive maximum likelihood estimates based on DGOSs are studied. Finally, simulation studies are carried out to assess the efficiency of the obtained results. PubDate: 2022-04-30
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract For a parametric model of distributions, the closest distribution in the model to the true distribution located outside the model is considered. Measuring the closeness between two distributions with the Kullback–Leibler divergence, the closest distribution is called the “information projection.” The estimation risk of the maximum likelihood estimator is defined as the expectation of Kullback–Leibler divergence between the information projection and the maximum likelihood estimative density (the predictive distribution with the plugged-in maximum likelihood estimator). Here, the asymptotic expansion of the risk is derived up to the second order in the sample size, and the sufficient condition on the risk for the Bayes error rate between the predictive distribution and the information projection to be lower than a specified value is investigated. Combining these results, the “p/n criterion” is proposed, which determines whether the estimative density is sufficiently close to the information projection for the given model and sample. This criterion can constitute a solution to the sample size or model selection problem. The use of the p/n criteria is demonstrated for two practical datasets. PubDate: 2022-04-25
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The fundamental problem in the orthogonal design theory is the design isomorphism, which involves two classes of methods in the statistical literature. One is to identify the isomorphic designs by costly computation, another is only to detect the non-isomorphic designs as a feasible alternative. In this paper we explore the design structure to propose the degree of isomorphism, as a novel criterion showing the similarity between orthogonal designs. A column-wise framework is proposed to accommodate different issues of the design isomorphism, including the detection of non-isomorphism, identification of isomorphism and determination of subclasses for symmetric orthogonal designs. Our framework shows surprisingly high efficiency, where the average time of identifying the isomorphism between two designs in selected classes is all down to about one second. By applying the hierarchical clustering on the average linkage, a novel classification is also presented for non-isomorphic orthogonal designs in a combinatorial view. PubDate: 2022-04-24
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: A Correction to this paper has been published: 10.1007/s00362-021-01273-w PubDate: 2022-04-20
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper proposes a new generalized extended balanced loss function (GEBLF). Admissibility of linear estimators is characterized in the General Gauss–Markov model with respect to GEBLF. The sufficient and necessary conditions for linear estimators to be admissible with a dispersion matrix possibly singular among the set of linear estimators are obtained. It is stated that the results obtained under special conditions lead to the results known in the literature. PubDate: 2022-04-08
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In the paper we consider a new approach to regularize the maximum likelihood estimator of a discrete probability distribution and its application in variable selection. The method relies on choosing a parameter of its convex combination with a low-dimensional target distribution by minimising the squared error (SE) instead of the mean SE (MSE). The choice of an optimal parameter for every sample results in not larger MSE than MSE for James–Stein shrinkage estimator of discrete probability distribution. The introduced parameter is estimated by cross-validation and is shown to perform promisingly for synthetic dependence models. The method is applied to introduce regularized versions of information based variable selection criteria which are investigated in numerical experiments and turn out to work better than commonly used plug-in estimators under several scenarios. PubDate: 2022-04-07
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Many time series problems feature epidemic changes—segments where a parameter deviates from a background baseline. Detection of such changepoints can be improved by accounting for the epidemic structure, but this is currently difficult if the background level is unknown. Furthermore, in practical data the background often undergoes nuisance changes, which interfere with standard estimation techniques and appear as false alarms. To solve these issues, we develop a new, efficient approach to simultaneously detect epidemic changes and estimate unknown, but fixed, background level, based on a penalised cost. Using it, we build a two-level detector that models and separates nuisance and signal changes. The analytic and computational properties of the proposed methods are established, including consistency and convergence. We demonstrate via simulations that our two-level detector provides accurate estimation of changepoints under a nuisance process, while other state-of-the-art detectors fail. In real-world genomic and demographic datasets, the proposed method identified and localised target events while separating out seasonal variations and experimental artefacts. PubDate: 2022-04-04
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Sample selection arises when the outcome of interest is partially observed in a study. A common challenge is the requirement for exclusion restrictions. That is, some of the covariates affecting missingness mechanism do not affect the outcome. The drive to establish this requirement often leads to the inclusion of irrelevant variables in the model. A suboptimal solution is the use of classical variable selection criteria such as AIC and BIC, and traditional variable selection procedures such as stepwise selection. These methods are unstable when there is limited expert knowledge about the variables to include in the model. To address this, we propose the use of adaptive Lasso for variable selection and parameter estimation in both the selection and outcome submodels simultaneously in the absence of exclusion restrictions. By using the maximum likelihood estimator of the sample selection model, we constructed a loss function similar to the least squares regression problem up to a constant, and minimized its penalized version using an efficient algorithm. We show that the estimator, with proper choice of regularization parameter, is consistent and possesses the oracle properties. The method is compared to Lasso and adaptively weighted \(L_{1}\) penalized Two-step method. We applied the methods to the well-known Ambulatory Expenditure Data. PubDate: 2022-04-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we investigate the complete moment convergence for randomly weighted sums of extended negatively dependent (END) random variables. The results obtained in this paper extended the corresponding one of Li et al. (J Inequalities Appl 2017:16, 2017). As an application, we study the complete consistency for the estimator of semiparametric regression models based on END random variables by using the complete convergence that we established. Finally, we have conducted comprehensive simulation studies to demonstrate the validity of obtained theoretical results. PubDate: 2022-04-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Strong orthogonal arrays were recently introduced as a new class of space-filling designs for computer experiments due to their better stratifications than orthogonal arrays. To further improve the space-filling properties in low dimensions while possessing the column orthogonality, we propose column-orthogonal strong orthogonal arrays of strength two star and three. Construction methods and characterizations of such designs are provided. The resulting strong orthogonal arrays, with the numbers of levels being increased, have their space-filling properties in one and two dimensions being strengthened. They can accommodate comparable or even larger numbers of factors than those in the existing literature, enjoy flexible run sizes, and possess the column orthogonality. The construction methods are convenient and flexible, and the resulting designs are good choices for computer experiments. PubDate: 2022-04-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we first investigate the estimation of the functional single index regression model with missing responses at random for strong mixing time series data. More precisely, the uniform almost complete convergence rate and asymptotic normality of the estimator are obtained respectively under some general conditions. Then, some simulation studies are carried out to show the finite sample performances of the estimator. Finally, a real data analysis about the sea surface temperature is used to illustrate the effectiveness of our methodology. PubDate: 2022-04-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract For the family of multivariate probability distributions variously denoted as unified skew-normal, closed skew-normal and other names, a number of properties are already known, but many others are not, even some basic ones. The present contribution aims at filling some of the missing gaps. Specifically, the moments up to the fourth order are obtained, and from here the expressions of the Mardia’s measures of multivariate skewness and kurtosis. Other results concern the property of log-concavity of the distribution, closure with respect to conditioning on intervals, and a possible alternative parameterization. PubDate: 2022-04-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Most of existing augmented designs are to add some runs in the follow-up stages. While in many cases, the level of factors should be augmented and these augmented designs are called level-augmented designs. According to whether the experimental domain is extended or not, they can be divided into range-extended and range-fixed level-augmented designs. For different types of initial designs, the symmetrical and asymmetrical level-augmented designs are discussed, respectively. Based on the property of robustness, a uniformity criterion is a suitable choice to obtain an optimal level-augmented design when the model is unknown. In this paper, the wrap-around \(L_2\) -discrepancy (WD) is chosen as the uniformity measure. We give the expressions and the tight lower bounds of WD of level-augmented designs under some special parameters. A method to construct a special case of symmetrical level-augmented designs is given. Some examples and level-augmented uniform designs are also provided. PubDate: 2022-04-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper considers multistage experiments which involve, in each stage, two-level factors whose levels are hard to change. Because of such factors, in each stage, there are restrictions in the randomization of runs leading to two types of experimental units, called whole plots and split plots. As a result, there are two types of random errors, in each stage, that need to be taken into account when modeling the response variable. It is assumed that a linear mixed effects model is appropriate for analyzing observations, and that the method of generalized least squares estimation is used to obtain estimators for the fixed effects in the model. It is also assumed that the model matrix of the fixed effects is based on a general two-level fractional factorial design. The goal of this paper is to provide an analytic form of the covariance matrix of the generalized least squares estimators of the fixed factorial effects in the model, that is useful for evaluating designs. This form shows how any confounding of (fixed) effects with the whole plots, associated with the different stages, affects the variances of their generalized least squares estimators. Some special cases of this form, which correspond to a model matrix based on either a two-level regular factorial or a two-level full factorial design, are also discussed. Results can be extended to multistage experiments with randomization restrictions of the runs, in each stage, with a model matrix based on a general multilevel fractional factorial design. PubDate: 2022-04-01