Authors:Moosup Kim; Sangyeol Lee Pages: 945 - 968 Abstract: In this study, we consider the problem of estimating the tail exponent of multivariate regular variation. Since any convex combination of a random vector with a multivariate regularly varying tail has a univariate regularly varying tail with the same exponent under certain conditions, to estimate the tail exponent of the multivariate regular variation of a given random vector, we employ a weighted average of Hill’s estimators obtained for all of its convex combinations, designed to reduce the variability of estimation. We investigate the asymptotic properties and evaluate the finite sample performance of the weighted average of Hill’s estimators. A simulation study and real data analysis are provided for illustration. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0574-9 Issue No:Vol. 69, No. 5 (2017)

Authors:L. Baringhaus; B. Ebner; N. Henze Pages: 969 - 995 Abstract: We present a general result on the limit distribution of weighted one- and two-sample \(L^2\) -goodness-of-fit test statistics of some hypothesis \(H_0\) under fixed alternatives. Applications include an approximation of the power function of such tests, asymptotic confidence intervals of the distance of an underlying distribution with respect to the distributions under \(H_0\) , and an asymptotic equivalence test that is able to validate certain neighborhoods of \(H_0\) . PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0567-8 Issue No:Vol. 69, No. 5 (2017)

Authors:Sunghoon Kwon; Jeongyoun Ahn; Woncheol Jang; Sangin Lee; Yongdai Kim Pages: 997 - 1025 Abstract: We propose a new penalty called the doubly sparse (DS) penalty for variable selection in high-dimensional linear regression models when the covariates are naturally grouped. An advantage of the DS penalty over other penalties is that it provides a clear way of controlling sparsity between and within groups, separately. We prove that there exists a unique global minimizer of the DS penalized sum of squares of residuals and show how the DS penalty selects groups and variables within selected groups, even when the number of groups exceeds the sample size. An efficient optimization algorithm is introduced also. Results from simulation studies and real data analysis show that the DS penalty outperforms other existing penalties with finite samples. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0571-z Issue No:Vol. 69, No. 5 (2017)

Authors:Omer Ozturk Pages: 1029 - 1057 Abstract: This article develops estimators for certain population characteristics using a judgment post stratified (JPS) sample. The paper first constructs a conditional JPS sample with a reduced set size K by conditioning on the ranks of the measured observations of the original JPS sample of set size \(H \ge K\) . The paper shows that the estimators of the population mean, median and distribution function based on this conditional JPS sample are consistent and have limiting normal distributions. It is shown that the proposed estimators, unlike the ratio and regression estimators, where they require a strong linearity assumption, only need a monotonic relationship between the response and auxiliary variable. For moderate sample sizes, the paper provides a bootstrap distribution to draw statistical inference. A small-scale simulation study shows that the proposed estimators based on a reduced set JPS sample perform better than the corresponding estimators based on a regular JPS sample. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0572-y Issue No:Vol. 69, No. 5 (2017)

Authors:Hanfang Yang; Yichuan Zhao Pages: 1059 - 1073 Abstract: In this paper, we propose a smoothed estimating equation for the difference of quantiles with two samples. Using the jackknife pseudo-sample technique for the estimating equation, we propose the jackknife empirical likelihood (JEL) ratio and establish the Wilk’s theorem. Due to avoiding estimating link variables, the simulation studies demonstrate that JEL method has computational efficiency compared with traditional normal approximation method. We carry out a simulation study in terms of coverage probability and average length of the proposed confidence intervals. A real data set is used to illustrate the JEL procedure. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0576-7 Issue No:Vol. 69, No. 5 (2017)

Authors:Nicolas Grosjean; Thierry Huillet Pages: 1075 - 1097 Abstract: We derive some additional results on the Bienyamé–Galton–Watson-branching process with \(\theta \) -linear fractional branching mechanism, as studied by Sagitov and Lindo (Branching Processes and Their Applications. Lecture Notes in Statistics—Proceedings, 2016). This includes the explicit expression of the limit laws in both the subcritical cases and the supercritical cases with finite mean, and the long-run behavior of the population size in the critical case, limits laws in the supercritical cases with infinite mean when the \(\theta \) process is either regular or explosive, and results regarding the time to absorption, an expression of the probability law of the \(\theta \) -branching mechanism involving Bell polynomials, and the explicit computation of the stochastic transition matrix of the \(\theta \) process, together with its powers. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0573-x Issue No:Vol. 69, No. 5 (2017)

Authors:Minggen Lu Pages: 1099 - 1127 Abstract: We consider a simple yet flexible spline estimation method for quasi-likelihood models. We approximate the unknown function by B-splines and apply the Fisher scoring algorithm to compute the estimates. The spline estimate of the nonparametric component achieves the optimal rate of convergence under the smooth condition, and the estimate of the parametric part is shown to be asymptotically normal even if the variance function is misspecified. The semiparametric efficiency of the model can be established if the variance function is correctly specified. A direct and consistent variance estimation method based on the least-squares estimation is proposed. A simulation study is performed to evaluate the numerical performance of the spline estimate. The methodology is illustrated on a crab study. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0575-8 Issue No:Vol. 69, No. 5 (2017)

Authors:James C. Fu; Wan-Chen Lee Pages: 1129 - 1139 Abstract: Suppose an urn contains m distinct coupons, labeled from 1 to m. A random sample of k coupons is drawn without replacement from the urn, numbers are recorded and the coupons are then returned to the urn. This procedure is done repeatedly and the sample sizes are independently identically distributed. Let W be the total number of random samples needed to see all coupons at least l times \((l \ge 1)\) . Recently, for \(l=1\) , the approximation for the first moment of the random variable W has been studied under random sample size sampling scheme by Sellke (Ann Appl Probab, 5:294–309, 1995). In this manuscript, we focus on studying the exact distributions of waiting times W for both fixed and random sample size sampling schemes given \(l \ge 1\) . The results are further extended to a combination of fixed and random sample size sampling procedures. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0578-5 Issue No:Vol. 69, No. 5 (2017)

Authors:Yusuke Shimizu Pages: 1141 - 1154 Abstract: In this paper, we study the uniform tail-probability estimates of a regularized least-squares estimator for the linear regression model. We make use of the polynomial type large deviation inequality for the associated statistical random fields, which may not be locally asymptotically quadratic. Our results enable us to verify various arguments requiring convergence of moments of estimator-dependent statistics, such as the mean squared prediction error and the bias correction for AIC-type information criterion. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0577-6 Issue No:Vol. 69, No. 5 (2017)

Authors:P. Vellaisamy Pages: 1155 - 1176 Abstract: Collapsibility deals with the conditions under which a conditional (on a covariate W) measure of association between two random variables Y and X equals the marginal measure of association. In this paper, we discuss the average collapsibility of certain well-known measures of association, and also with respect to a new measure of association. The concept of average collapsibility is more general than collapsibility, and requires that the conditional average of an association measure equals the corresponding marginal measure. Sufficient conditions for the average collapsibility of the association measures are obtained. Some interesting counterexamples are constructed and applications to linear, Poisson, logistic and negative binomial regression models are discussed. An extension to the case of multivariate covariate W is also analyzed. Finally, we discuss the collapsibility conditions of some dependence measures for survival models and illustrate them for the case of linear transformation models. PubDate: 2017-10-01 DOI: 10.1007/s10463-016-0580-y Issue No:Vol. 69, No. 5 (2017)

Authors:Cuizhen Niu; Lixing Zhu Abstract: This paper is devoted to test the parametric single-index structure of the underlying model when there are outliers in observations. First, a test that is robust against outliers is suggested. The Hampel’s second-order influence function of the test statistic is proved to be bounded. Second, the test fully uses the dimension reduction structure of the hypothetical model and automatically adapts to alternative models when the null hypothesis is false. Thus, the test can greatly overcome the dimensionality problem and is still omnibus against general alternative models. The performance of the test is demonstrated by both Monte Carlo simulation studies and an application to a real dataset. PubDate: 2017-11-02 DOI: 10.1007/s10463-017-0626-9

Authors:Benedikt Bauer; Felix Heimrich; Michael Kohler; Adam Krzyżak Abstract: Estimation of surrogate models for computer experiments leads to nonparametric regression estimation problems without noise in the dependent variable. In this paper, we propose an empirical maximal deviation minimization principle to construct estimates in this context and analyze the rate of convergence of corresponding quantile estimates. As an application, we consider estimation of computer experiments with moderately high dimension by neural networks and show that here we can circumvent the so-called curse of dimensionality by imposing rather general assumptions on the structure of the regression function. The estimates are illustrated by applying them to simulated data and to a simulation model in mechanical engineering. PubDate: 2017-11-02 DOI: 10.1007/s10463-017-0627-8

Authors:Kun-Lin Kuo; Yuchung J. Wang Abstract: Conditionally specified models offers a higher level of flexibility than the joint approach. Regression switching in multiple imputation is a typical example. However, reasonable-seeming conditional models are generally not coherent with one another. Gibbs sampler based on incompatible conditionals is called pseudo-Gibbs sampler, whose properties are mostly unknown. This article investigates the richness and commonalities among their stationary distributions. We show that Gibbs sampler replaces the conditional distributions iteratively, but keep the marginal distributions invariant. In the process, it minimizes the Kullback–Leibler divergence. Next, we prove that systematic pseudo-Gibbs projections converge for every scan order, and the stationary distributions share marginal distributions in a circularly fashion. Therefore, regardless of compatibility, univariate consistency is guaranteed when the orders of imputation are circularly related. Moreover, a conditional model and its pseudo-Gibbs distributions have equal number of parameters. Study of pseudo-Gibbs sampler provides a fresh perspective for understanding the original Gibbs sampler. PubDate: 2017-10-24 DOI: 10.1007/s10463-017-0625-x

Authors:Omer Ozturk Abstract: This paper draws statistical inference for population characteristics using two-stage cluster samples. Cluster samples in each stage are constructed using ranked set sample (RSS), probability-proportional-to-size sample, or simple random sample (SRS) designs. Each RSS sampling design is implemented with and without replacement policies. The paper constructs design-unbiased estimators for population mean, total, and their variances. Efficiency improvement of all sampling designs over SRS sampling design is investigated. It is shown that the efficiency of the estimators depends on the intra-cluster correlation coefficient and choice of sampling designs in stage I and II sampling. The paper also constructs an approximate confidence interval for the population mean (total). For a fixed cost, the optimal sample sizes for stage I and stage II samples are constructed by maximizing the information content of the sample. The proposed sampling designs and estimators are applied to California School District Study and Ohio Corn Production Data. PubDate: 2017-10-24 DOI: 10.1007/s10463-017-0623-z

Authors:Denis Belomestny; Fabienne Comte; Valentine Genon-Catalot Abstract: In this paper, our aim is to revisit the nonparametric estimation of a square integrable density f on \({\mathbb {R}}\) , by using projection estimators on a Hermite basis. These estimators are studied from the point of view of their mean integrated squared error on \({\mathbb {R}}\) . A model selection method is described and proved to perform an automatic bias variance compromise. Then, we present another collection of estimators, of deconvolution type, for which we define another model selection strategy. Although the minimax asymptotic rates of these two types of estimators are mainly equivalent, the complexity of the Hermite estimators is usually much lower than the complexity of their deconvolution (or kernel) counterparts. These results are illustrated through a small simulation study. PubDate: 2017-10-23 DOI: 10.1007/s10463-017-0624-y

Authors:Mijeong Kim; Yanyuan Ma Abstract: In the mean regression context, this study considers several frequently encountered heteroscedastic error models where the regression mean and variance functions are specified up to certain parameters. An important point we note through a series of analyses is that different assumptions on standardized regression errors yield quite different efficiency bounds for the corresponding estimators. Consequently, all aspects of the assumptions need to be specifically taken into account in constructing their corresponding efficient estimators. This study clarifies the relation between the regression error assumptions and their, respectively, efficiency bounds under the general regression framework with heteroscedastic errors. Our simulation results support our findings; we carry out a real data analysis using the proposed methods where the Cobb–Douglas cost model is the regression mean. PubDate: 2017-10-13 DOI: 10.1007/s10463-017-0622-0

Authors:Nitis Mukhopadhyay; Sudeep R. Bapat Abstract: A negative binomial (NB) distribution is useful to model over-dispersed count data arising from agriculture, health, and pest control. We design purely sequential bounded-risk methodologies to estimate an unknown NB mean \(\mu (>0)\) under different forms of loss functions including customary and modified Linex loss as well as squared error loss. We handle situations when the thatch parameter \(\tau (>0)\) may be assumed known or unknown. Our proposed methodologies are shown to satisfy properties including first-order asymptotic efficiency and first-order asymptotic risk efficiency. Summaries are provided from extensive sets of simulations showing encouraging performances of the proposed methodologies for small and moderate sample sizes. We follow with illustrations obtained by implementing estimation strategies using real data from statistical ecology: (1) weed count data of different species from a field in Netherlands and (2) count data of migrating woodlarks at the Hanko bird sanctuary in Finland. PubDate: 2017-10-13 DOI: 10.1007/s10463-017-0620-2

Authors:Yixin Fang; Heng Lian; Hua Liang Abstract: When model the heteroscedasticity in a broad class of partially linear models, we allow the variance function to be a partial linear model as well and the parameters in the variance function to be different from those in the mean function. We develop a two-step estimation procedure, where in the first step some initial estimates of the parameters in both the mean and variance functions are obtained and then in the second step the estimates are updated using the weights calculated based on the initial estimates. The resulting weighted estimators of the linear coefficients in both the mean and variance functions are shown to be asymptotically normal, more efficient than the initial un-weighted estimators, and most efficient in the sense of semiparametric efficiency for some special cases. Simulation experiments are conducted to examine the numerical performance of the proposed procedure, which is also applied to data from an air pollution study in Mexico City. PubDate: 2017-10-04 DOI: 10.1007/s10463-017-0619-8

Authors:Huybrechts F. Bindele; Ash Abebe; Karlene N. Meyer Abstract: This study considers rank estimation of the regression coefficients of the single index regression model. Conditions needed for the consistency and asymptotic normality of the proposed estimator are established. Monte Carlo simulation experiments demonstrate the robustness and efficiency of the proposed estimator compared to the semiparametric least squares estimator. A real-life example illustrates that the rank regression procedure effectively corrects model nonlinearity even in the presence of outliers in the response space. PubDate: 2017-09-20 DOI: 10.1007/s10463-017-0618-9