Authors:Yong Kong Pages: 497 - 512 Abstract: Abstract The distributions of the mth longest runs of multivariate random sequences are considered. For random sequences made up of k kinds of letters, the lengths of the runs are sorted in two ways to give two definitions of run length ordering. In one definition, the lengths of the runs are sorted separately for each letter type. In the second definition, the lengths of all the runs are sorted together. Exact formulas are developed for the distributions of the mth longest runs for both definitions. The derivations are based on a two-step method that is applicable to various other runs-related distributions, such as joint distributions of several letter types and multiple run lengths of a single letter type. PubDate: 2017-06-01 DOI: 10.1007/s10463-015-0551-8 Issue No:Vol. 69, No. 3 (2017)

Authors:Michaela Prokešová; Jiří Dvořák; Eva B. Vedel Jensen Pages: 513 - 542 Abstract: Abstract In the present paper, we discuss and compare several two-step estimation procedures for inhomogeneous shot-noise Cox processes. The intensity function is parametrized by the inhomogeneity parameters while the pair-correlation function is parametrized by the interaction parameters. The suggested procedures are based on a combination of Poisson likelihood estimation of the inhomogeneity parameters in the first step and an adaptation of a method from the homogeneous case for estimation of the interaction parameters in the second step. The adapted methods, based on minimum contrast estimation, composite likelihood and Palm likelihood, are compared both theoretically and by means of a simulation study. The general conclusion from the simulation study is that the three estimation methods have similar performance. Two-step estimation with Palm likelihood has not been considered before and is motivated by the superior performance of the Palm likelihood in the stationary case for estimation of certain parameters of interest. Asymptotic normality of the two-step estimator with Palm likelihood is proved. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0556-y Issue No:Vol. 69, No. 3 (2017)

Authors:Dominique Fourdrinier; Fatiha Mezoued; William E. Strawderman Pages: 543 - 570 Abstract: Abstract We consider Bayesian estimation of the location parameter \(\theta \) of a random vector X having a unimodal spherically symmetric density \(f(\Vert x - \theta \Vert ^2)\) for a spherically symmetric prior density \(\pi (\Vert \theta \Vert ^2)\) . In particular, we consider minimaxity of the Bayes estimator \(\delta _\pi (X)\) under quadratic loss. When the distribution belongs to the Berger class, we show that minimaxity of \(\delta _\pi (X)\) is linked to the superharmonicity of a power of a marginal associated to a primitive of f. This leads to proper Bayes minimax estimators for certain densities \(f(\Vert x - \theta \Vert ^2)\) . PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0553-1 Issue No:Vol. 69, No. 3 (2017)

Authors:Ke-Hai Yuan; Peter M. Bentler Pages: 571 - 597 Abstract: Abstract In structural equation modeling (SEM), parameter estimates are typically computed by the Fisher-scoring algorithm, which often has difficulty in obtaining converged solutions. Even for simulated data with a correctly specified model, non-converged replications have been repeatedly reported in the literature. In particular, in Monte Carlo studies it has been found that larger factor loadings or smaller error variances in a confirmatory factor model correspond to a higher rate of convergence. However, studies of a ridge method in SEM indicate that adding a diagonal matrix to the sample covariance matrix also increases the rate of convergence for the Fisher-scoring algorithm. This article addresses these two seemingly contradictory phenomena. Using statistical and numerical analyses, the article clarifies why both approaches increase the rate of convergence in SEM. Monte Carlo results confirm the analytical results. Recommendations are provided on how to increase both the speed and rate of convergence in parameter estimation. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0552-2 Issue No:Vol. 69, No. 3 (2017)

Authors:Shan Luo; Gengsheng Qin Pages: 599 - 626 Abstract: Abstract Low-income proportion is an important index in describing the inequality of an income distribution. It has been widely used by governments in measuring social stability around the world. Established inferential methods for this index are based on the empirical estimator of the index. It may have poor finite sample performances when the real income data are skewed or has outliers. In this paper, based on a smooth estimator for the low-income proportion, we propose a smoothed jackknife empirical likelihood approach for inferences of the low-income proportion. Wilks theorem is obtained for the proposed jackknife empirical likelihood ratio statistic. Various confidence intervals based on the smooth estimator are constructed. Extensive simulation studies are conducted to compare the finite sample performances of the proposed intervals with some existing intervals. Finally, the proposed methods are illustrated by a public income dataset of the professors in University System of Georgia. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0554-0 Issue No:Vol. 69, No. 3 (2017)

Authors:Ping Wu; Xinchao Luo; Peirong Xu; Lixing Zhu Pages: 627 - 646 Abstract: Abstract In this paper, we consider how to select both the fixed effects and the random effects in linear mixed models. To make variable selection more efficient for such models in which there are high correlations between covariates associated with fixed and random effects, a novel approach is proposed, which orthogonalizes fixed and random effects such that the two sets of effects can be separately selected with less influence on one another. Also, unlike most of existing methods with parametric assumptions, the new method only needs fourth order moments of involved random variables. The oracle property is proved. the performance of our method is examined by a simulation study. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0555-z Issue No:Vol. 69, No. 3 (2017)

Authors:Jing Xi; Jin Xie; Ruriko Yoshida Pages: 647 - 671 Abstract: Abstract In order to conduct a statistical analysis on a given set of phylogenetic gene trees, we often use a distance measure between two trees. In a statistical distance-based method to analyze discordance between gene trees, it is a key to decide “biologically meaningful” and “statistically well-distributed” distance between trees. Thus, in this paper, we study the distributions of the three tree distance metrics: the edge difference, the path difference, and the precise K interval cospeciation distance, between two trees: First, we focus on distributions of the three tree distances between two random unrooted trees with n leaves ( \(n \ge 4\) ); and then we focus on the distributions the three tree distances between a fixed rooted species tree with n leaves and a random gene tree with n leaves generated under the coalescent process with the given species tree. We show some theoretical results as well as simulation study on these distributions. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0557-x Issue No:Vol. 69, No. 3 (2017)

Authors:Elizabeth Gross; Sonja Petrović; Despina Stasi Pages: 673 - 704 Abstract: Abstract Social networks and other sparse data sets pose significant challenges for statistical inference, since many standard statistical methods for testing model/data fit are not applicable in such settings. Algebraic statistics offers a theoretically justified approach to goodness-of-fit testing that relies on the theory of Markov bases. Most current practices require the computation of the entire basis, which is infeasible in many practical settings. We present a dynamic approach to explore the fiber of a model, which bypasses this issue, and is based on the combinatorics of hypergraphs arising from the toric algebra structure of log-linear models. We demonstrate the approach on the Holland–Leinhardt \(p_1\) model for random directed graphs that allows for reciprocation effects. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0560-2 Issue No:Vol. 69, No. 3 (2017)

Authors:Chang Xuan Mao; Cuiying Yang; Yitong Yang; Wei Zhuang Pages: 705 - 716 Abstract: Abstract The Rasch model has been used to estimate the unknown size of a population from multi-list data. It can take both the list effectiveness and individual heterogeneity into account. Estimating the population size is shown to be equivalent to estimating the odds that an individual is unseen. The odds parameter is nonidentifiable. We propose a sequence of estimable lower bounds, including the greatest one, for the odds parameter. We show that a lower bound can be calculated by linear programming. Estimating a lower bound of the odds leads to an estimator for a lower bound of the population size. A simulation experiment is performed and three real examples are studied. PubDate: 2017-06-01 DOI: 10.1007/s10463-016-0561-1 Issue No:Vol. 69, No. 3 (2017)

Authors:Xiaohui Liu; Qihua Wang; Yi Liu Pages: 249 - 269 Abstract: Abstract In this paper, a jackknife empirical likelihood based approach is developed to test whether the underlying distribution is equal to a specified one. The limiting distribution of the proposed testing statistic is derived under some mild conditions. It turns out that the proposed test is consistent and easy to be implemented. Some simulation studies are conducted to evaluate the finite sample behaviors by comparing the proposed method with the existing one. A real data example is also analyzed to illustrate the proposed test approach. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0550-9 Issue No:Vol. 69, No. 2 (2017)

Authors:Ao Yuan; Mihai Giurcanu; George Luta; Ming T. Tan Pages: 271 - 302 Abstract: Abstract For incomplete data models, the classical U-statistic estimator of a functional parameter of the underlying distribution cannot be computed directly since the data are not fully observed. To estimate such a functional parameter, we propose a U-statistic using a substitution estimator of the conditional kernel given the observed data. This kernel estimator is obtained by substituting the non-parametric maximum likelihood estimator for the underlying distribution function in the expression of the conditional kernel. We study the asymptotic properties of the proposed U-statistic for several incomplete data models, and in a simulation study, we assess the finite sample performance of the Mann–Whitney U-statistic with conditional kernel in the current status model. The analysis of a real-world data set illustrates the application of the proposed methods in practice. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0537-6 Issue No:Vol. 69, No. 2 (2017)

Authors:Jean-François Coeurjolly Pages: 303 - 331 Abstract: Abstract This paper is concerned with a robust estimator of the intensity of a stationary spatial point process. The estimator corresponds to the median of a jittered sample of the number of points, computed from a tessellation of the observation domain. We show that this median-based estimator satisfies a Bahadur representation from which we deduce its consistency and asymptotic normality under mild assumptions on the spatial point process. Through a simulation study, we compare the new estimator, in particular, with the standard one counting the mean number of points per unit volume. The empirical study confirms the asymptotic properties established in the theoretical part and shows that the median-based estimator is more robust to outliers than standard procedures. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0536-7 Issue No:Vol. 69, No. 2 (2017)

Authors:Andrew M. Raim; Nagaraj K. Neerchal; Jorge G. Morel Pages: 333 - 364 Abstract: Abstract A simple closed form of the Fisher information matrix (FIM) usually cannot be obtained under a finite mixture. Several authors have considered a block-diagonal FIM approximation for binomial and multinomial finite mixtures, used in scoring and in demonstrating relative efficiency of proposed estimators. Raim et al. (Stat Methodol 18:115–130, 2014a) noted that this approximation coincides with the complete data FIM of the observed data and latent mixing process jointly. It can, therefore, be formulated for a wide variety of missing data problems. Multinomial mixtures feature a number of trials, which, when taken to infinity, result in the FIM and approximation becoming arbitrarily close. This work considers a clustered sampling scheme which allows the convergence result to be extended significantly to the class of exponential family finite mixtures. A series of examples demonstrate the convergence result and suggest that it can be further generalized. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0542-9 Issue No:Vol. 69, No. 2 (2017)

Authors:Jiang Hu; Zhidong Bai; Chen Wang; Wei Wang Pages: 365 - 387 Abstract: Abstract In this article, we focus on the problem of testing the equality of several high dimensional mean vectors with unequal covariance matrices. This is one of the most important problems in multivariate statistical analysis and there have been various tests proposed in the literature. Motivated by Bai and Saranadasa (Stat Sin 6:311–329, 1996) and Chen and Qin (Ann Stat 38:808–835, 2010), we introduce a test statistic and derive the asymptotic distributions under the null and the alternative hypothesis. In addition, it is compared with a test statistic recently proposed by Srivastava and Kubokawa (J Multivar Anal 115:204–216, 2013). It is shown that our test statistic performs better especially in the large dimensional case. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0543-8 Issue No:Vol. 69, No. 2 (2017)

Authors:Giacomo Aletti; Matteo Ruffini Pages: 389 - 416 Abstract: Abstract In this paper, we study periodical stochastic processes, and we define the conditions that are needed by a model to be a good noise model on the circumference. The classes of processes that fit the required conditions are studied together with their expansion in random Fourier series to provide results about their path regularity. Finally, we discuss a simple and flexible parametric model with prescribed regularity that is used in applications, and we prove the asymptotic properties of the maximum likelihood estimates of model parameters. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0546-5 Issue No:Vol. 69, No. 2 (2017)

Authors:Paweł Marcin Kozyra; Tomasz Rychlik Pages: 417 - 428 Abstract: Abstract For classic i.i.d. samples with an arbitrary nondegenerate and finite variance distribution, Papadatos (1995, Annals of the Institute of Statistical Mathematics, 47, 185–193) presented sharp lower and upper bounds on the variances of order statistics, expressed in population variance units. We provide here analogous results for spacings. Also, we describe the parent distributions which attain the bounds. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0545-6 Issue No:Vol. 69, No. 2 (2017)

Authors:Christopher Partlett; Prakash Patil Pages: 429 - 460 Abstract: Abstract In this paper, we show that some of the most commonly used tests of symmetry do not have power which is reflective of the size of asymmetry. This is because the primary rationale for the test statistics that are proposed in the literature to test for symmetry is to detect the departure from symmetry, rather than the quantification of the asymmetry. As a result, tests of symmetry based upon these statistics do not necessarily generate power that is representative of the departure from the null hypothesis of symmetry. Recent research has produced new measures of asymmetry, which have been shown to do an admirable job of quantifying the amount of asymmetry. We propose several new tests based upon one such measure. We derive the asymptotic distribution of the test statistics and analyse the performance of these proposed tests through the use of a simulation study. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0547-4 Issue No:Vol. 69, No. 2 (2017)

Authors:Vygantas Paulauskas; Marijus Vaičiulis Pages: 461 - 487 Abstract: Abstract In the paper, we propose a new class of functions which is used to construct tail index estimators. Functions from this new class are non-monotone in general, but they are the product of two monotone functions: the power function and the logarithmic function, which play essential role in the classical Hill estimator. The newly introduced generalized moment ratio estimator and generalized Hill estimator have a better asymptotic performance compared with the corresponding classical estimators over the whole range of the parameters that appear in the second-order regular variation condition. Asymptotic normality of the introduced estimators is proved, and comparison (using asymptotic mean square error) with other estimators of the tail index is provided. Some preliminary simulation results are presented. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0548-3 Issue No:Vol. 69, No. 2 (2017)

Authors:Yong Kong Pages: 489 - 495 Abstract: Abstract Distributions of runs of length at least k (Type II runs) and overlapping runs of length k (Type III runs) are derived in a unified way using a new generating function approach. A new and more compact formula is obtained for the probability mass function of the Type III runs. PubDate: 2017-04-01 DOI: 10.1007/s10463-015-0549-2 Issue No:Vol. 69, No. 2 (2017)

Authors:Matthew Thorpe; Adam M. Johansen Abstract: Abstract Establishing the convergence of splines can be cast as a variational problem which is amenable to a \(\Gamma \) -convergence approach. We consider the case in which the regularization coefficient scales with the number of observations, n, as \(\lambda _n=n^{-p}\) . Using standard theorems from the \(\Gamma \) -convergence literature, we prove that the general spline model is consistent in that estimators converge in a sense slightly weaker than weak convergence in probability for \(p\le \frac{1}{2}\) . Without further assumptions, we show this rate is sharp. This differs from rates for strong convergence using Hilbert scales where one can often choose \(p>\frac{1}{2}\) . PubDate: 2017-04-04 DOI: 10.1007/s10463-017-0609-x