Abstract: Abstract Let G be the graph corresponding to the graphical model of nearest neighbor interaction in a Gaussian character. We study Natural Exponential Families (NEF) of Wishart distributions on convex cones \(Q_G\) and \(P_G\) , where \(P_G\) is the cone of tridiagonal positive definite real symmetric matrices, and \(Q_G\) is the dual cone of \(P_G\) . The Wishart NEF that we construct include Wishart distributions considered earlier for models based on decomposable(chordal) graphs. Our approach is, however, different and allows us to study the basic objects of Wishart NEF on the cones \(Q_G\) and \(P_G\) . We determine Riesz measures generating Wishart exponential families on \(Q_G\) and \(P_G\) , and we give the quadratic construction of these Riesz measures and exponential families. The mean, inverse-mean, covariance and variance functions, as well as moments of higher order, are studied and their explicit formulas are given. PubDate: 2019-04-01

Abstract: Abstract This paper concerns the study of the entire conditional distribution of a response given predictors in a heterogeneous regression setting. A common approach to address heterogeneous data is quantile regression, which utilizes the minimization of the \(L_1\) norm. As an alternative to quantile regression, we consider expectile regression, which relies on the minimization of the asymmetric \(L_2\) norm and detects heteroscedasticity effectively. We assume that only a small set of predictors is relevant to the response and develop penalized expectile regression with SCAD and adaptive LASSO penalties. With properly chosen tuning parameters, we show that the proposed estimators display oracle properties. A numerical study using simulated and real examples demonstrates the competitive performance of the proposed penalized expectile regression, and its combined use with penalized quantile regression would be helpful and recommended for practitioners. PubDate: 2019-04-01

Abstract: Abstract This paper develops a frequentist model averaging approach for threshold model specifications. The resulting estimator is proved to be asymptotically optimal in the sense of achieving the lowest possible squared errors. In particular, when combining estimators from threshold autoregressive models, this approach is also proved to be asymptotically optimal. Simulation results show that for the situation where the existing model averaging approach is not applicable, our proposed model averaging approach has a good performance; for the other situations, our proposed model averaging approach performs marginally better than other commonly used model selection and model averaging methods. An empirical application of our approach on the US unemployment data is given. PubDate: 2019-04-01

Abstract: Abstract This paper presents simple weighted and fully augmented weighted estimators for the additive hazards model with missing covariates when they are missing at random. The additive hazards model estimates the difference in hazards and has an intuitive biological interpretation. The proposed weighted estimators for the additive hazards model use incomplete data nonparametrically and have close-form expressions. We show that they are consistent and asymptotically normal, and are more efficient than the simple weighted estimator which only uses the complete data. We illustrate their finite-sample performance through simulation studies and an application to study the progression from mild cognitive impairment to dementia using data from the Alzheimer’s Disease Neuroimaging Initiative as well as an application to the mouse leukemia study. PubDate: 2019-04-01

Abstract: Abstract In the spirit of Bross (Biometrics 14:18–38, 1958), this paper considers ridit reliability functionals to develop test procedures for the equality of \(K(>2)\) treatment effects in nonparametric analysis of covariance (ANCOVA) model with d covariates based on two different methods. The procedures are asymptotically distribution free and are not based on the assumption that the distribution functions (d.f.’s) of the response variable and the associated covariates are continuous. By means of simulation study, the proposed methods are compared with the methods provided by Tsangari and Akritas (J Multivar Anal 88:298–319, 2004) and Bathke and Brunner (Recent advances and trends in nonparametric statistics, Elsevier, Amsterdam, 2003) under ANCOVA in terms of type I error rate and power. PubDate: 2019-04-01

Abstract: Abstract Let k be a positive integer. Some exact distributions of the waiting time random variables for k consecutive repetitions of a pattern are derived in a sequence of independent identically distributed trials. It is proved that the number of equations of conditional probability generating functions for deriving the distribution can be reduced to less than or equal to the length of the basic pattern to be repeated consecutively. By using the result, various properties of the distributions of usual runs are extended to those of consecutive repetitions of a pattern. These results include some properties of the geometric distribution of order k and those of the waiting time distributions of the \((k_1,k_2)\) -events. Further, the probability generating function of the number of non-overlapping occurrences of k consecutive repetitions of a pattern can be written in an explicit form with k as a parameter. Some recurrence relations, which are useful for evaluating the probability mass functions, are also given. PubDate: 2019-04-01

Abstract: Abstract Non-concave penalized maximum likelihood methods are widely used because they are more efficient than the Lasso. They include a tuning parameter which controls a penalty level, and several information criteria have been developed for selecting it. While these criteria assure the model selection consistency, they have a problem in that there are no appropriate rules for choosing one from the class of information criteria satisfying such a preferred asymptotic property. In this paper, we derive an information criterion based on the original definition of the AIC by considering minimization of the prediction error rather than model selection consistency. Concretely speaking, we derive a function of the score statistic that is asymptotically equivalent to the non-concave penalized maximum likelihood estimator and then provide an estimator of the Kullback–Leibler divergence between the true distribution and the estimated distribution based on the function, whose bias converges in mean to zero. PubDate: 2019-04-01

Abstract: Abstract In this paper, we consider an unbalanced urn model with multiple drawing. At each discrete time step n, we draw m balls at random from an urn containing white and blue balls. The replacement of the balls follows either opposite or self-reinforcement rule. Under the opposite reinforcement rule, we use the stochastic approximation algorithm to obtain a strong law of large numbers and a central limit theorem for \(W_n\) : the number of white balls after n draws. Under the self-reinforcement rule, we prove that, after suitable normalization, the number of white balls \(W_n\) converges almost surely to a random variable \(W_\infty \) which has an absolutely continuous distribution. PubDate: 2019-04-01

Abstract: Abstract This paper develops a robust profile estimation method for the parametric and nonparametric components of a single-index model when the errors have a strongly unimodal density with unknown nuisance parameter. We derive consistency results for the link function estimators as well as consistency and asymptotic distribution results for the single-index parameter estimators. Under a log-Gamma model, the sensitivity to anomalous observations is studied using the empirical influence curve. We also discuss a robust K-fold cross-validation procedure to select the smoothing parameters. A numerical study carried on with errors following a log-Gamma model and for contaminated schemes shows the good robustness properties of the proposed estimators and the advantages of considering a robust approach instead of the classical one. A real data set illustrates the use of our proposal. PubDate: 2019-03-21

Abstract: Abstract Issues regarding missing data are critical in observational and experimental research. Recently, for datasets with mixed continuous–discrete variables, multiple imputation by chained equation (MICE) has been widely used, although MICE may yield severely biased estimates. We propose a new semiparametric Bayes multiple imputation approach that can deal with continuous and discrete variables. This enables us to overcome the shortcomings of MICE; they must satisfy strong conditions (known as compatibility) to guarantee obtained estimators are consistent. Our simulation studies show the coverage probability of 95% interval calculated using MICE can be less than 1%, while the MSE of the proposed can be less than one-fiftieth. We applied our method to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, and the results are consistent with those of the previous works that used panel data other than ADNI database, whereas the existing methods, such as MICE, resulted in inconsistent results. PubDate: 2019-03-11

Abstract: Abstract The cross ratio function (CRF) is a commonly used tool to describe local dependence between two correlated variables. Being a ratio of conditional hazards, the CRF can be rewritten in terms of (first and second derivatives of) the survival copula of these variables. Bernstein estimators for (the derivatives of) this survival copula are used to define a nonparametric estimator of the cross ratio, and asymptotic normality thereof is established. We consider simulations to study the finite sample performance of our estimator for copulas with different types of local dependency. A real dataset is used to investigate the dependence between food expenditure and net income. The estimated CRF reveals that families with a low net income relative to the mean net income will spend less money to buy food compared to families with larger net incomes. This dependence, however, disappears when the net income is large compared to the mean income. PubDate: 2019-02-26

Abstract: Scoring rules serve to quantify predictive performance. A scoring rule is proper if truth telling is an optimal strategy in expectation. Subject to customary regularity conditions, every scoring rule can be made proper, by applying a special case of the Bayes act construction studied by Grünwald and Dawid (Ann Stat 32:1367–1433, 2004) and Dawid (Ann Inst Stat Math 59:77–93, 2007), to which we refer as properization. We discuss examples from the recent literature and apply the construction to create new types, and reinterpret existing forms, of proper scoring rules and consistent scoring functions. In an abstract setting, we formulate sufficient conditions under which Bayes acts exist and scoring rules can be made proper. PubDate: 2019-02-22

Abstract: Abstract In this article, I explore in a unified manner the structure of uniform slash and \(\alpha \) -slash distributions which, in the continuous case, are defined to be the distributions of Y / U and \( Y_\alpha /U^{1/\alpha }\) where Y and \(Y_\alpha \) follow any distribution on \(\mathbb {R}^+\) and, independently, U is uniform on (0, 1). The parallels with the monotone and \(\alpha \) -monotone distributions of \( Y \times U\) and \(Y_\alpha \times U^{1/\alpha }\) , respectively, are striking. I also introduce discrete uniform slash and \(\alpha \) -slash distributions which arise from a notion of negative binomial thinning/fattening. Their specification, although apparently rather different from the continuous case, seems to be a good one because of the close way in which their properties mimic those of the continuous case. PubDate: 2019-02-15

Abstract: Abstract Estimation and hypothesis test for partial linear single-index multiplicative models are considered in this paper. To estimate unknown single-index parameter, we propose a profile least product relative error estimator coupled with a leave-one-component-out method. To test a hypothesis on the parametric components, a Wald-type test statistic is proposed. We employ the smoothly clipped absolute deviation penalty to select relevant variables. To study model checking problem, we propose a variant of the integrated conditional moment test statistic by using linear projection weighting function, and we also suggest a bootstrap procedure for calculating critical values. Simulation studies are conducted to demonstrate the performance of the proposed procedure and a real example is analyzed for illustration. PubDate: 2019-02-12

Abstract: Abstract We present a detection problem where several spatially distributed sensors observe Poisson signals emitted from a single radioactive source of unknown position. The measurements at each sensor are modeled by independent inhomogeneous Poisson processes. A method based on Bayesian change-point estimation is proposed to identify the location of the source’s coordinates. The asymptotic behavior of the Bayesian estimator is studied. In particular, the consistency and the asymptotic efficiency of the estimator are analyzed. The limit distribution and the convergence of the moments are also described. The similar statistical model could be used in GPS localization problems. PubDate: 2019-02-08

Abstract: Abstract We revisit the problem of testing for multivariate reflected symmetry about an unspecified point. Although this testing problem is invariant with respect to full-rank affine transformations, among the few hitherto proposed tests only a class of tests studied in Henze et al. (J Multivar Anal 87:275–297, 2003) that depends on a positive parameter a respects this property. We identify a measure of deviation \(\varDelta _a\) (say) from symmetry associated with the test statistic \(T_{n,a}\) (say), and we obtain the limit normal distribution of \(T_{n,a}\) as \(n \rightarrow \infty \) under a fixed alternative to symmetry. Since a consistent estimator of the variance of this limit normal distribution is available, we obtain an asymptotic confidence interval for \(\varDelta _a\) . The test, when applied to a classical data set, strongly rejects the hypothesis of reflected symmetry, although other tests even do not object against the much stronger hypothesis of elliptical symmetry. PubDate: 2019-02-08

Abstract: Abstract This article is concerned with proving the consistency of Efron’s bootstrap for the Kaplan–Meier estimator on the whole support of a survival function. While previous works address the asymptotic Gaussianity of the Kaplan–Meier estimator without restricting time, we enable the construction of bootstrap-based time-simultaneous confidence bands for the whole survival function. Other practical applications include bootstrap-based confidence bands for the mean residual lifetime function or the Lorenz curve as well as confidence intervals for the Gini index. Theoretical results are complemented with a simulation study and a real data example which result in statistical recommendations. PubDate: 2019-02-01

Abstract: Abstract In this paper, our aim is to revisit the nonparametric estimation of a square integrable density f on \({\mathbb {R}}\) , by using projection estimators on a Hermite basis. These estimators are studied from the point of view of their mean integrated squared error on \({\mathbb {R}}\) . A model selection method is described and proved to perform an automatic bias variance compromise. Then, we present another collection of estimators, of deconvolution type, for which we define another model selection strategy. Although the minimax asymptotic rates of these two types of estimators are mainly equivalent, the complexity of the Hermite estimators is usually much lower than the complexity of their deconvolution (or kernel) counterparts. These results are illustrated through a small simulation study. PubDate: 2019-02-01

Abstract: Abstract In the mean regression context, this study considers several frequently encountered heteroscedastic error models where the regression mean and variance functions are specified up to certain parameters. An important point we note through a series of analyses is that different assumptions on standardized regression errors yield quite different efficiency bounds for the corresponding estimators. Consequently, all aspects of the assumptions need to be specifically taken into account in constructing their corresponding efficient estimators. This study clarifies the relation between the regression error assumptions and their, respectively, efficiency bounds under the general regression framework with heteroscedastic errors. Our simulation results support our findings; we carry out a real data analysis using the proposed methods where the Cobb–Douglas cost model is the regression mean. PubDate: 2019-02-01

Abstract: Abstract Conditionally specified models offers a higher level of flexibility than the joint approach. Regression switching in multiple imputation is a typical example. However, reasonable-seeming conditional models are generally not coherent with one another. Gibbs sampler based on incompatible conditionals is called pseudo-Gibbs sampler, whose properties are mostly unknown. This article investigates the richness and commonalities among their stationary distributions. We show that Gibbs sampler replaces the conditional distributions iteratively, but keep the marginal distributions invariant. In the process, it minimizes the Kullback–Leibler divergence. Next, we prove that systematic pseudo-Gibbs projections converge for every scan order, and the stationary distributions share marginal distributions in a circularly fashion. Therefore, regardless of compatibility, univariate consistency is guaranteed when the orders of imputation are circularly related. Moreover, a conditional model and its pseudo-Gibbs distributions have equal number of parameters. Study of pseudo-Gibbs sampler provides a fresh perspective for understanding the original Gibbs sampler. PubDate: 2019-02-01