Abstract: Abstract A long-standing problem in the construction of asymptotically correct confidence bands for a regression function \(m(x)=E[Y X=x]\) , where Y is the response variable influenced by the covariate X, involves the situation where Y values may be missing at random, and where the selection probability, the density function f(x) of X, and the conditional variance of Y given X are all completely unknown. This can be particularly more complicated in nonparametric situations. In this paper, we propose a new kernel-type regression estimator and study the limiting distribution of the properly normalized versions of the maximal deviation of the proposed estimator from the true regression curve. The resulting limiting distribution will be used to construct uniform confidence bands for the underlying regression curve with asymptotically correct coverages. The focus of the current paper is on the case where \(X\in \mathbb {R}\) . We also perform numerical studies to assess the finite-sample performance of the proposed method. In this paper, both mechanics and the theoretical validity of our methods are discussed. PubDate: 2019-02-07 DOI: 10.1007/s10182-019-00351-7

Abstract: Abstract The problem of how to determine portfolio weights so that the variance of portfolio returns is minimized has been given considerable attention in the literature, and several methods have been proposed. Some properties of these estimators, however, remain unknown, and many of their relative strengths and weaknesses are therefore difficult to assess for users. This paper contributes to the field by comparing and contrasting the risk functions used to derive efficient portfolio weight estimators. It is argued that risk functions commonly used to derive and evaluate estimators may be inadequate and that alternative quality criteria should be considered instead. The theoretical discussions are supported by a Monte Carlo simulation and two empirical applications where particular focus is set on cases where the number of assets (p) is close to the number of observations (n). PubDate: 2019-02-04 DOI: 10.1007/s10182-018-00349-7

Abstract: Abstract We introduce a new class of regression models based on the geometric Tweedie models (GTMs) for analyzing both continuous and semicontinuous data, similar to the recent and standard Tweedie regression models. We also present a phenomenon of variation with respect to the equi-varied exponential distribution, where variance is equal to the squared mean. The corresponding power v-functions which characterize the GTMs, obtained in turn by exponential-Tweedie mixture, are transformed into variance to use the conventional generalized linear models. The real power parameter of GTMs works as an automatic distribution selection such for asymmetric Laplace, geometric-compound-Poisson-gamma and geometric-Mittag-Leffler. The classification of all power v-functions reveals only two border count distributions, namely geometric and geometric-Poisson. We establish practical properties, into the GTMs, of zero-mass and variation phenomena, also in connection with some reliability measures. Simulation studies show that the proposed model highlights asymptotic unbiased and consistent estimators, despite the general over-variation. We illustrate two applications, under- and over-varied, on real datasets to a time to failure and time to repair in reliability; one of which has positive values with many achievements in zeros. We finally make concluding remarks, including future directions. PubDate: 2019-01-30 DOI: 10.1007/s10182-019-00350-8

Abstract: Abstract Inference on impulse response functions from vector autoregressive models is commonly done using bootstrap methods. These methods can be inaccurate in small samples and for persistent processes. This article investigates the construction of skewness-adjusted confidence intervals and joint confidence bands for impulse responses with improved small sample performance. We suggest to adjust the skewness of the bootstrap distribution of the autoregressive coefficients before the impulse response functions are computed. Using extensive Monte Carlo simulations, the approach is shown to improve the coverage accuracy in small- and medium-sized samples and for unit-root processes. PubDate: 2019-01-11 DOI: 10.1007/s10182-018-00347-9

Abstract: Abstract Longitudinal studies often involve multiple mixed response variables measured repeatedly over time. Although separate modeling of these multiple mixed response variables can be easily performed, they may lead to inefficient estimates and consequently, misleading inferences. For obtaining correct inference, one needs to model multiple mixed responses jointly. In this paper, we use copula models for defining a multivariate distribution for multiple mixed outcomes at each time point. Then, we use transition model for considering association between longitudinal measurements. Two simulation studies are performed for illustration of the proposed approach. The results of the simulation studies show that the use of the separate models instead of the joint modeling leads to inefficient parameter estimates. The proposed approach is also used for analyzing two real data sets. The first data set is a part of the British Household Panel Survey. In this data set, annual income and life satisfaction are considered as the continuous and the ordinal correlated longitudinal responses, respectively. The second data set is a longitudinal data about heart failure patients. This study is a treatment–control study, where the effect of a treatment is simultaneously investigated on readmission and referral to doctor as two binary associated longitudinal responses. PubDate: 2019-01-05 DOI: 10.1007/s10182-018-00346-w

Abstract: Abstract In this paper, the repeated measures analysis for functional data is considered. The known testing procedures for this problem are based on test statistic being the integral of the difference between sample mean functions, which takes into account only “between group variability”. We modify this test statistic to use also information about “within group variability”. More precisely, we construct the new test statistics being integral and supremum of pointwise test statistic obtained by adapting the classical paired t-test statistic to functional data framework. The testing procedures are based on different methods of approximating the null distribution of the test statistics, namely the Box-type approximation, nonparametric and parametric bootstrap and permutation approaches. These approximations do not perform equally well under finite samples, which is established in simulation experiments, indicating the best new tests. The simulations and an application to mortality data suggest that some of the new procedures outperform the known tests in terms of size control and power. PubDate: 2019-01-04 DOI: 10.1007/s10182-018-00348-8

Abstract: Abstract We propose a new estimation procedure for estimating the unknown parameters and function in partial functional linear regression. The asymptotic distribution of the estimator of the vector of slope parameters is derived, and the global convergence rate of the estimator of unknown slope function is established under suitable norm. The convergence rate of the mean squared prediction error for the proposed estimators is also established. Based on the proposed estimation procedure, we further construct the penalized regression estimators and establish their variable selection consistency and oracle properties. Finite sample properties of our procedures are studied through Monte Carlo simulations. A real data example about the real estate data is used to illustrate our proposed methodology. PubDate: 2018-12-14 DOI: 10.1007/s10182-018-00342-0

Abstract: Abstract Standard Poisson and negative binomial truncated regression models for count data include the regressors in the mean of the non-truncated distribution. In this paper, a new approach is proposed so that the explanatory variables determine directly the truncated mean. The main advantage is that the regression coefficients in the new models have a straightforward interpretation as the effect of a change in a covariate on the mean of the response variable. A simulation study has been carried out in order to analyze the performance of the proposed truncated regression models versus the standard ones showing that coefficient estimates are now more accurate in the sense that the standard errors are always lower. Also, the simulation study indicates that the estimates obtained with the standard models are biased. An application to real data illustrates the utility of the introduced truncated models in a hurdle model. Although in the example there are slight differences in the results between the two approaches, the proposed one provides a clear interpretation of the coefficient estimates. PubDate: 2018-12-10 DOI: 10.1007/s10182-018-00345-x

Abstract: Abstract A unified testing framework is presented for large-dimensional mean vectors of one or several populations which may be non-normal with unequal covariance matrices. Beginning with one-sample case, the construction of tests, underlying assumptions and asymptotic theory, is systematically extended to multi-sample case. Tests are defined in terms of U-statistics-based consistent estimators, and their limits are derived under a few mild assumptions. Accuracy of the tests is shown through simulations. Real data applications, including a five-sample unbalanced MANOVA analysis on count data, are also given. PubDate: 2018-12-10 DOI: 10.1007/s10182-018-00343-z

Abstract: Abstract A scalar-response functional model describes the association between a scalar response and a set of functional covariates. An important problem in the functional data literature is to test nullity or linearity of the effect of the functional covariate in the context of scalar-on-function regression. This article provides an overview of the existing methods for testing both the null hypotheses that there is no relationship and that there is a linear relationship between the functional covariate and scalar response, and a comprehensive numerical comparison of their performance. The methods are compared for a variety of realistic scenarios: when the functional covariate is observed at dense or sparse grids and measurements include noise or not. Finally, the methods are illustrated on the Tecator data set. PubDate: 2018-10-17 DOI: 10.1007/s10182-018-00337-x

Abstract: Abstract In this paper a test for model selection is proposed which extends the usual goodness-of-fit test in several ways. It is assumed that the underlying distribution H depends on a covariate value in a fixed design setting. Secondly, instead of one parametric class we consider two competing classes one of which may contain the underlying distribution. The test allows to select one of two equally treated model classes which fits the underlying distribution better. To define the distance of distributions various measures are available. Here the Cramér-von Mises has been chosen. The null hypothesis that both parametric classes have the same distance to the underlying distribution H can be checked by means of a test statistic, the asymptotic properties of which are shown under a set of suitable conditions. The performance of the test is demonstrated by Monte Carlo simulations. Finally, the procedure is applied to a data set from an endurance test on electric motors. PubDate: 2018-10-01 DOI: 10.1007/s10182-017-0317-0

Authors:Seçil Yalaz Abstract: Abstract In this paper, multivariate partially linear model with error in the explanatory variable of nonparametric part, where the response variable is m dimensional, is considered. By modification of local-likelihood method, an estimator of parametric part is driven. Moreover, the asymptotic normality of the generalized least square estimator of the parametric component is investigated when the error distribution function is either ordinarily smooth or super smooth. Applications in the Engel curves are discussed and through Monte Carlo experiments performances of \(\hat{\beta }_{n}\) are investigated. PubDate: 2018-05-17 DOI: 10.1007/s10182-018-0326-7

Authors:Yiping Yang; Tiejun Tong; Gaorong Li Abstract: Abstract In this paper, we consider the single-index measurement error model with mismeasured covariates in the nonparametric part. To solve the problem, we develop a simulation-extrapolation (SIMEX) algorithm based on the local linear smoother and the estimating equation. For the proposed SIMEX estimation, it is not needed to assume the distribution of the unobserved covariate. We transform the boundary of a unit ball in \({\mathbb {R}}^p\) to the interior of a unit ball in \({\mathbb {R}}^{p-1}\) by using the constraint \(\Vert \beta \Vert =1\) . The proposed SIMEX estimator of the index parameter is shown to be asymptotically normal under some regularity conditions. We also derive the asymptotic bias and variance of the estimator of the unknown link function. Finally, the performance of the proposed method is examined by simulation studies and is illustrated by a real data example. PubDate: 2018-05-03 DOI: 10.1007/s10182-018-0327-6

Authors:Jean-Marie Dufour; Joachim Wilde Abstract: Abstract Weak identification is a well-known issue in the context of linear structural models. However, for probit models with endogenous explanatory variables, this problem has been little explored. In this paper, we study by simulating the behavior of the usual z-test and the LR test in the presence of weak identification. We find that the usual asymptotic z-test exhibits large level distortions (over-rejections under the null hypothesis). The magnitude of the level distortions depends heavily on the parameter value tested. In contrast, asymptotic LR tests do not over-reject and appear to be robust to weak identification. PubDate: 2018-04-21 DOI: 10.1007/s10182-018-0325-8

Authors:Meisam Moghimbeygi; Mousa Golalizadeh Abstract: Abstract It is known that the shapes of planar triangles can be represented by a set of points on the surface of the unit sphere. On the other hand, most of the objects can easily be triangulated and so each triangle can accordingly be treated in the context of shape analysis. There is a growing interest to fit a smooth path going through the cloud of shape data available in some time instances. To tackle this problem, we propose a longitudinal model through a triangulation procedure for the shape data. In fact, our strategy initially relies on a spherical regression model for triangles, but is extended to shape data via triangulation. Regarding modeling of directional data, we use the bivariate von Mises–Fisher distribution for density of the errors. Various forms of the composite likelihood functions, constructed by altering the assumptions considered for the angles defined for each triangle, are invoked. The proposed regression model is applied to rat skull data. Also, some simulations results are presented along with the real data results. PubDate: 2018-04-05 DOI: 10.1007/s10182-018-0324-9

Authors:Yandan Yang; Hon Keung Tony Ng; Narayanaswamy Balakrishnan Abstract: Abstract In science and engineering, we are often interested in learning about the lifetime characteristics of the system as well as those of the components that made up the system. However, in many cases, the system lifetimes can be observed but not the component lifetimes, and so we may not also have any knowledge on the structure of the system. Statistical procedures for estimating the parameters of the component lifetime distribution and for identifying the system structure based on system-level lifetime data are developed here using expectation–maximization (EM) algorithm. Different implementations of the EM algorithm based on system-level or component-level likelihood functions are proposed. A special case that the system is known to be a coherent system with unknown structure is considered. The methodologies are then illustrated by considering the component lifetimes to follow a two-parameter Weibull distribution. A numerical example and a Monte Carlo simulation study are used to evaluate the performance and related merits of the proposed implementations of the EM algorithm. Lognormally distributed component lifetimes and a real data example are used to illustrate how the methodologies can be applied to other lifetime models in addition to the Weibull model. Finally, some recommendations along with concluding remarks are provided. PubDate: 2018-03-09 DOI: 10.1007/s10182-018-0323-x

Authors:Alexander Begun; Anatoli Yashin Abstract: Abstract Frailty models allow us to take into account the non-observable inhomogeneity of individual hazard functions. Although models with time-independent frailty have been intensively studied over the last decades and a wide range of applications in survival analysis have been found, the studies based on the models with time-dependent frailty are relatively rare. In this paper, we formulate and prove two propositions related to the identifiability of the bivariate survival models with frailty given by a nonnegative bivariate Lévy process. We discuss parametric and semiparametric procedures for estimating unknown parameters and baseline hazard functions. Numerical experiments with simulated and real data illustrate these procedures. The statements of the propositions can be easily extended to the multivariate case. PubDate: 2018-02-28 DOI: 10.1007/s10182-018-0322-y

Authors:J. A. Mayor-Gallego; J. L. Moreno-Rebollo; M. D. Jiménez-Gamero Abstract: Abstract Auxiliary information \({\varvec{x}}\) is commonly used in survey sampling at the estimation stage. We propose an estimator of the finite population distribution function \(F_{y}(t)\) when \({\varvec{x}}\) is available for all units in the population and related to the study variable y by a superpopulation model. The new estimator integrates ideas from model calibration and penalized calibration. Calibration estimates of \(F_{y}(t)\) with the weights satisfying benchmark constraints on the fitted values distribution function \(\hat{F}_{\hat{y}}=F_{\hat{y}}\) on a set of fixed values of t can be found in the literature. Alternatively, our proposal \(\hat{F}_{y\omega }\) seeks an estimator taking into account a global distance \(D(\hat{F}_{\hat{y}\omega },F_{\hat{y}})\) between \(\hat{F}_{\hat{y}\omega }\) and \({F}_{\hat{y}},\) and a penalty parameter \(\alpha \) that assesses the importance of this term in the objective function. The weights are explicitly obtained for the \(L^2\) distance and conditions are given so that \(\hat{F}_{y\omega }\) to be a distribution function. In this case \(\hat{F}_{y\omega }\) can also be used to estimate the population quantiles. Moreover, results on the asymptotic unbiasedness and the asymptotic variance of \(\hat{F}_{y\omega }\) , for a fixed \(\alpha \) , are obtained. The results of a simulation study, designed to compare the proposed estimator to other existing ones, reveal that its performance is quite competitive. PubDate: 2018-02-23 DOI: 10.1007/s10182-018-0321-z

Authors:Hosik Choi; Eunjung Song; Seung-sik Hwang; Woojoo Lee Abstract: Abstract Detecting local spatial clusters for count data is an important task in spatial epidemiology. Two broad approaches—moving window and disease mapping methods—have been suggested in some of the literature to find clusters. However, the existing methods employ somewhat arbitrarily chosen tuning parameters, and the local clustering results are sensitive to the choices. In this paper, we propose a penalized likelihood method to overcome the limitations of existing local spatial clustering approaches for count data. We start with a Poisson regression model to accommodate any type of covariates, and formulate the clustering problem as a penalized likelihood estimation problem to find change points of intercepts in two-dimensional space. The cost of developing a new algorithm is minimized by modifying an existing least absolute shrinkage and selection operator algorithm. The computational details on the modifications are shown, and the proposed method is illustrated with Seoul tuberculosis data. PubDate: 2018-01-17 DOI: 10.1007/s10182-018-0318-7