 AStA Advances in Statistical Analysis
• A penalized spline estimator for fixed effects panel data models
• Authors: Peter Pütz; Thomas Kneib
Pages: 145 - 166
Abstract: Estimating nonlinear effects of continuous covariates by penalized splines is well established for regressions with cross-sectional data as well as for panel data regressions with random effects. Penalized splines are particularly advantageous since they enable both the estimation of unknown nonlinear covariate effects and inferential statements about these effects. The latter are based, for example, on simultaneous confidence bands that provide a simultaneous uncertainty assessment for the whole estimated functions. In this paper, we consider fixed effects panel data models instead of random effects specifications and develop a first-difference approach for the inclusion of penalized splines in this case. We take the resulting dependence structure into account and adapt the construction of simultaneous confidence bands accordingly. In addition, the penalized spline estimates as well as the confidence bands are also made available for derivatives of the estimated effects which are of considerable interest in many application areas. As an empirical illustration, we analyze the dynamics of life satisfaction over the life span based on data from the German Socio-Economic Panel. An open-source software implementation of our methods is available in the R package pamfe.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0296-1
Issue No: Vol. 102, No. 2 (2018)

• Closure properties of classes of multiple testing procedures
• Authors: Georg Hahn
Pages: 167 - 178
Abstract: Statistical discoveries are often obtained through multiple hypothesis testing. A variety of procedures exists to evaluate multiple hypotheses, for instance the ones of Benjamini–Hochberg, Bonferroni, Holm or Sidak. We are particularly interested in multiple testing procedures with two desired properties: (solely) monotonic and well-behaved procedures. This article investigates to which extent the classes of (monotonic or well-behaved) multiple testing procedures, in particular the subclasses of so-called step-up and step-down procedures, are closed under basic set operations, specifically the union, intersection, difference and the complement of sets of rejected or non-rejected hypotheses. The present article proves two main results: First, taking the union or intersection of arbitrary (monotonic or well-behaved) multiple testing procedures results in new procedures which are monotonic but not well-behaved, whereas the complement or difference generally preserves neither property. Second, the two classes of (solely monotonic or well-behaved) step-up and step-down procedures are closed under taking the union or intersection, but not the complement or difference.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0297-0
Issue No: Vol. 102, No. 2 (2018)

• Non-concave penalization in linear mixed-effect models and regularized
selection of fixed effects
• Authors: Abhik Ghosh; Magne Thoresen
Pages: 179 - 210
Abstract: Mixed-effect models are very popular for analyzing data with a hierarchical structure. In medical applications, typical examples include repeated observations within subjects in a longitudinal design, patients nested within centers in a multicenter design. However, recently, due to the medical advances, the number of fixed-effect covariates collected from each patient can be quite large, e.g., data on gene expressions of each patient, and all of these variables are not necessarily important for the outcome. So, it is very important to choose the relevant covariates correctly for obtaining the optimal inference for the overall study. On the other hand, the relevant random effects will often be low-dimensional and pre-specified. In this paper, we consider regularized selection of important fixed-effect variables in linear mixed-effect models along with maximum penalized likelihood estimation of both fixed and random-effect parameters based on general non-concave penalties. Asymptotic and variable selection consistency with oracle properties are proved for low-dimensional cases as well as for high dimensionality of non-polynomial order of sample size (number of parameters is much larger than sample size). We also provide a suitable computationally efficient algorithm for implementation. Additionally, all the theoretical results are proved for a general non-convex optimization problem that applies to several important situations well beyond the mixed model setup (like finite mixture of regressions) illustrating the huge range of applicability of our proposal.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0298-z
Issue No: Vol. 102, No. 2 (2018)

• On composite likelihood in bivariate meta-analysis of diagnostic test
accuracy studies
• Authors: Aristidis K. Nikoloulopoulos
Pages: 211 - 227
Abstract: The composite likelihood is amongst the computational methods used for estimation of the generalized linear mixed model (GLMM) in the context of bivariate meta-analysis of diagnostic test accuracy studies. Its advantage is that the likelihood can be derived conveniently under the assumption of independence between the random effects, but there has not been a clear analysis of the merit or necessity of this method. For synthesis of diagnostic test accuracy studies, a copula mixed model has been proposed in the biostatistics literature. This general model includes the GLMM as a special case and can also allow for flexible dependence modelling, different from assuming simple linear correlation structures, normality and tail independence in the joint tails. A maximum likelihood (ML) method, which is based on evaluating the bi-dimensional integrals of the likelihood with quadrature methods, has been proposed, and in fact it eases any computational difficulty that might be caused by the double integral in the likelihood function. Both methods are thoroughly examined with extensive simulations and illustrated with data of a published meta-analysis. It is shown that the ML method has no non-convergence issues or computational difficulties and at the same time allows estimation of the dependence between study-specific sensitivity and specificity and thus prediction via summary receiver operating curves.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0299-y
Issue No: Vol. 102, No. 2 (2018)

• Estimation of structural impulse responses: short-run versus long-run
identifying restrictions
• Authors: Helmut Lütkepohl; Anna Staszewska-Bystrova; Peter Winker
Pages: 229 - 244
Abstract: There is evidence that estimates of long-run impulse responses of structural vector autoregressive (VAR) models based on long-run identifying restrictions may not be very accurate. This finding suggests that using short-run identifying restrictions may be preferable. We compare structural VAR impulse response estimates based on long-run and short-run identifying restrictions and find that long-run identifying restrictions can result in much more precise estimates for the structural impulse responses than restrictions on the impact effects of the shocks.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0300-9
Issue No: Vol. 102, No. 2 (2018)

• Bayesian conditional inference for Rasch models
• Authors: Clemens Draxler
Pages: 245 - 262
Abstract: This paper is concerned with Bayesian inference in psychometric modeling. It treats conditional likelihood functions obtained from discrete conditional probability distributions which are generalizations of the hypergeometric distribution. The influence of nuisance parameters is eliminated by conditioning on observed values of their sufficient statistics, and Bayesian considerations are only referred to parameters of interest. Since such a combination of techniques to deal with both types of parameters is less common in psychometrics, a wider scope in future research may be gained. The focus is on the evaluation of the empirical appropriateness of assumptions of the Rasch model, thereby pointing to an alternative to the frequentists’ approach which is dominating in this context. A number of examples are discussed. Some are very straightforward to apply. Others are computationally intensive and may be unpractical. The suggested procedure is illustrated using real data from a study on vocational education.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0303-6
Issue No: Vol. 102, No. 2 (2018)

• A distance-based model for spatial prediction using radial basis functions
• Authors: Carlos E. Melo; Oscar O. Melo; Jorge Mateu
Pages: 263 - 288
Abstract: In the context of local interpolators, radial basis functions (RBFs) are known to reduce the computational time by using a subset of the data for prediction purposes. In this paper, we propose a new distance-based spatial RBFs method which allows modeling spatial continuous random variables. The trend is incorporated into a RBF according to a detrending procedure with mixed variables, among which we may have categorical variables. In order to evaluate the efficiency of the proposed method, a simulation study is carried out for a variety of practical scenarios for five distinct RBFs, incorporating principal coordinates. Finally, the proposed method is illustrated with an application of prediction of calcium concentration measured at a depth of 0–20 cm in Brazil, selecting the smoothing parameter by cross-validation.
PubDate: 2018-04-01
DOI: 10.1007/s10182-017-0305-4
Issue No: Vol. 102, No. 2 (2018)

• A formal framework for hedonic elementary price indices
• Authors: Hans Wolfgang Brachinger; Michael Beer; Olivier Schöni
Pages: 67 - 93
Abstract: Hedonic methods are considered state of the art for handling quality changes when compiling consumer price indices. The present article proposes first a mathematical description of characteristics and of elementary aggregates. In a following step, a hedonic econometric model is formulated and hedonic elementary population indices are defined. We emphasise that population indices are unobservable economic parameters that need to be estimated by suitable sample indices. It is shown that within the framework developed here, many of the hedonic index formulae used in practice are identified as sample versions corresponding to particular hedonic elementary population indices. The article closes with an empirical part on quarterly housing data where the considered hedonic indices are estimated along with their bootstrapped confidence intervals. It is shown that the computed confidence intervals together with the results from theory suggest a particular answer to the price index problem.
PubDate: 2018-01-01
DOI: 10.1007/s10182-017-0293-4
Issue No: Vol. 102, No. 1 (2018)

• Change-in-mean tests in long-memory time series: a review of recent
developments
• Authors: Kai Wenger; Christian Leschinski; Philipp Sibbertsen
Abstract: It is well known that standard tests for a mean shift are invalid in long-range dependent time series. Therefore, several long-memory robust extensions of standard testing principles for a change-in-mean have been proposed in the literature. These can be divided into two groups: those that utilize consistent estimates of the long-run variance and self-normalized test statistics. Here, we review this literature and complement it by deriving a new long-memory robust version of the sup-Wald test. Apart from giving a systematic review, we conduct an extensive Monte Carlo study to compare the relative performance of these methods. Special attention is paid to the interaction of the test results with the estimation of the long-memory parameter. Furthermore, we show that the power of self-normalized test statistics can be improved considerably by using an estimator that is robust to mean shifts.
PubDate: 2018-05-26
DOI: 10.1007/s10182-018-0328-5

• Multivariate partially linear regression in the presence of measurement
error
• Authors: Seçil Yalaz
Abstract: In this paper, multivariate partially linear model with error in the explanatory variable of nonparametric part, where the response variable is m dimensional, is considered. By modification of local-likelihood method, an estimator of parametric part is driven. Moreover, the asymptotic normality of the generalized least square estimator of the parametric component is investigated when the error distribution function is either ordinarily smooth or super smooth. Applications in the Engel curves are discussed and through Monte Carlo experiments performances of $$\hat{\beta }_{n}$$ are investigated.
PubDate: 2018-05-17
DOI: 10.1007/s10182-018-0326-7

• SIMEX estimation for single-index model with covariate measurement error
• Authors: Yiping Yang; Tiejun Tong; Gaorong Li
Abstract: In this paper, we consider the single-index measurement error model with mismeasured covariates in the nonparametric part. To solve the problem, we develop a simulation-extrapolation (SIMEX) algorithm based on the local linear smoother and the estimating equation. For the proposed SIMEX estimation, it is not needed to assume the distribution of the unobserved covariate. We transform the boundary of a unit ball in $${\mathbb {R}}^p$$ to the interior of a unit ball in $${\mathbb {R}}^{p-1}$$ by using the constraint $$\Vert \beta \Vert =1$$ . The proposed SIMEX estimator of the index parameter is shown to be asymptotically normal under some regularity conditions. We also derive the asymptotic bias and variance of the estimator of the unknown link function. Finally, the performance of the proposed method is examined by simulation studies and is illustrated by a real data example.
PubDate: 2018-05-03
DOI: 10.1007/s10182-018-0327-6

• Weak identification in probit models with endogenous covariates
• Authors: Jean-Marie Dufour; Joachim Wilde
Abstract: Weak identification is a well-known issue in the context of linear structural models. However, for probit models with endogenous explanatory variables, this problem has been little explored. In this paper, we study by simulating the behavior of the usual z-test and the LR test in the presence of weak identification. We find that the usual asymptotic z-test exhibits large level distortions (over-rejections under the null hypothesis). The magnitude of the level distortions depends heavily on the parameter value tested. In contrast, asymptotic LR tests do not over-reject and appear to be robust to weak identification.
PubDate: 2018-04-21
DOI: 10.1007/s10182-018-0325-8

• A longitudinal model for shapes through triangulation
• Authors: Meisam Moghimbeygi; Mousa Golalizadeh
Abstract: It is known that the shapes of planar triangles can be represented by a set of points on the surface of the unit sphere. On the other hand, most of the objects can easily be triangulated and so each triangle can accordingly be treated in the context of shape analysis. There is a growing interest to fit a smooth path going through the cloud of shape data available in some time instances. To tackle this problem, we propose a longitudinal model through a triangulation procedure for the shape data. In fact, our strategy initially relies on a spherical regression model for triangles, but is extended to shape data via triangulation. Regarding modeling of directional data, we use the bivariate von Mises–Fisher distribution for density of the errors. Various forms of the composite likelihood functions, constructed by altering the assumptions considered for the angles defined for each triangle, are invoked. The proposed regression model is applied to rat skull data. Also, some simulations results are presented along with the real data results.
PubDate: 2018-04-05
DOI: 10.1007/s10182-018-0324-9

• Expectation–maximization algorithm for system-based lifetime data with
unknown system structure
• Authors: Yandan Yang; Hon Keung Tony Ng; Narayanaswamy Balakrishnan
PubDate: 2018-03-09
DOI: 10.1007/s10182-018-0323-x

• Study of the bivariate survival data using frailty models based on
Lévy processes
• Authors: Alexander Begun; Anatoli Yashin
Abstract: Frailty models allow us to take into account the non-observable inhomogeneity of individual hazard functions. Although models with time-independent frailty have been intensively studied over the last decades and a wide range of applications in survival analysis have been found, the studies based on the models with time-dependent frailty are relatively rare. In this paper, we formulate and prove two propositions related to the identifiability of the bivariate survival models with frailty given by a nonnegative bivariate Lévy process. We discuss parametric and semiparametric procedures for estimating unknown parameters and baseline hazard functions. Numerical experiments with simulated and real data illustrate these procedures. The statements of the propositions can be easily extended to the multivariate case.
PubDate: 2018-02-28
DOI: 10.1007/s10182-018-0322-y

• Estimation of the finite population distribution function using a global
penalized calibration method
• Authors: J. A. Mayor-Gallego; J. L. Moreno-Rebollo; M. D. Jiménez-Gamero
Abstract: Auxiliary information $${\varvec{x}}$$ is commonly used in survey sampling at the estimation stage. We propose an estimator of the finite population distribution function $$F_{y}(t)$$ when $${\varvec{x}}$$ is available for all units in the population and related to the study variable y by a superpopulation model. The new estimator integrates ideas from model calibration and penalized calibration. Calibration estimates of $$F_{y}(t)$$ with the weights satisfying benchmark constraints on the fitted values distribution function $$\hat{F}_{\hat{y}}=F_{\hat{y}}$$ on a set of fixed values of t can be found in the literature. Alternatively, our proposal $$\hat{F}_{y\omega }$$ seeks an estimator taking into account a global distance $$D(\hat{F}_{\hat{y}\omega },F_{\hat{y}})$$ between $$\hat{F}_{\hat{y}\omega }$$ and $${F}_{\hat{y}},$$ and a penalty parameter $$\alpha$$ that assesses the importance of this term in the objective function. The weights are explicitly obtained for the $$L^2$$ distance and conditions are given so that $$\hat{F}_{y\omega }$$ to be a distribution function. In this case $$\hat{F}_{y\omega }$$ can also be used to estimate the population quantiles. Moreover, results on the asymptotic unbiasedness and the asymptotic variance of $$\hat{F}_{y\omega }$$ , for a fixed $$\alpha$$ , are obtained. The results of a simulation study, designed to compare the proposed estimator to other existing ones, reveal that its performance is quite competitive.
PubDate: 2018-02-23
DOI: 10.1007/s10182-018-0321-z

• Varying levels of anomie in Europe: a multilevel analysis based on
multidimensional IRT models
• Authors: Lara Fontanella; Annalina Sarra; Pasquale Valentini; Simone Di Zio; Sara Fontanella
Abstract: Recent years have seen increased attention paid to monitoring social anomie and its dependency on micro- and macro-factors. In this paper, we endorse the theorisation of social anomie as a complex, multidimensional and multilevel phenomenon. To ensure a rigorous measurement of the varying levels of social anomie in the European countries, the current study relies on a multilevel multidimensional item response theory model which explicitly accounts for the presence of a non-ignorable missing data mechanism. This unified approach makes it possible to specify an analytical model of links between anomie features and their determinants and to explore how the latent traits of interest are influenced by individual-level factors, as well as by country-level indicators. Additionally, to avoid misleading inferential conclusions, the proposed model takes into account the respondent’s omitting behaviour, assuming that the missingness mechanism is driven by a latent propensity to respond. Data used in this study have been collected in the 2010 wave of the European Social Survey. To reduce the computational complexities, a Bayesian specification of the MIRT model is provided and the parameter model estimates are obtained through MCMC algorithms.
PubDate: 2018-02-10
DOI: 10.1007/s10182-018-0320-0

• Weighted quantile regression and testing for varying-coefficient models
with randomly truncated data
• Authors: Hong-Xia Xu; Guo-Liang Fan; Zhen-Long Chen; Jiang-Feng Wang
Abstract: This paper develops a varying-coefficient approach to the estimation and testing of regression quantiles under randomly truncated data. In order to handle the truncated data, the random weights are introduced and the weighted quantile regression (WQR) estimators for nonparametric functions are proposed. To achieve nice efficiency properties, we further develop a weighted composite quantile regression (WCQR) estimation method for nonparametric functions in varying-coefficient models. The asymptotic properties both for the proposed WQR and WCQR estimators are established. In addition, we propose a novel bootstrap-based test procedure to test whether the nonparametric functions in varying-coefficient quantile models can be specified by some function forms. The performance of the proposed estimators and test procedure are investigated through simulation studies and a real data example.
PubDate: 2018-02-07
DOI: 10.1007/s10182-018-0319-6

• A modified generalized lasso algorithm to detect local spatial clusters
for count data
• Authors: Hosik Choi; Eunjung Song; Seung-sik Hwang; Woojoo Lee
Abstract: Detecting local spatial clusters for count data is an important task in spatial epidemiology. Two broad approaches—moving window and disease mapping methods—have been suggested in some of the literature to find clusters. However, the existing methods employ somewhat arbitrarily chosen tuning parameters, and the local clustering results are sensitive to the choices. In this paper, we propose a penalized likelihood method to overcome the limitations of existing local spatial clustering approaches for count data. We start with a Poisson regression model to accommodate any type of covariates, and formulate the clustering problem as a penalized likelihood estimation problem to find change points of intercepts in two-dimensional space. The cost of developing a new algorithm is minimized by modifying an existing least absolute shrinkage and selection operator algorithm. The computational details on the modifications are shown, and the proposed method is illustrated with Seoul tuberculosis data.
PubDate: 2018-01-17
DOI: 10.1007/s10182-018-0318-7

• Estimating the hazard functions of two alternating recurrent events in the
presence of covariates
• Authors: Moumita Chatterjee; Sugata Sen Roy
Abstract: The motivation for this paper is a cystic fibrosis data which records a patient’s times to relapse and times to cure under several recurrences of the disease. The idea is to study the impact of covariates on the hazard rates of two alternately occurring events. The dependence between the times to the two events over the different cycles is modeled through an autoregressive-type setup. The partial likelihood function is then derived and the estimators obtained. The estimators are shown to be consistent and asymptotically normal. The technique is applied to study the motivating data. A simulation study is also conducted to corroborate the results.
PubDate: 2017-12-21
DOI: 10.1007/s10182-017-0316-1

