Authors:Peter Pütz; Thomas Kneib Pages: 145 - 166 Abstract: Estimating nonlinear effects of continuous covariates by penalized splines is well established for regressions with cross-sectional data as well as for panel data regressions with random effects. Penalized splines are particularly advantageous since they enable both the estimation of unknown nonlinear covariate effects and inferential statements about these effects. The latter are based, for example, on simultaneous confidence bands that provide a simultaneous uncertainty assessment for the whole estimated functions. In this paper, we consider fixed effects panel data models instead of random effects specifications and develop a first-difference approach for the inclusion of penalized splines in this case. We take the resulting dependence structure into account and adapt the construction of simultaneous confidence bands accordingly. In addition, the penalized spline estimates as well as the confidence bands are also made available for derivatives of the estimated effects which are of considerable interest in many application areas. As an empirical illustration, we analyze the dynamics of life satisfaction over the life span based on data from the German Socio-Economic Panel. An open-source software implementation of our methods is available in the R package pamfe. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0296-1 Issue No:Vol. 102, No. 2 (2018)

Authors:Georg Hahn Pages: 167 - 178 Abstract: Statistical discoveries are often obtained through multiple hypothesis testing. A variety of procedures exists to evaluate multiple hypotheses, for instance the ones of Benjamini–Hochberg, Bonferroni, Holm or Sidak. We are particularly interested in multiple testing procedures with two desired properties: (solely) monotonic and well-behaved procedures. This article investigates to which extent the classes of (monotonic or well-behaved) multiple testing procedures, in particular the subclasses of so-called step-up and step-down procedures, are closed under basic set operations, specifically the union, intersection, difference and the complement of sets of rejected or non-rejected hypotheses. The present article proves two main results: First, taking the union or intersection of arbitrary (monotonic or well-behaved) multiple testing procedures results in new procedures which are monotonic but not well-behaved, whereas the complement or difference generally preserves neither property. Second, the two classes of (solely monotonic or well-behaved) step-up and step-down procedures are closed under taking the union or intersection, but not the complement or difference. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0297-0 Issue No:Vol. 102, No. 2 (2018)

Authors:Abhik Ghosh; Magne Thoresen Pages: 179 - 210 Abstract: Mixed-effect models are very popular for analyzing data with a hierarchical structure. In medical applications, typical examples include repeated observations within subjects in a longitudinal design, patients nested within centers in a multicenter design. However, recently, due to the medical advances, the number of fixed-effect covariates collected from each patient can be quite large, e.g., data on gene expressions of each patient, and all of these variables are not necessarily important for the outcome. So, it is very important to choose the relevant covariates correctly for obtaining the optimal inference for the overall study. On the other hand, the relevant random effects will often be low-dimensional and pre-specified. In this paper, we consider regularized selection of important fixed-effect variables in linear mixed-effect models along with maximum penalized likelihood estimation of both fixed and random-effect parameters based on general non-concave penalties. Asymptotic and variable selection consistency with oracle properties are proved for low-dimensional cases as well as for high dimensionality of non-polynomial order of sample size (number of parameters is much larger than sample size). We also provide a suitable computationally efficient algorithm for implementation. Additionally, all the theoretical results are proved for a general non-convex optimization problem that applies to several important situations well beyond the mixed model setup (like finite mixture of regressions) illustrating the huge range of applicability of our proposal. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0298-z Issue No:Vol. 102, No. 2 (2018)

Authors:Aristidis K. Nikoloulopoulos Pages: 211 - 227 Abstract: The composite likelihood is amongst the computational methods used for estimation of the generalized linear mixed model (GLMM) in the context of bivariate meta-analysis of diagnostic test accuracy studies. Its advantage is that the likelihood can be derived conveniently under the assumption of independence between the random effects, but there has not been a clear analysis of the merit or necessity of this method. For synthesis of diagnostic test accuracy studies, a copula mixed model has been proposed in the biostatistics literature. This general model includes the GLMM as a special case and can also allow for flexible dependence modelling, different from assuming simple linear correlation structures, normality and tail independence in the joint tails. A maximum likelihood (ML) method, which is based on evaluating the bi-dimensional integrals of the likelihood with quadrature methods, has been proposed, and in fact it eases any computational difficulty that might be caused by the double integral in the likelihood function. Both methods are thoroughly examined with extensive simulations and illustrated with data of a published meta-analysis. It is shown that the ML method has no non-convergence issues or computational difficulties and at the same time allows estimation of the dependence between study-specific sensitivity and specificity and thus prediction via summary receiver operating curves. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0299-y Issue No:Vol. 102, No. 2 (2018)

Authors:Helmut Lütkepohl; Anna Staszewska-Bystrova; Peter Winker Pages: 229 - 244 Abstract: There is evidence that estimates of long-run impulse responses of structural vector autoregressive (VAR) models based on long-run identifying restrictions may not be very accurate. This finding suggests that using short-run identifying restrictions may be preferable. We compare structural VAR impulse response estimates based on long-run and short-run identifying restrictions and find that long-run identifying restrictions can result in much more precise estimates for the structural impulse responses than restrictions on the impact effects of the shocks. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0300-9 Issue No:Vol. 102, No. 2 (2018)

Authors:Clemens Draxler Pages: 245 - 262 Abstract: This paper is concerned with Bayesian inference in psychometric modeling. It treats conditional likelihood functions obtained from discrete conditional probability distributions which are generalizations of the hypergeometric distribution. The influence of nuisance parameters is eliminated by conditioning on observed values of their sufficient statistics, and Bayesian considerations are only referred to parameters of interest. Since such a combination of techniques to deal with both types of parameters is less common in psychometrics, a wider scope in future research may be gained. The focus is on the evaluation of the empirical appropriateness of assumptions of the Rasch model, thereby pointing to an alternative to the frequentists’ approach which is dominating in this context. A number of examples are discussed. Some are very straightforward to apply. Others are computationally intensive and may be unpractical. The suggested procedure is illustrated using real data from a study on vocational education. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0303-6 Issue No:Vol. 102, No. 2 (2018)

Authors:Carlos E. Melo; Oscar O. Melo; Jorge Mateu Pages: 263 - 288 Abstract: In the context of local interpolators, radial basis functions (RBFs) are known to reduce the computational time by using a subset of the data for prediction purposes. In this paper, we propose a new distance-based spatial RBFs method which allows modeling spatial continuous random variables. The trend is incorporated into a RBF according to a detrending procedure with mixed variables, among which we may have categorical variables. In order to evaluate the efficiency of the proposed method, a simulation study is carried out for a variety of practical scenarios for five distinct RBFs, incorporating principal coordinates. Finally, the proposed method is illustrated with an application of prediction of calcium concentration measured at a depth of 0–20 cm in Brazil, selecting the smoothing parameter by cross-validation. PubDate: 2018-04-01 DOI: 10.1007/s10182-017-0305-4 Issue No:Vol. 102, No. 2 (2018)

Authors:Olha Bodnar; Clemens Elster Pages: 1 - 20 Abstract: Random-effects meta-analysis has become a well-established tool applied in many areas, for example, when combining the results of several clinical studies on a treatment effect. Typically, the inference aims at the common mean and the amount of heterogeneity. In some applications, the laboratory effects are of interest, for example, when assessing uncertainties quoted by laboratories participating in an interlaboratory comparison in metrology. We consider the Bayesian estimation of the realized random effects in random-effects meta-analysis. Several vague and noninformative priors are examined as well as a proposed novel one. Conditions are established that ensure propriety of the posteriors for the realized random effects. We present extensive simulation results that assess the inference in dependence on the choice of prior as well as mis-specifications in the statistical model. Overall good performance is observed for all priors with the novel prior showing the most promising results. Finally, the uncertainties reported by eleven national metrology institutes and universities for their measurements on the Newtonian constant of gravitation are assessed. PubDate: 2018-01-01 DOI: 10.1007/s10182-016-0279-7 Issue No:Vol. 102, No. 1 (2018)

Authors:Ramón Giraldo; William Caballero; Jesús Camacho-Tamayo Pages: 21 - 39 Abstract: Statistics for spatial functional data is an emerging field in statistics which combines methods of spatial statistics and functional data analysis to model spatially correlated functional data. Checking for spatial autocorrelation is an important step in the statistical analysis of spatial data. Several statistics to achieve this goal have been proposed. The test based on the Mantel statistic is widely known and used in this context. This paper proposes an application of this test to the case of spatial functional data. Although we focus particularly on geostatistical functional data, that is functional data observed in a region with spatial continuity, the test proposed can also be applied with functional data which can be measured on a discrete set of areas of a region (areal functional data) by defining properly the distance between the areas. Based on two simulation studies, we show that the proposed test has a good performance. We illustrate the methodology by applying it to an agronomic data set. PubDate: 2018-01-01 DOI: 10.1007/s10182-016-0280-1 Issue No:Vol. 102, No. 1 (2018)

Authors:Xuejun Wang; Yi Wu; Shuhe Hu Pages: 41 - 65 Abstract: In this paper, the strong laws of large numbers for partial sums and weighted sums of negatively superadditive-dependent (NSD, in short) random variables are presented, especially the Marcinkiewicz–Zygmund type strong law of large numbers. Using these strong laws of large numbers, we further investigate the strong consistency and weak consistency of the LS estimators in the EV regression model with NSD errors, which generalize and improve the corresponding ones for negatively associated random variables. Finally, a simulation is carried out to study the numerical performance of the strong consistency result that we established. PubDate: 2018-01-01 DOI: 10.1007/s10182-016-0286-8 Issue No:Vol. 102, No. 1 (2018)

Authors:Hans Wolfgang Brachinger; Michael Beer; Olivier Schöni Pages: 67 - 93 Abstract: Hedonic methods are considered state of the art for handling quality changes when compiling consumer price indices. The present article proposes first a mathematical description of characteristics and of elementary aggregates. In a following step, a hedonic econometric model is formulated and hedonic elementary population indices are defined. We emphasise that population indices are unobservable economic parameters that need to be estimated by suitable sample indices. It is shown that within the framework developed here, many of the hedonic index formulae used in practice are identified as sample versions corresponding to particular hedonic elementary population indices. The article closes with an empirical part on quarterly housing data where the considered hedonic indices are estimated along with their bootstrapped confidence intervals. It is shown that the computed confidence intervals together with the results from theory suggest a particular answer to the price index problem. PubDate: 2018-01-01 DOI: 10.1007/s10182-017-0293-4 Issue No:Vol. 102, No. 1 (2018)

Authors:Jean-Marie Dufour; Joachim Wilde Abstract: Weak identification is a well-known issue in the context of linear structural models. However, for probit models with endogenous explanatory variables, this problem has been little explored. In this paper, we study by simulating the behavior of the usual z-test and the LR test in the presence of weak identification. We find that the usual asymptotic z-test exhibits large level distortions (over-rejections under the null hypothesis). The magnitude of the level distortions depends heavily on the parameter value tested. In contrast, asymptotic LR tests do not over-reject and appear to be robust to weak identification. PubDate: 2018-04-21 DOI: 10.1007/s10182-018-0325-8

Authors:Meisam Moghimbeygi; Mousa Golalizadeh Abstract: It is known that the shapes of planar triangles can be represented by a set of points on the surface of the unit sphere. On the other hand, most of the objects can easily be triangulated and so each triangle can accordingly be treated in the context of shape analysis. There is a growing interest to fit a smooth path going through the cloud of shape data available in some time instances. To tackle this problem, we propose a longitudinal model through a triangulation procedure for the shape data. In fact, our strategy initially relies on a spherical regression model for triangles, but is extended to shape data via triangulation. Regarding modeling of directional data, we use the bivariate von Mises–Fisher distribution for density of the errors. Various forms of the composite likelihood functions, constructed by altering the assumptions considered for the angles defined for each triangle, are invoked. The proposed regression model is applied to rat skull data. Also, some simulations results are presented along with the real data results. PubDate: 2018-04-05 DOI: 10.1007/s10182-018-0324-9

Authors:Yandan Yang; Hon Keung Tony Ng; Narayanaswamy Balakrishnan Abstract: In science and engineering, we are often interested in learning about the lifetime characteristics of the system as well as those of the components that made up the system. However, in many cases, the system lifetimes can be observed but not the component lifetimes, and so we may not also have any knowledge on the structure of the system. Statistical procedures for estimating the parameters of the component lifetime distribution and for identifying the system structure based on system-level lifetime data are developed here using expectation–maximization (EM) algorithm. Different implementations of the EM algorithm based on system-level or component-level likelihood functions are proposed. A special case that the system is known to be a coherent system with unknown structure is considered. The methodologies are then illustrated by considering the component lifetimes to follow a two-parameter Weibull distribution. A numerical example and a Monte Carlo simulation study are used to evaluate the performance and related merits of the proposed implementations of the EM algorithm. Lognormally distributed component lifetimes and a real data example are used to illustrate how the methodologies can be applied to other lifetime models in addition to the Weibull model. Finally, some recommendations along with concluding remarks are provided. PubDate: 2018-03-09 DOI: 10.1007/s10182-018-0323-x

Authors:Alexander Begun; Anatoli Yashin Abstract: Frailty models allow us to take into account the non-observable inhomogeneity of individual hazard functions. Although models with time-independent frailty have been intensively studied over the last decades and a wide range of applications in survival analysis have been found, the studies based on the models with time-dependent frailty are relatively rare. In this paper, we formulate and prove two propositions related to the identifiability of the bivariate survival models with frailty given by a nonnegative bivariate Lévy process. We discuss parametric and semiparametric procedures for estimating unknown parameters and baseline hazard functions. Numerical experiments with simulated and real data illustrate these procedures. The statements of the propositions can be easily extended to the multivariate case. PubDate: 2018-02-28 DOI: 10.1007/s10182-018-0322-y

Authors:J. A. Mayor-Gallego; J. L. Moreno-Rebollo; M. D. Jiménez-Gamero Abstract: Auxiliary information \({\varvec{x}}\) is commonly used in survey sampling at the estimation stage. We propose an estimator of the finite population distribution function \(F_{y}(t)\) when \({\varvec{x}}\) is available for all units in the population and related to the study variable y by a superpopulation model. The new estimator integrates ideas from model calibration and penalized calibration. Calibration estimates of \(F_{y}(t)\) with the weights satisfying benchmark constraints on the fitted values distribution function \(\hat{F}_{\hat{y}}=F_{\hat{y}}\) on a set of fixed values of t can be found in the literature. Alternatively, our proposal \(\hat{F}_{y\omega }\) seeks an estimator taking into account a global distance \(D(\hat{F}_{\hat{y}\omega },F_{\hat{y}})\) between \(\hat{F}_{\hat{y}\omega }\) and \({F}_{\hat{y}},\) and a penalty parameter \(\alpha \) that assesses the importance of this term in the objective function. The weights are explicitly obtained for the \(L^2\) distance and conditions are given so that \(\hat{F}_{y\omega }\) to be a distribution function. In this case \(\hat{F}_{y\omega }\) can also be used to estimate the population quantiles. Moreover, results on the asymptotic unbiasedness and the asymptotic variance of \(\hat{F}_{y\omega }\) , for a fixed \(\alpha \) , are obtained. The results of a simulation study, designed to compare the proposed estimator to other existing ones, reveal that its performance is quite competitive. PubDate: 2018-02-23 DOI: 10.1007/s10182-018-0321-z

Authors:Lara Fontanella; Annalina Sarra; Pasquale Valentini; Simone Di Zio; Sara Fontanella Abstract: Recent years have seen increased attention paid to monitoring social anomie and its dependency on micro- and macro-factors. In this paper, we endorse the theorisation of social anomie as a complex, multidimensional and multilevel phenomenon. To ensure a rigorous measurement of the varying levels of social anomie in the European countries, the current study relies on a multilevel multidimensional item response theory model which explicitly accounts for the presence of a non-ignorable missing data mechanism. This unified approach makes it possible to specify an analytical model of links between anomie features and their determinants and to explore how the latent traits of interest are influenced by individual-level factors, as well as by country-level indicators. Additionally, to avoid misleading inferential conclusions, the proposed model takes into account the respondent’s omitting behaviour, assuming that the missingness mechanism is driven by a latent propensity to respond. Data used in this study have been collected in the 2010 wave of the European Social Survey. To reduce the computational complexities, a Bayesian specification of the MIRT model is provided and the parameter model estimates are obtained through MCMC algorithms. PubDate: 2018-02-10 DOI: 10.1007/s10182-018-0320-0

Authors:Hong-Xia Xu; Guo-Liang Fan; Zhen-Long Chen; Jiang-Feng Wang Abstract: This paper develops a varying-coefficient approach to the estimation and testing of regression quantiles under randomly truncated data. In order to handle the truncated data, the random weights are introduced and the weighted quantile regression (WQR) estimators for nonparametric functions are proposed. To achieve nice efficiency properties, we further develop a weighted composite quantile regression (WCQR) estimation method for nonparametric functions in varying-coefficient models. The asymptotic properties both for the proposed WQR and WCQR estimators are established. In addition, we propose a novel bootstrap-based test procedure to test whether the nonparametric functions in varying-coefficient quantile models can be specified by some function forms. The performance of the proposed estimators and test procedure are investigated through simulation studies and a real data example. PubDate: 2018-02-07 DOI: 10.1007/s10182-018-0319-6

Authors:Hosik Choi; Eunjung Song; Seung-sik Hwang; Woojoo Lee Abstract: Detecting local spatial clusters for count data is an important task in spatial epidemiology. Two broad approaches—moving window and disease mapping methods—have been suggested in some of the literature to find clusters. However, the existing methods employ somewhat arbitrarily chosen tuning parameters, and the local clustering results are sensitive to the choices. In this paper, we propose a penalized likelihood method to overcome the limitations of existing local spatial clustering approaches for count data. We start with a Poisson regression model to accommodate any type of covariates, and formulate the clustering problem as a penalized likelihood estimation problem to find change points of intercepts in two-dimensional space. The cost of developing a new algorithm is minimized by modifying an existing least absolute shrinkage and selection operator algorithm. The computational details on the modifications are shown, and the proposed method is illustrated with Seoul tuberculosis data. PubDate: 2018-01-17 DOI: 10.1007/s10182-018-0318-7

Authors:Moumita Chatterjee; Sugata Sen Roy Abstract: The motivation for this paper is a cystic fibrosis data which records a patient’s times to relapse and times to cure under several recurrences of the disease. The idea is to study the impact of covariates on the hazard rates of two alternately occurring events. The dependence between the times to the two events over the different cycles is modeled through an autoregressive-type setup. The partial likelihood function is then derived and the estimators obtained. The estimators are shown to be consistent and asymptotically normal. The technique is applied to study the motivating data. A simulation study is also conducted to corroborate the results. PubDate: 2017-12-21 DOI: 10.1007/s10182-017-0316-1