Abstract: Abstract This paper deals with statistical inference procedure of multivariate failure time data when the primary covariate can be measured only on a subset of the full cohort but the auxiliary information is available. To improve efficiency of statistical inference, we use quadratic inference function approach to incorporate the intra-cluster correlation and use kernel smoothing technique to further utilize the auxiliary information. The proposed method is shown to be more efficient than those ignoring the intra-cluster correlation and auxiliary information and is easy to implement. In addition, we develop a chi-squared test for hypothesis testing of hazard ratio parameters. We evaluate the finite-sample performance of the proposed procedure via extensive simulation studies. The proposed approach is illustrated by analysis of a real data set from the study of left ventricular dysfunction. PubDate: 2021-01-08

Abstract: Abstract Time-to-event data often violate the proportional hazards assumption inherent in the popular Cox regression model. Such violations are especially common in the sphere of biological and medical data where latent heterogeneity due to unmeasured covariates or time varying effects are common. A variety of parametric survival models have been proposed in the literature which make more appropriate assumptions on the hazard function, at least for certain applications. One such model is derived from the First Hitting Time (FHT) paradigm which assumes that a subject’s event time is determined by a latent stochastic process reaching a threshold value. Several random effects specifications of the FHT model have also been proposed which allow for better modeling of data with unmeasured covariates. While often appropriate, these methods often display limited flexibility due to their inability to model a wide range of heterogeneities. To address this issue, we propose a Bayesian model which loosens assumptions on the mixing distribution inherent in the random effects FHT models currently in use. We demonstrate via simulation study that the proposed model greatly improves both survival and parameter estimation in the presence of latent heterogeneity. We also apply the proposed methodology to data from a toxicology/carcinogenicity study which exhibits nonproportional hazards and contrast the results with both the Cox model and two popular FHT models. PubDate: 2021-01-08

Abstract: Abstract This paper considers the optimal design for the frailty model with discrete-time survival endpoints in longitudinal studies. We introduce the random effects into the discrete hazard models to account for the heterogeneity between experimental subjects, which causes the observations of the same subject at the sequential time points being correlated. We propose a general design method to collect the survival endpoints as inexpensively and efficiently as possible. A cost-based generalized D ( \(D_s\) )-optimal design criterion is proposed to derive the optimal designs for estimating the fixed effects with cost constraint. Different computation strategies based on grid search or particle swarm optimization (PSO) algorithm are provided to obtain generalized D ( \(D_s\) )-optimal designs. The equivalence theorem for the cost-based D ( \(D_s\) )-optimal design criterion is given to verify the optimality of the designs. Our numerical results indicate that the presence of the random effects has a great influence on the optimal designs. Some useful suggestions are also put forward for future designing longitudinal studies. PubDate: 2021-01-08

Abstract: Abstract In this paper, we propose an innovative method for jointly analyzing survival data and longitudinally measured continuous and ordinal data. We use a random effects accelerated failure time model for survival outcomes, a linear mixed model for continuous longitudinal outcomes and a proportional odds mixed model for ordinal longitudinal outcomes, where these outcome processes are linked through a set of association parameters. A primary objective of this study is to examine the effects of association parameters on the estimators of joint models. The model parameters are estimated by the method of maximum likelihood. The finite-sample properties of the estimators are studied using Monte Carlo simulations. The empirical study suggests that the degree of association among the outcome processes influences the bias, efficiency, and coverage probability of the estimators. Our proposed joint model estimators are approximately unbiased and produce smaller mean squared errors as compared to the estimators obtained from separate models. This work is motivated by a large multicenter study, referred to as the Genetic and Inflammatory Markers of Sepsis (GenIMS) study. We apply our proposed method to the GenIMS data analysis. PubDate: 2020-11-24

Abstract: Abstract Models for situations where some individuals are long-term survivors, immune or non-susceptible to the event of interest, are extensively studied in biomedical research. Fitting a regression can be problematic in situations involving small sample sizes with high censoring rate, since the maximum likelihood estimates of some coefficients may be infinity. This phenomenon is called monotone likelihood, and it occurs in the presence of many categorical covariates, especially when one covariate level is not associated with any failure (in survival analysis) or when a categorical covariate perfectly predicts a binary response (in the logistic regression). A well known solution is an adaptation of the Firth method, originally created to reduce the estimation bias. The method provides a finite estimate by penalizing the likelihood function. Bias correction in the mixture cure model is a topic rarely discussed in the literature and it configures a central contribution of this work. In order to handle this point in such context, we propose to derive the adjusted score function based on the Firth method. An extensive Monte Carlo simulation study indicates good inference performance for the penalized maximum likelihood estimates. The analysis is illustrated through a real application involving patients with melanoma assisted at the Hospital das Clínicas/UFMG in Brazil. This is a relatively novel data set affected by the monotone likelihood issue and containing cured individuals. PubDate: 2020-11-13

Abstract: Abstract Calibration is an important measure of the predictive accuracy for a prognostic risk model. A widely used measure of calibration when the outcome is survival time is the expected Brier score. In this paper, methodology is developed to accurately estimate the difference in expected Brier scores derived from nested survival models and to compute an accompanying variance estimate of this difference. The methodology is applicable to time invariant and time-varying coefficient Cox survival models. The nested survival model approach is often applied to the scenario where the full model consists of conventional and new covariates and the subset model contains the conventional covariates alone. A complicating factor in the methodologic development is that the Cox model specification cannot, in general, be simultaneously satisfied for nested models. The problem has been resolved by projecting the properly specified full survival model onto the lower dimensional space of conventional markers alone. Simulations are performed to examine the method’s finite sample properties and a prostate cancer data set is used to illustrate its application. PubDate: 2020-10-22

Abstract: Abstract Outcome-dependent sampling designs such as the case–control or case–cohort design are widely used in epidemiological studies for their outstanding cost-effectiveness. In this article, we propose and develop a smoothed weighted Gehan estimating equation approach for inference in an accelerated failure time model under a general failure time outcome-dependent sampling scheme. The proposed estimating equation is continuously differentiable and can be solved by the standard numerical methods. In addition to developing asymptotic properties of the proposed estimator, we also propose and investigate a new optimal power-based subsamples allocation criteria in the proposed design by maximizing the power function of a significant test. Simulation results show that the proposed estimator is more efficient than other existing competing estimators and the optimal power-based subsamples allocation will provide an ODS design that yield improved power for the test of exposure effect. We illustrate the proposed method with a data set from the Norwegian Mother and Child Cohort Study to evaluate the relationship between exposure to perfluoroalkyl substances and women’s subfecundity. PubDate: 2020-10-12

Abstract: Abstract In this paper, we first propose a dependent Dirichlet process (DDP) model using a mixture of Weibull models with each mixture component resembling a Cox model for survival data. We then build a Dirichlet process mixture model for competing risks data without regression covariates. Next we extend this model to a DDP model for competing risks regression data by using a multiplicative covariate effect on subdistribution hazards in the mixture components. Though built on proportional hazards (or subdistribution hazards) models, the proposed nonparametric Bayesian regression models do not require the assumption of constant hazard (or subdistribution hazard) ratio. An external time-dependent covariate is also considered in the survival model. After describing the model, we discuss how both cause-specific and subdistribution hazard ratios can be estimated from the same nonparametric Bayesian model for competing risks regression. For use with the regression models proposed, we introduce an omnibus prior that is suitable when little external information is available about covariate effects. Finally we compare the models’ performance with existing methods through simulations. We also illustrate the proposed competing risks regression model with data from a breast cancer study. An R package “DPWeibull” implementing all of the proposed methods is available at CRAN. PubDate: 2020-10-12

Abstract: Abstract Interval-censored failure time data arise in a number of fields and many authors have recently paid more attention to their analysis. However, regression analysis of interval-censored data under the additive risk model can be challenging in maximizing the complex likelihood, especially when there exists a non-ignorable cure fraction in the population. For the problem, we develop a sieve maximum likelihood estimation approach based on Bernstein polynomials. To relieve the computational burden, an expectation–maximization algorithm by exploiting a Poisson data augmentation is proposed. Under some mild conditions, the asymptotic properties of the proposed estimator are established. The finite sample performance of the proposed method is evaluated by extensive simulations, and is further illustrated through a real data set from the smoking cessation study. PubDate: 2020-10-01

Abstract: Abstract Interval-censored data often arise naturally in medical, biological, and demographical studies. As a matter of routine, the Cox proportional hazards regression is employed to fit such censored data. The related work in the framework of additive hazards regression, which is always considered as a promising alternative, remains to be investigated. We propose a sieve maximum likelihood method for estimating regression parameters in the additive hazards regression with case II interval-censored data, which consists of right-, left- and interval-censored observations. We establish the consistency and the asymptotic normality of the proposed estimator and show that it attains the semiparametric efficiency bound. The finite-sample performance of the proposed method is assessed via comprehensive simulation studies, which is further illustrated by a real clinical example for patients with hemophilia. PubDate: 2020-10-01

Abstract: Abstract The cause of failure in cohort studies that involve competing risks is frequently incompletely observed. To address this, several methods have been proposed for the semiparametric proportional cause-specific hazards model under a missing at random assumption. However, these proposals provide inference for the regression coefficients only, and do not consider the infinite dimensional parameters, such as the covariate-specific cumulative incidence function. Nevertheless, the latter quantity is essential for risk prediction in modern medicine. In this paper we propose a unified framework for inference about both the regression coefficients of the proportional cause-specific hazards model and the covariate-specific cumulative incidence functions under missing at random cause of failure. Our approach is based on a novel computationally efficient maximum pseudo-partial-likelihood estimation method for the semiparametric proportional cause-specific hazards model. Using modern empirical process theory we derive the asymptotic properties of the proposed estimators for the regression coefficients and the covariate-specific cumulative incidence functions, and provide methodology for constructing simultaneous confidence bands for the latter. Simulation studies show that our estimators perform well even in the presence of a large fraction of missing cause of failures, and that the regression coefficient estimator can be substantially more efficient compared to the previously proposed augmented inverse probability weighting estimator. The method is applied using data from an HIV cohort study and a bladder cancer clinical trial. PubDate: 2020-10-01

Abstract: Abstract Case–cohort studies are useful when information on certain risk factors is difficult or costly to ascertain. Particularly, a case–cohort study may be well suited in situations where several case series are of interest, e.g. in studies with competing risks, because the same sub-cohort may serve as a comparison group for all case series. Previous analyses of this kind of sampled cohort data most often involved estimation of rate ratios based on a Cox regression model. However, with competing risks this method will not provide parameters that directly describe the association between covariates and cumulative risks. In this paper, we study regression analysis of cause-specific cumulative risks in case–cohort studies using pseudo-observations. We focus mainly on the situation with competing risks. However, as a by-product, we also develop a method by which absolute mortality risks may be analyzed directly from case–cohort survival data. We adjust for the case–cohort sampling by inverse sampling probabilities applied to a generalized estimation equation. The large-sample properties of the proposed estimator are developed and small-sample properties are evaluated in a simulation study. We apply the methodology to study the effect of a specific diet component and a specific gene on the absolute risk of atrial fibrillation. PubDate: 2020-10-01

Abstract: Abstract Treatment switching frequently occurs in clinical trials due to ethical reasons. Intent-to-treat analysis without adjusting for switching yields biased and inefficient estimates of the treatment effects. In this paper, we propose a class of semiparametric semi-competing risks transition survival models to accommodate two-way time-varying switching. Theoretical properties of the proposed method are examined. An efficient expectation–maximization algorithm is derived to obtain maximum likelihood estimates and model diagnostic tools. Existing software is used to implement the algorithm. Simulation studies are conducted to demonstrate the validity of the model. The proposed method is further applied to data from a clinical trial with patients having recurrent or metastatic squamous-cell carcinoma of head and neck. PubDate: 2020-10-01

Abstract: Abstract In the time-to-event setting, the concordance probability assesses the relative level of agreement between a model-based risk score and the survival time of a patient. While it provides a measure of discrimination over the entire follow-up period of a study, the probability does not provide information on the longitudinal durability of a baseline risk score. It is possible that a baseline risk model is able to segregate short-term from long-term survivors but unable to maintain its discriminatory strength later in the follow-up period. As a consequence, this would motivate clinicians to re-evaluate the risk score longitudinally. This longitudinal re-evaluation may not, however, be feasible in many scenarios since a single baseline evaluation may be the only data collectible due to treatment or other clinical or ethical reasons. In these scenarios, an attenuation of the discriminatory power of the patient risk score over time would indicate decreased clinical utility and call into question whether this score should remain a prognostic tool at later time points. Working within the concordance probability paradigm, we propose a method to address this clinical scenario and evaluate the discriminatory power of a baseline derived risk score over time. The methodology is illustrated with two examples: a baseline risk score in colorectal cancer defined at the time of tumor resection, and for circulating tumor cells in metastatic prostate cancer. PubDate: 2020-07-24

Abstract: Abstract The absolute standardized hazard ratio (ASHR) is a scale-invariant scalar measure of the strength of association of a vector of covariates with the risk of an event. It is derived from proportional hazards regression. The ASHR is useful for making comparisons among different sets of covariates. Extensions of the ASHR concept and practical considerations regarding its computation are discussed. These include a new method to conduct preliminary checks for collinearity among covariates, a partial ASHR to evaluate the association with event risk of some of the covariates conditioning on others, and the ASHR for interactions. To put the ASHR in context, its relationship to measures of explained variation and other measures of separation of risk is discussed. A new measure of the contribution of each covariate to the risk score variance is proposed. This measure, which is derived from the ASHR calculations, is interpretable as variable importance within the context of the multivariable model. PubDate: 2020-07-23

Abstract: Abstract The hazard ratio is one of the most commonly reported measures of treatment effect in randomised trials, yet the source of much misinterpretation. This point was made clear by Hernán (Epidemiology (Cambridge, Mass) 21(1):13–15, 2010) in a commentary, which emphasised that the hazard ratio contrasts populations of treated and untreated individuals who survived a given period of time, populations that will typically fail to be comparable—even in a randomised trial—as a result of different pressures or intensities acting on different populations. The commentary has been very influential, but also a source of surprise and confusion. In this note, we aim to provide more insight into the subtle interpretation of hazard ratios and differences, by investigating in particular what can be learned about a treatment effect from the hazard ratio becoming 1 (or the hazard difference 0) after a certain period of time. We further define a hazard ratio that has a causal interpretation and study its relationship to the Cox hazard ratio, and we also define a causal hazard difference. These quantities are of theoretical interest only, however, since they rely on assumptions that cannot be empirically evaluated. Throughout, we will focus on the analysis of randomised experiments. PubDate: 2020-07-11

Abstract: Abstract Mean residual life (MRL) is the remaining life expectancy of a subject who has survived to a certain time point and can be used as an alternative to hazard function for characterizing the distribution of a time-to-event variable. Inference and application of MRL models have primarily focused on full-cohort studies. In practice, case-cohort and nested case-control designs have been commonly used within large cohorts that have long follow-up and study rare diseases, particularly when studying costly molecular biomarkers. They enable prospective inference as the full-cohort design with significant cost-saving benefits. In this paper, we study the modeling and inference of a family of generalized MRL models under case-cohort and nested case-control designs. Built upon the idea of inverse selection probability, the weighted estimating equations are constructed to estimate regression parameters and baseline MRL function. Asymptotic properties of the proposed estimators are established and finite-sample performance is evaluated by extensive numerical simulations. An application to the New York University Women’s Health Study is presented to illustrate the proposed models and demonstrate a model diagnostic method to guide practical implementation. PubDate: 2020-06-11

Abstract: Abstract Restricted mean survival time is often of direct interest in epidemiologic studies involving censored survival time. In this article, we propose the nonparametric and semiparametric estimators of the mean restricted to the preassigned interval with censored length-biased data. Based on the peculiarity of length-biased data, the auxiliary information that truncation time and residual time have the same distribution is taken into account for improving estimation efficiency. For two-sample comparison, we construct two tests which are easy to implement. We also derive the asymptotic properties for the proposed estimators and test statistics. In simulation studies, some simulations are conducted to compare the performances of several approaches to estimate restricted mean and to assess the test statistics. In addition, our methods are applied to a real data example and some interesting results are presented. PubDate: 2020-04-16

Abstract: Abstract This paper studies the Cox model with time-varying coefficients for cause-specific hazard functions when the causes of failure are subject to missingness. Inverse probability weighted and augmented inverse probability weighted estimators are investigated. The latter is considered as a two-stage estimator by directly utilizing the inverse probability weighted estimator and through modeling available auxiliary variables to improve efficiency. The asymptotic properties of the two estimators are investigated. Hypothesis testing procedures are developed to test the null hypotheses that the covariate effects are zero and that the covariate effects are constant. We conduct simulation studies to examine the finite sample properties of the proposed estimation and hypothesis testing procedures under various settings of the auxiliary variables and the percentages of the failure causes that are missing. These simulation results demonstrate that the augmented inverse probability weighted estimators are more efficient than the inverse probability weighted estimators and that the proposed testing procedures have the expected satisfactory results in sizes and powers. The proposed methods are illustrated using the Mashi clinical trial data for investigating the effect of randomization to formula-feeding versus breastfeeding plus extended infant zidovudine prophylaxis on death due to mother-to-child HIV transmission in Botswana. PubDate: 2020-04-09