A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort by number of followers]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted alphabetically
Advances in Complex Systems     Hybrid Journal   (Followers: 10)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 61)
Annals of Applied Statistics     Full-text available via subscription   (Followers: 39)
Applied Categorical Structures     Hybrid Journal   (Followers: 4)
Argumentation et analyse du discours     Open Access   (Followers: 10)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 8)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 4)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Bernoulli     Full-text available via subscription   (Followers: 9)
Biometrical Journal     Hybrid Journal   (Followers: 10)
Biometrics     Hybrid Journal   (Followers: 51)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 18)
Building Simulation     Hybrid Journal   (Followers: 1)
Bulletin of Statistics     Full-text available via subscription   (Followers: 4)
CHANCE     Hybrid Journal   (Followers: 5)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 11)
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 37)
Current Research in Biostatistics     Open Access   (Followers: 8)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 11)
Demographic Research     Open Access   (Followers: 16)
Electronic Journal of Statistics     Open Access   (Followers: 8)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
ESAIM: Probability and Statistics     Full-text available via subscription   (Followers: 5)
Extremes     Hybrid Journal   (Followers: 2)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 8)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 5)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 17)
International Journal of Stochastic Analysis     Open Access   (Followers: 3)
International Statistical Review     Hybrid Journal   (Followers: 12)
International Trade by Commodity Statistics - Statistiques du commerce international par produit     Full-text available via subscription  
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 4)
Journal of Applied Statistics     Hybrid Journal   (Followers: 20)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 20)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 39, SJR: 3.664, CiteScore: 2)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 20)
Journal of Econometrics     Hybrid Journal   (Followers: 82)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 6)
Journal of Forecasting     Hybrid Journal   (Followers: 17)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Interactive Marketing     Hybrid Journal   (Followers: 10)
Journal of Mathematics and Statistics     Open Access   (Followers: 8)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 32)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 5)
Journal of Statistical Physics     Hybrid Journal   (Followers: 13)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 8)
Journal of Statistical Software     Open Access   (Followers: 20, SJR: 13.802, CiteScore: 16)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 72, SJR: 3.746, CiteScore: 2)
Journal of the Korean Statistical Society     Hybrid Journal   (Followers: 1)
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 31)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 26)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 43)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 16)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 30)
Law, Probability and Risk     Hybrid Journal   (Followers: 8)
Lifetime Data Analysis     Hybrid Journal   (Followers: 7)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Metrika     Hybrid Journal   (Followers: 4)
Modelling of Mechanical Systems     Full-text available via subscription   (Followers: 1)
Monte Carlo Methods and Applications     Hybrid Journal   (Followers: 6)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 2)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 5)
Optimization Letters     Hybrid Journal   (Followers: 2)
Optimization Methods and Software     Hybrid Journal   (Followers: 8)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 34)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 17)
Probability Surveys     Open Access   (Followers: 4)
Queueing Systems     Hybrid Journal   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 7)
Review of Economics and Statistics     Hybrid Journal   (Followers: 124)
Review of Socionetwork Strategies     Hybrid Journal  
Risk Management     Hybrid Journal   (Followers: 15)
Sankhya A     Hybrid Journal   (Followers: 2)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Sequential Analysis: Design Methods and Applications     Hybrid Journal  
Significance     Hybrid Journal   (Followers: 7)
Sociological Methods & Research     Hybrid Journal   (Followers: 37)
SourceOCDE Comptes nationaux et Statistiques retrospectives     Full-text available via subscription  
SourceOCDE Statistiques : Sources et methodes     Full-text available via subscription  
SourceOECD Bank Profitability Statistics - SourceOCDE Rentabilite des banques     Full-text available via subscription   (Followers: 1)
SourceOECD Insurance Statistics - SourceOCDE Statistiques d'assurance     Full-text available via subscription   (Followers: 2)
SourceOECD Main Economic Indicators - SourceOCDE Principaux indicateurs economiques     Full-text available via subscription   (Followers: 1)
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
SourceOECD Monthly Statistics of International Trade     Full-text available via subscription   (Followers: 1)
SourceOECD National Accounts & Historical Statistics     Full-text available via subscription  
SourceOECD OECD Economic Outlook Database - SourceOCDE Statistiques des Perspectives economiques de l'OCDE     Full-text available via subscription   (Followers: 2)
SourceOECD Science and Technology Statistics - SourceOCDE Base de donnees des sciences et de la technologie     Full-text available via subscription  
SourceOECD Statistics Sources & Methods     Full-text available via subscription   (Followers: 1)
SourceOECD Taxing Wages Statistics - SourceOCDE Statistiques des impots sur les salaires     Full-text available via subscription  
Stata Journal     Full-text available via subscription   (Followers: 9)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Statistical Applications in Genetics and Molecular Biology     Hybrid Journal   (Followers: 5)
Statistical Communications in Infectious Diseases     Hybrid Journal  
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Statistical Methodology     Hybrid Journal   (Followers: 7)
Statistical Methods and Applications     Hybrid Journal   (Followers: 6)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 27)
Statistical Modelling     Hybrid Journal   (Followers: 19)
Statistical Papers     Hybrid Journal   (Followers: 4)
Statistical Science     Full-text available via subscription   (Followers: 13)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Statistics & Risk Modeling     Hybrid Journal   (Followers: 2)
Statistics and Computing     Hybrid Journal   (Followers: 13)
Statistics and Economics     Open Access   (Followers: 1)
Statistics in Medicine     Hybrid Journal   (Followers: 191)
Statistics, Politics and Policy     Hybrid Journal   (Followers: 6)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 14)
Stochastic Models     Hybrid Journal   (Followers: 3)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Teaching Statistics     Hybrid Journal   (Followers: 7)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
TEST     Hybrid Journal   (Followers: 3)
The American Statistician     Full-text available via subscription   (Followers: 24)
The Annals of Applied Probability     Full-text available via subscription   (Followers: 8)
The Annals of Probability     Full-text available via subscription   (Followers: 10)
The Annals of Statistics     Full-text available via subscription   (Followers: 34)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 11)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)

              [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Statistical Methods in Medical Research
Journal Prestige (SJR): 1.402
Citation Impact (citeScore): 2
Number of Followers: 27  
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 0962-2802 - ISSN (Online) 1477-0334
Published by Sage Publications Homepage  [1097 journals]
  • Improving convergence in growth mixture models without covariance
           structure constraints
    • Authors: Daniel McNeish, Jeffrey R. Harring
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Growth mixture models are a popular method to uncover heterogeneity in growth trajectories. Harnessing the power of growth mixture models in applications is difficult given the prevalence of nonconvergence when fitting growth mixture models to empirical data. Growth mixture models are rooted in the random effect tradition, and nonconvergence often leads researchers to modify their intended model with constraints in the random effect covariance structure to facilitate estimation. While practical, doing so has been shown to adversely affect parameter estimates, class assignment, and class enumeration. Instead, we advocate specifying the models with a marginal approach to prevent the widespread practice of sacrificing class-specific covariance structures to appease nonconvergence. A simulation is provided to show the importance of modeling class-specific covariance structures and builds off existing literature showing that applying constraints to the covariance leads to poor performance. These results suggest that retaining class-specific covariance structures should be a top priority and that marginal models like covariance pattern growth mixture models that model the covariance structure without random effects are well-suited for such a purpose, particularly with modest sample sizes and attrition commonly found in applications. An application to PTSD data with such characteristics is provided to demonstrate (a) convergence difficulties with random effect models, (b) how covariance structure constraints improve convergence but to the detriment of performance, and (c) how covariance pattern growth mixture models may provide a path forward that improves convergence without forfeiting class-specific covariance structures.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-13T03:33:21Z
      DOI: 10.1177/0962280220981747
  • Online control of the familywise error rate
    • Authors: Jinjin Tian, Aaditya Ramdas
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Biological research often involves testing a growing number of null hypotheses as new data are accumulated over time. We study the problem of online control of the familywise error rate, that is testing an a priori unbounded sequence of hypotheses (p-values) one by one over time without knowing the future, such that with high probability there are no false discoveries in the entire sequence. This paper unifies algorithmic concepts developed for offline (single batch) familywise error rate control and online false discovery rate control to develop novel online familywise error rate control methods. Though many offline familywise error rate methods (e.g., Bonferroni, fallback procedures and Sidak’s method) can trivially be extended to the online setting, our main contribution is the design of new, powerful, adaptive online algorithms that control the familywise error rate when the p-values are independent or locally dependent in time. Our numerical experiments demonstrate substantial gains in power, that are also formally proved in an idealized Gaussian sequence model. A promising application to the International Mouse Phenotyping Consortium is described.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-08T07:32:13Z
      DOI: 10.1177/0962280220983381
  • Continuous(ly) missing outcome data in network meta-analysis: A one-stage
           pattern-mixture model approach
    • Authors: Loukia M Spineli, Chrysostomos Kalyvas, Katerina Papadimitropoulou
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Appropriate handling of aggregate missing outcome data is necessary to minimise bias in the conclusions of systematic reviews. The two-stage pattern-mixture model has been already proposed to address aggregate missing continuous outcome data. While this approach is more proper compared with the exclusion of missing continuous outcome data and simple imputation methods, it does not offer flexible modelling of missing continuous outcome data to investigate their implications on the conclusions thoroughly. Therefore, we propose a one-stage pattern-mixture model approach under the Bayesian framework to address missing continuous outcome data in a network of interventions and gain knowledge about the missingness process in different trials and interventions. We extend the hierarchical network meta-analysis model for one aggregate continuous outcome to incorporate a missingness parameter that measures the departure from the missing at random assumption. We consider various effect size estimates for continuous data, and two informative missingness parameters, the informative missingness difference of means and the informative missingness ratio of means. We incorporate our prior belief about the missingness parameters while allowing for several possibilities of prior structures to account for the fact that the missingness process may differ in the network. The method is exemplified in two networks from published reviews comprising a different amount of missing continuous outcome data.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-07T05:50:47Z
      DOI: 10.1177/0962280220983544
  • Prediction of cancer survival for cohorts of patients most recently
           diagnosed using multi-model inference
    • Authors: Camille Maringe, Aurélien Belot, Bernard Rachet
      Pages: 3605 - 3622
      Abstract: Statistical Methods in Medical Research, Volume 29, Issue 12, Page 3605-3622, December 2020.
      Despite a large choice of models, functional forms and types of effects, the selection of excess hazard models for prediction of population cancer survival is not widespread in the literature. We propose multi-model inference based on excess hazard model(s) selected using Akaike information criteria or Bayesian information criteria for prediction and projection of cancer survival. We evaluate the properties of this approach using empirical data of patients diagnosed with breast, colon or lung cancer in 1990–2011. We artificially censor the data on 31 December 2010 and predict five-year survival for the 2010 and 2011 cohorts. We compare these predictions to the observed five-year cohort estimates of cancer survival and contrast them to predictions from an a priori selected simple model, and from the period approach. We illustrate the approach by replicating it for cohorts of patients for which stage at diagnosis and other important prognosis factors are available. We find that model-averaged predictions and projections of survival have close to minimal differences with the Pohar-Perme estimation of survival in many instances, particularly in subgroups of the population. Advantages of information-criterion based model selection include (i) transparent model-building strategy, (ii) accounting for model selection uncertainty, (iii) no a priori assumption for effects, and (iv) projections for patients outside of the sample.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-06T07:18:20Z
      DOI: 10.1177/0962280220934501
      Issue No: Vol. 29, No. 12 (2020)
  • Class imbalance in gradient boosting classification algorithms:
           Application to experimental stroke data
    • Authors: Olga Lyashevska, Fiona Malone, Eugene MacCarthy, Jens Fiehler, Jan-Hendrik Buhk, Liam Morris
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Imbalance between positive and negative outcomes, a so-called class imbalance, is a problem generally found in medical data. Imbalanced data hinder the performance of conventional classification methods which aim to improve the overall accuracy of the model without accounting for uneven distribution of the classes. To rectify this, the data can be resampled by oversampling the positive (minority) class until the classes are approximately equally represented. After that, a prediction model such as gradient boosting algorithm can be fitted with greater confidence. This classification method allows for non-linear relationships and deep interactive effects while focusing on difficult areas by iterative shifting towards problematic observations. In this study, we demonstrate application of these methods to medical data and develop a practical framework for evaluation of features contributing into the probability of stroke.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-28T08:12:12Z
      DOI: 10.1177/0962280220980484
  • A two-stage Generalized Method of Moments model for feedback with
           time-dependent covariates
    • Authors: Elsa Vazquez-Arreola
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Correlated observations in longitudinal studies are often due to repeated measures on the subjects. Additionally, correlation may be realized due to the association between responses at a particular time and the predictors at earlier times. There are also feedback effects (relation between responses in the present and the covariates at a later time), though these are not always relevant and are often ignored. All these cases of correlation must be accounted for as they can have different effects on the regression coefficients. Several authors have provided models that reflect the direct and delayed impact of covariates on the response, utilizing valid moment conditions to estimate the relevant regression coefficients. However, there are applications when one cannot ignore the effect of the responses on future covariates. A two-stage model to account for the feedback, modeling the direct as well as the delayed effects of the covariates on future responses and vice versa is presented. The use of the two-stage model is demonstrated by revisiting child morbidity and its impact on future values of body mass index using Philippines health data. Also, obesity status and its feedback effects on physical activity and depression levels using the Add Health dataset are analyzed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-28T03:53:56Z
      DOI: 10.1177/0962280220981402
  • A group sequential design and sample size estimation for an immunotherapy
           trial with a delayed treatment effect
    • Authors: Bosheng Li, Liwen Su, Jun Gao, Liyun Jiang, Fangrong Yan
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      A delayed treatment effect is often observed in the confirmatory trials for immunotherapies and is reflected by a delayed separation of the survival curves of the immunotherapy groups versus the control groups. This phenomenon makes the design based on the log-rank test not applicable because this design would violate the proportional hazard assumption and cause loss of power. Thus, we propose a group sequential design allowing early termination on the basis of efficacy based on a more powerful piecewise weighted log-rank test for an immunotherapy trial with a delayed treatment effect. We present an approach on the group sequential monitoring, in which the information time is defined based on the number of events occurring after the delay time. Furthermore, we developed a one-dimensional search algorithm to determine the required maximum sample size for the proposed design, which uses an analytical estimation obtained by the inflation factor as an initial value and an empirical power function calculated by a simulation-based procedure as an objective function. In the simulation, we tested the unstable accuracy of the analytical estimation, the consistent accuracy of the maximum sample size determined by the search algorithm and the advantages of the proposed design on saving sample size.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-24T02:53:05Z
      DOI: 10.1177/0962280220980780
  • CWL: A conditional weighted likelihood method to account for the delayed
           joint toxicity–efficacy outcomes for phase I/II clinical trials
    • Authors: Yifei Zhang, Yong Zang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The delayed outcome issue is common in early phase dose-finding clinical trials. This problem becomes more intractable in phase I/II clinical trials because both toxicity and efficacy responses are subject to the delayed outcome issue. The existing methods applying for the phase I trials cannot be used directly for the phase I/II trial due to a lack of capability to model the joint toxicity–efficacy distribution. In this paper, we propose a conditional weighted likelihood (CWL) method to circumvent this issue. The key idea of the CWL method is to decompose the joint probability into the product of marginal and conditional probabilities and then weight each probability based on each patient’s actual follow-up time. The CWL method makes no parametric model assumption on either the dose–response curve or the toxicity–efficacy correlation and therefore can be applied to any existing phase I/II trial design. Numerical trial applications show that the proposed CWL method yields desirable operating characteristics.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-22T04:02:22Z
      DOI: 10.1177/0962280220979328
  • Inferring median survival differences in general factorial designs via
           permutation tests
    • Authors: Marc Ditzhaus, Dennis Dobler, Markus Pauly
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Factorial survival designs with right-censored observations are commonly inferred by Cox regression and explained by means of hazard ratios. However, in case of non-proportional hazards, their interpretation can become cumbersome; especially for clinicians. We therefore offer an alternative: median survival times are used to estimate treatment and interaction effects and null hypotheses are formulated in contrasts of their population versions. Permutation-based tests and confidence regions are proposed and shown to be asymptotically valid. Their type-1 error control and power behavior are investigated in extensive simulations, showing the new methods’ wide applicability. The latter is complemented by an illustrative data analysis.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-22T03:58:03Z
      DOI: 10.1177/0962280220980784
  • Two-phase analysis and study design for survival models with error-prone
    • Authors: Kyunghee Han, Thomas Lumley, Bryan E Shepherd, Pamela A Shaw
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Increasingly, medical research is dependent on data collected for non-research purposes, such as electronic health records data. Health records data and other large databases can be prone to measurement error in key exposures, and unadjusted analyses of error-prone data can bias study results. Validating a subset of records is a cost-effective way of gaining information on the error structure, which in turn can be used to adjust analyses for this error and improve inference. We extend the mean score method for the two-phase analysis of discrete-time survival models, which uses the unvalidated covariates as auxiliary variables that act as surrogates for the unobserved true exposures. This method relies on a two-phase sampling design and an estimation approach that preserves the consistency of complete case regression parameter estimates in the validated subset, with increased precision leveraged from the auxiliary data. Furthermore, we develop optimal sampling strategies which minimize the variance of the mean score estimator for a target exposure under a fixed cost constraint. We consider the setting where an internal pilot is necessary for the optimal design so that the phase two sample is split into a pilot and an adaptive optimal sample. Through simulations and data example, we evaluate efficiency gains of the mean score estimator using the derived optimal validation design compared to balanced and simple random sampling for the phase two sample. We also empirically explore efficiency gains that the proposed discrete optimal design can provide for the Cox proportional hazards model in the setting of a continuous-time survival outcome.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-17T05:18:04Z
      DOI: 10.1177/0962280220978500
  • Probability intervals of toxicity and efficacy design for dose-finding
           clinical trials in oncology
    • Authors: Xiaolei Lin, Yuan Ji
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Immunotherapy, gene therapy or adoptive cell therapies, such as the chimeric antigen receptor+ T-cell therapies, have demonstrated promising therapeutic effects in oncology patients. We consider statistical designs for dose-finding adoptive cell therapy trials, in which the monotonic dose–response relationship assumed in traditional oncology trials may not hold. Building upon a previous design called “TEPI”, we propose a new dose finding method – Probability Intervals of Toxicity and Efficacy (PRINTE), which utilizes toxicity and efficacy jointly in making dosing decisions, does not require a pre-elicited decision table and at the same time can handle Ockham’s razor properly in the statistical inference. We show that optimizing the joint posterior expected utility of toxicity and efficacy under a 0–1 loss is equivalent to maximizing the marginal model posterior probability in the two-dimensional probability space. An extensive simulation study under various scenarios are conducted and results show that PRINTE outperforms existing designs in the literature since it assigns more patients to optimal doses and less to toxic ones, and selects optimal doses with higher percentages. The simple and transparent features together with good operating characteristics make PRINTE an improved design for dose-finding trials in oncology trials.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-17T04:33:02Z
      DOI: 10.1177/0962280220977009
  • Bayesian variable selection in logistic regression with application to
           whole-brain functional connectivity analysis for Parkinson’s disease
    • Authors: Xuan Cao, Kyoungjae Lee, Qingling Huang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Parkinson’s disease is a progressive, chronic, and neurodegenerative disorder that is primarily diagnosed by clinical examinations and magnetic resonance imaging (MRI). In this paper, we propose a Bayesian model to predict Parkinson’s disease employing a functional MRI (fMRI) based radiomics approach. We consider a spike and slab prior for variable selection in high-dimensional logistic regression models, and present an approximate Gibbs sampler by replacing a logistic distribution with a t-distribution. Under mild conditions, we establish model selection consistency of the induced posterior and illustrate the performance of the proposed method outperforms existing state-of-the-art methods through simulation studies. In fMRI analysis, 6216 whole-brain functional connectivity features are extracted for 50 healthy controls along with 70 Parkinson’s disease patients. We apply our method to the resulting dataset and further show its benefits with a higher average prediction accuracy of 0.83 compared to other contenders based on 10 random splits. The model fitting procedure also reveals the most discriminative brain regions for Parkinson’s disease. These findings demonstrate that the proposed Bayesian variable selection method has the potential to support radiological diagnosis for patients with Parkinson’s disease.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-14T03:12:52Z
      DOI: 10.1177/0962280220978990
  • Concordance probability as a meaningful contrast across disparate survival
    • Authors: Sean M Devlin, Glenn Heller
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The performance of time-to-event models is frequently assessed in part by estimating the concordance probability, which evaluates the probabilistic pairwise ordering of the model-based risk scores and survival times. The standard definition of this probability conditions on any survival time pair ordering, irrespective of whether the times are meaningfully separated. Inclusion of survival times that would be deemed clinically similar attenuates the concordance and moves the estimate away from the contrast-of-interest: comparing the risk scores between individuals with disparate survival times. In this manuscript, we propose a concordance definition and corresponding method to estimate the probability conditional on survival times being separated by at least a minimum difference. The proposed estimate requires direct input from the analyst to identify a separable survival region and, in doing so, is analogous to the clinically defined subgroups used for binary outcome area under the curve estimates. The method is illustrated in two cancer examples: a prognostic score in clear cell renal cell carcinoma and two biomarkers in metastatic prostate cancer.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-10T04:32:55Z
      DOI: 10.1177/0962280220973694
  • Efficient and flexible simulation-based sample size determination for
           clinical trials with multiple design parameters
    • Authors: Duncan T Wilson, Richard Hooper, Julia Brown, Amanda J Farrin, Rebecca EA Walwyn
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Simulation offers a simple and flexible way to estimate the power of a clinical trial when analytic formulae are not available. The computational burden of using simulation has, however, restricted its application to only the simplest of sample size determination problems, often minimising a single parameter (the overall sample size) subject to power being above a target level. We describe a general framework for solving simulation-based sample size determination problems with several design parameters over which to optimise and several conflicting criteria to be minimised. The method is based on an established global optimisation algorithm widely used in the design and analysis of computer experiments, using a non-parametric regression model as an approximation of the true underlying power function. The method is flexible, can be used for almost any problem for which power can be estimated using simulation, and can be implemented using existing statistical software packages. We illustrate its application to a sample size determination problem involving complex clustering structures, two primary endpoints and small sample considerations.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-03T06:37:06Z
      DOI: 10.1177/0962280220975790
  • Statistical design considerations for trials that study multiple
    • Authors: Alexander M Kaizer, Joseph S Koopmeiners, Nan Chen, Brian P Hobbs
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Breakthroughs in cancer biology have defined new research programs emphasizing the development of therapies that target specific pathways in tumor cells. Innovations in clinical trial design have followed with master protocols defined by inclusive eligibility criteria and evaluations of multiple therapies and/or histologies. Consequently, characterization of subpopulation heterogeneity has become central to the formulation and selection of a study design. However, this transition to master protocols has led to challenges in identifying the optimal trial design and proper calibration of hyperparameters. We often evaluate a range of null and alternative scenarios; however, there has been little guidance on how to synthesize the potentially disparate recommendations for what may be optimal. This may lead to the selection of suboptimal designs and statistical methods that do not fully accommodate the subpopulation heterogeneity. This article proposes novel optimization criteria for calibrating and evaluating candidate statistical designs of master protocols in the presence of the potential for treatment effect heterogeneity among enrolled patient subpopulations. The framework is applied to demonstrate the statistical properties of conventional study designs when treatments offer heterogeneous benefit as well as identify optimal designs devised to monitor the potential for heterogeneity among patients with differing clinical indications using Bayesian modeling.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-03T06:33:27Z
      DOI: 10.1177/0962280220975187
  • Joint analysis of multivariate interval-censored survival data and a
           time-dependent covariate
    • Authors: Di Wu, Chenxi Li
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We develop a joint modeling method for multivariate interval-censored survival data and a time-dependent covariate that is intermittently measured with error. The joint model is estimated using nonparametric maximum likelihood estimation, which is carried out via an expectation–maximization algorithm, and the inference for finite-dimensional parameters is performed using bootstrap. We also develop a similar joint modeling method for univariate interval-censored survival data and a time-dependent covariate, which excels the existing methods in terms of model flexibility and interpretation. Simulation studies show that the model fitting and inference approaches perform very well under realistic sample sizes. We apply the method to a longitudinal study of dental caries in African-American children from low-income families in the city of Detroit, Michigan.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-01T08:55:48Z
      DOI: 10.1177/0962280220975064
  • Unbiasedness and efficiency of non-parametric and UMVUE estimators of the
           probabilistic index and related statistics
    • Authors: Johan Verbeeck, Vaiva Deltuvaite-Thomas, Ben Berckmoes, Tomasz Burzykowski, Marc Aerts, Olivier Thas, Marc Buyse, Geert Molenberghs
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In reliability theory, diagnostic accuracy, and clinical trials, the quantity [math], also known as the Probabilistic Index (PI), is a common treatment effect measure when comparing two groups of observations. The quantity [math], a linear transformation of PI known as the net benefit, has also been advocated as an intuitively appealing treatment effect measure. Parametric estimation of PI has received a lot of attention in the past 40 years, with the formulation of the Uniformly Minimum-Variance Unbiased Estimator (UMVUE) for many distributions. However, the non-parametric Mann–Whitney estimator of the PI is also known to be UMVUE in some situations. To understand this seeming contradiction, in this paper a systematic comparison is performed between the non-parametric estimator for the PI and parametric UMVUE estimators in various settings. We show that the Mann–Whitney estimator is always an unbiased estimator of the PI with univariate, completely observed data, while the parametric UMVUE is not when the distribution is misspecified. Additionally, the Mann–Whitney estimator is the UMVUE when observations belong to an unrestricted family. When observations come from a more restrictive family of distributions, the loss in efficiency for the non-parametric estimator is limited in realistic clinical scenarios. In conclusion, the Mann–Whitney estimator is simple to use and is a reliable estimator for the PI and net benefit in realistic clinical scenarios.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-01T08:52:11Z
      DOI: 10.1177/0962280220966629
  • Monte Carlo approaches to frequentist multiplicity-adjusted benefiting
           subgroup identification
    • Authors: Patrick M Schnell
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      One common goal of subgroup analyses is to determine the subgroup of the population for which a given treatment is effective. Like most problems in subgroup analyses, this benefiting subgroup identification requires careful attention to multiple testing considerations, especially Type I error inflation. To partially address these concerns, the credible subgroups approach provides a pair of bounding subgroups for the benefiting subgroup, constructed so that with high posterior probability one is contained by the benefiting subgroup while the other contains the benefiting subgroup. To date, this approach has been presented within the Bayesian paradigm only, and requires sampling from the posterior of a Bayesian model. Additionally, in many cases, such as regulatory submission, guarantees of frequentist operating characteristics are helpful or necessary. We present Monte Carlo approaches to constructing confidence subgroups, frequentist analogues to credible subgroups that replace the posterior distribution with an estimate of the joint distribution of personalized treatment effect estimates, and yield frequentist interpretations and coverage guarantees. The estimated joint distribution is produced using either draws from asymptotic sampling distributions of estimated model parameters, or bootstrap resampling schemes. The approach is applied to a publicly available dataset from randomized trials of Alzheimer’s disease treatments.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-01T08:48:29Z
      DOI: 10.1177/0962280220973705
  • Bayesian mixture cure rate frailty models with an application to gastric
           cancer data
    • Authors: Ali Karamoozian, Mohammad Reza Baneshi, Abbas Bahrampour
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Mixture cure rate models are commonly used to analyze lifetime data with long-term survivors. On the other hand, frailty models also lead to accurate estimation of coefficients by controlling the heterogeneity in survival data. Gamma frailty models are the most common models of frailty. Usually, the gamma distribution is used in the frailty random variable models. However, for survival data which are suitable for populations with a cure rate, it may be better to use a discrete distribution for the frailty random variable than a continuous distribution. Therefore, we proposed two models in this study. In the first model, continuous gamma as the distribution is used, and in the second model, discrete hyper-Poisson distribution is applied for the frailty random variable. Also, Bayesian inference with Weibull distribution and generalized modified Weibull distribution as the baseline distribution were used in the two proposed models, respectively. In this study, we used data of patients with gastric cancer to show the application of these models in real data analysis. The parameters and regression coefficients were estimated using the Metropolis with Gibbs sampling algorithm, so that this algorithm is one of the crucial techniques in Markov chain Monte Carlo simulation. A simulation study was also used to evaluate the performance of the Bayesian estimates to confirm the proposed models. Based on the results of the Bayesian inference, it was found that the model with generalized modified Weibull and hyper-Poisson distributions is a suitable model in practical study and also this model fits better than the model with Weibull and Gamma distributions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-27T05:53:48Z
      DOI: 10.1177/0962280220974699
  • Bayesian adaptive decision-theoretic designs for multi-arm multi-stage
           clinical trials
    • Authors: Andrea Bassi, Johannes Berkhof, Daphne de Jong, Peter M van de Ven
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Multi-arm multi-stage clinical trials in which more than two drugs are simultaneously investigated provide gains over separate single- or two-arm trials. In this paper we propose a generic Bayesian adaptive decision-theoretic design for multi-arm multi-stage clinical trials with K ([math]) arms. The basic idea is that after each stage a decision about continuation of the trial and accrual of patients for an additional stage is made on the basis of the expected reduction in loss. For this purpose, we define a loss function that incorporates the patient accrual costs as well as costs associated with an incorrect decision at the end of the trial. An attractive feature of our loss function is that its estimation is computationally undemanding, also when K > 2. We evaluate the frequentist operating characteristics for settings with a binary outcome and multiple experimental arms. We consider both the situation with and without a control arm. In a simulation study, we show that our design increases the probability of making a correct decision at the end of the trial as compared to nonadaptive designs and adaptive two-stage designs.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-27T05:48:27Z
      DOI: 10.1177/0962280220973697
  • Employing a latent variable framework to improve efficiency in composite
           endpoint analysis
    • Authors: Martina McMenamin, Jessica K Barrett, Anna Berglind, James MS Wason
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Composite endpoints that combine multiple outcomes on different scales are common in clinical trials, particularly in chronic conditions. In many of these cases, patients will have to cross a predefined responder threshold in each of the outcomes to be classed as a responder overall. One instance of this occurs in systemic lupus erythematosus, where the responder endpoint combines two continuous, one ordinal and one binary measure. The overall binary responder endpoint is typically analysed using logistic regression, resulting in a substantial loss of information. We propose a latent variable model for the systemic lupus erythematosus endpoint, which assumes that the discrete outcomes are manifestations of latent continuous measures and can proceed to jointly model the components of the composite. We perform a simulation study and find that the method offers large efficiency gains over the standard analysis, the magnitude of which is highly dependent on the components driving response. Bias is introduced when joint normality assumptions are not satisfied, which we correct for using a bootstrap procedure. The method is applied to the Phase IIb MUSE trial in patients with moderate to severe systemic lupus erythematosus. We show that it estimates the treatment effect 2.5 times more precisely, offering a 60% reduction in required sample size.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-25T01:57:32Z
      DOI: 10.1177/0962280220970986
  • Small sample sizes: A big data problem in high-dimensional data analysis
    • Authors: Frank Konietschke, Karima Schwab, Markus Pauly
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In many experiments and especially in translational and preclinical research, sample sizes are (very) small. In addition, data designs are often high dimensional, i.e. more dependent than independent replications of the trial are observed. The present paper discusses the applicability of max t-test-type statistics (multiple contrast tests) in high-dimensional designs (repeated measures or multivariate) with small sample sizes. A randomization-based approach is developed to approximate the distribution of the maximum statistic. Extensive simulation studies confirm that the new method is particularly suitable for analyzing data sets with small sample sizes. A real data set illustrates the application of the methods.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-24T09:53:35Z
      DOI: 10.1177/0962280220970228
  • Unifying instrumental variable and inverse probability weighting
           approaches for inference of causal treatment effect and unmeasured
           confounding in observational studies
    • Authors: Tao Liu, Joseph W Hogan
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Confounding is a major concern when using data from observational studies to infer the causal effect of a treatment. Instrumental variables, when available, have been used to construct bound estimates on population average treatment effects when outcomes are binary and unmeasured confounding exists. With continuous outcomes, meaningful bounds are more challenging to obtain because the domain of the outcome is unrestricted. In this paper, we propose to unify the instrumental variable and inverse probability weighting methods, together with suitable assumptions in the context of an observational study, to construct meaningful bounds on causal treatment effects. The contextual assumptions are imposed in terms of the potential outcomes that are partially identified by data. The inverse probability weighting component incorporates a sensitivity parameter to encode the effect of unmeasured confounding. The instrumental variable and inverse probability weighting methods are unified using the principal stratification. By solving the resulting system of estimating equations, we are able to quantify both the causal treatment effect and the sensitivity parameter (i.e. the degree of the unmeasured confounding). We demonstrate our method by analyzing data from the HIV Epidemiology Research Study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-20T07:08:09Z
      DOI: 10.1177/0962280220971835
  • Functional clustering methods for longitudinal data with application to
           electronic health records
    • Authors: Bret Zeldow, James Flory, Alisa Stephens-Shields, Marsha Raebel, Jason A Roy
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature extraction, or, as in our motivating example, outcome identification, which refers to the process of identifying disease status through patient laboratory tests rather than through diagnosis codes or prescription information. We model the joint distribution of a continuous longitudinal outcome and baseline covariates using an enriched Dirichlet process prior. This joint model decomposes into (local) semiparametric linear mixed models for the outcome given the covariates and simple (local) marginals for the covariates. The nonparametric enriched Dirichlet process prior is placed on the regression and spline coefficients, the error variance, and the parameters governing the predictor space. This leads to clustering of patients based on their outcomes and covariates. We predict the outcome at unobserved time points for subjects with data at other time points as well as for new subjects with only baseline covariates. We find improved prediction over mixed models with Dirichlet process priors when there are a large number of covariates. Our method is demonstrated with electronic health records consisting of initiators of second-generation antipsychotic medications, which are known to increase the risk of diabetes. We use our model to predict laboratory values indicative of diabetes for each individual and assess incidence of suspected diabetes from the predicted dataset.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-12T04:23:58Z
      DOI: 10.1177/0962280220965630
  • Selecting the number of categories of the lymph node ratio in cancer
           research: A bootstrap-based hypothesis test
    • Authors: Irantzu Barrio, Javier Roca-Pardiñas, Inmaculada Arostegui
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The high impact of the lymph node ratio as a prognostic factor is widely established in colorectal cancer, and is being used as a categorized predictor variable in several studies. However, the cut-off points as well as the number of categories considered differ considerably in the literature. Motivated by the need to obtain the best categorization of the lymph node ratio as a predictor of mortality in colorectal cancer patients, we propose a method to select the best number of categories for a continuous variable in a logistic regression framework. Thus, to this end, we propose a bootstrap-based hypothesis test, together with a new estimation algorithm for the optimal location of the cut-off points called BackAddFor, which is an updated version of the previously proposed AddFor algorithm. The performance of the hypothesis test was evaluated by means of a simulation study, under different scenarios, yielding type I errors close to the nominal errors and good power values whenever a meaningful difference in terms of prediction ability existed. Finally, the methodology proposed was applied to the CCR-CARESS study where the lymph node ratio was included as a predictor of five-year mortality, resulting in the selection of three categories.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-10T03:19:39Z
      DOI: 10.1177/0962280220965631
  • Flexible derivative time-varying model in matched case-crossover studies
           for a small number of geographical locations among the participants
    • Authors: Ana M Ortega-Villa, Inyoung Kim
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In matched case-crossover studies, any stratum effect is removed by conditioning on the fixed number of case–control sets in the stratum, and hence, the conditional logistic regression model is not able to detect any effects associated with matching covariates. However, some matching covariates such as time and location often modify the effect of covariates, making the estimations obtained by conditional logistic regression incorrect. Therefore, in this paper, we propose a flexible derivative time-varying coefficient model to evaluate effect modification by time and location, in order to make correct statistical inference, when the number of locations is small. Our proposed model is developed under the Bayesian hierarchical model framework and allows us to simultaneously detect relationships between the predictor and binary outcome and between the predictor and time. Inference is proposed based on the derivative function of the estimated function to determine whether there is an effect modification due to time and/or location, for a small number of locations among the participants. We demonstrate the accuracy of the estimation using a simulation study and an epidemiological example of a 1–4 bidirectional case-crossover study of childhood aseptic meningitis with drinking water turbidity.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-04T02:42:26Z
      DOI: 10.1177/0962280220968178
  • Random changepoint segmented regression with smooth transition
    • Authors: Julio M Singer, Francisco MM Rocha, Antonio Carlos Pedroso-de-Lima, Giovani L Silva, Giuliana C Coatti, Mayana Zatz
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We consider random changepoint segmented regression models to analyse data from a study conducted to verify whether treatment with stem cells may delay the onset of a symptom of amyotrophic lateral sclerosis in genetically modified mice. The proposed models capture the biological aspects of the data, accommodating a smooth transition between the periods with and without symptoms. An additional changepoint is considered to avoid negative predicted responses. Given the nonlinear nature of the model, we propose an algorithm to estimate the fixed parameters and to predict the random effects by fitting linear mixed models iteratively via standard software. We compare the variances obtained in the final step with bootstrapped and robust ones.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-04T02:36:06Z
      DOI: 10.1177/0962280220964953
  • Development of a mixture model allowing for smoothing functions of
           longitudinal trajectories
    • Authors: Ming Ding, Jorge E. Chavarro, Garrett M. Fitzmaurice
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In the health and social sciences, two types of mixture models have been widely used by researchers to identify participants within a population with heterogeneous longitudinal trajectories: latent class growth analysis and the growth mixture model. Both methods parametrically model trajectories of individuals, and capture latent trajectory classes, using an expectation–maximization algorithm. However, parametric modeling of trajectories using polynomial functions or monotonic spline functions results in limited flexibility for modeling trajectories; as a result, group membership may not be classified accurately due to model misspecification. In this paper, we propose a smoothing mixture model allowing for smoothing functions of trajectories using a modified algorithm in the M step. Specifically, participants are reassigned to only one group for which the estimated trajectory is the most similar to the observed one; trajectories are fitted using generalized additive mixed models with smoothing functions of time within each of the resulting subsamples. The smoothing mixture model is straightforward to implement using the recently released “gamm4” package (version 0.2–6) in R 3.5.0. It can incorporate time-varying covariates and be applied to longitudinal data with any exponential family distribution, e.g., normal, Bernoulli, and Poisson. Simulation results show favorable performance of the smoothing mixture model, when compared to latent class growth analysis and growth mixture model, in recovering highly flexible trajectories. The proposed method is illustrated by its application to body mass index data on individuals followed from adolescence to young adulthood and its relationship with incidence of cardiometabolic disease.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-27T08:47:34Z
      DOI: 10.1177/0962280220966019
  • Inference about age-standardized rates with sampling errors in the
    • Authors: Jiming Jiang, Eric J Feuer, Yuanyuan Li, Thuan Nguyen, Mandi Yu
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Cancer incidence and mortality are typically presented as age-standardized rates. Inference about these rates becomes complicated when denominators involve sampling errors. We propose a bias-corrected rate estimator as well as its corresponding variance estimator that take into account sampling errors in the denominators. Confidence intervals are derived based on the proposed estimators as well. Performance of the proposed methods is evaluated empirically based on simulation studies. More importantly, advantage of the proposed method is demonstrated and verified in a real-life study of cancer mortality disparity. A web-based, user-friendly computational tool is also being developed at the National Cancer Institute to accompany the new methods with the first application being calculating cancer mortality rates by US-born and foreign-born status. Finally, promise of proposed estimators to account for errors introduced by differential privacy procedures to the 2020 decennial census products is discussed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-16T03:43:57Z
      DOI: 10.1177/0962280220962516
  • Reference range: Which statistical intervals to use'
    • Authors: Wei Liu, Frank Bretz, Mario Cortina-Borja
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Reference ranges, which are data-based intervals aiming to contain a pre-specified large proportion of the population values, are powerful tools to analyse observations in clinical laboratories. Their main point is to classify any future observations from the population which fall outside them as atypical and thus may warrant further investigation. As a reference range is constructed from a random sample from the population, the event ‘a reference range contains [math] of the population’ is also random. Hence, all we can hope for is that such event has a large occurrence probability. In this paper we argue that some intervals, including the P prediction interval, are not suitable as reference ranges since there is a substantial probability that these intervals contain less than [math] of the population, especially when the sample size is large. In contrast, a [math] tolerance interval is designed to contain [math] of the population with a pre-specified large confidence γ so it is eminently adequate as a reference range. An example based on real data illustrates the paper’s key points.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-15T03:27:06Z
      DOI: 10.1177/0962280220961793
  • Joint analysis of recurrence and termination: A Bayesian latent class
    • Authors: Zhixing Xu, Debajyoti Sinha, Jonathan R Bradley
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Like many other clinical and economic studies, each subject of our motivating transplant study is at risk of recurrent events of non-fatal tissue rejections as well as the terminating event of death due to total graft rejection. For such studies, our model and associated Bayesian analysis aim for some practical advantages over competing methods. Our semiparametric latent-class-based joint model has coherent interpretation of the covariate (including race and gender) effects on all functions and model quantities that are relevant for understanding the effects of covariates on future event trajectories. Our fully Bayesian method for estimation and prediction uses a complete specification of the prior process of the baseline functions. We also derive a practical and theoretically justifiable partial likelihood-based semiparametric Bayesian approach to deal with the analysis when there is a lack of prior information about baseline functions. Our model and method can accommodate fixed as well as time-varying covariates. Our Markov Chain Monte Carlo tools for both Bayesian methods are implementable via publicly available software. Our Bayesian analysis of transplant study and simulation study demonstrate practical advantages and improved performance of our approach.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-14T04:02:57Z
      DOI: 10.1177/0962280220962522
  • Sample size and sample composition for constructing growth reference
    • Authors: TJ Cole
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Growth reference centile charts are widely used in child health to assess weight, height and other age-varying measurements. The centiles are easy to construct from reference data, using the LMS method or GAMLSS (Generalised Additive Models for Location Scale and Shape). However, there is as yet no clear guidance on how to design such studies, and in particular how many reference data to collect, and this has led to study sizes varying widely. The paper aims to provide a theoretical framework for optimally designing growth reference studies based on cross-sectional data. Centiles for weight, height, body mass index and head circumference, in 6878 boys aged 0–21 years from the Fourth Dutch Growth Study, were fitted using GAMLSS. The effect on precision of varying the sample size and the distribution of measurement ages (sample composition) was explored by fitting a series of GAMLSS models to simulated data. Sample composition was defined as uniform on the ageλ scale, where λ was chosen to give constant precision across the age range. Precision was measured on the z-score scale, and was the same for all four measurements, with a standard error of 0.041 z-score units for the median and 0.066 for the 2nd and 98th centiles. Compared to a naïve calculation, the process of smoothing the centiles increased the notional sample size two- to threefold by ‘borrowing strength’. The sample composition for estimating the median curve was optimal for λ=0.4, reflecting considerable over-sampling of infants compared to children. However, for the 2nd and 98th centiles, λ=0.75 was optimal, with less infant over-sampling. The conclusion is that both sample size and sample composition need to be optimised. The paper provides practical advice on design, and concludes that optimally designed studies need 7000–25,000 subjects per sex.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-12T12:08:55Z
      DOI: 10.1177/0962280220958438
  • A weighting method for simultaneous adjustment for confounding and joint
           exposure-outcome misclassifications
    • Authors: Bas BL Penning de Vries, Maarten van Smeden, Rolf HH Groenwold
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Joint misclassification of exposure and outcome variables can lead to considerable bias in epidemiological studies of causal exposure-outcome effects. In this paper, we present a new maximum likelihood based estimator for marginal causal effects that simultaneously adjusts for confounding and several forms of joint misclassification of the exposure and outcome variables. The proposed method relies on validation data for the construction of weights that account for both sources of bias. The weighting estimator, which is an extension of the outcome misclassification weighting estimator proposed by Gravel and Platt (Weighted estimation for confounded binary outcomes subject to misclassification. Stat Med 2018; 37: 425–436), is applied to reinfarction data. Simulation studies were carried out to study its finite sample properties and compare it with methods that do not account for confounding or misclassification. The new estimator showed favourable large sample properties in the simulations. Further research is needed to study the sensitivity of the proposed method and that of alternatives to violations of their assumptions. The implementation of the estimator is facilitated by a new R function (ipwm) in an existing R package (mecor).
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-01T03:09:16Z
      DOI: 10.1177/0962280220960172
  • Pattern discovery of health curves using an ordered probit model with
           Bayesian smoothing and functional principal component analysis
    • Authors: Shijia Wang, Yunlong Nie, Jason M Sutherland, Liangliang Wang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      This article is motivated by the need for discovering patterns of patients’ health based on their daily settings of care to aid the health policy-makers to improve the effectiveness of distributing funding for health services. The hidden process of one’s health status is assumed to be a continuous smooth function, called the health curve, ranging from perfectly healthy to dead. The health curves are linked to the categorical setting of care using an ordered probit model and are inferred through Bayesian smoothing. The challenges include the nontrivial constraints on the lower bound of the health status (death) and on the model parameters to ensure model identifiability. We use the Markov chain Monte Carlo method to estimate the parameters and health curves. The functional principal component analysis is applied to the patients’ estimated health curves to discover common health patterns. The proposed method is demonstrated through an application to patients hospitalized from strokes in Ontario. Whilst this paper focuses on the method’s application to a health care problem, the proposed model and its implementation have the potential to be applied to many application domains in which the response variable is ordinal and there is a hidden process. Our implementation is available at https://github.com/liangliangwangsfu/healthCurveCode.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-25T05:55:13Z
      DOI: 10.1177/0962280220951834
  • Change point detection in Cox proportional hazards mixture cure model
    • Authors: Bing Wang, Jialiang Li, Xiaoguang Wang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The mixture cure model has been widely applied to survival data in which a fraction of the observations never experience the event of interest, despite long-term follow-up. In this paper, we study the Cox proportional hazards mixture cure model where the covariate effects on the distribution of uncured subjects’ failure time may jump when a covariate exceeds a change point. The nonparametric maximum likelihood estimation is used to obtain the semiparametric estimates. We employ a two-step computational procedure involving the Expectation-Maximization algorithm to implement the estimation. The consistency, convergence rate and asymptotic distributions of the estimators are carefully established under technical conditions and we show that the change point estimator is n consistency. The m out of n bootstrap and the Louis algorithm are used to obtain the standard errors of the estimated change point and other regression parameter estimates, respectively. We also contribute a test procedure to check the existence of the change point. The finite sample performance of the proposed method is demonstrated via simulation studies and real data examples.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-24T05:37:55Z
      DOI: 10.1177/0962280220959118
  • Comparison of small-sample standard-error corrections for generalised
           estimating equations in stepped wedge cluster randomised trials with a
           binary outcome: A simulation study
    • Authors: JA Thompson, K Hemming, A Forbes, K Fielding, R Hayes
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Generalised estimating equations with the sandwich standard-error estimator provide a promising method of analysis for stepped wedge cluster randomised trials. However, they have inflated type-one error when used with a small number of clusters, which is common for stepped wedge cluster randomised trials. We present a large simulation study of binary outcomes comparing bias-corrected standard errors from Fay and Graubard; Mancl and DeRouen; Kauermann and Carroll; Morel, Bokossa, and Neerchal; and Mackinnon and White with an independent and exchangeable working correlation matrix. We constructed 95% confidence intervals using a t-distribution with degrees of freedom including clusters minus parameters (DFC-P), cluster periods minus parameters, and estimators from Fay and Graubard (DFFG), and Pan and Wall. Fay and Graubard and an approximation to Kauermann and Carroll (with simpler matrix inversion) were unbiased in a wide range of scenarios with an independent working correlation matrix and more than 12 clusters. They gave confidence intervals with close to 95% coverage with DFFG with 12 or more clusters, and DFC-P with 18 or more clusters. Both standard errors were conservative with fewer clusters. With an exchangeable working correlation matrix, approximated Kauermann and Carroll and Fay and Graubard had a small degree of under-coverage.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-24T05:16:53Z
      DOI: 10.1177/0962280220958735
  • A proportional risk model for time-to-event analysis in randomized
           controlled trials
    • Authors: Oliver Kuss, Annika Hoyer
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Regression models for continuous, binary, nominal, and ordinal outcomes almost completely rely on parametric models, whereas time-to-event outcomes are mainly analyzed by Cox’s Proportional Hazards model, an essentially non-parametric method. This is done despite a long list of disadvantages that have been reported for the hazard ratio, and also for the odds ratio, another effect measure sometimes used for time-to-event modelling. In this paper, we propose a parametric proportional risk model for time-to-event outcomes in a two-group situation. Modelling explicitly a risk instead of a hazard or an odds solves the current interpretational and technical problems of the latter two effect measures. The model further allows for computing absolute effect measures like risk differences or numbers needed to treat. As an additional benefit, results from the model can also be communicated on the original time scale, as an accelerated or a prolongated failure time thus facilitating interpretation for a non-technical audience. Parameter estimation by maximum likelihood, while properly accounting for censoring, is straightforward and can be implemented in each statistical package that allows coding and maximizing a univariate likelihood function. We illustrate the model with an example from a randomized controlled trial on efficacy of a new glucose-lowering drug for the treatment of type 2 diabetes mellitus and give the results of a small simulation study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-22T04:22:24Z
      DOI: 10.1177/0962280220953599
  • Efficient two-stage sequential arrays of proof of concept studies for
           pharmaceutical portfolios
    • Authors: Linchen He, Linqiu Du, Zoran Antonijevic, Martin Posch, Valeriy R Korostyshevskiy, Robert A Beckman
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Previous work has shown that individual randomized “proof-of-concept” (PoC) studies may be designed to maximize cost-effectiveness, subject to an overall PoC budget constraint. Maximizing cost-effectiveness has also been considered for arrays of simultaneously executed PoC studies. Defining Type III error as the opportunity cost of not performing a PoC study, we evaluate the common pharmaceutical practice of allocating PoC study funds in two stages. Stage 1, or the first wave of PoC studies, screens drugs to identify those to be permitted additional PoC studies in Stage 2. We investigate if this strategy significantly improves efficiency, despite slowing development. We quantify the benefit, cost, benefit-cost ratio, and Type III error given the number of Stage 1 PoC studies. Relative to a single stage PoC strategy, significant cost-effective gains are seen when at least one of the drugs has a low probability of success (10%) and especially when there are either few drugs (2) with a large number of indications allowed per drug (10) or a large portfolio of drugs (4). In these cases, the recommended number of Stage 1 PoC studies ranges from 2 to 4, tracking approximately with an inflection point in the minimization curve of Type III error.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-21T02:58:19Z
      DOI: 10.1177/0962280220958177
  • Extending the I-squared statistic to describe treatment effect
           heterogeneity in cluster, multi-centre randomized trials and individual
           patient data meta-analysis
    • Authors: Karla Hemming, James P Hughes, Joanne E McKenzie, Andrew B Forbes
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Treatment effect heterogeneity is commonly investigated in meta-analyses to identify if treatment effects vary across studies. When conducting an aggregate level data meta-analysis it is common to describe the magnitude of any treatment effect heterogeneity using the I-squared statistic, which is an intuitive and easily understood concept. The effect of a treatment might also vary across clusters in a cluster randomized trial, or across centres in multi-centre randomized trial, and it can be of interest to explore this at the analysis stage. In cross-over trials and other randomized designs, in which clusters or centres are exposed to both treatment and control conditions, this treatment effect heterogeneity can be identified. Here we derive and evaluate a comparable I-squared measure to describe the magnitude of heterogeneity in treatment effects across clusters or centres in randomized trials. We further show how this methodology can be used to estimate treatment effect heterogeneity in an individual patient data meta-analysis.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-21T02:05:10Z
      DOI: 10.1177/0962280220948550
  • Analysing body composition as compositional data: An exploration of the
           relationship between body composition, body mass and bone strength
    • Authors: D Dumuid, JA Martín-Fernández, S Ellul, RS Kenett, M Wake, P Simm, L Baur, T Olds
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Human body composition is made up of mutually exclusive and exhaustive parts (e.g. %truncal fat, %non-truncal fat and %fat-free mass) which are constrained to sum to the same total (100%). In statistical analyses, individual parts of body composition (e.g. %truncal fat or %fat-free mass) have traditionally been used as proxies for body composition, and have been linked with a range of health outcomes. But analysis of individual parts omits information about the other parts, which are intrinsically co-dependent because of the constant sum constraint of 100%. Further, body mass may be associated with health outcomes. We describe a statistical approach for body composition based on compositional data analysis. The body composition data are expressed as logratios to allow relative information about all the compositional parts to be explored simultaneously in relation to health outcomes. We describe a recent extension to the logratio approach to compositional data analysis which allows absolute information about the total of the compositional parts (body mass) to be considered alongside relative information about body composition. The statistical approach is illustrated by an example that explores the relationships between adults’ body composition, body mass and bone strength.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-17T12:21:33Z
      DOI: 10.1177/0962280220955221
  • Optimal two-stage sampling for mean estimation in multilevel populations
           when cluster size is informative
    • Authors: Francesco Innocenti, Math JJM Candel, Frans ES Tan, Gerard JP van Breukelen
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related to the mean of the outcome variable of interest (informative cluster size), the following competing sampling designs are considered: sampling clusters with probability proportional to cluster size, and then the same number of individuals per cluster; drawing clusters with equal probability, and then the same percentage of individuals per cluster; and selecting clusters with equal probability, and then the same number of individuals per cluster. For each design, optimal sample sizes are derived under a budget constraint. The three optimal two-stage sampling designs are compared, in terms of efficiency, with each other and with simple random sampling of individuals. Sampling clusters with probability proportional to size is recommended. To overcome the dependency of the optimal design on unknown nuisance parameters, maximin designs are derived. The results are illustrated, assuming probability proportional to size sampling of clusters, with the planning of a hypothetical survey to compare adolescent alcohol consumption between France and Italy.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-17T12:18:33Z
      DOI: 10.1177/0962280220952833
  • A Bayesian hierarchical change point model with parameter constraints
    • Authors: Hong Li, Andreana Benitez, Brian Neelon
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Alzheimer’s disease is the leading cause of dementia among adults aged 65 or above. Alzheimer’s disease is characterized by a change point signaling a sudden and prolonged acceleration in cognitive decline. The timing of this change point is of clinical interest because it can be used to establish optimal treatment regimens and schedules. Here, we present a Bayesian hierarchical change point model with a parameter constraint to characterize the rate and timing of cognitive decline among Alzheimer’s disease patients. We allow each patient to have a unique random intercept, random slope before the change point, random change point time, and random slope after the change point. The difference in slope before and after a change point is constrained to be nonpositive, and its parameter space is partitioned into a null region (representing normal aging) and a rejection region (representing accelerated decline). Using the change point time, the estimated slope difference, and the threshold of the null region, we are able to (1) distinguish normal aging patients from those with accelerated cognitive decline, (2) characterize the rate and timing for patients experiencing cognitive decline, and (3) predict personalized risk of progression to dementia due to Alzheimer’s disease. We apply the approach to data from the Religious Orders Study, a national cohort study of aging Catholic nuns, priests, and lay brothers.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-14T03:04:34Z
      DOI: 10.1177/0962280220948097
  • Efficient orthogonal functional magnetic resonance imaging designs in the
           presence of drift
    • Authors: Rakhi Singh, John Stufken
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      To study brain activity, by measuring changes associated with the blood flow in the brain, functional magnetic resonance imaging techniques are employed. The design problem in event-related functional magnetic resonance imaging studies is to find the best sequence of stimuli to be shown to subjects for precise estimation of the brain activity. Previous analytical studies concerning optimal functional magnetic resonance imaging designs often assume a simplified model with independent errors over time. Optimal designs under this model are called g-lag orthogonal designs. Recently, it has been observed that g-lag orthogonal designs also perform well under simplified models with auto-regressive error structures. However, these models do not include drift. We investigate the performance of g-lag orthogonal designs for models that incorporate drift parameters. Identifying g-lag orthogonal designs that perform best in the presence of a drift is important because a drift is typically assumed for the analysis of event-related functional magnetic resonance imaging data.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-10T04:36:09Z
      DOI: 10.1177/0962280220953870
  • Understanding between-cluster variation in prevalence and limits for how
           much variation is plausible
    • Authors: Mark D Chatfield, Daniel M Farewell
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In clinical trials and observational studies of clustered binary data, understanding between-cluster variation is essential: in sample size and power calculations of cluster randomised trials, for example, the intra-cluster correlation coefficient is often specified. However, quantifications of between-cluster variation can be unintuitive, and an intra-cluster correlation coefficient as low as 0.04 may correspond to surprisingly large between-cluster differences. We suggest that understanding is improved through visualising the implied distribution of true cluster prevalences – possibly by assuming they follow a beta distribution – or by calculating their standard deviation, which is more readily interpretable than the intra-cluster correlation coefficient. Even so, the bounded nature of binary data complicates the interpretation of variances as primary measures of uncertainty, and entropy offers an attractive alternative. Appealing to maximum entropy theory, we propose the following rule of thumb: that plausible intra-cluster correlation coefficients and standard deviations of true cluster prevalences are both bounded above by the overall prevalence, its complement, and one third. We also provide corresponding bounds for the coefficient of variation, and for a different standard deviation and intra-cluster correlation defined on the log odds scale. Using previously published data, we observe the quantities defined on the log odds scale to be more transportable between studies with different outcomes with different prevalences than the intra-cluster correlation and coefficient of variation. The latter increase and decrease, respectively, as prevalence increases from 0% to 50%, and the same is true for our bounds. Our work will help clinical trialists better understand between-cluster variation and avoid specifying implausibly high values for the intra-cluster correlation in sample size and power calculations.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-10T04:33:51Z
      DOI: 10.1177/0962280220951831
  • Estimating model-based nonnegative population marginal means in
           application to medical expenditures covered by different health care
           policies – A study on Medical Expenditure Panel Survey
    • Authors: Mingmei Tian, Jihnhee Yu
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The medical care expenditure is historically an important public health issue, which greatly impacts the government’s health policies as well as patients’ financial and medical decisions. In population health research, we commonly discretize a numeric attribute to a few ordinal groups to examine population characteristics. Oftentimes, the population marginal mean estimation by the ANOVA approach is inflexible since it uses pre-defined grouping of the covariate. In this paper, we propose a method to estimate the population marginal mean using the B-spline-based regression in a manner of a generalized additive model as an alternative for the ANOVA. Since the medical expenditure is always nonnegative, a Bayesian approach is also implemented for the nonnegative constraint on the marginal mean estimates. The proposed method is flexible to estimate marginal means for user-specified grouping after model fitting in a post-hoc manner, a clear advantage over the ANOVA approach. We show that this method is inferentially superior to the ANOVA through theoretical investigations and an extensive Monte Carlo study. The real data analysis using Medical Expenditure Panel Survey data assisted by some visualization tools demonstrates an applicability of the proposed approach and leads us some interesting observations that may be relevant to public health discussions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-10T04:31:50Z
      DOI: 10.1177/0962280220954241
  • A tail-based test to detect differential expression in RNA-sequencing data
    • Authors: Jiong Chen, Xinlei Mi, Jing Ning, Xuming He, Jianhua Hu
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and other studies. Such data at the exon level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for within-sample dependence among the exons through a specified correlation structure. Through Monte Carlo simulation studies, we show that the proposed test is generally more powerful and robust in detecting differential expression than commonly used tests based on the mean or a single quantile. An application to TCGA lung adenocarcinoma data demonstrates the promise of the proposed method in terms of biomarker discovery.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-01T05:32:34Z
      DOI: 10.1177/0962280220951907
  • Risk factor identification in cystic fibrosis by flexible hierarchical
           joint models
    • Authors: Weiji Su, Xia Wang, Rhonda D Szczesniak
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Cystic fibrosis (CF) is a lethal autosomal disease hallmarked by respiratory failure. Maintaining lung function and minimizing frequency of acute respiratory events known as pulmonary exacerbations are essential to survival. Jointly modeling longitudinal lung function and exacerbation occurrences may provide better inference. We propose a shared-parameter joint hierarchical Gaussian process model with flexible link function to investigate the impacts of both demographic and time-varying clinical risk factors on lung function decline and to examine the associations between lung function and occurrence of pulmonary exacerbation. A two-level Gaussian process is used to capture the nonlinear longitudinal trajectory, and a flexible link function is introduced to the joint model in order to analyze binary process. Bayesian model assessment criteria are provided in examining the overall performance in joint models and marginal fitting in each submodel. We conduct simulation studies and apply the proposed model in a local CF center cohort. In the CF application, a nonlinear structure is supported in modeling both the longitudinal continuous and binary processes. A negative association is detected between lung function and pulmonary exacerbation by the joint model. The importance of risk factors, including gender, diagnostic status, insurance status, and BMI, is examined in joint models.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-26T05:21:45Z
      DOI: 10.1177/0962280220950369
  • Efficient algorithms for covariate analysis with dynamic data using
           nonlinear mixed-effects model
    • Authors: Min Yuan, Zhi Zhu, Yaning Yang, Minghua Zhao, Kate Sasser, Hisham Hamadeh, Jose Pinheiro, Xu Steven Xu
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Nonlinear mixed-effects modeling is one of the most popular tools for analyzing repeated measurement data, particularly for applications in the biomedical fields. Multiple integration and nonlinear optimization are the two major challenges for likelihood-based methods in nonlinear mixed-effects modeling. To solve these problems, approaches based on empirical Bayesian estimates have been proposed by breaking the problem into a nonlinear mixed-effects model with no covariates and a linear regression model without random effect. This approach is time-efficient as it involves no covariates in the nonlinear optimization. However, covariate effects based on empirical Bayesian estimates are underestimated and the bias depends on the extent of shrinkage. Marginal correction method has been proposed to correct the bias caused by shrinkage to some extent. However, the marginal approach appears to be suboptimal when testing covariate effects on multiple model parameters, a situation that is often encountered in real-world data analysis. In addition, the marginal approach cannot correct the inaccuracy in the associated p-values. In this paper, we proposed a simultaneous correction method (nSCEBE), which can handle the situation where covariate analysis is performed on multiple model parameters. Simulation studies and real data analysis showed that nSCEBE is accurate and efficient for both effect-size estimation and p-value calculation compared with the existing methods. Importantly, nSCEBE can be>2000 times faster than the standard mixed-effects models, potentially allowing utilization for high-dimension covariate analysis for longitudinal or repeated measured outcomes.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-25T04:57:06Z
      DOI: 10.1177/0962280220949898
  • Model-robust designs for nonlinear quantile regression
    • Authors: Selvakkadunko Selvaratnam, Linglong Kong, Douglas P Wiens
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We construct robust designs for nonlinear quantile regression, in the presence of both a possibly misspecified nonlinear quantile function and heteroscedasticity of an unknown form. The asymptotic mean-squared error of the quantile estimate is evaluated and maximized over a neighbourhood of the fitted quantile regression model. This maximum depends on the scale function and on the design. We entertain two methods to find designs that minimize the maximum loss. The first is local – we minimize for given values of the parameters and the scale function, using a sequential approach, whereby each new design point minimizes the subsequent loss, given the current design. The second is adaptive – at each stage, the maximized loss is evaluated at quantile estimates of the parameters, and a kernel estimate of scale, and then the next design point is obtained as in the sequential method. In the context of a Michaelis–Menten response model for an estrogen/hormone study, and a variety of scale functions, we demonstrate that the adaptive approach performs as well, in large study sizes, as if the parameter values and scale function were known beforehand and the sequential method applied. When the sequential method uses an incorrectly specified scale function, the adaptive method yields an, often substantial, improvement. The performance of the adaptive designs for smaller study sizes is assessed and seen to still be very favourable, especially so since the prior information required to design sequentially is rarely available.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-19T11:06:30Z
      DOI: 10.1177/0962280220948159
  • Dynamic predictions of kidney graft survival in the presence of
           longitudinal outliers
    • Authors: Özgür Asar, Marie-Cécile Fournier, Etienne Dantan
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In kidney transplantation, dynamic predictions of graft survival may be obtained from joint modelling of longitudinal and survival data for which a common assumption is that random-effects and error terms in the longitudinal sub-model are Gaussian. However, this assumption may be too restrictive, e.g. in the presence of outliers, and more flexible distributions would be required. In this study, we relax the Gaussian assumption by defining a robust joint modelling framework with t-distributed random-effects and error terms to obtain dynamic predictions of graft survival for kidney transplant patients. We take a Bayesian paradigm for inference and dynamic predictions and sample from the joint posterior densities. While previous research reported improved performances of robust joint models compared to the Gaussian version in terms of parameter estimation, dynamic prediction accuracy obtained from such approach has not been yet evaluated. Our results based on a training sample from the French DIVAT kidney transplantation cohort illustrate that estimates for the slope parameters in the longitudinal and survival sub-models are sensitive to the distributional assumptions. From both an internal validation sample from the DIVAT cohort and an external validation sample from the Lille (France) and Leuven (Belgium) transplantation centers, calibration and discrimination performances appeared to be better under the robust joint models compared to the Gaussian version, illustrating the need to accommodate outliers in the dynamic prediction context. Simulation results support the findings of the validation studies.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-13T03:37:38Z
      DOI: 10.1177/0962280220945352
  • A pseudo-likelihood approach for multivariate meta-analysis of test
           accuracy studies with multiple thresholds
    • Authors: Annamaria Guolo, Duc-Khanh To
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Multivariate meta-analysis of test accuracy studies when tests are evaluated in terms of sensitivity and specificity at more than one threshold represents an effective way to synthesize results by fully exploiting the data, if compared to univariate meta-analyses performed at each threshold independently. The approximation of logit transformations of sensitivities and specificities at different thresholds through a normal multivariate random-effects model is a recent proposal that straightforwardly extends the bivariate models well recommended for the one threshold case. However, drawbacks of the approach, such as poor estimation of the within-study correlations between sensitivities and between specificities, and severe computational issues can make it unappealing. We propose an alternative method for inference on common diagnostic measures using a pseudo-likelihood constructed under a working independence assumption between sensitivities and between specificities at different thresholds in the same study. The method does not require within-study correlations, overcomes the convergence issues and can be effortlessly implemented. Simulation studies highlight a satisfactory performance of the method, remarkably improving the results from the multivariate normal counterpart under different scenarios. The pseudo-likelihood approach is illustrated in the evaluation of a test used for diagnosis of preeclampsia as a cause of maternal and perinatal morbidity and mortality.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-13T03:37:09Z
      DOI: 10.1177/0962280220948085
  • Random forests for high-dimensional longitudinal data
    • Authors: Louis Capitaine, Robin Genuer, Rodolphe Thiébaut
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Random forests are one of the state-of-the-art supervised machine learning methods and achieve good performance in high-dimensional settings where p, the number of predictors, is much larger than n, the number of observations. Repeated measurements provide, in general, additional information, hence they are worth accounted especially when analyzing high-dimensional data. Tree-based methods have already been adapted to clustered and longitudinal data by using a semi-parametric mixed effects model, in which the non-parametric part is estimated using regression trees or random forests. We propose a general approach of random forests for high-dimensional longitudinal data. It includes a flexible stochastic model which allows the covariance structure to vary over time. Furthermore, we introduce a new method which takes intra-individual covariance into consideration to build random forests. Through simulation experiments, we then study the behavior of different estimation methods, especially in the context of high-dimensional data. Finally, the proposed method has been applied to an HIV vaccine trial including 17 HIV-infected patients with 10 repeated measurements of 20,000 gene transcripts and blood concentration of human immunodeficiency virus RNA. The approach selected 21 gene transcripts for which the association with HIV viral load was fully relevant and consistent with results observed during primary infection.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-10T03:40:07Z
      DOI: 10.1177/0962280220946080
  • A variance shrinkage method improves arm-based Bayesian network
    • Authors: Zhenxun Wang, Lifeng Lin, James S Hodges, Richard MacLehose, Haitao Chu
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Network meta-analysis is a commonly used tool to combine direct and indirect evidence in systematic reviews of multiple treatments to improve estimation compared to traditional pairwise meta-analysis. Unlike the contrast-based network meta-analysis approach, which focuses on estimating relative effects such as odds ratios, the arm-based network meta-analysis approach can estimate absolute risks and other effects, which are arguably more informative in medicine and public health. However, the number of clinical studies involving each treatment is often small in a network meta-analysis, leading to unstable treatment-specific variance estimates in the arm-based network meta-analysis approach when using non- or weakly informative priors under an unequal variance assumption. Additional assumptions, such as equal (i.e. homogeneous) variances for all treatments, may be used to remedy this problem, but such assumptions may be inappropriately strong. This article introduces a variance shrinkage method for an arm-based network meta-analysis. Specifically, we assume different treatment variances share a common prior with unknown hyperparameters. This assumption is weaker than the homogeneous variance assumption and improves estimation by shrinking the variances in a data-dependent way. We illustrate the advantages of the variance shrinkage method by reanalyzing a network meta-analysis of organized inpatient care interventions for stroke. Finally, comprehensive simulations investigate the impact of different variance assumptions on statistical inference, and simulation results show that the variance shrinkage method provides better estimation for log odds ratios and absolute risks.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-06T02:42:03Z
      DOI: 10.1177/0962280220945731
  • Variable selection for ultra-high dimensional quantile regression with
           missing data and measurement error
    • Authors: Yongxin Bai, Maozai Tian, Man-Lai Tang, Wing-Yan Lee
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-04T04:01:04Z
      DOI: 10.1177/0962280220941533
  • Bayesian quantile nonhomogeneous hidden Markov models
    • Authors: Hefei Liu, Xinyuan Song, Yanlin Tang, Baoxue Zhang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Hidden Markov models are useful in simultaneously analyzing a longitudinal observation process and its dynamic transition. Existing hidden Markov models focus on mean regression for the longitudinal response. However, the tails of the response distribution are as important as the center in many substantive studies. We propose a quantile hidden Markov model to provide a systematic method to examine the entire conditional distribution of the response given the hidden state and potential covariates. Instead of considering homogeneous hidden Markov models, which assume that the probabilities of between-state transitions are independent of subject- and time-specific characteristics, we allow the transition probabilities to depend on exogenous covariates, thereby yielding nonhomogeneous Markov chains and making the proposed model more flexible than its homogeneous counterpart. We develop a Bayesian approach coupled with efficient Markov chain Monte Carlo methods for statistical inference. Simulations are conducted to assess the empirical performance of the proposed method. The proposed methodology is applied to a cocaine use study to provide new insights into the prevention of cocaine use.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:44:02Z
      DOI: 10.1177/0962280220942802
  • Adjusted win ratio with stratification: Calculation methods and
    • Authors: Samvel B Gasparyan, Folke Folkvaljon, Olof Bengtsson, Joan Buenconsejo, Gary G Koch
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The win ratio is a general method of comparing locations of distributions of two independent, ordinal random variables, and it can be estimated without distributional assumptions. In this paper we provide a unified theory of win ratio estimation in the presence of stratification and adjustment by a numeric variable. Building step by step on the estimate of the crude win ratio we compare corresponding tests with well known non-parametric tests of group difference (Wilcoxon rank-sum test, Fligner–Policello test, van Elteren test, test based on the regression on ranks, and the rank analysis of covariance test). We show that the win ratio gives an interpretable treatment effect measure with corresponding test to detect treatment effect difference under minimal assumptions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:43:40Z
      DOI: 10.1177/0962280220942558
  • Functional survival forests for multivariate longitudinal outcomes:
           Dynamic prediction of Alzheimer’s disease progression
    • Authors: Jeffrey Lin, Kan Li, Sheng Luo
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The random survival forest (RSF) is a non-parametric alternative to the Cox proportional hazards model in modeling time-to-event data. In this article, we developed a modeling framework to incorporate multivariate longitudinal data in the model building process to enhance the predictive performance of RSF. To extract the essential features of the multivariate longitudinal outcomes, two methods were adopted and compared: multivariate functional principal component analysis and multivariate fast covariance estimation for sparse functional data. These resulting features, which capture the trajectories of the multiple longitudinal outcomes, are then included as time-independent predictors in the subsequent RSF model. This non-parametric modeling framework, denoted as functional survival forests, is better at capturing the various trends in both the longitudinal outcomes and the survival model which may be difficult to model using only parametric approaches. These advantages are demonstrated through simulations and applications to the Alzheimer’s Disease Neuroimaging Initiative.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:43:32Z
      DOI: 10.1177/0962280220941532
  • Issues and solutions in biomarker evaluation when subclasses are involved
           under binary classification
    • Authors: Yingdong Feng, Lili Tian
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In practice, it is common to evaluate biomarkers in binary classification settings (e.g. non-cancer vs. cancer) where one or both main classes involve multiple subclasses. For example, non-cancer class might consist of healthy subjects and benign cases, while cancer class might consist of subjects at early and late stages. The standard practice is pooling within each main class, i.e. all non-cancer subclasses are pooled together to create a control group, and all cancer subclasses are pooled together to create a case group. Based on the pooled data, the area under ROC curve (AUC) and other characteristics are estimated under binary classification for the purpose of biomarker evaluation. Despite the popularity of this pooling strategy in practice, its validity and implication in biomarker evaluation have never been carefully inspected. This paper aims to demonstrate that pooling strategy can be seriously misleading in biomarker evaluation. Furthermore, we present a new diagnostic framework as well as new accuracy measures appropriate for biomaker evaluation under such settings. In the end, an ovarian cancer data set is analyzed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:30:55Z
      DOI: 10.1177/0962280220938077
  • Mixed-effects models for the design and analysis of stepped wedge cluster
           randomized trials: An overview
    • Authors: Fan Li, James P Hughes, Karla Hemming, Monica Taljaard, Edward R. Melnick, Patrick J Heagerty
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The stepped wedge cluster randomized design has received increasing attention in pragmatic clinical trials and implementation science research. The key feature of the design is the unidirectional crossover of clusters from the control to intervention conditions on a staggered schedule, which induces confounding of the intervention effect by time. The stepped wedge design first appeared in the Gambia hepatitis study in the 1980s. However, the statistical model used for the design and analysis was not formally introduced until 2007 in an article by Hussey and Hughes. Since then, a variety of mixed-effects model extensions have been proposed for the design and analysis of these trials. In this article, we explore these extensions under a unified perspective. We provide a general model representation and regard various model extensions as alternative ways to characterize the secular trend, intervention effect, as well as sources of heterogeneity. We review the key model ingredients and clarify their implications for the design and analysis. The article serves as an entry point to the evolving statistical literatures on stepped wedge designs.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-07T04:51:24Z
      DOI: 10.1177/0962280220932962
  • Transformation based on likelihood ratio
    • Authors: Jianping Yang, Pei-Fen Kuan, Jialiang Li
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We respond here on a recent letter in this journal, on the transformation based on likelihood ratio.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-05-26T04:43:00Z
      DOI: 10.1177/0962280220925509
  • Bayesian and influence function-based empirical likelihoods for inference
           of sensitivity in diagnostic tests
    • Authors: Yan Hai, Xiaoyi Min, Gengsheng Qin
      First page: 3457
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In medical diagnostic studies, a diagnostic test can be evaluated based on its sensitivity under a desired specificity. Existing methods for inference on sensitivity include normal approximation-based approaches and empirical likelihood (EL)-based approaches. These methods generally have poor performance when the specificity is high, and some require choosing smoothing parameters. We propose a new influence function-based empirical likelihood method and Bayesian empirical likelihood methods to overcome such problems. Numerical studies are performed to compare the finite sample performance of the proposed approaches with existing methods. The proposed methods are shown to perform better in terms of both coverage probability and interval length. A real data set from Alzheimer’s Disease Neuroimaging Initiative (ANDI) is analyzed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-06-18T06:15:54Z
      DOI: 10.1177/0962280220929042
  • Criteria for evaluating risk prediction of multiple outcomes
    • Authors: Frank Dudbridge
      First page: 3492
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Risk prediction models have been developed in many contexts to classify individuals according to a single outcome, such as risk of a disease. Emerging “-omic” biomarkers provide panels of features that can simultaneously predict multiple outcomes from a single biological sample, creating issues of multiplicity reminiscent of exploratory hypothesis testing. Here I propose definitions of some basic criteria for evaluating prediction models of multiple outcomes. I define calibration in the multivariate setting and then distinguish between outcome-wise and individual-wise prediction, and within the latter between joint and panel-wise prediction. I give examples such as screening and early detection in which different senses of prediction may be more appropriate. In each case I propose definitions of sensitivity, specificity, concordance, positive and negative predictive value and relative utility. I link the definitions through a multivariate probit model, showing that the accuracy of a multivariate prediction model can be summarised by its covariance with a liability vector. I illustrate the concepts on a biomarker panel for early detection of eight cancers, and on polygenic risk scores for six common diseases.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-06-29T08:22:19Z
      DOI: 10.1177/0962280220929039
  • Robustness of risk-based allocation of resources for disease prevention
    • Authors: Mitchell H Gail, David Pee
      First page: 3511
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Risk models for disease incidence can be useful for allocating resources for disease prevention if risk assessment is not too expensive. Assume there is a preventive intervention that should be given to everyone, but preventive resources are limited. We optimize risk-based prevention strategies and investigate robustness to modeling assumptions. The optimal strategy defines the proportion of the population to be given risk assessment and who should be offered intervention. The optimal strategy depends on the ratio of available resources to resources needed to intervene on everyone, and on the ratio of the costs of risk assessment to intervention. Risk assessment is not recommended if it is too expensive. Preventive efficiency decreases with decreasing compliance to risk assessment or intervention. Risk measurement error has little effect nor does misspecification of the risk distribution. Ignoring population substructure has small effects on optimal prevention strategy but can lead to modest over- or under-spending. We give conditions under which ignoring population substructure has no effect on optimal strategy. Thus, a simple one-population model offers robust guidance on prevention strategy but requires data on available resources, costs of risk assessment and intervention, population risk distribution, and probabilities of acceptance of risk assessment and intervention.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-06-18T06:18:14Z
      DOI: 10.1177/0962280220930055
  • Group sequential monitoring based on the maximum of weighted log-rank
           statistics with the Fleming–Harrington class of weights in oncology
           clinical trials
    • Authors: Thomas J Prior
      First page: 3525
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Clinical trials in oncology often involve the statistical analysis of time-to-event data such as progression-free survival or overall survival to determine the benefit of a treatment or therapy. The log-rank test is commonly used to compare time-to-event data from two groups. The log-rank test is especially powerful when the two groups have proportional hazards. However, survival curves encountered in oncology studies that differ from one another do not always differ by having proportional hazards; in such instances, the log-rank test loses power, and the survival curves are said to have “non-proportional hazards”. This non-proportional hazards situation occurs for immunotherapies in oncology; immunotherapies often have a delayed treatment effect when compared to chemotherapy or radiation therapy. To correctly identify and deliver efficacious treatments to patients, it is important in oncology studies to have available a statistical test that can detect the difference in survival curves even in a non-proportional hazards situation such as one caused by delayed treatment effect. An attempt to address this need was the “max-combo” test, which was originally described only for a single analysis timepoint; this article generalizes that test to preserve type I error when there are one or more interim analyses, enabling efficacious treatments to be identified and made available to patients more rapidly.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-06-11T04:47:50Z
      DOI: 10.1177/0962280220931560
  • Bootstrap inference for multiple imputation under uncongeniality and
    • Authors: Jonathan W Bartlett, Rachael A Hughes
      First page: 3533
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin’s simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so-called congenial and the embedding model is correctly specified, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice, imputation models and analysis procedures are often not congenial, such that tests may not have the correct size, and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-01T06:34:33Z
      DOI: 10.1177/0962280220932189
  • Homogeneity testing for binomial proportions under stratified
           double-sampling scheme with two fallible classifiers
    • Authors: Shi-Fang Qiu, Qi-Xiang Fu
      First page: 3547
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      This article investigates the homogeneity testing problem of binomial proportions for stratified partially validated data obtained by double-sampling method with two fallible classifiers. Several test procedures, including the weighted-least-squares test with/without log-transformation, logit-transformation and double log-transformation, and likelihood ratio test and score test, are developed to test the homogeneity under two models, distinguished by conditional independence assumption of two classifiers. Simulation results show that score test performs better than other tests in the sense that the empirical size is generally controlled around the nominal level, and hence be recommended to practical applications. Other tests also perform well when both binomial proportions and sample sizes are not small. Approximate sample sizes based on score test, likelihood ratio test and the weighted-least-squares test with double log-transformation are generally accurate in terms of the empirical power and type I error rate with the estimated sample sizes, and hence be recommended. An example from the malaria study is illustrated by the proposed methodologies.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-09T04:59:04Z
      DOI: 10.1177/0962280220932601
  • Nonparametric hyperrectangular tolerance and prediction regions for
           setting multivariate reference regions in laboratory medicine
    • Authors: Derek S Young, Thomas Mathew
      First page: 3569
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Reference regions are widely used in clinical chemistry and laboratory medicine to interpret the results of biochemical or physiological tests of patients. There are well-established methods in the literature for reference limits for univariate measurements; however, limited methods are available for the construction of multivariate reference regions, since traditional multivariate statistical regions (e.g. confidence, prediction, and tolerance regions) are not constructed based on a hyperrectangular geometry. The present work addresses this problem by developing multivariate hyperrectangular nonparametric tolerance regions for setting the reference regions. The approach utilizes statistical data depth to determine which points to trim and then the extremes of the trimmed dataset are used as the faces of the hyperrectangular region. Also presented is a strategy for determining the number of points to trim based on previously established asymptotic results. An extensive coverage study shows the favorable performance of the proposed procedure for moderate to large sample sizes. The procedure is applied to obtain reference regions for addressing two important clinical problems: (1) assessing kidney function in adolescents and (2) characterizing insulin-like growth factor concentrations in the serum of adults.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-06-29T08:14:19Z
      DOI: 10.1177/0962280220933910
  • Inference for variance components in linear mixed-effect models with
           flexible random effect and error distributions
    • Authors: Tom Chen, Rui Wang
      First page: 3586
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In many biomedical investigations, parameters of interest, such as the intraclass correlation coefficient, are functions of higher-order moments reflecting finer distributional characteristics. One popular method to make inference for such parameters is through postulating a parametric random effects model. We relax the standard normality assumptions for both the random effects and errors through the use of the Fleishman distribution, a flexible four-parameter distribution which accounts for the third and fourth cumulants. We propose a Fleishman bootstrap method to construct confidence intervals for correlated data and develop a normality test for the random effect and error distributions. Recognizing that the intraclass correlation coefficient may be heavily influenced by a few extreme observations, we propose a modified, quantile-normalized intraclass correlation coefficient. We evaluate our methods in simulation studies and apply these methods to the Childhood Adenotonsillectomy Trial sleep electroencephalogram data in quantifying wave-frequency correlation among different channels.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-16T03:41:33Z
      DOI: 10.1177/0962280220933909
  • Propensity score specification for optimal estimation of average treatment
           effect with binary response
    • Authors: John A Craycroft, Jiapeng Huang, Maiying Kong
      First page: 3623
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Propensity score methods are commonly used in statistical analyses of observational data to reduce the impact of confounding bias in estimations of average treatment effect. While the propensity score is defined as the conditional probability of a subject being in the treatment group given that subject’s covariates, the most precise estimation of average treatment effect results from specifying the propensity score as a function of true confounders and predictors only. This property has been demonstrated via simulation in multiple prior research articles. However, we have seen no theoretical explanation as to why this should be so. This paper provides that theoretical proof. Furthermore, this paper presents a method for performing the necessary variable selection by means of elastic net regression, and then estimating the propensity scores so as to obtain optimal estimates of average treatment effect. The proposed method is compared against two other recently introduced methods, outcome-adaptive lasso and covariate balancing propensity score. Extensive simulation analyses are employed to determine the circumstances under which each method appears most effective. We applied the proposed methods to examine the effect of pre-cardiac surgery coagulation indicator on mortality based on a linked dataset from a retrospective review of 1390 patient medical records at Jewish Hospital (Louisville, KY) with the Society of Thoracic Surgeons database.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-09T05:05:04Z
      DOI: 10.1177/0962280220934847
  • A working likelihood approach for robust regression
    • Authors: Liya Fu, You-Gan Wang, Fengjing Cai
      First page: 3641
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Robust approach is often desirable in presence of outliers for more efficient parameter estimation. However, the choice of the regularization parameter value impacts the efficiency of the parameter estimators. To maximize the estimation efficiency, we construct a likelihood function for simultaneously estimating the regression parameters and the tuning parameter. The “working” likelihood function is deemed as a vehicle for efficient regression parameter estimation, because we do not assume the data are generated from this likelihood function. The proposed method can effectively find a value of the regularization parameter based on the extent of contamination in the data. We carry out extensive simulation studies in a variety of cases to investigate the performance of the proposed method. The simulation results show that the efficiency can be enhanced as much as 40% when the data follow a heavy-tailed distribution, and reaches as high as 468% for the heteroscedastic variance cases compared to the traditional Huber’s method with a fixed regularization parameter. For illustration, we also analyzed two datasets: one from a diabetics study and the other from a mortality study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-14T11:14:59Z
      DOI: 10.1177/0962280220936310
  • A robust score test of homogeneity for zero-inflated count data
    • Authors: Wei-Wen Hsu, David Todem, Nadeesha R Mawella, KyungMann Kim, Richard R Rosenkranz
      First page: 3653
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In many applications of zero-inflated models, score tests are often used to evaluate whether the population heterogeneity as implied by these models is consistent with the data. The most frequently cited justification for using score tests is that they only require estimation under the null hypothesis. Because this estimation involves specifying a plausible model consistent with the null hypothesis, the testing procedure could lead to unreliable inferences under model misspecification. In this paper, we propose a score test of homogeneity for zero-inflated models that is robust against certain model misspecifications. Due to the true model being unknown in practical settings, our proposal is developed under a general framework of mixture models for which a layer of randomness is imposed on the model to account for uncertainty in the model specification. We exemplify this approach on the class of zero-inflated Poisson models, where a random term is imposed on the Poisson mean to adjust for relevant covariates missing from the mean model or a misspecified functional form. For this example, we show through simulations that the resulting score test of zero inflation maintains its empirical size at all levels, albeit a loss of power for the well-specified non-random mean model under the null. Frequencies of health promotion activities among young Girl Scouts and dental caries indices among inner-city children are used to illustrate the robustness of the proposed testing procedure.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-10T08:12:49Z
      DOI: 10.1177/0962280220937324
  • A global test for competing risks survival analysis
    • Authors: Dominic Edelmann, Maral Saadati, Hein Putter, Jelle Goeman
      First page: 3666
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Standard tests for the Cox model, such as the likelihood ratio test or the Wald test, do not perform well in situations, where the number of covariates is substantially higher than the number of observed events. This issue is perpetuated in competing risks settings, where the number of observed occurrences for each event type is usually rather small. Yet, appropriate testing methodology for competing risks survival analysis with few events per variable is missing. In this article, we show how to extend the global test for survival by Goeman et al. to competing risks and multistate models[Per journal style, abstracts should not have reference citations. Therefore, can you kindly delete this reference citation.]. Conducting detailed simulation studies, we show that both for type I error control and for power, the novel test outperforms the likelihood ratio test and the Wald test based on the cause-specific hazards model in settings where the number of events is small compared to the number of covariates. The benefit of the global tests for competing risks survival analysis and multistate models is further demonstrated in real data examples of cancer patients from the European Society for Blood and Marrow Transplantation.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-07T04:54:39Z
      DOI: 10.1177/0962280220938402
  • Using gradient boosting with stability selection on health insurance
           claims data to identify disease trajectories in chronic obstructive
           pulmonary disease
    • Authors: Tina Ploner, Steffen Heß, Marcus Grum, Philipp Drewe-Boss, Jochen Walker
      First page: 3684
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      ObjectiveWe propose a data-driven method to detect temporal patterns of disease progression in high-dimensional claims data based on gradient boosting with stability selection.Materials and methodsWe identified patients with chronic obstructive pulmonary disease in a German health insurance claims database with 6.5 million individuals and divided them into a group of patients with the highest disease severity and a group of control patients with lower severity. We then used gradient boosting with stability selection to determine variables correlating with a chronic obstructive pulmonary disease diagnosis of highest severity and subsequently model the temporal progression of the disease using the selected variables.ResultsWe identified a network of 20 diagnoses (e.g. respiratory failure), medications (e.g. anticholinergic drugs) and procedures associated with a subsequent chronic obstructive pulmonary disease diagnosis of highest severity. Furthermore, the network successfully captured temporal patterns, such as disease progressions from lower to higher severity grades.DiscussionThe temporal trajectories identified by our data-driven approach are compatible with existing knowledge about chronic obstructive pulmonary disease showing that the method can reliably select relevant variables in a high-dimensional context.ConclusionWe provide a generalizable approach for the automatic detection of disease trajectories in claims data. This could help to diagnose diseases early, identify unknown risk factors and optimize treatment plans.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-10T08:16:29Z
      DOI: 10.1177/0962280220938088
  • Random effects models for complex designs
    • Authors: RG Jarrett, VT Farewell, AM Herzberg
      First page: 3695
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Plaid designs are characterised by having one set of treatments applied to rows and another set of treatments applied to columns. In a 2003 publication, Farewell and Herzberg presented an analysis of variance structure for such designs. They presented an example of a study in which medical practitioners, trained in different ways, evaluated a series of videos of patients obtained under a variety of conditions. However, their analysis did not take full account of all error terms. In this paper, a more comprehensive analysis of this study is presented, informed by the recognition that the study can also be regarded as a two-phase design. The development of random effects models is outlined and the potential importance of block-treatment interactions is highlighted. The use of a variety of techniques is shown to lead to a better understanding of the study. Examination of the variance components involved in the expected mean squares is demonstrated to have particular value in identifying appropriate error terms for F-tests derived from an analysis of variance table. A package such as ASReml can also be used provided an appropriate error structure is specified. The methods presented can be applied to the design and analysis of other complex studies in which participants supply multiple measurements under a variety of conditions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-17T05:39:25Z
      DOI: 10.1177/0962280220938418
  • Semiparametric regression of the illness-death model with interval
           censored disease incidence time: An application to the ACLS data
    • Authors: Jie Zhou, Jiajia Zhang, Alexander C McLain, Wenbin Lu, Xuemei Sui, James W Hardin
      First page: 3707
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      To investigate the effect of fitness on cardiovascular disease and all-cause mortality using the Aerobics Center Longitudinal Study, we develop a semiparametric illness-death model account for intermittent observations of the cardiovascular disease incidence time and the right censored data of all-cause mortality. The main challenge in estimation is to handle the intermittent observations (interval censoring) of cardiovascular disease incidence time and we develop a semiparametric estimation method based on the expectation-maximization algorithm for a Markov illness-death regression model. The variance of the parameters is estimated using profile likelihood methods. The proposed method is evaluated using extensive simulation studies and illustrated with an application to the Aerobics Center Longitudinal Study data.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-09T05:17:44Z
      DOI: 10.1177/0962280220939123
  • Propensity score weighting under limited overlap and model
    • Authors: Yunji Zhou, Roland A Matsouaka, Laine Thomas
      First page: 3721
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Propensity score weighting methods are often used in non-randomized studies to adjust for confounding and assess treatment effects. The most popular among them, the inverse probability weighting, assigns weights that are proportional to the inverse of the conditional probability of a specific treatment assignment, given observed covariates. A key requirement for inverse probability weighting estimation is the positivity assumption, i.e. the propensity score must be bounded away from 0 and 1. In practice, violations of the positivity assumption often manifest by the presence of limited overlap in the propensity score distributions between treatment groups. When these practical violations occur, a small number of highly influential inverse probability weights may lead to unstable inverse probability weighting estimators, with biased estimates and large variances. To mitigate these issues, a number of alternative methods have been proposed, including inverse probability weighting trimming, overlap weights, matching weights, and entropy weights. Because overlap weights, matching weights, and entropy weights target the population for whom there is equipoise (and with adequate overlap) and their estimands depend on the true propensity score, a common criticism is that these estimators may be more sensitive to misspecifications of the propensity score model. In this paper, we conduct extensive simulation studies to compare the performances of inverse probability weighting and inverse probability weighting trimming against those of overlap weights, matching weights, and entropy weights under limited overlap and misspecified propensity score models. Across the wide range of scenarios we considered, overlap weights, matching weights, and entropy weights consistently outperform inverse probability weighting in terms of bias, root mean squared error, and coverage probability.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-22T04:31:17Z
      DOI: 10.1177/0962280220940334
  • Nonlinear parametric quantile models
    • Authors: Matteo Bottai, Giovanna Cilluffo
      First page: 3757
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Quantile regression is widely used to estimate conditional quantiles of an outcome variable of interest given covariates. This method can estimate one quantile at a time without imposing any constraints on the quantile process other than the linear combination of covariates and parameters specified by the regression model. While this is a flexible modeling tool, it generally yields erratic estimates of conditional quantiles and regression coefficients. Recently, parametric models for the regression coefficients have been proposed that can help balance bias and sampling variability. So far, however, only models that are linear in the parameters and covariates have been explored. This paper presents the general case of nonlinear parametric quantile models. These can be nonlinear with respect to the parameters, the covariates, or both. Some important features and asymptotic properties of the proposed estimator are described, and its finite-sample behavior is assessed in a simulation study. Nonlinear parametric quantile models are applied to estimate extreme quantiles of longitudinal measures of respiratory mechanics in asthmatic children from an epidemiological study and to evaluate a dose–response relationship in a toxicological laboratory experiment.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-20T04:32:14Z
      DOI: 10.1177/0962280220941159
  • Reference-based pattern-mixture models for analysis of longitudinal binary
    • Authors: Kaifeng Lu
      First page: 3770
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Pattern-mixture model (PMM)-based controlled imputations have become a popular tool to assess the sensitivity of primary analysis inference to different post-dropout assumptions or to estimate treatment effectiveness. The methodology is well established for continuous responses but less well established for binary responses. In this study, we formulate the copy-reference and jump-to-reference PMMs for longitudinal binary data using a multivariate probit model with latent variables. We discuss the maximum likelihood, Bayesian, and multiple imputation methods for estimating the treatment effect under the specified PMM. Simulation studies are conducted to evaluate the performance of these methods. These methods are also illustrated using data from a bipolar mania study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-23T05:22:51Z
      DOI: 10.1177/0962280220941880
  • Relative efficiencies of alternative preference-based designs for
           randomised trials
    • Authors: SD Walter, M Bian
      First page: 3783
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Recent work has shown that outcomes in clinical trials can be affected by which treatment the trial participants would select if they were allowed to do so, and if they do or do not actually receive that treatment. These influences are known as selection and preference effects, respectively. Unfortunately, they cannot be evaluated in conventional, parallel group trials because patient preferences remain unknown. However, several alternative designs have been proposed, to measure and take account of patient preferences. In this paper, we discuss three preference-based designs (the two-stage, fully randomised, and partially randomised designs). In conventional trials, only the treatment effect is estimable, while the preference-based designs have the potential to estimate some or all of the selection and preference effects. The relative efficiency of these designs is affected by several factors, including the proportion of participants who are undecided about treatments, or who are unable or unwilling to state a preference; the relative preference rate between the treatments being compared, among patients who do have a preference; and the ratio of patients randomised to each treatment. We also discuss the advantages and disadvantages of these designs under different scenarios.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-24T03:59:43Z
      DOI: 10.1177/0962280220941874
  • On hazard-based penalized likelihood estimation of accelerated failure
           time model with partly interval censoring
    • Authors: Jinqing Li, Jun Ma
      First page: 3804
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In survival analysis, the semiparametric accelerated failure time model is an important alternative to the widely used Cox proportional hazard model. The existing methods for accelerated failure time models include least-squares, log rank-based estimating equations and approximations to the nonparametric error distribution. In this paper, we propose another fitting method for the accelerated failure time model, formulated from the hazard function of the exponential error term. Our method can handle partly interval-censored data which contains event time, as well as left, right and interval censoring time. We adopt the maximum penalized likelihood method to estimate all the parameters in the model, including the nonparametric component. The penalty function is used to regularize the nonparametric component of the accelerated failure time model. Asymptotic properties of the penalized likelihood estimate are developed. A simulation study is conducted to investigate the performance of the proposed method and an application of this method to an AIDS study is presented as an example.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-21T04:51:16Z
      DOI: 10.1177/0962280220942555
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-