A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort by number of followers]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted alphabetically
Advances in Complex Systems     Hybrid Journal   (Followers: 11)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 62)
Annals of Applied Statistics     Full-text available via subscription   (Followers: 39)
Applied Categorical Structures     Hybrid Journal   (Followers: 4)
Argumentation et analyse du discours     Open Access   (Followers: 11)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 8)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 4)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Bernoulli     Full-text available via subscription   (Followers: 9)
Biometrical Journal     Hybrid Journal   (Followers: 11)
Biometrics     Hybrid Journal   (Followers: 52)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 18)
Building Simulation     Hybrid Journal   (Followers: 2)
Bulletin of Statistics     Full-text available via subscription   (Followers: 4)
CHANCE     Hybrid Journal   (Followers: 5)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 11)
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 37)
Current Research in Biostatistics     Open Access   (Followers: 8)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 11)
Demographic Research     Open Access   (Followers: 15)
Electronic Journal of Statistics     Open Access   (Followers: 8)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
ESAIM: Probability and Statistics     Full-text available via subscription   (Followers: 5)
Extremes     Hybrid Journal   (Followers: 2)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 9)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 5)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 17)
International Journal of Stochastic Analysis     Open Access   (Followers: 3)
International Statistical Review     Hybrid Journal   (Followers: 13)
International Trade by Commodity Statistics - Statistiques du commerce international par produit     Full-text available via subscription  
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 4)
Journal of Applied Statistics     Hybrid Journal   (Followers: 21)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 21)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 39, SJR: 3.664, CiteScore: 2)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 20)
Journal of Econometrics     Hybrid Journal   (Followers: 84)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 6)
Journal of Forecasting     Hybrid Journal   (Followers: 17)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Interactive Marketing     Hybrid Journal   (Followers: 10)
Journal of Mathematics and Statistics     Open Access   (Followers: 8)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 33)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 5)
Journal of Statistical Physics     Hybrid Journal   (Followers: 13)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 8)
Journal of Statistical Software     Open Access   (Followers: 21, SJR: 13.802, CiteScore: 16)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 72, SJR: 3.746, CiteScore: 2)
Journal of the Korean Statistical Society     Hybrid Journal   (Followers: 1)
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 33)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 27)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 43)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 16)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 30)
Law, Probability and Risk     Hybrid Journal   (Followers: 8)
Lifetime Data Analysis     Hybrid Journal   (Followers: 7)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Metrika     Hybrid Journal   (Followers: 4)
Modelling of Mechanical Systems     Full-text available via subscription   (Followers: 1)
Monte Carlo Methods and Applications     Hybrid Journal   (Followers: 6)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 2)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 5)
Optimization Letters     Hybrid Journal   (Followers: 2)
Optimization Methods and Software     Hybrid Journal   (Followers: 8)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 34)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 17)
Probability Surveys     Open Access   (Followers: 4)
Queueing Systems     Hybrid Journal   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 8)
Review of Economics and Statistics     Hybrid Journal   (Followers: 128)
Review of Socionetwork Strategies     Hybrid Journal  
Risk Management     Hybrid Journal   (Followers: 15)
Sankhya A     Hybrid Journal   (Followers: 2)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Sequential Analysis: Design Methods and Applications     Hybrid Journal  
Significance     Hybrid Journal   (Followers: 7)
Sociological Methods & Research     Hybrid Journal   (Followers: 38)
SourceOCDE Comptes nationaux et Statistiques retrospectives     Full-text available via subscription  
SourceOCDE Statistiques : Sources et methodes     Full-text available via subscription  
SourceOECD Bank Profitability Statistics - SourceOCDE Rentabilite des banques     Full-text available via subscription   (Followers: 1)
SourceOECD Insurance Statistics - SourceOCDE Statistiques d'assurance     Full-text available via subscription   (Followers: 2)
SourceOECD Main Economic Indicators - SourceOCDE Principaux indicateurs economiques     Full-text available via subscription   (Followers: 1)
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
SourceOECD Monthly Statistics of International Trade     Full-text available via subscription   (Followers: 1)
SourceOECD National Accounts & Historical Statistics     Full-text available via subscription  
SourceOECD OECD Economic Outlook Database - SourceOCDE Statistiques des Perspectives economiques de l'OCDE     Full-text available via subscription   (Followers: 2)
SourceOECD Science and Technology Statistics - SourceOCDE Base de donnees des sciences et de la technologie     Full-text available via subscription  
SourceOECD Statistics Sources & Methods     Full-text available via subscription   (Followers: 1)
SourceOECD Taxing Wages Statistics - SourceOCDE Statistiques des impots sur les salaires     Full-text available via subscription  
Stata Journal     Full-text available via subscription   (Followers: 9)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Statistical Applications in Genetics and Molecular Biology     Hybrid Journal   (Followers: 5)
Statistical Communications in Infectious Diseases     Hybrid Journal  
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Statistical Methodology     Hybrid Journal   (Followers: 7)
Statistical Methods and Applications     Hybrid Journal   (Followers: 6)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 27)
Statistical Modelling     Hybrid Journal   (Followers: 19)
Statistical Papers     Hybrid Journal   (Followers: 4)
Statistical Science     Full-text available via subscription   (Followers: 13)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Statistics & Risk Modeling     Hybrid Journal   (Followers: 3)
Statistics and Computing     Hybrid Journal   (Followers: 13)
Statistics and Economics     Open Access   (Followers: 1)
Statistics in Medicine     Hybrid Journal   (Followers: 195)
Statistics, Politics and Policy     Hybrid Journal   (Followers: 6)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 14)
Stochastic Models     Hybrid Journal   (Followers: 3)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Teaching Statistics     Hybrid Journal   (Followers: 7)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
TEST     Hybrid Journal   (Followers: 3)
The American Statistician     Full-text available via subscription   (Followers: 23)
The Annals of Applied Probability     Full-text available via subscription   (Followers: 8)
The Annals of Probability     Full-text available via subscription   (Followers: 10)
The Annals of Statistics     Full-text available via subscription   (Followers: 34)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 11)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)

              [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Statistical Methods in Medical Research
Journal Prestige (SJR): 1.402
Citation Impact (citeScore): 2
Number of Followers: 27  
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 0962-2802 - ISSN (Online) 1477-0334
Published by Sage Publications Homepage  [1138 journals]
  • A goodbye
    • Authors: Brian Everitt
      Pages: 3 - 3
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 3-3, January 2021.

      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:23:52Z
      DOI: 10.1177/0962280220986690
      Issue No: Vol. 30, No. 1 (2021)
  • Introductions
    • Pages: 4 - 4
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 4-4, January 2021.

      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:23:52Z
      DOI: 10.1177/0962280220986691
      Issue No: Vol. 30, No. 1 (2021)
  • GEOMED 2019 Editorial
    • Authors: Andrew B Lawson, Marcos Prates, Craig Anderson
      Pages: 5 - 5
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 5-5, January 2021.

      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:23:22Z
      DOI: 10.1177/0962280220930177
      Issue No: Vol. 30, No. 1 (2021)
  • Dealing with risk discontinuities to estimate cancer mortality risks when
           the number of small areas is large
    • Authors: Guzman Santafé, Aritz Adin, Duncan Lee, Maŕ?a Dolores Ugarte
      Pages: 6 - 21
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 6-21, January 2021.
      Many statistical models have been developed during the last years to smooth risks in disease mapping. However, most of these modeling approaches do not take possible local discontinuities into consideration or if they do, they are computationally prohibitive or simply do not work when the number of small areas is large. In this paper, we propose a two-step method to deal with discontinuities and to smooth noisy risks in small areas. In a first stage, a novel density-based clustering algorithm is used. In contrast to previous proposals, this algorithm is able to automatically detect the number of spatial clusters, thus providing a single cluster structure. In the second stage, a Bayesian hierarchical spatial model that takes the cluster configuration into account is fitted, which accounts for the discontinuities in disease risk. To evaluate the performance of this new procedure in comparison to previous proposals, a simulation study has been conducted. Results show competitive risk estimates at a much better computational cost. The new methodology is used to analyze stomach cancer mortality data in Spanish municipalities.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:21:13Z
      DOI: 10.1177/0962280220946502
      Issue No: Vol. 30, No. 1 (2021)
  • Spatiotemporal distributed lag modelling of multiple Plasmodium species in
           a malaria elimination setting
    • Authors: Chawarat Rotejanaprasert, Duncan Lee, Nattwut Ekapirat, Prayuth Sudathip, Richard J Maude
      Pages: 22 - 34
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 22-34, January 2021.
      In much of the Greater Mekong Sub-region, malaria is now confined to patches and small foci of transmission. Malaria transmission is seasonal with the spatiotemporal patterns being associated with variation in environmental and climatic factors. However, the possible effect at different lag periods between meteorological variables and clinical malaria has not been well studied in the region. Thus, in this study we developed distributed lagged modelling accounting for spatiotemporal excessive zero cases in a malaria elimination setting. A multivariate framework was also extended to incorporate multiple data streams and investigate the spatiotemporal patterns from multiple parasite species via their lagged association with climatic variables. A simulation study was conducted to examine robustness of the methodology and a case study is provided of weekly data of clinical malaria cases at sub-district level in Thailand.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:22:48Z
      DOI: 10.1177/0962280220938977
      Issue No: Vol. 30, No. 1 (2021)
  • Joint space–time Bayesian disease mapping via quantification of
           disease risk association
    • Authors: Daniel R Baer, Andrew B Lawson, Jane E Joseph
      Pages: 35 - 61
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 35-61, January 2021.
      Alzheimer’s disease is an increasingly prevalent neurological disorder with no effective therapies. Thus, there is a need to characterize the progression of Alzheimer’s disease risk in order to preclude its inception in patients. Characterizing Alzheimer’s disease risk can be accomplished at the population-level by the space–time modeling of Alzheimer’s disease incidence data. In this paper, we develop flexible Bayesian hierarchical models which can borrow risk information from conditions antecedent to Alzheimer’s disease, such as mild cognitive impairment, in an effort to better characterize Alzheimer’s disease risk over space and time. From an application of these models to real-world Alzheimer’s disease and mild cognitive impairment spatiotemporal incidence data, we found that our novel models provided improved model goodness of fit, and via a simulation study, we demonstrated the importance of diagnosing the label-switching problem for our models as well as the importance of model specification in order to best capture the contribution of time in modeling Alzheimer’s disease risk.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:22:49Z
      DOI: 10.1177/0962280220938975
      Issue No: Vol. 30, No. 1 (2021)
  • Estimating hidden populations by transferring knowledge from
           geographically misaligned levels
    • Authors: Douglas R. M. Azevedo, Marcos O. Prates, Renato M. Assunção
      Pages: 62 - 74
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 62-74, January 2021.
      The estimation of hidden sub-populations is a hard task that appears in many fields. For example, public health planning in Brazil depends crucially of the number of people who holds a private health insurance plan and hence rarely uses the public services. Different sources of information about these sub-populations may be available at different geographical levels. The available information can be transferred between these different geographic levels to improve the estimation of the hidden population size. In this study, we propose a model that use individual level information to learn about the dependence between the response variable and explanatory variables by proposing a family of link functions with asymptotes that are flexible enough to represent the real aspects of the data and robust to departures from the model. We use the fitted model to estimate the size of the sub-population at any desired level. We illustrate our methodology estimating the sub-population that uses the public health system in each neighborhood of large cities in Brazil.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:20:12Z
      DOI: 10.1177/0962280220930560
      Issue No: Vol. 30, No. 1 (2021)
  • Spatial scan statistics can be dangerous
    • Authors: Toshiro Tango
      Pages: 75 - 86
      Abstract: Statistical Methods in Medical Research, Volume 30, Issue 1, Page 75-86, January 2021.
      Spatial scan statistics are widely used tools for the detection of disease clusters. Especially, the circular spatial scan statistic proposed by Kulldorff along with SaTScan software has been used in a wide variety of epidemiological studies and disease surveillance. However, as it cannot detect non-circular, irregularly shaped clusters, many authors have proposed non-circular spatial scan statistics. Above all, the flexible spatial scan statistic proposed by Tango and Takahashi along with FleXScan software has also been used. However, it does not seem to be well recognized that these spatial scan statistics, especially SaTScan, tend to detect the most likely cluster, much larger than the true cluster by absorbing neighboring regions with nonelevated risk of disease occurrence. Therefore, if researchers reported the detected most likely cluster as they are, it might lead to a criticism to them due to the fact that some regions with nonelevated risk are included in the detected most likely cluster. In this paper, to avoid detecting such undesirable and misleading clusters which might cause a social concern, we shall propose the use of the restricted likelihood ratio proposed by Tango and illustrate the procedure with two kinds of mortality data in Japan.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-17T04:20:12Z
      DOI: 10.1177/0962280220930562
      Issue No: Vol. 30, No. 1 (2021)
  • Evaluating Bayesian adaptive randomization procedures with adaptive clip
           methods for multi-arm trials
    • Authors: Kim May Lee, J Jack Lee
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Bayesian adaptive randomization is a heuristic approach that aims to randomize more patients to the putatively superior arms based on the trend of the accrued data in a trial. Many statistical aspects of this approach have been explored and compared with other approaches; yet only a limited number of works has focused on improving its performance and providing guidance on its application to real trials. An undesirable property of this approach is that the procedure would randomize patients to an inferior arm in some circumstances, which has raised concerns in its application. Here, we propose an adaptive clip method to rectify the problem by incorporating a data-driven function to be used in conjunction with Bayesian adaptive randomization procedure. This function aims to minimize the chance of assigning patients to inferior arms during the early time of the trial. Moreover, we propose a utility approach to facilitate the selection of a randomization procedure. A cost that reflects the penalty of assigning patients to the inferior arm(s) in the trial is incorporated into our utility function along with all patients benefited from the trial, both within and beyond the trial. We illustrate the selection strategy for a wide range of scenarios.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-03-10T06:36:13Z
      DOI: 10.1177/0962280221995961
  • Marginal analysis of bivariate mixed responses with measurement error and
    • Authors: Qihuang Zhang, Grace Y Yi
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Bivariate responses with mixed continuous and binary variables arise commonly in applications such as clinical trials and genetic studies. Statistical methods based on jointly modeling continuous and binary variables have been available. However, such methods ignore the effects of response mismeasurement, a ubiquitous feature in applications. It has been well studied that in many settings, ignorance of mismeasurement in variables usually results in biased estimation. In this paper, we consider the setting with a bivariate outcome vector which contains a continuous component and a binary component both subject to mismeasurement. We propose estimating equation approaches to handle measurement error in the continuous response and misclassification in the binary response simultaneously. The proposed estimators are consistent and robust to certain model misspecification, provided regularity conditions. Extensive simulation studies confirm that the proposed methods successfully correct the biases resulting from the error-in-variables under various settings. The proposed methods are applied to analyze the outbred Carworth Farms White mice data arising from a genome-wide association study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-26T06:10:57Z
      DOI: 10.1177/0962280220983587
  • Calibrating validation samples when accounting for measurement error in
           intervention studies
    • Authors: Benjamin Ackerman, Juned Siddique, Elizabeth A Stuart
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Many lifestyle intervention trials depend on collecting self-reported outcomes, such as dietary intake, to assess the intervention’s effectiveness. Self-reported outcomes are subject to measurement error, which impacts treatment effect estimation. External validation studies measure both self-reported outcomes and accompanying biomarkers, and can be used to account for measurement error. However, in order to account for measurement error using an external validation sample, an assumption must be made that the inferences are transportable from the validation sample to the intervention trial of interest. This assumption does not always hold. In this paper, we propose an approach that adjusts the validation sample to better resemble the trial sample, and we also formally investigate when bias due to poor transportability may arise. Lastly, we examine the performance of the methods using simulation, and illustrate them using PREMIER, a lifestyle intervention trial measuring self-reported sodium intake as an outcome, and OPEN, a validation study measuring both self-reported diet and urinary biomarkers.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-23T11:43:39Z
      DOI: 10.1177/0962280220988574
  • An adaptive seamless Phase 2-3 design with multiple endpoints
    • Authors: Man Jin, Pingye Zhang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Adaptive seamless Phase 2-3 design has been considered as one possible way to expedite development time for a drug program by allowing the expansion from an ongoing Phase 2 trial into a Phase 3 trial. Multiple endpoints are often tested when a regulatory approval is pursued. Here we propose an adaptive seamless Phase 2-3 design with multiple endpoints which can expand an ongoing Phase 2 trial into a Phase 3 trial based on an intermediate endpoint for adaptive decision and test the endpoints with a powerful multiple test procedure. It is proved that the proposed design can preserve the familywise Type I error under a mild assumption that is expected to hold in practical considerations. We illustrate our proposed design with an example trial design for oncology. Simulations are conducted to confirm the control of the familywise Type I error and the adaptive seamless Phase 2-3 design is illustrated with an example.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-16T04:58:54Z
      DOI: 10.1177/0962280220986935
  • Harmonizing child mortality data at disparate geographic levels
    • Authors: Neal Marquez, Jon Wakefield
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      There is an increasing focus on reducing inequalities in health outcomes in developing countries. Subnational variation is of particular interest, with geographically-indexed data being used to understand the spatial risk of detrimental outcomes and to identify who is at greatest risk. While some health surveys provide observations with associated geographic coordinates (point data), many others provide data that have their locations masked and instead only report the strata (polygon information) within which the data resides (masked data). How to harmonize these data sources for spatial analysis has seen previously considered though only ad hoc methods have been previously considered, and comparison of methods is lacking. In this paper, we present a new method for analyzing masked survey data, using a method that is consistent with the data-generating process. In addition, we critique two previously proposed approaches to analyzing masked data and illustrate that they are fundamentally flawed methodologically. To validate our method, we compare our approach with previously formulated solutions in several realistic simulation environments in which the underlying structure of the risk field is known. We simulate samples from spatiotemporal fields in a way that mimics the sampling frame implemented in the most common health surveys in low- and middle-income countries, the Demographic and Health Surveys and Multiple Indicator Cluster Surveys. In simulations, the newly proposed approach outperforms previously proposed approaches in terms of minimizing error while increasing the precision of estimates. The approaches are subsequently compared using child mortality data from the Dominican Republic where our findings are reinforced. The ability to accurately increase precision of child mortality estimates, and health outcomes in general, by leveraging various types of data, improves our ability to implement precision public health initiatives and better understand the landscape of geographic health inequalities.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-02T03:15:32Z
      DOI: 10.1177/0962280220988742
  • Propensity score analysis methods with balancing constraints: A Monte
           Carlo study
    • Authors: Yan Li, Liang Li
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The inverse probability weighting is an important propensity score weighting method to estimate the average treatment effect. Recent literature shows that it can be easily combined with covariate balancing constraints to reduce the detrimental effects of excessively large weights and improve balance. Other methods are available to derive weights that balance covariate distributions between the treatment groups without the involvement of propensity scores. We conducted comprehensive Monte Carlo experiments to study whether the use of covariate balancing constraints circumvent the need for correct propensity score model specification, and whether the use of a propensity score model further improves the estimation performance among methods that use similar covariate balancing constraints. We compared simple inverse probability weighting, two propensity score weighting methods with balancing constraints (covariate balancing propensity score, covariate balancing scoring rule), and two weighting methods with balancing constraints but without using the propensity scores (entropy balancing and kernel balancing). We observed that correct specification of the propensity score model remains important even when the constraints effectively balance the covariates. We also observed evidence suggesting that, with similar covariate balance constraints, the use of a propensity score model improves the estimation performance when the dimension of covariates is large. These findings suggest that it is important to develop flexible data-driven propensity score models that satisfy covariate balancing conditions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-02T03:11:12Z
      DOI: 10.1177/0962280220983512
  • Combining multiple biomarkers to linearly maximize the diagnostic accuracy
           under ordered multi-class setting
    • Authors: Jia Hua, Lili Tian
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Either in clinical study or biomedical research, it is a common practice to combine multiple biomarkers to improve the overall diagnostic performance. Despite the fact there exist a large number of statistical methods for biomarker combination under binary classification, research on this topic under multi-class setting is sparse. The overall diagnostic accuracy, i.e. the sum of correct classification rates, directly measures the classification accuracy of the combined biomarkers. Hence the overall accuracy can serve as an important objective function for biomarker combination, especially when the combined biomarkers are used for the purpose of making medical diagnosis. In this paper, we address the problem of combining multiple biomarkers to directly maximize the overall diagnostic accuracy by presenting several grid search methods and derivation-based methods. A comprehensive simulation study was conducted to compare the performances of these methods. An ovarian cancer data set is analyzed in the end.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-02-01T10:38:03Z
      DOI: 10.1177/0962280220987587
  • Clustered longitudinal data subject to irregular observation
    • Authors: Eleanor M Pullenayegum, Catherine Birken, Jonathon Maguire
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Data collected longitudinally as part of usual health care is becoming increasingly available for research, and is often available across several centres. Because the frequency of follow-up is typically determined by the patient’s health, the timing of measurements may be related to the outcome of interest. Failure to account for the informative nature of the observation process can result in biased inferences. While methods for accounting for the association between observation frequency and outcome are available, they do not currently account for clustering within centres. We formulate a semi-parametric joint model to include random effects for centres as well as subjects. We also show how inverse-intensity weighted GEEs can be adapted to account for clustering, comparing stratification, frailty models, and covariate adjustment to account for clustering in the observation process. The finite-sample performance of the proposed methods is evaluated through simulation and the methods illustrated using a study of the relationship between outdoor play and air quality in children aged 2–9 living in the Greater Toronto Area.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-29T08:19:36Z
      DOI: 10.1177/0962280220986193
  • Inference under covariate-adaptive randomization: A simulation study
    • Authors: Andrea Callegaro, B S Harsha Shree, Naveen Karkada
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In clinical trials, several covariate-adaptive designs have been proposed to balance treatment arms with respect to key covariates. Although some argue that conventional asymptotic tests are still appropriate when covariate-adaptive randomization is used, others think that re-randomization tests should be used. In this manuscript, we compare by simulation the performance of asymptotic and re-randomization tests under covariate-adaptive randomization. Our simulation study confirms results expected by the existing theory (e.g. asymptotic tests do not control type I error when the model is miss-specified). Furthermore, it shows that (i) re-randomization tests are as powerful as the asymptotic tests if the model is correct; (ii) re-randomization tests are more powerful when adjusting for covariates; (iii) minimization and permuted blocks provide similar results.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-28T04:26:00Z
      DOI: 10.1177/0962280220985564
  • A Bayesian dose–response meta-analysis model: A simulations study
           and application
    • Authors: Tasnim Hamza, Andrea Cipriani, Toshi A Furukawa, Matthias Egger, Nicola Orsini, Georgia Salanti
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Dose–response models express the effect of different dose or exposure levels on a specific outcome. In meta-analysis, where aggregated-level data is available, dose–response evidence is synthesized using either one-stage or two-stage models in a frequentist setting. We propose a hierarchical dose–response model implemented in a Bayesian framework. We develop our model assuming normal or binomial likelihood and accounting for exposures grouped in clusters. To allow maximum flexibility, the dose–response association is modelled using restricted cubic splines. We implement these models in R using JAGS and we compare our approach to the one-stage dose–response meta-analysis model in a simulation study. We found that the Bayesian dose–response model with binomial likelihood has lower bias than the Bayesian model with normal likelihood and the frequentist one-stage model when studies have small sample size. When the true underlying shape is log–log or half-sigmoid, the performance of all models depends on choosing an appropriate location for the knots. In all other examined situations, all models perform very well and give practically identical results. We also re-analyze the data from 60 randomized controlled trials (15,984 participants) examining the efficacy (response) of various doses of serotonin-specific reuptake inhibitor (SSRI) antidepressant drugs. All models suggest that the dose–response curve increases between zero dose and 30–40 mg of fluoxetine-equivalent dose, and thereafter shows small decline. We draw the same conclusion when we take into account the fact that five different antidepressants have been studied in the included trials. We show that implementation of the hierarchical model in Bayesian framework has similar performance to, but overcomes some of the limitations of the frequentist approach and offers maximum flexibility to accommodate features of the data.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-28T04:23:41Z
      DOI: 10.1177/0962280220982643
  • Bridging across patient subgroups in phase I oncology trials that
           incorporate animal data
    • Authors: Haiyan Zheng, Lisa V Hampson, Thomas Jaki
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In this paper, we develop a general Bayesian hierarchical model for bridging across patient subgroups in phase I oncology trials, for which preliminary information about the dose–toxicity relationship can be drawn from animal studies. Parameters that re-scale the doses to adjust for intrinsic differences in toxicity, either between animals and humans or between human subgroups, are introduced to each dose–toxicity model. Appropriate priors are specified for these scaling parameters, which capture the magnitude of uncertainty surrounding the animal-to-human translation and bridging assumption. After mapping data onto a common, ‘average’ human dosing scale, human dose–toxicity parameters are assumed to be exchangeable either with the standardised, animal study-specific parameters, or between themselves across human subgroups. Random-effects distributions are distinguished by different covariance matrices that reflect the between-study heterogeneity in animals and humans. Possibility of non-exchangeability is allowed to avoid inferences for extreme subgroups being overly influenced by their complementary data. We illustrate the proposed approach with hypothetical examples, and use simulation to compare the operating characteristics of trials analysed using our Bayesian model with several alternatives. Numerical results show that the proposed approach yields robust inferences, even when data from multiple sources are inconsistent and/or the bridging assumptions are incorrect.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-27T11:22:07Z
      DOI: 10.1177/0962280220986580
  • An effective technique for diabetic retinopathy using hybrid machine
           learning technique
    • Authors: N Satyanarayana Murthy, B Arunadevi
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Diabetic retinopathy (DR) stays as an eye issue that has continuously developed in individuals who experienced diabetes. The complexities in diabetes cause harm to the vein at the back of the retina. In outrageous cases, DR could swift apparition disaster or visual impairment. This genuine impact had the option to charge through convenient treatment and early recognition. As of late, this issue has been spreading quickly, particularly in the working region, which in the end constrained the interest of an analysis of this disease from the most prompt stage. Therefore, that are castoff to protect the progressions of this disorder, revealing of the retinal blood vessels (RBVs) play a foremost role. The growth of an abnormal vessel leads to the development steps of DR, where it can be well known by extracting the RBV. The recognition of the BV for DR by developing an automatic approach is a major aim of our research study. In the proposed method, there are two major steps: one is segmentation and the second one is classification of affected retinal BV. The proposed method uses the Kinetic Gas Molecule Optimization based on centroid initialization used for the Fuzzy C-means Clustering. In the classification step, those segmented images are given as input to hybrid techniques such as a convolution neural network with bidirectional-long short-term memory (CNN with Bi-LSTM). The learning degree of Bi-LSTM is revised by using the self-attention mechanism for refining the classification accuracy. The trial consequences disclosed that the mixture algorithm achieved higher accuracy, specificity, and sensitivity than existing techniques.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-27T04:20:28Z
      DOI: 10.1177/0962280220983541
  • An automation-based adaptive seamless design for dose selection and
           confirmation with improved power and efficiency
    • Authors: Lu Cui, Tianyu Zhan, Lanju Zhang, Ziqian Geng, Yihua Gu, Ivan SF Chan
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In a drug development program, the efficacy and safety of multiple doses can be evaluated in patients through a phase 2b dose ranging study. With a demonstrated dose response in the trial, promising doses are identified. Their effectiveness then is further investigated and confirmed in phase 3 studies. Although this two-step approach serves the purpose of the program, in general, it is inefficient because of its prolonged development duration and the exclusion of the phase 2b data in the final efficacy evaluation and confirmation which are only based on phase 3 data. To address the issue, we propose a new adaptive design, which seamlessly integrates the dose finding and confirmation steps under one pivotal study. Unlike existing adaptive seamless phase 2b/3 designs, the proposed design combines the response adaptive randomization, sample size modification, and multiple testing techniques to achieve better efficiency. The design can be easily implemented through an automated randomization process. At the end, a number of targeted doses are selected and their effectiveness is confirmed with guaranteed control of family-wise error rate.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-18T02:58:57Z
      DOI: 10.1177/0962280220984822
  • Improving convergence in growth mixture models without covariance
           structure constraints
    • Authors: Daniel McNeish, Jeffrey R. Harring
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Growth mixture models are a popular method to uncover heterogeneity in growth trajectories. Harnessing the power of growth mixture models in applications is difficult given the prevalence of nonconvergence when fitting growth mixture models to empirical data. Growth mixture models are rooted in the random effect tradition, and nonconvergence often leads researchers to modify their intended model with constraints in the random effect covariance structure to facilitate estimation. While practical, doing so has been shown to adversely affect parameter estimates, class assignment, and class enumeration. Instead, we advocate specifying the models with a marginal approach to prevent the widespread practice of sacrificing class-specific covariance structures to appease nonconvergence. A simulation is provided to show the importance of modeling class-specific covariance structures and builds off existing literature showing that applying constraints to the covariance leads to poor performance. These results suggest that retaining class-specific covariance structures should be a top priority and that marginal models like covariance pattern growth mixture models that model the covariance structure without random effects are well-suited for such a purpose, particularly with modest sample sizes and attrition commonly found in applications. An application to PTSD data with such characteristics is provided to demonstrate (a) convergence difficulties with random effect models, (b) how covariance structure constraints improve convergence but to the detriment of performance, and (c) how covariance pattern growth mixture models may provide a path forward that improves convergence without forfeiting class-specific covariance structures.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-13T03:33:21Z
      DOI: 10.1177/0962280220981747
  • Online control of the familywise error rate
    • Authors: Jinjin Tian, Aaditya Ramdas
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Biological research often involves testing a growing number of null hypotheses as new data are accumulated over time. We study the problem of online control of the familywise error rate, that is testing an a priori unbounded sequence of hypotheses (p-values) one by one over time without knowing the future, such that with high probability there are no false discoveries in the entire sequence. This paper unifies algorithmic concepts developed for offline (single batch) familywise error rate control and online false discovery rate control to develop novel online familywise error rate control methods. Though many offline familywise error rate methods (e.g., Bonferroni, fallback procedures and Sidak’s method) can trivially be extended to the online setting, our main contribution is the design of new, powerful, adaptive online algorithms that control the familywise error rate when the p-values are independent or locally dependent in time. Our numerical experiments demonstrate substantial gains in power, that are also formally proved in an idealized Gaussian sequence model. A promising application to the International Mouse Phenotyping Consortium is described.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-08T07:32:13Z
      DOI: 10.1177/0962280220983381
  • Continuous(ly) missing outcome data in network meta-analysis: A one-stage
           pattern-mixture model approach
    • Authors: Loukia M Spineli, Chrysostomos Kalyvas, Katerina Papadimitropoulou
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Appropriate handling of aggregate missing outcome data is necessary to minimise bias in the conclusions of systematic reviews. The two-stage pattern-mixture model has been already proposed to address aggregate missing continuous outcome data. While this approach is more proper compared with the exclusion of missing continuous outcome data and simple imputation methods, it does not offer flexible modelling of missing continuous outcome data to investigate their implications on the conclusions thoroughly. Therefore, we propose a one-stage pattern-mixture model approach under the Bayesian framework to address missing continuous outcome data in a network of interventions and gain knowledge about the missingness process in different trials and interventions. We extend the hierarchical network meta-analysis model for one aggregate continuous outcome to incorporate a missingness parameter that measures the departure from the missing at random assumption. We consider various effect size estimates for continuous data, and two informative missingness parameters, the informative missingness difference of means and the informative missingness ratio of means. We incorporate our prior belief about the missingness parameters while allowing for several possibilities of prior structures to account for the fact that the missingness process may differ in the network. The method is exemplified in two networks from published reviews comprising a different amount of missing continuous outcome data.
      Citation: Statistical Methods in Medical Research
      PubDate: 2021-01-07T05:50:47Z
      DOI: 10.1177/0962280220983544
  • Class imbalance in gradient boosting classification algorithms:
           Application to experimental stroke data
    • Authors: Olga Lyashevska, Fiona Malone, Eugene MacCarthy, Jens Fiehler, Jan-Hendrik Buhk, Liam Morris
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Imbalance between positive and negative outcomes, a so-called class imbalance, is a problem generally found in medical data. Imbalanced data hinder the performance of conventional classification methods which aim to improve the overall accuracy of the model without accounting for uneven distribution of the classes. To rectify this, the data can be resampled by oversampling the positive (minority) class until the classes are approximately equally represented. After that, a prediction model such as gradient boosting algorithm can be fitted with greater confidence. This classification method allows for non-linear relationships and deep interactive effects while focusing on difficult areas by iterative shifting towards problematic observations. In this study, we demonstrate application of these methods to medical data and develop a practical framework for evaluation of features contributing into the probability of stroke.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-28T08:12:12Z
      DOI: 10.1177/0962280220980484
  • A two-stage Generalized Method of Moments model for feedback with
           time-dependent covariates
    • Authors: Elsa Vazquez-Arreola
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Correlated observations in longitudinal studies are often due to repeated measures on the subjects. Additionally, correlation may be realized due to the association between responses at a particular time and the predictors at earlier times. There are also feedback effects (relation between responses in the present and the covariates at a later time), though these are not always relevant and are often ignored. All these cases of correlation must be accounted for as they can have different effects on the regression coefficients. Several authors have provided models that reflect the direct and delayed impact of covariates on the response, utilizing valid moment conditions to estimate the relevant regression coefficients. However, there are applications when one cannot ignore the effect of the responses on future covariates. A two-stage model to account for the feedback, modeling the direct as well as the delayed effects of the covariates on future responses and vice versa is presented. The use of the two-stage model is demonstrated by revisiting child morbidity and its impact on future values of body mass index using Philippines health data. Also, obesity status and its feedback effects on physical activity and depression levels using the Add Health dataset are analyzed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-28T03:53:56Z
      DOI: 10.1177/0962280220981402
  • A group sequential design and sample size estimation for an immunotherapy
           trial with a delayed treatment effect
    • Authors: Bosheng Li, Liwen Su, Jun Gao, Liyun Jiang, Fangrong Yan
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      A delayed treatment effect is often observed in the confirmatory trials for immunotherapies and is reflected by a delayed separation of the survival curves of the immunotherapy groups versus the control groups. This phenomenon makes the design based on the log-rank test not applicable because this design would violate the proportional hazard assumption and cause loss of power. Thus, we propose a group sequential design allowing early termination on the basis of efficacy based on a more powerful piecewise weighted log-rank test for an immunotherapy trial with a delayed treatment effect. We present an approach on the group sequential monitoring, in which the information time is defined based on the number of events occurring after the delay time. Furthermore, we developed a one-dimensional search algorithm to determine the required maximum sample size for the proposed design, which uses an analytical estimation obtained by the inflation factor as an initial value and an empirical power function calculated by a simulation-based procedure as an objective function. In the simulation, we tested the unstable accuracy of the analytical estimation, the consistent accuracy of the maximum sample size determined by the search algorithm and the advantages of the proposed design on saving sample size.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-24T02:53:05Z
      DOI: 10.1177/0962280220980780
  • CWL: A conditional weighted likelihood method to account for the delayed
           joint toxicity–efficacy outcomes for phase I/II clinical trials
    • Authors: Yifei Zhang, Yong Zang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The delayed outcome issue is common in early phase dose-finding clinical trials. This problem becomes more intractable in phase I/II clinical trials because both toxicity and efficacy responses are subject to the delayed outcome issue. The existing methods applying for the phase I trials cannot be used directly for the phase I/II trial due to a lack of capability to model the joint toxicity–efficacy distribution. In this paper, we propose a conditional weighted likelihood (CWL) method to circumvent this issue. The key idea of the CWL method is to decompose the joint probability into the product of marginal and conditional probabilities and then weight each probability based on each patient’s actual follow-up time. The CWL method makes no parametric model assumption on either the dose–response curve or the toxicity–efficacy correlation and therefore can be applied to any existing phase I/II trial design. Numerical trial applications show that the proposed CWL method yields desirable operating characteristics.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-22T04:02:22Z
      DOI: 10.1177/0962280220979328
  • Inferring median survival differences in general factorial designs via
           permutation tests
    • Authors: Marc Ditzhaus, Dennis Dobler, Markus Pauly
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Factorial survival designs with right-censored observations are commonly inferred by Cox regression and explained by means of hazard ratios. However, in case of non-proportional hazards, their interpretation can become cumbersome; especially for clinicians. We therefore offer an alternative: median survival times are used to estimate treatment and interaction effects and null hypotheses are formulated in contrasts of their population versions. Permutation-based tests and confidence regions are proposed and shown to be asymptotically valid. Their type-1 error control and power behavior are investigated in extensive simulations, showing the new methods’ wide applicability. The latter is complemented by an illustrative data analysis.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-22T03:58:03Z
      DOI: 10.1177/0962280220980784
  • Two-phase analysis and study design for survival models with error-prone
    • Authors: Kyunghee Han, Thomas Lumley, Bryan E Shepherd, Pamela A Shaw
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Increasingly, medical research is dependent on data collected for non-research purposes, such as electronic health records data. Health records data and other large databases can be prone to measurement error in key exposures, and unadjusted analyses of error-prone data can bias study results. Validating a subset of records is a cost-effective way of gaining information on the error structure, which in turn can be used to adjust analyses for this error and improve inference. We extend the mean score method for the two-phase analysis of discrete-time survival models, which uses the unvalidated covariates as auxiliary variables that act as surrogates for the unobserved true exposures. This method relies on a two-phase sampling design and an estimation approach that preserves the consistency of complete case regression parameter estimates in the validated subset, with increased precision leveraged from the auxiliary data. Furthermore, we develop optimal sampling strategies which minimize the variance of the mean score estimator for a target exposure under a fixed cost constraint. We consider the setting where an internal pilot is necessary for the optimal design so that the phase two sample is split into a pilot and an adaptive optimal sample. Through simulations and data example, we evaluate efficiency gains of the mean score estimator using the derived optimal validation design compared to balanced and simple random sampling for the phase two sample. We also empirically explore efficiency gains that the proposed discrete optimal design can provide for the Cox proportional hazards model in the setting of a continuous-time survival outcome.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-17T05:18:04Z
      DOI: 10.1177/0962280220978500
  • Probability intervals of toxicity and efficacy design for dose-finding
           clinical trials in oncology
    • Authors: Xiaolei Lin, Yuan Ji
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Immunotherapy, gene therapy or adoptive cell therapies, such as the chimeric antigen receptor+ T-cell therapies, have demonstrated promising therapeutic effects in oncology patients. We consider statistical designs for dose-finding adoptive cell therapy trials, in which the monotonic dose–response relationship assumed in traditional oncology trials may not hold. Building upon a previous design called “TEPI”, we propose a new dose finding method – Probability Intervals of Toxicity and Efficacy (PRINTE), which utilizes toxicity and efficacy jointly in making dosing decisions, does not require a pre-elicited decision table and at the same time can handle Ockham’s razor properly in the statistical inference. We show that optimizing the joint posterior expected utility of toxicity and efficacy under a 0–1 loss is equivalent to maximizing the marginal model posterior probability in the two-dimensional probability space. An extensive simulation study under various scenarios are conducted and results show that PRINTE outperforms existing designs in the literature since it assigns more patients to optimal doses and less to toxic ones, and selects optimal doses with higher percentages. The simple and transparent features together with good operating characteristics make PRINTE an improved design for dose-finding trials in oncology trials.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-17T04:33:02Z
      DOI: 10.1177/0962280220977009
  • Bayesian variable selection in logistic regression with application to
           whole-brain functional connectivity analysis for Parkinson’s disease
    • Authors: Xuan Cao, Kyoungjae Lee, Qingling Huang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Parkinson’s disease is a progressive, chronic, and neurodegenerative disorder that is primarily diagnosed by clinical examinations and magnetic resonance imaging (MRI). In this paper, we propose a Bayesian model to predict Parkinson’s disease employing a functional MRI (fMRI) based radiomics approach. We consider a spike and slab prior for variable selection in high-dimensional logistic regression models, and present an approximate Gibbs sampler by replacing a logistic distribution with a t-distribution. Under mild conditions, we establish model selection consistency of the induced posterior and illustrate the performance of the proposed method outperforms existing state-of-the-art methods through simulation studies. In fMRI analysis, 6216 whole-brain functional connectivity features are extracted for 50 healthy controls along with 70 Parkinson’s disease patients. We apply our method to the resulting dataset and further show its benefits with a higher average prediction accuracy of 0.83 compared to other contenders based on 10 random splits. The model fitting procedure also reveals the most discriminative brain regions for Parkinson’s disease. These findings demonstrate that the proposed Bayesian variable selection method has the potential to support radiological diagnosis for patients with Parkinson’s disease.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-14T03:12:52Z
      DOI: 10.1177/0962280220978990
  • Concordance probability as a meaningful contrast across disparate survival
    • Authors: Sean M Devlin, Glenn Heller
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The performance of time-to-event models is frequently assessed in part by estimating the concordance probability, which evaluates the probabilistic pairwise ordering of the model-based risk scores and survival times. The standard definition of this probability conditions on any survival time pair ordering, irrespective of whether the times are meaningfully separated. Inclusion of survival times that would be deemed clinically similar attenuates the concordance and moves the estimate away from the contrast-of-interest: comparing the risk scores between individuals with disparate survival times. In this manuscript, we propose a concordance definition and corresponding method to estimate the probability conditional on survival times being separated by at least a minimum difference. The proposed estimate requires direct input from the analyst to identify a separable survival region and, in doing so, is analogous to the clinically defined subgroups used for binary outcome area under the curve estimates. The method is illustrated in two cancer examples: a prognostic score in clear cell renal cell carcinoma and two biomarkers in metastatic prostate cancer.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-10T04:32:55Z
      DOI: 10.1177/0962280220973694
  • Efficient and flexible simulation-based sample size determination for
           clinical trials with multiple design parameters
    • Authors: Duncan T Wilson, Richard Hooper, Julia Brown, Amanda J Farrin, Rebecca EA Walwyn
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Simulation offers a simple and flexible way to estimate the power of a clinical trial when analytic formulae are not available. The computational burden of using simulation has, however, restricted its application to only the simplest of sample size determination problems, often minimising a single parameter (the overall sample size) subject to power being above a target level. We describe a general framework for solving simulation-based sample size determination problems with several design parameters over which to optimise and several conflicting criteria to be minimised. The method is based on an established global optimisation algorithm widely used in the design and analysis of computer experiments, using a non-parametric regression model as an approximation of the true underlying power function. The method is flexible, can be used for almost any problem for which power can be estimated using simulation, and can be implemented using existing statistical software packages. We illustrate its application to a sample size determination problem involving complex clustering structures, two primary endpoints and small sample considerations.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-03T06:37:06Z
      DOI: 10.1177/0962280220975790
  • Statistical design considerations for trials that study multiple
    • Authors: Alexander M Kaizer, Joseph S Koopmeiners, Nan Chen, Brian P Hobbs
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Breakthroughs in cancer biology have defined new research programs emphasizing the development of therapies that target specific pathways in tumor cells. Innovations in clinical trial design have followed with master protocols defined by inclusive eligibility criteria and evaluations of multiple therapies and/or histologies. Consequently, characterization of subpopulation heterogeneity has become central to the formulation and selection of a study design. However, this transition to master protocols has led to challenges in identifying the optimal trial design and proper calibration of hyperparameters. We often evaluate a range of null and alternative scenarios; however, there has been little guidance on how to synthesize the potentially disparate recommendations for what may be optimal. This may lead to the selection of suboptimal designs and statistical methods that do not fully accommodate the subpopulation heterogeneity. This article proposes novel optimization criteria for calibrating and evaluating candidate statistical designs of master protocols in the presence of the potential for treatment effect heterogeneity among enrolled patient subpopulations. The framework is applied to demonstrate the statistical properties of conventional study designs when treatments offer heterogeneous benefit as well as identify optimal designs devised to monitor the potential for heterogeneity among patients with differing clinical indications using Bayesian modeling.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-03T06:33:27Z
      DOI: 10.1177/0962280220975187
  • Joint analysis of multivariate interval-censored survival data and a
           time-dependent covariate
    • Authors: Di Wu, Chenxi Li
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We develop a joint modeling method for multivariate interval-censored survival data and a time-dependent covariate that is intermittently measured with error. The joint model is estimated using nonparametric maximum likelihood estimation, which is carried out via an expectation–maximization algorithm, and the inference for finite-dimensional parameters is performed using bootstrap. We also develop a similar joint modeling method for univariate interval-censored survival data and a time-dependent covariate, which excels the existing methods in terms of model flexibility and interpretation. Simulation studies show that the model fitting and inference approaches perform very well under realistic sample sizes. We apply the method to a longitudinal study of dental caries in African-American children from low-income families in the city of Detroit, Michigan.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-01T08:55:48Z
      DOI: 10.1177/0962280220975064
  • Unbiasedness and efficiency of non-parametric and UMVUE estimators of the
           probabilistic index and related statistics
    • Authors: Johan Verbeeck, Vaiva Deltuvaite-Thomas, Ben Berckmoes, Tomasz Burzykowski, Marc Aerts, Olivier Thas, Marc Buyse, Geert Molenberghs
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In reliability theory, diagnostic accuracy, and clinical trials, the quantity [math], also known as the Probabilistic Index (PI), is a common treatment effect measure when comparing two groups of observations. The quantity [math], a linear transformation of PI known as the net benefit, has also been advocated as an intuitively appealing treatment effect measure. Parametric estimation of PI has received a lot of attention in the past 40 years, with the formulation of the Uniformly Minimum-Variance Unbiased Estimator (UMVUE) for many distributions. However, the non-parametric Mann–Whitney estimator of the PI is also known to be UMVUE in some situations. To understand this seeming contradiction, in this paper a systematic comparison is performed between the non-parametric estimator for the PI and parametric UMVUE estimators in various settings. We show that the Mann–Whitney estimator is always an unbiased estimator of the PI with univariate, completely observed data, while the parametric UMVUE is not when the distribution is misspecified. Additionally, the Mann–Whitney estimator is the UMVUE when observations belong to an unrestricted family. When observations come from a more restrictive family of distributions, the loss in efficiency for the non-parametric estimator is limited in realistic clinical scenarios. In conclusion, the Mann–Whitney estimator is simple to use and is a reliable estimator for the PI and net benefit in realistic clinical scenarios.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-01T08:52:11Z
      DOI: 10.1177/0962280220966629
  • Monte Carlo approaches to frequentist multiplicity-adjusted benefiting
           subgroup identification
    • Authors: Patrick M Schnell
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      One common goal of subgroup analyses is to determine the subgroup of the population for which a given treatment is effective. Like most problems in subgroup analyses, this benefiting subgroup identification requires careful attention to multiple testing considerations, especially Type I error inflation. To partially address these concerns, the credible subgroups approach provides a pair of bounding subgroups for the benefiting subgroup, constructed so that with high posterior probability one is contained by the benefiting subgroup while the other contains the benefiting subgroup. To date, this approach has been presented within the Bayesian paradigm only, and requires sampling from the posterior of a Bayesian model. Additionally, in many cases, such as regulatory submission, guarantees of frequentist operating characteristics are helpful or necessary. We present Monte Carlo approaches to constructing confidence subgroups, frequentist analogues to credible subgroups that replace the posterior distribution with an estimate of the joint distribution of personalized treatment effect estimates, and yield frequentist interpretations and coverage guarantees. The estimated joint distribution is produced using either draws from asymptotic sampling distributions of estimated model parameters, or bootstrap resampling schemes. The approach is applied to a publicly available dataset from randomized trials of Alzheimer’s disease treatments.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-12-01T08:48:29Z
      DOI: 10.1177/0962280220973705
  • Bayesian mixture cure rate frailty models with an application to gastric
           cancer data
    • Authors: Ali Karamoozian, Mohammad Reza Baneshi, Abbas Bahrampour
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Mixture cure rate models are commonly used to analyze lifetime data with long-term survivors. On the other hand, frailty models also lead to accurate estimation of coefficients by controlling the heterogeneity in survival data. Gamma frailty models are the most common models of frailty. Usually, the gamma distribution is used in the frailty random variable models. However, for survival data which are suitable for populations with a cure rate, it may be better to use a discrete distribution for the frailty random variable than a continuous distribution. Therefore, we proposed two models in this study. In the first model, continuous gamma as the distribution is used, and in the second model, discrete hyper-Poisson distribution is applied for the frailty random variable. Also, Bayesian inference with Weibull distribution and generalized modified Weibull distribution as the baseline distribution were used in the two proposed models, respectively. In this study, we used data of patients with gastric cancer to show the application of these models in real data analysis. The parameters and regression coefficients were estimated using the Metropolis with Gibbs sampling algorithm, so that this algorithm is one of the crucial techniques in Markov chain Monte Carlo simulation. A simulation study was also used to evaluate the performance of the Bayesian estimates to confirm the proposed models. Based on the results of the Bayesian inference, it was found that the model with generalized modified Weibull and hyper-Poisson distributions is a suitable model in practical study and also this model fits better than the model with Weibull and Gamma distributions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-27T05:53:48Z
      DOI: 10.1177/0962280220974699
  • Bayesian adaptive decision-theoretic designs for multi-arm multi-stage
           clinical trials
    • Authors: Andrea Bassi, Johannes Berkhof, Daphne de Jong, Peter M van de Ven
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Multi-arm multi-stage clinical trials in which more than two drugs are simultaneously investigated provide gains over separate single- or two-arm trials. In this paper we propose a generic Bayesian adaptive decision-theoretic design for multi-arm multi-stage clinical trials with K ([math]) arms. The basic idea is that after each stage a decision about continuation of the trial and accrual of patients for an additional stage is made on the basis of the expected reduction in loss. For this purpose, we define a loss function that incorporates the patient accrual costs as well as costs associated with an incorrect decision at the end of the trial. An attractive feature of our loss function is that its estimation is computationally undemanding, also when K > 2. We evaluate the frequentist operating characteristics for settings with a binary outcome and multiple experimental arms. We consider both the situation with and without a control arm. In a simulation study, we show that our design increases the probability of making a correct decision at the end of the trial as compared to nonadaptive designs and adaptive two-stage designs.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-27T05:48:27Z
      DOI: 10.1177/0962280220973697
  • Employing a latent variable framework to improve efficiency in composite
           endpoint analysis
    • Authors: Martina McMenamin, Jessica K Barrett, Anna Berglind, James MS Wason
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Composite endpoints that combine multiple outcomes on different scales are common in clinical trials, particularly in chronic conditions. In many of these cases, patients will have to cross a predefined responder threshold in each of the outcomes to be classed as a responder overall. One instance of this occurs in systemic lupus erythematosus, where the responder endpoint combines two continuous, one ordinal and one binary measure. The overall binary responder endpoint is typically analysed using logistic regression, resulting in a substantial loss of information. We propose a latent variable model for the systemic lupus erythematosus endpoint, which assumes that the discrete outcomes are manifestations of latent continuous measures and can proceed to jointly model the components of the composite. We perform a simulation study and find that the method offers large efficiency gains over the standard analysis, the magnitude of which is highly dependent on the components driving response. Bias is introduced when joint normality assumptions are not satisfied, which we correct for using a bootstrap procedure. The method is applied to the Phase IIb MUSE trial in patients with moderate to severe systemic lupus erythematosus. We show that it estimates the treatment effect 2.5 times more precisely, offering a 60% reduction in required sample size.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-25T01:57:32Z
      DOI: 10.1177/0962280220970986
  • Small sample sizes: A big data problem in high-dimensional data analysis
    • Authors: Frank Konietschke, Karima Schwab, Markus Pauly
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In many experiments and especially in translational and preclinical research, sample sizes are (very) small. In addition, data designs are often high dimensional, i.e. more dependent than independent replications of the trial are observed. The present paper discusses the applicability of max t-test-type statistics (multiple contrast tests) in high-dimensional designs (repeated measures or multivariate) with small sample sizes. A randomization-based approach is developed to approximate the distribution of the maximum statistic. Extensive simulation studies confirm that the new method is particularly suitable for analyzing data sets with small sample sizes. A real data set illustrates the application of the methods.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-24T09:53:35Z
      DOI: 10.1177/0962280220970228
  • Unifying instrumental variable and inverse probability weighting
           approaches for inference of causal treatment effect and unmeasured
           confounding in observational studies
    • Authors: Tao Liu, Joseph W Hogan
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Confounding is a major concern when using data from observational studies to infer the causal effect of a treatment. Instrumental variables, when available, have been used to construct bound estimates on population average treatment effects when outcomes are binary and unmeasured confounding exists. With continuous outcomes, meaningful bounds are more challenging to obtain because the domain of the outcome is unrestricted. In this paper, we propose to unify the instrumental variable and inverse probability weighting methods, together with suitable assumptions in the context of an observational study, to construct meaningful bounds on causal treatment effects. The contextual assumptions are imposed in terms of the potential outcomes that are partially identified by data. The inverse probability weighting component incorporates a sensitivity parameter to encode the effect of unmeasured confounding. The instrumental variable and inverse probability weighting methods are unified using the principal stratification. By solving the resulting system of estimating equations, we are able to quantify both the causal treatment effect and the sensitivity parameter (i.e. the degree of the unmeasured confounding). We demonstrate our method by analyzing data from the HIV Epidemiology Research Study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-20T07:08:09Z
      DOI: 10.1177/0962280220971835
  • Functional clustering methods for longitudinal data with application to
           electronic health records
    • Authors: Bret Zeldow, James Flory, Alisa Stephens-Shields, Marsha Raebel, Jason A Roy
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature extraction, or, as in our motivating example, outcome identification, which refers to the process of identifying disease status through patient laboratory tests rather than through diagnosis codes or prescription information. We model the joint distribution of a continuous longitudinal outcome and baseline covariates using an enriched Dirichlet process prior. This joint model decomposes into (local) semiparametric linear mixed models for the outcome given the covariates and simple (local) marginals for the covariates. The nonparametric enriched Dirichlet process prior is placed on the regression and spline coefficients, the error variance, and the parameters governing the predictor space. This leads to clustering of patients based on their outcomes and covariates. We predict the outcome at unobserved time points for subjects with data at other time points as well as for new subjects with only baseline covariates. We find improved prediction over mixed models with Dirichlet process priors when there are a large number of covariates. Our method is demonstrated with electronic health records consisting of initiators of second-generation antipsychotic medications, which are known to increase the risk of diabetes. We use our model to predict laboratory values indicative of diabetes for each individual and assess incidence of suspected diabetes from the predicted dataset.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-12T04:23:58Z
      DOI: 10.1177/0962280220965630
  • Selecting the number of categories of the lymph node ratio in cancer
           research: A bootstrap-based hypothesis test
    • Authors: Irantzu Barrio, Javier Roca-Pardiñas, Inmaculada Arostegui
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The high impact of the lymph node ratio as a prognostic factor is widely established in colorectal cancer, and is being used as a categorized predictor variable in several studies. However, the cut-off points as well as the number of categories considered differ considerably in the literature. Motivated by the need to obtain the best categorization of the lymph node ratio as a predictor of mortality in colorectal cancer patients, we propose a method to select the best number of categories for a continuous variable in a logistic regression framework. Thus, to this end, we propose a bootstrap-based hypothesis test, together with a new estimation algorithm for the optimal location of the cut-off points called BackAddFor, which is an updated version of the previously proposed AddFor algorithm. The performance of the hypothesis test was evaluated by means of a simulation study, under different scenarios, yielding type I errors close to the nominal errors and good power values whenever a meaningful difference in terms of prediction ability existed. Finally, the methodology proposed was applied to the CCR-CARESS study where the lymph node ratio was included as a predictor of five-year mortality, resulting in the selection of three categories.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-10T03:19:39Z
      DOI: 10.1177/0962280220965631
  • Flexible derivative time-varying model in matched case-crossover studies
           for a small number of geographical locations among the participants
    • Authors: Ana M Ortega-Villa, Inyoung Kim
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In matched case-crossover studies, any stratum effect is removed by conditioning on the fixed number of case–control sets in the stratum, and hence, the conditional logistic regression model is not able to detect any effects associated with matching covariates. However, some matching covariates such as time and location often modify the effect of covariates, making the estimations obtained by conditional logistic regression incorrect. Therefore, in this paper, we propose a flexible derivative time-varying coefficient model to evaluate effect modification by time and location, in order to make correct statistical inference, when the number of locations is small. Our proposed model is developed under the Bayesian hierarchical model framework and allows us to simultaneously detect relationships between the predictor and binary outcome and between the predictor and time. Inference is proposed based on the derivative function of the estimated function to determine whether there is an effect modification due to time and/or location, for a small number of locations among the participants. We demonstrate the accuracy of the estimation using a simulation study and an epidemiological example of a 1–4 bidirectional case-crossover study of childhood aseptic meningitis with drinking water turbidity.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-04T02:42:26Z
      DOI: 10.1177/0962280220968178
  • Random changepoint segmented regression with smooth transition
    • Authors: Julio M Singer, Francisco MM Rocha, Antonio Carlos Pedroso-de-Lima, Giovani L Silva, Giuliana C Coatti, Mayana Zatz
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We consider random changepoint segmented regression models to analyse data from a study conducted to verify whether treatment with stem cells may delay the onset of a symptom of amyotrophic lateral sclerosis in genetically modified mice. The proposed models capture the biological aspects of the data, accommodating a smooth transition between the periods with and without symptoms. An additional changepoint is considered to avoid negative predicted responses. Given the nonlinear nature of the model, we propose an algorithm to estimate the fixed parameters and to predict the random effects by fitting linear mixed models iteratively via standard software. We compare the variances obtained in the final step with bootstrapped and robust ones.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-11-04T02:36:06Z
      DOI: 10.1177/0962280220964953
  • Development of a mixture model allowing for smoothing functions of
           longitudinal trajectories
    • Authors: Ming Ding, Jorge E. Chavarro, Garrett M. Fitzmaurice
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In the health and social sciences, two types of mixture models have been widely used by researchers to identify participants within a population with heterogeneous longitudinal trajectories: latent class growth analysis and the growth mixture model. Both methods parametrically model trajectories of individuals, and capture latent trajectory classes, using an expectation–maximization algorithm. However, parametric modeling of trajectories using polynomial functions or monotonic spline functions results in limited flexibility for modeling trajectories; as a result, group membership may not be classified accurately due to model misspecification. In this paper, we propose a smoothing mixture model allowing for smoothing functions of trajectories using a modified algorithm in the M step. Specifically, participants are reassigned to only one group for which the estimated trajectory is the most similar to the observed one; trajectories are fitted using generalized additive mixed models with smoothing functions of time within each of the resulting subsamples. The smoothing mixture model is straightforward to implement using the recently released “gamm4” package (version 0.2–6) in R 3.5.0. It can incorporate time-varying covariates and be applied to longitudinal data with any exponential family distribution, e.g., normal, Bernoulli, and Poisson. Simulation results show favorable performance of the smoothing mixture model, when compared to latent class growth analysis and growth mixture model, in recovering highly flexible trajectories. The proposed method is illustrated by its application to body mass index data on individuals followed from adolescence to young adulthood and its relationship with incidence of cardiometabolic disease.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-27T08:47:34Z
      DOI: 10.1177/0962280220966019
  • Inference about age-standardized rates with sampling errors in the
    • Authors: Jiming Jiang, Eric J Feuer, Yuanyuan Li, Thuan Nguyen, Mandi Yu
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Cancer incidence and mortality are typically presented as age-standardized rates. Inference about these rates becomes complicated when denominators involve sampling errors. We propose a bias-corrected rate estimator as well as its corresponding variance estimator that take into account sampling errors in the denominators. Confidence intervals are derived based on the proposed estimators as well. Performance of the proposed methods is evaluated empirically based on simulation studies. More importantly, advantage of the proposed method is demonstrated and verified in a real-life study of cancer mortality disparity. A web-based, user-friendly computational tool is also being developed at the National Cancer Institute to accompany the new methods with the first application being calculating cancer mortality rates by US-born and foreign-born status. Finally, promise of proposed estimators to account for errors introduced by differential privacy procedures to the 2020 decennial census products is discussed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-16T03:43:57Z
      DOI: 10.1177/0962280220962516
  • Reference range: Which statistical intervals to use'
    • Authors: Wei Liu, Frank Bretz, Mario Cortina-Borja
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Reference ranges, which are data-based intervals aiming to contain a pre-specified large proportion of the population values, are powerful tools to analyse observations in clinical laboratories. Their main point is to classify any future observations from the population which fall outside them as atypical and thus may warrant further investigation. As a reference range is constructed from a random sample from the population, the event ‘a reference range contains [math] of the population’ is also random. Hence, all we can hope for is that such event has a large occurrence probability. In this paper we argue that some intervals, including the P prediction interval, are not suitable as reference ranges since there is a substantial probability that these intervals contain less than [math] of the population, especially when the sample size is large. In contrast, a [math] tolerance interval is designed to contain [math] of the population with a pre-specified large confidence γ so it is eminently adequate as a reference range. An example based on real data illustrates the paper’s key points.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-15T03:27:06Z
      DOI: 10.1177/0962280220961793
  • Joint analysis of recurrence and termination: A Bayesian latent class
    • Authors: Zhixing Xu, Debajyoti Sinha, Jonathan R Bradley
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Like many other clinical and economic studies, each subject of our motivating transplant study is at risk of recurrent events of non-fatal tissue rejections as well as the terminating event of death due to total graft rejection. For such studies, our model and associated Bayesian analysis aim for some practical advantages over competing methods. Our semiparametric latent-class-based joint model has coherent interpretation of the covariate (including race and gender) effects on all functions and model quantities that are relevant for understanding the effects of covariates on future event trajectories. Our fully Bayesian method for estimation and prediction uses a complete specification of the prior process of the baseline functions. We also derive a practical and theoretically justifiable partial likelihood-based semiparametric Bayesian approach to deal with the analysis when there is a lack of prior information about baseline functions. Our model and method can accommodate fixed as well as time-varying covariates. Our Markov Chain Monte Carlo tools for both Bayesian methods are implementable via publicly available software. Our Bayesian analysis of transplant study and simulation study demonstrate practical advantages and improved performance of our approach.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-14T04:02:57Z
      DOI: 10.1177/0962280220962522
  • Sample size and sample composition for constructing growth reference
    • Authors: TJ Cole
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Growth reference centile charts are widely used in child health to assess weight, height and other age-varying measurements. The centiles are easy to construct from reference data, using the LMS method or GAMLSS (Generalised Additive Models for Location Scale and Shape). However, there is as yet no clear guidance on how to design such studies, and in particular how many reference data to collect, and this has led to study sizes varying widely. The paper aims to provide a theoretical framework for optimally designing growth reference studies based on cross-sectional data. Centiles for weight, height, body mass index and head circumference, in 6878 boys aged 0–21 years from the Fourth Dutch Growth Study, were fitted using GAMLSS. The effect on precision of varying the sample size and the distribution of measurement ages (sample composition) was explored by fitting a series of GAMLSS models to simulated data. Sample composition was defined as uniform on the ageλ scale, where λ was chosen to give constant precision across the age range. Precision was measured on the z-score scale, and was the same for all four measurements, with a standard error of 0.041 z-score units for the median and 0.066 for the 2nd and 98th centiles. Compared to a naïve calculation, the process of smoothing the centiles increased the notional sample size two- to threefold by ‘borrowing strength’. The sample composition for estimating the median curve was optimal for λ=0.4, reflecting considerable over-sampling of infants compared to children. However, for the 2nd and 98th centiles, λ=0.75 was optimal, with less infant over-sampling. The conclusion is that both sample size and sample composition need to be optimised. The paper provides practical advice on design, and concludes that optimally designed studies need 7000–25,000 subjects per sex.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-12T12:08:55Z
      DOI: 10.1177/0962280220958438
  • A weighting method for simultaneous adjustment for confounding and joint
           exposure-outcome misclassifications
    • Authors: Bas BL Penning de Vries, Maarten van Smeden, Rolf HH Groenwold
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Joint misclassification of exposure and outcome variables can lead to considerable bias in epidemiological studies of causal exposure-outcome effects. In this paper, we present a new maximum likelihood based estimator for marginal causal effects that simultaneously adjusts for confounding and several forms of joint misclassification of the exposure and outcome variables. The proposed method relies on validation data for the construction of weights that account for both sources of bias. The weighting estimator, which is an extension of the outcome misclassification weighting estimator proposed by Gravel and Platt (Weighted estimation for confounded binary outcomes subject to misclassification. Stat Med 2018; 37: 425–436), is applied to reinfarction data. Simulation studies were carried out to study its finite sample properties and compare it with methods that do not account for confounding or misclassification. The new estimator showed favourable large sample properties in the simulations. Further research is needed to study the sensitivity of the proposed method and that of alternatives to violations of their assumptions. The implementation of the estimator is facilitated by a new R function (ipwm) in an existing R package (mecor).
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-10-01T03:09:16Z
      DOI: 10.1177/0962280220960172
  • Pattern discovery of health curves using an ordered probit model with
           Bayesian smoothing and functional principal component analysis
    • Authors: Shijia Wang, Yunlong Nie, Jason M Sutherland, Liangliang Wang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      This article is motivated by the need for discovering patterns of patients’ health based on their daily settings of care to aid the health policy-makers to improve the effectiveness of distributing funding for health services. The hidden process of one’s health status is assumed to be a continuous smooth function, called the health curve, ranging from perfectly healthy to dead. The health curves are linked to the categorical setting of care using an ordered probit model and are inferred through Bayesian smoothing. The challenges include the nontrivial constraints on the lower bound of the health status (death) and on the model parameters to ensure model identifiability. We use the Markov chain Monte Carlo method to estimate the parameters and health curves. The functional principal component analysis is applied to the patients’ estimated health curves to discover common health patterns. The proposed method is demonstrated through an application to patients hospitalized from strokes in Ontario. Whilst this paper focuses on the method’s application to a health care problem, the proposed model and its implementation have the potential to be applied to many application domains in which the response variable is ordinal and there is a hidden process. Our implementation is available at https://github.com/liangliangwangsfu/healthCurveCode.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-25T05:55:13Z
      DOI: 10.1177/0962280220951834
  • Change point detection in Cox proportional hazards mixture cure model
    • Authors: Bing Wang, Jialiang Li, Xiaoguang Wang
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The mixture cure model has been widely applied to survival data in which a fraction of the observations never experience the event of interest, despite long-term follow-up. In this paper, we study the Cox proportional hazards mixture cure model where the covariate effects on the distribution of uncured subjects’ failure time may jump when a covariate exceeds a change point. The nonparametric maximum likelihood estimation is used to obtain the semiparametric estimates. We employ a two-step computational procedure involving the Expectation-Maximization algorithm to implement the estimation. The consistency, convergence rate and asymptotic distributions of the estimators are carefully established under technical conditions and we show that the change point estimator is n consistency. The m out of n bootstrap and the Louis algorithm are used to obtain the standard errors of the estimated change point and other regression parameter estimates, respectively. We also contribute a test procedure to check the existence of the change point. The finite sample performance of the proposed method is demonstrated via simulation studies and real data examples.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-24T05:37:55Z
      DOI: 10.1177/0962280220959118
  • Comparison of small-sample standard-error corrections for generalised
           estimating equations in stepped wedge cluster randomised trials with a
           binary outcome: A simulation study
    • Authors: JA Thompson, K Hemming, A Forbes, K Fielding, R Hayes
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Generalised estimating equations with the sandwich standard-error estimator provide a promising method of analysis for stepped wedge cluster randomised trials. However, they have inflated type-one error when used with a small number of clusters, which is common for stepped wedge cluster randomised trials. We present a large simulation study of binary outcomes comparing bias-corrected standard errors from Fay and Graubard; Mancl and DeRouen; Kauermann and Carroll; Morel, Bokossa, and Neerchal; and Mackinnon and White with an independent and exchangeable working correlation matrix. We constructed 95% confidence intervals using a t-distribution with degrees of freedom including clusters minus parameters (DFC-P), cluster periods minus parameters, and estimators from Fay and Graubard (DFFG), and Pan and Wall. Fay and Graubard and an approximation to Kauermann and Carroll (with simpler matrix inversion) were unbiased in a wide range of scenarios with an independent working correlation matrix and more than 12 clusters. They gave confidence intervals with close to 95% coverage with DFFG with 12 or more clusters, and DFC-P with 18 or more clusters. Both standard errors were conservative with fewer clusters. With an exchangeable working correlation matrix, approximated Kauermann and Carroll and Fay and Graubard had a small degree of under-coverage.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-24T05:16:53Z
      DOI: 10.1177/0962280220958735
  • A proportional risk model for time-to-event analysis in randomized
           controlled trials
    • Authors: Oliver Kuss, Annika Hoyer
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Regression models for continuous, binary, nominal, and ordinal outcomes almost completely rely on parametric models, whereas time-to-event outcomes are mainly analyzed by Cox’s Proportional Hazards model, an essentially non-parametric method. This is done despite a long list of disadvantages that have been reported for the hazard ratio, and also for the odds ratio, another effect measure sometimes used for time-to-event modelling. In this paper, we propose a parametric proportional risk model for time-to-event outcomes in a two-group situation. Modelling explicitly a risk instead of a hazard or an odds solves the current interpretational and technical problems of the latter two effect measures. The model further allows for computing absolute effect measures like risk differences or numbers needed to treat. As an additional benefit, results from the model can also be communicated on the original time scale, as an accelerated or a prolongated failure time thus facilitating interpretation for a non-technical audience. Parameter estimation by maximum likelihood, while properly accounting for censoring, is straightforward and can be implemented in each statistical package that allows coding and maximizing a univariate likelihood function. We illustrate the model with an example from a randomized controlled trial on efficacy of a new glucose-lowering drug for the treatment of type 2 diabetes mellitus and give the results of a small simulation study.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-22T04:22:24Z
      DOI: 10.1177/0962280220953599
  • Efficient two-stage sequential arrays of proof of concept studies for
           pharmaceutical portfolios
    • Authors: Linchen He, Linqiu Du, Zoran Antonijevic, Martin Posch, Valeriy R Korostyshevskiy, Robert A Beckman
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Previous work has shown that individual randomized “proof-of-concept” (PoC) studies may be designed to maximize cost-effectiveness, subject to an overall PoC budget constraint. Maximizing cost-effectiveness has also been considered for arrays of simultaneously executed PoC studies. Defining Type III error as the opportunity cost of not performing a PoC study, we evaluate the common pharmaceutical practice of allocating PoC study funds in two stages. Stage 1, or the first wave of PoC studies, screens drugs to identify those to be permitted additional PoC studies in Stage 2. We investigate if this strategy significantly improves efficiency, despite slowing development. We quantify the benefit, cost, benefit-cost ratio, and Type III error given the number of Stage 1 PoC studies. Relative to a single stage PoC strategy, significant cost-effective gains are seen when at least one of the drugs has a low probability of success (10%) and especially when there are either few drugs (2) with a large number of indications allowed per drug (10) or a large portfolio of drugs (4). In these cases, the recommended number of Stage 1 PoC studies ranges from 2 to 4, tracking approximately with an inflection point in the minimization curve of Type III error.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-21T02:58:19Z
      DOI: 10.1177/0962280220958177
  • Extending the I-squared statistic to describe treatment effect
           heterogeneity in cluster, multi-centre randomized trials and individual
           patient data meta-analysis
    • Authors: Karla Hemming, James P Hughes, Joanne E McKenzie, Andrew B Forbes
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Treatment effect heterogeneity is commonly investigated in meta-analyses to identify if treatment effects vary across studies. When conducting an aggregate level data meta-analysis it is common to describe the magnitude of any treatment effect heterogeneity using the I-squared statistic, which is an intuitive and easily understood concept. The effect of a treatment might also vary across clusters in a cluster randomized trial, or across centres in multi-centre randomized trial, and it can be of interest to explore this at the analysis stage. In cross-over trials and other randomized designs, in which clusters or centres are exposed to both treatment and control conditions, this treatment effect heterogeneity can be identified. Here we derive and evaluate a comparable I-squared measure to describe the magnitude of heterogeneity in treatment effects across clusters or centres in randomized trials. We further show how this methodology can be used to estimate treatment effect heterogeneity in an individual patient data meta-analysis.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-21T02:05:10Z
      DOI: 10.1177/0962280220948550
  • Optimal two-stage sampling for mean estimation in multilevel populations
           when cluster size is informative
    • Authors: Francesco Innocenti, Math JJM Candel, Frans ES Tan, Gerard JP van Breukelen
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related to the mean of the outcome variable of interest (informative cluster size), the following competing sampling designs are considered: sampling clusters with probability proportional to cluster size, and then the same number of individuals per cluster; drawing clusters with equal probability, and then the same percentage of individuals per cluster; and selecting clusters with equal probability, and then the same number of individuals per cluster. For each design, optimal sample sizes are derived under a budget constraint. The three optimal two-stage sampling designs are compared, in terms of efficiency, with each other and with simple random sampling of individuals. Sampling clusters with probability proportional to size is recommended. To overcome the dependency of the optimal design on unknown nuisance parameters, maximin designs are derived. The results are illustrated, assuming probability proportional to size sampling of clusters, with the planning of a hypothetical survey to compare adolescent alcohol consumption between France and Italy.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-17T12:18:33Z
      DOI: 10.1177/0962280220952833
  • Adjusted win ratio with stratification: Calculation methods and
    • Authors: Samvel B Gasparyan, Folke Folkvaljon, Olof Bengtsson, Joan Buenconsejo, Gary G Koch
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The win ratio is a general method of comparing locations of distributions of two independent, ordinal random variables, and it can be estimated without distributional assumptions. In this paper we provide a unified theory of win ratio estimation in the presence of stratification and adjustment by a numeric variable. Building step by step on the estimate of the crude win ratio we compare corresponding tests with well known non-parametric tests of group difference (Wilcoxon rank-sum test, Fligner–Policello test, van Elteren test, test based on the regression on ranks, and the rank analysis of covariance test). We show that the win ratio gives an interpretable treatment effect measure with corresponding test to detect treatment effect difference under minimal assumptions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:43:40Z
      DOI: 10.1177/0962280220942558
  • Mixed-effects models for the design and analysis of stepped wedge cluster
           randomized trials: An overview
    • Authors: Fan Li, James P Hughes, Karla Hemming, Monica Taljaard, Edward R. Melnick, Patrick J Heagerty
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The stepped wedge cluster randomized design has received increasing attention in pragmatic clinical trials and implementation science research. The key feature of the design is the unidirectional crossover of clusters from the control to intervention conditions on a staggered schedule, which induces confounding of the intervention effect by time. The stepped wedge design first appeared in the Gambia hepatitis study in the 1980s. However, the statistical model used for the design and analysis was not formally introduced until 2007 in an article by Hussey and Hughes. Since then, a variety of mixed-effects model extensions have been proposed for the design and analysis of these trials. In this article, we explore these extensions under a unified perspective. We provide a general model representation and regard various model extensions as alternative ways to characterize the secular trend, intervention effect, as well as sources of heterogeneity. We review the key model ingredients and clarify their implications for the design and analysis. The article serves as an entry point to the evolving statistical literatures on stepped wedge designs.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-07T04:51:24Z
      DOI: 10.1177/0962280220932962
  • Transformation based on likelihood ratio
    • Authors: Jianping Yang, Pei-Fen Kuan, Jialiang Li
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We respond here on a recent letter in this journal, on the transformation based on likelihood ratio.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-05-26T04:43:00Z
      DOI: 10.1177/0962280220925509
  • Issues and solutions in biomarker evaluation when subclasses are involved
           under binary classification
    • Authors: Yingdong Feng, Lili Tian
      First page: 87
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In practice, it is common to evaluate biomarkers in binary classification settings (e.g. non-cancer vs. cancer) where one or both main classes involve multiple subclasses. For example, non-cancer class might consist of healthy subjects and benign cases, while cancer class might consist of subjects at early and late stages. The standard practice is pooling within each main class, i.e. all non-cancer subclasses are pooled together to create a control group, and all cancer subclasses are pooled together to create a case group. Based on the pooled data, the area under ROC curve (AUC) and other characteristics are estimated under binary classification for the purpose of biomarker evaluation. Despite the popularity of this pooling strategy in practice, its validity and implication in biomarker evaluation have never been carefully inspected. This paper aims to demonstrate that pooling strategy can be seriously misleading in biomarker evaluation. Furthermore, we present a new diagnostic framework as well as new accuracy measures appropriate for biomaker evaluation under such settings. In the end, an ovarian cancer data set is analyzed.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:30:55Z
      DOI: 10.1177/0962280220938077
  • Functional survival forests for multivariate longitudinal outcomes:
           Dynamic prediction of Alzheimer’s disease progression
    • Authors: Jeffrey Lin, Kan Li, Sheng Luo
      First page: 99
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The random survival forest (RSF) is a non-parametric alternative to the Cox proportional hazards model in modeling time-to-event data. In this article, we developed a modeling framework to incorporate multivariate longitudinal data in the model building process to enhance the predictive performance of RSF. To extract the essential features of the multivariate longitudinal outcomes, two methods were adopted and compared: multivariate functional principal component analysis and multivariate fast covariance estimation for sparse functional data. These resulting features, which capture the trajectories of the multiple longitudinal outcomes, are then included as time-independent predictors in the subsequent RSF model. This non-parametric modeling framework, denoted as functional survival forests, is better at capturing the various trends in both the longitudinal outcomes and the survival model which may be difficult to model using only parametric approaches. These advantages are demonstrated through simulations and applications to the Alzheimer’s Disease Neuroimaging Initiative.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:43:32Z
      DOI: 10.1177/0962280220941532
  • Bayesian quantile nonhomogeneous hidden Markov models
    • Authors: Hefei Liu, Xinyuan Song, Yanlin Tang, Baoxue Zhang
      First page: 112
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Hidden Markov models are useful in simultaneously analyzing a longitudinal observation process and its dynamic transition. Existing hidden Markov models focus on mean regression for the longitudinal response. However, the tails of the response distribution are as important as the center in many substantive studies. We propose a quantile hidden Markov model to provide a systematic method to examine the entire conditional distribution of the response given the hidden state and potential covariates. Instead of considering homogeneous hidden Markov models, which assume that the probabilities of between-state transitions are independent of subject- and time-specific characteristics, we allow the transition probabilities to depend on exogenous covariates, thereby yielding nonhomogeneous Markov chains and making the proposed model more flexible than its homogeneous counterpart. We develop a Bayesian approach coupled with efficient Markov chain Monte Carlo methods for statistical inference. Simulations are conducted to assess the empirical performance of the proposed method. The proposed methodology is applied to a cocaine use study to provide new insights into the prevention of cocaine use.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-07-29T04:44:02Z
      DOI: 10.1177/0962280220942802
  • Variable selection for ultra-high dimensional quantile regression with
           missing data and measurement error
    • Authors: Yongxin Bai, Maozai Tian, Man-Lai Tang, Wing-Yan Lee
      First page: 129
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-04T04:01:04Z
      DOI: 10.1177/0962280220941533
  • A variance shrinkage method improves arm-based Bayesian network
    • Authors: Zhenxun Wang, Lifeng Lin, James S Hodges, Richard MacLehose, Haitao Chu
      First page: 151
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Network meta-analysis is a commonly used tool to combine direct and indirect evidence in systematic reviews of multiple treatments to improve estimation compared to traditional pairwise meta-analysis. Unlike the contrast-based network meta-analysis approach, which focuses on estimating relative effects such as odds ratios, the arm-based network meta-analysis approach can estimate absolute risks and other effects, which are arguably more informative in medicine and public health. However, the number of clinical studies involving each treatment is often small in a network meta-analysis, leading to unstable treatment-specific variance estimates in the arm-based network meta-analysis approach when using non- or weakly informative priors under an unequal variance assumption. Additional assumptions, such as equal (i.e. homogeneous) variances for all treatments, may be used to remedy this problem, but such assumptions may be inappropriately strong. This article introduces a variance shrinkage method for an arm-based network meta-analysis. Specifically, we assume different treatment variances share a common prior with unknown hyperparameters. This assumption is weaker than the homogeneous variance assumption and improves estimation by shrinking the variances in a data-dependent way. We illustrate the advantages of the variance shrinkage method by reanalyzing a network meta-analysis of organized inpatient care interventions for stroke. Finally, comprehensive simulations investigate the impact of different variance assumptions on statistical inference, and simulation results show that the variance shrinkage method provides better estimation for log odds ratios and absolute risks.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-06T02:42:03Z
      DOI: 10.1177/0962280220945731
  • Random forests for high-dimensional longitudinal data
    • Authors: Louis Capitaine, Robin Genuer, Rodolphe Thiébaut
      First page: 166
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Random forests are one of the state-of-the-art supervised machine learning methods and achieve good performance in high-dimensional settings where p, the number of predictors, is much larger than n, the number of observations. Repeated measurements provide, in general, additional information, hence they are worth accounted especially when analyzing high-dimensional data. Tree-based methods have already been adapted to clustered and longitudinal data by using a semi-parametric mixed effects model, in which the non-parametric part is estimated using regression trees or random forests. We propose a general approach of random forests for high-dimensional longitudinal data. It includes a flexible stochastic model which allows the covariance structure to vary over time. Furthermore, we introduce a new method which takes intra-individual covariance into consideration to build random forests. Through simulation experiments, we then study the behavior of different estimation methods, especially in the context of high-dimensional data. Finally, the proposed method has been applied to an HIV vaccine trial including 17 HIV-infected patients with 10 repeated measurements of 20,000 gene transcripts and blood concentration of human immunodeficiency virus RNA. The approach selected 21 gene transcripts for which the association with HIV viral load was fully relevant and consistent with results observed during primary infection.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-10T03:40:07Z
      DOI: 10.1177/0962280220946080
  • Dynamic predictions of kidney graft survival in the presence of
           longitudinal outliers
    • Authors: Özgür Asar, Marie-Cécile Fournier, Etienne Dantan
      First page: 185
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In kidney transplantation, dynamic predictions of graft survival may be obtained from joint modelling of longitudinal and survival data for which a common assumption is that random-effects and error terms in the longitudinal sub-model are Gaussian. However, this assumption may be too restrictive, e.g. in the presence of outliers, and more flexible distributions would be required. In this study, we relax the Gaussian assumption by defining a robust joint modelling framework with t-distributed random-effects and error terms to obtain dynamic predictions of graft survival for kidney transplant patients. We take a Bayesian paradigm for inference and dynamic predictions and sample from the joint posterior densities. While previous research reported improved performances of robust joint models compared to the Gaussian version in terms of parameter estimation, dynamic prediction accuracy obtained from such approach has not been yet evaluated. Our results based on a training sample from the French DIVAT kidney transplantation cohort illustrate that estimates for the slope parameters in the longitudinal and survival sub-models are sensitive to the distributional assumptions. From both an internal validation sample from the DIVAT cohort and an external validation sample from the Lille (France) and Leuven (Belgium) transplantation centers, calibration and discrimination performances appeared to be better under the robust joint models compared to the Gaussian version, illustrating the need to accommodate outliers in the dynamic prediction context. Simulation results support the findings of the validation studies.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-13T03:37:38Z
      DOI: 10.1177/0962280220945352
  • A pseudo-likelihood approach for multivariate meta-analysis of test
           accuracy studies with multiple thresholds
    • Authors: Annamaria Guolo, Duc-Khanh To
      First page: 204
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Multivariate meta-analysis of test accuracy studies when tests are evaluated in terms of sensitivity and specificity at more than one threshold represents an effective way to synthesize results by fully exploiting the data, if compared to univariate meta-analyses performed at each threshold independently. The approximation of logit transformations of sensitivities and specificities at different thresholds through a normal multivariate random-effects model is a recent proposal that straightforwardly extends the bivariate models well recommended for the one threshold case. However, drawbacks of the approach, such as poor estimation of the within-study correlations between sensitivities and between specificities, and severe computational issues can make it unappealing. We propose an alternative method for inference on common diagnostic measures using a pseudo-likelihood constructed under a working independence assumption between sensitivities and between specificities at different thresholds in the same study. The method does not require within-study correlations, overcomes the convergence issues and can be effortlessly implemented. Simulation studies highlight a satisfactory performance of the method, remarkably improving the results from the multivariate normal counterpart under different scenarios. The pseudo-likelihood approach is illustrated in the evaluation of a test used for diagnosis of preeclampsia as a cause of maternal and perinatal morbidity and mortality.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-13T03:37:09Z
      DOI: 10.1177/0962280220948085
  • Model-robust designs for nonlinear quantile regression
    • Authors: Selvakkadunko Selvaratnam, Linglong Kong, Douglas P Wiens
      First page: 221
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      We construct robust designs for nonlinear quantile regression, in the presence of both a possibly misspecified nonlinear quantile function and heteroscedasticity of an unknown form. The asymptotic mean-squared error of the quantile estimate is evaluated and maximized over a neighbourhood of the fitted quantile regression model. This maximum depends on the scale function and on the design. We entertain two methods to find designs that minimize the maximum loss. The first is local – we minimize for given values of the parameters and the scale function, using a sequential approach, whereby each new design point minimizes the subsequent loss, given the current design. The second is adaptive – at each stage, the maximized loss is evaluated at quantile estimates of the parameters, and a kernel estimate of scale, and then the next design point is obtained as in the sequential method. In the context of a Michaelis–Menten response model for an estrogen/hormone study, and a variety of scale functions, we demonstrate that the adaptive approach performs as well, in large study sizes, as if the parameter values and scale function were known beforehand and the sequential method applied. When the sequential method uses an incorrectly specified scale function, the adaptive method yields an, often substantial, improvement. The performance of the adaptive designs for smaller study sizes is assessed and seen to still be very favourable, especially so since the prior information required to design sequentially is rarely available.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-19T11:06:30Z
      DOI: 10.1177/0962280220948159
  • Efficient algorithms for covariate analysis with dynamic data using
           nonlinear mixed-effects model
    • Authors: Min Yuan, Zhi Zhu, Yaning Yang, Minghua Zhao, Kate Sasser, Hisham Hamadeh, Jose Pinheiro, Xu Steven Xu
      First page: 233
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Nonlinear mixed-effects modeling is one of the most popular tools for analyzing repeated measurement data, particularly for applications in the biomedical fields. Multiple integration and nonlinear optimization are the two major challenges for likelihood-based methods in nonlinear mixed-effects modeling. To solve these problems, approaches based on empirical Bayesian estimates have been proposed by breaking the problem into a nonlinear mixed-effects model with no covariates and a linear regression model without random effect. This approach is time-efficient as it involves no covariates in the nonlinear optimization. However, covariate effects based on empirical Bayesian estimates are underestimated and the bias depends on the extent of shrinkage. Marginal correction method has been proposed to correct the bias caused by shrinkage to some extent. However, the marginal approach appears to be suboptimal when testing covariate effects on multiple model parameters, a situation that is often encountered in real-world data analysis. In addition, the marginal approach cannot correct the inaccuracy in the associated p-values. In this paper, we proposed a simultaneous correction method (nSCEBE), which can handle the situation where covariate analysis is performed on multiple model parameters. Simulation studies and real data analysis showed that nSCEBE is accurate and efficient for both effect-size estimation and p-value calculation compared with the existing methods. Importantly, nSCEBE can be>2000 times faster than the standard mixed-effects models, potentially allowing utilization for high-dimension covariate analysis for longitudinal or repeated measured outcomes.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-25T04:57:06Z
      DOI: 10.1177/0962280220949898
  • Risk factor identification in cystic fibrosis by flexible hierarchical
           joint models
    • Authors: Weiji Su, Xia Wang, Rhonda D Szczesniak
      First page: 244
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Cystic fibrosis (CF) is a lethal autosomal disease hallmarked by respiratory failure. Maintaining lung function and minimizing frequency of acute respiratory events known as pulmonary exacerbations are essential to survival. Jointly modeling longitudinal lung function and exacerbation occurrences may provide better inference. We propose a shared-parameter joint hierarchical Gaussian process model with flexible link function to investigate the impacts of both demographic and time-varying clinical risk factors on lung function decline and to examine the associations between lung function and occurrence of pulmonary exacerbation. A two-level Gaussian process is used to capture the nonlinear longitudinal trajectory, and a flexible link function is introduced to the joint model in order to analyze binary process. Bayesian model assessment criteria are provided in examining the overall performance in joint models and marginal fitting in each submodel. We conduct simulation studies and apply the proposed model in a local CF center cohort. In the CF application, a nonlinear structure is supported in modeling both the longitudinal continuous and binary processes. A negative association is detected between lung function and pulmonary exacerbation by the joint model. The importance of risk factors, including gender, diagnostic status, insurance status, and BMI, is examined in joint models.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-08-26T05:21:45Z
      DOI: 10.1177/0962280220950369
  • A tail-based test to detect differential expression in RNA-sequencing data
    • Authors: Jiong Chen, Xinlei Mi, Jing Ning, Xuming He, Jianhua Hu
      First page: 261
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and other studies. Such data at the exon level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for within-sample dependence among the exons through a specified correlation structure. Through Monte Carlo simulation studies, we show that the proposed test is generally more powerful and robust in detecting differential expression than commonly used tests based on the mean or a single quantile. An application to TCGA lung adenocarcinoma data demonstrates the promise of the proposed method in terms of biomarker discovery.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-01T05:32:34Z
      DOI: 10.1177/0962280220951907
  • Efficient orthogonal functional magnetic resonance imaging designs in the
           presence of drift
    • Authors: Rakhi Singh, John Stufken
      First page: 277
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      To study brain activity, by measuring changes associated with the blood flow in the brain, functional magnetic resonance imaging techniques are employed. The design problem in event-related functional magnetic resonance imaging studies is to find the best sequence of stimuli to be shown to subjects for precise estimation of the brain activity. Previous analytical studies concerning optimal functional magnetic resonance imaging designs often assume a simplified model with independent errors over time. Optimal designs under this model are called g-lag orthogonal designs. Recently, it has been observed that g-lag orthogonal designs also perform well under simplified models with auto-regressive error structures. However, these models do not include drift. We investigate the performance of g-lag orthogonal designs for models that incorporate drift parameters. Identifying g-lag orthogonal designs that perform best in the presence of a drift is important because a drift is typically assumed for the analysis of event-related functional magnetic resonance imaging data.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-10T04:36:09Z
      DOI: 10.1177/0962280220953870
  • Understanding between-cluster variation in prevalence and limits for how
           much variation is plausible
    • Authors: Mark D Chatfield, Daniel M Farewell
      First page: 286
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      In clinical trials and observational studies of clustered binary data, understanding between-cluster variation is essential: in sample size and power calculations of cluster randomised trials, for example, the intra-cluster correlation coefficient is often specified. However, quantifications of between-cluster variation can be unintuitive, and an intra-cluster correlation coefficient as low as 0.04 may correspond to surprisingly large between-cluster differences. We suggest that understanding is improved through visualising the implied distribution of true cluster prevalences – possibly by assuming they follow a beta distribution – or by calculating their standard deviation, which is more readily interpretable than the intra-cluster correlation coefficient. Even so, the bounded nature of binary data complicates the interpretation of variances as primary measures of uncertainty, and entropy offers an attractive alternative. Appealing to maximum entropy theory, we propose the following rule of thumb: that plausible intra-cluster correlation coefficients and standard deviations of true cluster prevalences are both bounded above by the overall prevalence, its complement, and one third. We also provide corresponding bounds for the coefficient of variation, and for a different standard deviation and intra-cluster correlation defined on the log odds scale. Using previously published data, we observe the quantities defined on the log odds scale to be more transportable between studies with different outcomes with different prevalences than the intra-cluster correlation and coefficient of variation. The latter increase and decrease, respectively, as prevalence increases from 0% to 50%, and the same is true for our bounds. Our work will help clinical trialists better understand between-cluster variation and avoid specifying implausibly high values for the intra-cluster correlation in sample size and power calculations.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-10T04:33:51Z
      DOI: 10.1177/0962280220951831
  • Estimating model-based nonnegative population marginal means in
           application to medical expenditures covered by different health care
           policies – A study on Medical Expenditure Panel Survey
    • Authors: Mingmei Tian, Jihnhee Yu
      First page: 299
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      The medical care expenditure is historically an important public health issue, which greatly impacts the government’s health policies as well as patients’ financial and medical decisions. In population health research, we commonly discretize a numeric attribute to a few ordinal groups to examine population characteristics. Oftentimes, the population marginal mean estimation by the ANOVA approach is inflexible since it uses pre-defined grouping of the covariate. In this paper, we propose a method to estimate the population marginal mean using the B-spline-based regression in a manner of a generalized additive model as an alternative for the ANOVA. Since the medical expenditure is always nonnegative, a Bayesian approach is also implemented for the nonnegative constraint on the marginal mean estimates. The proposed method is flexible to estimate marginal means for user-specified grouping after model fitting in a post-hoc manner, a clear advantage over the ANOVA approach. We show that this method is inferentially superior to the ANOVA through theoretical investigations and an extensive Monte Carlo study. The real data analysis using Medical Expenditure Panel Survey data assisted by some visualization tools demonstrates an applicability of the proposed approach and leads us some interesting observations that may be relevant to public health discussions.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-10T04:31:50Z
      DOI: 10.1177/0962280220954241
  • A Bayesian hierarchical change point model with parameter constraints
    • Authors: Hong Li, Andreana Benitez, Brian Neelon
      First page: 316
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Alzheimer’s disease is the leading cause of dementia among adults aged 65 or above. Alzheimer’s disease is characterized by a change point signaling a sudden and prolonged acceleration in cognitive decline. The timing of this change point is of clinical interest because it can be used to establish optimal treatment regimens and schedules. Here, we present a Bayesian hierarchical change point model with a parameter constraint to characterize the rate and timing of cognitive decline among Alzheimer’s disease patients. We allow each patient to have a unique random intercept, random slope before the change point, random change point time, and random slope after the change point. The difference in slope before and after a change point is constrained to be nonpositive, and its parameter space is partitioned into a null region (representing normal aging) and a rejection region (representing accelerated decline). Using the change point time, the estimated slope difference, and the threshold of the null region, we are able to (1) distinguish normal aging patients from those with accelerated cognitive decline, (2) characterize the rate and timing for patients experiencing cognitive decline, and (3) predict personalized risk of progression to dementia due to Alzheimer’s disease. We apply the approach to data from the Religious Orders Study, a national cohort study of aging Catholic nuns, priests, and lay brothers.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-14T03:04:34Z
      DOI: 10.1177/0962280220948097
  • Analysing body composition as compositional data: An exploration of the
           relationship between body composition, body mass and bone strength
    • Authors: D Dumuid, JA Martín-Fernández, S Ellul, RS Kenett, M Wake, P Simm, L Baur, T Olds
      First page: 331
      Abstract: Statistical Methods in Medical Research, Ahead of Print.
      Human body composition is made up of mutually exclusive and exhaustive parts (e.g. %truncal fat, %non-truncal fat and %fat-free mass) which are constrained to sum to the same total (100%). In statistical analyses, individual parts of body composition (e.g. %truncal fat or %fat-free mass) have traditionally been used as proxies for body composition, and have been linked with a range of health outcomes. But analysis of individual parts omits information about the other parts, which are intrinsically co-dependent because of the constant sum constraint of 100%. Further, body mass may be associated with health outcomes. We describe a statistical approach for body composition based on compositional data analysis. The body composition data are expressed as logratios to allow relative information about all the compositional parts to be explored simultaneously in relation to health outcomes. We describe a recent extension to the logratio approach to compositional data analysis which allows absolute information about the total of the compositional parts (body mass) to be considered alongside relative information about body composition. The statistical approach is illustrated by an example that explores the relationships between adults’ body composition, body mass and bone strength.
      Citation: Statistical Methods in Medical Research
      PubDate: 2020-09-17T12:21:33Z
      DOI: 10.1177/0962280220955221
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-