Optimising dynamic treatment regimens using sequential multiple assignment randomised trials data with missing data

Jessica Xu; Anurika P De Silva; Katherine J Lee; Robert K Mahar; Julie A Simpson

doi:10.1186/s12874-025-02595-1

Optimising dynamic treatment regimens using sequential multiple assignment randomised trials data with missing data

BMC Med Res Methodol. 2025 Jul 1;25(1):162. doi: 10.1186/s12874-025-02595-1.

Authors

Jessica Xu¹, Anurika P De Silva¹, Katherine J Lee^{2

3}, Robert K Mahar^#^{1

2}, Julie A Simpson^#^{4

5}

Affiliations

¹ Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia.
² Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Royal Children's Hospital, Melbourne, VIC, Australia.
³ Department of Paediatrics, University of Melbourne, Melbourne, VIC, Australia.
⁴ Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia. julieas@unimelb.edu.au.
⁵ Nuffield Department of Medicine, University of Oxford, Oxford, UK. julieas@unimelb.edu.au.

^# Contributed equally.

Abstract

Dynamic treatment regimens are commonly used for patients with chronic or progressive medical conditions. Sequential multiple assignment randomised trials (SMARTs) are studies used to optimise dynamic treatment regimens by repeatedly randomising participants to treatments. Q-learning, a stage-wise regression-based method used to analyse SMARTs, uses backward induction to compare treatments administered as a sequence. Missing data is a common problem in randomised trials and can be complex in SMARTs given the sequential randomisation. Common methods for handling missing data such as complete case analysis (CCA) and multiple imputation (MI) have been widely explored in single-stage randomised trials, however, the only study that explored these methods in SMARTs did not consider Q-learning. We evaluated the performance of CCA and MI on the estimation of Q-learning parameters in a SMART. We simulated 1000 datasets of 500 participants, based on a SMART with two stages, under different missing data scenarios defined by missing directed acyclic graphs (m-DAGS), percentages of missing data (20%, 40%), stage 2 treatment effects, and strengths of association with missingness in stage 2 treatment, patient history and outcome. We also compared CCA and MI using retrospective data from a longitudinal smoking cessation SMART. When there was no treatment effect at either stage 1 or 2, we observed close to zero absolute bias in the stage 1 treatment effect and similar empirical standard errors for CCA and MI under all missing data scenarios. When all participants had a relatively large stage 2 treatment effect, we observed minimal bias from both CCA and MI, with slightly greater bias for MI. Empirical standard errors were higher for MI compared to CCA under all scenarios except for when data were missing not dependent on any variables. When the stage 2 treatment effect varied between participants and data were missing dependent on other variables (for example, stage 1 responder status missing dependent on stage 1 treatment and baseline variables), we observed greater bias for MI when estimating the stage 1 treatment effect, which increased with the percentage missingness, while the bias for CCA remained minimal. Resulting empirical standard errors were lower or similar for MI compared to CCA under all missing data scenarios. Results showed that for a two-stage SMART, MI failed to capture the differences between treatment effects when the stage 2 treatment effect varied between participants.

Keywords: Missing data; Multiple imputation; Q-learning; Sequential multiple assignment randomised trials.

MeSH terms

Algorithms
Computer Simulation
Data Interpretation, Statistical
Humans
Models, Statistical
Randomized Controlled Trials as Topic* / methods
Randomized Controlled Trials as Topic* / statistics & numerical data
Research Design*

Abstract

MeSH terms

Grants and funding