Background: The temporal sequence of clinical events is crucial in outcomes research, yet standard machine learning (ML) approaches often overlook this aspect in electronic health records (EHRs), limiting predictive accuracy.
Methods: We introduce Temporal Learning with Dynamic Range (TLDR), a time-sensitive ML framework, to identify risk factors for post-acute sequelae of SARS-CoV-2 infection (PASC). Using longitudinal EHR data from over 85,000 patients in the Precision PASC Research Cohort (P2RC) from a large integrated academic medical center, we compare TLDR against a conventional atemporal ML model.
Results: TLDR demonstrated superior predictive performance, achieving a mean AUROC of 0.791 compared to 0.668 for the benchmark, marking an 18.4% improvement. Additionally, TLDR's mean PRAUC of 0.590 significantly outperformed the benchmark's 0.421, a 40.14% increase. The framework exhibited improved generalizability with a lower mean overfitting index (-0.028), highlighting its robustness. Beyond predictive gains, TLDR's use of time-stamped features enhanced interpretability, offering a more precise characterization of individual patient records.
Discussion: TLDR effectively captures exposure-outcome associations and offers flexibility in time-stamping strategies to suit diverse clinical research needs.
Conclusion: TLDR provides a simple yet effective approach for integrating dynamic temporal windows into predictive modeling. It is available within the MLHO R package to support further exploration of recurrent treatment and exposure patterns in various clinical settings.