Using Machine Learning to Predict Linezolid-Associated Thrombocytopenia

Infect Drug Resist. 2025 May 23:18:2653-2661. doi: 10.2147/IDR.S479658. eCollection 2025.

Abstract

Objective: Using artificial intelligence and machine learning to predict linezolid-induced thrombocytopenia helps identify related risk factors in patients.

Methods: Between January 2020 and December 2023, 284 patients receiving linezolid from Beijing Chaoyang Hospital were enrolled. The data underwent filtering to ensure completeness and quality. The filtered data were then randomly divided into training and validation sets at a 3:1 ratio using stratified sampling. Four machine learning methods-logistic regression, Lasso regression, support vector machine (SVM), and random forest-were employed to develop predictive models on the training set, with optimal hyperparameters determined through grid search. Model performance was assessed via 10 - fold cross - validation on the training set, and the model with the highest AUC was selected. The chosen model was further validated on the independent validation set, with AUC, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) calculated.

Results: During treatment with linezolid, 42 (14.8%) of the 284 patients developed thrombocytopenia, with an average onset of 12.0±5.6 days after starting linezolid therapy. The random forest model demonstrated the best performance, with an AUC of 0.902 (95% CI 0.814-0.991) in the validation set. This model achieved a sensitivity of 81.8%, specificity of 86.9%, positive predictive value (PPV) of 52.9%, and negative predictive value (NPV) of 96.4%.

Conclusion: We developed a machine learning model to predict linezolid-associated thrombocytopenia, with the random forest model achieving an AUC of 0.902. This model can help clinicians assess patient risk and optimize treatment plans. Future work should validate the model in multicenter studies and explore its integration into clinical decision support systems.

Keywords: linezolid; machine learning; risk factors; thrombocytopenia.