Introduction: Candidemia carries a heavy burden in terms of mortality, especially when presenting as septic shock, and its early diagnosis remains crucial.
Methods: We assessed the performance of a deep learning model for the early differential diagnosis between candidemia and bacteremia. The model was trained on a large dataset of automatically extracted laboratory features.
Results: A total of 12,483 episodes of candidemia (1275; 10%) or bacteremia (11,208; 90%) were included. For recognizing candidemia, a deep learning model showed sensitivity 0.80, specificity 0.59, positive predictive value (PPV) 0.18, weighted PPV (wPPV) 0.88, and negative predictive value (NPV) 0.96 on the training set (area under the curve [AUC] 0.69), and sensitivity 0.70, specificity 0.58, PPV 0.16, wPPV 0.87, and NPV 0.95 on the test set (AUC 0.64). Then, the learned discriminatory ability was tested in the subgroup of patients with available serum β-D-glucan (BDG) and procalcitonin (PCT) values to explore additive or synergistic effects with these more specific markers. Both feature selection and transfer learning did not improve the diagnostic performance of a model based on BDG and PCT only.
Conclusions: A deep learning model trained on nonspecific laboratory features showed some discriminatory ability to differentiate candidemia from bacteremia, highlighting the ability of deep learning to exploit complex patterns within nonspecific laboratory data. However, the learned patterns did not improve the diagnostic performance of more specific markers. Further exploration of candidemia prediction using laboratory features through machine learning techniques remains a promising area of research, serving as a valuable complement to the development of large-scale models that also incorporate clinical features.
Keywords: Candida; Artificial intelligence; Biomarker; Machine learning; Neural networks.
© 2025. The Author(s).