Introduction: Meningitis, an inflammatory condition of the membranes surrounding the brain and spinal cord, can be caused by various agents. Bacterial meningitis is particularly severe due to its high morbidity and mortality rates. This study aims to develop machine learning (ML) models to classify the aetiology of bacterial meningitis using data from the Notifiable Diseases Information System (SINAN) in São Paulo State, Brazil.
Methods: Data were collected from the SINAN database, including sociodemographic variables, clinical symptoms, and cerebrospinal fluid (CSF) analyses. Five ML models Random Forest, LightGBM, XGBoost, CatBoost, and AdaBoost were applied to classify meningitis cases into bacterial, fungal, viral, and other types. Models were evaluated using metrics such as AUC-ROC, accuracy, precision, recall, F1-score, and MCC.
Results: The CatBoost model demonstrated superior performance, achieving an AUC-ROC of 0.95 for binary classification (bacterial vs. non-bacterial) and 0.85 for multiclass classification (Neisseria meningitidis, Streptococcus pneumoniae, and Haemophilus influenzae). XGBoost and LightGBM also showed promising results with AUC-ROC scores of 0.94 and 0.92, respectively, for binary classification. The CatBoost model exhibited high sensitivity and reasonable specificity, highlighting its applicability in the rapid and accurate diagnosis of meningitis. SHAP analysis identified variables such as leukocyte count and the presence of petechiae as influential predictors in the models.
Conclusion: ML algorithms, particularly CatBoost, XGBoost, and LightGBM, proved highly effective in the differential diagnosis of meningitis, offering a valuable tool for the rapid identification of meningitis types and bacterial serogroups. These techniques can be integrated into public health protocols to improve meningitis outbreak responses and optimize patient treatment.
Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.