A BERT base model for the analysis of Electronic Health Records from diabetic patients

Enrico Manzini; Bogdan Vlacho; Josep Franch-Nadal; Joan Escudero; Ana Genova; Elisenda Reixach; Erich Andres; Israel Pizarro; Didac Mauricio; Alexandre Perera-Lluna

doi:10.1109/EMBC53108.2024.10782488

A BERT base model for the analysis of Electronic Health Records from diabetic patients

Annu Int Conf IEEE Eng Med Biol Soc. 2024 Jul:2024:1-4. doi: 10.1109/EMBC53108.2024.10782488.

Authors

Enrico Manzini, Bogdan Vlacho, Josep Franch-Nadal, Joan Escudero, Ana Genova, Elisenda Reixach, Erich Andres, Israel Pizarro, Didac Mauricio, Alexandre Perera-Lluna

PMID: 40039371
DOI: 10.1109/EMBC53108.2024.10782488

Abstract

The increasing availability of Electronic Health Records (EHRs) and the continuous developments and improvements of deep learning (DL) predictive models are shifting the health care process from a paradigm centered on clinicians' and specialists' specific knowledge to a new one, centered on big databases of patients data. However, the usage of DL models with clinical data is anything but simple, with many limitations due to the availability of labeled data and their inherent characteristics. In the field of Natural Language Processing the BERT model is achieving astonishing results thanks to the pretraining on big unlabelled corpora and its capacity of analysing long sequences of data. Here we propose a BERT base model for the analysis of EHRs sequences. The original BERT model has been adapted to deal with different EHRs modalities, introducing also a state vector representing the patient at the beginning of the sequence. The model has been trained with 5 years of data of more than 200.000 diabetic patients in Catalunya (Spain) using diagnosis codes, drugs prescriptions, clinical variables and laboratory results. The proposed embedding model improves the AUROC of the baseline models for different clinical tasks.

MeSH terms

Deep Learning
Diabetes Mellitus* / diagnosis
Electronic Health Records*
Humans
Natural Language Processing