Exploring Negated Entites for Named Entity Recognition in Italian Lung Cancer Clinical Reports

Stud Health Technol Inform. 2024 May 23:314:98-102. doi: 10.3233/SHTI240066.

Abstract

This paper explores the potential of leveraging electronic health records (EHRs) for personalized health research through the application of artificial intelligence (AI) techniques, specifically Named Entity Recognition (NER). By extracting crucial patient information from clinical texts, including diagnoses, medications, symptoms, and lab tests, AI facilitates the rapid identification of relevant data, paving the way for future care paradigms. The study focuses on Non-small cell lung cancer (NSCLC) in Italian clinical notes, introducing a novel set of 29 clinical entities that include both presence or absence (negation) of relevant information associated with NSCLC. Using a state-of-the-art model pretrained on Italian biomedical texts, we achieve promising results (average F1-score of 80.8%), demonstrating the feasibility of employing AI for extracting biomedical information in the Italian language.

Keywords: EHRs; NER; NSCLC; deep learning; trasformer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Carcinoma, Non-Small-Cell Lung / diagnosis
  • Data Mining / methods
  • Electronic Health Records*
  • Humans
  • Italy
  • Lung Neoplasms* / diagnosis
  • Natural Language Processing*