Use of natural language processing to identify patients with inflammatory breast cancer across a health-care system

JNCI Cancer Spectr. 2025 Apr 30;9(3):pkaf058. doi: 10.1093/jncics/pkaf058.

Abstract

Early identification and referral of inflammatory breast cancer remains challenging within large health-care systems, limiting access to specialized care. We developed and evaluated an artificial intelligence-driven platform integrating natural language processing (NLP) with electronic health records to systematically identify potential inflammatory breast cancer patients across 5 campuses. Our platform analyzed 8 623 494 clinical notes, implementing a sequential review process: NLP screening followed by human validation and multidisciplinary confirmation. Initial NLP screening achieved 55.4% positive predictive value, improving to 78.4% with human-in-the-loop review. Notably, among 255 confirmed patients with inflammatory breast cancer, our system demonstrated 92.2% sensitivity, identifying 57 patients (22.4%) that traditional surveillance methods missed. Documentation patterns influenced system performance, with combined inflammatory breast cancer and T4d staging mentions showing the highest predictive value (98.2%). This proof-of-concept study demonstrates that lightweight NLP systems with targeted human review can identify rare cancer cases that may otherwise remain siloed within complex health-care networks, ultimately improving access to specialized care resources.

MeSH terms

  • Adult
  • Aged
  • Artificial Intelligence
  • Early Detection of Cancer* / methods
  • Electronic Health Records*
  • Female
  • Humans
  • Inflammatory Breast Neoplasms* / diagnosis
  • Inflammatory Breast Neoplasms* / pathology
  • Middle Aged
  • Natural Language Processing*
  • Predictive Value of Tests
  • Proof of Concept Study
  • Sensitivity and Specificity

Grants and funding