RadLex Normalization in Radiology Reports

Surabhi Datta; Jordan Godfrey-Stovall; Kirk Roberts

RadLex Normalization in Radiology Reports

AMIA Annu Symp Proc. 2021 Jan 25:2020:338-347. eCollection 2020.

Authors

Surabhi Datta¹, Jordan Godfrey-Stovall¹, Kirk Roberts¹

Affiliation

¹ School of Biomedical Informatics The University of Texas Health Science Center at Houston Houston, TX.

PMID: 33936406
PMCID: PMC8075450

Abstract

Radiology reports have been widely used for extraction of various clinically significant information about patients' imaging studies. However, limited research has focused on standardizing the entities to a common radiology-specific vocabulary. Further, no study to date has attempted to leverage RadLex for standardization. In this paper, we aim to normalize a diverse set of radiological entities to RadLex terms. We manually construct a normalization corpus by annotating entities from three types of reports. This contains 1706 entity mentions. We propose two deep learning-based NLP methods based on a pre-trained language model (BERT) for automatic normalization. First, we employ BM25 to retrieve candidate concepts for the BERT-based models (re-ranker and span detector) to predict the normalized concept. The results are promising, with the best accuracy (78.44%) obtained by the span detector. Additionally, we discuss the challenges involved in corpus construction and propose new RadLex terms.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Deep Learning*
Diagnostic Imaging / methods*
Documentation / standards*
Humans
Natural Language Processing*
Radiology Information Systems / standards*
Radiology*
Unified Medical Language System
Vocabulary, Controlled

Abstract

Publication types

MeSH terms

Grants and funding