Augmenting aer2vec: Enriching distributed representations of adverse event report data with orthographic and lexical information

J Biomed Inform. 2021 Jul:119:103833. doi: 10.1016/j.jbi.2021.103833. Epub 2021 Jun 8.

Abstract

Adverse Drug Events (ADEs) are prevalent, costly, and sometimes preventable. Post-marketing drug surveillance aims to monitor ADEs that occur after a drug is released to market. Reports of such ADEs are aggregated by reporting systems, such as the Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). In this paper, we consider the topic of how best to represent data derived from reports in FAERS for the purpose of detecting post-marketing surveillance signals, in order to inform regulatory decision making. In our previous work, we developed aer2vec, a method for deriving distributed representations (concept embeddings) of drugs and side effects from ADE reports, establishing the utility of distributional information for pharmacovigilance signal detection. In this paper, we advance this line of research further by evaluating the utility of encoding orthographic and lexical information. We do so by adapting two Natural Language Processing methods, subword embedding and vector retrofitting, which were developed to encode such information into word embeddings. Models were compared for their ability to distinguish between positive and negative examples in a set of manually curated drug/ADE relationships, with both aer2vec enhancements offering advantages in performances over baseline models, and best performance obtained when retrofitting and subword embeddings were applied in concert. In addition, this work demonstrates that models leveraging distributed representations do not require extensive manual preprocessing to perform well on this pharmacovigilance signal detection task, and may even benefit from information that would otherwise be lost during the normalization and standardization process.

Keywords: Natural language processing; Pharmacovigilance; Post-marketing surveillance; Retrofitting; Subword embeddings; Word embeddings.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adverse Drug Reaction Reporting Systems
  • Drug-Related Side Effects and Adverse Reactions*
  • Humans
  • Natural Language Processing
  • Pharmacovigilance*
  • United States
  • United States Food and Drug Administration