DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires

Nat Commun. 2021 Mar 11;12(1):1605. doi: 10.1038/s41467-021-21879-w.

Abstract

Deep learning algorithms have been utilized to achieve enhanced performance in pattern-recognition tasks. The ability to learn complex patterns in data has tremendous implications in immunogenomics. T-cell receptor (TCR) sequencing assesses the diversity of the adaptive immune system and allows for modeling its sequence determinants of antigenicity. We present DeepTCR, a suite of unsupervised and supervised deep learning methods able to model highly complex TCR sequencing data by learning a joint representation of a TCR by its CDR3 sequences and V/D/J gene usage. We demonstrate the utility of deep learning to provide an improved 'featurization' of the TCR across multiple human and murine datasets, including improved classification of antigen-specific TCRs and extraction of antigen-specific TCRs from noisy single-cell RNA-Seq and T-cell culture-based assays. Our results highlight the flexibility and capacity for deep neural networks to extract meaningful information from complex immunogenomic data for both descriptive and predictive purposes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence / genetics
  • Animals
  • Databases, Genetic
  • Deep Learning*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Mice
  • Neural Networks, Computer
  • RNA-Seq / methods
  • Receptors, Antigen, T-Cell / genetics*
  • T-Lymphocytes / immunology*

Substances

  • Receptors, Antigen, T-Cell