scPrediXcan integrates deep learning methods and single-cell data into a cell-type-specific transcriptome-wide association study framework

Cell Genom. 2025 May 14;5(5):100875. doi: 10.1016/j.xgen.2025.100875.

Abstract

Transcriptome-wide association studies (TWASs) help identify disease-causing genes but often fail to pinpoint disease mechanisms at the cellular level because of the limited sample sizes and sparsity of cell-type-specific expression data. Here, we propose scPrediXcan, which integrates state-of-the-art deep learning approaches that predict epigenetic features from DNA sequences with the canonical TWAS framework. Our prediction approach, ctPred, predicts cell-type-specific expression with high accuracy and captures complex gene-regulatory grammar that linear models overlook. Applied to type 2 diabetes (T2D) and systemic lupus erythematosus (SLE), scPrediXcan outperformed the canonical TWAS framework by identifying more candidate causal genes, explaining more genome-wide association study (GWAS) loci and providing insights into the cellular specificity of TWAS hits. Overall, our results demonstrate that scPrediXcan represents a significant advance, promising to deepen our understanding of the cellular mechanisms underlying complex diseases.

Keywords: Enformer; GWAS; PrediXcan; TWAS; deep learning; single-cell; single-cell RNA-seq; systemic lupus erythematosus; type 2 diabetes.

MeSH terms

  • Deep Learning*
  • Diabetes Mellitus, Type 2 / genetics
  • Epigenesis, Genetic
  • Gene Expression Profiling / methods
  • Genome-Wide Association Study* / methods
  • Humans
  • Lupus Erythematosus, Systemic / genetics
  • Single-Cell Analysis* / methods
  • Transcriptome* / genetics