Machine learning-based integration identifies plasma cells-related gene signature ST6GAL1 in idiopathic pulmonary fibrosis

BMC Pulm Med. 2025 Jul 2;25(1):295. doi: 10.1186/s12890-025-03696-9.

Abstract

Background: Idiopathic pulmonary fibrosis (IPF) is a rare, progressive, and fibrotic disease with poor prognosis that lacks treatment options. As a major component of the lung adaptive immune system, plasma cells play a crucial regulatory role during fibrosis. The aim of this study is to systematically explore plasma cells-related genes associated with prognosis in patients with IPF.

Methods: The marker genes for plasma cells were extracted via single-cell RNA sequencing (scRNA-seq) analysis. Hub genes most relevant to the IPF state and plasma cells infiltration level were screened by weighted gene co-expression network analysis (WGCNA). Moreover, the differentially expressed genes (DEGs) were obtained based on the bulk RNA-seq and microarray data. In addition, a machine learning-based integrative procedure for constructing a concordance plasma cells-related gene signature (PCRGS) was developed. A core gene in the PCRGS was further identified and validated through experiments. Finally, the network pharmacology analysis for the core gene was implemented.

Results: The established PCRGS, based on the seven genes (SLAMF7, JCHAIN, PNOC, POU2AF1, MEI1, ST6GAL1, and VOPP1), was identified as an independent prognostic factor for overall survival. It also demonstrated well robustness compared to conventional clinical features and 22 published signatures. Eventually, ST6GAL1 was selected as the core gene and its localization in the plasma cells as well as its over-expression in the lungs of bleomycin-injured mice was experimentally validated. The small molecular drugs prediction and docking analysis suggest quercetin as the optimal ligand targeting ST6GAL1 which might form a stable binding conformation with it.

Conclusions: PCRGS might be used to evaluate the IPF prognosis, among which ST6GAL1 is a potential therapeutic target. These results provide an important basis for future studies on the relationship between plasma cells-related genes and IPF.

Keywords: Gene signatures; IPF; Machine learning; Plasma cells; ST6GAL1.

MeSH terms

  • Aged
  • Animals
  • Female
  • Gene Expression Profiling
  • Humans
  • Idiopathic Pulmonary Fibrosis* / genetics
  • Idiopathic Pulmonary Fibrosis* / mortality
  • Machine Learning*
  • Male
  • Mice
  • Plasma Cells* / metabolism
  • Prognosis
  • Sialyltransferases* / genetics
  • Transcriptome

Substances

  • Sialyltransferases