scGT: Integration algorithm for single-cell RNA-seq and ATAC-seq based on graph transformer

Bioinformatics. 2025 Jun 24:btaf357. doi: 10.1093/bioinformatics/btaf357. Online ahead of print.

Abstract

Motivation: Multi-omics analysis of individual cells offers remarkable opportunities for exploring the dynamics and relationships of gene regulatory states across large atlas data. However, the current integration algorithms have limited performance, largely due to ignoring the impact of correlation features within the dataset on the discrepancies between omics.

Results: In this study, we propose scGT, a model based on Graph Transformer for single-cell RNA-seq and ATAC-seq data, which leverages the robust graph structures strengthened by correlation features present in each raw dataset to harmonize representations of multi-omics data, enabling the integration of multi-omics and effective label transfer. We compare scGT with other state-of-the-art methods on paired and unpaired datasets. The results show that scGT accomplishes more accurate label transfer and is capable of integrating datasets with millions of cells. Meanwhile, scGT achieves better performance for preserving biological variation during integration.

Availability and implementation: The source code and data used in this paper can be found in https://github.com/Jinsl-lab/scGT.

Supplementary information: Supplementary data are available at Bioinformatics online.