scMGCL: accurate and efficient integration representation of single-cell multiomics data

Bioinformatics. 2025 Jul 9:btaf392. doi: 10.1093/bioinformatics/btaf392. Online ahead of print.

Abstract

Motivation: Single-cell multi-omics data integration is essential for understanding cellular states and disease mechanisms, yet integrating heterogeneous data modalities remains a challenge. We present scMGCL, a graph contrastive learning framework for robust integration of single-cell ATAC-seq and RNA-seq data. Our approach leverages self-supervised learning on cell-cell similarity graphs, in which each modality's graph structure serves as an augmentation for the other. This cross-modality contrastive paradigm enables the learning of biologically meaningful, shared representations while preserving modality-specific features.

Results: Benchmarking against state-of-the-art methods demonstrates that scMGCL outperforms others in cell-type clustering, label transfer accuracy, and preservation of marker gene correlations. Additionally, scMGCL significantly improves computational efficiency, reducing runtime and memory usage. The method's effectiveness is further validated through extensive analyses of cell-type similarity and functional consistency, providing a powerful tool for multi-omics data exploration.

Availability: Code and datasets are released at https://github.com/zlCreator/scMGCL.

Supplementary information: Supplementary data are available at Bioinformatics online.