Segger: Fast and accurate cell segmentation of imaging-based spatial transcriptomics data

Elyas Heidari; Andrew Moorman; Dániel Unyi; Nikhita Pasnuri; Gleb Rukhovich; Domenico Calafato; Anna Mathioudaki; Joseph M Chan; Tal Nawy; Moritz Gerstung; Dana Pe'er; Oliver Stegle

doi:10.1101/2025.03.14.643160

Segger: Fast and accurate cell segmentation of imaging-based spatial transcriptomics data

bioRxiv [Preprint]. 2025 Mar 16:2025.03.14.643160. doi: 10.1101/2025.03.14.643160.

Authors

Elyas Heidari^{1

2

3

4}, Andrew Moorman⁵, Dániel Unyi^{2

6}, Nikhita Pasnuri⁵, Gleb Rukhovich¹, Domenico Calafato¹, Anna Mathioudaki¹, Joseph M Chan⁷, Tal Nawy⁵, Moritz Gerstung^{1

8

9

10

11}, Dana Pe'er^{5

12}, Oliver Stegle^{2

3

13}

Affiliations

¹ Artificial Intelligence in Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
² Division of Computational Genomics and System Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
³ European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.
⁴ Collaboration for joint PhD degree between DKFZ and Heidelberg University, Faculty of Biosciences, Heidelberg, Germany.
⁵ Program for Computational and Systems Biology, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
⁶ Department of Telecommunications and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics.
⁷ Human Oncology & Pathogenesis Program and Department of Medicine, Thoracic Oncology Service, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
⁸ Heidelberg University, Faculty of Computer Science and Mathematics, Heidelberg, Germany.
⁹ Robert Bosch Center for Tumor Diseases, Stuttgart, Germany.
¹⁰ Medical Faculty, Eberhard-Karls-University, Tübingen, Germany.
¹¹ University Hospital Tübingen, Tübingen, Germany.
¹² Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
¹³ Cellular Genetics Programme, Wellcome Sanger Institute, Cambridge, UK.

Abstract

The accurate assignment of transcripts to their cells of origin remains the Achilles heel of imaging-based spatial transcriptomics, despite being critical for nearly all downstream analyses. Current cell segmentation methods are prone to over- and under-segmentation, misassign transcripts to cells, require manual intervention, and suffer from low sensitivity and scalability. We introduce segger, a versatile graph neural network based on a heterogeneous graph representation of individual transcripts and cells, that frames cell segmentation as a transcript-to-cell link prediction task and can leverage single-cell RNA-seq information to improve transcript assignments. On multiple Xenium dataset benchmarks, segger exhibits superior sensitivity and specificity, while requiring orders of magnitude less compute time than existing methods. The user-friendly open-source software implementation has extensive documentation (https://elihei2.github.io/segger_dev/), requires little manual intervention, integrates seamlessly into existing workflows, and enables atlas-scale applications.

Publication types

Preprint