Addressing the mean-variance relationship in spatially resolved transcriptomics data with spoon

Biostatistics. 2024 Dec 31;26(1):kxaf012. doi: 10.1093/biostatistics/kxaf012.

Abstract

An important task in the analysis of spatially resolved transcriptomics (SRT) data is to identify spatially variable genes (SVGs), or genes that vary in a 2D space. Current approaches rank SVGs based on either $ P $-values or an effect size, such as the proportion of spatial variance. However, previous work in the analysis of RNA-sequencing data identified a technical bias with log-transformation, violating the "mean-variance relationship" of gene counts, where highly expressed genes are more likely to have a higher variance in counts but lower variance after log-transformation. Here, we demonstrate the mean-variance relationship in SRT data. Furthermore, we propose spoon, a statistical framework using empirical Bayes techniques to remove this bias, leading to more accurate prioritization of SVGs. We demonstrate the performance of spoon in both simulated and real SRT data. A software implementation of our method is available at https://bioconductor.org/packages/spoon.

Keywords: Gaussian process regression; empirical Bayes; mean–variance bias; spatial transcriptomics; spatially variable gene.

MeSH terms

  • Bayes Theorem
  • Data Interpretation, Statistical
  • Gene Expression Profiling* / methods
  • Gene Expression Profiling* / statistics & numerical data
  • Humans
  • Sequence Analysis, RNA / methods
  • Software*
  • Transcriptome* / genetics