Background: Single-cell RNA sequencing allows for the exploration of transcriptomic features at the individual cell level, but the high dimensionality and sparsity of the data pose substantial challenges for downstream analysis. Feature selection, therefore, is a critical step to reduce dimensionality and enhance interpretability.
Results: We developed a robust feature selection algorithm that leverages optimized locally estimated scatterplot smoothing regression (LOESS) to precisely capture the relationship between gene average expression level and positive ratio while minimizing overfitting. Our evaluations showed that our algorithm consistently outperforms eight leading feature selection methods across three benchmark criteria and helps improve downstream analysis, thus offering a significant improvement in gene subset selection.
Conclusions: By preserving key biological information through feature selection, GLP provides informative features to enhance the accuracy and effectiveness of downstream analyses.
Keywords: Feature selection; High variable genes; Single cell transcriptome.
© 2025. The Author(s).