Long-noncoding RNAs (LncRNAs) play important roles in physiological and pathological processes. Accurately predicting lncRNA-protein interactions (LPIs) is vital strategy for clarify functions and pathogenic mechanisms of lncRNAs. Current computational methods for evaluating LPIs with their utility and generalization have significant room for improvement. In this study, data splitting by incorporating protein clusters as group information reveals that lots of LPI prediction methods suffer from generalization flaws due to data leakage caused by ignoring LPI biological properties. To address the issue, we present LPItabformer, a tabular Transformer framework for predicting LPIs, that incorporates a domain shifts with uncertainty (DSU) module for generalization enhancement. The LPItabformer demonstrates a capacity to alleviate the generalization challenges associated with biases in LPI data and preferences in protein binding patterns. In addition, LPItabformer shows greater robustness and generalization on human and mouse LPI datasets compared to state-of-the-art methods. Ultimately, we have verified that the LPItabformer is capable of predicting novel LPIs. Code is available at https://github.com/Ci-TJ/LPItabformer.
Keywords: Deep learning; Generalization; LncRNA-protein interactions; Long non-coding RNA; Tabular Transformer.
© 2025 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.