Accurate prognosis represents a critical component in oncology research, enabling personalized treatment planning, and optimized healthcare resource utilization. While existing prognostic models demonstrate promising performance on restricted datasets, they remain constrained by two limitations: modality-specific architectural designs and cancer type-specific training paradigms that hinder cross-domain generalization. To address these challenges, the Unified Multi-modal Pan-cancer Survival Network (UMPSNet) is introduced, which integrates histopathology images, genomic expression profiles, and four metadata categories through structured text templates. UMPSNet employs the optimal transport (OT)-based attention for multi-modal feature alignment and a guided mixture of experts (GMoE) mechanism to address cancer-type distribution shifts. Comprehensive evaluation across 3,523 whole slide images (WSIs) (n=2,831) spanning five TCGA cohorts demonstrated superior predictive performance (mean C-index=0.725), surpassing meticulously designed single-cancer models. Notably, in zero-shot transfer evaluation involving 392 pancreatic adenocarcinoma WSIs (n=66) from Peking University Third Hospital, UMPSNet achieved a C-index of 0.652 without parameter fine-tuning, demonstrating generalization capacity for previously unseen malignancies. Additionally, UMPSNet identified prognostic gene signatures that consistently overlapped with clinically detected mutations (n=92) while revealing novel gene candidates, validating its clinical relevance and providing complementary insights for precision oncology. The UMPSNet framework establishes a new paradigm for multi-modal survival analysis by overcoming data heterogeneity and domain shift challenges, thereby providing a clinically adaptable tool for pan-cancer prognostic prediction.
Keywords: Deep learning; Multi-modal integration; Pan-cancer prognosis; Zero-shot generalization.
Copyright © 2025 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.