Hierarchical Swin Transformer Ensemble with Explainable AI for Robust and Decentralized Breast Cancer Diagnosis

Bioengineering (Basel). 2025 Jun 13;12(6):651. doi: 10.3390/bioengineering12060651.

Abstract

Early and accurate detection of breast cancer is essential for reducing mortality rates and improving clinical outcomes. However, deep learning (DL) models used in healthcare face significant challenges, including concerns about data privacy, domain-specific overfitting, and limited interpretability. To address these issues, we propose BreastSwinFedNetX, a federated learning (FL)-enabled ensemble system that combines four hierarchical variants of the Swin Transformer (Tiny, Small, Base, and Large) with a Random Forest (RF) meta-learner. By utilizing FL, our approach ensures collaborative model training across decentralized and institution-specific datasets while preserving data locality and preventing raw patient data exposure. The model exhibits strong generalization and performs exceptionally well across five benchmark datasets-BreakHis, BUSI, INbreast, CBIS-DDSM, and a Combined dataset-achieving an F1 score of 99.34% on BreakHis, a PR AUC of 98.89% on INbreast, and a Matthews Correlation Coefficient (MCC) of 99.61% on the Combined dataset. To enhance transparency and clinical adoption, we incorporate explainable AI (XAI) through Grad-CAM, which highlights class-discriminative features. Additionally, we deploy the model in a real-time web application that supports uncertainty-aware predictions and clinician interaction and ensures compliance with GDPR and HIPAA through secure federated deployment. Extensive ablation studies and paired statistical analyses further confirm the significance and robustness of each architectural component. By integrating transformer-based architectures, secure collaborative training, and explainable outputs, BreastSwinFedNetX provides a scalable and trustworthy AI solution for real-world breast cancer diagnostics.

Keywords: breast cancer; clinical decision support; ensemble learning; federated learning; privacy-preserving; vision transformers.