LoRA-Enhanced RT-DETR: First Low-Rank Adaptation based DETR for real-time full body anatomical structures identification in musculoskeletal ultrasound

Comput Med Imaging Graph. 2025 Jun 18:124:102583. doi: 10.1016/j.compmedimag.2025.102583. Online ahead of print.

Abstract

Medical imaging models for object identification often rely on extensive pretraining data, which is difficult to obtain due to data scarcity and privacy constraints. In practice, hospitals typically have access only to pretrained model weights without the original training data limiting their ability to tailor models to specific patient populations and imaging devices. We address this challenge with the first Low-Rank Adaptation (LoRA)-enhanced Real-Time Detection Transformer (RT-DETR) model for full body musculoskeletal (MSK) ultrasound (US). By injecting LoRA modules into select encoder and decoder layers of RT-DETR, we achieved a 99.45 % (RT-DETR-L) and 99.68 % (RT-DETR-X) reduction in trainable parameters while preserving the model's representational power. This extreme reduction enables efficient fine-tuning using only minimal institution-specific data and maintains robust performance even on anatomical structures absent from the fine-tuning set. In extensive 5-fold cross-validation, our LoRA-enhanced model outperformed traditional full-model fine-tuning and maintained or improved detection accuracy across a wide range of MSK structures while demonstrating strong resilience to domain shifts. The proposed LoRA-enhanced RT-DETR significantly lowers the barrier for deploying transformer-based detection in clinics, offering a privacy-conscious, computationally lightweight solution for real-time, full-body MSK US identification.

Keywords: Deep Learning; Fine-tune; Low-Rank Adaptation; Musculoskeletal Ultrasound; Object Detection; Transformer.