Large models in medical imaging: Advances and prospects

Chin Med J (Engl). 2025 Jun 20. doi: 10.1097/CM9.0000000000003699. Online ahead of print.

Abstract

Recent advances in large models demonstrate significant prospects for transforming the field of medical imaging. These models, including large language models, large visual models, and multimodal large models, offer unprecedented capabilities in processing and interpreting complex medical data across various imaging modalities. By leveraging self-supervised pretraining on vast unlabeled datasets, cross-modal representation learning, and domain-specific medical knowledge adaptation through fine-tuning, large models can achieve higher diagnostic accuracy and more efficient workflows for key clinical tasks. This review summarizes the concepts, methods, and progress of large models in medical imaging, highlighting their potential in precision medicine. The article first outlines the integration of multimodal data under large model technologies, approaches for training large models with medical datasets, and the need for robust evaluation metrics. It then explores how large models can revolutionize applications in critical tasks such as image segmentation, disease diagnosis, personalized treatment strategies, and real-time interactive systems, thus pushing the boundaries of traditional imaging analysis. Despite their potential, the practical implementation of large models in medical imaging faces notable challenges, including the scarcity of high-quality medical data, the need for optimized perception of imaging phenotypes, safety considerations, and seamless integration with existing clinical workflows and equipment. As research progresses, the development of more efficient, interpretable, and generalizable models will be critical to ensuring their reliable deployment across diverse clinical environments. This review aims to provide insights into the current state of the field and provide directions for future research to facilitate the broader adoption of large models in clinical practice.

Keywords: Artificial intelligence; Diagnosis; Interactive system; Large language model; Large vision model; Multimodal data; Segmentation.