Background: Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions.
Methods: This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics.
Results: Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799-0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons.
Conclusion: This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.
Keywords: Artificial intelligence; Large language models; Major depressive disorder; Medical data processing; Supervised fine-tuning.
Copyright © 2025. Published by Elsevier B.V.