Frontostriatal degeneration in Parkinson's disease (PD) is associated with language deficits, which can be identified using natural language processing, a remarkable tool for digital-phenotyping. Current evidence is mostly blind to the disorder's cognitive phenotypes. We validated an AI-driven approach to capture digital language markers of PD with and without mild cognitive impairment (PD-MCI, PD-nMCI) relative to healthy controls (HCs). Analyzing the connected speech of participants, we extracted linguistic features with CLAN software. Classification was performed using SVM and RFE. Discrimination between PD and HCs reached an AUC of 77%, with even better results for subgroup analyses (AUC: 85% PD-nMCI vs. HCs; 83% PD-MCI vs. HCs; 75% PD-nMCI vs. PD-MCI). Key linguistic features included retracing, action verb, utterance error, and verbless-utterance ratios. Despite the small sample size, which may limit statistical power and generalizability, this study highlights the foundational potential of linguistic digital markers for early diagnosis and phenotyping of PD.
© 2025. The Author(s).