Addressing Limited Generalizability in Artificial Intelligence-Based Brain Aneurysm Detection for Computed Tomography Angiography: Development of an Externally Validated Artificial Intelligence Screening Platform

Samuel D Pettersson; Jean Filo; Peter Liaw; Paulina Skrzypkowska; Tomasz Klepinowski; Tomasz Szmuda; Thomas B Fodor; Felipe Ramirez-Velandia; Piotr Zieliński; Yu-Ming Chang; Philipp Taussky; Christopher S Ogilvy

doi:10.1227/neu.0000000000003549

Addressing Limited Generalizability in Artificial Intelligence-Based Brain Aneurysm Detection for Computed Tomography Angiography: Development of an Externally Validated Artificial Intelligence Screening Platform

Neurosurgery. 2025 Jun 9. doi: 10.1227/neu.0000000000003549. Online ahead of print.

Affiliations

¹ Division of Neurosurgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA.
² Department of Neurosurgery, Medical University of Gdansk, Gdansk, Poland.
³ Department of Radiology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA.

PMID: 40488539
DOI: 10.1227/neu.0000000000003549

Abstract

Background and objectives: Brain aneurysm detection models, both in the literature and in industry, continue to lack generalizability during external validation, limiting clinical adoption. This challenge is largely due to extensive exclusion criteria during training data selection. The authors developed the first model to achieve generalizability using novel methodological approaches.

Methods: Computed tomography angiography (CTA) scans from 2004 to 2023 at the study institution were used for model training, including untreated unruptured intracranial aneurysms without extensive cerebrovascular disease. External validation used digital subtraction angiography-verified CTAs from an international center, while prospective validation occurred at the internal institution over 9 months. A public web platform was created for further model validation.

Results: A total of 2194 CTA scans were used for this study. One thousand five hundred eighty-seven patients and 1920 aneurysms with a mean size of 5.3 ± 3.7 mm were included in the training cohort. The mean age of the patients was 69.7 ± 14.9 years, and 1203 (75.8%) were female. The model achieved a training Dice score of 0.88 and a validation Dice score of 0.76. Prospective internal validation on 304 scans yielded a lesion-level (LL) sensitivity of 82.5% (95% CI: 75.5-87.9) and specificity of 89.6 (95% CI: 84.5-93.2). External validation on 303 scans demonstrated an on-par LL sensitivity and specificity of 83.5% (95% CI: 75.1-89.4) and 92.9% (95% CI: 88.8-95.6), respectively. Radiologist LL sensitivity from the external center was 84.5% (95% CI: 76.2-90.2), and 87.5% of the missed aneurysms were detected by the model.

Conclusion: The authors developed the first publicly testable artificial intelligence model for aneurysm detection on CTA scans, demonstrating generalizability and state-of-the-art performance in external validation. The model addresses key limitations of previous efforts and enables broader validation through a web-based platform.

Keywords: Artificial intelligence; Computed tomography angiography; Convoluted neural networks; Detection; Intracranial aneurysm; Segmentation.