Predicting semantic segmentation quality in laryngeal endoscopy images

Andreas M Kist; Sina Razi; René Groh; Florian Gritsch; Anne Schützenberger

doi:10.1371/journal.pone.0314573

Predicting semantic segmentation quality in laryngeal endoscopy images

PLoS One. 2025 Jul 3;20(7):e0314573. doi: 10.1371/journal.pone.0314573. eCollection 2025.

Authors

Andreas M Kist¹, Sina Razi¹, René Groh¹, Florian Gritsch¹, Anne Schützenberger²

Affiliations

¹ Department Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Bavaria, Germany.
² Division Phoniatrics and Pediatric Audiology, Department Otolaryngology, Head- and Neck-Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Bavaria, Germany.

Abstract

Endoscopy is a major tool for assessing the physiology of inner organs. Contemporary artificial intelligence methods are used to fully automatically label medical important classes on a pixel-by-pixel level. This so-called semantic segmentation is for example used to detect cancer tissue or to assess laryngeal physiology. However, due to the diversity of patients presenting, it is necessary to judge the segmentation quality. In this study, we present a fully automatic system to evaluate the segmentation performance in laryngeal endoscopy images. We showcase on glottal area segmentation that the predicted segmentation quality represented by the intersection over union metric is on par with human raters. Using a traffic light system, we are able to identify problematic segmentation frames to allow human-in-the-loop improvements, important for the clinical adaptation of automatic analysis procedures.

Copyright: © 2025 Kist et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Algorithms
Artificial Intelligence
Humans
Image Processing, Computer-Assisted* / methods
Laryngoscopy* / methods
Larynx* / diagnostic imaging
Semantics*