Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis

Muhammad Saad; Muhammad A Moqeet; Hassan Mansoor; Shama Khan; Rabia Sharif; Fahim Ullah Khan; Ali H Naqvi; Warda Ali

doi:10.7759/cureus.79688

Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis

Cureus. 2025 Feb 26;17(2):e79688. doi: 10.7759/cureus.79688. eCollection 2025 Feb.

Authors

Muhammad Saad¹, Muhammad A Moqeet², Hassan Mansoor², Shama Khan², Rabia Sharif², Fahim Ullah Khan³, Ali H Naqvi¹, Warda Ali²

Affiliations

¹ Ophthalmology, Al-Shifa Trust Eye Hospital, Rawalpindi, PAK.
² Cornea and Refractive Surgery, Al-Shifa Trust Eye Hospital, Rawalpindi, PAK.
³ Cornea, Al-Shifa Trust Eye Hospital, Rawalpindi, PAK.

Abstract

Background Vernal keratoconjunctivitis (VKC) is a recurrent allergic eye disease that requires accurate patient education to ensure proper management. AI-driven chatbots, such as Google Gemini Advanced (Mountain View, California, US), are increasingly being explored as potential tools for providing medical information. This study evaluates the accuracy, reliability, and clinical applicability of Google Gemini Advanced in addressing VKC-related queries. Objective To assess the performance of Google Gemini Advanced in delivering medically accurate and relevant information about VKC and to evaluate its reliability based on expert ratings. Methods A total of 125 responses generated by Google Gemini Advanced for 25 VKC-related questions were assessed by two independent cornea specialists. Responses were rated on accuracy, completeness, and potential harm using a 5-point Likert scale (1-5). Inter-rater reliability was measured using Cronbach's alpha. Responses were categorized into highly accurate (score of 5), minor inconsistencies (score of 4), and inaccurate (scores 1-3). Results Google Gemini Advanced demonstrated high inter-rater reliability (Cronbach's alpha = 0.92, 95% CI: 0.87-0.94). Of the 125 responses, 108 (86.4%) were rated highly accurate (score of 5) while 17 (13.6%) had minor inconsistencies (score of 4) but posed no potential for harm. No responses were classified as inaccurate or potentially harmful. The combined mean score was 4.88 ± 0.31, reflecting strong agreement between raters. The chatbot consistently provided reliable information across diagnostic, treatment, and prognosis-related queries, with minor gaps in complex grading and treatment-related discussions. Discussion The findings support the use of AI-driven chatbots like Google Gemini Advanced as potential tools for patient education in ophthalmology. The chatbot exhibited strong accuracy and consistency, particularly in addressing general VKC-related queries. However, areas for improvement remain, especially in providing detailed guidance on treatment protocols and ensuring completeness in responses to complex clinical questions. Conclusion Google Gemini Advanced demonstrates high reliability and accuracy in delivering medical information about VKC, making it a valuable tool for patient education. While its responses are consistent and generally accurate, expert oversight remains necessary to refine AI-generated content for clinical applications. Further research is needed to enhance AI-driven chatbots' ability to provide nuanced medical advice and integrate them safely into ophthalmic patient education and clinical decision-making.

Keywords: artificial intelligence (ai); chatgpt; co-pilot; google gemini; health sciences; medical education; medical research; patient care.