Large Language Model-Supported Interactive Case-Based Learning: A Pilot Study
Haelynn Gim1, Benjamin Cook2, Jasmin Le2, Christina Guo3, Matthew Arnold2, Brandon Stretton2, Galina Gheihman1, Stephen Bacchi1
1Harvard Medical School, 2Adelaide Medical School, 3Johns Hopkins University
Objective:

The aim of this pilot study was to evaluate the performance of a large language model (LLM) in responding to medical student case-based questions (e.g., emulating a patient).

Background:

Artificial intelligence (AI), including LLM, has been proposed as a potential strategy to augment case-based learning (CBL). However, LLM use for educational purposes is mired in concerns regarding “hallucinations”. This concern about generating faulty information is frequently discussed in the literature but has been relatively understudied in the CBL context.

Design/Methods:

This study employed a cross-sectional analysis of the ability of an LLM to respond to medical student questions, as may occur in a CBL scenario. Five descriptions of patient cases were prepared, each with a different neurological presenting complaint. OpenAI’s GPT-4o-mini was used to emulate the patient in each interaction through a custom online interface. Medical student investigators (H.G., J.L., and B.C.) interrogated the LLM cases through free-text questioning regarding history, examination, and investigation results to arrive at a diagnosis. All student-investigator questions and AI-generated responses were evaluated by medical officers.

Results:

There was a total of 857 question-response pairs generated following the interrogation of the five LLM cases. In response to student-generated questions, the LLM adhered to the provided case in 832/857 (97.1%) of responses. There were 25/857 (2.9%) of questions in which the LLM provided information beyond that which was in the provided case, one of which was considered inconsistent with the case. Therefore, overall, 856/857 (99.9%) of LLM responses were appropriate.

Conclusions:

This study has demonstrated that it is feasible to use LLM to support CBL-type interactions with medical students. A contemporary LLM adhered closely to provided cases, and when it deviated, typically provided responses that were reasonable from a medical context. Further studies examining the effects of these approaches are required.

10.1212/WNL.0000000000211732
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.