Reduced Neurological Burnout in the ER Utilizing Advanced Sophisticated Large Language Model
Alon Gorenshtein1, Moran Sorka2, Shiri Fistel1, Shahar Shelly1
1Department of Neurology, Rambam Medical Center, 2AI in neurology Laboratory, Technion – Israel Institute of Technology
Objective:
To assess if large language model (LLM) can generate professional medical discharge report from the Emergency Department (ED) for neurological patients.
Background:
Discharge summary reports from the ED are pivotal components of medical documentation, with essential for aiding future physicians in the review and continuity of patient care physicians spending twice more time on electronic health record (EHR) documentation than on direct patient care. we aim to autogenerated patient-specific recommendations and report summery to improve overall quality and utility of discharge summaries.
Design/Methods:
We conducted a retrospective study comparing our model versus real report summery from patients who received neurological consultations in the ED and developed combination model using LLM (Gemini 1.5-pro-002), complemented by prompt engineering and retrieval-augmented generation (RAG). We utilized Clinical-BioBERT embeddings to analyze the summaries and performed a comparison using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) and cosine similarity. Both ROUGE and cosine metrics are measured on a scale from 0 to 1, where higher values indicate greater alignment between the automatically generated summary and the reference.
Results:
We identified 250 detailed ED neurological summery reports. The mean cosine similarity between the model to the human neurologist was 0.89 ±0.03, indicating strong semantic alignment between the reports. A significant difference was noted in length between LLM-generated and residents reports (61.56 vs. 94.75, p < 0.001). The LLM report was written at a lower grade level (FKGL = 11.3 vs. 12.22, p < 0.001), making it easier to read and potentially improving patient comprehension. However, the low ROUGE scores (ROUGE-1 = 0.25, ROUGE-2 = 0.09, ROUGE-L = 0.19) suggest structural and compositional differences.
Conclusions:
Advanced LLM showed the potential to reduce the workload of neurologists by providing a generated discharge report that is more detailed and structured. Moreover, the explainability of LLM-generated recommendations can enhance patients’ comprehension of their post-discharge care plans.
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.