2026 American Academy of Neurology Abstract Website

Natalie Erickson¹, Scott Millis¹, Waleed Raheem Abood Abood¹, Maysaa Basha², Peter LeWitt², Anza Memon³, Philip Ross⁴, Jacob Rube⁵, Carla Watson², Deepti Zutshi⁶, Deepa Raghavan¹, Rohit Marawar⁷
¹Wayne State University, ²Wayne State University, Detroit Medical Center, ³John D. Dingell, VAMC, Detroit, Michigan, ⁴Wayne State University Physicians Group, Department of Neurology, ⁵University Health Center, ⁶Wayne State University School of Medicine, ⁷Wayne State University - Detroit Medical Center

Objective:

To evaluate whether neurology faculty can distinguish AI-assisted (AI) or human-generated(HG) personal statements(PS) in residency applications.

Background:

Large language models (LLMs) can help brainstorm, draft or edit PS for residency applications. No studies to date have examined the effect of LLM use on PS for neurology residency applications. Prior work in other specialties compared fully AI-generated versus HG PS, but most applicants are likely to LLMs as an assistive tool, which our study evaluates.

Design/Methods:

Eleven anonymized AI-PS were collected from medical student volunteers (three post-match, eight pre-match). Volunteers received a brief introduction to LLM basics and had two weeks to submit PS using ChatGPT 4o without restriction. A pool of 86 deidentified PS from neurology applicants selected for interview in the 2023 Neurology Match (predating widespread use of ChatGPT) provided the HG source; 11 HG-PS were selected for comparison. The 22 PS were mixed, randomized and independently rated by six blinded neurology faculty using sliding scales for various qualities. Two-sided t-tests compared AI and HG PS.

Results:

Faculty reviewers had a median of 3.5 years’ of experience on residency selection committees. There was no statistically significant differences between the AI-PS and HG-PS for readability (p < 0.119), originality (p < 0.072), authenticity (p < 0.341), overall quality (p < 0.695) or the “why neurology?” item (p < 0.730). Three reviewers suspected AI use in some PS and three were unsure. On average faculty estimated 2.5 PS as fully AI generated while 8.1 as AI-assisted, with an average confidence level of 41/100. Four of six reviewers were unaware of AAMC guidelines on AI use in residency PS.

Conclusions:

AI-assisted PS were indistinguishable from human-generated PS by blinded neurology faculty across multiple quality measures with low confidence in detecting AI use.