2026 American Academy of Neurology Abstract Website

Objective:

To evaluate the role of open-access large language models (LLMs), under physician guidance, as diagnostic assistants in rare diseases with uncommon clinical presentations.

Background:

Rare diseases often lead to decades-long diagnostic struggles. Artificial intelligence (AI) LLMs have shown diagnostic promise in benchmark studies, yet most applications remain retrospective, accuracy-focused, or tied to institution-specific models. There is little real-world evidence on how open-access LLMs can support physicians in diagnostic challenges involving uncommon cases with complex clinical manifestations, particularly in resource-constrained community.

Design/Methods:

We conducted proof-of-concept testing of open-access LLMs (GPT, OpenEvidence) in two cases with confirmed rare disease diagnoses, followed by extension to four unresolved or working-in-progress cases using physician-guided prompting. A focused literature review was also performed, with extracted findings compared against our case series.

Results:

Using only clinical vignettes, LLMs generated focused differentials diagnosis for two confirmed cases, COL3A1-related connective tissue disorder and COL4A1-related cerebral small vessel disease, also directed attention toward targeted molecular testing. These findings align with benchmark studies demonstrating LLM diagnostic accuracy in rare disease vignettes. For four unresolved cases, outcomes illustrated a four-role framework: Diagnostic clarification with management impact — PACS1 splice variant reinterpretation enabled ASM adjustment, improving seizure control. Multidisciplinary synthesis for resolution — DES desminopathy identified through VUS reclassification and reanalysis of imaging/pathology after decades of inconclusive workups. Clinical reframing without single etiology — a complex multisystem case redirected care from repeated testing toward anticipatory planning. Highlighting uncertainty and limits — a tumefactive perivascular space case showed divergent outputs, underscoring the need for physician oversight. Compatible outputs were observed between general-purpose LLMs (GPT) and a domain-specific model (OpenEvidence), supporting reproducibility across platforms.

Conclusions:

Readily available open-access LLMs, with physician judicious guidance, have potential to improve diagnostic efficiency, accuracy, and care quality in rare diseases. Prospective, multi-center validation will be essential to establish reproducibility, safety, and patient-centered benefit.