MsLesionLLM: A Tool to Extract Key Radiological Metrics from Real-world Multiple Sclerosis Datasets
Shane Poole1, Kanishka Koshal1, Nikki Sisodia1, Kyra Henderson1, Jaeleene Wijangco1, Danelvis Paredes1, Chelsea Chen1, William Rowles1, Amit Akula1, Jens Wuerfel2, Vishaka Sharma2, Andreas Rauschecker3, Roland Henry1, Riley Bove1
1University of California San Francisco Weill Institute for Neurosciences, 2Hoffmann-LaRoche, 3University of California, San Francisco Center for Intelligent Imaging
Objective:
To (1) develop and validate msLesionLLM, an AI-powered prompt to extract information about new multiple sclerosis (MS)-related inflammatory activity from MRI reports and (2) apply the prompt to a real-world use case: MRI activity after starting B-cell depleting therapy.
Background:
Neuroimaging is routinely used to monitor disease activity in multiple sclerosis (MS). Artificial Intelligence (AI)-enabled large language models could be applied to efficiently analyze imaging reports and understand real-world effects of treatments.
Design/Methods:
In this retrospective observational study, an LLM model (Versa AI ecosystem that securely connects healthcare data with ChatGPT4) was applied to clinical MRI reports for adults with MS in a single center. The discovery phase involved iterative refinement of a prompt using 5 annotated datasets, to detect new T2-weighted lesions and contrast-enhancing lesions. The validation phase involved applying the prompt to MRI reports from adults with MS initiating B-cell depleting therapy.
Results:
The validation phase included 1262 notes (536 patients: 70.4% female, median age 40.4, IQR 33.4-51.1). Prompt performance in the validation cohort for the detection of new T2-weighted lesions was: 97.0% accuracy /97.4% sensitivity /95.9% specificity /98.3% positive predictive value /93.6% negative predictive value. Performance for detection of contrast enhancing lesions was: 96.8% accuracy / 96.8% sensitivity / 91.2% specificity / 97.3% positive predictive value / and 95.4% negative predictive value. When applied to the clinical use case, after the first 6 months on B-cell depleting therapy, 97.9% of all MRI reports revealed no enhancing lesions; after the first “rebaselining” scan, 97.5% of all reports revealed no new lesions.
Conclusions:
AI-enabled large language models can efficiently extract accurate information from unstructured imaging reports. Tools such as msLesionLLM could be applied in many other real-world, large clinical settings to answer questions relating to disease evolution and treatment response.
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.