2025 American Academy of Neurology Abstract Website

Objective:

exploring the potential uses of LLM and SML models in predicting post-operative complications in patients with cervical spondylosis, and to compare the pros and cons of the two approaches in terms of accuracy, cost-effectiveness, and patient confidentiality and data security.

Background:

This study compares Shallow Machine Learning (SML) and Large Language Models (LLM) in predicting post-operative complications in neurosurgical applications.

Design/Methods:

Data were extracted from the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) registry. The postoperative outcomes evaluated included infections, cardiorespiratory events, thrombosis, bleeding, readmission, and reoperation. We employed multivariate logistic regression, machine learning algorithms, and nomograms for the analyses.

Results:

A total of 13,287 patients were included, with postoperative complications occurring in 5.4%. The most common complication was infection (2.3%). For predicting any adverse event, the Best AutoML algorithm had the highest performance, achieving an AUC of 0.7989. The RuleFit Model excelled in predicting cardiovascular events (AUC of 0.7688) and infections (AUC of 0.7885).

In terms of LLM models, the Llama 3 8b model had a prediction accuracy of 70% with a training time of 2.5 hours for one epoch. The BioMedLM model reached 60% accuracy for any complication, while the BioMestral model demonstrated 77% accuracy with a training time of 4 hours for 3 epochs.

Conclusions:

SML models are cost-effective and suitable for many clinical application scenarios, unlike LLM models that require high-cost training, maintenance, and engineering. The LLM models still need further training and testing; there is still room for improvement and fine-tuning. Also, further training with larger datasets can significantly improve the results.

Model’s Link: https://huggingface.co/ShaheenLab/DR_SHAHEENAI