Optimizing the Diagnostic Pathway for Adult Genetic Epilepsy Patients: A Machine Learning Approach
David Lee1, Cynthia Peng2, Sai Zhang3, Yi Li2
1Johns Hopkins University School of Medicine, 2Neurology and Neurological Sciences, Stanford University School of Medicine, 3Biomedical Informatics & Data Science, Yale School of Medicine
Objective:
To develop a machine learning (ML) approach to identify adult epilepsy patients with a high likelihood of harboring a diagnostic genetic variant using routinely collected clinical data.
Background:
Identification of pathogenic genetic variants causative for epilepsy informs decision-making in anti-seizure medication and surgical candidacy. However, the diagnosis of genetic epilepsy in adults is often delayed for years, hindering optimal patient management.
Design/Methods:
We conducted a retrospective cohort study of adult epilepsy patients at the Stanford Comprehensive Epilepsy Center (2018-2024) who underwent genetic testing. An eXtreme Gradient Boosting (XGBoost) model was trained on 80% of the cohort using 23 extracted features from clinical history, epilepsy type, and MRI/EEG findings, for classification of patients with or without a disease-causing genetic variant. Shapley Additive exPlanations (SHAP) followed by backward elimination were used to identify an optimal feature subset. The final model was trained using 5-fold cross-validation with nested grid search and evaluated on a 20% holdout test set using Area Under the Curve (AUC), sensitivity (SN), specificity (SP), precision (PR), and F1-score.
Results:
Our cohort (n=505; 52.9% male) included 167 (33%) patients with an epilepsy-causing genetic variant. Initial training and SHAP analysis identified 14 informative features from which backwards elimination yielded an optimal subset of 8 features. The model achieved AUC of 0.81 on training data. On the holdout test set, the final model achieved AUC of 0.79 (SN=0.82; SP=0.72; PR=0.59; F1=0.69). The most influential predictive features were age of seizure onset, intellectual/developmental disability, and first-degree family history.
Conclusions:
Our model using a minimal set of routinely available clinical data could help identify adult epilepsy patients with genetic etiology. This approach has the potential to streamline the diagnostic pathway and optimize the use of genetic testing in clinical practice. Future work will focus on external validation of the model.
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.