Machine Learning Models for Mortality and Readmission in Encephalopathy: Performance and Pitfalls From a Large ICU Cohort
Sindhu Vasireddy1, Shankar Biswas2, Elangovan Krishnan3, Yashasvi Srivastava2, Ayman Hamadttu4, Jeimy Marilyn Castellanos5
1Neurology, NMC Speciality Hospital, 2Internal Medicine, Ivano-Frankivsk National medical university, 3Immunology and microbiology, university of louisville, 4Internal Medicine, Sudan University of Science and Technology, 5Universidad Autonoma del Estado de Quintana Roo
Objective:
To develop and evaluate electronic health record (EHR)–based machine learning models for predicting key adverse outcomes—including in-hospital mortality, 30-day mortality, ICU readmission, composite adverse outcomes, and poor functional recovery—among critically ill adults with encephalopathy. The study aimed to identify the best-performing model architecture, determine the most predictive clinical features, and highlight methodological pitfalls such as data leakage and spectrum bias that may influence model performance and generalizability.
Background:
Encephalopathy among critically ill adults is common and portends worse outcomes, yet bedside prognostic tools are limited. We evaluated electronic health record (EHR)–derived machine-learning models to estimate risk of mortality and ICU readmission in this population.
Design/Methods:
In a retrospective cohort from MIMIC-IV, we identified 11,468 patients with encephalopathy across 15,630 ICU stays. From demographics, vital signs, laboratory values, medication administrations, and mental-status assessments (217 features), we trained five machine learning model families (random forest, gradient boosting, logistic regression, support-vector machine, neural network) to predict hospital mortality, 30-day mortality, ICU readmission, a composite adverse outcome, and poor functional outcome. Models were developed with non-overlapping train–test splits designed to mitigate leakage. Discrimination was summarized as AUROC with 95% CIs.
Results:
Random forests yielded the highest internal discrimination. AUROC (95% CI) was 0.984 (0.979–0.989) for hospital mortality, 0.985 (0.980–0.990) for ICU readmission, 0.971 (0.965–0.977) for the composite outcome, 0.905 (0.895–0.915) for 30-day mortality, and 0.874 (0.862–0.886) for poor functional outcome. Highly predictive features included age, illness-severity indicators, ICU type, and early instability in vital signs.
Conclusions:
Machine learning models achieved high discrimination for mortality and readmission among ICU patients with encephalopathy in internal testing. Given the risk of hidden leakage and spectrum effects in single-center retrospective data, external and prospective validation with calibration, decision-curve analysis, subgroup/fairness evaluation, and comparison to established scores is required before any clinical use.
KEYWORDS: artificial intelligence: machine learning; prognostic modeling; encephalopathy; critical care.
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.