To develop and compare machine-learning models that predict short-, intermediate-, and long-term mortality after ischemic stroke using routinely available ICU parameters.
Existing prognostic models for stroke mortality often predict only binary outcomes and rely on limited variable sets. Building on prior nomogram-based work by Li et al. (2022), this study applies a quaternary multi-class approach to stratify mortality timing at 30, 90, and 360 days.
A retrospective cohort of 760 ICU-admitted stroke patients from MIMIC III previously identified by Li et al. (2022) was analyzed. Inclusion criteria required a confirmed stroke diagnosis and complete demographic, physiologic, and laboratory data. Sixty-three variables were extracted from the first 24 hours of ICU admission, encompassing demographics, comorbidities, vital signs, laboratory tests, neurologic status (GCS), and other key features. Mortality outcomes were categorized as: death ≤ 30 days, death 31–90 days, death 91–360 days, and survival > 360 days. Six supervised classifiers (RandomForest, NaiveBayes, MultilayerPerceptron, J48, AdaBoostM1, LogitBoost) were trained using 10-fold cross-validation in WEKA.
Among all models, Random Forest demonstrated the best overall discrimination (ROC AUC = 0.795; sensitivity = 0.623; specificity = 0.823; precision = 0.623). AdaBoostM1 (ROC = 0.714) and LogitBoost (ROC = 0.783) achieved comparable sensitivity (≈ 0.58) but lower precision. Naive Bayes (ROC = 0.752) and J48 (ROC = 0.705) provided moderate performance, while Multilayer Perceptron underperformed, likely reflecting dataset size and class imbalance.
A quaternary multi-class framework effectively stratifies stroke mortality across multiple time horizons. The Random Forest classifier achieved the highest discrimination and balance between sensitivity and specificity, supporting its potential integration into ICU decision-support systems for post-stroke risk stratification.