Using Explainable AI in the Clinical Validating of MyCog: A Self-administered Cognitive Screener for Primary Care Settings
Callie Jones1, Stephanie Ruth Young1, Greg Byrne1, Elizabeth Dworak1, Julia Yoshino Benavente2, Richard Gershon1, Michael Wolf2, Cindy Nowinski1
1Medical Social Sciences, 2Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University
Objective:
This study examined whether MyCog, a brief tablet-based cognitive screening application, could accurately discriminate between older adults with and without cognitive impairment using machine learning with explainable AI (XAI) methods to enhance clinical interpretability.
Background:
Primary care settings are optimal for early cognitive impairment detection but face significant barriers including time constraints and lack of minimally burdensome assessments. Traditional screeners require staff administration and cannot capture granular behavioral data. MyCog addresses these challenges as an EHR-integrated, self-administered tablet application featuring two validated cognitive tasks: Dimensional Change Card Sort (executive functioning) and Picture Sequence Memory (episodic memory).
Design/Methods:
Cross-sectional validation study included 65 adults aged 65+ with diagnosed cognitive impairment and 80 cognitively normal controls. We employed ensemble modeling using five machine learning (ML) approaches (LASSO, Elastic Net, Random Forest, Bayesian Logistic Regression, Gradient Boosting) with nested cross-validation. XAI techniques included SHapley Additive exPlanations (SHAP) values for individual prediction explanations, feature importance rankings, and decision boundary visualization. Performance was evaluated using ROC AUC, sensitivity, specificity, and accuracy.
Results:
All models demonstrated strong diagnostic performance (AUC: 0.817-0.873). XAI analysis revealed memory accuracy (Picture Sequence Memory exact match) and executive functioning efficiency (Dimensional Change Card Sort rate-correct score) as most predictive features. The final explainable consensus model achieved AUC 0.890, sensitivity 72.3-83.1%, specificity 78.8-91.2%, and accuracy 80.7-82.8%. SHAP analysis provided individualized feature contribution scores for clinical understanding.
Conclusions:
MyCog demonstrates strong diagnostic accuracy through a parsimonious, clinically interpretable model enhanced with XAI capabilities. The integration of explainable AI provides clinicians with transparent, individualized insights into cognitive screening results, addressing interpretability challenges hindering ML adoption in healthcare. As a validated, self-administered tool requiring under 7 minutes with seamless EHR integration, MyCog represents a practical solution combining diagnostic accuracy with clinical transparency.
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.