Machine Learning-enhanced Dementia Testing: Reliability and Accuracy of the Autonomous Cognitive Examination
Calvin Howard1, Marcus Ng2
1Neurology, Harvard Medical School, 2Neurology, University of Manitoba
Objective:

Dementia cases worldwide exceed 57 million and are projected to hit 153 million by 2050. Current healthcare systems are strained, averaging a three-year diagnosis delay. Up to 50% of patients die undiagnosed. Timely, scalable, and accurate tests are critical.

Background:

Dementia cases worldwide exceed 57 million and are projected to hit 153 million by 2050. Current healthcare systems are strained, averaging a three-year diagnosis delay. Up to 50% of patients die undiagnosed. Improved testing is critical.

Design/Methods:

We designed a comprehensive test for cognitive skills such as speaking, drawing, and writing. A suite of algorithms was developed to evaluate various cognitive domains while ensuring accessibility. Technologies employed include custom neural networks, geolocation, keyword recognition, natural language processing, touchscreen interfaces, and computer vision. We ran a randomized-controlled trial with 46 patients from the University of Manitoba and Brigham and Women’s Hospital, Harvard Medical School. Sample size was based upon achieving 80% statistical power. Patients took either our Autonomous Cognitive Examination or the standard Addenbrooke’s Cognitive Examination-3 (ACE-3), followed by the other test 0-6 weeks later. Test reliability and diagnostic metrics were compared.

Results:

Our test achieved high reliability with an intraclass correlation coefficient of 0.81 (95% CI 0.68-0.91). No significant difference was found in total scores between the two tests. Sub-analyses showed consistent scores across cognitive domains, with minor improvements in software-based fluency and language assessments. Out of 19 questions, only two showed significant differences between tests. Diagnostic capability was assessed with an area under the ROC curve, found to be 0.93 (95% CI 0.85-0.99). Specificity and sensitivity were both 1.0 at score thresholds of 76% and 87%, respectively (95% CI 1.0-1.0).

Conclusions:

We present a tool for reliable and comprehensive cognitive assessment. This tool has the potential to significantly scale cognitive testing, addressing the growing dementia crisis and offer quality testing in low-resource environments.

10.1212/WNL.0000000000208284