Predicting Alzheimer's Disease from Clinical Dementia Rating, Estimated Total Intracranial Volume, Atlas Scaling Factor: Incorporating Ensemble Approach into Automated Machine Learning
Ali Haider Bangash1, Sakina Batool2
1Shifa College of Medicine, STMU, 2Fatima Jinnah Medical University
Objective:
With artificial intelligence (particularly machine learning) exhibiting superior predictive and prognosticative capabilities when compared with conventional statistical techniques, for neurological as well as non-neurological conditions, we explored automated Machine Learning to predict AD using variables such as socioeconomic status, Normalized Whole Brain Volume and Atlas Scaling Factor.
Background:
Alzheimer’s Disease (AD) is a debilitating disease associated with the pathognomonic deposition of neurofibrillary tangles and neuritic plaques along with characteristic amyloid angiopathy-driven hippocampal granulovacuolar degeneration and Meynert's nucleus neuronal loss.
Design/Methods:
The data was adopted as shared on Kaggle. The study population comprised 373 human subjects (Female subjects = 57%; Range of age = 60-98). Variables include years of education, socioeconomic status and MMSE scores. The current state of the art for aML was adopted to develop classification models using algorithms including Extreme Gradient Boosting, Random Forest and Neural Network. The models were stacked after implementation of hyperparameter tuning, via insertion of random features and boosting on errors. Ensemble approach, which is the amalgamation of two or more than two algorithmic models to develop a model better than either of its computive components, was superimposed on stacked models. The mwA-AUROC area gauged the discriminating ability of models, with the highest value of 1 indicating perfect discriminative classification ability.
Results:
An ensemble of CatBoost, Nearest Neighbours and Neural Network algorithmic models with incorporated K-means features and boost on errors achieved an mwA-AUROC of 0.96 with a close-to perfect AD detection score. The respective model did not perform well in classifying converted patients. The ensemble model exhibited the least log loss among all developed algorithmic models.
Conclusions:
Such predictive models, when deployed on cloud, would serve to improve the management of the respective patient as well as the resource allocation, ultimately translating into significant reduction of morbidity and mortality associated with this enfeebling condition.