Limited Value of Radiomics Compared to Quantitative MRI Measures for Predicting 10-year Disability in Newly Diagnosed Multiple Sclerosis: A Real-world Data Exploratory Study
Radu Tanasescu1, Ruizhe Li2, Amjad Altokhis4, Paul S Morgan3, Arman Eshaghi5, Xin Chen2, Nikolaos Evangelou6
1Department of Neurology, Nottingham University Hospitals, University of Nottingham, 2School of Computer Science, 3Radiological Sciences, University of Nottingham, 4University of Nottingham, 5UCL, 6Department of Neurology, Academic Clinical Neurology, University of Nottingham, QMC, Nottingham University Hospitals
Objective:
To compare the predictive value of radiomic features versus quantitative MRI for long-term disability in newly-diagnosed people with Multiple Sclerosis (MS).
Background:
MRI measures (lesions, linear-atrophy) correlate with MS severity, however their predictive value for long-term prognosis is limited. Machine learning (ML) classifiers perform well for cross-sectional disability prediction, but their value for long-term EDSS-prediction is unclear.
Design/Methods:
158 MRI (sagittal T2-FLAIR and T1-weighted spin-echo sequences) and clinical data-sets of eighty-one patients with MS from the Nottingham MS Clinic [52 women;35.4(±10.3)y; diagnosis, five- and ten-years data] were used. We measured the T2-FLAIR-lesion(≥3mm)number/volumes, and linear-atrophy (third-ventricular width, medullary width, corpus callosum index and inter-caudate diameter) using 3DSlicer4.13.0.
107 radiomics features were extracted from the T2-FLAIR images using Pyradiomics package. A Multilayer-Perceptron (MLP) model was trained on clinical data, with/without the radiomic features, to forecast the likelihood of EDSS score ≥6 at 10y. Due to the limited amount of data, a feature-ranking strategy was executed using Random Forest. With a fine-tuning on a small validation set, the number of features was reduced to <10 to reduce noise and prevent overfitting.
Results:
The MLP classifiers were tested on the whole dataset using 5-fold cross-validation approach. The accuracy for predicting 10y EDSS ≥6 before/after feature selection was 0.56/0.77 for the set of features including clinical/demographic and quantitative MRI data. Baseline(diagnosis) clinical/demographic features alone had a comparable accuracy (0.74). Adding radiomic features obtained from the clinical scans at diagnosis did not significantly improve accuracy (0.56/0.79). Adding 5y-follow-up data slightly improved accuracy (0.62/0.85).
Conclusions:
Within the limitation of the small sample-size, the use of radiomic features from first (diagnostic) MS clinical scan does not significantly improve the prediction of long-term disability accumulation compared to quantitative MRI. Mechanisms underlying disability progression in MS are complex, and predictive models should account for the relative weight of various factors beyond routine brain imaging.