Diagnosis of Multiple Sclerosis Subtype through Machine Learning Analysis of Frontal Cortex Metabolite Profiles
Abhinav V. Kurada1, Kelley M. Swanberg1,2, Hetty Prinsen2, and Christoph Juchem1,2,3,4

1Biomedical Engineering, Columbia University School of Engineering and Applied Science, New York, NY, United States, 2Radiology and Biomedical Imaging, Yale University School of Medicine, New Haven, CT, United States, 3Neurology, Yale University School of Medicine, New Haven, CT, United States, 4Radiology, Columbia University Medical Center, New York, NY, United States


The onset and progression of multiple sclerosis (MS) is accompanied by changes in brain biochemistry. Magnetic resonance spectroscopy (MRS) is a powerful tool for investigating these changes in vivo. Machine learning analysis of MRS-derived biochemical profiles may reveal metabolic patterns inherent in certain MS subtypes to inform their diagnosis. By employing a feature set of only metabolite concentrations derived from brain MRS data acquired at 7 Tesla, we achieved an 80% validation set accuracy for differentiating MS patients from healthy controls and a 70% validation set accuracy for differentiating relapsing-remitting and progressive MS patients.


Multiple sclerosis (MS) is a demyelinating autoimmune disease that affects over 2 million people worldwide.1 The disease presents clinically with significant heterogeneity and is classified into several different subtypes, the two broadest of which are relapsing-remitting (RR-MS), marked by bouts of neuronal inflammation and recovery, and progressive (P-MS), characterized by neurodegeneration and functional degradation in the absence of acute inflammatory flares.2 Treatment for these two MS courses follows separate cycles of care. While several FDA-approved medications can combat the frequency and severity of flares in relapsing-remitting MS, fewer therapies are available for the progressive course.3 Diagnosing MS often involves long periods of patient monitoring, especially for P-MS,4 and frequently requires invasive lumbar punctures.5 Noninvasive automated solutions may enable the more rapid and accurate differential diagnosis of MS and its subtypes. Since treatment paradigms are shifting towards starting treatment as early as possible after initial diagnosis, such solutions may prove to be even more critical.6 In vivo magnetic resonance spectroscopy (MRS) can contribute to their development. MRS has already revealed metabolic differences among MS subtypes7 and is thus poised to inform machine learning (ML) analyses for expanding diagnostic toolkits. While limited efforts have been made to employ MRS in ML-driven classification of MS,8 previous studies have typically not used metabolite concentrations as the sole input features for subtype classifier development.


Spectral acquisition and quantification. 19 individuals with RR-MS (13 female, 6 male; mean ± S.E.M. 50 ± 2.4 y.o.), 21 with P-MS (12 female, 9 male; 55 ± 1.7 y.o.), and 16 individuals without MS (Healthy Controls (HC); 9 female, 7 male; 51 ± 2.6 y.o.) underwent a 7 Tesla MR scan (Varian Medical Systems, Inc., Palo Alto, CA, USA) with B1 phase shimming, third-order spherical harmonics B0 shimming, and localization to a 27-cc cubic frontal cortex voxel for proton MRS, including macromolecule-suppressed STEAM (TI 300 ms, TE 10 ms, TM 50 ms, TR 3 s) as well as MEGA-sLASER for J-difference editing of glutathione (GSH) and γ-aminobutyric acid (GABA) (TE 72 ms, TR 3 s).9-10 Processing and quantification of metabolites relative to total creatine were achieved in INSPECTOR11 as reported previously.12

Machine learning analysis. Eleven metabolite concentrations (Figure 1) derived from three different MRS applications were used as input for analysis pipelines developed using Python and the Sci-Kit Learn library.13 SMOTE was first employed to balance class sizes14 prior to pre-processing and dimensionality reduction. Hyper-parameter optimization for several linear and non-linear classifiers was conducted using GridSearch CV.13 Leave-One-Out Cross Validation (LOOCV) was employed on a training set, and testing was performed on a held-out, evenly split, ten-patient validation set.


Distinguishing MS from HC. Pipelines employing K-nearest neighbors (KNN), logistic regressions (LR), support vector machines (SVM), and Gaussian process classifiers (GPC) each reached at least 75% LOOCV accuracy on the training set and an 80% accuracy on the validation set (Figure 2A). All other examined pipelines achieved <70% validation set accuracy. Dimensionality reduction, namely principal component analysis (PCA), further increased classifier accuracy for MS patient vs. HC differentiation for the four best-performing pipelines. The effect of PCA on the KNN pipeline’s training and validation set performance is shown in Figure 3A. Eight principal components were employed to maximize the pipeline’s collective accuracy on both sets. The final KNN pipeline correctly predicted the disease state of 8 patients in the validation set, while mistaking the course of one MS patient and one HC (Figure 4A).

Differentiating between relapsing-remitting and progressive MS. The highest-performing pipeline (KNN) achieved a 77.8 ± 4.7% LOOCV accuracy and 70% validation set accuracy (Figure 2B). A stochastic gradient descent (SGD)-based pipeline also achieved a 70% validation set accuracy. All other pipelines achieved <60% validation set accuracy. Dimensionality reduction also increased MS subtype classifier accuracy and revealed that employing five principal components optimized KNN pipeline performance (Figure 3B). The KNN pipeline correctly predicted the disease state of seven patients in the validation set, while mistaking the course of one RR-MS patient and two P-MS patients (Figure 4B).


Our results confirm that the metabolic profiles associated with RR-MS and P-MS are distinguishable through machine learning. To our knowledge, this study is the first to demonstrate that the classification of MS subtype can be performed using in vivo MRS data alone. Adding more input features to our models, including metabolite concentrations derived from even more than the 3 different MRS sequences employed in this research, may further increase each of our models’ accuracy. Finally, an external test set is needed, beyond our validation set, to further assess model generalizability.


This research was funded by the National Multiple Sclerosis Society (NMSS) grant “In Vivo Metabolomics of Oxidative Stress with 7 Tesla Magnetic Resonance Spectroscopy“ (RG 5319) as well as the Yale Center for Clinical Investigation (YCCI) and falls under the purview of Yale Medical School Human Investigation Committee protocol #1107008743. We offer special thanks to Ms. Yvette Strong (YCCI) and the nursing staff at the Yale-New Haven Hospital Interventional Immunology Clinic for their tireless efforts to support patient recruitment.


1. Browne P, Chandraratna D, Angood C, Tremlett H, Baker C, Taylor BV, Thompson AJ. Atlas of multiple sclerosis 2013: a growing global problem with widespread inequity. Neurology. 2014;83(11):1022-1024.

2. Sand IK. Classification, diagnosis, and differential diagnosis of multiple sclerosis. Current Opinion in Neurology. 2015;28(3):193-205.

3. Ciotti JR, Cross HC. Disease-modifying treatment in progressive multiple sclerosis. Current Treatment Options in Neurology. 2015;20(5):12.

4. Thompson AJ, Banwell BL, Barkhof F, Carroll WM, Coetzee T, Comi G, Correale G et al. Diagnosis of multiple sclerosis: 2017 revisions of the McDonald criteria. The Lancet Neurology. 2017.

5. Stangel M, Fredrikson S, Meinl E, Petzold A, Stüve O, Tumani H. The utility of cerebrospinal fluid analysis in patients with multiple sclerosis. Nature Reviews Neurology. 2013;9(5):267.

6. Noyes K, Weinstock-Guttman B. Impact of diagnosis and early treatment on the course of multiple sclerosis. Am J Manag Care. 2013;19(17):s321-331.

7. Geurts JJG, Reuling IEW, Vrenken H, Uitdehaag BMJ, Polman CH, Castelijns AJ, Barkhof F, Pouwels PJW. MR spectroscopic evidence for thalamic and hippocampal, but not cortical, damage in multiple sclerosis. Magnetic Resonance in Medicine. 2006; 55(3):478-483.

8. Ion-Mărgineanu A, Kocevar G, Stamile C, Sima DM, Durand-Dubief F, Van Huffel S, Sappey-Marinier D. Machine learning approach for classifying Multiple Sclerosis courses by combining clinical data with lesion loads and Magnetic Resonance metabolic features. Frontiers in Neuroscience. 2017;11:398.

9. Prinsen H, De Graaf RA, Mason GF, Pelletier D, Juchem C. Reproducibility measurement of glutathione, GABA, and glutamate: towards in vivo neurochemical profiling of multiple sclerosis with MR spectroscopy at 7T. Journal of Magnetic Resonance Imaging. 2017;45(1):187-198.

10. Swanberg KM, Prinsen H, Fulbright RK, et al. Towards in vivo neurochemical profiling of multiple sclerosis with MR spectroscopy at 7 Tesla: Cross-sectional assessment of frontal-cortex glutathione, GABA, and glutamate in individuals with relapsing-remitting and progressive multiple sclerosis. Proc. ISMRM. 2017;2970.

11. Juchem C. INSPECTOR - Magnetic Resonance Spectroscopy Software. Columbia TechVenture (CTV) License CU17130. 2016. innovation.columbia.edu/technologies/cu17130_inspector.

12. Swanberg, KM, Prinsen H, Coman D, de Graaf RA, and Juchem C. Quantification of glutathione transverse relaxation time T2 using echo time extension with variable refocusing selectivity and symmetry in the human brain at 7 Tesla. Journal of Magnetic Resonance. 2018;290:1-11.

13. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M et al. Scikit-learn: machine learning in Python. Journal of Machine Learning Research. 2011; 12:2825-2830.

14. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research. 2002; 16:321-357.


Figure 1: The concentrations of eleven total creatine-referenced metabolites were employed as input features for our classification pipelines. Nine of these metabolites were derived through macromolecule-suppressed Stimulated Echo Acquisition Mode (STEAM: TI 300 ms, TE 10 ms, TM 50 ms, TR 3 s). J-difference editing (JDE) enabled the quantification of GABA and GSH using semi-localized by adiabatic selective refocusing spectroscopy (MEGA-sLASER: TE 72 ms, TR 3 s).

Figure 2: Accuracy of the highest-performing classifiers for differentiating multiple sclerosis (MS) patients vs. healthy controls (HC) and differentiating relapsing-remitting and progressive MS patients. A) We were able to tune K-nearest neighbors (KNN), Gaussian process classifier (GPC), logistic regression (LR), and support vector machine (SVM) models to achieve 80% validation set accuracy for differentiating MS patients from HCs. Other MS versus HC pipelines achieved accuracies of 70% and below (gray dashed line). B) KNN and stochastic gradient descent (SGD) pipelines reached 70% validation set accuracy for classifying MS subtypes, while other pipelines produced accuracies of 60% and below.

Figure 3: Dimensionality reduction techniques, including principal component analysis (PCA), increased the accuracy of the leading pipelines for separating multiple sclerosis (MS) patients from healthy controls (HC) as well as for differentiating MS patients by their disease state. A) The optimal MS vs. HC KNN pipeline (82.1 ± 4.4% Leave-One-Out Cross-Validation Accuracy – LOOCV; 80% validation set accuracy) was achieved by employing 8 principal components. B) The most accurate MS subtype-classifying KNN pipeline (77.8% ± 4.7% LOOCV; 70% validation set accuracy) employed 5 principal components. The shaded regions surrounding the training set accuracy represent ± 1 S.E.M of this accuracy.

Figure 4: Validation set confusion matrices for the top K-Nearest Neighbors (KNN) pipelines for differentiating multiple sclerosis (MS) patients from healthy controls (HC) and for classifying patients with different MS courses. A) The first KNN pipeline correctly predicted whether 8 of 10 patients had MS. One MS patient was incorrectly classified as being healthy and one HC was predicted to have MS. B) The second KNN pipeline correctly classified the disease state of 7 of 10 MS patients, while incorrectly predicting the disease course of two P-MS patients and one RR-MS patient.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)