Data-driven phenotypic clustering of Parkinson’s disease patients seeking deep brain stimulation
Joohi Jimenez Shahed1, Arthur Berg2, Michele York3, Jason Schwalb4, James Kirk5, Mustafa Siddiqui6, James McInerney2
1Icahn School of Medicine at Mount Sinai, 2Penn State University, 3Baylor College of Medicine, 4Henry Ford Medical Group, 5Patient Advocate, 6Wake Forest Medical Center
Objective:

To identify unique phenotypic clusters amongst participants in RAD-PD (Registry for the Advancement of Deep Brain Stimulation in Parkinson’s Disease). 

Background:
RAD-PD is a longitudinal quality improvement registry that systematically characterizes participants through patient reported outcome measures and clinician administered scales, with the overarching goal of improving outcomes from deep brain stimulation (DBS) in PD. Multiple investigations demonstrate that DBS improves motor function and quality of life (QoL). Less is understood about non-motor outcomes and determinants. 
Design/Methods:
A variety of demographic/social, disease-related, motor, non-motor, quality of life, and treatment-related datapoints are captured in RAD-PD.  Continuous data variables from participants with complete pre-operative assessments were analyzed using R software. A pairs plot correlation analysis, principal component analysis (PCA) and hierarchical clustering analysis (HCA) with normalized data were conducted.  

Results:
Data from 133 subjects were included. Amongst 32 categories of RAD-PD data covering demographic/disease (n=5), non-motor (n=18), motor (n=3), cognitive (n=3) and QoL (n=3) variables, all 3 QoL scale scores were strongly correlated with each other (r=0.47-0.64, p<0.0001), and were otherwise most highly correlated (r>0.4) with  measures of depression, anxiety, sleep/fatigue, NMSS total, mood/cognition, IADLs, and MDS-UPDRS part 2. Motor and disease-related variables did not strongly correlate with other variables. PCA revealed two dimensions that together explain 38.7% of the variance in the data, driven mainly by quality of life, NMSS domain/total scores, and QUIP-RS subscores/total scores. HCA classified individuals into two groups that were differentiated based on non-motor (NMSS and ICD scores), cognitive, and QoL features. 
Conclusions:

While recommendations for DBS in PD are often driven by motor disease features, QoL and non-motor symptoms are prominent features that are closely related and can be combined to classify phenotypic clusters. Such data-driven classifications can be used in the RAD-PD cohort for novel outcomes analyses and predictive modeling.  

10.1212/WNL.0000000000203914