Complex Enough?: Evaluating Reliability in Rating Medical Complexity of a Vulnerable Group of Patients with Epilepsy
Jonathan Williams1, Ventaka Bendi1, Fabio Nascimento1, Julio Cuevas2, David Heron1, Rachel Miller1, Adam Greenblatt1
1Washington University in St. Louis, 2San Juan Bautista School of Medicine
Objective:
At-risk patients with epilepsy (PWE) have disparities in access to specialists and treatments. Effectively reaching vulnerable PWE with finite clinic resources will require protocols to identify, rate, and triage PWE appropriately. We sought to evaluate the interrater variability between expert (epileptologist) and non-expert (neurology trainee) on grading the complexity of a common set of clinical epilepsy cases. 
Background:
PWE have higher rates of comorbid medical conditions, injury and premature death compared to the general population.  Minority status and low socioeconomic status are associated with a higher risk of epilepsy. These sociodemographic traits are prevalent in our resident neurology clinic.  Specialist care is associated with a lower risk of premature mortality for PWE.  Available grading tools for seizure severity have been used in clinical trials or research.  The validity of these existing measures as an estimate of epilepsy disease severity/complexity in outpatient clinics has not been well established. 
Design/Methods:
An observational cross-sectional study. Seven de-identified epilepsy cases from the resident clinic were presented as standardized clinical vignettes. A RedCap survey using qualitative and semiquantitative measures alongside the vignettes was completed by one epileptologist and eight neurology trainees. Primary measures of interest were the interrater agreement and reliability.  We also explored the weighted strength of clinical features on respondent selection when concordant.  
Results:
  • Percent agreement: 73% (95% CI 52.4-93.7); p=0.00006559  

  • Interrater reliability (Gwet’s AC1): 0.460 (0.047-0.874) [moderate]; p=0.01721  

Conclusions:
Interrater agreement was 73% (p=0.00006559) and interrater reliability was only moderate (0.46, p=0.01721) between respondents grading epilepsy complexity. For concordant responses, medications and seizure frequency had the highest impact on choice selection. Given the % agreement, common principles exist that may be operationalized into protocols to reach those at higher risk. Standardized, evidence-based grading systems could help to improve interrater reliability. 
10.1212/WNL.0000000000205397