NLP-based Extraction of Social Determinants of Health in Patients Admitted with Spontaneous Intracranial Hemorrhage
Evan Zelt1, Areen Al-Dhoon1, Umit Topaloglu1, Aarti Sarwal1
1Wake Forest School of Medicine
Objective:
To determine the effectiveness of extraction of Social Determinants of Health in ICH patient notes. 
Background:
Social Determinants of Health (SDOH) have been a focus of research in intracranial hemorrhage (ICH) Trials. Clinical text describes SDOH better than structured data in the EMR, but manual abstraction is expensive and prone to bias. Machine learning-based language extraction models require high-quality annotations. Clinical SDOH extraction models must be trained on SDOH-rich corpora. We created a framework to use the Social History Annotation Corpus (SHAC) on clinical notes. With these annotations, we analyzed the richness of SDOH corpora amongst note types. 

Design/Methods:

Ongoing study assessing encounter notes from 600 patients with spontaneous ICH over 4 years. We took a sample of 5 patients and extracted clinical notes across encounters from EMR without any date limitations yielding 2315 notes. Elements were annotated according to SHAC guidelines. Two independent annotators were trained to mark each note’s SDOH elements using guidelines and adjudication tools to resolve disagreements, overseen by a senior investigator. The count for various SDOH elements was found for each note. These counts were then grouped into 29 types with differences characterized between types.

Results:

We found that gender was the most frequent SDOH element listed, followed by tobacco and alcohol. Drugs, insurance, living status, and race were less frequent. Country of origin, environmental exposure, physical activity, and sexual orientation were never mentioned. Of the 29 types of notes, Consults, Progress Notes and Therapy Evaluation notes produced the highest yield. 

Conclusions:

Efforts for extracting SDOH elements would have the highest efficiency when concentrated on Consults, Progress Notes, and Therapy Evaluations. Extraction of country of origin, environmental exposure, physical activity, and sexual orientation elements is restricted with current notes. These elements need additional sources outside the EMR or incorporation of these SDOH into the EMR as structured entries.  

10.1212/WNL.0000000000206414