INSIGHT-MP: Interpretable Natural Language Processing System for Identification of Transthyretin Amyloidosis with Mixed Phenotype Using Machine Learning
Akshay Arora1, Cynthia Sunderman1, Briget da Graca1, Elisa Priest1, Muhammad Khan1, Kendall Hammonds1, Monica Bennett1, Jason Ettlinger1, Robert Gottlieb1, John Venditto2, Mia Papas2
1Baylor Scott and White Research Institute, 2US Medical Affairs, BioPharmaceutical Medical, AstraZeneca
Objective:
To develop a machine learning model (INSIGHT-MP) for identification of patients likely to have a mixed transthyretin amyloidosis (ATTR) phenotype among heart failure patients with neuropathy, which could guide and optimize referrals for diagnostic testing. We also aim to offer insights into the key factors driving the model.
Background:

Patients with ATTR amyloidosis often have concomitant polyneuropathy and cardiomyopathy (ATTR-mixed phenotype). However, polyneuropathy is frequently unrecognized. Machine learning powered with explainable artificial intelligence (AI) may improve early ATTR-mixed phenotype detection by identifying at-risk patients and helping clinicians understand key predictors driving the model.

Design/Methods:

Patients with diagnoses of ATTR (ICD E85.x) or heart failure (ICD I50.x) and peripheral or autonomic neuropathy (ICD G54.x–G64.x, G90.x) were included. Natural language processing (NLP) with Named Entity Recognition was used to process unstructured clinical, echocardiographic, and electrocardiogram data from clinical notes extracted from the Baylor Scott & White Health Epic database. Balanced Random Forest Classifier (BRFC) was trained on the clinical text to predict presence of both ATTR and neuropathy diagnosis codes. Model explanations were yielded by Local Interpretable Model Agnostic Explanations (LIME).

Results:

The mean age of the cohort was 71.6 (14.3) years; 47.3% female; 75.8% White, 18.1% Black, and 2.1% Asian; and 89.7% non-Hispanic. The BRFC achieved a sensitivity of 90.0 %, specificity of 86.2%, positive predictive value of 14.1%, negative predictive value of 99.7% and f1 score of 24.5% in identifying ATTR-mixed phenotype (n=412) among patients with heart failure and neuropathy (n=13,500). LIME showed carpal tunnel, joint-swelling, atrial fibrillation, dyspnea, weight loss, tingling, heat and cold intolerance, syncope, paresthesia, and claudication as top predictors for predicting ATTR-mixed phenotype.

Conclusions:
The integration of NLP, machine learning and explainable AI may provide a valuable tool for timely identification of undiagnosed ATTR-mixed phenotype cases among patients with heart failure and neuropathy.
10.1212/WNL.0000000000210673
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.