Development and Validation of an AI-based Algorithm to Identify Recurrent Stroke Using National Claims Data in Korea
Jun Yup Kim1, Seong-Eun Kim1, Jiyoon Lee4, Hyokwang Kim5, Wi-Sun Ryu5, Jihoon Kang1, Hyunji Kim2, Hee-Joon Bae3
1Department of Neurology, 2Department of Laboratory Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, 3Seoul National University Bundang Hospital, Seoul National University College of Medicine, 4Department of Biostatistics, Korea University College of Medicine, 5Artificial Intelligence Research Center, JLK Inc
Objective:
This study aimed to develop and validate an artificial intelligence (AI)–based algorithm to identify recurrent stroke admissions using national claims data, enabling accurate epidemiologic statistics and supporting health policy planning.
Background:
Stroke is a major cause of death and disability worldwide. Recurrent stroke contributes substantially to the overall burden but is difficult to accurately identify using administrative claims data because current coding systems do not distinguish acute from chronic or non-acute events. Although validated algorithms for first-ever stroke exist, no nationwide claims-based algorithms are currently available to detect recurrent stroke in Korea.
Design/Methods:
The gold standard for recurrent stroke was defined using the Korean Acute Stroke Assessment Program (ASQAP), a nationwide government-led evaluation of acute stroke care. Initial stroke admissions were identified during the 4th–9th assessment periods (2011–2021), and patients who were re-registered in the 10th assessment (October 2022–March 2023) were classified as recurrent stroke. A rule-based algorithm was developed, followed by a Random Forest machine learning model. Hyperparameters were optimized using grid search to maximize sensitivity and PPV.
Results:
A total of 13,078 admission cases during the 10th assessment period (male, 54.2%; mean age, 65.7 ± 15.5 years) were analyzed, of which 485 cases were confirmed recurrent stroke. Thirty key identifiers were selected from claims data, and 58 conditions were generated through their combinations. These identifiers included brain CT, MRI, rt-PA, endovascular treatment, carotid endarterectomy or stenting, antithrombotics, and anticoagulants, etc. The rule-based algorithm achieved sensitivity 72.0%, specificity 98.5%, accuracy 97.5%, and PPV 64.7%. The AI-based model improved performance (sensitivity 82.3%, specificity 98.9%, accuracy 98.3%, PPV 74.7%).
Conclusions:
An AI-based algorithm accurately identified recurrent stroke admissions from national claims data. This approach may enable robust nationwide recurrent stroke statistics and inform resource allocation and health policy decisions.
Disclaimer: Abstracts were not reviewed by Neurology® and do not reflect the views of Neurology® editors or staff.