Postoperative Cervical Spine Radiograph Analysis: Generative Adversarial Networks Which Erase Spinal Instrumentation Improve the Performance of Landmark Detection Models
Sachin Govind1, Yunting Yu2, Nader Dahdaleh3
1Department of Neurological Surgery, Northwestern University Feinberg School of Medicine, 2Penn State College of Medicine, 3Department of Neurological Surgery, Northwestern Medicine
Objective:
In this work, we evaluate a framework which identifies and digitally “erases” confounding spine instrumentation from postoperative images for long-term Neurology follow up.
Background:
Radiographic parameters following cervical spine surgery are presently assessed through manual methods or with the aid of computer-assisted landmark detection. Existing radiographic algorithms are sometimes hindered by vertebral radio-opaque fixation instrumentation, resulting in poor registration.
Design/Methods:
Landmarks of interest included vertebral end plates, intervertebral disc heights and Cobb angles, and spinous process A/P tips. Four model architectures were compared: 1) standard U-Net architecture, 2) standard registration-based model, and 3,4) both standard models prefaced with a masking U-Net and Generative Adversarial Network (GAN) that digitally erases spine instrumentation. The GAN was pre-trained on 11,694 cervical spine x-rays from the NHANES-II dataset, with digitally-added fixation hardware phantoms. These four models were evaluated on 132 annotated postoperative lateral cervical spine radiographs.
Results:
Inter-rater reliability showed a normalized Euclidean Distance (ED) 1.56±0.14mm for landmark locations. The U-Net and registration models were able to locate C2-C6 vertebral parameters of interest, with an ED of 2.11±0.24mm and 1.91±0.22mm, respectively. The prefixed U-Net and registration models demonstrated ED of 1.61±0.18mm and 1.11±0.24mm, respectively. Using the tolerance benchmark of 1.56mm, precision scores were determined for each model: 88% vs. 95% for the prefixed U-Net (p = 0.0132) and 79% vs. 88% for the prefixed registration model (p = 0.0092). Subgroup analysis revealed that the prefixed models fared better (89% vs. 96%, p = 0.0004) for spinous processes vs. vertebral endplates, and that accuracy of computed landmarks was better for C2-C4 vs. C4-C6 (p = 0.0045).
Conclusions:
We demonstrate that GANs trained on spinal imaging can aid Neurologists with postoperative cervical spine radioimaging follow-up, and that when used as prefixed models, they outperform state-of-the-art detection models with efficacy similar to gold standard human-detected landmarks.