A task-based endpoint assessment for CNN segmentations in radiomics processing
Karl Spuhler1, Jie Ding1, Mario Serrano-Sosa1, and Chuan Huang1,2,3

1Biomedical Engineering, Stony Brook University, Stony Brook, NY, United States, 2Radiology, Stony Brook Medicine, Stony Brook, NY, United States, 3Psychiatry, Stony Brook Medicine, Stony Brook, NY, United States


In this study, we present evidence that CNN-based segmentations are sufficient for automated ROI delineation in radiomics processing.


Radiomics processing has received considerable interest over the past several years, with researchers increasingly seeking to realize images as mineable data sources which can augment and support clinical decision making noninvasively using imaging data already widely acquired in standard care (1). Whereas radiomics offers several potential benefits, its clinical adoption is often hindered by the need for reliable region of interest (ROI) delineation. Manual segmentation is a time consuming process which must be performed by a trained expert, and radiomics features have been seen to vary with segmentation accuracy.

Recently, convolutional neural networks (CNN) have emerged as a dominant technique in essentially all computer vision tasks, including image segmentation. CNNs automatically learn latent data representations without the need for modelling or assumptions of image acquisition physics or biology. As such, CNNs arguably are likely to provide an optimal framework for segmenting ROIs across a spectrum of radiomics tasks. Despite marked interest in radiomics and impressive improvements in image segmentation due to CNNs, little task-basedvalidation, the assessment of CNN segmentations to support radiomics workflows, has been performed.

In this study, we demonstrate that a u-Net architecture (2)provides reliable segmentations for radiomics processing, using our recently published pipeline for sentinel lymph node (SLN) status prediction using DCE-MRI of women with breast cancer (3).


Our previous radiomics pipeline, recently published in JMRI (3), consists of a logistic regression model established on a 109-subject DCE-MRI dataset and validated on an independent 52-subject dataset. All ROIs were drawn by an expert radiologist (R1) with 10 years of experience in breast MRIfor radiomics training and testing; moreover, a radiology resident (R2) drew a second set of ROIs for the validation set, in order to determine whether the variance introduced by feeding CNN segmentations to the trained radiomics pipeline exceeded that introduced by using a second source of manual ROIs.

In order to assess the utility of CNN-based segmentations of this task, we trained a u-Net to segment breast lesions from DCE-MRI. The CNN training dataset was a superset of the radiomics training data, containing an additional 154 subjects acquired with different voxel sizes which were not included in the original radiomics training (Figure 1).


he CNN achieved a mean Dice coefficient of 0.70±0.20 in the validation set relative to R1, this exceeded R2’s mean Dice coefficient of 0.62±0.16 relative to R1. A histogram of the CNN vs R1 Dice coefficients is shown in Figure 2.

In terms of the primary endpoint, task-based analysis, the CNN segmentations achieved comparable accuracy to R1 segmentations in radiomics processing. Receiver operating characteristic data are shown in Figure 3 and prediction metrics are shown in Table 1.


There are two primary observations of this study. Principally, when inputting CNN-based segmentations to our trained radiomics model we did not observe substantial performance degradation relative to the original radiologist’s (R1) segmentations. Secondly, when a second observer, R2, a radiology resident, provided segmentations we observed larger performance variation than the CNN-based segmentations introduced. This demonstrates that the u-Net model is a suitable candidate for advancing clinical adoption of radiomics through the use of automated lesion segmentation.


A simple CNN model is able to provide robust and sufficiently accurate ROI segmentations for input into an existing radiomics pipeline. This is primarily supported by the fact that the CNN segmentations demonstrated similar task-based accuracy as expert manual segmentation while at the same time outperforming segmentations provided by a radiology resident. The community would benefit greatly from such task-based analyses as provided herein.


This work was supported by National Institutes of Health (R03CA223052), Carol M. Baldwin Foundation for Breast Cancer Research (2017-Huang), Walk-for-Beauty Foundation


1. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology 2015;278(2):563-577.

2. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. 2015. Springer. p 234-241.

3. Liu C, Ding J, Spuhler K, Gao Y, Serrano Sosa M, Moriarty M, Hussain S, He X, Liang C, Huang C. Preoperative prediction of sentinel lymph node metastasis in breast cancer by radiomic signatures from dynamic contrast‐enhanced MRI. Journal of Magnetic Resonance Imaging 2018.


Figure 1. Distribution of training and testing data for both the CNN and radiomics models. Note that the radiomics training dataset is a subset of the CNN training dataset.

Figure 2. Dice coefficient histogram. R1 vs CNN.

Figure 3. ROC data for radiomics prediction accuracy using R1, R2 and CNN segmentations.

Table 1. Radiomics prediction metrics

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)