Survival prediction from DCE-MRI kinetic parameters in patients with osteosarcoma using deep learning
Junyu Guo1 and Wilburn E. Reddick1

1St Jude Children's Research Hospital, Memphis, TN, United States


DCE-MRI may be a prognostic biomarker for some tumors including osteosarcoma. The purpose of this study was to assess whether a DCE-MRI kinetic parameter map of osteosarcoma can provide prognostic indicators for clinical results using three deep convolution neural networks (DCNN). In this study, we found that DCNNs can provide biomarkers for overall survivals with accuracy over 0.8; three DCNNs have the comparable performance in prediction of clinical results; and the predictions using DCNN with tumor mask were significantly better than those without using tumor mask.


Osteosarcoma (OS) is the most common malignant bone tumor in children. DCE-MRI is widely used in clinical studies for assessment of cancer treatment response and survival1,2. However, no reliable earlier biomarker was reported before for predicting response, overall survival, and event free survival. One reason could be due to the oversimplified quantification, such as using a mean value averaging for the whole heterogeneous tumor. Deep convolution neural networks (DCNN) are machine-learning algorithms that are well suited to classify images and predict results3. In this study, we deployed three DCNNs including ResNet4, Inception5, and a home-made CNN to investigate whether these networks can predict clinical outcomes from kinetic parameters in a 2D slice.


A total of 37 pediatric patients with OS treated on a phase II trial were included in this study. Protocol treatment was comprised of anti-angiogenic therapy (bevacizumab) and neoadjuvant combination chemotherapy. DCE-MRI data were acquired at different stages to monitor the treatment before surgery. In this study, four serial DCE-MRI examinations at the baseline, on day-2, on day1, and day5 were included for DCNN training (all exams were within about 7 days of start of treatment). All 37 patients had at least one of the four examinations. DCE-MRI data were acquired on a 1.5 T Siemens MRI scanner with16 slices covering all or part of tumors. The total acquisition had 50 phases with a temporal resolution of 7 seconds. DCE-MRI data were analyzed using a two-compartment pharmacokinetic model to generate four parametric maps: Ktrans, kep, ve, and vp6. Histologic response was assessed at week 10 after definitive surgery. Responders were defined as ≥ 90% necrosis and nonresponders as < 90%.

We built one DCNN net called CNN26 including 26 layers. We trained three DCNN nets including CNN26, ResNet50, and InceptionV3 using Keras and tensorflow. All DCE data were divided into training (~80%) and testing (~20%) sets for each of three cases: responders vs. non-responders, event free survivors (EFS) vs. relapsed patients, overall survivors vs. expired patients. In each exam, we selected slices (3 to 12) covering the central part of tumor based on its size. The tumor images were further augmented using rotation and shift with 32 factor. The data sets were summarized in Table 1. All the nets used two epochs and batch size of 100 or 150 (based on memory) with the other default hyper-parameters. In addition, the images multiplied by a tumor mask were used for a separate training and testing. All the training were repeated five times to test the stability of the prediction.


We have trained three nets for each of the four kinetic parameter maps and one combination map (3Paras: ktrans, kep, and ve) for each of the three outcomes. Figure 1 shows that all DCNN nets can predict overall survival consistently with an accuracy up to 0.884 with a tumor mask. All nets performed poorly in predicting response and EFS. However, the prediction of response can reach 0.75 using InceptionV3 and 3Paras. The prediction with the tumor mask were compared with the results without the mask for overall survival shown in Figure 2. The accuracy of prediction with a tumor mask are typically higher than those without a tumor mask except for InceptionV3 and 3Paras. For predicting overall survival, ve is the optimal parameter for CNN26, while Ktrans is the optimal for ResNet50 and InceptionV3. We assess the accuracy of prediction of overall survival for the three nets and the optimal kinetic parameters using a receiver operating characteristic (ROC) analysis. The ROC curve and the corresponding area under curve (AUC) were computed and shown in Figure 3. The accuracies were 0.845 for CNN26 with ve, 0.835 for ResNet50 with Ktrans, and 0.85 for InceptionV3 with Ktrans. The corresponding AUC values are 0.851, 0.87, and 0.893 respectively.

Discussion / Conclusion

Our results reveal that all three DCNNs can predict the overall survival accurately but do perform differently for different parameter inputs. The CNN26 is the most consistent with less variation among repetitions, which may be due to the larger batch size and the simpler network structure. Although we didn’t attain the most robust prediction for response, some results indicate that it is possible to predict response after fine tuning hyper-parameters. In conclusion, we can achieve accurate prediction of overall survival based on a single 2D kinetic parameter map with three different DCNN networks. The prediction of response and EFS may become more promising using fine tuning networks.


No acknowledgement found.


1. Reddick WE, et al. Dynamic magnetic resonance imaging of regional contrast access as an additional prognostic factor in pediatric osteosarcoma. Cancer, 2001;91(12):2230-2237

2. Guo J, et al. Assessing vascular effects of adding bevacizumab to neoadjuvant chemotherapy in osteosarcoma using DCE-MRI. BJC, 2015; 113:1281-1288.

3. LeCun Y, et al. Deep learning. Nature,2015; 521:436–444

4. He K, et al. Deep Residual Learning for Image Recognition.2015; arXiv:1512.03385

5. Szegedy, C, et al. Rethinking the Inception Architecture for Computer Vision. 2015; arXiv:1512.00567v3

6. Tofts PS, et al. Estimating kinetic parameters from dynamic contrast-enhanced T1-weighted MRI of a diffusible tracer: standardized quantities and symbols. JMRI, 1999; 10(3):223-232.


Table1. DCE-MRI Data sets for three categorical trainings: Response, EFS, and overall survival.

Figure 1. Accuracy of prediction of clinical results using three neural networks with a tumor mask.

Figure 2. . Comparison of accuracy of prediction using DCE parameters between two cases: using tumor mask and no mask. * represents the significant difference (p value < 0.05).

Figure 3. Three selected ROC curves for three neural networks with mask. DCE parameters ve for CNN26; Ktrans for ResNet50 and InceptionV3.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)