Deep Learning Model for Liver MRI Segmentation
Amber Michele Mittendorf1, Lawrence Ngo1, Erol Bozdogan1, Mohammad Chaudhry1, Steven Chen1, Gemini Janas1,2, Jacob Johnson3, Zhe Zhu1, Maciej Mazurowski1, and Mustafa R Bashir1,2,4

1Department of Radiology, Duke University Medical Center, Durham, NC, United States, 2Center for Advanced Magnetic Resonance Imaging, Duke University Medical Center, Durham, NC, United States, 3Department of Radiology, University of Wisconsin, Madison, WI, United States, 4Department of Medicine, Gastroenterology, Duke University Medical Center, Durham, NC, United States


Hepatic segmentation is an important but tedious clinical task used in a variety of applications. Existing techniques are relatively narrow in scope, requiring a particular type of MRI sequence or CT for accurate segmentation. We developed a Convolutional Neural Network (CNN) capable of automated liver segmentation on single-shot fast spin echo, T1-weighted, or opposed phase proton-density (OP-PD) weighted sequences using separate training/validation and testing data sets. Compared to human segmenters, the CNN performed well, with volumetric DICE coefficients of 0.92-0.95. The CNN performed least consistently on OP-PD sequences, which had the smallest number of cases in the training/validation data set.


Hepatic segmentation is an important but tedious clinical task used for liver volumetry, quantitative analysis, liver lesion colocalization, and planning for surgical intervention and radiation therapy. Manual and automated segmentation methods have traditionally been used, predominantly for CT examinations. More recently, deep convolutional neural network (CNN) models have been applied to multi-organ segmentation of MR images. However, both traditional and CNN-based methods have typically been restricted to a single sequence/image type. In research involving liver MRI, data heterogeneity involves a variety factors including diverse sequence types, scanner models and manufacturers, patient populations, in addition to factors such as motion-corruption, misregistration, and missing data. Our purpose is to develop a robust liver segmentation algorithm, which could be a cornerstone in developing more advanced algorithms in the analysis of these liver MRI studies.

Our purpose is to train and validate a convolutional neural network to accurately perform liver segmentation which is independent of MRI sequence, thereby improving applicability to varying protocols and sequences.


MRI examinations were collected retrospectively and manual liver contouring was performed as the reference standard by one of two image analysts with at least two years of experience in liver segmentation. A training and validation data set included 103 single-shot fast spin echo (SSFSE), 62 precontrast non-fat suppressed T1-weighted (T1w), and 39 opposed phase proton density (OP-PD) MRI volumes, across 104 unique subjects. Each volume consisted of 10-30 images including the liver, depending on the size of the liver and slice thickness (5-10 mm). An independent test set of 30 SSFSE, 18 T1w, and 8 OP-PD sequences was then used for comparing algorithm-derived and human-derived segmentation contours.

The deep learning model was a 2D encoder-decoder convolutional neural network, which was inspired by the U-Net and Inception network architectures3,4(Figure 1). Segmentation accuracy was evaluated using Dice coefficients calculated for each 3D volume.


Dice coefficients were: for SSFSE, mean 0.95 ± 0.01 (range 0.92-0.97); for T1w, mean 0.94 ± 0.02 (0.91-0.96); for OP-PD, mean 0.92 ± 0.05 (range 0.82-0.96). Representative images are shown in Figure 2. Overall, the regions segmented by the human readers and the algorithm matched quite well across all sequences. Errors tended to occur most commonly at the tip of the left hepatic lobe. The lowest and least consistent performance was observed for the OP-PD.


This CNN-based method demonstrates highly accurate liver segmentation with Dice coefficients ranging from 0.92 to 0.95. The algorithm performed best with the SSFSE sequences, possibly in part due to a relatively large number of manual segmentation exams in the training dataset for this type of sequence. The lowest and least consistent performance was observed for the OP-PD sequence, which had the smallest number of cases in the training set. Our mean accuracy is comparable to previously described deep learning based MRI segmentation methods but is superior in sequence independence and therefore has an extended applicability, as prior studies are predominantly based on use for a single sequence.


We have developed a CNN-based method for fully automated liver segmentation that performs accurately on a variety of pulse sequence types. The performance of the algorithm differed for different types of pulse sequences. The performance of the algorithm could potentially be improved and its applicability further extended by incorporating a larger training set, developing a 3D-based rather than slice-by-slice algorithm, and training on other image types, such as fat-suppressed images, contrast-enhanced images, and CT images.


No acknowledgement found.


1. Fu Y, Mazur TR, Wu X, Liu S, Chang X, Lu Y, et al. A novel MRI segmentation method using CNN-based correction network for MRI-guided adaptive radiotherapy.0(0). doi: doi:10.1002/mp.13221.

2. Bobo MF, Bao S, Huo Y, Yao Y, Virostko J, Plassard AJ, et al. Fully Convolutional Neural Networks Improve Abdominal Organ Segmentation. Proceedings of SPIE--the International Society for Optical Engineering. 2018;10574. Epub 2018/06/12. doi: 10.1117/12.2293751. PubMed PMID: 29887665; PubMed Central PMCID: PMCPMC5992909.

3. Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.

4. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818-2826).


Overall encoder-decoder architecture inspired by U-Net and containing Inception modules for a 2D fully convolutional network.

Segmentation overlays on sample SSFSE (A), OP-PD (B), and T1w (C) images. Green represents the overlap of automated and human analyst segmentations. Yellow represents areas segmented by the human analyst that were not included by the algorithm’s segmentation, while light blue represents areas segmented by the algorithm that were not included by the human analyst’s segmentation. Dice coefficients were 0.97 (A), 0.97 (B), and 0.98 (C) for these images respectively. 3D Dice coefficients for the associated volumes were 0.96 (A), 0.95 (B), and 0.96 (C), respectively.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)