Proton density fat fraction results derived from deep learning auto-segmentation correlate strongly with results obtained by manual analysis
Ashley L. Louie1, Kang Wang1, Timoteo Delgado1, Michael S. Middleton1, Gavin Hamilton1, Tanya Wolfson2, Robert P. Myers3, C. Stephen Djedjos3, Rohit Loomba4, and Claude B. Sirlin1

1Liver Imaging Group, Department of Radiology, University of California San Diego, La Jolla, CA, United States, 2Computational and Applied Statistics Laboratory (CASL), SDSC, University of California San Diego, La Jolla, CA, United States, 3Gilead Sciences, Inc., Foster City, CA, United States, 4NAFLD Research Center, Division of Gastroenterology, University of California San Diego, La Jolla, CA, United States


A widely-accepted method to estimate hepatic proton-density fat fraction (PDFF) is by averaging values derived from manually drawn regions-of-interest (ROIs) in the nine Couinaud segments. An automated deep-learning-based segmentation tool has been developed to potentially replace this labor-intensive and technically-challenging method. The purpose of this study was to compare whole-liver PDFF values obtained using this auto-segmentation tool to results obtained using manual analysis for a longitudinal multi-center clinical trial of 72 patients with nonalcoholic steatohepatitis. We found that PDFF values estimated using the auto-segmentation tool were in near agreement with values derived by manually drawing ROIs.


MRI assessment of proton density fat fraction (PDFF), a quantitative imaging biomarker of hepatic steatosis, is noninvasive, standardized, and allows for whole liver coverage1. MRI-PDFF is currently estimated by averaging values from regions-of-interest (ROIs) drawn in the nine Couinaud liver segments2,3,4. This method is labor-intensive and time-consuming, and so an automated, deep-learning-based segmentation tool was developed to more rapidly and efficiently estimate MRI-PDFF5. The purpose of this study was to validate correlation between MRI-PDFF estimated using this automated method compared to manual analysis in a multi-center drug development clinical trial.


Study design

We performed a secondary analysis of images obtained in a parent drug-development clinical trial (NCT02466516) conducted at 23 sites using one of four MR scanner types (GE Healthcare 1.5T and 3T, Siemens 1.5T and 3T). Each subject was scanned at three time points. Parent study inclusion criteria were that enrolled subjects have had liver biopsy with confirmed nonalcoholic steatohepatitis6.

MRI protocol

Abdominal 6-echo 2D spoiled-gradient-echo chemical-shift-encoded MRIs were acquired (Table 1) with a low flip angle to avoid T1-bias and six echoes to correct for R2* decay1,2,4.

Manual image analysis

One-cm radius ROIs were manually placed by a trained data analyst in each of the nine Couinaud liver segments on fifth-echo source images, avoiding major vessels and bile ducts, liver edges, the gallbladder, and artifacts3. ROIs were then propagated onto the five remaining source echoes, and the mean ROI signal intensities were exported and processed using a custom MATLAB script that accounted for spectral complexity of fat to estimate segmental PDFF values, which were then averaged to obtain a reference whole-liver PDFF value2,4,7.

Automated image analysis

A parametric PDFF map was created directly from source images for each MR exam using an Osirix plug-in8. A 2D U-Net convolutional neural network (CNN) implemented with the Python library Keras5 was used to create a whole-liver volume mask from the source images for each MR exam, and to exclude pixels within 1-cm of liver edges. The mask was applied to the parametric PDFF map to extract PDFF values from voxels inside the liver (Figure 1). The assumption was made that there are two normally-distributed populations of per-voxel PDFF values within the liver, one from liver parenchyma, and one from intrahepatic vessels5. A Gaussian-mixture model (2-GMM) was used to distinguish the distribution of values attributable to major intrahepatic vessels from the distribution of values attributable to liver parenchyma. The values attributable to intrahepatic vessels was removed, and only PDFF-values attributable to liver parenchyma were averaged to obtain a whole-liver PDFF value.

Statistical analysis

Data from all three visits of every patient was included. Bland-Altman analysis was performed to estimate bias and 95% limits of agreement (LOA). An intra-class correlation coefficient (ICC) and a coefficient of variation for paired data (CV) were computed to assess agreement between PDFF values obtained using these methods. Significance of the Bland-Altman bias was assessed using a mixed-effects regression extension of a paired t-test to adjust for within-patient dependence.


Two hundred and one MR exams (72 adults, ages 18 to 72 yrs) were included in this analysis. PDFF values obtained from auto-segmentation agree closely with those from manual analysis (ICC=0.996; CV=2.7%). Bland-Altman analysis showed that mean absolute bias (auto-segmentation PDFF minus manual analysis PDFF) was 0.10% (p=0.058; 95% limits of agreement: -1.09%, 1.29%) (Figure 2).


In this multi-center study we found strong correlation between hepatic PDFF values obtained by CNN-based auto-segmentation and hepatic PDFF values obtained by manual analysis. These results support the near-equivalence of the tested CNN-based auto-segmentation and manual methods to estimate hepatic PDFF. Our findings build on the results from a prior single-center study5 by showing strong performance of the automated method in a multi-center, multi-scanner setting.


This analysis supports that CNN-based auto-segmentation to estimate hepatic PDFF may replace the labor-intensive standard manual analysis method and is robust across imaging centers, scanner platforms, and field strength.


The parent study for this analysis was funded by Gilead Sciences, Inc.


  1. Bydder et al, Magn Reson Imaging 2008;26:347
  2. Yokoo et al, Radiology 2011;258:749
  3. Hong et al, JMRI 2018;47:988
  4. Heba et al, JMRI 2015;43:398
  5. Wang et al, ISMRM Machine Learning Workshop, Mar 2018
  6. Loomba et al, Hepatology 2018;67:549
  7. Hamilton et al, NMR Biomed 2010;24:784
  8. Manning et al, Abdom Radiol 2017;42:833


Table 1: MRI parameters for 6-echo 2D spoiled-gradient-echo CSE-MRI sequence

Figure 1: An overview of the auto-segmentation pipeline. First, the source image (a) is collected from the scanner and then processed through an Osirix plug-in to generate a parametric PDFF map (b). Second, the CNN creates a whole liver volume mask (c). The 1-cm border (illustrated by the unmasked space between the purple outline) is then removed (d). This eroded mask is overlaid onto the parametric PDFF map, as shown in (e).

Figure 2: Bland-Altman plot showing agreement of PDFF results between auto-segmentation and manual analysis. Bias is trend-level significant, but small and not clinically relevant. Agreement is high.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)