Automated intervertebral disc segmentation using a two-pathway network
Fei Gao1, Shui Liu2, Xiaodong Zhang2, Jue Zhang1,3, and Xiaoying Wang2,3

1College of Engineering, Peking University, Beijing, China, 2Department of Radiology, Peking University First Hospital, Beijing, China, 3Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China


We developed a two-pathway fully convolutional network for refined intervertebral disc segmentation. The proposed pooling free subbranch can capture more local fine-grained features. The quantitative results indicate its priority for disc segmentation.


Lumbar disc degeneration (LDD) is the leading cause of lower back pain and has the risk of disability worldwide. Clinically, grading of LDD is a necessary step for making a suitable treatment plan. Therefore, accurate intervertebral disc (IVD) segmentation is the essential step for automated disc disease analysis. In routine clinical settings, MRI is the preferred imaging method for diagnosing intervertebral disc (IVD) due to its better visualization over other modalities1. Generally, fully convolutional network (FCN) is a commonly used method for objects segmentation in medical images, exhibiting good performance. Among those models, U-net is a well-known architecture benefiting from its conciseness and effectiveness. However, it focuses more on abstract features, leading to a lack of local fine-grained features for small objects segmentation such as IVD. To achieve accurate IVD segmentation, we constructed a two-pathway fully convolutional network with an additional pooling free subbranch for capturing more local spatial features, termed as TwoPathFCN.


In this retrospective study, the data collected from routine clinics consists of T2-weighted MRI scans of 208 patients under varied lumbar diseases, such as degeneration, herniation and scoliosis, sourced from different scanners. The dataset was divided into training (70%), validation (10%) and test set (20%) randomly.

Encoder-decoder is a widely used architecture in semantic segmentation, balancing local and global information. However, local information is limited even with encoder features concatenated to decoder path, e.g. U-net2, mainly due to the pooling layers. Therefore, in our study, based on U-net, we constructed an additional computational path, termed as “local pathway”, for capturing more local information, using a pure CNN - without pooling layers. We constructed the TwoPathFCN by integrating the local pathway and the backbone pathway. The full architecture along with its details is illustrated in Fig.1. In our implementation, the local pathway is a fully convolutional architecture built from dense block in DenseNet3. Rather than adding more down-sampling layers (by max-pooling or strided convolution) at the cost of missing the low-level spatial information, we use dilated convolutions4 to increase the receptive fields of the local pathway. The backbone pathway is a typical U-net architecture2, including 19 convolutional layers, four pooling layers, four upsampling layers and four skip concatenations. To allow for the combination of the hidden layers of both pathways, the feature maps of respective last layers of both pathways are concatenated together and then fed to the output layer.

Considering the imbalance between foreground and background pixels, we proposed to integrate weighted pixel-wise cross entropy in the proposed TwoPathFCN. Specifically, $$$w_{foreground}=0.9$$$ and $$$w_{background}=0.1$$$ are adopted during training to bias the model to pay more attention to the foreground pixels.


For the segmentation, metrics including sensitivity (SE), specificity (SP), accuracy (AC), Intersection over Union (IoU) and Dice coefficient (DI) were selected as evaluation indicators, performed on each pixel.

For comparison, we trained three models using the same dataset including U-net (BackbonePathFCN), pure local pathway (LocalPathFCN) and TwoPathFCN. The quantitative results of these variants are listed in Table I. As expected, the TwoPathFCN is ranked first with the highest scores on almost all metrics. The segmentation results on four typical subjects from our test set, produced by different variant architectures, are illustrated in Fig.2. The yellow arrows indicate that error-prone IVDs can still be effectively segmented by the proposed TwoPathFCN method. By harnessing two pathways, the global high-level features as well as the local fine-grained features could be captured simultaneously. This leads to that the dual-pathway architecture can produce better segmentation results compared with the single-pathway architecture.


In conclusion, the proposed two-pathway framework is proved effective for IVD segmentation and It provides a valuable tool for automated IVD disease analysis.


No acknowledgement found.


[1] C. W. A. Pfirrmann, A. Metzdorf, M. Zanetti, J. Hodler, and N. Boos, "Magnetic resonance classification of lumbar intervertebral disc degeneration," (in English), Spine, vol. 26, no. 17, pp. 1873-1878, Sep 1 2001.

[2] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015, pp. 234-241: Springer.

[3] G. Huang, Z. Liu, K. Q. Weinberger, and L. van der Maaten, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, vol. 1, no. 2, p. 3.

[4] F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," arXiv preprint arXiv:1511.07122, 2015.


Fig.1 Architecture of the proposed two-pathway FCN (TwoPathFCN).

Fig.2 Examples of segmentation results from test set achieved by four different methods, with each row denoting a specific subject. The first column and the last column denote, respectively, the original images and the ground truth annotated by experts. The yellow arrows indicate that error-prone IVDs can still be effectively segmented by the proposed TwoPathFCN method.

Tabel I. The segmentation performance, in terms of Dice, IoU, AC, SE, and SP, obtained by different methods on the test set.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)