A 3D Fully Convolutional Network with various input dimension for brain extraction in MRI
Xibo Zhang1, Zhe Liu2, Pascal Spincemaille2, and Yi Wang2

1Tsinghua University, Beijing, China, 2Cornell University, New York, NY, United States


A 3D Fully Convolutional Network is proposed using cascade architecture and combining two different channels to overcome the low accuracy of traditional methods. The network is applied to do the brain extraction.

Target Audience

Researchers and clinicians interested in MRI image processing.


Brain extraction, also known as skull stripping, is a crucial process in magnetic resonance imaging (MRI), especially functional MRI. It is used as the first step for many MRI processing pipelines. Traditional topology based brain extraction methods (1) have been proposed in specific dataset, although the accuracy might be un-satisfying for general use. Machine learning based extraction methods have been recently proposed based on convolutional neural network (CNN) with a single input size (4). In this work, we explore a combination of multiple input sizes in CNN for improved accuracy in brain extraction.


Network: A 3D CNN was constructed as in Figure 1. We divided the original 3D MRI volume into patches with two different sizes, in order to utilize both local and global information in determining the boundary of the brain: A small patch as used in previous studies (6) focused on detailed information within a localized neighborhood and enabled a deeper network; Meanwhile, a larger patch provided more global information. Intermediate feature maps were connected to later classification layer via shortcut connection. The network produced patches of size , which were then compiled to recover the full volume.

Data acquisition and processing: We obtained 18 T1w scans of healthy subject (voxel size 0.94x1.5x0.94 mm3) from the Internet Brain Segmentation Repository (IBSR) dataset and 10 T1w scans of patient from Weill Cornell clinical dataset.

Training: For IBSR dataset, 16 of 18 subjects were used for training, giving a total number of 32000 patches, in which were selected randomly as validation set. For clinical dataset, 9 of 10 subjects were used for training. We employed the dice loss function as in (6) in training the network with Adam optimizer (3) (learning rate 0.001, epoch 30).

Post-processing and analysis: After getting the result of the network, we obtain a precise border. Considering the goal of our task, we can keep the inside and discard the voxel outside the border. Thus, we create a method to find a seed in the brain center and then fulfill the whole brain. There is also a process making the result smooth. Dice score, sensitivity and specificity were used to evaluate the network performance on the hold-out data in each dataset, respectively.

$$$Dice = \frac{2|P∩R|}{|P|+|R|} = \frac{2TP}{2TP+FP+FN}$$$

$$$Sensitivity= \frac{TP}{TP+FN}$$$

$$$Specificity= \frac{TN}{TN+FP}$$$

Using the notion of true positive (TP), true negative (TN), false positive (FP) and false negative (FN). The proposed network was compared with several alternative methods for brain extraction (2), as shown in Table 1.


The quantitative comparison on IBSR dataset was shown in Table 1. The proposed network achieved the highest Dice score and specificity, and the second highest sensitivity (97.6%) compared to the highest one (99.1%) given by HWA(1). Figure 2 compared the ground truth and the proposed brain extraction on a representative test subject. The proposed network outperformed BET in quantitative metrics on the clinical dataset, as shown in Table 2. A qualitative comparison of brain extraction was shown in Figure 3. Table 2 also indicated that the post-processing improved the dice score and specificity.


We propose a novel 3D fully convolutional network which combines two different input sizes for brain extraction.


We thank everyone in Wang lab for inspiring discussions and selfless assistances.


1. Smith, Stephen M. "Fast robust automated brain extraction." Human brain mapping 17.3 (2002): 143-155.

2. Palanisamy, Kalavathi & Prasath, Surya. Methods on Skull Stripping of MRI Head Scan Images—a Review. Journal of Digital Imaging.

3. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980 2014.

4. Kleesiek J, Urban G, Hubert A, et al. Deep MRI brain extraction: a 3D convolutional neural network for skull stripping[J]. NeuroImage, 2016, 129: 460-469.

5. Zhao, Gengyan & Liu, Fang & Oler, Jonathan & E. Meyerand, Mary & H. Kalin, Ned & M. Birn, Rasmus. (2018). Bayesian Convolutional Neural Network Based MRI Brain Extraction on Nonhuman Primates. NeuroImage 2017

6. Dolz, J., Desrosiers, C., Ben Ayed, I., 2017. 3D fully convolutional networks for subcortical segmentation in MRI: a large-scale study. NeuroImage 2017.


Figure 1. Network architecture. Kernel size and feature number are denoted at each convolutional layer. Input patch of size and is provided to a separate branch of the network, respectively. One max pooling layer is added in the branch for patch . The network output is a voxel-wise brain/non-brain classification.

Table 1. Dice score, sensitivity and specificity on the IBSR dataset for different methods. The best candidate in each category was highlighted in bold.

Figure 2. Comparison of the ground truth (red) and our brain extraction result (blue) on a test case, displayed in axial, sagittal and coronal views.

Table 2. Dice score, sensitivity and specificity on the clinical dataset for BET, proposed network with and without post-processing, respectively. Post-processing improved the dice score and specificity.

Figure 3. Comparison of the BET method (yellow), proposed network (blue) and the ground truth (red).

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)