Cascaded Deep Learning Networks for Automated Image Quality Evaluation of Structural Brain MRI

1Department of Diagnostic and Interventional Imaging, University of Texas Health Science Center, Houston, TX, United States


Visual quality assessment of MRI is subjective and impractical for large datasets. In this study, we present a cascaded convolutional neural network (CNN) model for automated image quality evaluation of structural brain MRI. The multisite Autism Brain Imaging Data Exchange dataset of ~1000 subjects was used to train and evaluate the proposed model. The model performance was compared with expert evaluation. The first network rated individual slices, and the second network combined the slice ratings into a final image score. The network achieved 74% accuracy, 69% sensitivity, and 74% specificity, demonstrating that deep learning can provide robust image quality evaluation.


Magnetic resonance imaging (MRI) is used to study a wide variety of neurological disorders. Poor image quality can compromise the reliability of MRI-based measures.1 Slice-wise visual inspection of 3D MRI volumes is impractical for large datasets such as encountered in multi-center clinical trials. Automated image quality evaluation of MRI is crucial for excluding images with unacceptable quality for reliable morphometric analyses and disease diagnosis.2 We propose a deep learning (DL) model based on Convolutional Neural Networks (CNNs) to automatically evaluate image quality based on a self-learning approach. Brain MRI from the open-access Autism Brain Imaging Data Exchange (ABIDE) was used to evaluate the performance of the proposed technique. ABIDE data also provides image quality evaluation by experts, thus providing the ground truth for training the DL model.3


We used 1064 structural brain MRI volumes in this study. The DL model used for automatic image quality evaluation (Fig. 1) has two stages. It first evaluates each input slice from a 3D volume MRI using a deep CNN, and then combines the slice ratings using a fully connected network to obtain the final volume-wise prediction. The architecture of the deep CNN used for slice quality prediction is shown in Fig. 2. The CNN consisted of an input layer, six convolution layers, one fully connected layer, and an output layer. The CNN was trained using 32 slices (5 mm apart) from each 3D volume along the three MRI planes (axial, sagittal and coronal). Dropout was used to regularize the network weights. Data augmentation was performed to avoid overfitting. The predictions for individual planes from the first network was used to train the second network, which consisted of a fully-connected layer and an output layer. The volume predictions from the three different planes were averaged to provide the final image quality score. The maximum number of epochs was 500. Both networks used the rectified linear unit (ReLu) as the activation function and sigmoid classification function in the last layer. The network coefficients were trained by minimizing the binary cross-entropy using Adam4 with a learning rate of 10-3 and associated parameters β1 = 0.9 and β2 = 0.999. Class weights of 0.57/4.03 for the acceptable/unacceptable images were used to account for the imbalance between these images in the training set. All processing was done on the Maverick 2 cluster, equipped with 4 NVIDIA Tesla GTX graphics processing unit (GPU) cards, at the Texas Advanced Computing Center (TACC) at Austin, Texas. The network implementation was carried out in the Python Keras library5 using TensorFlow.6


In total, our dataset contained 1064 volumes out of which 132 were labeled by experts as ‘exclude’ and 932 as ‘include’. The 1064 cases in the dataset were split into 60% for training, 20% for validation, and 20% for testing. Sample images from the test set along with corresponding class type and the final model classification score are shown in Fig. 3.The network provided an accuracy of 74% with 69% sensitivity and 74% specificity.


The main objective of this work was developing an automated image quality evaluation for structural brain MRI using deep CNN. The advantage of deep learning is that it is fully data-driven which eliminates the need for hand-crafted feature. The sensitivity (74%) of the model is higher than other state-of–art traditional machine learning algorithms.


The proposed deep learning CNN model shows potential for automatic evaluation of image quality for improved robustness of the results. Once trained, the same network can be used for analyzing the individual patient image quality.


We acknowledge the following support: NINDS/NIH grant #1R56NS105857-01, Endowed Chair in Biomedical Engineering, and Dunn Foundation and the Texas Advanced Computing Center, Austin, TX for providing access to Maverick2 cluster.


1. Osadebey M, Pedersen M, Arnold D, Wendel-Mitoraj K. Image Quality Evaluation in Clinical Research: A Case Study on Brain and Cardiac MRI Images in Multi-Center Clinical Trials. IEEE J Transl Eng Health Med. 2018; 6:1800915. Published 2018 Aug 23. doi:10.1109/JTEHM.2018.2855213

2. Reuter, Martin, et al. "Head motion during MRI acquisition reduces gray matter volume and thickness estimates." Neuroimage 107.2015; 107-115.

3. Cameron Craddock, Yassine Benhajali, Carlton Chu, Francois Chouinard, Alan Evans, András Jakab, Budhachandra Singh Khundrakpam, John David Lewis, Qingyang Li, Michael Milham, Chaogan Yan, Pierre Bellec., “ The Neuro Bureau Preprocessing Initiative: open sharing of preprocessed neuroimaging data and derivatives”, In Neuroinformatics, Stockholm, Sweden, 2013.

4. D. P. Kingma and J. L. Ba, “Adam: a Method for Stochastic Optimization,” Int. Conf. Learn. Represent., 2015.

5. F. Chollet, “Keras,” [Online]. Available: https://github.com/%0Afchollet/keras, 2015.

6. M. Abadi et al., “TensorFlow: A System for Large-Scale Machine Learning TensorFlow: A system for large-scale machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), 2016.


Fig 1. The proposed cascaded deep learning model for automated image quality evaluation of brain MRI. The model takes in a 3D MRI image and outputs confidence scores for the type of classes. Class 0 represent ‘include’ (acceptable) cases and class 1 represent ‘exclude’ (unacceptable) image quality.

Fig.2. Architecture of the deep convolutional neural network (CNN) for image quality prediction of individual brain slices. The image size and number of kernels are indicated in the numbers denoting the width, height and depth at each stage. A 3x3 convolution (conv) kernels and a 2x2 maximum pooling (pool) operations were used to produce image features. The final fully connected (fc) and output layers combine the features to produce the quality prediction.

Fig 3. Images with (a,b) low quality (‘exclude’, class 1) and (c,d) acceptable quality (‘include’, class 0) and their deep learning quality ratings (score).

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)