Mohammad Zalbagi Darestani^{1} and Reinhard Heckel^{1,2}

^{1}Electrical and Computer Engineering, Rice University, Houston, TX, United States, ^{2}Electrical and Computer Engineering, Technical University of Munich, Munich, Germany

Convolutional Neural Networks (CNNs) are highly effective tools for image reconstruction problems. Typically, CNNs are trained on large amounts of images, but, perhaps surprisingly, even without any training data, CNNs such as the Deep Image Prior and Deep Decoder achieve excellent imaging performance. Here, we build on those works by proposing an un-trained CNN for accelerated MRI along with performance-enhancing steps including enforcing data-consistency and combining multiple reconstructions. We show that the resulting method i) achieves reconstruction performance almost on par with baseline as well as state-of-the-art trained CNNs, but without any training, and ii) significantly outperforms competing sparsity-based approaches.

In this work, we study accelerating multi-coil MRI with un-trained CNNs that do not rely on any training data. In very recent work, it has been demonstrated that un-trained CNNs such as the Deep Image Prior

We are given the k-space measurements $$$\mathbf{y}_1,\ldots,\mathbf{y}_{n_c}$$$ of $$$n_c$$$ receiver coils undersampled with a given mask $$$\mathbf{M}$$$ of an unknown image, and our goal is to estimate the image. We reconstruct an image as follows:

$$

\mathcal{L}(\mathbf{C}) = \frac{1}{2} \sum_{i=1}^{n_c} \lVert \mathbf{y}_i - \mathbf{M} \mathbf{F} G_i(\mathbf{C}) \rVert_2^2.

$$

Here, $$$\mathbf{F}$$$ is the 2D Fourier transform, and $$$G_i$$$ is the $$$i$$$-th output channel of the ConvDecoder.

The new elements of this approach relative to previous works are i) the ConvDecoder architecture ii) the data-consistency step, and iii) the ensembling trick (step 4). The ConvDecoder is a variation of the Deep Decoder architecture and results in slightly less blurry reconstructions and the data-consistency and ensembling steps each improves performance notably for all architectures we considered.

Note that this method works without any estimates of the sensitivity maps. We also considered a slight variation of the method which takes the sensitivity maps estimated with ESPIRIT

We study the performance of this method on the Knee and Brain FastMRI dataset. We compared to Total-Variation norm minimization

Figure 3 compares the performance of the ConvDecoder architecture to the DIP and DD architectures. All architectures use our data-consistency step. The results show that the data-consistency step proposed here improves the results for all methods, and that the ConvDecoder architecture addresses the smoothness and artifacts issues of DIP and DD.

Figure 4 shows that our ensembling technique enhances the reconstruction quality and also demonstrates how PSNR score changes by varying the number of considered decoders.

Finally, Figure 5 illustrates that for an out-of-distribution example (i.e., a brain image for networks trained on knees), trained neural networks lose significant performance, whereas un-trained networks are by design robust to such shifts.

The code to reproduce our results is available at https://github.com/MLI-lab/ConvDecoder, and the full version of this paper is available as a preprint on arXiv

While the state-of-the-art trained neural networks still slightly outperform our un-trained network in terms of reconstruction accuracy, (i) the robustness to distribution shifts and (ii) not needing any training data make un-trained networks an important tool in practice, especially in regimes where there is a lack of training data, and a need for robustness.

1. A. Sriram et al. “End-to-end variational networks for accelerated MRI reconstruction”. In: arXiv:2004.06688 [eess.IV]. 2020.

2. O. Ronneberger, P. Fischer, and T. Brox. “U-net: convolutional networks for biomedical image segmentation”. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. 2015, pp. 234–241.

3. D. Ulyanov, A. Vedaldi, and V. Lempitsky. “Deep image prior”. In: IEEE Conference on Computer Vision and Pattern Recognition. 2018, pp. 9446–9454.

4. R. Heckel and P. Hand. “Deep Decoder: concise image representations from untrained nonconvolutional networks”. In: International Conference on Learning Representations (ICLR). 2019.

5. D. V. Veen, A. Jalal, M. Soltanolkotabi, E. Price, S. Vishwanath, and A. G. Dimakis. “Compressed sensing with deep image prior and learned regularization”. In: arXiv:1806.06438 [stat.ML]. 2018.

6. S. Arora, V. Roeloffs, and M. Lustig. “Untrained modified deep decoder for joint denoising parallel imaging reconstruction”. In: International Society for Magnetic Resonance in Medicine Annual Meeting. 2020.

7. D. P. Kingma and J. Ba. “Adam: a method for stochastic optimization”. In: International Conference on Learning Representations (ICLR). 2015.

8. M. Uecker et al. “ESPIRiT—an eigenvalue approach to autocalibrating parallel MRI: where SENSE meets GRAPPA”. In: 2014, pp. 990–1001.

9. K. T. Block, M. Uecker, and J. Frahm. “Undersampled radial MRI with multiple coils. Iterative image reconstruction using a total variation constraint”. In: Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine. 2007, pp. 1086–1098.

10. J. Zbontar et al. “FastMRI: An open dataset and benchmarks for accelerated MRI”. In: arXiv:1811.08839 [cs.CV]. 2018.

11. A. Mason et al. “Comparison of objective image quality metrics to expert radiologists’ scoring of diagnostic quality of MR images”. In: IEEE Transactions on Medical Imaging. 2019, pp. 1064–1072.

12. M. Z. Darestani and R. Heckel. “Can Un-trained Neural Networks Compete with Trained Neural Networks at Image Reconstruction?” In: arXiv preprint:2007.02471[eess.IV]. 2020.

ConvDecoder architecture. It is comprised of up-sampling, convolutional, ReLU, batch normalization, and linear combination layers.

Sample reconstructions for ConvDecoder, TV, U-net, and the end-to-end variational network (VarNet) for a validation image from multi-coil knee measurements (4x accelerated). The second row represents the zoomed-in version of the first row. ConvDecoder and the end-to-end variational network (VarNet) find the best reconstructions for this image (slightly better than U-net and significantly better than TV). The scores given below are averaged over 200 different mid-slice images from the FastMRI validation set.

Sample reconstructions for ConvDecoder, DIP, and Deep Decoder (DD) for a validation image from multi-coil knee measurements (4x accelerated). The second row represents the zoomed-in version of the first row. ConvDecoder gives the best reconstruction for this image. The scores below are averaged over 20 different randomly-chosen mid-slice images from the FastMRI dataset.

(a) Our ensembling trick yields a better reconstruction than the best image of 10 runs. (b) PSNR scores increasingly improve by averaging the output of $$$k \in \{1,2,...10\}$$$ number of ConvDecoders. This technique also increases the SSIM score by $$$2\%$$$ on average.