Roberto Souza^{1,2} and Richard Frayne^{1,2}

Compressed sensing (CS) magnetic resonance (MR) imaging acquisitions reduce MR exam times by decreasing the amount of data acquired during acquisition, while still reconstructing high quality images. Deep learning methods have the advantage of reconstructing images in a single step as opposed to iterative (and slower) CS methods. Our proposal aims to leverage information from both k-space and image domains, in contrast to most other deep-learning methods that only use image domain information. We compare our W-net model against four recently published deep-learning-based methods. We achieved second best results in the quantitative analysis, but with more visually pleasing reconstructions.

^{1}Ronneberger et al., “U-net: Convolutional networks
for biomedical image segmentation,” MICCAI
2015, pp. 234-241;

^{2}Zhu
et al., “Image
reconstruction by domain-transform manifold learning,” Nature 2018, pp. 487-;

^{3}
Quan et al., “Compressed sensing
MRI reconstruction using a generative adversarial network with a cyclic
loss,” IEEE Tran Med Imaging, 2018, pp. 1488-1497;

^{4}Yang
et al., “DAGAN: Deep de-aliasing generative
adversarial networks for fast compressed sensing MRI reconstruction,” IEEE Tran Med Imaging, 2018, pp. 1310-1321;

^{5}Schlemper
et al., “A
deep cascade of convolutional neural networks for dynamic MR image
reconstruction,” IEEE Tran Med Imaging, 2018, pp. 491-503;

^{6}Wang et al., “Image
quality assessment: from error visibility to structural similarity,” IEEE Tran Image Processing, 2004, pp. 600-612.

Flowchart
of the proposed methodology. It receives as input undersampled k-space. The
frequency domain network (residual U-net) acts as an interpolation block that
tries to fill k-space missing values. The image domain network (U-net) acts as
an anti-aliasing filter to further improve the image reconstruction obtained
from the first network. The iDFT block leverages the Fourier Transform
mathematical formulation, but it has no learnable parameters.

Simplified
architecture of our W-net model. It consists of a residual U-net (left) and a
U-net (right). They are connected through the magnitude of the iDFT operator,
which is efficiently computed on the Graphics Processing Unit (GPU) through the
inverse Fast Fourier Transform (iFFT). The residual U-net on the left uses
5×5 kernels on the convolutional layers and the U-net on the right uses 3×3
kernels. These kernel sizes were empirically set. The model has ~11,000,000
trainable parameters and it is trained end-to-end. The iDFT block has no learnable
parameters, which makes our model much more compact compared to AUTOMAP.

Sample
result of our Hybrid approach. Input k-space under-sampled by 80%. Top
row displays k-space spectrum, and bottom row displays magnitude of the iDFT of
the corresponding k-space. From the
left to the right: Input undersampled k-space reconstruction (NRMSE = 3.2%),
result of the frequency domain U-net (NRMSE = 1.9%), result of the image domain
U-net (NRMSE = 1.6%), and fully sampled reference reconstruction.

Summary of the results for the different
architectures with an acceleration factor of 5x. The top two results for each
metric are emboldened. Deep-Cascade achieved the best metrics with statistical
significance (p<0.01). W-net was
second best also with statistical significance (p<0.01) compared to UNET, RefineGAN and DAGAN. Mean ± standard
deviation are reported over ten test subjects.

Sample
reconstructions for a k-space undersampled by 80% with a special highlight on the cerebellum region, where
differences are more noticeable. Our W-net presents the best image contrast and
details in the cerebellum region compared to the other techniques.