Efrat Shimron^{1}, Jonathan Tamir^{2}, Ke Wang^{1}, and Michael Lustig^{1}

^{1}Electrical Engineering and Computer Sciences, UC Berkeley, Berkeley, CA, United States, ^{2}Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States

This work reveals how naively using publicly available data for training and evaluating reconstruction algorithms may lead to artificially improved algorithm performance. We observed such practice in the “wild” and aim to bring this to the attention of the community. The underlying cause is common data preprocessing pipelines which are often ignored: k-space zero-padding in clinical scanners and JPEG compression in database storage. We show that retrospective subsampling of such preprocessed data leads to overly-optimistic reconstructions. We demonstrate this phenomenon for Compressed-Sensing, Dictionary-Learning and Deep Neural Networks. This work hence highlights the importance of careful task-adequate usage of public databases.

1. CS with

2. A Dictionary Learning method that jointly solves for the dictionary and the sparse code using alternating minimization

3. Unrolled DNN: The MoDL algorithm

The standard JPEG implementation of the PILLOW Library

1. Ong, F., Amin, S., Vasanawala, S., & Lustig, M. (2018). Mridata. org: An open archive for sharing MRI raw data. In Proc. Intl. Soc. Mag. Reson. Med (Vol. 26, p. 1).

2. Zbontar, J., Knoll, F., Sriram, A., Murrell, T., Huang, Z., Muckley, M. J., ... & Parente, M. (2018). FastMRI: An open dataset and benchmarks for accelerated MRI. arXiv preprint arXiv:1811.08839.

3. IXI database. Resources for the computational analysis of brain development, Imperial College, http://biomedic.doc.ic.ac.uk/brain-development.

4. The Human Connectome Project, University of Souterhn California, http://www.humanconnectomeproject.org/publications/documents/.

5. Mueller, S. G., Weiner, M. W., Thal, L. J., Petersen, R. C., Jack, C. R., Jagust, W., ... & Beckett, L. (2005). Ways toward an early diagnosis in Alzheimer’s disease: the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimer's & Dementia, 1(1), 55-66.

6. Lustig, M., Donoho, D., & Pauly, J. M. (2007). Sparse MRI: The application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine: 58(6), 1182-1195.

7. Ravishankar, S., & Bresler, Y. (2010). MR image reconstruction from highly undersampled k-space data by dictionary learning. IEEE transactions on medical imaging, 30(5), 1028-1041.

8. Aggarwal, H. K., Mani, M. P., & Jacob, M. (2018). MoDL: Model-based deep learning architecture for inverse problems. IEEE transactions on medical imaging, 38(2), 394-405.

9. Guerquin-Kern, M., Lejeune, L., Pruessmann, K. P., & Unser, M. (2011). Realistic analytical phantoms for parallel magnetic resonance imaging. IEEE Transactions on Medical Imaging, 31(3), 626-636.

10. Herrmann, M. D., Clunie, D. A., Fedorov, A., Doyle, S. W., Pieper, S., Klepeis, V., ... & Kikinis, R. (2018). Implementing the DICOM standard for digital pathology. Journal of pathology informatics, 9.

11. Ong, F., & Lustig, M. (2019). SigPy: a python package for high performance iterative reconstruction. In Proceedings of the ISMRM 27th Annual Meeting, Montreal, Quebec, Canada (abstract 4819).

12. Tamir, J. (2020). Step-by-Step Reconstruction Using Learned Dictionaries. In: Proceedings of the ISMRM 28th Annual Meeting, (Virtual Meeting).

13. Pati, Y. C., Rezaiifar, R., & Krishnaprasad, P. S. (1993). Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Proceedings of 27th Asilomar conference on signals, systems and computers (pp. 40-44). IEEE.

14. Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing, 54(11), 4311-4322.

15. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Desmaison, A. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems (pp. 8026-8037).

16. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.

18. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

19. Clark, A. (2015). Pillow (PIL Fork) https://pillow.readthedocs.io/en/stable/.

*Subtle Crime II concept. *MR images are often saved in the DICOM format which sometimes involves JPEG compression. However, JPEG-compressed images contain less high-frequency data than the original data; therefore, algorithms trained on retrospectively-subsampled compressed data may benefit from the compression. These algorithms may hence exhibit misleadingly good performance.