Fetal Motion Prediction from Volumetric MRI using Machine Learning
Junshen of Xu1, Molin Zhang2, Larry Zhang1,3, Ellen Grant4,5, Polina Golland1,3, and Elfar Adalsteinsson1,6

1Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, United States, 2Department of Engineering Physics, Tsinghua University, Beijing, China, 3Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, United States, 4Harvard Medical School, Boston, MA, United States, 5Fetal-Neonatal Neuroimaging and Developmental Science Center, Boston Children’s Hospital, Boston, MA, United States, 6Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, United States


Prospective motion correction is a challenge in clinical fetal MR imaging as fetal motion is erratic and often substantial. To address this problem, we propose a two-stage machine learning pipeline to extract fetal poses from echo planar MRI volumes at previous time points to predict future pose. This pipeline can be used to learn kinematic models of fetal motion and serve as valuable auxiliary information for real-time, online slice prescription in fetal MRI.


Fetal MR imaging remains challenging in the clinic, largely due to fetal motion and its consequent artifacts. Although Single-shot sequences such as Half Fourier Acquisition Single Shot Turbo Spin Echo (HASTE) can reduce in-plane motion artifacts, single-shot T2-weighted imaging by HASTE is typically deployed in protocols with multiple repeated scans that yield long scan times and uncertainty about completeness of slice coverage as the MR technologist “chases” the moving fetal brain by manually re-prescribed slices. Prospective motion correction through real-time updates in slice prescription require knowledge of fetal pose. To address this problem, in this abstract, we propose a two-stage pipeline for fetal motion prediction from volumetric MR images using machine learning methods and retrospective evaluation of its performance.


Volumetric MR data were acquired with multislice EPI imaging (matrix size = 120*120*80, resolution = 3mm*3mm*3mm, TR = 3.5s) of the pregnant abdomen for subjects at gestational age between 25 and 35 weeks. Fifteen fetal features, so-called keypoints (ankles, knees, hips, bladder, shoulders, elbows, wrists and eyes) were labeled manually for 1171 volumes. Then, 1113 sequential time points with duration 10 frames each, were extracted. The objective of the pose prediction task is to estimate the pose of fetus at each of the four time points that follow the six preceding 3D MR EPI volumes.The proposed pipeline for motion prediction is shown in Figure 1, which consists of two stages, namely, pose estimation from volumetric MRI and motion prediction based on pose representation.In the first stage, a convolution neural network is used to estimate fetal poses (via keypoint coordinates) from 3D MRI. Inspired by human pose estimation for 2D image [1], here we propose a 3D hourglass network to 3D fetal pose estimation (see Figure 2). The network uses upsampling and downsampling operations to capture multiscale feature of the images and use residual connections to preserve high resolution information.The second stage, motion prediction, can be regarded as the following autoregression problem


where xt is the pose of fetus at time t and f is a function used to predict the next pose given the pose at k previous time points. Three different autoregression algorithm that have previously been proposed for human motion prediction and other time series prediction problems were implemented and compared for the fetal pose prediction problem, including linear autoregression model (LAR), recurrent neural network (RNN) [2] and Autoregressive Trees (ART) [3]. In the RNN model, we adapted a single-layer, gated recurrent unit [4] architecture with 1024 hidden units. As for the ART model, we used a random forest as the based model, which is more robust to noise in the data. Given the limited data, a five-fold cross validation was used to evaluated the performance of the different methods.


For visualization, we performed principal component analysis (PCA) on the pose data and plot the first three principal components along the time dimension. Figure 3 shows two characteristic patterns of fetal movements. To evaluate the performance of three different autoregression models, normalized root mean squared errors (NRMSE) with different prediction lengths were calculated as illustrated in Figure 4. For comparison, the zero velocity (ZV) baseline is used, which assumes that the fetus will not move from time t-1 to time t. Figure 5 shows examples of motion prediction of different methods at different time points.


The results in Figure 4 shows that, in general, the ART method has the best performance. Another observation is that at T = 1, i.e., only predicting the next one step, the most complicated RNN method does not outperform even the simplest zero-velocity baseline. A similar observation has been reported in human motion prediction tasks [2]. The examples in Figure 5 also show that the performance of tree-based autoregression method is most robust for longer-term predictions.As shown in Figure 3, different fetuses may have different motion patterns, e.g., some fetuses may be nearly stationary during scan whereas others are very active and move frequently. The current pipeline only model fetal motion in general and does not exploit characteristic motion patterns that may improve prediction performance in future implementations.


A machine learning pipeline for fetal motion prediction is proposed in this abstract with future applications that include both retrospective analysis of fetal movement and as a tool to guide prospective slice prescription in real time. Future work will focus on improving prediction accuracy by incorporating priors for observed characteristic fetal motion patterns and application of the pipeline to prospective fetal motion correction for MRI.


Funding: NIH R01 EB017337, U01 HD087211


[1] Newell, A., Yang, K., & Deng, J. (2016, October). Stacked hourglass networks for human pose estimation. In European Conference on Computer Vision (pp. 483-499). Springer, Cham.

[2] Martinez, J., Black, M. J., & Romero, J. (2017, July). On human motion prediction using recurrent neural networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4674-4683). IEEE.

[3] A. M. Lehrmann, P. V. Gehler, and S. Nowozin. Efficient nonlinear markov models for human motion. In CVPR, 2014[4] Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

[4] Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.


Figure 1. The proposed two-stage pipeline for prediction fetal motion from volumetric MRI. Stage 1: fetal pose estimation from 3D MRI. Stage 2: motion estimation based on fetal pose.

Figure 2. Overall Architecture of the proposed network.

Figure 3. Principal component analysis of two subjects. (a) and (c) are the plot of the first three Principal components of the two subjects. (b) and (d) are the sample autocorrelation with confidence bounds of the first principal component of the two subjects, which show the correlation of motion between certain time lags.

Figure 4. NRMSE with different prediction lengths for different methods.

Figure 5. Examples of motion prediction of different methods at different time points. Dashed lines are the predicted pose while solid lines are the ground truth. Purple triangle captures eyes and base of neck, blue connects shoulders, which connect with arms (green, red). Light blue connects base of neck with bladder, which links hips (blue) and legs (green, red).

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)