Andrada Ianus^{1,2}, InĂªs Santiago^{2}, Daniele Ravi^{1}, Celso Matos^{2}, Daniel C. Alexander^{1}, and Noam Shemesh^{2}

Developing non-invasive imaging technique for detection and characterisation of lymph nodes is an important topic in cancer research. Diffusion MRI (dMRI) appears to be a promising modality for this task. This work investigates the ability of dMRI to differentiate benign and malignant lymph nodes based on a rich, ex-vivo dataset, and aims to find which measurements provide the most differentiation power.

Lymph
Node (LN) staging is one of the main determinants of the management
of rectal cancer patients, yet current imaging methods show limited
accuracy for that purpose^{1}. Diffusion MRI (dMRI) is
becoming an increasingly important tool for non-invasive detection of
malignant lymph nodes^{2-4} and potential characterization of
their microstructure^{5}.

This work investigates the ability of dMRI to differentiate benign and malignant lymph nodes based on a rich, ex-vivo dataset and aims to find which measurements provide the most differentiation power. We also compare the performance of different classification techniques: Logistic Regression (LR), Random Forest (RF), Deep Neural Network (DNN) and Convolutional Neural Network (CNN).

Institutional setting, approved by ethics committee: A total of 31 malignant nodes and 23 benign nodes (as defined by a dedicated pathologist) were included, originating from patients histopathologically staged as N+ after surgery with curative intent, without neoadjuvant therapy. The nodes were preserved in 4% formaldehyde and moved to a 1% PBS solution 24h before scanning.

Image
acquisition: Nodal pairs were imaged with a 16.4T Bruker® scanner.
Imaging parameters: slice thickness: 0.7mm, in-plane resolution:
0.14x0.14mm^{2}, matrix size: 70x70, TR/TE: 2800/6.5 ms. Diffusion
acquisition: stimulated echo acquisition mode (STEAM) experiments were performed with gradient duration δ=1ms and gradient separation
Δ={5,10,20,40,70,100,150}ms for b-values of 500 and 1000 s/mm^{2} and
Δ={10,20,30,50}ms for b-values of 1500 and 2000 s/mm^{2}, resulting in
22 shells with 6 gradient directions for each shell.

Data
analysis: A Diffusion Tensor (DT) was fitted voxelwise to each shell
using the effective b matrix of the acquisition, and then mean
diffusivity (MD) and fractional anisotropy (FA) were computed. This
results in 44 features (22 shells x FA and MD) for each voxel.
First,
employing all features, we study the accuracy of LR, RF (TreeBagger
function in Matlab with 100 trees and default parameters for
classification) and a DNN (3 fully connected layers with 30, 5 and 1
node, respectively, implemented in Matlab). Then, to investigate the
effect of including shape and texture we also analyse a CNN with the
following architecture: C^{3}_{32}-P-C^{3}_{10}-D-P-C^{5}_{10}-D-FC_{40}-FC_{2}-SL-CL,
where C^{k}_{N} is
a convolution layer with N filters of size k x k, P is a
2x2 max-pooling layer, D is a dropout layer, FC_{N}
is a fully-connected layer with N nodes, SL is a soft-max layer and
CL is a classification layer; each convolution layer is followed by a
batch normalization layer and a ReLU layer. The CNN is applied
to 45 channels (44 diffusion features and the probability output of
DNN). The dataset is split 80-20 for training and testing. Mean
accuracy and its standard deviation is computed over 20 repetitions.
We
also study the importance of different features for classification
based on RF, and the classification accuracy using four feature
subsets.

Figure
1 shows parameter maps of MD and FA for a benign and a malignant node
(measured at Δ = 5ms and b = 1000 s/mm^{2})
and boxplots of voxelwise MD and FA values in the two nodes, which
are both statistically different (p<<0.01).

Figure 2 illustrates the classification accuracy and its uncertainty for LR (75.1±10.1%), RF (74.2±9.1%), DNN (76.6±9.2%) and CNN (76.3±1.3%) when employed with all 44 features. The uncertainty is around 10% for LR, RF and DNN, and much smaller (1.3%) for CNN.

Figure 3 plots the feature importance obtained from the RF classification.
The most informative feature appears to be MD measured at short
diffusion time (Δ = 5ms and b=1000s/mm^{2}).

Next, we re-perform the classification for LR, RF and DNN with a reduced number of features: 1 or 4 shells with either only the best feature or both MD and FA. As illustrated in Figure 4, for 1 shell, including both MD and FA increases the accuracy for all classifiers, nevertheless, the increase is significant only for the DNN. The accuracy with reduced features is smaller than the initial values, however the difference is not significant (except for DNN with 1 shell).

^{1}
Grone, J. et al, Accuracy of Various Lymph Node Staging Criteria in
Rectal Cancer with Magnetic Resonance Imaging, J Gastrointest Surg
(2018) 22: 146.

^{2} Ogawa, M., et al, The Usefulness of Diffusion MRI in Detection of
Lymph Node Metastases of Colorectal Cancer. Anticancer Research,
2016. 36(2): p. 815-9.

^{3} Yasui, O., et al, Diffusion-weighted imaging in the
detection of lymph node metastasis in colorectal cancer. Tohoku J Exp
Med, 2009. 218(3): p. 177-83.

^{4} Heijnen, L.A., et al., Diffusion-weighted MR imaging in primary
rectal cancer staging demonstrates but does not characterise lymph
nodes. Eur Radiol, 2013. 23(12): p. 3354-60.

^{5} Ianus, A. et al, Comparison of diffusion MR models in lymph nodes at
ultra high field. ISMRM, 2018

^{6}
Hasbahceci, M. et al, Diffusion
MRI on lymph node staging of gastric adenocarcinoma, Quant Imaging
Med Surg. 2015 Jun; 5(3): 392–400.

Figure 1. a) Example maps of MD and FA for a benign and a malignant node for the shell with Δ = 5 ms and b = 1000 s/mm2; b) Box plots showing the voxelwise parameter differences between the two nodes. Both MD and FA values are significantly different with p <<0.01.

Figure 2. Mean
and standard deviation of accuracy over 20 repetitions for logistic
regression, random forest, deep neural network and convolutional
neural network fitted to all 44 features.

Figure 3. Feature importance based on the random forest algorithm, i.e. the OOBPermutedVarDeltaError which describes the increase in prediction error if the values of a given variable are permuted across out-of-bag observations. MD features are marked in blue and FA in red. The x-axis shows the shell corresponding to each feature. The green lines delineate the best feature and next 3 best features.

Figure 4. Accuracy
of LR, RF and DNN for the
limited number of features from either 1 shell (#2) or 4 shells (#2,
#13, #3, #22). In each case either the best feature for each shell
was used (left) or both MD and FA maps were used.