\(\newcommand{L}[1]{\| #1 \|}\newcommand{VL}[1]{\L{ \vec{#1} }}\newcommand{R}[1]{\operatorname{Re}\,(#1)}\newcommand{I}[1]{\operatorname{Im}\, (#1)}\)

Calculating transformations between images

In which we discover optimization, cost functions and how to use them.

>>> # - compatibility with Python 2
>>> from __future__ import print_function  # print('me') instead of print 'me'
>>> from __future__ import division  # 1/2 == 0.5, not 0
>>> # - import common modules
>>> import numpy as np  # the Python array package
>>> import matplotlib.pyplot as plt  # the Python plotting package
>>> # - set gray colormap and nearest neighbor interpolation by default
>>> plt.rcParams['image.cmap'] = 'gray'
>>> plt.rcParams['image.interpolation'] = 'nearest'
>>> # Tell numpy to print numbers to 4 decimal places only
>>> np.set_printoptions(precision=4, suppress=True)

Hint

If running in the IPython console, consider running %matplotlib to enable interactive plots. If running in the Jupyter Notebook, use %matplotlib inline.

Introduction

We often want to work out some set of spatial transformations that will make one image be a better match to another.

One example is motion correction in FMRI. We know that the subject moves over the course of an FMRI run. To correct for this, we may want to take each volume in the run, and match this volume to the first volume in the run. To do this, we need to work out what the best set of transformations are to match - say - volume 1 to volume 0.

Get the data

We start with a single slice from the each of the first two volumes in the EPI (BOLD) data.

We load the 4D EPI image ds107_sub012_t1r2.nii, and get the data:

>>> import nibabel as nib
>>> img = nib.load('ds107_sub012_t1r2.nii')
>>> data = img.get_data()
>>> data.shape
(64, 64, 35, 166)

We need to make our data into floating point to make processing easier later

>>> data = data.astype(np.float32)

Get the first and second volumes:

>>> vol0 = data[..., 0]
>>> vol1 = data[..., 1]
>>> vol1.shape
(64, 64, 35)

Now we get our slices. Here I am getting slice index 17 in z, from the first volume:

>>> mid_vol0 = vol0[:, :, 17]
>>> plt.imshow(mid_vol0)
<...>

(png, hires.png, pdf)

_images/optimizing_space-8.png

Here is the corresponding slice from volume 1:

>>> mid_vol1 = vol1[:, :, 17]
>>> plt.imshow(mid_vol1)
<...>

(png, hires.png, pdf)

_images/optimizing_space-9.png

We might want to match these two slices so they are in the same position. We will try and match the slices by moving the voxel values in slice 1 so that they are a better match to the voxel values in slice 0. Our job is to find the best set of transformations to do this.

We would like to have some automatic way to calculate these transformations.

The rest of this page is about how this automated method could (and does) work.

At the moment it is hard to see any difference between the two images. That is still true even if we put the slices side by side:

>>> fig, axes = plt.subplots(1, 2)
>>> axes[0].imshow(mid_vol0)
<...>
>>> axes[0].set_title('slice 0')
<...>
>>> axes[1].imshow(mid_vol1)
<...>
>>> axes[1].set_title('slice 1')
<...>

(png, hires.png, pdf)

_images/optimizing_space-10.png

The slices are very similar, because the movement between the two volumes is very small - often the movements in an FMRI run are less than a single voxel.

In order to show the automated matching procedure in action, let’s start with an image that is obviously badly matched with slice 0. We will do this by pushing slice 0 (mid_vol1) 8 voxels down.

After that we will try to match the shifted version (shifted_mid_vol1) to mid_vol0:

>>> # Make slice full of zeros like mid_vol1 slice
>>> shifted_mid_vol1 = np.zeros(mid_vol1.shape)
>>> # Fill the lower 54 (of 64) x lines with mid_vol1
>>> shifted_mid_vol1[8:, :] = mid_vol1[:-8, :]
>>> # Now we have something like mid_vol1 but translated
>>> # in the first dimension
>>> plt.imshow(shifted_mid_vol1)
<...>

(png, hires.png, pdf)

_images/optimizing_space-11.png

Formulating the matching problem

Let’s say we did not know how many voxels the image had been translated. We want a good way to find out.

Let’s put our problem in a more concrete form. We are going to try translating our shifted_mid_vol1 over the first (\(x\)) axis. Call this translation \(t\), so a movement of -5 voxels in \(x\) gives \(t=-5\).

Call the first image \(\mathbf{X}\) (mid_vol0 in our case). Call the second image \(\mathbf{Y}\) (shifted_mid_vol1 in our case). \(\mathbf{Y}_t\) is \(\mathbf{Y}\) translated in \(x\) by \(t\) voxels.

\(t\) is a good translation if the image \(\mathbf{Y}_t\) is a good match for image \(\mathbf{X}\).

We need to quantify what we mean by good match. That is, we need some measure of quality of match, given two images \(\mathbf{X}\) and \(\mathbf{Y}\). Call this measure \(M(\mathbf{X}, \mathbf{Y})\), and let us specify that the value of \(M(\mathbf{X}, \mathbf{Y})\) should be lower when the images match well. We could therefore call \(M(\mathbf{X}, \mathbf{Y})\) a mismatch function.

Now we can formulate our problem - we want to find the translation \(t\) that gives the lowest value of \(M(\mathbf{X}, \mathbf{Y_t})\). We can express this as:

\[\hat{t} = \mathrm{argmin}_t M(\mathbf{X}, \mathbf{Y_t})\]

Read this as “the desired translation value \(\hat{t}\) is the value of \(t\) such that \(M(\mathbf{X}, \mathbf{Y_t})\) is minimized”.

Practically, we are going to need the following things:

  • a function to generate \(\mathbf{Y_t}\) given \(\mathbf{Y}\) and \(t\). Call this the transformation function;
  • a function \(M\) to give the mismatch between two images. Call this the mismatch function.

Here’s the transformation function to generate \(\mathbf{Y_t}\) - the image \(\mathbf{Y}\) shifted by \(t\) voxels in \(x\):

>>> def x_trans_slice(img_slice, x_vox_trans):
...     """ Return copy of `img_slice` translated by `x_vox_trans` voxels
...
...     Parameters
...     ----------
...     img_slice : array shape (M, N)
...         2D image to transform with translation `x_vox_trans`
...     x_vox_trans : int
...         Number of pixels (voxels) to translate `img_slice`; can be
...         positive or negative.
...
...     Returns
...     -------
...     img_slice_transformed : array shape (M, N)
...         2D image translated by `x_vox_trans` pixels (voxels).
...     """
...     # Make a 0-filled array of same shape as `img_slice`
...     trans_slice = np.zeros(img_slice.shape)
...     # Use slicing to select voxels out of the image and move them
...     # up or down on the first (x) axis
...     if x_vox_trans < 0:
...         trans_slice[:x_vox_trans, :] = img_slice[-x_vox_trans:, :]
...     elif x_vox_trans == 0:
...         trans_slice[:, :] = img_slice
...     else:
...         trans_slice[x_vox_trans:, :] = img_slice[:-x_vox_trans, :]
...     return trans_slice

Choosing a metric for image mismatch

Next we need a mismatch function that accepts two images, and returns a scalar value that is low when the images are well matched.

We could imagine a mismatch measure that used the values from subtracting the two images:

>>> plt.imshow(mid_vol0 - shifted_mid_vol1)
<...>

(png, hires.png, pdf)

_images/optimizing_space-13.png

Can we take the sum of this difference as our scalar measure of mismatch?

No, because the negative numbers (black) will cancel out the positive numbers (white). We need something better:

>>> def mean_abs_mismatch(slice0, slice1):
...     """ Mean absoute difference between images
...     """
...     return np.mean(np.abs(slice0 - slice1))

Now we can check different values of translation with our mismatch function. We move the image up and down for a range of values of \(t\) and recalculate the mismatch measure for every candidate translation:

>>> mismatches = []
>>> translations = range(-25, 15)  # Candidate values for t
>>> for t in translations:
...     # Make the translated image Y_t
...     unshifted = x_trans_slice(shifted_mid_vol1, t)
...     # Calculate the mismatch
...     mismatch = mean_abs_mismatch(unshifted, mid_vol0)
...     # Store it for later
...     mismatches.append(mismatch)
>>> plt.plot(translations, mismatches)
[...]
>>> plt.xlabel('translation (t)')
<...>
>>> plt.ylabel('mean absolute difference')
<...>

(png, hires.png, pdf)

_images/optimizing_space-16.png

We can try other measures of mismatch. Another measure of how well the images match might be the correlation of the voxel values at each voxel. When the images are well matched, we expect black values in one image to be matched with black in the other, ditto for white.

>>> # Number of voxels in the image
>>> n_voxels = np.prod(mid_vol1.shape)
>>> # Reshape vol0 slice as 1D vector
>>> mid_vol0_as_1d = mid_vol0.reshape(n_voxels)
>>> # Reshape vol1 slice as 1D vector
>>> mid_vol1_as_1d = mid_vol1.reshape(n_voxels)
>>> # These original slices should be very close to each other already
>>> # So - plotting one set of image values against the other should
>>> # be close to a straight line
>>> plt.plot(mid_vol0_as_1d, mid_vol1_as_1d, '.')
[...]
>>> plt.xlabel('voxels in vol0 slice')
<...>
>>> plt.ylabel('voxels in original vol1 slice')
<...>
>>> # Correlation coefficient between them
>>> print(np.corrcoef(mid_vol0_as_1d, mid_vol1_as_1d)[0, 1])
0.998119433113

(png, hires.png, pdf)

_images/optimizing_space-17.png
>>> # The shifted slice will be less well matched
>>> # Therefore the line will be less straight and narrow
>>> plt.plot(mid_vol0_as_1d, shifted_mid_vol1.ravel(), '.')
[...]
>>> plt.xlabel('voxels in vol0 slice')
<...>
>>> plt.ylabel('voxels in shifted vol1 slice')
<...>
>>> # Correlation coefficient between them will be nearer 0
>>> print(np.corrcoef(mid_vol0_as_1d, shifted_mid_vol1.ravel())[0, 1])
0.446286936383

(png, hires.png, pdf)

_images/optimizing_space-18.png

We expect that the correlation will be high and positive when the images are well matched. Our mismatch measure, on the other hand, should be low when the images are well-matched. So, we can use the negative correlation as our mismatch measure:

>>> def correl_mismatch(slice0, slice1):
...     """ Negative correlation between the two images, flattened to 1D """
...     correl = np.corrcoef(slice0.ravel(), slice1.ravel())[0, 1]
...     return -correl
>>> correl_mismatches = []
>>> translations = range(-25, 15)  # Candidate values for t
>>> for t in translations:
...     unshifted = x_trans_slice(shifted_mid_vol1, t)
...     mismatch = correl_mismatch(unshifted, mid_vol0)
...     correl_mismatches.append(mismatch)
>>> plt.plot(translations, correl_mismatches)
[...]
>>> plt.xlabel('translation (t)')
<...>
>>> plt.ylabel('correlation mismatch')
<...>

(png, hires.png, pdf)

_images/optimizing_space-21.png

So far we have only tried translations of integer numbers of voxels (0, 1, 2, 3...).

How about non-integer translations? Will these work?

To do a non-integer translation, we have to think more generally about how to make an image that matches another image, after a transformation.

Resampling in 2D

We need a more general resampling algorithm. This is like the 1D interpolation we do for Slice timing correction, but in two dimensions.

Let’s say we want to do a voxel translation of 0.5 voxels in x. The way we might go about this is the following.

To start, here are some names:

  • img0 is the image we are trying to match to (in our case mid_vol0);
  • img1 is the image we are trying to match by moving (in our case shifted_mid_vol1);
  • trans is the transformation from pixel coordinates in img1 to pixel coordinates in img0 (in our case adding 0.5 to the first coordinate value, so that [0, 0] becomes [0.5, 0]);
  • itrans is the inverse of trans, and gives the transformation to go from pixel coordinates in img0 to pixel coordinates in img1. In our case this is subtracting 0.5 from the first coordinate value.

The procedure for resampling is:

  • Make a 2D image the same shape as the image we want to match to (img0) - call this new_img0;
  • For each pixel in new_img0;
    • call the pixel coordinate for this pixel: coord_for_img0;
    • transform coord_for_img0 using itrans. In our case this would be to subtract 0.5 from the first coordinate value ([0, 0] becomes [-0.5, 0]). Call the transformed coordinate coord_for_img1;
    • Estimate the pixel value in img1 at coordinate coord_for_img1. Call this value img1_value_estimate;
    • Insert img1_value into new_img0 at coordinate coord_for_img0.

The “Estimate pixel value” step is called resampling. As you can see this is the same general idea as interpolating in one dimension. We saw one dimensional interpolation for Slice timing correction. There are various ways of interpolating in two or three dimensions, but one of the most obvious is the simple extension of linear interpolation to two (or more) dimensions - bilinear interpolation.

The scipy.ndimage library has routines for resampling in 2 or 3 dimensions:

>>> import scipy.ndimage as snd

In fact the affine_transform function from scipy.ndimage will do the whole process for us.

>>> def fancy_x_trans_slice(img_slice, x_vox_trans):
...     """ Return copy of `img_slice` translated by `x_vox_trans` voxels
...
...     Parameters
...     ----------
...     img_slice : array shape (M, N)
...         2D image to transform with translation `x_vox_trans`
...     x_vox_trans : float
...         Number of pixels (voxels) to translate `img_slice`; can be
...         positive or negative, and does not need to be integer value.
...     """
...     # Resample image using bilinear interpolation (order=1)
...     trans_slice = snd.affine_transform(img_slice, [1, 1], [-x_vox_trans, 0], order=1)
...     return trans_slice
>>> fine_mismatches = []
>>> fine_translations = np.linspace(-25, 15, 100)
>>> for t in fine_translations:
...     unshifted = fancy_x_trans_slice(shifted_mid_vol1, t)
...     mismatch = correl_mismatch(unshifted, mid_vol0)
...     fine_mismatches.append(mismatch)
>>> plt.plot(fine_translations, fine_mismatches)
[...]

(png, hires.png, pdf)

_images/optimizing_space-25.png

We are looking for the best x translation. At the moment we have to sample lots of x translations and then choose the best. Is there a better way?

Optimization

Optimization is a field of mathematics / computer science that solves this exact problem.

There are many optimization routines in Python, MATLAB and other languages. These routines typically allow you to pass some function, called the objective function, or the cost function. The optimization routine returns the parameters of the cost function that give the lowest value.

In our case, the cost function we need to minimize will accept one parameter (the translation), and return the mismatch value for that translation. So, it will need to create the image \(\mathbf{X_t}\) and return \(M(\mathbf{X}, \mathbf{Y_t})\).

What does it mean to “pass” a function in Python. Remember that, in Python, Functions are objects like any other.

The optimization works by running the cost function at some starting value of the parameter (in our case, x translation), and using an algorithm to choose the next value of the parameter to try. It continues trying new values until it finds a parameter value for which very small changes of the parameter up or down only increase the cost function value. At this point the routine stops and returns the parameter value.

To write the cost function, we use the fact that Python functions can access variables defined in the Global and local scope of Python variables. In our case the function cost_function can access variables shifted_mid_vol1 and mid_vol0 that we defined in the top level (global) scope of our interactive session:

>>> def cost_function(x_trans):
...     # Function can use image slices defined in the global scope
...     # Calculate X_t - image translated by x_trans
...     unshifted = fancy_x_trans_slice(shifted_mid_vol1, x_trans)
...     # Return mismatch measure for the translated image X_t
...     return correl_mismatch(unshifted, mid_vol0)
>>> # value of the negative correlation for no translatino
>>> print(cost_function(0))
-0.446286936383
>>> # value of the negative correlation for translation of -8 voxels
>>> print(cost_function(-8))
-0.997030807352

Now we get a general optimizing routine from the scipy Python library. fmin_powell finds the minimum of a function using Powell’s method of numerical optimization:

>>> from scipy.optimize import fmin_powell

We pass the fmin_powell routine our Python cost function, and a starting value of zero for the translation:

>>> print(fmin_powell(cost_function, [0]))
Optimization terminated successfully.
         Current function value: -0.997032
         Iterations: 2
         Function evaluations: 27
-7.9960266912...

The function ran, and found that a translation value very close to -8 gave the smallest value for our cost function.

What actually happened there? Let’s track the progress of fmin_powell using a callback function:

>>> def my_callback(params):
...    print("Trying parameters " + str(params))

fmin_powell calls this my_callback function when it is testing a new set of parameters.

>>> best_params = fmin_powell(cost_function, [0], callback=my_callback)
Trying parameters [-7.995]
Trying parameters [-7.996]
Optimization terminated successfully.
         Current function value: -0.997032
         Iterations: 2
         Function evaluations: 27
>>> print(best_params)
-7.9960266912...

The optimization routine fmin_powell is trying various different translations, finally coming to the optimum (minimum) translation close to -8.

More than one parameter

How about adding y translation as well?

Our optimization routine could deal with this in a very simple way - by optimizing the x translation, then the y translation, like this:

  • Adjust the x translation until it has reached a minimum, then;
  • Adjust the y translation until it has reached a minimum, then;
  • Repeat x, y minimization until the minimum for both is stable.

Although we could do that, in fact fmin_powell does something slightly more complicated when looking for the next best set of parameters to try. The details aren’t important to the general idea of searching over different parameter values to find the lowest value for the cost function.

>>> def fancy_xy_trans_slice(img_slice, x_y_trans):
...     """ Make a copy of `img_slice` translated by `x_y_trans` voxels
...
...     x_y_trans is a sequence or array length 2, containing
...     the (x, y) translations in voxels.
...
...     Values in `x_y_trans` can be positive or negative, and
...     can be floats.
...     """
...     x_y_trans = np.array(x_y_trans)
...     # Resample image using bilinear interpolation (order=1)
...     trans_slice = snd.affine_transform(img_slice, [1, 1], -x_y_trans, order=1)
...     return trans_slice
>>> def fancy_cost_at_xy(x_y_trans):
...     """ Give cost function at xy translation values `x_y_trans`
...     """
...     unshifted = fancy_xy_trans_slice(shifted_mid_vol1, x_y_trans)
...     return correl_mismatch(unshifted, mid_vol0)

Do the optimization of fancy_cost_at_xy using fmin_powell:

>>> best_params = fmin_powell(fancy_cost_at_xy, [0, 0], callback=my_callback)
Trying parameters [-7.995  0.   ]
Trying parameters [-7.996  0.   ]
Optimization terminated successfully.
         Current function value: -0.997032
         Iterations: 2
         Function evaluations: 133
>>> best_params
array([-7.996,  0.   ])

(You probably noticed that fmin_powell appears to be sticking to the right answer for y translation (0), without printing out any evidence that it is trying other y values. This is because of the way that the Powell optimization works - it is in effect doing separate 1-parameter minimizations in order to get the final 2-parameter minimization, and it is not reporting the intermediate steps in the 1-parameter minimizations)

Full 3D estimate

Now we know how to do two parameters, it is easy to extend this to six parameters.

The six parameters are:

  • x translation
  • y translation
  • z translation
  • rotation around x axis (pitch)
  • rotation around y axis (roll)
  • rotation around z axis (yaw)

Any transformation with these parameters is called a rigid body transformation because the transformation cannot make the object change shape (the object is rigid).

Implementing the rotations is just slightly out of scope for this tutorial (we would have to convert between angles and rotation matrices), so here is the 3D optimization with the three translation parameters:

To make it a bit more interesting, shift the second volume 8 voxels on the first axis, as we did for the 2D case. We will also push the image 5 voxels forward on the second axis:

>>> shifted_vol1 = np.zeros(vol1.shape)
>>> shifted_vol1[8:, 5:, :] = vol1[:-8, :-5, :]
>>> plt.imshow(shifted_vol1[:, :, 17])
<...>

(png, hires.png, pdf)

_images/optimizing_space-36.png

We define resampling for any given x, y, z translation:

>>> def xyz_trans_vol(vol, x_y_z_trans):
...     """ Make a new copy of `vol` translated by `x_y_z_trans` voxels
...
...     x_y_z_trans is a sequence or array length 3, containing
...     the (x, y, z) translations in voxels.
...
...     Values in `x_y_z_trans` can be positive or negative,
...     and can be floats.
...     """
...     x_y_z_trans = np.array(x_y_z_trans)
...     # [1, 1, 1] says to do no zooming or rotation
...     # Resample image using trilinear interpolation (order=1)
...     trans_vol = snd.affine_transform(vol, [1, 1, 1], -x_y_z_trans, order=1)
...     return trans_vol

Our cost function (picking up vol0, shifted_vol1 from the global scope):

>>> def cost_at_xyz(x_y_z_trans):
...     """ Give cost function value at xyz translation values `x_y_z_trans`
...     """
...     unshifted = xyz_trans_vol(shifted_vol1, x_y_z_trans)
...     return correl_mismatch(unshifted, vol0)

Do the optimization of cost_at_xyz using fmin_powell:

>>> best_params = fmin_powell(cost_at_xyz, [0, 0, 0], callback=my_callback)
Trying parameters [-7.5722 -4.9661  0.    ]
Trying parameters [-7.9968 -4.9997  0.    ]
Trying parameters [-8. -5.  0.]
Optimization terminated successfully.
         Current function value: -0.995145
         Iterations: 3
         Function evaluations: 270
>>> best_params
array([-8., -5.,  0.])

Finally, we make a new volume from shifted_vol1 with these transformations applied:

>>> unshifted_vol1 = snd.affine_transform(shifted_vol1, [1, 1, 1], -best_params)
>>> fig, axes = plt.subplots(1, 2)
>>> axes[0].imshow(vol0[:, :, 17])
<...>
>>> axes[0].set_title('vol0')
<...>
>>> axes[1].imshow(unshifted_vol1[:, :, 17])
<...>
>>> axes[1].set_title('unshifted vol1')
<...>

(png, hires.png, pdf)

_images/optimizing_space-41.png

Getting stuck in the wrong place

So far, all our cost function plots are simple, in the sense that they have one single obvious minimum. For example, here is a repeat of our earlier plot of the negative correlation value, as a function of translation in x, for the shifted single slice:

>>> correl_mismatches = []
>>> translations = range(-25, 15)  # Candidate values for t
>>> for t in translations:
...     unshifted = x_trans_slice(shifted_mid_vol1, t)
...     mismatch = correl_mismatch(unshifted, mid_vol0)
...     correl_mismatches.append(mismatch)
>>> plt.plot(translations, correl_mismatches)
[...]
>>> plt.title('Cost as a function of $t$')
<...>

(png, hires.png, pdf)

_images/optimizing_space-43.png

Notice the nice single minimum at around \(t=-8\).

Unfortunately, many cost functions don’t have one single minimum, but several. In fact this is so even for our simple correlation measure, if we look at larger translations (values of \(t\)):

>>> correl_mismatches = []
>>> translations = range(-60, 50)  # Candidate values for t
>>> for t in translations:
...     unshifted = x_trans_slice(shifted_mid_vol1, t)
...     mismatch = correl_mismatch(unshifted, mid_vol0)
...     correl_mismatches.append(mismatch)
>>> plt.plot(translations, correl_mismatches)
[...]
>>> plt.title('Cost as a function of more $t$')
<...>

(png, hires.png, pdf)

_images/optimizing_space-45.png

Remember that a minimum is a value for which the values to the left and right are higher. So, the -8 value of \(t\) is a minimum (with a negative correlation value of -1), but the value at around \(t=44\) is also a minimum, with a negative correlation value of around 0.2. The value at \(t=-8\) is a global minimum in the sense that it is the minimum with the lowest cost value across all values of \(t\). The value at around \(t=44\) is a local minimum.

In general, our optimization routines are only able to guarantee that they have found a local minimum. So, if we start our search in the wrong place, then the optimization routine may well find the wrong minimum.

Here is our original 1 parameter cost function (it is the same as the version above):

>>> def cost_function(x_trans):
...     # Function can use image slices defined in the
...     # global Python scope.
...     # Calculate X_t - image translated by x_trans
...     unshifted = fancy_x_trans_slice(shifted_mid_vol1, x_trans)
...     # Return mismatch measure for the translated image X_t
...     return correl_mismatch(unshifted, mid_vol0)

Now we minimize this cost function with fmin_powell, but starting nearer the local minimum:

>>> print(fmin_powell(cost_function, [35]))
Optimization terminated successfully.
         Current function value: 0.241603
         Iterations: 2
         Function evaluations: 55
43.0000006372...

Here we took a very bad starting value, but we would run the same risk if the images started off much further apart and we gave a starting value of 0.

One major part of using optimization, is being aware that it is possible for the optimization to find a “best” value that is a local rather than a global minimum. The art of optimization is finding a minimization algorithm and mismatch metric that are well-adapted to the particular problem.