r_31415
r_31415

Reputation: 8982

Image Registration and affine transformation in Python

I have been reading Programming Computer Vision with Python by Jan Erik Solem which is a pretty good book, however I haven't been able to clarify a question regarding image registration.

Basically, we have a bunch of images (faces) that need to be aligned a bit so the first thing needed is to perform a rigid transformation via a similarity transformation:

x' = | sR t | x
     | 0  1 |

where x is the vector (a set of coordinates in this case) to be transform into x' via a rotation R, a translation t and maybe a scaling s.

Solem calculates this rigid transformation for each image which returns the rotation matrix R and a translation vector as tx and ty:

R,tx,ty = compute_rigid_transform(refpoints, points)

However, he reorders the elements of R for some reason:

T = array([[R[1][1], R[1][0]], [R[0][1], R[0][0]]])

and later he performs an affine transformation:

im2[:,:,i] = ndimage.affine_transform(im[:,:,i],linalg.inv(T),offset=[-ty,-tx])

In this example, this affine transformation is performed on each channel but that's not relevant. im[:,:,i] is the image to be processed and this procedure returns another image.

What is T and why are we inverting that matrix in the affine transformation? And what are the usual steps to achieve image registration?

Update

Here you can find the relevant part of this code in Google Books. Starts at the bottom of page 67.

Upvotes: 3

Views: 5483

Answers (2)

ColorRGB
ColorRGB

Reputation: 1

I will try to answer your question and point out a mistake (?) in the book. (1) Why using T = array([[R[1][1], R[1][0]], [R[0][1], R[0][0]]]) ? since R,tx,ty = compute_rigid_transform(refpoints, points) computes rotation matrix and translation in the form:

|x'| = s|R[0][0] R[0][1]||x| + |tx|             Equation (1)
|y'|    |R[1][0] R[1][1]||y|   |ty|

HOWEVER, OUT = ndimage.affine_transform(IN,A,b) requires the coordinate in the form of (y,x) NOT in the order of (x,y). So the above Equation (1) will become

|y'| = s|R[1][1] R[1][0]||y| + |ty| = T|y| + |ty|        Equation(2)
|x'|    |R[0][1] R[0][0]||x|   |tx|    |x|   |tx|

Then, in function ndimage.affine_transform() the matrix will be linalg.inv(T), not linalg.inv(R).

(2) The affine transform OUT = ndimage.affine_transform(IN,A,b) in fact is A*OUT + b => IN . According to Equation (2), rewrite it as

|y| = inv(T)|y'| - inv(T)|ty|
|x|         |x'|         |tx|

So the offset in function ndimage.affine_transform() is inv(T)[-ty, -tx], not [-ty -tx]. I think this is a bug in the original code.

Upvotes: 0

aganders3
aganders3

Reputation: 5955

It looks like an error in the code to me. T appears to just be the transpose of R, which for a rotation matrix is the same as the inverse. Then he takes the inverse (again) in the call to ndimage.affine_transform. I think it should be either T or linalg.inv(R) passed to that function.

Upvotes: 1

Related Questions